MSys2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.
That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.
Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).
Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.
Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).
Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).
Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).
Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.
Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.
While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.
Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199
[jes: adjusted test number to avoid conflicts, added support for
chdir(), etc]
Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There is a problem in the way 9ac3f0e5b3 (pack-objects: fix
performance issues on packing large deltas, 2018-07-22) initializes that
mutex in the `packing_data` struct. The problem manifests in a
segmentation fault on Windows, when a mutex (AKA critical section) is
accessed without being initialized. (With pthreads, you apparently do
not really have to initialize them?)
This was reported in https://github.com/git-for-windows/git/issues/1839.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.
Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.
Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.
Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.
Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).
Signed-off-by: Karsten Blees <blees@dcon.de>
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch addresses the segmentation faults in `git difftool --no-index
--dir-diff`: surprisingly, those two options don't make no sense
together.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch conflicts with the next change that will change the
way we call `CreateProcessW()`. So let's merge it early, to avoid merge
conflicts during a merge (because we would have to resolve this with
every single merging-rebase).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The environment variable `HOME` is not exactly a native concept on
Windows, but Git and its scripts rely heavily on it. Make sure that it
is set (using a default that is sensible in most cases, and can easily
be overridden by setting the user-wide environment variable `HOME`
explicitly, before starting Git).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The idea of the C runtime on Windows as to what a locale is does not
mesh well with the idea Git has. So let's just ignore the C runtime.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
MSYS2 inherits the trick from Cygwin to pretend that filenames can
contain characters that are illegal on Windows (by mapping them to a
private Unicode page). As long as we stay safely within the MSYS2 realm
(Bash, GNU make, Perl) that is fine, so technically this change is not
needed. But it is a lot more elegant not to rely on this.
Besides, the suffix `.new` is a lot more intuitive than the suffix
`+`...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is GCC's attempt at making things less predictable and thereby
reduce the attack surface for malware.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows executables can be configured to make use of certain Windows
features only via a so-called "manifest", i.e. a specific, embedded
resource. This manifest is also necessary to determine the Windows
version reliably.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows ships with a subset of MSYS2, and tries to integrate
smoothly, so we want to install the documentation in the location that
is recommended by MSYS2.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch fixes the usage pattern where files are still held
open with an exclusive lock when an external program is asked to open
those very same files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This branch introduces support for reading the "Windows-wide" Git
configuration from `%PROGRAMDATA%\Git\config`. As these settings are
intended to be shared between *all* Git-related software, that config
file takes an even lower precedence than `$(prefix)/etc/gitconfig`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We no longer use MakeMaker, so let's not state in the MINGW section that
we do not want to use it...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch fixes a corner case that is amazingly common in this
developer's workflow: in a `git stash -p`, splitting a hunk and stashing
only part of it runs into a (known) bug where the partial hunk cannot be
applied in reverse.
It is one of those "good enough" fixes, not a full fix, though, as the
full fix would require a 3-way merge between `stash^` and the *worktree*
(not `HEAD`), with `stash` as merge base (i.e. a `git revert`, but on
top of the current worktree).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is the final leg of the journey to a fully built-in `git add`: the
`git add -i` and `git add -p` modes were re-implemented in C, but they
lacked support for a couple of config settings.
The one that sticks out most is the `interactive.singleKey` setting: it
was not only particularly hard to get to work, especially on Windows. It
is also the setting that seems to be incomplete already in the Perl
version: while the name suggests that it applies to the main loop of
`git add --interactive`, or to the file selections in that command, it
does not. Only the `git add --patch` mode respects that setting.
As it is outside the purpose of the conversion of
`git-add--interactive.perl` to C, we will leave that loose end for some
future date.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At this stage on the journey to a fully built-in `git add`, we already
have everything we need, including the `--interactive` and `--patch`
options, as long as the `add.interactive.useBuiltin` setting is set to
`true` (kind of a "turned off feature flag", which it will be for a
while, until we get confident enough that the built-in version does the
job, and retire the Perl script).
However, the internal `add--interactive` helper is also used to back the
`--patch` option of `git stash`, `git reset` and `git checkout`.
This patch series brings them "online".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This job runs the test suite twice, once in regular mode, and once with
a whole slew of `GIT_TEST_*` variables set.
Now that the built-in version of `git add --interactive` is
feature-complete, let's also throw `GIT_TEST_MULTI_PACK_INDEX` into that
fray.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In 7e9e048661 (stash -p: demonstrate failure of split with mixed y/n,
2015-04-16), a regression test for a known breakage that was added to
the test script `t3904-stash-patch.sh` that demonstrated that splitting
a hunk and trying to stash only part of that split hunk fails (but
shouldn't).
As expected, it still fails, but for the wrong reason: once the bug is
fixed, we would expect stderr to show nothing, yet the regression test
expects stderr to show something.
Let's fix that by telling that regression test case to expect nothing to
be printed to stderr.
While at it, also drop the obvious left-over from debugging where the
regression test did not mind `git stash -p` to return a non-zero exit
status.
Of course, the regression test still fails, but this time for the
correct reason.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>