Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.
The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree. Change that when fscache is enabled to
call lstat() first and if present, call open().
This seems backwards because both lstat needs to do more
work than fstat. But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls. This works because of the lstat diversion
to mingw_lstat when fscache is enabled.
This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach FSCACHE to remember "not found" directories.
This is a performance optimization.
FSCACHE is a performance optimization available for Windows. It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext. It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory. This gives a major performance
boost on Windows.
However, it does not remember "not found" directories. When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst. This completely defeats any performance
gains.
This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.
This change reduced status times for my very large repo by 60%.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow
`lstat()` emulation (git calls `lstat()` once for each file in the
index). Windows operating system APIs seem to be much better at scanning
the status of entire directories than checking single files.
Add an `lstat()` implementation that uses a cache for lstat data. Cache
misses read the entire parent directory and add it to the cache.
Subsequent `lstat()` calls for the same directory are served directly
from the cache.
Also implement `opendir()`/`readdir()`/`closedir()` so that they create
and use directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX `dirent` API on Windows via
`FindFirstFile()`/`FindNextFile()` is pretty staightforward, however,
most of the information provided in the `WIN32_FIND_DATA` structure is
thrown away in the process. A more sophisticated implementation may
cache this data, e.g. for later reuse in calls to `lstat()`.
Make the `dirent` implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to `readdir()`/`closedir()`
that match the `opendir()` implementation (similar to vtable pointers in
Object-Oriented Programming). Define `readdir()`/`closedir()` so that
they call the function pointers in the `DIR` structure. This allows to
choose the `opendir()` implementation on a call-by-call basis.
Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may
be implementation specific (e.g. a caching implementation may allocate a
`struct dirent` with _just_ the size needed to hold the `d_name` in
question).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The pattern `return errno = ..., -1;` is observed several times in
`compat/mingw.c`. It has served us well over the years, but now clang
starts complaining:
```
compat/mingw.c:723:24: error: possible misuse of comma operator here [-Werror,-Wcomma]
723 | return errno = ENOSYS, -1;
| ^
```
See for example [this failing workflow
run](https://github.com/git-for-windows/git-sdk-arm64/actions/runs/15457893907/job/43513458823#step:8:201).
Let's appease clang (and also reduce the use of the no longer common
comma operator).
The pattern `return errno = ..., -1;` is observed several times in
`compat/mingw.c`. It has served us well over the years, but now clang
starts complaining:
compat/mingw.c:723:24: error: possible misuse of comma operator here [-Werror,-Wcomma]
723 | return errno = ENOSYS, -1;
| ^
See for example this failing workflow run:
https://github.com/git-for-windows/git-sdk-arm64/actions/runs/15457893907/job/43513458823#step:8:201
Let's appease clang (and also reduce the use of the no longer common
comma operator).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git on Windows 2022 fails to write config files on ReFS with the error
message "Function not implemented". The reason is that
`ERROR_NOT_SUPPORTED` is reported (not `ERROR_INVALID_PARAMETER`, as
expected). Let's handle both errors the same: by falling back to the
best-effort option, namely to rename without POSIX semantics.
This fixes https://github.com/git-for-windows/git/issues/5427
There has been a slow but steady stream of bug reports triggered by the
warning included in ac33519ddf. Since
introducing the escape hatch for Windows 7 in that same patch, however,
I have not seen a single report that pointed to the kind of bug I
anticipated when writing that warning message.
Let's drop this warning, as well as the escape hatch: Git for Windows
dropped supporting Windows 7 (and Windows 8) directly after Git for
Windows v2.46.2. For full details, see
https://gitforwindows.org/requirements#windows-version.
For good measure, let's keep the fall-back, though: It sometimes helps
work around issues (e.g. when Defender still holds a handle on a file
that Git wants to write out).
Since Git LFS v3.5.x implicitly dropped Windows 7 support, we now want
users to be advised _what_ is going wrong on that Windows version. This
topic branch goes out of its way to provide users with such guidance.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Another (hopefully clean) PR for showing the error warning about atomic
append on windows after failure on APFS, which returns EBADF not EINVAL.
Signed-off-by: David Lomas <dl3@pale-eds.co.uk>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
As per
https://github.com/git-for-windows/git/issues/4350#issuecomment-1485041503,
the major block for upgrading Git for Windows' OpenSSL from v1.1 to v3
is the tricky part where such an upgrade would break `git fetch`/`git
clone` and `git push` because the libcurl depends on the OpenSSL DLL,
and the major version bump will _change_ the file name of said DLL.
To overcome that, the plan is to build libcurl flavors for each
supported SSL/TLS backend, aligning with the way MSYS2 builds libcurl,
then switch Git for Windows' SDK to the Secure Channel-flavored libcurl,
and teach Git to look for the specific flavor of libcurl corresponding
to the `http.sslBackend` setting (if that was configured).
Here is the PR to teach Git that trick.
The first three commits are rebased versions of those in gitgitgadget/git#1215. These allow the following:
1. Fix `git config --global foo.bar <path>` from allowing the `<path>`. As a bonus, users with a config value starting with `/` will not get a warning about "old-style" paths needing a "`%(prefix)/`".
2. When in WSL, the path starts with `/` so it needs to be interpolated properly. Update the warning to include `%(prefix)/` to get the right value for WSL users. (This is specifically for using Git for Windows from Git Bash, but in a WSL directory.)
3. When using WSL, the ownership check fails and reports an error message. This is noisy, and happens even if the user has marked the path with `safe.directory`. Remove that error message.
This topic vendors in mimalloc v2.2.3, a fast allocator that allows Git
for Windows to perform efficiently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch allows us to specify absolute paths without the drive
prefix e.g. when cloning.
Example:
C:\Users\me> git clone https://github.com/git/git \upstream-git
This will clone into a new directory C:\upstream-git, in line with how
Windows interprets absolute paths.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
ReFS is an alternative filesystem to NTFS. On Windows 2022, it seems not
to support the rename operation using POSIX semantics that Git uses on
Windows as of 391bceae43 (compat/mingw: support POSIX semantics for
atomic renames, 2024-10-27).
However, Windows 2022 reports `ERROR_NOT_SUPPORTED` in this instance.
This is in contrast to `ERROR_INVALID_PARAMETER` (as previous Windows
versions would report that do not support POSIX semantics in renames at
all).
Let's handle both errors the same: by falling back to the best-effort
option, namely to rename without POSIX semantics.
This fixes https://github.com/git-for-windows/git/issues/5427
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In ac33519ddf (mingw: restrict file handle inheritance only on Windows
7 and later, 2019-11-22), I introduced code to safe-guard the
defense-in-depth handling that restricts handles' inheritance so that it
would work with Windows 7, too.
Let's revert this patch: Git for Windows dropped supporting Windows 7 (and
Windows 8) directly after Git for Windows v2.46.2. For full details, see
https://gitforwindows.org/requirements#windows-version.
Actually, on second thought: revert only the part that makes this handle
inheritance restriction logic optional and that suggests to open a bug
report if it fails, but keep the fall-back to try again without said
logic: There have been a few false positives over the past few years
(where the warning was triggered e.g. because Defender was still
accessing a file that Git wanted to overwrite), and the fall-back logic
seems to have helped occasionally in such situations.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This comment has been true for the longest time; The combination of the
two preceding commits made it incorrect, so let's drop that comment.
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We map WSAGetLastError() errors to errno errors in winsock_error_to_errno(),
but the MSVC strerror() implementation only produces "Unknown error" for
most of them. Produce some more meaningful error messages in these
cases.
Our builds for ARM64 link against the newer UCRT strerror() that does know
these errors, so we won't change the strerror() used there.
The wording of the messages is copied from glibc strerror() messages.
Reported-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The winsock2 library provides functions that work on different data
types than file descriptors, therefore we wrap them.
But that is not the only difference: they also do not set `errno` but
expect the callers to enquire about errors via `WSAGetLastError()`.
Let's translate that into appropriate `errno` values whenever the socket
operations fail so that Git's code base does not have to change its
expectations.
This closes https://github.com/git-for-windows/git/issues/2404
Helped-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git LFS is now built with Go 1.21 which no longer supports Windows 7.
However, Git for Windows still wants to support Windows 7.
Ideally, Git LFS would re-introduce Windows 7 support until Git for
Windows drops support for Windows 7, but that's not going to happen:
https://github.com/git-for-windows/git/issues/4996#issuecomment-2176152565
The next best thing we can do is to let the users know what is
happening, and how to get out of their fix, at least.
This is not quite as easy as it would first seem because programs
compiled with Go 1.21 or newer will simply throw an exception and fail
with an Access Violation on Windows 7.
The only way I found to address this is to replicate the logic from Go's
very own `version` command (which can determine the Go version with
which a given executable was built) to detect the situation, and in that
case offer a helpful error message.
This addresses https://github.com/git-for-windows/git/issues/4996.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `__MINGW64__` constant is defined, surprise, surprise, only when
building for a 64-bit CPU architecture.
Therefore using it as a guard to define `_POSIX_C_SOURCE` (so that
`localtime_r()` is declared, among other functions) is not enough, we
also need to check `__MINGW32__`.
Technically, the latter constant is defined even for 64-bit builds. But
let's make things a bit easier to understand by testing for both
constants.
Making it so fixes this compile warning (turned error in GCC v14.1):
archive-zip.c: In function 'dos_time':
archive-zip.c:612:9: error: implicit declaration of function 'localtime_r';
did you mean 'localtime_s'? [-Wimplicit-function-declaration]
612 | localtime_r(&time, &tm);
| ^~~~~~~~~~~
| localtime_s
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows 10 version 1511 (also known as Anniversary Update), according to
https://learn.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences
introduced native support for ANSI sequence processing. This allows
using colors from the entire 24-bit color range.
All we need to do is test whether the console's "virtual processing
support" can be enabled. If it can, we do not even need to start the
`console_thread` to handle ANSI sequences.
Or, almost all we need to do: When `console_thread()` does its work, it
uses the Unicode-aware `write_console()` function to write to the Win32
Console, which supports Git for Windows' implicit convention that all
text that is written is encoded in UTF-8. The same is not necessarily
true if native ANSI sequence processing is used, as the output is then
subject to the current code page. Let's ensure that the code page is set
to `CP_UTF8` as long as Git writes to it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When running Git for Windows on a remote APFS filesystem, it would
appear that the `mingw_open_append()`/`write()` combination would fail
almost exactly like on some CIFS-mounted shares as had been reported in
https://github.com/git-for-windows/git/issues/2753, albeit with a
different `errno` value.
Let's handle that `errno` value just the same, by suggesting to set
`windows.appendAtomically=false`.
Signed-off-by: David Lomas <dl3@pale-eds.co.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The previous commits introduced a compile-time option to load libcurl
lazily, but it uses the hard-coded name "libcurl-4.dll" (or equivalent
on platforms other than Windows).
To allow for installing multiple libcurl flavors side by side, where
each supports one specific SSL/TLS backend, let's first look whether
`libcurl-<backend>-4.dll` exists, and only use `libcurl-4.dll` as a fall
back.
That will allow us to ship with a libcurl by default that only supports
the Secure Channel backend for the `https://` protocol. This libcurl
won't suffer from any dependency problem when upgrading OpenSSL to a new
major version (which will change the DLL name, and hence break every
program and library that depends on it).
This is crucial because Git for Windows relies on libcurl to keep
working when building and deploying a new OpenSSL package because that
library is used by `git fetch` and `git clone`.
Note that this feature is by no means specific to Windows. On Ubuntu,
for example, a `git` built using `LAZY_LOAD_LIBCURL` will use
`libcurl.so.4` for `http.sslbackend=openssl` and `libcurl-gnutls.so.4`
for `http.sslbackend=gnutls`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>