Commit Graph

930 Commits

Author SHA1 Message Date
Jeff Hostetler
211338e393 fscache: make fscache_enabled() public
Make fscache_enabled() function public rather than static.
Remove unneeded fscache_is_enabled() function.
Change is_fscache_enabled() macro to call fscache_enabled().

is_fscache_enabled() now takes a pathname so that the answer
is more precise and mean "is fscache enabled for this pathname",
since fscache only stores repo-relative paths and not absolute
paths, we can avoid attempting lookups for absolute paths.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:12:01 -05:00
Jeff Hostetler
fbfc38929f dir.c: make add_excludes aware of fscache during status
Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.

The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree.  Change that when fscache is enabled to
call lstat() first and if present, call open().

This seems backwards because both lstat needs to do more
work than fstat.  But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls.  This works because of the lstat diversion
to mingw_lstat when fscache is enabled.

This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:12:01 -05:00
Jeff Hostetler
3779c1da42 fscache: remember not-found directories
Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Jeff Hostetler
8ef10b6c3d fscache: add key for GIT_TRACE_FSCACHE
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
87d9080bd7 fscache: load directories only once
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.

On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.

Signed-off-by: Karsten Blees <blees@dcon.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
c11fc11b69 mingw: add a cache below mingw's lstat and dirent implementations
Checking the work tree status is quite slow on Windows, due to slow
`lstat()` emulation (git calls `lstat()` once for each file in the
index). Windows operating system APIs seem to be much better at scanning
the status of entire directories than checking single files.

Add an `lstat()` implementation that uses a cache for lstat data. Cache
misses read the entire parent directory and add it to the cache.
Subsequent `lstat()` calls for the same directory are served directly
from the cache.

Also implement `opendir()`/`readdir()`/`closedir()` so that they create
and use directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
701fe1626d add infrastructure for read-only file system level caches
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
63ef4dde95 Win32: make the lstat implementation pluggable
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
aa086f1252 mingw: make the dirent implementation pluggable
Emulating the POSIX `dirent` API on Windows via
`FindFirstFile()`/`FindNextFile()` is pretty staightforward, however,
most of the information provided in the `WIN32_FIND_DATA` structure is
thrown away in the process. A more sophisticated implementation may
cache this data, e.g. for later reuse in calls to `lstat()`.

Make the `dirent` implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to `readdir()`/`closedir()`
that match the `opendir()` implementation (similar to vtable pointers in
Object-Oriented Programming). Define `readdir()`/`closedir()` so that
they call the function pointers in the `DIR` structure. This allows to
choose the `opendir()` implementation on a call-by-call basis.

Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may
be implementation specific (e.g. a caching implementation may allocate a
`struct dirent` with _just_ the size needed to hold the `d_name` in
question).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
630c90e14f Win32: dirent.c: Move opendir down
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2021-11-15 12:12:01 -05:00
Karsten Blees
9d0aa9f379 Win32: make FILETIME conversion functions public
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:12:01 -05:00
Jeff Hostetler
6cb01b15d0 Merge branch 'try-v4-fsmonitor-part4' into try-v4-fsmonitor 2021-11-15 12:12:00 -05:00
Jeff Hostetler
037111150c Merge branch 'try-v4-fsmonitor-part3' into try-v4-fsmonitor 2021-11-15 12:12:00 -05:00
Johannes Schindelin
4df89beff9 Merge pull request #3398 from carenas/pthread-unistd
mingw: avoid fallback for {local,gm}time_r()
2021-11-15 12:12:00 -05:00
Johannes Schindelin
aa129899d6 Merge pull request #3220 from dscho/there-is-no-vs/master-anymore
Let the documentation reflect that there is no vs/master anymore
2021-11-15 12:11:59 -05:00
Johannes Schindelin
5410bb4cce Merge pull request #3165 from dscho/increase-allowed-length-of-interpreter-path
mingw: allow for longer paths in `parse_interpreter()`
2021-11-15 12:11:59 -05:00
Johannes Schindelin
185ac94c33 Merge pull request #2915 from dennisameling/windows-arm64-support
Windows arm64 support
2021-11-15 12:11:59 -05:00
Johannes Schindelin
a9ada8a718 Merge pull request #2351 from PhilipOakley/vcpkg-tip
Vcpkg Install: detect lack of working Git, and note possible vcpkg time outs
2021-11-15 12:11:59 -05:00
Johannes Schindelin
618498be4b Merge pull request #2974 from derrickstolee/maintenance-and-headless
Include Windows-specific maintenance and headless-git
2021-11-15 12:11:59 -05:00
Johannes Schindelin
29eae05baa Merge pull request #2506 from dscho/issue-2283
Allow running Git directly from `C:\Program Files\Git\mingw64\bin\git.exe`
2021-11-15 12:11:58 -05:00
Johannes Schindelin
144e74581c Merge pull request #2504 from dscho/access-repo-via-junction
Handle `git add <file>` where <file> traverses an NTFS junction
2021-11-15 12:11:58 -05:00
Johannes Schindelin
a33c5a34f9 Merge pull request #2501 from jeffhostetler/clink-debug-curl
clink.pl: fix MSVC compile script to handle libcurl-d.lib
2021-11-15 12:11:58 -05:00
Johannes Schindelin
8ac6b3c626 Merge pull request #2488 from bmueller84/master
mingw: fix fatal error working on mapped network drives on Windows
2021-11-15 12:11:58 -05:00
Johannes Schindelin
c5667dda79 Merge pull request #2449 from dscho/mingw-getcwd-and-symlinks
Do resolve symlinks in `getcwd()`
2021-11-15 12:11:57 -05:00
Johannes Schindelin
2b315471b1 Merge pull request #2405 from dscho/mingw-setsockopt
Make sure `errno` is set when socket operations fail
2021-11-15 12:11:57 -05:00
Johannes Schindelin
2b6eb0d80f Merge branch 'dont-clean-junctions'
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:11:57 -05:00
Johannes Schindelin
05d3c4362c Merge branch 'fsync-object-files-always'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:11:57 -05:00
Jeff Hostetler
d93addc03b compat/fsmonitor/fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
Teach the listener thread to shutdown the daemon if the spelling of the
worktree root directory changes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
c73b2c1239 compat/fsmonitor/fsm-health-win32: force shutdown daemon if worktree root moves
Force shutdown fsmonitor daemon if the worktree root directory
is moved, renamed, or deleted.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
083e3b5e89 compat/fsmonitor/fsm-health-win32: add framework for periodically monitoring
Create framework in Win32 version of the "health" thread to
periodically inspect the system and shutdown if warranted.

This version just include the setup for the timeout in
WaitForMultipleObjects() and calls (currently empty) table
of functions.

A later commit will add functions to the table to actually
inspect the system.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
f24a928a7a fsmonitor--daemon: stub in health thread
Create another thread to watch over the daemon process and
automatically shut it down if necessary.

This commit creates the basic framework for a "health" thread
to monitor the daemon and/or the file system.  Later commits
will add platform-specific code to do the actual work.

The "health" thread is intended to monitor conditions that
would be difficult to track inside the IPC thread pool and/or
the file system listener threads.  For example, when there are
file system events outside of the watched worktree root or if
we want to have an idle-timeout auto-shutdown feature.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
5717fb8eec fsmonitor--daemon: rename listener thread related variables
Rename platform-specific listener thread related variables
and data types as we prepare to add another backend thread
type.

[] `struct fsmonitor_daemon_backend_data` becomes `struct fsm_listen_data`
[] `state->backend_data` becomes `state->listen_data`
[] `state->error_code` becomes `state->listen_error_code`

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
e5e6ef00d2 fsmonitor--daemon: cd out of worktree root
Teach the fsmonitor--daemon to CD outside of the worktree
before starting up.

The common Git startup mechanism causes the CWD of the daemon process
to be in the root of the worktree.  On Windows, this causes the daemon
process to hold a locked handle on the CWD and prevents other
processes from moving or deleting the worktree while the daemon is
running.

CD to HOME before entering main event loops.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
bea2c029a0 compat/fsmonitor/fsm-listen-darwin: ignore FSEvents caused by xattr changes on MacOS
Ignore FSEvents resulting from `xattr` changes.  Git does not care about
xattr's or changes to xattr's, so don't waste time collecting these
events in the daemon nor transmitting them to clients.

Various security tools add xattrs to files and/or directories, such as
to mark them as having been downloaded.  We should ignore these events
since it doesn't affect the content of the file/directory or the normal
meta-data that Git cares about.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
30c64fa622 fsmonitor-settings: remote repos on Windows are incompatible with FSMonitor
Teach Git to detect remote working directories on Windows and mark them as
incompatible with FSMonitor.

With this `git fsmonitor--daemon run` will error out with a message like it
does for bare repos.

Client commands, such as `git status`, will not attempt to start the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
c93bb80783 compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
Implement file system event listener on MacOS using FSEvent,
CoreFoundation, and CoreServices.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
1425b687b1 fsmonitor-settings: remote repos on MacOS are incompatible with FSMonitor
Teach Git to detect remote working directories on MacOS and mark them as
incompatible with FSMonitor.

With this, `git fsmonitor--daemon run` will error out with a message
like it does for bare repos.

Client commands, like `git status`, will not attempt to start the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
a17de0604b compat/fsmonitor/fsm-listen-darwin: add macos header files for FSEvent
Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs.  We need GCC and clang versions because of
compiler and header file conflicts.

While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:

In file included
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
     /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
       193 | char bytes[kAuthorizationExternalFormLength];
           |      ^~~~~

The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.

This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082

In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.

Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
06df56f26b fsmonitor-settings: stub in platform-specific incompatibility checking on MacOS
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
2378ba608f compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
Teach the win32 backend to register a watch on the working tree
root directory (recursively).  Also watch the <gitdir> if it is
not inside the working tree.  And to collect path change notifications
into batches and publish.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
852d6aba69 fsmonitor-settings: virtual repos are incompatible with FSMonitor
Virtual repos, such as GVFS (aka VFS for Git), are incompatible
with FSMonitor.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
3ce8546309 fsmonitor-settings: stub in platform-specific incompatibility checking
Extend generic incompatibility checkout with platform-specific
mechanism.  Stub in Win32 version.

In the existing fsmonitor-settings code we have a way to mark
types of repos as incompatible with fsmonitor (whether via the
hook and ipc APIs).  For example, we do this for bare repos,
since there are no files to watch.

Extend this exclusion mechanism for platfor-specific reasons.
This commit just creates the framework and adds a stub for Win32.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Jeff Hostetler
7a087af79f compat/fsmonitor/fsm-listen-win32: handle shortnames
Teach FSMonitor daemon on Windows to recognize shortname paths as
aliases of normal longname paths.  FSMonitor clients, such as `git
status`, should receive the longname spelling of changed files (when
possible).

Sometimes we receive FS events using the shortname, such as when a CMD
shell runs "RENAME GIT~1 FOO" or "RMDIR GIT~1".  The FS notification
arrives using whatever combination of long and shortnames were used by
the other process.  (Shortnames do seem to be case normalized,
however.)

Use Windows GetLongPathNameW() to try to map the pathname spelling in
the notification event into the normalized longname spelling.  (This
can fail if the file/directory is deleted, moved, or renamed, because
we are asking the FS for the mapping in response to the event and
after it has already happened, but we try.)

Special case the shortname spelling of ".git" to avoid under-reporting
these events.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:56 -05:00
Carlo Marcelo Arenas Belón
adf0cd8d14 mingw: avoid fallback for {local,gm}time_r()
mingw-w64's pthread_unistd.h had a bug that mistakenly (because there is
no support for the *lockfile() functions required[1]) defined
_POSIX_THREAD_SAFE_FUNCTIONS and that was being worked around since
3ecd153a3b (compat/mingw: support MSys2-based MinGW build, 2016-01-14).

the bug was fixed in winphtreads, but as a sideeffect, leaves the
reentrant functions from time.h not longer visible and therefore breaks
the build.

since the intention all along was to avoid using the fallback functions,
formalize the use of POSIX by setting the corresponding feature flag and
to make the intention clearer compile out the fallback functions.

[1] https://unix.org/whitepapers/reentrant.html

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2021-11-15 12:11:55 -05:00
Johannes Schindelin
f3f774d33d compat/vcbuild: document preferred way to build in Visual Studio
We used to have that `make vcxproj` hack, but a hack it is. In the
meantime, we have a much cleaner solution: using CMake, either
explicitly, or even more conveniently via Visual Studio's built-in CMake
support (simply open Git's top-level directory via File>Open>Folder...).

Let's let the `README` reflect this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:11:55 -05:00
Johannes Schindelin
53871bb9e0 mingw: allow for longer paths in parse_interpreter()
As reported in https://github.com/newren/git-filter-repo/pull/225, it
looks like 99 bytes is not really sufficient to represent e.g. the full
path to Python when installed via Windows Store (and this path is used
in the hasb bang line when installing scripts via `pip`).

Let's increase it to what is probably the maximum sensible path size:
MAX_PATH. This makes `parse_interpreter()` in line with what
`lookup_prog()` handles.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Vilius Šumskas <vilius@sumskas.eu>
2021-11-15 12:11:55 -05:00
Jeff Hostetler
a826a299c3 compat/fsmonitor/fsm-listen-darwin: stub in backend for Darwin
Stub in empty implementation of fsmonitor--daemon
backend for Darwin (aka MacOS).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:55 -05:00
Jeff Hostetler
6190e98da8 compat/fsmonitor/fsm-listen-win32: stub in backend for Windows
Stub in empty filesystem listener backend for fsmonitor--daemon on Windows.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2021-11-15 12:11:55 -05:00
Dennis Ameling
272baced19 Add schannel to curl installation
Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
2021-11-15 12:11:54 -05:00
Philip Oakley
2c0ad099de vcpkg_install: add comment regarding slow network connections
The vcpkg downloads may not succeed. Warn careful readers of the time out.

A simple retry will usually resolve the issue.

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-11-15 12:11:54 -05:00