Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each
directory and then free it when we call FindClose(). Update fscache to call
the underlying kernel API NtQueryDirectoryFile so that we can do the buffer
management ourselves. That allows us to allocate a single buffer for the
lifetime of the cache and reuse it for each directory.
This change improves performance of 'git status' by 18% in a repo with ~200K
files and 30k folders.
Documentation for NtQueryDirectoryFile can be found at:
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfilehttps://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constantshttps://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags
To determine if the specified directory is a symbolic link, inspect the
FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is
set. If so, EaSize will contain the reparse tag (this is a so far
undocumented feature, but confirmed by the NTFS developers). To
determine if the reparse point is a symbolic link (and not some other
form of reparse point), test whether the tag value equals the value
IO_REPARSE_TAG_SYMLINK.
The NtQueryDirectoryFile() call works best (and on Windows 8.1 and
earlier, it works *only*) with buffer sizes up to 64kB. Which is 32k
wide characters, so let's use that as our buffer size.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The recent change to make fscache thread specific relied on fscache_enable()
being called first from the primary thread before being called in parallel
from worker threads. Make that more robust and protect it with a critical
section to avoid any issues.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Now that the fscache is single threaded, take advantage of the mem_pool as
the allocator to significantly reduce the cost of allocations and frees.
With the reduced cost of free, in future patches, we can start freeing the
fscache at the end of commands instead of just leaking it.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The threading model for fscache has been to have a single, global cache.
This puts requirements on it to be thread safe so that callers like
preload-index can call it from multiple threads. This was implemented
with a single mutex and completion events which introduces contention
between the calling threads.
Simplify the threading model by making fscache thread specific. This allows
us to remove the global mutex and synchronization events entirely and instead
associate a fscache with every thread that requests one. This works well with
the current multi-threading which divides the cache entries into blocks with
a separate thread processing each block.
At the end of each worker thread, if there is a fscache on the primary
thread, merge the cached results from the worker into the primary thread
cache. This enables us to reuse the cache later especially when scanning for
untracked files.
In testing, this reduced the time spent in preload_index() by about 25% and
also reduced the CPU utilization significantly. On a repo with ~200K files,
it reduced overall status times by ~12%.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.
Add a separate disable_fscache() macro to make the code clearer and easier
to read.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Track fscache hits and misses for lstat and opendir requests. Reporting of
statistics is done when the cache is disabled for the last time and freed
and is only reported if GIT_TRACE_FSCACHE is set.
Sample output is:
11:33:11.836428 compat/win32/fscache.c:433 fscache: lstat 3775, opendir 263, total requests/misses 4052/269
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Use FindFirstFileExW with FindExInfoBasic to avoid forcing NTFS to look up
the short name. Also switch to a larger (64K vs 4K) buffer using
FIND_FIRST_EX_LARGE_FETCH to minimize round trips to the kernel.
In a repo with ~200K files, this drops warm cache status times from 3.19
seconds to 2.67 seconds for a 16% savings.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
This is retry of #1419.
I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.
git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.
Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333
Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136
I confirmed this patch passed all tests in t/ with core_fscache=1.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Make fscache_enabled() function public rather than static.
Remove unneeded fscache_is_enabled() function.
Change is_fscache_enabled() macro to call fscache_enabled().
is_fscache_enabled() now takes a pathname so that the answer
is more precise and mean "is fscache enabled for this pathname",
since fscache only stores repo-relative paths and not absolute
paths, we can avoid attempting lookups for absolute paths.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach read_directory_recursive() and add_excludes() to
be aware of optional fscache and avoid trying to open()
and fstat() non-existant ".gitignore" files in every
directory in the worktree.
The current code in add_excludes() calls open() and then
fstat() for a ".gitignore" file in each directory present
in the worktree. Change that when fscache is enabled to
call lstat() first and if present, call open().
This seems backwards because both lstat needs to do more
work than fstat. But when fscache is enabled, fscache will
already know if the .gitignore file exists and can completely
avoid the IO calls. This works because of the lstat diversion
to mingw_lstat when fscache is enabled.
This reduced status times on a 350K file enlistment of the
Windows repo on a NVMe SSD by 0.25 seconds.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach FSCACHE to remember "not found" directories.
This is a performance optimization.
FSCACHE is a performance optimization available for Windows. It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext. It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory. This gives a major performance
boost on Windows.
However, it does not remember "not found" directories. When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst. This completely defeats any performance
gains.
This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.
This change reduced status times for my very large repo by 60%.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow
`lstat()` emulation (git calls `lstat()` once for each file in the
index). Windows operating system APIs seem to be much better at scanning
the status of entire directories than checking single files.
Add an `lstat()` implementation that uses a cache for lstat data. Cache
misses read the entire parent directory and add it to the cache.
Subsequent `lstat()` calls for the same directory are served directly
from the cache.
Also implement `opendir()`/`readdir()`/`closedir()` so that they create
and use directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX `dirent` API on Windows via
`FindFirstFile()`/`FindNextFile()` is pretty staightforward, however,
most of the information provided in the `WIN32_FIND_DATA` structure is
thrown away in the process. A more sophisticated implementation may
cache this data, e.g. for later reuse in calls to `lstat()`.
Make the `dirent` implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to `readdir()`/`closedir()`
that match the `opendir()` implementation (similar to vtable pointers in
Object-Oriented Programming). Define `readdir()`/`closedir()` so that
they call the function pointers in the `DIR` structure. This allows to
choose the `opendir()` implementation on a call-by-call basis.
Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may
be implementation specific (e.g. a caching implementation may allocate a
`struct dirent` with _just_ the size needed to hold the `d_name` in
question).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.
We don't actually do fsyncs in the most of the test suite, since
GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
surrounding batch mode code since GIT_TEST_FSYNC merely makes the
maybe_fsync wrapper always appear to succeed.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit NFC or NFC and NFD spellings of pathnames on macOS.
MacOS is Unicode composition insensitive, so NFC and NFD spellings are
treated as aliases and collide. While the spelling of pathnames in
filesystem events depends upon the underlying filesystem, such as
APFS, HFS+ or FAT32, the OS enforces such collisions regardless of
filesystem.
Teach the daemon to always report the NFC spelling and to report
the NFD spelling when stored in that format on the disk.
This is slightly more general than "core.precomposeUnicode".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach the listener thread to shutdown the daemon if the spelling of the
worktree root directory changes.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Force shutdown fsmonitor daemon if the worktree root directory
is moved, renamed, or deleted.
Use Windows low-level GetFileInformationByHandle() to get and
compare the Windows system unique ID for the directory with a
cached version when we started up. This lets us detect the
case where someone renames the directory that we are watching
and then creates a new directory with the original pathname.
This is important because we are listening to a named pipe for
requests and they are stored in the Named Pipe File System (NPFS)
which a kernel-resident pseudo filesystem not associated with
the actual NTFS directory.
For example, if the daemon was watching "~/foo/", it would have
a directory-watch handle on that directory and a named-pipe
handle for "//./pipe/...foo". Moving the directory to "~/bar/"
does not invalidate the directory handle. (So the daemon would
actually be watching "~/bar" but listening on "//./pipe/...foo".
If the user then does "git init ~/foo" and causes another daemon
to start, the first daemon will still have ownership of the pipe
and the second daemon instance will fail to start. "git status"
clients in "~/foo" will ask "//./pipe/...foo" about changes and
the first daemon instance will tell them about "~/bar".
This commit causes the first daemon to shutdown if the system unique
ID for "~/foo" changes (changes from what it was when the daemon
started). Shutdown occurs after a periodic poll. After the
first daemon exits and releases the lock on the named pipe,
subsequent Git commands may cause another daemon to be started
on "~/foo". Similarly, a subsequent Git command may cause another
daemon to be started on "~/bar".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Extend the Windows version of the "health" thread to periodically
inspect the system and shutdown if warranted.
This commit updates the thread's wait loop to use a timeout and
defines a (currently empty) table of functions to poll the system.
A later commit will add functions to the table to actually
inspect the system.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Create another thread to watch over the daemon process and
automatically shut it down if necessary.
This commit creates the basic framework for a "health" thread
to monitor the daemon and/or the file system. Later commits
will add platform-specific code to do the actual work.
The "health" thread is intended to monitor conditions that
would be difficult to track inside the IPC thread pool and/or
the file system listener threads. For example, when there are
file system events outside of the watched worktree root or if
we want to have an idle-timeout auto-shutdown feature.
This commit creates the health thread itself, defines the thread-proc
and sets up the thread's event loop. It integrates this new thread
into the existing IPC and Listener thread models.
This commit defines the API to the platform-specific code where all of
the monitoring will actually happen.
The platform-specific code for MacOS is just stubs. Meaning that the
health thread will immediately exit on MacOS, but that is OK and
expected. Future work can define MacOS-specific monitoring.
The platform-specific code for Windows sets up enough of the
WaitForMultipleObjects() machinery to watch for system and/or custom
events. Currently, the set of wait handles only includes our custom
shutdown event (sent from our other theads). Later commits in this
series will extend the set of wait handles to monitor other
conditions.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Rename platform-specific listener thread related variables
and data types as we prepare to add another backend thread
type.
[] `struct fsmonitor_daemon_backend_data` becomes `struct fsm_listen_data`
[] `state->backend_data` becomes `state->listen_data`
[] `state->error_code` becomes `state->listen_error_code`
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach the fsmonitor--daemon to CD outside of the worktree
before starting up.
The common Git startup mechanism causes the CWD of the daemon process
to be in the root of the worktree. On Windows, this causes the daemon
process to hold a locked handle on the CWD and prevents other
processes from moving or deleting the worktree while the daemon is
running.
CD to HOME before entering main event loops.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Ignore FSEvents resulting from `xattr` changes. Git does not care about
xattr's or changes to xattr's, so don't waste time collecting these
events in the daemon nor transmitting them to clients.
Various security tools add xattrs to files and/or directories, such as
to mark them as having been downloaded. We should ignore these events
since it doesn't affect the content of the file/directory or the normal
meta-data that Git cares about.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
On MacOS mark repos on NTFS or FAT32 volumes as incompatible.
The builtin FSMonitor used Unix domain sockets on MacOS for IPC
with clients. These sockets are kept in the .git directory.
Unix sockets are not supported by NTFS and FAT32, so the daemon
cannot start up.
Test for this during our compatibility checking so that client
commands do not keep trying to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach Git to detect remote working directories on Windows and mark them as
incompatible with FSMonitor.
With this `git fsmonitor--daemon run` will error out with a message like it
does for bare repos.
Client commands, such as `git status`, will not attempt to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach Git to detect remote working directories on macOS and mark them as
incompatible with FSMonitor.
With this, `git fsmonitor--daemon run` will error out with a message
like it does for bare repos.
Client commands, like `git status`, will not attempt to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
VFS for Git virtual repositories are incompatible with FSMonitor.
VFS for Git is a downstream fork of Git. It contains its own custom
file system watcher that is aware of the virtualization. If a working
directory is being managed by VFS for Git, we should not try to watch
it because we may get incomplete results.
We do not know anything about how VFS for Git works, but we do
know that VFS for Git working directories contain a well-defined
config setting. If it is set, mark the working directory as
incompatible.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Extend generic incompatibility checkout with platform-specific
mechanism. Stub in Win32 version.
In the existing fsmonitor-settings code we have a way to mark
types of repos as incompatible with fsmonitor (whether via the
hook and IPC APIs). For example, we do this for bare repos,
since there are no files to watch.
Extend this exclusion mechanism for platform-specific reasons.
This commit just creates the framework and adds a stub for Win32.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>