git-for-windows/git - git - Gitea: Self-hosted GitHub

mirror of https://github.com/git-for-windows/git.git synced 2026-03-26 12:56:37 -05:00

Author	SHA1	Message	Date
Karsten Blees	2d12221a42	fscache: load directories only once If multiple threads access a directory that is not yet in the cache, the directory will be loaded by each thread. Only one of the results is added to the cache, all others are leaked. This wastes performance and memory. On cache miss, add a future object to the cache to indicate that the directory is currently being loaded. Subsequent threads register themselves with the future object and wait. When the first thread has loaded the directory, it replaces the future object with the result and notifies waiting threads. Signed-off-by: Karsten Blees <blees@dcon.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	be1eab22ee	mingw: add a cache below mingw's lstat and dirent implementations Checking the work tree status is quite slow on Windows, due to slow `lstat()` emulation (git calls `lstat()` once for each file in the index). Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. Add an `lstat()` implementation that uses a cache for lstat data. Cache misses read the entire parent directory and add it to the cache. Subsequent `lstat()` calls for the same directory are served directly from the cache. Also implement `opendir()`/`readdir()`/`closedir()` so that they create and use directory listings in the cache. The cache doesn't track file system changes and doesn't plug into any modifying file APIs, so it has to be explicitly enabled for git functions that don't modify the working copy. Note: in an earlier version of this patch, the cache was always active and tracked file system changes via ReadDirectoryChangesW. However, this was much more complex and had negative impact on the performance of modifying git commands such as 'git checkout'. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	f10e3ffd9f	add infrastructure for read-only file system level caches Add a macro to mark code sections that only read from the file system, along with a config option and documentation. This facilitates implementation of relatively simple file system level caches without the need to synchronize with the file system. Enable read-only sections for 'git status' and preload_index. Signed-off-by: Karsten Blees <blees@dcon.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	152d9e0d46	Win32: make the lstat implementation pluggable Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite slow. Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. A caching implementation may improve performance by bulk-reading entire directories or reusing data obtained via opendir / readdir. Make the lstat implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	6b6c9ae78c	mingw: make the dirent implementation pluggable Emulating the POSIX `dirent` API on Windows via `FindFirstFile()`/`FindNextFile()` is pretty staightforward, however, most of the information provided in the `WIN32_FIND_DATA` structure is thrown away in the process. A more sophisticated implementation may cache this data, e.g. for later reuse in calls to `lstat()`. Make the `dirent` implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Define a base DIR structure with pointers to `readdir()`/`closedir()` that match the `opendir()` implementation (similar to vtable pointers in Object-Oriented Programming). Define `readdir()`/`closedir()` so that they call the function pointers in the `DIR` structure. This allows to choose the `opendir()` implementation on a call-by-call basis. Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may be implementation specific (e.g. a caching implementation may allocate a `struct dirent` with _just_ the size needed to hold the `d_name` in question). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	238138a91c	Win32: dirent.c: Move opendir down Move opendir down in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de>	2022-04-05 08:14:35 -07:00
Karsten Blees	8f6a80a6b2	Win32: make FILETIME conversion functions public We will use them in the upcoming "FSCache" patches (to accelerate sequential lstat() calls). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2022-04-05 08:14:35 -07:00
Johannes Schindelin	5d3b520414	Merge branch 'ready-for-upstream' This is the branch thicket of patches in Git for Windows that are considered ready for upstream. To keep them in a ready-to-submit shape, they are kept as close to the beginning of the branch thicket as possible.	2022-04-05 08:14:35 -07:00
Victoria Dye	47c70b836f	Merge branch 'ns-batch-fsync' into HEAD	2022-04-05 08:14:34 -07:00
Johannes Schindelin	467afe6621	Merge pull request #3533 from PhilipOakley/hashliteral_t Begin `unsigned long`->`size_t` conversion to support large files on Windows	2022-04-05 08:14:34 -07:00
Jeff Hostetler	ecee229955	Merge branch 'mark-v4-fsmonitor-experimental' into try-v4-fsmonitor	2022-04-05 08:14:34 -07:00
Johannes Schindelin	3dd2f446b5	Enable the built-in FSMonitor as an experimental feature If `feature.experimental` and `feature.manyFiles` are set and the user has not explicitly turned off the builtin FSMonitor, we now start the built-in FSMonitor by default. Only forcing it when UNSET matches the behavior of UPDATE_DEFAULT_BOOL() used for other repo settings. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-05 08:14:33 -07:00
Victoria Dye	a28a3c3326	fsmonitor: reintroduce core.useBuiltinFSMonitor Reintroduce the 'core.useBuiltinFSMonitor' config setting (originally added in `0a756b2a25` (fsmonitor: config settings are repository-specific, 2021-03-05)) after its removal from the upstream version of FSMonitor. Upstream, the 'core.useBuiltinFSMonitor' setting was rendered obsolete by "overloading" the 'core.fsmonitor' setting to take a boolean value. However, several applications (e.g., 'scalar') utilize the original config setting, so it should be preserved for a deprecation period before complete removal: * if 'core.fsmonitor' is a boolean, the user is correctly using the new config syntax; do not use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is unspecified, use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is a path, override and use the builtin FSMonitor if 'core.useBuiltinFSMonitor' is 'true'; otherwise, use the FSMonitor hook indicated by the path. Additionally, for this deprecation period, advise users to switch to using 'core.fsmonitor' to specify their use of the builtin FSMonitor. Signed-off-by: Victoria Dye <vdye@github.com>	2022-04-04 16:22:55 -07:00
Neeraj Singh	88c95ab186	core.fsyncmethod: performance tests for batch mode Add basic performance tests for git commands that can add data to the object database. We cover: * git add * git stash * git update-index (via git stash) * git unpack-objects * git commit --all We cover all currently available fsync methods as well. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	406dc747a9	t/perf: add iteration setup mechanism to perf-lib Tests that affect the repo in stateful ways are easier to write if we can run setup steps outside of the measured portion of perf iteration. This change adds a "--setup 'setup-script'" parameter to test_perf. To make invocations easier to understand, I also moved the prerequisites to a new --prereq parameter. The setup facility will be used in the upcoming perf tests for batch mode, but it already helps in some existing tests, like t5302 and t7820. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	256e857d6a	core.fsyncmethod: tests for batch mode Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	0d1301dd99	test-lib-functions: add parsing helpers for ls-files and ls-tree Several tests use awk to parse OIDs from the output of 'git ls-files --stage' and 'git ls-tree'. Introduce helpers to centralize these uses of awk. Update t5317-pack-objects-filter-objects.sh to use the new ls-files helper so that it has some usages to review. Other updates are left for the future. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	ac60c1da84	core.fsync: use batch mode and sync loose objects by default on Windows Git for Windows has defaulted to core.fsyncObjectFiles=true since September 2017. We turn on syncing of loose object files with batch mode in upstream Git so that we can get broad coverage of the new code upstream. We don't actually do fsyncs in the most of the test suite, since GIT_TEST_FSYNC is set to 0. However, we do exercise all of the surrounding batch mode code since GIT_TEST_FSYNC merely makes the maybe_fsync wrapper always appear to succeed. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	238b1cf128	unpack-objects: use the bulk-checkin infrastructure The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling an odb-transaction when unpacking objects, we can take advantage of batched fsyncs. Here are some performance numbers to justify batch mode for unpack-objects, collected on a WSL2 Ubuntu VM. Fsync Mode \| Time for 90 objects (ms) ------------------------------------- Off \| 170 On,fsync \| 760 On,batch \| 230 Note that the default unpackLimit is 100 objects, so there's a 3x benefit in the worst case. The non-batch mode fsync scales linearly with the number of objects, so there are significant benefits even with smaller numbers of objects. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	34ec98ee7a	update-index: use the bulk-checkin infrastructure The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables odb-transactions for update-index infrastructure to speed up adding new objects to the object database by leveraging the batch fsync functionality. There is some risk with this change, since under batch fsync, the object files will be in a tmp-objdir until update-index is complete, so callers using the --stdin option will not see them until update-index is done. This risk is mitigated by not keeping an ODB transaction open around --stdin processing if in --verbose mode. Without --verbose mode, a caller feeding update-index via --stdin wouldn't know when update-index adds an object, event without an ODB transaction. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	744f9b005a	builtin/add: add ODB transaction around add_files_to_cache The add_files_to_cache function is invoked internally by builtin/commit.c and builtin/checkout.c for their flags that stage modified files before doing the larger operation. These commands can benefit from batched fsyncing. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	baaec243bf	cache-tree: use ODB transaction around writing a tree Take advantage of the odb transaction infrastructure around writing the cached tree to the object database. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	adf5285ad7	core.fsyncmethod: batched disk flushes for loose-objects When adding many objects to a repo with `core.fsync=loose-object`, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. This commit introduces a new `core.fsyncMethod=batch` option that batches up hardware flushes. It hooks into the bulk-checkin odb-transaction functionality, takes advantage of tmp-objdir, and uses the writeout-only support code. When the new mode is enabled, we do the following for each new object: 1a. Create the object in a tmp-objdir. 2a. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1b. Issue an fsync against a dummy file to flush the log and hardware writeback cache, which should by now have seen the tmp-objdir writes. 2b. Rename all of the tmp-objdir files to their final names. 3b. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the default today, but the user now has the option of syncing the index and there is a separate patch series to implement syncing of refs. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This sequence also ensures that no object files appear in the main object store unless they are fsync-durable. Batch mode is only enabled if core.fsync includes loose-objects. If the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does not include loose-objects, we will use file-by-file fsyncing. In step (1a) of the sequence, the tmp-objdir is created lazily to avoid work if no loose objects are ever added to the ODB. We use a tmp-objdir to maintain the invariant that no loose-objects are visible in the main ODB unless they are properly fsync-durable. This is important since future ODB operations that try to create an object with specific contents will silently drop the new data if an object with the target hash exists without checking that the loose-object contents match the hash. Only a full git-fsck would restore the ODB to a functional state where dataloss doesn't occur. In step (1b) of the sequence, we issue a fsync against a dummy file created specifically for the purpose. This method has a little higher cost than using one of the input object files, but makes adding new callers of this mechanism easier, since we don't need to figure out which object file is "last" or risk sharing violations by caching the fd of the last object file. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. Adding 500 files to the repo with 'git add' Times reported in seconds. object file syncing \| Linux \| Mac \| Windows --------------------\|-------\|-------\|-------- disabled \| 0.06 \| 0.35 \| 0.61 fsync \| 1.88 \| 11.18 \| 2.47 batch \| 0.15 \| 0.41 \| 1.53 Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	e9bbf03409	bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Make it clearer in the naming and documentation of the plug_bulk_checkin and unplug_bulk_checkin APIs that they can be thought of as a "transaction" to optimize operations on the object database. These transactions may be nested so that subsystems like the cache-tree writing code can optimize their operations without caring whether the top-level code has a transaction active. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Neeraj Singh	6840959be6	bulk-checkin: rename 'state' variable and separate 'plugged' boolean This commit prepares for adding batch-fsync to the bulk-checkin infrastructure. The bulk-checkin infrastructure is currently used to batch up addition of large blobs to a packfile. When a blob is larger than big_file_threshold, we unconditionally add it to a pack. If bulk checkins are 'plugged', we allow multiple large blobs to be added to a single pack until we reach the packfile size limit; otherwise, we simply make a new packfile for each large blob. The 'unplug' call tells us when the series of blob additions is done so that we can finish the packfiles and make their objects available to subsequent operations. Stated another way, bulk-checkin allows callers to define a transaction that adds multiple objects to the object database, where the object database can optimize its internal operations within the transaction boundary. Batched fsync will fit into bulk-checkin by taking advantage of the plug/unplug functionality to determine the appropriate time to fsync and make newly-added objects available in the primary object database. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_objdir'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-04 13:40:10 -07:00
Jeff Hostetler	3a5235120b	Merge branch 'trial-fsmonitor-march-2022-part3' into trial-fsmonitor-march-2022	2022-04-04 12:28:52 -07:00
Jeff Hostetler	047b7f42de	t7527: test Unicode NFC/NFD handling on MacOS Confirm that the daemon reports events using the on-disk spelling for Unicode NFC/NFD characters. On APFS we still have Unicode aliasing, so we cannot create two files that only differ by NFC/NFD, but the on-disk format preserves the spelling used to create the file. On HFS+ we also have aliasing, but the path is always stored on disk in NFD. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	006595a88e	t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd Create a set of prereqs to help understand how file names are handled by the filesystem when they contain NFC and NFD Unicode characters. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	08b925fdf6	fsmonitor: on macOS also emit NFC spelling for NFD pathname Emit NFC or NFC and NFD spellings of pathnames on macOS. MacOS is Unicode composition insensitive, so NFC and NFD spellings are treated as aliases and collide. While the spelling of pathnames in filesystem events depends upon the underlying filesystem, such as APFS, HFS+ or FAT32, the OS enforces such collisions regardless of filesystem. Teach the daemon to always report the NFC spelling and to report the NFD spelling when stored in that format on the disk. This is slightly more general than "core.precomposeUnicode". Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	1d23328829	t7527: test FSMonitor on case insensitive+preserving file system Test that FS events from the OS are received using the preserved, on-disk spelling of files/directories rather than spelling used to make the change. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	90f9857709	fsmonitor: never set CE_FSMONITOR_VALID on submodules Never set CE_FSMONITOR_VALID on the cache-entry of submodule directories. During a client command like 'git status', we may need to recurse into each submodule to compute a status summary for the submodule. Since the purpose of the ce_flag is to let Git avoid scanning a cache-entry, setting the flag causes the recursive call to be avoided and we report incorrect (no status) for the submodule. We created an OS watch on the root directory of our working directory and we receive events for everything in the cone under it. When submodules are present inside our working directory, we receive events for both our repo (the super) and any subs within it. Since our index doesn't have any information for items within the submodules, we can't use those events. We could try to truncate the paths of those events back to the submodule boundary and mark the GITLINK as dirty, but that feels expensive since we would have to prefix compare every FS event that we receive against a list of submodule roots. And it still wouldn't be sufficient to correctly report status on the submodule, since we don't have any space in the cache-entry to cache the submodule's status (the 'SCMU' bits in porcelain V2 speak). That is, the CE_FSMONITOR_VALID bit just says that we don't need to scan/inspect it because we already know the answer -- it doesn't say that the item is clean -- and we don't have space in the cache-entry to store those answers. So we should always do the recursive scan. Therefore, we should never set the flag on GITLINK cache-entries. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	9473b67517	t/perf/p7527: add perf test for builtin FSMonitor Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	1c0b74e4e5	t7527: FSMonitor tests for directory moves Create unit tests to move a directory. Verify that `git status` gives the same result with and without FSMonitor enabled. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	d612def36c	fsmonitor: optimize processing of directory events Teach Git to perform binary search over the cache-entries for a directory notification and then linearly scan forward to find the immediate children. Previously, when the FSMonitor reported a modified directory Git would perform a linear search on the entire cache-entry array for all entries matching that directory prefix and invalidate them. Since the cache-entry array is already sorted, we can use a binary search to find the first matching entry and then only linearly walk forward and invalidate entries until the prefix changes. Also, the original code would invalidate anything having the same directory prefix. Since a directory event should only be received for items that are immediately within the directory (and not within sub-directories of it), only invalidate those entries and not the whole subtree. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	f8563129ce	fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed Teach the listener thread to shutdown the daemon if the spelling of the worktree root directory changes. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	1ebe1cb298	fsm-health-win32: force shutdown daemon if worktree root moves Force shutdown fsmonitor daemon if the worktree root directory is moved, renamed, or deleted. Use Windows low-level GetFileInformationByHandle() to get and compare the Windows system unique ID for the directory with a cached version when we started up. This lets us detect the case where someone renames the directory that we are watching and then creates a new directory with the original pathname. This is important because we are listening to a named pipe for requests and they are stored in the Named Pipe File System (NPFS) which a kernel-resident pseudo filesystem not associated with the actual NTFS directory. For example, if the daemon was watching "~/foo/", it would have a directory-watch handle on that directory and a named-pipe handle for "//./pipe/...foo". Moving the directory to "~/bar/" does not invalidate the directory handle. (So the daemon would actually be watching "~/bar" but listening on "//./pipe/...foo". If the user then does "git init ~/foo" and causes another daemon to start, the first daemon will still have ownership of the pipe and the second daemon instance will fail to start. "git status" clients in "~/foo" will ask "//./pipe/...foo" about changes and the first daemon instance will tell them about "~/bar". This commit causes the first daemon to shutdown if the system unique ID for "~/foo" changes (changes from what it was when the daemon started). Shutdown occurs after a periodic poll. After the first daemon exits and releases the lock on the named pipe, subsequent Git commands may cause another daemon to be started on "~/foo". Similarly, a subsequent Git command may cause another daemon to be started on "~/bar". Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	a970c35dfd	fsm-health-win32: add polling framework to monitor daemon health Extend the Windows version of the "health" thread to periodically inspect the system and shutdown if warranted. This commit updates the thread's wait loop to use a timeout and defines a (currently empty) table of functions to poll the system. A later commit will add functions to the table to actually inspect the system. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	aea0188784	fsmonitor--daemon: stub in health thread Create another thread to watch over the daemon process and automatically shut it down if necessary. This commit creates the basic framework for a "health" thread to monitor the daemon and/or the file system. Later commits will add platform-specific code to do the actual work. The "health" thread is intended to monitor conditions that would be difficult to track inside the IPC thread pool and/or the file system listener threads. For example, when there are file system events outside of the watched worktree root or if we want to have an idle-timeout auto-shutdown feature. This commit creates the health thread itself, defines the thread-proc and sets up the thread's event loop. It integrates this new thread into the existing IPC and Listener thread models. This commit defines the API to the platform-specific code where all of the monitoring will actually happen. The platform-specific code for MacOS is just stubs. Meaning that the health thread will immediately exit on MacOS, but that is OK and expected. Future work can define MacOS-specific monitoring. The platform-specific code for Windows sets up enough of the WaitForMultipleObjects() machinery to watch for system and/or custom events. Currently, the set of wait handles only includes our custom shutdown event (sent from our other theads). Later commits in this series will extend the set of wait handles to monitor other conditions. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	c924227a89	fsmonitor--daemon: rename listener thread related variables Rename platform-specific listener thread related variables and data types as we prepare to add another backend thread type. [] `struct fsmonitor_daemon_backend_data` becomes `struct fsm_listen_data` [] `state->backend_data` becomes `state->listen_data` [] `state->error_code` becomes `state->listen_error_code` Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	8c4e2d788a	fsmonitor--daemon: prepare for adding health thread Refactor daemon thread startup to make it easier to start a third thread class to monitor the health of the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	2b8aceb8a0	fsmonitor--daemon: cd out of worktree root Teach the fsmonitor--daemon to CD outside of the worktree before starting up. The common Git startup mechanism causes the CWD of the daemon process to be in the root of the worktree. On Windows, this causes the daemon process to hold a locked handle on the CWD and prevents other processes from moving or deleting the worktree while the daemon is running. CD to HOME before entering main event loops. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	f5ef15c839	fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS Ignore FSEvents resulting from `xattr` changes. Git does not care about xattr's or changes to xattr's, so don't waste time collecting these events in the daemon nor transmitting them to clients. Various security tools add xattrs to files and/or directories, such as to mark them as having been downloaded. We should ignore these events since it doesn't affect the content of the file/directory or the normal meta-data that Git cares about. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	b5e3c77c74	unpack-trees: initialize fsmonitor_has_run_once in o->result Initialize `o->result.fsmonitor_has_run_once` based upon value in `o->src_index->fsmonitor_has_run_once` to prevent a second fsmonitor query during the tree traversal and possibly getting a skewed view of the working directory. The checkout code has already talked to the fsmonitor and the traversal is updating the index as it traverses, so there is no need to query the fsmonitor. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	67ca4fa97d	fsmonitor-settings: NTFS and FAT32 on MacOS are incompatible On MacOS mark repos on NTFS or FAT32 volumes as incompatible. The builtin FSMonitor used Unix domain sockets on MacOS for IPC with clients. These sockets are kept in the .git directory. Unix sockets are not supported by NTFS and FAT32, so the daemon cannot start up. Test for this during our compatibility checking so that client commands do not keep trying to start the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	5648ec8662	fsmonitor-settings: remote repos on Windows are incompatible Teach Git to detect remote working directories on Windows and mark them as incompatible with FSMonitor. With this `git fsmonitor--daemon run` will error out with a message like it does for bare repos. Client commands, such as `git status`, will not attempt to start the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	3fe6f2353f	fsmonitor-settings: remote repos on macOS are incompatible Teach Git to detect remote working directories on macOS and mark them as incompatible with FSMonitor. With this, `git fsmonitor--daemon run` will error out with a message like it does for bare repos. Client commands, like `git status`, will not attempt to start the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	94d8e5110f	fsmonitor-settings: stub in macOS-specific incompatibility checking Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	de3ef9624c	fsmonitor-settings: VFS for Git virtual repos are incompatible VFS for Git virtual repositories are incompatible with FSMonitor. VFS for Git is a downstream fork of Git. It contains its own custom file system watcher that is aware of the virtualization. If a working directory is being managed by VFS for Git, we should not try to watch it because we may get incomplete results. We do not know anything about how VFS for Git works, but we do know that VFS for Git working directories contain a well-defined config setting. If it is set, mark the working directory as incompatible. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	e5572c54dd	fsmonitor-settings: stub in Win32-specific incompatibility checking Extend generic incompatibility checkout with platform-specific mechanism. Stub in Win32 version. In the existing fsmonitor-settings code we have a way to mark types of repos as incompatible with fsmonitor (whether via the hook and IPC APIs). For example, we do this for bare repos, since there are no files to watch. Extend this exclusion mechanism for platform-specific reasons. This commit just creates the framework and adds a stub for Win32. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00
Jeff Hostetler	e7c2b4b2f1	fsmonitor-settings: bare repos are incompatible with FSMonitor Bare repos do not have a worktree, so there is nothing for the daemon watch. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2022-04-04 12:28:51 -07:00

1 2 3 4 5 ...

130609 Commits