This topic branch re-adds the deprecated --stdin/-z options to `git
reset`. Those patches were overridden by a different set of options in
the upstream Git project before we could propose `--stdin`.
We offered this in MinGit to applications that wanted a safer way to
pass lots of pathspecs to Git, and these applications will need to be
adjusted.
Instead of `--stdin`, `--pathspec-from-file=-` should be used, and
instead of `-z`, `--pathspec-file-nul`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, symbolic links actually have a type depending on the target:
it can be a file or a directory.
In certain circumstances, this poses problems, e.g. when a symbolic link
is supposed to point into a submodule that is not checked out, so there
is no way for Git to auto-detect the type.
To help with that, we will add support over the course of the next
commits to specify that symlink type via the Git attributes. This
requires an index_state, though, something that Git for Windows'
`symlink()` replacement cannot know about because the function signature
is defined by the POSIX standard and not ours to change.
So let's introduce a helper function to create symbolic links that
*does* know about the index_state.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A ton of Git commands simply do not read (or at least parse) the core.*
settings. This is not good, as Git for Windows relies on the
core.longPaths setting to be read quite early on.
So let's just make sure that all commands read the config and give
platform_core_config() a chance.
This patch teaches tons of Git commands to respect the config setting
`core.longPaths = true`, including `pack-refs`, thereby fixing
https://github.com/git-for-windows/git/issues/1218
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git clean` command needs to enumerate plenty of files and
directories, and can therefore benefit from the FSCache.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `--stdin` option was a well-established paradigm in other commands,
therefore we implemented it in `git reset` for use by Visual Studio.
Unfortunately, upstream Git decided that it is time to introduce
`--pathspec-from-file` instead.
To keep backwards-compatibility for some grace period, we therefore
reinstate the `--stdin` option on top of the `--pathspec-from-file`
option, but mark it firmly as deprecated.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It was a bad idea to just remove that option from Git for Windows
v2.15.0, as early users of that (still experimental) option would have
been puzzled what they are supposed to do now.
So let's reintroduce the flag, but make sure to show the user good
advice how to fix this going forward.
We'll remove this option in a more orderly fashion when we're certain
that the option is no longer used (previous Visual Studio versions
relied on it).
The option is deprecated now, therefore we make sure that keeps saying
so until we finally remove it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a third-party tool periodically runs `git status` in order to keep
track of the state of the working tree, it is a bad idea to lock the
index: it might interfere with interactive commands executed by the
user, e.g. when the user wants to commit files.
Git for Windows introduced the `--no-lock-index` option a long time ago
to fix that (it made it into Git for Windows v2.9.2(3)) by simply
avoiding to write that file.
The downside is that the periodic `git status` calls will be a little
bit more wasteful because they may have to refresh the index repeatedly,
only to throw away the updates when it exits. This cannot really be
helped, though, as tools wanting to get a periodic update of the status
have no way to predict when the user may want to lock the index herself.
Sadly, a competing approach was submitted (by somebody who apparently
has less work on their plate than this maintainer) that made it into
v2.15.0 but is *different*: instead of a `git status`-only option, it is
an option that comes *before* the Git command and is called differently,
too.
Let's give previous users a chance to upgrade to newer Git for Windows
versions by handling the `--no-lock-index` option, still, though with a
big fat warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.
Add a separate disable_fscache() macro to make the code clearer and easier
to read.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the end of the status command, disable and free the fscache so that we
don't leak the memory and so that we can dump the fscache statistics.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
This is retry of #1419.
I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.
git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.
Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333
Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136
I confirmed this patch passed all tests in t/ with core_fscache=1.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.
During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry. This
calls lstat(). On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.
Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.
We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with
not ok 47 - do not add files from a submodule
We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Teach fsmonitor--daemon client threads to create a cookie file
inside the .git directory and then wait until FS events for the
cookie are observed by the FS listener thread.
This helps address the racy nature of file system events by
blocking the client response until the kernel has drained any
event backlog.
This is especially important on MacOS where kernel events are
only issued with a limited frequency. See the `latency` argument
of `FSeventStreamCreate()`. The kernel only signals every `latency`
seconds, but does not guarantee that the kernel queue is completely
drained, so we may have to wait more than one interval. If we
increase the frequency, the system is more likely to drop events.
We avoid these issues by having each client thread create a unique
cookie file and then wait until it is seen in the event stream.
Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.
Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index. (This token is like a
point in time, but different). Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).
Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older than the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.
We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to respond to IPC requests from client
Git processes and respond with a list of modified pathnames
relative to the provided token.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to build a list of changed paths and associate
them with a token-id. This will be used by the platform-specific
backends to accumulate changed paths in response to filesystem events.
The platform-specific file system listener thread receives file system
events containing one or more changed pathnames (with whatever bucketing
or grouping that is convenient for the file system). These paths are
accumulated (without locking) by the file system layer into a `fsmonitor_batch`.
When the file system layer has drained the kernel event queue, it will
"publish" them to our token queue and make them visible to concurrent
client worker threads. The token layer is free to combine and/or de-dup
paths within these batches for efficient presentation to clients.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to classify relative and absolute
pathnames and decide how they should be handled. This will
be used by the platform-specific backend to respond to each
filesystem event.
When we register for filesystem notifications on a directory,
we get events for everything (recursively) in the directory.
We want to report to clients changes to tracked and untracked
paths within the working directory. We do not want to report
changes within the .git directory, for example.
This classification will be used in a later commit by the
different backends to classify paths as events are received.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Implement `run` and `start` commands to try to
begin listening for file system events.
This version defines the thread structure with a single
fsmonitor_fs_listen thread to watch for file system events
and a simple IPC thread pool to wait for connections from
Git clients over a well-known named pipe or Unix domain
socket.
This version does not actually do anything yet because the
backends are still just stubs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Implement `stop` and `status` client commands to control and query the
status of a `fsmonitor--daemon` server process (and implicitly start a
server process if necessary).
Later commits will implement the actual server and monitor the file
system.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Create a built-in file system monitoring daemon that can be used by
the existing `fsmonitor` feature (protocol API and index extension)
to improve the performance of various Git commands, such as `status`.
The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and
provides an alternative to hook access to existing fsmonitors such
as `watchman`.
This commit merely adds the new command without any functionality.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
This commit refactors `git_config_get_fsmonitor()` into the `repo_*()`
form that takes a parameter `struct repository *r`.
That change prepares for the upcoming `core.useBuiltinFSMonitor` flag which
will be stored in the `repo_settings` struct.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We just introduced a helper to avoid showing a console window when the
scheduled task runs `git.exe`. Let's actually use it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Windows' equivalent to "bind mounts", NTFS junction points, can be
unlinked without affecting the mount target. This is clearly what users
expect to happen when they call `git clean -dfx` in a worktree that
contains NTFS junction points: the junction should be removed, and the
target directory of said junction should be left alone (unless it is
inside the worktree).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It seems to be not exactly rare on Windows to install NTFS junction
points (the equivalent of "bind mounts" on Linux/Unix) in worktrees,
e.g. to map some development tools into a subdirectory.
In such a scenario, it is pretty horrible if `git clean -dfx` traverses
into the mapped directory and starts to "clean up".
Let's just not do that. Let's make sure before we traverse into a
directory that it is not a mount point (or junction).
This addresses https://github.com/git-for-windows/git/issues/607
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.
* tb/reverse-midx:
multi-pack-index: fix potential segfault without sub-command
"git status" codepath learned to work with sparsely populated index
without hydrating it fully.
* ds/status-with-sparse-index:
t1092: document bad sparse-checkout behavior
fsmonitor: integrate with sparse index
wt-status: expand added sparse directory entries
status: use sparse-index throughout
status: skip sparse-checkout percentage with sparse-index
diff-lib: handle index diffs with sparse dirs
dir.c: accept a directory as part of cone-mode patterns
unpack-trees: unpack sparse directory entries
unpack-trees: rename unpack_nondirectories()
unpack-trees: compare sparse directories correctly
unpack-trees: preserve cache_bottom
t1092: add tests for status/add and sparse files
t1092: expand repository data shape
t1092: replace incorrect 'echo' with 'cat'
sparse-index: include EXTENDED flag when expanding
sparse-index: skip indexes with unmerged entries
Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.
* ab/attribute-format:
advice.h: add missing __attribute__((format)) & fix usage
*.h: add a few missing __attribute__((format))
*.c static functions: add missing __attribute__((format))
sequencer.c: move static function to avoid forward decl
*.c static functions: don't forward-declare __attribute__
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.
* jk/log-decorate-optim:
load_ref_decorations(): fix decoration with tags
add_ref_decoration(): rename s/type/deco_type/
load_ref_decorations(): avoid parsing non-tag objects
object.h: add lookup_object_by_type() function
object.h: expand docstring for lookup_unknown_object()
log: avoid loading decorations for userformats that don't need it
pretty.h: update and expand docstring for userformat_find_requirements()
"git worktree add --lock" learned to record why the worktree is
locked with a custom message.
* sm/worktree-add-lock:
worktree: teach `add` to accept --reason <string> with --lock
worktree: mark lock strings with `_()` for translation
t2400: clean up '"add" worktree with lock' test
"git commit --allow-empty-message" won't abort the operation upon
an empty message, but the hint shown in the editor said otherwise.
* hj/commit-allow-empty-message:
commit: remove irrelavent prompt on `--allow-empty-message`
commit: reorganise commit hint strings
"git rev-list" learns to omit the "commit <object-name>" header
lines from the output with the `--no-commit-header` option.
* bc/rev-list-without-commit-line:
rev-list: add option for --pretty=format without header
Since cd57bc41bb (builtin/multi-pack-index.c: display usage on
unrecognized command, 2021-03-30) we have used a "usage" label to avoid
having two separate callers of usage_with_options (one when no arguments
are given, and another for unrecognized sub-commands).
But the first caller has been broken since cd57bc41bb, since it will
happily jump to usage without arguments, and then pass argv[0] to the
"unrecognized subcommand" error.
Many compilers will save us from a segfault here, but the end result is
ugly, since it mentions an unrecognized subcommand when we didn't even
pass one, and (on GCC) includes "(null)" in its output.
Move the "usage" label down past the error about unrecognized
subcommands so that it is only triggered when it should be. While we're
at it, bulk up our test coverage in this area, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup around struct_type_init() functions.
* ab/struct-init:
string-list.h users: change to use *_{nodup,dup}()
string-list.[ch]: add a string_list_init_{nodup,dup}()
dir.[ch]: replace dir_init() with DIR_INIT
*.c *_init(): define in terms of corresponding *_INIT macro
*.h: move some *_INIT to designated initializers
Code clean-up and leak plugging in "git bundle".
* ab/bundle-updates:
bundle: remove "ref_list" in favor of string-list.c API
bundle.c: use a temporary variable for OIDs and names
bundle cmd: stop leaking memory from parse_options_cmd_bundle()
Fill test gaps.
* ab/show-branch-tests:
show-branch tests: add missing tests
show-branch: don't <COLOR></RESET> for space characters
show-branch tests: modernize test code
show-branch tests: rename the one "show-branch" test file
Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.
* ab/fetch-negotiate-segv-fix:
fetch: fix segfault in --negotiate-only without --negotiation-tip=*
fetch: document the --negotiate-only option
send-pack.c: move "no refs in common" abort earlier
The default reason stored in the lock file, "added with --lock",
is unlikely to be what the user would have given in a separate
`git worktree lock` command. Allowing `--reason` to be specified
along with `--lock` when adding a working tree gives the user control
over the reason for locking without needing a second command.
Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By testing 'git -c core.fsmonitor= status -uno', we can check for the
simplest index operations that can be made sparse-aware. The necessary
implementation details are already integrated with sparse-checkout, so
modify command_requires_full_index to be zero for cmd_status().
In refresh_index(), we loop through the index entries to refresh their
stat() information. However, sparse directories have no stat()
information to populate. Ignore these entries.
This allows 'git status' to no longer expand a sparse index to a full
one. This is further tested by dropping the "-uno" option and adding an
untracked file into the worktree.
The performance test p2000-sparse-checkout-operations.sh demonstrates
these improvements:
Test HEAD~1 HEAD
-----------------------------------------------------------------------------
2000.2: git status (full-index-v3) 0.31(0.30+0.05) 0.31(0.29+0.06) +0.0%
2000.3: git status (full-index-v4) 0.31(0.29+0.07) 0.34(0.30+0.08) +9.7%
2000.4: git status (sparse-index-v3) 2.35(2.28+0.10) 0.04(0.04+0.05) -98.3%
2000.5: git status (sparse-index-v4) 2.35(2.24+0.15) 0.05(0.04+0.06) -97.9%
Note that since HEAD~1 was expanding the sparse index by parsing trees,
it was artificially slower than the full index case. Thus, the 98%
improvement is misleading, and instead we should celebrate the 0.34s to
0.05s improvement of 85%. This is more indicative of the peformance
gains we are expecting by using a sparse index.
Note: we are dropping the assignment of core.fsmonitor here. This is not
necessary for the test script as we are not altering the config any
other way. Correct integration with FS Monitor will be validated in
later changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- default lock string, "added with --lock"
- temporary lock string, "initializing"
Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.
* ab/pickaxe-pcre2: (22 commits)
xdiff-interface: replace discard_hunk_line() with a flag
xdiff users: use designated initializers for out_line
pickaxe -G: don't special-case create/delete
pickaxe -G: terminate early on matching lines
xdiff-interface: allow early return from xdiff_emit_line_fn
xdiff-interface: prepare for allowing early return
pickaxe -S: slightly optimize contains()
pickaxe: rename variables in has_changes() for brevity
pickaxe -S: support content with NULs under --pickaxe-regex
pickaxe: assert that we must have a needle under -G or -S
pickaxe: refactor function selection in diffcore-pickaxe()
perf: add performance test for pickaxe
pickaxe/style: consolidate declarations and assignments
diff.h: move pickaxe fields together again
pickaxe: die when --find-object and --pickaxe-all are combined
pickaxe: die when -G and --pickaxe-regex are combined
pickaxe tests: add missing test for --no-pickaxe-regex being an error
pickaxe tests: test for -G, -S and --find-object incompatibility
pickaxe tests: add test for "log -S" not being a regex
pickaxe tests: add test for diffgrep_consume() internals
...
Some more code and doc clarification around "git push".
* fc/push-simple-updates-cleanup:
push: don't get a full remote object
push: only check same_remote when needed
push: remove trivial function
push: remove redundant check
push: factor out the typical case
push: get rid of all the setup_push_* functions
push: trivial simplifications
push: make setup_push_* return the dst
push: only get the branch when needed
push: factor out null branch check
push: split switch cases
push: return immediately in trivial switch case
push: create new get_upstream_ref() helper