A ton of Git commands simply do not read (or at least parse) the core.*
settings. This is not good, as Git for Windows relies on the
core.longPaths setting to be read quite early on.
So let's just make sure that all commands read the config and give
platform_core_config() a chance.
This patch teaches tons of Git commands to respect the config setting
`core.longPaths = true`, including `pack-refs`, thereby fixing
https://github.com/git-for-windows/git/issues/1218
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git clean` command needs to enumerate plenty of files and
directories, and can therefore benefit from the FSCache.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.
Add a separate disable_fscache() macro to make the code clearer and easier
to read.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the end of the status command, disable and free the fscache so that we
don't leak the memory and so that we can dump the fscache statistics.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
This is retry of #1419.
I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.
git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.
Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333
Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136
I confirmed this patch passed all tests in t/ with core_fscache=1.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.
During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry. This
calls lstat(). On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.
Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.
We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with
not ok 47 - do not add files from a submodule
We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
After writing to `stdout` and before reading from `stdin`, it is a good
idea to flush the former.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These fixes were necessary for Sverre Rabbelier's remote-hg to work,
but for some magic reason they are not necessary for the current
remote-hg. Makes you wonder how that one gets away with it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Windows' equivalent to "bind mounts", NTFS junction points, can be
unlinked without affecting the mount target. This is clearly what users
expect to happen when they call `git clean -dfx` in a worktree that
contains NTFS junction points: the junction should be removed, and the
target directory of said junction should be left alone (unless it is
inside the worktree).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It seems to be not exactly rare on Windows to install NTFS junction
points (the equivalent of "bind mounts" on Linux/Unix) in worktrees,
e.g. to map some development tools into a subdirectory.
In such a scenario, it is pretty horrible if `git clean -dfx` traverses
into the mapped directory and starts to "clean up".
Let's just not do that. Let's make sure before we traverse into a
directory that it is not a mount point (or junction).
This addresses https://github.com/git-for-windows/git/issues/607
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For performance reasons `stdout` is buffered by default. That leads to
problems if after printing to `stdout` a read on `stdin` is performed.
For that reason interactive commands like `git clean -i` do not function
properly anymore if the `stdout` is not flushed by `fflush(stdout)` before
trying to read from `stdin`.
So let's precede all reads on `stdin` in `git clean -i` by flushing
`stdout`.
Signed-off-by: 마누엘 <nalla@hamal.uberspace.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The built-in `git add -i` machinery obviously has its `the_repository`
structure initialized at the point where `cmd_commit()` calls it, and
therefore does not look at the environment variable `GIT_INDEX_FILE`.
But when being called from `commit --interactive`, it has to, because
the index was already locked in that case, and we want to ask the
interactive add machinery to work on the `index.lock` file instead of
the `index` file.
Technically, we could teach `run_add_i()`, or for that matter
`run_add_p()`, to look specifically at that environment variable, but
the entire idea of passing in a parameter of type `struct repository *`
is to allow working on multiple repositories (and their index files)
independently.
So let's instead override the `index_file` field of that structure
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a straight-forward port of 2f0896ec3a (restore: support
--patch, 2019-04-25) which added support for `git restore -p`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch teaches the built-in `git add -p` machinery all the tricks it
needs to know in order to act as the work horse for `git checkout -p`.
Apart from the minor changes (slightly reworded messages, different
`diff` and `apply --check` invocations), it requires a new function to
actually apply the changes, as `git checkout -p` is a bit special in
that respect: when the desired changes do not apply to the index, but
apply to the work tree, Git does not fail straight away, but asks the
user whether to apply the changes to the worktree at least.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The scripted version of `git stash` called directly into the Perl script
`git-add--interactive.perl`, and this was faithfully converted to C.
However, we have a much better way to do this now: call the internal API
directly, which will now incidentally also respect the
`add.interactive.useBuiltin` setting. Let's just do this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As `git add` traditionally did not expose the `--patch=<mode>` modes via
command-line options, the scripted version of `git stash` had to call
`git add--interactive` directly.
But this prevents the built-in `add -p` from kicking in, as
`add--interactive` is the scripted version (which does not have a
"fall-back" to the built-in version).
So let's introduce support for internal switch for `git add` that the
scripted `git stash` can use to call the appropriate backend (scripted
or built-in, depending on `add.interactive.useBuiltin`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git stash` and `git reset` commands support a `--patch` option, and
both simply hand off to `git add -p` to perform that work. Let's teach
the built-in version of that command to be able to perform that work, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Perl script backing `git add -p` is used not only for that command,
but also for `git stash -p`, `git reset -p` and `git checkout -p`.
In preparation for teaching the C version of `git add -p` to support
also the latter commands, let's abstract away what is "stage" specific
into a dedicated data structure describing the differences between the
patch modes.
Finally, please note that the Perl version tries to make sure that the
diffs are only generated for the modified files. This is not actually
necessary, as the calls to Git's diff machinery already perform that
work, and perform it well. This makes it unnecessary to port the
`FILTER` field of the `%patch_modes` struct, as well as the
`get_diff_reference()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `--stdin` option was a well-established paradigm in other commands,
therefore we implemented it in `git reset` for use by Visual Studio.
Unfortunately, upstream Git decided that it is time to introduce
`--pathspec-from-file` instead.
To keep backwards-compatibility for some grace period, we therefore
reinstate the `--stdin` option on top of the `--pathspec-from-file`
option, but mark it firmly as deprecated.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The effort to move "git-add--interactive" to C continues.
* js/add-p-in-c:
built-in add -p: show helpful hint when nothing can be staged
built-in add -p: only show the applicable parts of the help text
built-in add -p: implement the 'q' ("quit") command
built-in add -p: implement the '/' ("search regex") command
built-in add -p: implement the 'g' ("goto") command
built-in add -p: implement hunk editing
strbuf: add a helper function to call the editor "on an strbuf"
built-in add -p: coalesce hunks after splitting them
built-in add -p: implement the hunk splitting feature
built-in add -p: show different prompts for mode changes and deletions
built-in app -p: allow selecting a mode change as a "hunk"
built-in add -p: handle deleted empty files
built-in add -p: support multi-file diffs
built-in add -p: offer a helpful error message when hunk navigation failed
built-in add -p: color the prompt and the help text
built-in add -p: adjust hunk headers as needed
built-in add -p: show colored hunks by default
built-in add -i: wire up the new C code for the `patch` command
built-in add -i: start implementing the `patch` functionality in C
Code cleanup.
* rs/ref-read-cleanup:
remote: pass NULL to read_ref_full() because object ID is not needed
refs: pass NULL to refs_read_ref_full() because object ID is not needed
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.
* ds/sparse-cone: (21 commits)
sparse-checkout: improve OS ls compatibility
sparse-checkout: respect core.ignoreCase in cone mode
sparse-checkout: check for dirty status
sparse-checkout: update working directory in-process for 'init'
sparse-checkout: cone mode should not interact with .gitignore
sparse-checkout: write using lockfile
sparse-checkout: use in-process update for disable subcommand
sparse-checkout: update working directory in-process
sparse-checkout: sanitize for nested folders
unpack-trees: add progress to clear_ce_flags()
unpack-trees: hash less in cone mode
sparse-checkout: init and set in cone mode
sparse-checkout: use hashmaps for cone patterns
sparse-checkout: add 'cone' mode
trace2: add region in clear_ce_flags
sparse-checkout: create 'disable' subcommand
sparse-checkout: add '--stdin' option to set subcommand
sparse-checkout: 'set' subcommand
clone: add --sparse mode
sparse-checkout: create 'init' subcommand
...
Redo "git name-rev" to avoid recursive calls.
* sg/name-rev-wo-recursion:
name-rev: cleanup name_ref()
name-rev: eliminate recursion in name_rev()
name-rev: use 'name->tip_name' instead of 'tip_name'
name-rev: drop name_rev()'s 'generation' and 'distance' parameters
name-rev: restructure creating/updating 'struct rev_name' instances
name-rev: restructure parsing commits and applying date cutoff
name-rev: pull out deref handling from the recursion
name-rev: extract creating/updating a 'struct name_rev' into a helper
t6120: add a test to cover inner conditions in 'git name-rev's name_rev()
name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation
name-rev: avoid unnecessary cast in name_ref()
name-rev: use strbuf_strip_suffix() in get_rev_name()
t6120-describe: modernize the 'check_describe' helper
t6120-describe: correct test repo history graph in comment
"git format-patch" can take a set of configured format.notes values
to specify which notes refs to use in the log message part of the
output. The behaviour of this was not consistent with multiple
--notes command line options, which has been corrected.
* dl/format-patch-notes-config-fixup:
notes.h: fix typos in comment
notes: break set_display_notes() into smaller functions
config/format.txt: clarify behavior of multiple format.notes
format-patch: move git_config() before repo_init_revisions()
format-patch: use --notes behavior for format.notes
notes: extract logic into set_display_notes()
notes: create init_display_notes() helper
notes: rename to load_display_notes()
A few more commands learned the "--pathspec-from-file" command line
option.
* am/pathspec-f-f-checkout:
checkout, restore: support the --pathspec-from-file option
doc: restore: synchronize <pathspec> description
doc: checkout: synchronize <pathspec> description
doc: checkout: fix broken text reference
doc: checkout: remove duplicate synopsis
add: support the --pathspec-from-file option
cmd_add: prepare for next patch
An earlier series to teach "--pathspec-from-file" to "git commit"
forgot to make the option incompatible with "--all", which has been
corrected.
* am/pathspec-from-file:
commit: forbid --pathspec-from-file --all
I forgot this in my previous patch `--pathspec-from-file` for
`git commit` [1]. When both `--pathspec-from-file` and `--all` were
specified, `--all` took precedence and `--pathspec-from-file` was
ignored. Before `--pathspec-from-file` was implemented, this case was
prevented by this check in `parse_and_validate_options()` :
die(_("paths '%s ...' with -a does not make sense"), argv[0]);
It is unfortunate that these two cases are disconnected. This came as
result of how the code was laid out before my patches, where `pathspec`
is parsed outside of `parse_and_validate_options()`. This branch is
already full of refactoring patches and I did not dare to go for another
one.
Fix by mirroring `die()` for `--pathspec-from-file` as well.
[1] Commit e440fc58 ("commit: support the --pathspec-from-file option" 2019-11-19)
Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a pointer that points at a constant string,
just give name directly to the constant string; this way, we
do not have to allocate a pointer variable in addition to
the string we want to use.
Let's convert `need_bad_and_good_revision_warning` and
`need_bisect_start_warning` char pointers to char arrays.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* dl/range-diff-with-notes:
range-diff: clear `other_arg` at end of function
range-diff: mark pointers as const
t3206: fix incorrect test name
"git rebase" did not work well when format.useAutoBase
configuration variable is set, which has been corrected.
* dl/rebase-with-autobase:
rebase: fix format.useAutoBase breakage
format-patch: teach --no-base
t4014: use test_config()
format-patch: fix indentation
t3400: demonstrate failure with format.useAutoBase
Reduce unnecessary reading of state variables back from the disk
during sequencer operation.
* ag/sequencer-todo-updates:
sequencer: directly call pick_commits() from complete_action()
rebase: fill `squash_onto' in get_replay_opts()
sequencer: move the code writing total_nr on the disk to a new function
sequencer: update `done_nr' when skipping commands in a todo list
sequencer: update `total_nr' when adding an item to a todo list
In the previous steps, we re-implemented the main loop of `git add -i`
in C, and most of the commands.
Notably, we left out the actual functionality of `patch`, as the
relevant code makes up more than half of `git-add--interactive.perl`,
and is actually pretty independent of the rest of the commands.
With this commit, we start to tackle that `patch` part. For better
separation of concerns, we keep the code in a separate file,
`add-patch.c`. The new code is still guarded behind the
`add.interactive.useBuiltin` config setting, and for the moment,
it can only be called via `git add -p`.
The actual functionality follows the original implementation of
5cde71d64a (git-add --interactive, 2006-12-10), but not too closely
(for example, we use string offsets rather than copying strings around,
and after seeing whether the `k` and `j` commands are applicable, in the
C version we remember which previous/next hunk was undecided, and use it
rather than looking again when the user asked to jump).
As a further deviation from that commit, We also use a comma instead of
a slash to separate the available commands in the prompt, as the current
version of the Perl script does this, and we also add a line about the
question mark ("print help") to the help text.
While it is tempting to use this conversion of `git add -p` as an excuse
to work on `apply_all_patches()` so that it does _not_ want to read a
file from `stdin` or from a file, but accepts, say, an `strbuf` instead,
we will refrain from this particular rabbit hole at this stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user uses the sparse-checkout feature in cone mode, they
add patterns using "git sparse-checkout set <dir1> <dir2> ..."
or by using "--stdin" to provide the directories line-by-line over
stdin. This behaviour naturally looks a lot like the way a user
would type "git add <dir1> <dir2> ..."
If core.ignoreCase is enabled, then "git add" will match the input
using a case-insensitive match. Do the same for the sparse-checkout
feature.
Perform case-insensitive checks while updating the skip-worktree
bits during unpack_trees(). This is done by changing the hash
algorithm and hashmap comparison methods to optionally use case-
insensitive methods.
When this is enabled, there is a small performance cost in the
hashing algorithm. To tease out the worst possible case, the
following was run on a repo with a deep directory structure:
git ls-tree -d -r --name-only HEAD |
git sparse-checkout set --stdin
The 'set' command was timed with core.ignoreCase disabled or
enabled. For the repo with a deep history, the numbers were
core.ignoreCase=false: 62s
core.ignoreCase=true: 74s (+19.3%)
For reproducibility, the equivalent test on the Linux kernel
repository had these numbers:
core.ignoreCase=false: 3.1s
core.ignoreCase=true: 3.6s (+16%)
Now, this is not an entirely fair comparison, as most users
will define their sparse cone using more shallow directories,
and the performance improvement from eb42feca97 ("unpack-trees:
hash less in cone mode" 2019-11-21) can remove most of the
hash cost. For a more realistic test, drop the "-r" from the
ls-tree command to store only the first-level directories.
In that case, the Linux kernel repository takes 0.2-0.25s in
each case, and the deep repository takes one second, plus or
minus 0.05s, in each case.
Thus, we _can_ demonstrate a cost to this change, but it is
unlikely to matter to any reasonable sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8164c961e1 (format-patch: use --notes behavior for format.notes,
2019-12-09), we introduced set_display_notes() which was a monolithic
function with three mutually exclusive branches. Break the function up
into three small and simple functions that each are only responsible for
one task.
This family of functions accepts an `int *show_notes` instead of
returning a value suitable for assignment to `show_notes`. This is for
two reasons. First of all, this guarantees that the external
`show_notes` variable changes in lockstep with the
`struct display_notes_opt`. Second, this prompts future developers to be
careful about doing something meaningful with this value. In fact, a
NULL check is intentionally omitted because causing a segfault here
would tell the future developer that they are meant to use the value for
something meaningful.
One alternative was making the family of functions accept a
`struct rev_info *` instead of the `struct display_notes_opt *`, since
the former contains the `show_notes` field as well. This does not work
because we have to call git_config() before repo_init_revisions().
However, if we had a `struct rev_info`, we'd need to initialize it before
it gets assigned values from git_config(). As a result, we break the
circular dependency by having standalone `int show_notes` and
`struct display_notes_opt notes_opt` variables which temporarily hold
values from git_config() until the values are copied over to `rev`.
To implement this change, we need to get a pointer to
`rev_info::show_notes`. Unfortunately, this is not possible with
bitfields and only direct-assignment is possible. Change
`rev_info::show_notes` to a non-bitfield int so that we can get its
address.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_ref_full() wraps refs_read_ref_full(), which in turn wraps
refs_resolve_ref_unsafe(), which handles a NULL oid pointer of callers
not interested in the resolved object ID. Make use of that feature to
document that mv() is such a caller.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5e82c3dd22 (bisect--helper: `bisect_reset` shell function in C,
2019-01-02), the `git bisect reset` subcommand was ported to C. When the
call to `git checkout` failed, an error message was reported to the
user.
However, this error message used the `strbuf` that had just been
released already. Let's switch that around: first use it, then release
it.
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hide lower-level verify_signed-buffer() API as a pure helper to
implement the public check_signature() function, in order to
encourage new callers to use the correct and more strict
validation.
* hi/gpg-use-check-signature:
gpg-interface: prefer check_signature() for GPG verification
The interaction between "git clone --recurse-submodules" and
alternate object store was ill-designed. The documentation and
code have been taught to make more clear recommendations when the
users see failures.
* jt/clone-recursesub-ref-advise:
submodule--helper: advise on fatal alternate error
Doc: explain submodule.alternateErrorStrategy
"git rebase -i" learned a few options that are known by "git
rebase" proper.
* ra/rebase-i-more-options:
rebase -i: finishing touches to --reset-author-date
rebase: add --reset-author-date
rebase -i: support --ignore-date
sequencer: rename amend_author to author_to_rename
rebase -i: support --committer-date-is-author-date
sequencer: allow callers of read_author_script() to ignore fields
rebase -i: add --ignore-whitespace flag