Update enable_fscache() to take an optional initial size parameter which is
used to initialize the hashmap so that it can avoid having to rehash as
additional entries are added.
Add a separate disable_fscache() macro to make the code clearer and easier
to read.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
At the end of the status command, disable and free the fscache so that we
don't leak the memory and so that we can dump the fscache statistics.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
This is retry of #1419.
I added flush_fscache macro to flush cached stats after disk writing
with tests for regression reported in #1438 and #1442.
git checkout checks each file path in sorted order, so cache flushing does not
make performance worse unless we have large number of modified files in
a directory containing many files.
Using chromium repository, I tested `git checkout .` performance when I
delete 10 files in different directories.
With this patch:
TotalSeconds: 4.307272
TotalSeconds: 4.4863595
TotalSeconds: 4.2975562
Avg: 4.36372923333333
Without this patch:
TotalSeconds: 20.9705431
TotalSeconds: 22.4867685
TotalSeconds: 18.8968292
Avg: 20.7847136
I confirmed this patch passed all tests in t/ with core_fscache=1.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.
During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry. This
calls lstat(). On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.
Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.
We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with
not ok 47 - do not add files from a submodule
We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Fix racy fsmonitor
The `t7519-status-fsmonitor.sh` tests became a *lot* more flaky with the
recent fsmonitor fix (`js/fsmonitor-refresh-after-discarding-index`).
That fix, however, did not introduce the flakiness, but it just made it
much more likely to be hit. And it seemed to be hit *only* on Windows.
The reason, though, is that the fsmonitor feature failed to mark the
in-memory index as changed, i.e. in need of writing, and it was the
`has_racy_timestamp()` test that hid this bug in most cases (although a
lot less on Windows, where the files' mtimes are actually a lot more
accurate than on Linux).
This fixes https://github.com/gitgitgadget/git/issues/197
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch cleans up some left-overs that were forgotten when
removing the scripted `git rebase`.
As these patches are based on top of v2.22.0-rc1 (which *did* drop the
scripted `git-rebase.sh`), instead of v2.21.0 (on which the current `master`
of Git for Windows is based, and which did *not* yet drop the scripted
`git rebase`) it does not make sense to try to backport them to
`master`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch is a backport of
https://github.com/gitgitgadget/git/pull/208, which avoids a lock
contention in `git gc --auto` where the `git gc` process holds a read
lock to the `commit-graph` file (if `core.commitgraph=true`) and the
spawned `git commit-graph write` (if `gc.writecommitgraph=true`) tries
to overwrite it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
After writing to `stdout` and before reading from `stdin`, it is a good
idea to flush the former.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These fixes were necessary for Sverre Rabbelier's remote-hg to work,
but for some magic reason they are not necessary for the current
remote-hg. Makes you wonder how that one gets away with it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The only remaining scripted part of `git rebase` is the
`--preserve-merges` backend. Meaning: there is little reason to keep the
"library of common rebase functions" as a separate file.
While moving the functions to `git-rebase--preserve-merges.sh`, we also
drop the `move_to_original_branch` function that is no longer used.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We will need to pass down the `struct index_state` to
`mark_fsmonitor_valid()` for an upcoming bug fix, and this here function
calls that there function, so we need to extend the signature of
`fill_stat_cache_info()` first.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Since 21853626ea (built-in rebase: call `git am` directly, 2019-01-18),
the built-in rebase already uses the built-in `git am` directly.
Now that d03ebd411c (rebase: remove the rebase.useBuiltin setting,
2019-03-18) even removed the scripted rebase, there is no longer any
user of `git-rebase--am.sh`, so let's just remove it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The close_all_packs() method is now responsible for more than just pack-files.
It also closes the commit-graph and the multi-pack-index. Rename the function
to be more descriptive of its larger role. The name also fits because the
input parameter is a raw_object_store.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
For performance reasons `stdout` is buffered by default. That leads to
problems if after printing to `stdout` a read on `stdin` is performed.
For that reason interactive commands like `git clean -i` do not function
properly anymore if the `stdout` is not flushed by `fflush(stdout)` before
trying to read from `stdin`.
So let's precede all reads on `stdin` in `git clean -i` by flushing
`stdout`.
Signed-off-by: 마누엘 <nalla@hamal.uberspace.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In 06f5608c14 (bisect--helper: `bisect_start` shell function partially
in C, 2019-01-02), we introduced a call to `get_oid()` and did not check
whether it succeeded before using its output.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This function is marked as `NORETURN`, and it indeed does not want to
return anything. So let's not declare it with the return type `int`.
This fixes the following warning when building with MSVC:
C4646: function declared with 'noreturn' has non-void return type
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The scripted version of `git stash` called directly into the Perl script
`git-add--interactive.perl`, and this was faithfully converted to C.
However, we have a much better way to do this now: call `git add
--patch=<mode>`, which incidentally also respects the config setting
`add.interactive.useBuiltin`.
Let's do this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The built-in `git add -i` machinery obviously has its `the_repository`
structure initialized at the point where `cmd_commit()` calls it, and
therefore does not look at the environment variable `GIT_INDEX_FILE`.
But it has to, because the index was already locked, and we want to ask
the interactive add machinery to work on the `index.lock` file instead
of the `index` file.
Technically, we could teach `run_add_i()` (and `run_add_p()`) to look
specifically at that environment variable, but the entire idea of
passing in a parameter of type `struct repository *` is to allow working
on multiple repositories (and their index files) independently.
So let's instead override the `index_file` field of that structure
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch teaches the built-in `git add -p` machinery all the tricks it
needs to know in order to act as the work horse for `git checkout -p`.
Apart from the minor changes (slightly reworded messages, different
`diff` and `apply --check` invocations), it requires a new function to
actually apply the changes, as `git checkout -p` is a bit special in
that respect: when the desired changes do not apply to the index, but
apply to the work tree, Git does not fail straight away, but asks the
user whether to apply the changes to the worktree at least.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `git stash` and `git reset` commands support a `--patch` option, and
both simply hand off to `git add -p` to perform that work. Let's teach
the built-in version of `git add -p` do perform that work, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As `git add` traditionally did not expose the `--patch=<mode>` modes via
command-line options, `git stash` had to call `git add--interactive`
directly.
But this prevents the built-in `add -p` from kicking in, as
`add--interactive` is the Perl script.
So let's introduce support for an optional `<mode>` argument in `git add
--patch[=<mode>]`, and use that in `git stash -p`, so that the built-in
interactive add can do its job if configured.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Perl script backing `git add -p` is used not only for that command,
but also for `git stash -p`, `git reset -p` and `git checkout -p`.
In preparation for teaching the C version of `git add -p` to support
also the latter commands, let's abstract away what is "stage" specific
into a dedicated data structure describing the differences between the
patch modes.
As we prepare for calling the built-in `git add -p` in
`run_add_interactive()` via code paths that have not let `add_config()`
do its work, we have to make sure to re-parse the config using that
function in those cases.
Finally, please note that the Perl version tries to make sure that the
diffs are only generated for the modified files. This is not actually
necessary, as the calls to Git's diff machinery already perform that
work, and perform it well. This makes it unnecessary to port the
`FILTER` field of the `%patch_modes` struct, as well as the
`get_diff_reference()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In the previous steps, we re-implemented the main loop of `git add -i`
in C, and most of the commands.
Notably, we left out the actual functionality of `patch`, as the
relevant code makes up more than half of `git-add--interactive.perl`,
and is actually pretty independent of the rest of the commands.
With this commit, we start to tackle that `patch` part. For better
separation of concerns, we keep the code in a separate file,
`add-patch.c`. The new code is still guarded behind the
`add.interactive.useBuiltin` config setting, and for the moment,
it can only be called via `git add -p`.
The actual functionality follows the original implementation of
5cde71d64a (git-add --interactive, 2006-12-10), but not too closely
(for example, we use string offsets rather than copying strings around,
and we also remember which previous/next hunk was undecided, rather than
looking again when the user asked to jump there).
As a further deviation from that commit, We also use a comma instead of
a slash to separate the available commands in the prompt, as the current
version of the Perl script does this, and we also add a line about the
question mark ("print help") to the help text.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is hardly the first conversion of a Git command that is implemented
as a script to a built-in. So far, the most successful strategy for such
conversions has been to add a built-in helper and call that for more and
more functionality from the script, as more and more parts are
converted.
With the interactive add, we choose a different strategy. The sole
reason for this is that on Windows (where such a conversion has the most
benefits in terms of speed and robustness) we face the very specific
problem that a `system()` call in Perl seems to close `stdin` in the
parent process when the spawned process consumes even one character from
`stdin`. And that just does not work for us here, as it would stop the
main loop as soon as any interactive command was performed by the
helper. Which is almost all of the commands in `git add -i`.
It is almost as if Perl told us once again that it does not want us to
use it on Windows.
Instead, we follow the opposite route where we start with a bare-bones
version of the built-in interactive add, guarded by the new
`add.interactive.useBuiltin` config variable, and then add more and more
functionality to it, until it is feature complete.
At this point, the built-in version of `git add -i` only states that it
cannot do anything yet ;-)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).
Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.
This feature is marked experimental because it is still under review in
the upstream Git project.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In bff014dac7 (builtin rebase: support the `verbose` and `diffstat`
options, 2018-09-04), we added a line that wanted to remove the
`REBASE_DIFFSTAT` bit from the flags, but it used an incorrect negation.
Found by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--dir-diff" mode of "git difftool" is not useful in "--no-index"
mode; they are now explicitly marked as mutually incompatible.
* js/difftool-no-index:
difftool --no-index: error out on --dir-diff (and don't crash)
The code to generate the multi-pack idx file was not prepared to
see too many packfiles and ran out of open file descriptor, which
has been corrected.
* ds/midx-too-many-packs:
midx: add packs to packed_git linked list
midx: pass a repository pointer
On a filesystem like HFS+, the names of the refs stored as filesystem
entities may become different from what the end-user expects, just
like files in the working tree get "renamed". Work around the
mismatch by paying attention to the core.precomposeUnicode
configuration.
* en/unicode-in-refnames:
Honor core.precomposeUnicode in more places
Update "git difftool" and "git mergetool" so that the combinations
of {diff,merge}.{tool,guitool} configuration variables serve as
fallback settings of each other in a sensible order.
* dl/difftool-mergetool:
difftool: fallback on merge.guitool
difftool: make --gui, --tool and --extcmd mutually exclusive
mergetool: fallback to tool when guitool unavailable
mergetool--lib: create gui_mode function
mergetool: use get_merge_tool function
t7610: add mergetool --gui tests
t7610: unsuppress output
Attempt to use an abbreviated option in "git clone --recurs" is
responded by a request to disambiguate between --recursive and
--recurse-submodules, which is bad because these two are synonyms.
The parse-options API has been extended to define such synonyms
more easily and not produce an unnecessary failure.
* nd/parse-options-aliases:
parse-options: don't emit "ambiguous option" for aliases
"git chery-pick" (and "revert" that shares the same runtime engine)
that deals with multiple commits got confused when the final step
gets stopped with a conflict and the user concluded the sequence
with "git commit". Attempt to fix it by cleaning up the state
files used by these commands in such a situation.
* pw/clean-sequencer-state-upon-final-commit:
fix cherry-pick/revert status after commit
commit/reset: try to clean up sequencer state
The internal implementation of "git rebase -i" has been updated to
avoid forking a separate "rebase--interactive" process.
* pw/rebase-i-internal:
rebase -i: run without forking rebase--interactive
rebase: use a common action enum
rebase -i: use struct rebase_options in do_interactive_rebase()
rebase -i: use struct rebase_options to parse args
rebase -i: use struct object_id for squash_onto
rebase -i: use struct commit when parsing options
rebase -i: remove duplication
rebase -i: combine rebase--interactive.c with rebase.c
rebase: use OPT_RERERE_AUTOUPDATE()
rebase: rename write_basic_state()
rebase: don't translate trace strings
sequencer: always discard index after checkout
The connectivity bitmaps are created by default in bare
repositories now; also the pathname hash-cache is created by
default to avoid making crappy deltas when repacking.
* ew/repack-with-bitmaps-by-default:
pack-objects: default to writing bitmap hash-cache
t5310: correctly remove bitmaps for jgit test
repack: enable bitmaps by default on bare repos
During an initial "git clone --depth=..." partial clone, it is
pointless to spend cycles for a large portion of the connectivity
check that enumerates and skips promisor objects (which by
definition is all objects fetched from the other side). This has
been optimized out.
* js/partial-clone-connectivity-check:
t/perf: add perf script for partial clones
clone: do faster object check for partial clones
In git-difftool.txt, it says
'git difftool' falls back to 'git mergetool' config variables when the
difftool equivalents have not been defined.
However, when `diff.guitool` is missing, it doesn't fallback to
anything. Make git-difftool fallback to `merge.guitool` when `diff.guitool` is
missing.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-difftool, these options specify which tool to ultimately run. As
a result, they are logically conflicting. Explicitly disallow these
options from being used together.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `--no-index` mode, we now no longer require a worktree nor a
repository. But some code paths in `difftool` expect those to be
present.
The most notable such code path is the `--dir-diff` one: we use the
existing checkout machinery to copy the files, and that machinery looks
up replacement refs, looks at alternate ODBs, wants to use the worktree
path, etc.
Rather than running into segmentation faults, let's die with an
informative error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>