If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.
On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.
Signed-off-by: Karsten Blees <blees@dcon.de>
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.
Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.
Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.
The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.
Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.
Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.
Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.
Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.
Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).
Signed-off-by: Karsten Blees <blees@dcon.de>
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When spawning child processes, we do want them to inherit the standard
handles so that we can talk to them. We do *not* want them to inherit
any other handle, as that would hold a lock to the respective files
(preventing them from being renamed, modified or deleted), and the child
process would not know how to access that handle anyway.
Happily, there is an API to make that happen. It is supported in Windows
Vista and later, which is exactly what we promise to support in Git for
Windows for the time being.
This also means that we lift, at long last, the target Windows version
from Windows XP to Windows Vista.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In MSYS2, we have two Python interpreters at our disposal, so we can
include the Python stuff in the build.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Make `git config --system` work like you think it should on Windows: it
should edit the file that is located in `<Git>\etc\gitconfig`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It would appear that least the Isilon network filesystem (and possibly
other network filesystems, too), report non-standard error values when
trying to access a non-existing directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In Git for Windows, there is an option to mark the .git directory as
hidden. Our test cases verify this by using the system utility
`attrib.exe`.
This file name is unfortunately quite generic, and overlaps with a
Unix-y utility that might be hiding the system one in the `PATH`.
Let's specify explicitly which `attrib` to use.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
After writing to `stdout` and before reading from `stdin`, it is a good
idea to flush the former.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is a good idea in general to avoid pipes in test cases, as it makes
things more debuggable to have files to inspect (instead of ephemereal
piped data that is long gone).
This also seemed to work around a problem where MSYS2' Perl would
segfault which may, or may not, still be present today.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch fixes a corner case that is amazingly common in this
developer's workflow: in a `git stash -p`, splitting a hunk and stashing
only part of it runs into a (known) bug where the partial hunk cannot be
applied in reverse.
It is one of those "good enough" fixes, not a full fix, though, as the
full fix would require a 3-way merge between `stash^` and the *worktree*
(not `HEAD`), with `stash` as merge base (i.e. a `git revert`, but on
top of the current worktree).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is the final leg of the journey to a fully built-in `git add`: the
`git add -i` and `git add -p` modes were re-implemented in C, but they
lacked support for a couple of config settings.
The one that sticks out most is the `interactive.singleKey` setting: it
was not only particularly hard to get to work, especially on Windows. It
is also the setting that seems to be incomplete already in the Perl
version: while the name suggests that it applies to the main loop of
`git add --interactive`, or to the file selections in that command, it
does not. Only the `git add --patch` mode respects that setting.
As it is outside the purpose of the conversion of
`git-add--interactive.perl` to C, we will leave that loose end for some
future date.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At this stage on the journey to a fully built-in `git add`, we already
have everything we need, including the `--interactive` and `--patch`
options, as long as the `add.interactive.useBuiltin` setting is set to
`true` (kind of a "turned off feature flag", which it will be for a
while, until we get confident enough that the built-in version does the
job, and retire the Perl script).
However, the internal `add--interactive` helper is also used to back the
`--patch` option of `git stash`, `git reset` and `git checkout`.
This patch series brings them "online".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Out of all the patch series on the journey to provide `git add
--interactive` and `git add --patch` in a built-in versions, this is the
big one, as can be expected from the fact that the `git add --patch`
functionality makes up over half of the 1,800+ lines of
`git-add--interactive.perl`.
The two patches that stick out are of course the ones to implement hunk
splitting and hunk editing: these operations are fundamentally more
complicated, and less obvious, than the rest of the operations.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
While re-implementing `git add -i` and `git add -p` in C, I tried to
make sure that there is test coverage for all of the features I convert
from Perl to C, to give me some confidence in the correctness from
running the test suite both with `GIT_TEST_ADD_I_USE_BUILTIN=true` and
with `GIT_TEST_ADD_I_USE_BUILTIN=false`.
However, I discovered that there are a couple of gaps. This patch series
intends to close them.
The first patch might actually not be considered a gap by some: it
basically removes the need for the `TTY` prerequisite in the `git add
-i` tests to verify that the output is colored all right. This change is
rather crucial for me, though: on Windows, where the conversion to a
built-in shows the most obvious benefits, there are no pseudo terminals
(yet), therefore `git.exe` cannot work with them (even if the MSYS2 Perl
interpreter used by Git for Windows knows about some sort of pty
emulation). And I *really* wanted to make sure that the colors work on
Windows, as I personally get a lot out of those color cues.
The patch series ends by addressing two issues that are not exactly
covering testing gaps:
- While adding a test case, I noticed that `git add -p` exited with
*success* when it could not even generate a diff. This is so obviously
wrong that I had to fix it right away (I noticed, actually, because my
in-progress built-in `git add -p` failed, and the Perl version did
not), and I used the same test case to verify that this is fixed once
and for all.
- While working on covering those test gaps, I noticed a problem in an
early version of the built-in version of `git add -p` where the `git
apply --allow-overlap` mode failed to work properly, for little
reason, and I fixed it real quick.
It would seem that the `--allow-overlap` function is not only
purposefully under-documented, but also purposefully under-tested,
probably to discourage its use. I do not quite understand what I
perceive to be Junio's aversion to that option, but I did not feel
like I should put up a battle here, so I did not accompany this fix
with a new test script.
In the end, the built-in version of `git add -p` does not use the
`--allow-overlap` function at all, anyway. Which should make everybody
a lot happier.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series implements the rest of the commands in `git add -i`'s
main loop: `update`, `revert`, `add_untracked`, `patch`, `diff`, and
`quit`. Apart from `quit`, these commands are all very similar in that
they first build a list of files, display it, let the user choose which
ones to act on, and then perform the action.
Note that the `patch` command is not actually converted to C, not
completely at least: the built-in version simply hands off to `git
add--interactive` after letting the user select which files to act on.
The reason for this omission is practicality. Out of the 1,800+ lines of
`git-add--interactive.perl`, over a thousand deal *just* with the `git
add -p` part. I did convert that functionality already (to be
contributed in a separate patch series), discovering that there is so
little overlap between the `git add --patch` part and the rest of `git
add --interactive` that I could put the former into a totally different
file: `add-patch.c`. Just a teaser ;-)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This job runs the test suite twice, once in regular mode, and once with
a whole slew of `GIT_TEST_*` variables set.
Now that the built-in version of `git add --interactive` is
feature-complete, let's also throw `GIT_TEST_MULTI_PACK_INDEX` into that
fray.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The built-in `git add -i` machinery obviously has its `the_repository`
structure initialized at the point where `cmd_commit()` calls it, and
therefore does not look at the environment variable `GIT_INDEX_FILE`.
But it has to, because the index was already locked, and we want to ask
the interactive add machinery to work on the `index.lock` file instead
of the `index` file.
Technically, we could teach `run_add_i()` (and `run_add_p()`) to look
specifically at that environment variable, but the entire idea of
passing in a parameter of type `struct repository *` is to allow working
on multiple repositories (and their index files) independently.
So let's instead override the `index_file` field of that structure
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In 7e9e048661 (stash -p: demonstrate failure of split with mixed y/n,
2015-04-16), a regression test for a known breakage that was added to
the test script `t3904-stash-patch.sh` that demonstrated that splitting
a hunk and trying to stash only part of that split hunk fails (but
shouldn't).
As expected, it still fails, but for the wrong reason: once the bug is
fixed, we would expect stderr to show nothing, yet the regression test
expects stderr to show something.
Let's fix that by telling that regression test case to expect nothing to
be printed to stderr.
While at it, also drop the obvious left-over from debugging where the
regression test did not mind `git stash -p` to return a non-zero exit
status.
Of course, the regression test still fails, but this time for the
correct reason.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When `interactive.singlekey = true`, we react immediately to keystrokes,
even to Escape sequences (e.g. when pressing a cursor key).
The problem with Escape sequences is that we do not really know when
they are done, and as a heuristic we poll standard input for half a
second to make sure that we got all of it.
While waiting half a second is not asking for a whole lot, it can become
quite annoying over time, therefore with this patch, we read the
terminal capabilities (if available) and extract known Escape sequences
from there, then stop polling immediately when we detected that the user
pressed a key that generated such a known sequence.
This recapitulates the remaining part of b5cc003253 (add -i: ignore
terminal escape sequences, 2011-05-17).
Note: We do *not* query the terminal capabilities directly. That would
either require a lot of platform-specific code, or it would require
linking to a library such as ncurses.
Linking to a library in the built-ins is something we try very hard to
avoid (we even kicked the libcurl dependency to a non-built-in remote
helper, just to shave off a tiny fraction of a second from Git's startup
time). And the platform-specific code would be a maintenance nightmare.
Even worse: in Git for Windows' case, we would need to query MSYS2
pseudo terminals, which `git.exe` simply cannot do (because it is
intentionally *not* an MSYS2 program).
To address this, we simply spawn `infocmp -L -1` and parse its output
(which works even in Git for Windows, because that helper is included in
the end-user facing installations).
This is done only once, as in the Perl version, but it is done only when
the first Escape sequence is encountered, not upon startup of `git add
-i`; This saves on startup time, yet makes reacting to the first Escape
sequence slightly more sluggish. But it allows us to keep the
terminal-related code encapsulated in the `compat/terminal.c` file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a straight-forward port of 2f0896ec3a (restore: support
--patch, 2019-04-25) which added support for `git restore -p`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This recapitulates part of b5cc003253 (add -i: ignore terminal escape
sequences, 2011-05-17):
add -i: ignore terminal escape sequences
On the author's terminal, the up-arrow input sequence is ^[[A, and
thus fat-fingering an up-arrow into 'git checkout -p' is quite
dangerous: git-add--interactive.perl will ignore the ^[ and [
characters and happily treat A as "discard everything".
As a band-aid fix, use Term::Cap to get all terminal capabilities.
Then use the heuristic that any capability value that starts with ^[
(i.e., \e in perl) must be a key input sequence. Finally, given an
input that starts with ^[, read more characters until we have read a
full escape sequence, then return that to the caller. We use a
timeout of 0.5 seconds on the subsequent reads to avoid getting stuck
if the user actually input a lone ^[.
Since none of the currently recognized keys start with ^[, the net
result is that the sequence as a whole will be ignored and the help
displayed.
Note that we leave part for later which uses "Term::Cap to get all
terminal capabilities", for several reasons:
1. it is actually not really necessary, as the timeout of 0.5 seconds
should be plenty sufficient to catch Escape sequences,
2. it is cleaner to keep the change to special-case Escape sequences
separate from the change that reads all terminal capabilities to
speed things up, and
3. in practice, relying on the terminal capabilities is a bit overrated,
as the information could be incomplete, or plain wrong. For example,
in this developer's tmux sessions, the terminal capabilities claim
that the "cursor up" sequence is ^[M, but the actual sequence
produced by the "cursor up" key is ^[[A.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch teaches the built-in `git add -p` machinery all the tricks it
needs to know in order to act as the work horse for `git checkout -p`.
Apart from the minor changes (slightly reworded messages, different
`diff` and `apply --check` invocations), it requires a new function to
actually apply the changes, as `git checkout -p` is a bit special in
that respect: when the desired changes do not apply to the index, but
apply to the work tree, Git does not fail straight away, but asks the
user whether to apply the changes to the worktree at least.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Perl version of `git add -p` supports this config setting to allow
users to input commands via single characters (as opposed to having to
press the <Enter> key afterwards).
This is an opt-in feature because it requires Perl packages
(Term::ReadKey and Term::Cap, where it tries to handle an absence of the
latter package gracefully) to work. Note that at least on Ubuntu, that
Perl package is not installed by default (it needs to be installed via
`sudo apt-get install libterm-readkey-perl`), so this feature is
probably not used a whole lot.
In C, we obviously do not have these packages available, but we just
introduced `read_single_keystroke()` that is similar to what
Term::ReadKey provides, and we use that here.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Typically, input on the command-line is line-based. It is actually not
really easy to get single characters (or better put: keystrokes).
We provide two implementations here:
- One that handles `/dev/tty` based systems as well as native Windows.
The former uses the `tcsetattr()` function to put the terminal into
"raw mode", which allows us to read individual keystrokes, one by one.
The latter uses `stty.exe` to do the same, falling back to direct
Win32 Console access.
Thanks to the refactoring leading up to this commit, this is a single
function, with the platform-specific details hidden away in
conditionally-compiled code blocks.
- A fall-back which simply punts and reads back an entire line.
Note that the function writes the keystroke into an `strbuf` rather than
a `char`, in preparation for reading Escape sequences (e.g. when the
user hit an arrow key). This is also required for UTF-8 sequences in
case the keystroke corresponds to a non-ASCII letter.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows' Git Bash runs in MinTTY by default, which does not have
a Win32 Console instance, but uses MSYS2 pseudo terminals instead.
This is a problem, as Git for Windows does not want to use the MSYS2
emulation layer for Git itself, and therefore has no direct way to
interact with that pseudo terminal.
As a workaround, use the `stty` utility (which is included in Git for
Windows, and which *is* an MSYS2 program, so it knows how to deal with
the pseudo terminal).
Note: If Git runs in a regular CMD or PowerShell window, there *is* a
regular Win32 Console to work with. This is not a problem for the MSYS2
`stty`: it copes with this scenario just fine.
Also note that we introduce support for more bits than would be
necessary for a mere `disable_echo()` here, in preparation for the
upcoming `enable_non_canonical()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We are about to introduce the function `enable_non_canonical()`, which
shares almost the complete code with `disable_echo()`.
Let's prepare for that, by refactoring out that shared code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Perl version of `git add -p` reads the config setting
`diff.algorithm` and if set, uses it to generate the diff using the
specified algorithm.
This patch ports that functionality to the C version.
To make sure that this works as intended, we add a regression test case
that tries to specify a bogus diff algorithm and then verifies that `git
diff-files` produced the expected error message.
Note: In that new test case, we actually ignore the exit code of `git
add -p`. The reason is that the C version exits with failure (as one
might expect), but the Perl version does not.
In fact, the Perl version continues happily after the uncolored diff
failed, trying to generate the colored diff, still not catching the
problem, and then it pretends to have succeeded (with exit code 0).
This is arguably a bug in the Perl version, and fixing it is safely
outside the scope of this patch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Perl version supports post-processing the colored diff (that is
generated in addition to the uncolored diff, intended to offer a
prettier user experience) by a command configured via that config
setting, and now the built-in version does that, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The scripted version of `git stash` called directly into the Perl script
`git-add--interactive.perl`, and this was faithfully converted to C.
However, we have a much better way to do this now: call `git add
--patch=<mode>`, which incidentally also respects the config setting
`add.interactive.useBuiltin`.
Let's do this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As `git add` traditionally did not expose the `--patch=<mode>` modes via
command-line options, the scripted version of `git stash` had to call
`git add--interactive` directly.
But this prevents the built-in `add -p` from kicking in, as
`add--interactive` is the Perl script.
So let's introduce support for an optional `<mode>` argument in `git add
--patch[=<mode>]`, and use that in the scripted version of `git stash
-p`, so that the built-in interactive add can do its job if configured.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>