Commit Graph

66157 Commits

Author SHA1 Message Date
Johannes Schindelin
2cfb5e5079 Merge pull request #971 from jeffhostetler/jeffhostetler/add_preload_fscache
add: use preload-index and fscache for performance
2016-11-29 23:29:21 +01:00
Johannes Schindelin
7d0be42bc7 Merge pull request #964 from jeffhostetler/jeffhostetler/memihash_perf
Jeffhostetler/memihash perf
2016-11-29 23:29:19 +01:00
Johannes Schindelin
2d11e784ec mingw: make readlink() independent of core.symlinks
Regardless whether we think we are able to create symbolic links, we
should always read them.

This fixes https://github.com/git-for-windows/git/issues/958

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:19 +01:00
Johannes Schindelin
d900b43b33 Merge pull request #955 from jeffhostetler/jeffhostetler/preload_index_perf
preload-index: avoid lstat for skip-worktree items
2016-11-29 23:29:18 +01:00
Johannes Schindelin
785cdc82bb Merge pull request #938 from virtuald/patch-1
git-cvsexportcommit.perl: Force crlf translation
2016-11-29 23:29:16 +01:00
Johannes Schindelin
e146b54b18 Merge branch 'reset-stdin'
This topic branch adds the (experimental) --stdin/-z options to `git
reset`. Those patches are still under review in the upstream Git project,
but are already merged in their experimental form into Git for Windows'
`master` branch, in preparation for a MinGit-only release.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:15 +01:00
Johannes Schindelin
84c89c2514 Merge branch 'mingw-strftime'
This topic branch works around an out-of-memory bug when the user
specified a format via --date=format:<format> that strftime() does
not like.

Reported by Stefan Naewe.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:14 +01:00
Johannes Schindelin
f10034d46a Unbreak interactive GPG prompt upon signing
With the recent update in efee955 (gpg-interface: check gpg signature
creation status, 2016-06-17), we ask GPG to send all status updates to
stderr, and then catch the stderr in an strbuf.

But GPG might fail, and send error messages to stderr. And we simply
do not show them to the user.

Even worse: this swallows any interactive prompt for a passphrase. And
detaches stderr from the tty so that the passphrase cannot be read.

So while the first problem could be fixed (by printing the captured
stderr upon error), the second problem cannot be easily fixed, and
presents a major regression.

So let's just revert commit efee9553a4.

This fixes https://github.com/git-for-windows/git/issues/871

Cc: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:13 +01:00
Johannes Schindelin
782960b172 Merge pull request #866 from landstander668/add_platform
Add reporting of build platform
2016-11-29 23:29:12 +01:00
Johannes Schindelin
f63d4a3568 Merge branch 'interactive-rebase'
This series of branches introduces the git-rebase--helper, a builtin
helping to accelerate the interactive rebase dramatically.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:10 +01:00
Johannes Schindelin
075a513e2b Merge branch 'unhidden-git'
It has been reported that core.hideDotFiles=false stopped working...
This topic branch fixes it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:09 +01:00
Johannes Schindelin
c5ab72dec7 Merge branch 'status-no-lock-index'
This branch allows third-party tools to call `git status
--no-lock-index` to avoid lock contention with the interactive Git usage
of the actual human user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:07 +01:00
Johannes Schindelin
86e65dd5e8 Merge pull request #797 from glhez/master
`git bundle create <bundle>` leaks handle the revlist is empty.
2016-11-29 23:29:06 +01:00
Johannes Schindelin
34e1d951b6 Merge 'release-gc-repack' into HEAD 2016-11-29 23:29:05 +01:00
Jeff Hostetler
0b70192613 add: use preload-index and fscache for performance
Teach "add" to use preload-index and fscache features
to improve performance on very large repositories.

During an "add", a call is made to run_diff_files()
which calls check_remove() for each index-entry.  This
calls lstat().  On Windows, the fscache code intercepts
the lstat() calls and builds a private cache using the
FindFirst/FindNext routines, which are much faster.

Somewhat independent of this, is the preload-index code
which distributes some of the start-up costs across
multiple threads.

We need to keep the call to read_cache() before parsing the
pathspecs (and hence cannot use the pathspecs to limit any preload)
because parse_pathspec() is using the index to determine whether a
pathspec is, in fact, in a submodule. If we would not read the index
first, parse_pathspec() would not error out on a path that is inside
a submodule, and t7400-submodule-basic.sh would fail with

	not ok 47 - do not add files from a submodule

We still want the nice preload performance boost, though, so we simply
call read_cache_preload(&pathspecs) after parsing the pathspecs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:03 +01:00
Johannes Schindelin
0da704a5ed Merge branch 'spawn-with-spaces'
This change lets us spawn .bat scripts whose paths contain spaces.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:03 +01:00
Johannes Schindelin
7cf6a82567 Export the preload_index() function
The purpose of this function is to stat() the files listed in the index
in a multi-threaded fashion. It is called directly after reading the
index in the read_index_preloaded() function.

However, in some cases we may want to separate the index reading from
the preloading step, e.g. in builtin/add.c, where we need to load the
index before we parse the pathspecs (which needs to error out if one of
the pathspecs refers to a path within a submodule, for which the index
must have been read already), and only then will we want to preload,
possibly limited by the just-parsed pathspecs.

So let's just export that function to allow calling it separately.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:03 +01:00
Jeff Hostetler
6ffde6a261 name-hash: remember previous dir_entry during lazy_init_name_hash
Teach hash_dir_entry() to remember the previously found dir_entry
during lazy_init_name_hash() iteration.  This is a performance
optimization.  Since items in the index array are sorted by full
pathname, adjacent items are likely to be in the same directory.
This can save memihash() computations and HashMap lookups.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Jeff Hostetler
1ab7b2d732 preload-index: avoid lstat for skip-worktree items
Teach preload-index to avoid lstat() calls for index-entries
with skip-worktree bit set.  This is a performance optimization.

During a sparse-checkout, the skip-worktree bit is set on items
that were not populated and therefore are not present in the
worktree.  The per-thread preload-index loop performs a series
of tests on each index-entry as it attempts to compare the
worktree version with the index and mark them up-to-date.
This patch short-cuts that work.

On a Windows 10 system with a very large repo (450MB index)
and various levels of sparseness, performance was improved
in the {preloadindex=true, fscache=false} case by 80% and
in the {preloadindex=true, fscache=true} case by 20% for various
commands.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Dustin Spicuzza
b81cd251a7 cvsexportcommit: force crlf translation
When using cvsnt + msys + git, it seems like the output of cvs status
had \r\n in it, and caused the command to fail.

This fixes that.

Signed-off-by: Dustin Spicuzza <dustin@virtualroadside.com>
2016-11-29 23:29:02 +01:00
Johannes Schindelin
afdae580f3 reset: support the experimental --stdin option
Just like with other Git commands, this option makes it read the paths
from the standard input. It comes in handy when resetting many, many
paths at once and wildcards are not an option (e.g. when the paths are
generated by a tool).

Note: we first parse the entire list and perform the actual reset action
only in a second phase. Not only does this make things simpler, it also
helps performance, as do_diff_cache() traverses the index and the
(sorted) pathspecs in simultaneously to avoid unnecessary lookups.

This feature is marked experimental because it is still under review in
the upstream Git project.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:02 +01:00
Johannes Schindelin
433f467adb mingw: abort on invalid strftime formats
On Windows, strftime() does not silently ignore invalid formats, but
warns about them and then returns 0 and sets errno to EINVAL.

Unfortunately, Git does not expect such a behavior, as it disagrees
with strftime()'s semantics on Linux. As a consequence, Git
misinterprets the return value 0 as "I need more space" and grows the
buffer. As the larger buffer does not fix the format, the buffer grows
and grows and grows until we are out of memory and abort.

Ideally, we would switch off the parameter validation just for
strftime(), but we cannot even override the invalid parameter handler
via _set_thread_local_invalid_parameter_handler() using MINGW because
that function is not declared. Even _set_invalid_parameter_handler(),
which *is* declared, does not help, as it simply does... nothing.

So let's just bite the bullet and override strftime() for MINGW and
abort on an invalid format string. While this does not provide the
best user experience, it is the best we can do.

See https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx for more
details.

This fixes https://github.com/git-for-windows/git/issues/863

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:29:02 +01:00
Jeff Hostetler
5ba3a95573 name-hash: specify initial size for istate.dir_hash table
Specify an initial size for the istate.dir_hash HashMap matching
the size of the istate.name_hash.

Previously hashmap_init() was given 0, causing a 64 bucket
hashmap to be created.  When working with very large
repositories, this would cause numerous rehash() calls to
realloc and rebalance the hashmap. This is especially true
when the worktree is deep, with many directories containing
a few files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Jeff Hostetler
baf663d69f name-hash: precompute hash values during preload-index
Precompute the istate.name_hash and istate.dir_hash values
for each cache-entry during the preload-index phase.

Move the expensive memihash() calculations from lazy_init_name_hash()
to the multi-threaded preload-index phase.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Jeff Hostetler
1ccde200c2 hashmap: allow memihash computation to be continued
Add variant of memihash() to allow the hash computation to
be continued.  There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory.  This can be expensive on large repositories.

With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Jeff Hostetler
9f122a5ad5 name-hash: eliminate duplicate memihash call
Remove duplicate memihash() call in hash_dir_entry().
The existing code called memihash() to do the find_dir_entry()
and it not found, called memihash() again to do the hashmap_add().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2016-11-29 23:29:02 +01:00
Adric Norris
9a0f75c7f1 Preliminary support for reporting build platform
Add preliminary support for detection of the build plaform, and reporting
of same with the `git version --build-options' command. This can be useful
for bug reporting, to distinguish between 32 and 64-bit builds for
example.

The current implementation can only distinguish between x86 and x86_64.
This will be extended in future patches. In addition, all 32-bit variants
(i686, i586, etc.) are collapsed into `x86'. An example of the output is:

   $ git version --build-options
   git version 2.9.3.windows.2.826.g06c0f2f
   sizeof-long: 4
   machine: x86_64

The label of `machine' was chosen so the new information will approximate
the output of `uname -m'.

Signed-off-by: Adric Norris <landstander668@gmail.com>
2016-11-29 23:29:01 +01:00
Johannes Schindelin
1dc85597ae Merge 'rebase-i-extra' into HEAD 2016-11-29 23:29:00 +01:00
Johannes Schindelin
8fb1046062 Merge 'rebase--helper' into HEAD 2016-11-29 23:28:59 +01:00
Johannes Schindelin
13b660918d rebase -i: rearrange fixup/squash lines using the rebase--helper
This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
a10efccc51 Merge 'sequencer-i' into HEAD 2016-11-29 23:28:57 +01:00
Johannes Schindelin
7876d7fc49 rebase -i: use the rebase--helper builtin
Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.

Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.

Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.

Note: the test t3404 had to be adjusted because the the error messages
produced by the sequencer comply with our current convention to start with
a lower-case letter.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
838389009f t3415: test fixup with wrapped oneline
The `git commit --fixup` command unwraps wrapped onelines when
constructing the commit message, without wrapping the result.

We need to make sure that `git rebase --autosquash` keeps handling such
cases correctly, in particular since we are about to move the autosquash
handling into the rebase--helper.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
234cc86565 Add a builtin helper for interactive rebases
Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.

Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.

To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.

The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.

In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.

Once that is in place, we can work gradually on tackling the rest of the
technical debt.

Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
084dca28c6 rebase -i: skip unnecessary picks using the rebase--helper
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Note: The original code did not try to skip unnecessary picks of root
commits but punts instead (probably --root was not considered common
enough of a use case to bother optimizing). We do the same, for now.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
14e4885908 rebase -i: check for missing commits in the rebase--helper
In particular on Windows, where shell scripts are even more expensive
than on MacOSX or Linux, it makes sense to move a loop that forks
Git at least once for every line in the todo list into a builtin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
76eb1f5b7c t3404: relax rebase.missingCommitsCheck tests
These tests were a bit anal about the *exact* warning/error message
printed by git rebase. But those messages are intended for the *end
user*, therefore it does not make sense to test so rigidly for the
*exact* wording.

In the following, we will reimplement the missing commits check in
the sequencer, with slightly different words.

So let's just test for the parts in the warning/error message that
we *really* care about, nothing more, nothing less.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
5174dbde5d rebase -i: also expand/collapse the SHA-1s via the rebase--helper
This is crucial to improve performance on Windows, as the speed is now
mostly dominated by the SHA-1 transformation (because it spawns a new
rev-parse process for *every* line, and spawning processes is pretty
slow from Git for Windows' MSYS2 Bash).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
c03d74ecab rebase -i: do not invent onelines when expanding/collapsing SHA-1s
To avoid problems with short SHA-1s that become non-unique during the
rebase, we rewrite the todo script with short/long SHA-1s before and
after letting the user edit the script. Since SHA-1s are not intuitive
for humans, rebase -i also provides the onelines (commit message
subjects) in the script, purely for the user's convenience.

It is very possible to generate a todo script via different means than
rebase -i and then to let rebase -i run with it; In this case, these
onelines are not required.

And this is where the expand/collapse machinery has a bug: it *expects*
that oneline, and failing to find one reuses the previous SHA-1 as
"oneline".

It was most likely an oversight, and made implementation in the (quite
limiting) shell script language less convoluted. However, we are about
to reimplement performance-critical parts in C (and due to spawning a
git.exe process for every single line of the todo script, the
expansion/collapsing of the SHA-1s *is* performance-hampering on
Windows), therefore let's fix this bug to make cross-validation with the
C version of that functionality possible.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
9ab3fc063e rebase -i: remove useless indentation
The commands used to be indented, and it is nice to look at, but when we
transform the SHA-1s, the indentation is removed. So let's do away with it.

For the moment, at least: when we will use the upcoming rebase--helper
to transform the SHA-1s, we *will* keep the indentation and can
reintroduce it. Yet, to be able to validate the rebase--helper against
the output of the current shell script version, we need to remove the
extra indentation.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
f8a2ffd5db rebase -i: generate the script via rebase--helper
The first step of an interactive rebase is to generate the so-called "todo
script", to be stored in the state directory as "git-rebase-todo" and to
be edited by the user.

Originally, we adjusted the output of `git log <options>` using a simple
sed script. Over the course of the years, the code became more
complicated. We now use shell scripting to edit the output of `git log`
conditionally, depending whether to keep "empty" commits (i.e. commits
that do not change any files).

On platforms where shell scripting is not native, this can be a serious
drag. And it opens the door for incompatibilities between platforms when
it comes to shell scripting or to Unix-y commands.

Let's just re-implement the todo script generation in plain C, using the
revision machinery directly.

This is substantially faster, improving the speed relative to the
shell script version of the interactive rebase from 2x to 3x on Windows.

Note that the rearrange_squash() function in git-rebase--interactive
relied on the fact that we set the "format" variable to the config setting
rebase.instructionFormat. Relying on a side effect like this is no good,
hence we explicitly perform that assignment (possibly again) in
rearrange_squash().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:57 +01:00
Johannes Schindelin
1e8863326f sequencer (rebase -i): write out the final message
The shell script version of the interactive rebase has a very specific
final message. Teach the sequencer to print the same.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
f703457526 sequencer (rebase -i): write the progress into files
For the benefit of e.g. the shell prompt, the interactive rebase not
only displays the progress for the user to see, but also writes it into
the msgnum/end files in the state directory.

Teach the sequencer this new trick.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
8513b2daa0 sequencer (rebase -i): show the progress
The interactive rebase keeps the user informed about its progress.
If the sequencer wants to do the grunt work of the interactive
rebase, it also needs to show that progress.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
22cfbad766 sequencer (rebase -i): suggest --edit-todo upon unknown command
This is the same behavior as known from `git rebase -i`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
5f952b3b51 sequencer (rebase -i): show only failed cherry-picks' output
This is the behavior of the shell script version of the interactive
rebase, by using the `output` function defined in `git-rebase.sh`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
28c5e3bca0 sequencer (rebase -i): show only failed git commit's output
This is the behavior of the shell script version of the interactive
rebase, by using the `output` function defined in `git-rebase.sh`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
161d7c0521 run_command_opt(): optionally hide stderr when the command succeeds
This will be needed to hide the output of `git commit` when the
sequencer handles an interactive rebase's script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
5019ce10dd sequencer (rebase -i): differentiate between comments and 'noop'
In the upcoming patch, we will support rebase -i's progress
reporting. The progress skips comments but counts 'noop's.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00
Johannes Schindelin
7d92436b5f sequencer (rebase -i): implement the 'drop' command
The parsing part of a 'drop' command is almost identical to parsing a
'pick', while the operation is the same as that of a 'noop'.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2016-11-29 23:28:56 +01:00