Commit Graph

81568 Commits

Author SHA1 Message Date
Johannes Schindelin
446c21d3cc fscache: add a test for the dir-not-found optimization
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:54:56 +01:00
Jeff Hostetler
37cafeaea7 fscache: remember not-found directories
Teach FSCACHE to remember "not found" directories.

This is a performance optimization.

FSCACHE is a performance optimization available for Windows.  It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext.  It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory.  This gives a major performance
boost on Windows.

However, it does not remember "not found" directories.  When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst.  This completely defeats any performance
gains.

This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.

This change reduced status times for my very large repo by 60%.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:54:56 +01:00
Jeff Hostetler
56e3d74eb7 fscache: add key for GIT_TRACE_FSCACHE
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:54:56 +01:00
Johannes Schindelin
63ed44f974 Merge branch 'long-paths'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:54:54 +01:00
Karsten Blees
d6b4447ead Win32: fix 'lstat("dir/")' with long paths
Use a suffciently large buffer to strip the trailing slash.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:47 +01:00
Karsten Blees
78daacc71b Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199

Note that the test cannot rely on the presence of short names, as they
are not enabled by default except on the system drive.

[jes: adjusted test number to avoid conflicts, reinstated && chain,
adjusted test to work without short names]

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:47 +01:00
Johannes Schindelin
4a67e4d461 Win32: support long paths
Windows paths are typically limited to MAX_PATH = 260 characters, even
though the underlying NTFS file system supports paths up to 32,767 chars.
This limitation is also evident in Windows Explorer, cmd.exe and many
other applications (including IDEs).

Particularly annoying is that most Windows APIs return bogus error codes
if a relative path only barely exceeds MAX_PATH in conjunction with the
current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the
infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG.

Many Windows wide char APIs support longer than MAX_PATH paths through the
file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path.
Notable exceptions include functions dealing with executables and the
current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as
well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...).

Introduce a handle_long_path function to check the length of a specified
path properly (and fail with ENAMETOOLONG), and to optionally expand long
paths using the '\\?\' file namespace prefix. Short paths will not be
modified, so we don't need to worry about device names (NUL, CON, AUX).

Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be
limited to MAX_PATH (at least not on Win7), so we can use it to do the
heavy lifting of the conversion (translate '/' to '\', eliminate '.' and
'..', and make an absolute path).

Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH
limit.

Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs
that support long paths.

While improved error checking is always active, long paths support must be
explicitly enabled via 'core.longpaths' option. This is to prevent end
users to shoot themselves in the foot by checking out files that Windows
Explorer, cmd/bash or their favorite IDE cannot handle.

Test suite:
Test the case is when the full pathname length of a dir is close
to 260 (MAX_PATH).
Bug report and an original reproducer by Andrey Rogozhnikov:
https://github.com/msysgit/git/pull/122#issuecomment-43604199

[jes: adjusted test number to avoid conflicts]

Thanks-to: Martin W. Kirst <maki@bitkings.de>
Thanks-to: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:46 +01:00
Doug Kelly
e27bc9d0d0 Add a test demonstrating a problem with long submodule paths
[jes: adusted test number to avoid conflicts, fixed non-portable use of
the 'export' statement]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:46 +01:00
Karsten Blees
bbc220591a fscache: load directories only once
If multiple threads access a directory that is not yet in the cache, the
directory will be loaded by each thread. Only one of the results is added
to the cache, all others are leaked. This wastes performance and memory.

On cache miss, add a future object to the cache to indicate that the
directory is currently being loaded. Subsequent threads register themselves
with the future object and wait. When the first thread has loaded the
directory, it replaces the future object with the result and notifies
waiting threads.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:42 +01:00
Karsten Blees
b2339a2fde Win32: add a cache below mingw's lstat and dirent implementations
Checking the work tree status is quite slow on Windows, due to slow lstat
emulation (git calls lstat once for each file in the index). Windows
operating system APIs seem to be much better at scanning the status
of entire directories than checking single files.

Add an lstat implementation that uses a cache for lstat data. Cache misses
read the entire parent directory and add it to the cache. Subsequent lstat
calls for the same directory are served directly from the cache.

Also implement opendir / readdir / closedir so that they create and use
directory listings in the cache.

The cache doesn't track file system changes and doesn't plug into any
modifying file APIs, so it has to be explicitly enabled for git functions
that don't modify the working copy.

Note: in an earlier version of this patch, the cache was always active and
tracked file system changes via ReadDirectoryChangesW. However, this was
much more complex and had negative impact on the performance of modifying
git commands such as 'git checkout'.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:42 +01:00
Karsten Blees
35db3b7717 add infrastructure for read-only file system level caches
Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.

This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.

Enable read-only sections for 'git status' and preload_index.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:42 +01:00
Karsten Blees
e778d71e77 Win32: make the lstat implementation pluggable
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:41 +01:00
Karsten Blees
42194ac36c Win32: Make the dirent implementation pluggable
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:41 +01:00
Karsten Blees
341c551ea3 Win32: dirent.c: Move opendir down
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:40 +01:00
Karsten Blees
c820a55573 Win32: make FILETIME conversion functions public
Signed-off-by: Karsten Blees <blees@dcon.de>
2018-03-23 13:51:40 +01:00
Johannes Schindelin
43a6e74f32 mingw: unset PERL5LIB by default
Git for Windows ships with its own Perl interpreter, and insists on
using it, so it will most likely wreak havoc if PERL5LIB is set before
launching Git.

Let's just unset that environment variables when spawning processes.

To make this feature extensible (and overrideable), there is a new
config setting `core.unsetenvvars` that allows specifying a
comma-separated list of names to unset before spawning processes.

Reported by Gabriel Fuhrmann.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:37 +01:00
Johannes Schindelin
b16ce0daeb Move Windows-specific config settings into compat/mingw.c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:36 +01:00
Johannes Schindelin
557d2f0ac9 Allow for platform-specific core.* config settings
In the Git for Windows project, we have ample precendent for config
settings that apply to Windows, and to Windows only.

Let's formalize this concept by introducing a platform_core_config()
function that can be #define'd in a platform-specific manner.

This will allow us to contain platform-specific code better, as the
corresponding variables no longer need to be exported so that they can
be defined in environment.c and be set in config.c

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:36 +01:00
Johannes Schindelin
7d13c75997 config: rename dummy parameter to cb in git_default_config()
This is the convention elsewhere (and prepares for the case where we may
need to pass callback data).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-03-23 13:51:36 +01:00
Johannes Schindelin
72d69705f7 Start the merging-rebase to v2.16.3
This commit starts the rebase of a511ce0613 to b64ca2b4ff
2018-03-23 13:51:13 +01:00
Junio C Hamano
d32eb83c1d Git 2.16.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v2.16.3
2018-03-22 14:24:45 -07:00
Junio C Hamano
88595ebceb Merge branch 'ms/non-ascii-ticks' into maint
Doc markup fix.

* ms/non-ascii-ticks:
  Documentation/gitsubmodules.txt: avoid non-ASCII apostrophes
2018-03-22 14:24:26 -07:00
Junio C Hamano
393eee1cad Merge branch 'jk/cached-commit-buffer' into maint
Code clean-up.

* jk/cached-commit-buffer:
  revision: drop --show-all option
  commit: drop uses of get_cached_commit_buffer()
2018-03-22 14:24:25 -07:00
Junio C Hamano
c9bc2c5d4d Merge branch 'sm/mv-dry-run-update' into maint
Code clean-up.

* sm/mv-dry-run-update:
  mv: remove unneeded 'if (!show_only)'
  t7001: add test case for --dry-run
2018-03-22 14:24:25 -07:00
Junio C Hamano
342215be59 Merge branch 'tg/worktree-create-tracking' into maint
Hotfix for a recent topic.

* tg/worktree-create-tracking:
  git-worktree.txt: fix indentation of example and text of 'add' command
  git-worktree.txt: fix missing ")" typo
2018-03-22 14:24:24 -07:00
Junio C Hamano
8bfeb0e42c Merge branch 'gs/test-unset-xdg-cache-home' into maint
Test update.

* gs/test-unset-xdg-cache-home:
  test-lib.sh: unset XDG_CACHE_HOME
2018-03-22 14:24:24 -07:00
Junio C Hamano
e09224812a Merge branch 'sb/status-doc-fix' into maint
Docfix.

* sb/status-doc-fix:
  Documentation/git-status: clarify status table for porcelain mode
2018-03-22 14:24:23 -07:00
Junio C Hamano
9ea8e0ca81 Merge branch 'rd/typofix' into maint
Typofix.

* rd/typofix:
  Correct mispellings of ".gitmodule" to ".gitmodules"
  t/: correct obvious typo "detahced"
2018-03-22 14:24:22 -07:00
Junio C Hamano
5a03f1d75a Merge branch 'bp/fsmonitor' into maint
Doc update for a recently added feature.

* bp/fsmonitor:
  fsmonitor: update documentation to remove reference to invalid config settings
2018-03-22 14:24:21 -07:00
Junio C Hamano
dfc20a5e3c Merge branch 'bc/doc-interpret-trailers-grammofix' into maint
Docfix.

* bc/doc-interpret-trailers-grammofix:
  docs/interpret-trailers: fix agreement error
2018-03-22 14:24:21 -07:00
Junio C Hamano
68559c464a Merge branch 'sg/doc-test-must-fail-args' into maint
Devdoc update.

* sg/doc-test-must-fail-args:
  t: document 'test_must_fail ok=<signal-name>'
2018-03-22 14:24:20 -07:00
Junio C Hamano
67b7dd3d86 Merge branch 'rj/sparse-updates' into maint
Devtool update.

* rj/sparse-updates:
  Makefile: suppress a sparse warning for pack-revindex.c
  config.mak.uname: remove SPARSE_FLAGS setting for cygwin
2018-03-22 14:24:19 -07:00
Junio C Hamano
2e1062d30f Merge branch 'jk/gettext-poison' into maint
Test updates.

* jk/gettext-poison:
  git-sh-i18n: check GETTEXT_POISON before USE_GETTEXT_SCHEME
  t0205: drop redundant test
2018-03-22 14:24:19 -07:00
Junio C Hamano
34f6f0eca2 Merge branch 'nd/ignore-glob-doc-update' into maint
Doc update.

* nd/ignore-glob-doc-update:
  gitignore.txt: elaborate shell glob syntax
2018-03-22 14:24:18 -07:00
Junio C Hamano
fda2326cb7 Merge branch 'rs/cocci-strbuf-addf-to-addstr' into maint
* rs/cocci-strbuf-addf-to-addstr:
  cocci: simplify check for trivial format strings
2018-03-22 14:24:17 -07:00
Junio C Hamano
e55521be8d Merge branch 'jc/worktree-add-short-help' into maint
Error message fix.

* jc/worktree-add-short-help:
  worktree: say that "add" takes an arbitrary commit in short-help
2018-03-22 14:24:17 -07:00
Junio C Hamano
9c34129e6b Merge branch 'tz/doc-show-defaults-to-head' into maint
Doc update.

* tz/doc-show-defaults-to-head:
  doc: mention 'git show' defaults to HEAD
2018-03-22 14:24:17 -07:00
Junio C Hamano
3112c3fa7f Merge branch 'nd/shared-index-fix' into maint
Code clean-up.

* nd/shared-index-fix:
  read-cache: don't write index twice if we can't write shared index
  read-cache.c: move tempfile creation/cleanup out of write_shared_index
  read-cache.c: change type of "temp" in write_shared_index()
2018-03-22 14:24:16 -07:00
Junio C Hamano
bffce882fd Merge branch 'jc/mailinfo-cleanup-fix' into maint
Corner case bugfix.

* jc/mailinfo-cleanup-fix:
  mailinfo: avoid segfault when can't open files
2018-03-22 14:24:16 -07:00
Junio C Hamano
b502aa4f45 Merge branch 'rb/hashmap-h-compilation-fix' into maint
Code clean-up.

* rb/hashmap-h-compilation-fix:
  hashmap.h: remove unused variable
2018-03-22 14:24:15 -07:00
Junio C Hamano
9bcb48912c Merge branch 'rs/describe-unique-abbrev' into maint
Code clean-up.

* rs/describe-unique-abbrev:
  describe: use strbuf_add_unique_abbrev() for adding short hashes
2018-03-22 14:24:14 -07:00
Junio C Hamano
60736db161 Merge branch 'ks/submodule-doc-updates' into maint
Doc updates.

* ks/submodule-doc-updates:
  Doc/git-submodule: improve readability and grammar of a sentence
  Doc/gitsubmodules: make some changes to improve readability and syntax
2018-03-22 14:24:14 -07:00
Junio C Hamano
b1bdf46bb8 Merge branch 'cl/t9001-cleanup' into maint
Test clean-up.

* cl/t9001-cleanup:
  t9001: use existing helper in send-email test
2018-03-22 14:24:13 -07:00
Junio C Hamano
6ef449ea77 Merge branch 'bw/oidmap-autoinit' into maint
Code clean-up.

* bw/oidmap-autoinit:
  oidmap: ensure map is initialized
2018-03-22 14:24:12 -07:00
Junio C Hamano
dab684ff43 Merge branch 'sg/test-i18ngrep' into maint
Test fixes.

* sg/test-i18ngrep:
  t: make 'test_i18ngrep' more informative on failure
  t: validate 'test_i18ngrep's parameters
  t: move 'test_i18ncmp' and 'test_i18ngrep' to 'test-lib-functions.sh'
  t5536: let 'test_i18ngrep' read the file without redirection
  t5510: consolidate 'grep' and 'test_i18ngrep' patterns
  t4001: don't run 'git status' upstream of a pipe
  t6022: don't run 'git merge' upstream of a pipe
  t5812: add 'test_i18ngrep's missing filename parameter
  t5541: add 'test_i18ngrep's missing filename parameter
2018-03-22 14:24:12 -07:00
Junio C Hamano
d78b7eb2d5 Merge branch 'jt/fsck-code-cleanup' into maint
Plug recently introduced leaks in fsck.

* jt/fsck-code-cleanup:
  fsck: fix leak when traversing trees
2018-03-22 14:24:12 -07:00
Junio C Hamano
34b9ec8dd9 Merge branch 'ew/svn-branch-segfault-fix' into maint
Workaround for segfault with more recent versions of SVN.

* ew/svn-branch-segfault-fix:
  git-svn: control destruction order to avoid segfault
2018-03-22 14:24:11 -07:00
Junio C Hamano
091853a1aa Merge branch 'nd/list-merge-strategy' into maint
Completion of "git merge -s<strategy>" (in contrib/) did not work
well in non-C locale.

* nd/list-merge-strategy:
  completion: fix completing merge strategies on non-C locales
2018-03-22 14:24:11 -07:00
Junio C Hamano
f936c9b393 Merge branch 'jk/daemon-fixes' into maint
Assorted fixes to "git daemon".

* jk/daemon-fixes:
  daemon: fix length computation in newline stripping
  t/lib-git-daemon: add network-protocol helpers
  daemon: handle NULs in extended attribute string
  daemon: fix off-by-one in logging extended attributes
  t/lib-git-daemon: record daemon log
  t5570: use ls-remote instead of clone for interp tests
2018-03-22 14:24:11 -07:00
Junio C Hamano
b0e0fc267b Merge branch 'tg/split-index-fixes' into maint
The split-index mode had a few corner case bugs fixed.

* tg/split-index-fixes:
  travis: run tests with GIT_TEST_SPLIT_INDEX
  split-index: don't write cache tree with null oid entries
  read-cache: fix reading the shared index for other repos
2018-03-22 14:24:10 -07:00