Thorough benchmarking with repacking a subset of linux.git (the commit
history reachable from 93a6fefe2f ([PATCH] fix the SYSCTL=n compilation,
2007-02-28), to be precise) suggest that this allocator is on par, in
multi-threaded situations maybe even better than nedmalloc:
`git repack -adfq` with mimalloc, 8 threads:
31.166991900 27.576763800 28.712311000 27.373859000 27.163141900
`git repack -adfq` with nedmalloc, 8 threads:
31.915032900 27.149883100 28.244933700 27.240188800 28.580849500
In a different test using GitHub Actions build agents (probably
single-threaded, a core-strength of nedmalloc)):
`git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with mimalloc:
943.426 978.500 939.709 959.811 954.605
`git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with nedmalloc:
995.383 952.179 943.253 963.043 980.468
While these measurements were not executed with complete scientific
rigor, as no hardware was set aside specifically for these benchmarks,
it shows that mimalloc and nedmalloc perform almost the same, nedmalloc
with a bit higher variance and also slightly higher average (further
testing suggests that nedmalloc performs worse in multi-threaded
situations than in single-threaded ones).
In short: mimalloc seems to be slightly better suited for our purposes
than nedmalloc.
Seeing that mimalloc is developed actively, while nedmalloc ceased to
see any updates in eight years, let's use mimalloc on Windows instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Always use the internal "use_weak" random seed when initializing
the "mimalloc" heap when statically linked on Windows.
The imported "mimalloc" routines support several random sources
to seed the heap data structures, including BCrypt.dll and
RtlGenRandom. Crashes have been reported when using BCrypt.dll
if it initialized during an `atexit()` handler function. Granted,
such DLL initialization should not happen in an atexit handler,
but yet the crashes remain.
It should be noted that on Windows when statically linked, the
mimalloc startup code (called by the GCC CRT to initialize static
data prior to calling `main()`) always uses the internal "weak"
random seed. "mimalloc" does not try to load an alternate
random source until after the OS initialization has completed.
Heap data is stored in `__declspec(thread)` TLS data and in theory
each Git thread will have its own heap data. However, testing
shows that the "mimalloc" library doesn't actually call
`os_random_buf()` (to load a new random source) when creating these
new per-thread heap structures.
However, if an atexit handler is forced to run on a non-main
thread, the "mimalloc" library *WILL* try to create a new heap
and seed it with `os_random_buf()`. (The reason for this is still
a mystery to this author.) The `os_random_buf()` call can cause
the (previously uninitialized BCrypt.dll library) to be dynamically
loaded and a call made into it. Crashes have been reported in
v2.40.1.vfs.0.0 while in this call.
As a workaround, the fix here forces the use of the internal
"use_weak" random code for the subsequent `os_random_buf()` calls.
Since we have been using that random generator for the majority
of the program, it seems safe to use it for the final few mallocs
in the atexit handler (of which there really shouldn't be that many.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
By defining `USE_MIMALLOC`, Git can now be compiled with that
nicely-fast and small allocator.
Note that we have to disable a couple `DEVELOPER` options to build
mimalloc's source code, as it makes heavy use of declarations after
statements, among other things that disagree with Git's conventions.
We even have to silence some GCC warnings in non-DEVELOPER mode. For
example, the `-Wno-array-bounds` flag is needed because in `-O2` builds,
trying to call `NtCurrentTeb()` (which `_mi_thread_id()` does on
Windows) causes the bogus warning about a system header, likely related
to https://sourceforge.net/p/mingw-w64/mailman/message/37674519/ and to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578:
C:/git-sdk-64-minimal/mingw64/include/psdk_inc/intrin-impl.h:838:1:
error: array subscript 0 is outside array bounds of 'long long unsigned int[0]' [-Werror=array-bounds]
838 | __buildreadseg(__readgsqword, unsigned __int64, "gs", "q")
| ^~~~~~~~~~~~~~
Also: The `mimalloc` library uses C11-style atomics, therefore we must
require that standard when compiling with GCC if we want to use
`mimalloc` (instead of requiring "only" C99). This is what we do in the
CMake definition already, therefore this commit does not need to touch
`contrib/buildsystems/`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We want to compile mimalloc's source code as part of Git, rather than
requiring the code to be built as an external library: mimalloc uses a
CMake-based build, which is not necessarily easy to integrate into the
flavors of Git for Windows (which will be the main benefitting port).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This commit imports mimalloc's source code as per v2.1.2, fetched from
the tag at https://github.com/microsoft/mimalloc.
The .c files are from the src/ subdirectory, and the .h files from the
include/ and include/mimalloc/ subdirectories. We will subsequently
modify the source code to accommodate building within Git's context.
Since we plan on using the `mi_*()` family of functions, we skip the
C++-specific source code, some POSIX compliant functions to interact
with mimalloc, and the code that wants to support auto-magic overriding
of the `malloc()` function (mimalloc-new-delete.h, alloc-posix.c,
mimalloc-override.h, alloc-override.c, alloc-override-osx.c,
alloc-override-win.c and static.c).
To appease the `check-whitespace` job of Git's Continuous Integration,
this commit was washed one time via `git rebase --whitespace=fix`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We are about to vendor in `mimalloc`'s source code which we will want to
include `git-compat-util.h` after defining that constant.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The mingw-w64 GCC seems to link implicitly to libwinpthread, which does
implement a pthread emulation (that is more complete than Git's). Let's
keep preferring Git's.
To avoid linker errors where it thinks that the `pthread_self` and the
`pthread_create` symbols are defined twice, let's give our version a
`win32_` prefix, just like we already do for `pthread_join()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
While Git for Windows does not _ship_ Python (in order to save on
bandwidth), MSYS2 provides very fine Python interpreters that users can
easily take advantage of, by using Git for Windows within its SDK.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch extends the protections introduced for Git GUI's
CVE-2022-41953 to cover `gitk`, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Just like CVE-2022-41953 for Git GUI, there exists a vulnerability of
`gitk` where it looks for `taskkill.exe` in the current directory before
searching `PATH`.
Note that the many `exec git` calls are unaffected, due to an obscure
quirk in Tcl's `exec` function. Typically, `git.exe` lives next to
`wish.exe` (i.e. the program that is run to execute `gitk` or Git GUI)
in Git for Windows, and that is the saving grace for `git.exe because
`exec` searches the directory where `wish.exe` lives even before the
current directory, according to
https://www.tcl-lang.org/man/tcl/TclCmd/exec.htm#M24:
If a directory name was not specified as part of the application
name, the following directories are automatically searched in
order when attempting to locate the application:
The directory from which the Tcl executable was loaded.
The current directory.
The Windows 32-bit system directory.
The Windows home directory.
The directories listed in the path.
The same is not true, however, for `taskkill.exe`: it lives in the
Windows system directory (never mind the 32-bit, Tcl's documentation is
outdated on that point, it really means `C:\Windows\system32`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Commit 97e9d0b78a (trailer: find the end of the log message, 2023-10-20)
combined two code paths for finding the end of the log message. For the
"no_divider" case, we used to use find_trailer_end(), and that has now
been rolled into find_end_of_log_message(). But there's a regression;
that function returns early when no_divider is set, returning the whole
string.
That's not how find_trailer_end() behaved. Although it did skip the
"---" processing (which is what "no_divider" is meant to do), we should
still respect ignored_log_message_bytes(), which covers things like
comments, "commit -v" cut lines, and so on.
The bug is actually in the interpret-trailers command, but the obvious
way to experience it is by running "commit -v" with a "--trailer"
option. The new trailer will be added at the end of the verbose diff,
rather than before it (and consequently will be ignored entirely, since
everything after the diff's intro scissors line is thrown away).
I've added two tests here: one for interpret-trailers directly, which
shows the bug via the parsing routines, and one for "commit -v".
The fix itself is pretty simple: instead of returning early, no_divider
just skips the "---" handling but still calls ignored_log_message_bytes().
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to the localized translation in 2.44, for zh_CN, we have
uniformly modified the translation of the word "commit-graph" to make it
more consistent with language usage habits.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
After we upgraded actions/setup-go to v5, the following warning message
was reported every time we ran the CI.
Restore cache failed: Dependencies file is not found ...
Disable cache to suppress warning messages as described in the solution
below.
https://github.com/actions/setup-go/issues/427
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
When we upgraded GitHub Actions "mshick/add-pr-comment" to v2, the
following warning message was reported every time we ran the CI.
Unexpected input(s) 'repo-token-user-login', valid inputs ...
Removed the obsolete parameter "repo-token-user-login" to suppress
warning messages.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
* 'master' of github.com:git/git: (51 commits)
Hopefully the last batch of fixes before 2.44 final
Git 2.43.2
A few more fixes before -rc1
write-or-die: fix the polarity of GIT_FLUSH environment variable
A few more topics before -rc1
completion: add and use __git_compute_second_level_config_vars_for_section
completion: add and use __git_compute_first_level_config_vars_for_section
completion: complete 'submodule.*' config variables
completion: add space after config variable names also in Bash 3
receive-pack: use find_commit_header() in check_nonce()
ci(linux32): add a note about Actions that must not be updated
ci: bump remaining outdated Actions versions
unit-tests: do show relative file paths on non-Windows, too
receive-pack: use find_commit_header() in check_cert_push_options()
prune: mark rebase autostash and orig-head as reachable
sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
ref-filter.c: sort formatted dates by byte value
ssh signing: signal an error with a negative return value
bisect: document command line arguments for "bisect start"
bisect: document "terms" subcommand more fully
...
The command line completion script (in contrib/) learned to
complete configuration variable names better.
* pb/complete-config:
completion: add and use __git_compute_second_level_config_vars_for_section
completion: add and use __git_compute_first_level_config_vars_for_section
completion: complete 'submodule.*' config variables
completion: add space after config variable names also in Bash 3
The code paths that call repo_read_object_file() have been
tightened to react to errors.
* js/check-null-from-read-object-file:
Always check the return value of `repo_read_object_file()`
Code simplification.
* rs/receive-pack-remove-find-header:
receive-pack: use find_commit_header() in check_nonce()
receive-pack: use find_commit_header() in check_cert_push_options()
"git cherry-pick" invoked during "git rebase -i" session lost
the authorship information, which has been corrected.
* vn/rebase-with-cherry-pick-authorship:
sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
The sequencer machinery does not use the ref API and instead
records names of certain objects it needs for its correct operation
in temporary files, which makes these objects susceptible to loss
by garbage collection. These temporary files have been added as
starting points for reachability analysis to fix this.
* pw/gc-during-rebase:
prune: mark rebase autostash and orig-head as reachable