On Windows, there are two kinds of executables, console ones and
non-console ones. Git's executables are all console ones.
When launching the former e.g. in a scheduled task, a CMD window pops
up. This is not what we want for the tasks installed via the `git
maintenance` command.
To work around this, let's introduce `headless-git.exe`, which is a
non-console program that does _not_ pop up any window. All it does is to
re-launch `git.exe`, suppressing that console window, passing through
all command-line arguments as-are.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Helped-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
So far, we only built Console programs, but we are about to introduce a
program that targets the Windows subsystem (i.e. it is a so-called "GUI"
program).
Let's handle this preemptively in the script that generates the Visual
Studio files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
An upcoming commit will introduce those compile options; MSVC does not
understand them, so let's suppress them when generating the Visual
Studio project files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, we also compile a "resource" file, which is similar to
source code, but contains metadata (such as the program version).
So far, we did not compile it in `MSVC` mode, only when compiling Git
for Windows with the GNU C Compiler.
In preparation for including it also when compiling with MS Visual C,
let's teach our `vcxproj` generator to handle those sort of files, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This seems to have been there since 259d87c354 (Add scripts to
generate projects for other buildsystems (MSVC vcproj, QMake),
2009-09-16), i.e. since the beginning of that file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Move the default `-ENTRY` and `-SUBSYSTEM` arguments for
MSVC=1 builds from `config.mak.uname` into `clink.pl`.
These args are constant for console-mode executables.
Add support to `clink.pl` for generating a Win32 GUI application
using the `-mwindows` argument (to match how GCC does it). This
changes the `-ENTRY` and `-SUBSYSTEM` arguments accordingly.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Ignore the `-fno-stack-protector` compiler argument when building
with MSVC. This will be used in a later commit that needs to build
a Win32 GUI app.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach MSVC=1 builds to depend on the `git.rc` file so that
the resulting executables have Windows-style resources and
version number information within them.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Create a wrapper for the Windows Resource Compiler (RC.EXE)
for use by the MSVC=1 builds. This is similar to the CL.EXE
and LIB.EXE wrappers used for the MSVC=1 builds.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
When building with `make MSVC=1 DEBUG=1`, link to `libexpatd.lib`
rather than `libexpat.lib`.
It appears that the `vcpkg` package for "libexpat" has changed and now
creates `libexpatd.lib` for debug mode builds. Previously, both debug
and release builds created a ".lib" with the same basename.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The convention in Git project's shell scripts is to have white-space
_before_, but not _after_ the `>` (or `<`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When we commit the template directory as part of `make vcxproj`, the
`branches/` directory is not actually commited, as it is empty.
Two tests were not prepared for that situation.
This developer tried to get rid of the support for `.git/branches/` a
long time ago, but that effort did not bear fruit, so the best we can do
is work around in these here tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It already caused problems with the test suite that the directory
containing `git.vcxproj` is called the same as the Git executable
without its file extension: `./git` is ambiguous, it could refer both to
the directory `git/` as well as to `git.exe`.
Now there is one more problem: when our GitHub workflow runs on the
`vs/master` branch, it fails in all but the Windows builds, as they want
to write the file `git` but there is already a directory in the way.
Let's just go ahead and append `.proj` to all of those directories, e.g.
`git.proj/` instead of `git/`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Internally, Git expects the environment variable `HOME` to be set, and
to point to the current user's home directory.
This environment variable is not set by default on Windows, and
therefore Git tries its best to construct one if it finds `HOME` unset.
There are actually two different approaches Git tries: first, it looks
at `HOMEDRIVE`/`HOMEPATH` because this is widely used in corporate
environments with roaming profiles, and a user generally wants their
global Git settings to be in a roaming profile.
Only when `HOMEDRIVE`/`HOMEPATH` is either unset or does not point to a
valid location, Git will fall back to using `USERPROFILE` instead.
However, starting with Windows Vista, for secondary logons and services,
the environment variables `HOMEDRIVE`/`HOMEPATH` point to Windows'
system directory (usually `C:\Windows\system32`).
That is undesirable, and that location is usually write-protected anyway.
So let's verify that the `HOMEDRIVE`/`HOMEPATH` combo does not point to
Windows' system directory before using it, falling back to `USERPROFILE`
if it does.
This fixes git-for-windows#2709
Initial-Path-by: Ivan Pozdeev <vano@mail.mipt.ru>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git for Windows wants to add `git.exe` to the users' `PATH`, without
cluttering the latter with unnecessary executables such as `wish.exe`.
To that end, it invented the concept of its "Git wrapper", i.e. a tiny
executable located in `C:\Program Files\Git\cmd\git.exe` (originally a
CMD script) whose sole purpose is to set up a couple of environment
variables and then spawn the _actual_ `git.exe` (which nowadays lives in
`C:\Program Files\Git\mingw64\bin\git.exe` for 64-bit, and the obvious
equivalent for 32-bit installations).
Currently, the following environment variables are set unless already
initialized:
- `MSYSTEM`, to make sure that the MSYS2 Bash and the MSYS2 Perl
interpreter behave as expected, and
- `PLINK_PROTOCOL`, to force PuTTY's `plink.exe` to use the SSH
protocol instead of Telnet,
- `PATH`, to make sure that the `bin` folder in the user's home
directory, as well as the `/mingw64/bin` and the `/usr/bin`
directories are included. The trick here is that the `/mingw64/bin/`
and `/usr/bin/` directories are relative to the top-level installation
directory of Git for Windows (which the included Bash interprets as
`/`, i.e. as the MSYS pseudo root directory).
Using the absence of `MSYSTEM` as a tell-tale, we can detect in
`git.exe` whether these environment variables have been initialized
properly. Therefore we can call `C:\Program Files\Git\mingw64\bin\git`
in-place after this change, without having to call Git through the Git
wrapper.
Obviously, above-mentioned directories must be _prepended_ to the `PATH`
variable, otherwise we risk picking up executables from unrelated Git
installations. We do that by constructing the new `PATH` value from
scratch, appending `$HOME/bin` (if `HOME` is set), then the MSYS2 system
directories, and then appending the original `PATH`.
Side note: this modification of the `PATH` variable is independent of
the modification necessary to reach the executables and scripts in
`/mingw64/libexec/git-core/`, i.e. the `GIT_EXEC_PATH`. That
modification is still performed by Git, elsewhere, long after making the
changes described above.
While we _still_ cannot simply hard-link `mingw64\bin\git.exe` to `cmd`
(because the former depends on a couple of `.dll` files that are only in
`mingw64\bin`, i.e. calling `...\cmd\git.exe` would fail to load due to
missing dependencies), at least we can now avoid that extra process of
running the Git wrapper (which then has to wait for the spawned
`git.exe` to finish) by calling `...\mingw64\bin\git.exe` directly, via
its absolute path.
Testing this is in Git's test suite tricky: we set up a "new" MSYS
pseudo-root and copy the `git.exe` file into the appropriate location,
then verify that `MSYSTEM` is set properly, and also that the `PATH` is
modified so that scripts can be found in `$HOME/bin`, `/mingw64/bin/`
and `/usr/bin/`.
This addresses https://github.com/git-for-windows/git/issues/2283
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A change between versions 2.4.1 and 2.6.0 of the MSYS2 runtime modified
how Cygwin's runtime (and hence Git for Windows' MSYS2 runtime
derivative) handles locales: d16a56306d (Consolidate wctomb/mbtowc calls
for POSIX-1.2008, 2016-07-20).
An unintended side-effect is that "cold-calling" into the POSIX
emulation will start with a locale based on the current code page,
something that Git for Windows is very ill-prepared for, as it expects
to be able to pass a command-line containing non-ASCII characters to the
shell without having those characters munged.
One symptom of this behavior: when `git clone` or `git fetch` shell out
to call `git-upload-pack` with a path that contains non-ASCII
characters, the shell tried to interpret the entire command-line
(including command-line parameters) as executable path, which obviously
must fail.
This fixes https://github.com/git-for-windows/git/issues/1036
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
"git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.
* ev/pull-already-up-to-date-is-noop:
pull: should be noop when already-up-to-date
"git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with versions of PCREv2 library older
than 10.34 in the latest release.
* hm/paint-hits-in-log-grep:
Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
This reverts commit f6526728f9.
The change in f652672 (dir: select directories correctly, 2021-09-24)
caused a regression in directory-based matches with non-cone-mode
patterns, especially for .gitignore patterns. A test is included to
prevent this regression in the future.
The commit ed495847 (dir: fix pattern matching on dirs, 2021-09-24) was
reverted in 5ceb663 (dir: fix directory-matching bug, 2021-11-02) for
similar reasons. Neither commit changed tests, and tests added later in
the series continue to pass when these commits are reverted.
Reported-by: Danial Alihosseini <danial.alihosseini@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit ae39ba431a, as it
breaks "grep" when looking for a string in non UTF-8 haystack, when
linked with certain versions of PCREv2 library.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The already-up-to-date pull bug was fixed for --ff-only but it did not
include the case where --ff or --ff-only are not specified. This updates
the --ff-only fix to include the case where --ff or --ff-only are not
specified in command line flags or config.
Signed-off-by: Erwin Villejo <erwin.villejo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The filter system allows for alterations to file contents when they're
moved between the database and the worktree. We already made sure that
it is possible for smudge filters to produce contents that are larger
than `unsigned long` can represent (which matters on systems where
`unsigned long` is narrower than `size_t`, most notably 64-bit Windows).
Now we make sure that clean filters can _consume_ contents that are
larger than that.
Note that this commit only allows clean filters' _input_ to be larger
than can be represented by `unsigned long`.
This change makes only a very minute dent into the much larger project
to teach Git to use `size_t` instead of `unsigned long` wherever
appropriate.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This introduces an additional guard for platforms where `unsigned long`
and `size_t` are not of the same size. If the size of an object in the
database would overflow `unsigned long`, instead we now exit with an
error.
A complete fix will have to update _many_ other functions throughout the
codebase to use `size_t` instead of `unsigned long`. It will have to be
implemented at some stage.
This commit puts in a stop-gap for the time being.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have a `batch` mode, let's be explicit.
This is a follow-up to ce4786fc77 (mingw: change core.fsyncObjectFiles
= 1 by default, 2017-09-04) and will most likely have to be squashed
into it before upstreaming that patch (after the `batch` fsync mode was
upstreamed).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There is mixed use of size_t and unsigned long to deal with sizes in the
codebase. Recall that Windows defines unsigned long as 32 bits even on
64-bit platforms, meaning that converting size_t to unsigned long narrows
the range. This mostly doesn't cause a problem since Git rarely deals
with files larger than 2^32 bytes.
But adjunct systems such as Git LFS, which use smudge/clean filters to
keep huge files out of the repository, may have huge file contents passed
through some of the functions in entry.c and convert.c. On Windows, this
results in a truncated file being written to the workdir. I traced this to
one specific use of unsigned long in write_entry (and a similar instance
in write_pc_item_to_fd for parallel checkout). That appeared to be for
the call to read_blob_entry, which expects a pointer to unsigned long.
By altering the signature of read_blob_entry to expect a size_t,
write_entry can be switched to use size_t internally (which all of its
callers and most of its callees already used). To avoid touching dozens of
additional files, read_blob_entry uses a local unsigned long to call a
chain of functions which aren't prepared to accept size_t.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This merges the topic branch (specifically backported onto v2.33.1 to
allow for integrating into Git for Windows' `main` branch) that strikes
a better balance between safety and speed: rather than `fsync()`ing each
and every loose object file, we now offer to do it in a batch.
This will become the new default in Git for Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The filter system allows for alterations to file contents when they're
added to the database or workdir. ("Smudge" when moving to the workdir;
"clean" when moving to the database.) This is used natively to handle CRLF
to LF conversions. It's also employed by Git-LFS to replace large files
from the workdir with small tracking files in the repo and vice versa.
Git pulls the entire smudged file into memory. While this is inefficient,
there's a more insidious problem on some platforms due to inconsistency
between using unsigned long and size_t for the same type of data (size of
a file in bytes). On most 64-bit platforms, unsigned long is 64 bits, and
size_t is typedef'd to unsigned long. On Windows, however, unsigned long is
only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to
unsigned long long in order to be 64 bits).
Practically speaking, this means 64-bit Windows users of Git-LFS can't
handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer
this limitation.
This commit introduces a test exposing the issue; future commits make it
pass. The test simulates the way Git-LFS works by having a tiny file
checked into the repository and expanding it to a huge file on checkout.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `xutftowcs_path` function canonicalizes absolute paths using GetFullPathNameW.
This canonicalization may change the length of the string (e.g. getting rid of \.\),
which breaks callers that pass the template string in a strbuf and expect the
length of the string to remain the same.
In my particular case, the tmp-objdir code is passing a strbuf to mkdtemp and is
breaking since the strbuf.len is no longer synchronized with strlen(strbuf.buf).
Signed-off-by: Neeraj K. Singh <neerajsi@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add a basic performance test for "git add" and "git stash" of a lot of
new objects with various fsync settings.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.
This imitates the `LONG_IS_64BIT` prerequisite.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Add test cases to exercise batch mode for:
* 'git add'
* 'git stash'
* 'git update-index'
* 'git unpack-objects'
These tests ensure that the added data winds up in the object database.
In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added due
to it already being in the repo.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this developer's tests, producing one gigabyte worth of NULs in a
busy loop that writes out individual bytes, unbuffered, took ~27sec.
Writing chunked 256kB buffers instead only took ~0.6sec
This matters because we are about to introduce a pair of test cases that
want to be able to produce 5GB of NULs, and we cannot use `/dev/zero`
because of the HP NonStop platform's lack of support for that device.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.
By enabling bulk-checkin when unpacking objects, we can take advantage
of batched fsyncs.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d5cfd142ec (tests: teach the test-tool to generate NUL bytes and
use it, 2019-02-14), add a way to generate zeroes in a portable
way without using /dev/zero (needed by HP NonStop), but uses a
long variable that is limited to 2^31 in Windows.
Use instead a (POSIX/C99) intmax_t that is at least 64bit wide
in 64-bit Windows to use in a future test.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.
This change enables bulk-checkin for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
pack functionality and the new bulk-fsync functionality.
There is some risk with this change, since under batch fsync, the object
files will not be available until the update-index is entirely complete.
This usage is unlikely, since any tool invoking update-index and
expecting to see objects would have to synchronize with the update-index
process after passing it a file path.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit adds a win32 implementation for fsync_no_flush that is
called git_fsync. The 'NtFlushBuffersFileEx' function being called is
available since Windows 8. If the function is not available, we
return -1 and Git falls back to doing a full fsync.
The operating system is told to flush data only without a hardware
flush primitive. A later full fsync will cause the metadata log
to be flushed and then the disk cache to be flushed on NTFS and
ReFS. Other filesystems will treat this as a full flush operation.
I added a new file here for this system call so as not to conflict with
downstream changes in the git-for-windows repository related to fscache.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding many objects to a repo with core.fsyncObjectFiles set to
true, the cost of fsync'ing each object file can become prohibitive.
One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. Fortunately, Windows,
and macOS offer mechanisms to write data from the filesystem page cache
without initiating a hardware flush. Linux has the sync_file_range API,
which issues a pagecache writeback request reliably after version 5.2.
This patch introduces a new 'core.fsyncObjectFiles = batch' option that
batches up hardware flushes. It hooks into the bulk-checkin plugging and
unplugging functionality and takes advantage of tmp-objdir.
When the new mode is enabled do the following for each new object:
1. Create the object in a tmp-objdir.
2. Issue a pagecache writeback request and wait for it to complete.
At the end of the entire transaction when unplugging bulk checkin:
1. Issue an fsync against a dummy file to flush the hardware writeback
cache, which should by now have processed the tmp-objdir writes.
2. Rename all of the tmp-objdir files to their final names.
3. When updating the index and/or refs, we assume that Git will issue
another fsync internal to that operation. This is not the case today,
but may be a good extension to those components.
On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns.
This change also updates the macOS code to trigger a real hardware flush
via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on
macOS there was no guarantee of durability since a simple fsync(2) call
does not flush any hardware caches.
_Performance numbers_:
Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.
This number is from a patch later in the series.
Adding 500 files to the repo with 'git add' Times reported in seconds.
core.fsyncObjectFiles | Linux | Mac | Windows
----------------------|-------|-------|--------
false | 0.06 | 0.35 | 0.61
true | 1.88 | 11.18 | 2.47
batch | 0.15 | 0.41 | 1.53
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure.
* Rename 'state' variable to 'bulk_checkin_state', since we will later
be adding 'bulk_fsync_state'. This also makes the variable easier to
find in the debugger, since the name is more unique.
* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
static variable. Doing this avoids resetting the variable in
finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
seem to unintentionally disable the plugging functionality the first
time a new packfile must be created due to packfile size limits. While
disabling the plugging state only results in suboptimal behavior for
the current code, it would be fatal for the bulk-fsync functionality
later in this patch series.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>