Commit Graph

23274 Commits

Author SHA1 Message Date
Johannes Schindelin
0755efe215 Merge pull request #2655 from jglathe/jg/t0014_trace_extra_info
t/t0014: fix: eliminate additional lines from trace
2025-02-06 19:11:35 +01:00
Johannes Schindelin
802f8776ad Merge pull request #2714 from lbonanomi/crlf-scissors
Rationalize line endings for scissors-cleanup
2025-02-06 19:11:35 +01:00
Johannes Schindelin
27111a8125 Merge 'add-p-many-files'
This topic branch allows `add -p` and `add -i` with a large number of
files. It is kind of a hack that was never really meant to be
upstreamed. Let's see if we can do better in the built-in `add -p`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:11:34 +01:00
Johannes Schindelin
3eb7c106d8 Merge pull request #2618 from dscho/avoid-d/f-conflict-in-vs/master
ci: avoid d/f conflict in vs/master
2025-02-06 19:11:34 +01:00
Johannes Schindelin
56b43f09f5 t5505/t5516: fix white-space around redirectors
The convention in Git project's shell scripts is to have white-space
_before_, but not _after_ the `>` (or `<`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:11:19 +01:00
Johannes Schindelin
6d862c523f t5505/t5516: allow running without .git/branches/ in the templates
When we commit the template directory as part of `make vcxproj`, the
`branches/` directory is not actually commited, as it is empty.

Two tests were not prepared for that situation.

This developer tried to get rid of the support for `.git/branches/` a
long time ago, but that effort did not bear fruit, so the best we can do
is work around in these here tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:11:19 +01:00
Johannes Schindelin
efd59cc4d5 Merge pull request #2506 from dscho/issue-2283
Allow running Git directly from `C:\Program Files\Git\mingw64\bin\git.exe`
2025-02-06 17:59:42 +01:00
Johannes Schindelin
35b304f895 Merge pull request #2504 from dscho/access-repo-via-junction
Handle `git add <file>` where <file> traverses an NTFS junction
2025-02-06 17:59:41 +01:00
Johannes Schindelin
d4f263dcaf Merge branch 'dont-clean-junctions'
This topic branch teaches `git clean` to respect NTFS junctions and Unix
bind mounts: it will now stop at those boundaries.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 17:59:39 +01:00
Johannes Schindelin
6abe642ef0 Merge branch 'drive-prefix'
This topic branch allows us to specify absolute paths without the drive
prefix e.g. when cloning.

Example:

	C:\Users\me> git clone https://github.com/git/git \upstream-git

This will clone into a new directory C:\upstream-git, in line with how
Windows interprets absolute paths.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 17:59:39 +01:00
Jens Glathe
73bab9f567 t0014: fix indentation
For some reason, this test case was indented with 4 spaces instead of 1
horizontal tab. The other test cases in the same test script are fine.

Signed-off-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:14 +01:00
Luke Bonanomi
28cc5977dd commit: accept "scissors" with CR/LF line endings
This change enhances `git commit --cleanup=scissors` by detecting
scissors lines ending in either LF (UNIX-style) or CR/LF (DOS-style).

Regression tests are included to specifically test for trailing
comments after a CR/LF-terminated scissors line.

Signed-off-by: Luke Bonanomi <lbonanomi@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:14 +01:00
Johannes Schindelin
7abe8d8c6e t3701: verify that we can add *lots* of files interactively
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:14 +01:00
Johannes Schindelin
d1e51aa137 mingw: implement a platform-specific strbuf_realpath()
There is a Win32 API function to resolve symbolic links, and we can use
that instead of resolving them manually. Even better, this function also
resolves NTFS junction points (which are somewhat similar to bind
mounts).

This fixes https://github.com/git-for-windows/git/issues/2481.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:13 +01:00
Johannes Schindelin
59dbe7e0b2 mingw: allow git.exe to be used instead of the "Git wrapper"
Git for Windows wants to add `git.exe` to the users' `PATH`, without
cluttering the latter with unnecessary executables such as `wish.exe`.
To that end, it invented the concept of its "Git wrapper", i.e. a tiny
executable located in `C:\Program Files\Git\cmd\git.exe` (originally a
CMD script) whose sole purpose is to set up a couple of environment
variables and then spawn the _actual_ `git.exe` (which nowadays lives in
`C:\Program Files\Git\mingw64\bin\git.exe` for 64-bit, and the obvious
equivalent for 32-bit installations).

Currently, the following environment variables are set unless already
initialized:

- `MSYSTEM`, to make sure that the MSYS2 Bash and the MSYS2 Perl
  interpreter behave as expected, and

- `PLINK_PROTOCOL`, to force PuTTY's `plink.exe` to use the SSH
  protocol instead of Telnet,

- `PATH`, to make sure that the `bin` folder in the user's home
  directory, as well as the `/mingw64/bin` and the `/usr/bin`
  directories are included. The trick here is that the `/mingw64/bin/`
  and `/usr/bin/` directories are relative to the top-level installation
  directory of Git for Windows (which the included Bash interprets as
  `/`, i.e. as the MSYS pseudo root directory).

Using the absence of `MSYSTEM` as a tell-tale, we can detect in
`git.exe` whether these environment variables have been initialized
properly. Therefore we can call `C:\Program Files\Git\mingw64\bin\git`
in-place after this change, without having to call Git through the Git
wrapper.

Obviously, above-mentioned directories must be _prepended_ to the `PATH`
variable, otherwise we risk picking up executables from unrelated Git
installations. We do that by constructing the new `PATH` value from
scratch, appending `$HOME/bin` (if `HOME` is set), then the MSYS2 system
directories, and then appending the original `PATH`.

Side note: this modification of the `PATH` variable is independent of
the modification necessary to reach the executables and scripts in
`/mingw64/libexec/git-core/`, i.e. the `GIT_EXEC_PATH`. That
modification is still performed by Git, elsewhere, long after making the
changes described above.

While we _still_ cannot simply hard-link `mingw64\bin\git.exe` to `cmd`
(because the former depends on a couple of `.dll` files that are only in
`mingw64\bin`, i.e. calling `...\cmd\git.exe` would fail to load due to
missing dependencies), at least we can now avoid that extra process of
running the Git wrapper (which then has to wait for the spawned
`git.exe` to finish) by calling `...\mingw64\bin\git.exe` directly, via
its absolute path.

Testing this is in Git's test suite tricky: we set up a "new" MSYS
pseudo-root and copy the `git.exe` file into the appropriate location,
then verify that `MSYSTEM` is set properly, and also that the `PATH` is
modified so that scripts can be found in `$HOME/bin`, `/mingw64/bin/`
and `/usr/bin/`.

This addresses https://github.com/git-for-windows/git/issues/2283

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:13 +01:00
Johannes Schindelin
31115213ca mingw: demonstrate a git add issue with NTFS junctions
NTFS junctions are somewhat similar in spirit to Unix bind mounts: they
point to a different directory and are resolved by the filesystem
driver. As such, they appear to `lstat()` as if they are directories,
not as if they are symbolic links.

_Any_ user can create junctions, while symbolic links can only be
created by non-administrators in Developer Mode on Windows 10. Hence
NTFS junctions are much more common "in the wild" than NTFS symbolic
links.

It was reported in https://github.com/git-for-windows/git/issues/2481
that adding files via an absolute path that traverses an NTFS junction:
since 1e64d18 (mingw: do resolve symlinks in `getcwd()`), we resolve not
only symbolic links but also NTFS junctions when determining the
absolute path of the current directory. The same is not true for `git
add <file>`, where symbolic links are resolved in `<file>`, but not NTFS
junctions.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:13 +01:00
Johannes Schindelin
b86e095423 clean: remove mount points when possible
Windows' equivalent to "bind mounts", NTFS junction points, can be
unlinked without affecting the mount target. This is clearly what users
expect to happen when they call `git clean -dfx` in a worktree that
contains NTFS junction points: the junction should be removed, and the
target directory of said junction should be left alone (unless it is
inside the worktree).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:12 +01:00
Johannes Schindelin
99ffa429f4 clean: do not traverse mount points
It seems to be not exactly rare on Windows to install NTFS junction
points (the equivalent of "bind mounts" on Linux/Unix) in worktrees,
e.g. to map some development tools into a subdirectory.

In such a scenario, it is pretty horrible if `git clean -dfx` traverses
into the mapped directory and starts to "clean up".

Let's just not do that. Let's make sure before we traverse into a
directory that it is not a mount point (or junction).

This addresses https://github.com/git-for-windows/git/issues/607

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:12 +01:00
Johannes Schindelin
5e12a91106 mingw: allow absolute paths without drive prefix
When specifying an absolute path without a drive prefix, we convert that
path internally. Let's make sure that we handle that case properly, too
;-)

This fixes the command

	git clone https://github.com/git-for-windows/git \G4W

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:11 +01:00
Johannes Schindelin
9cbc71643a mingw: demonstrate a problem with certain absolute paths
On Windows, there are several categories of absolute paths. One such
category starts with a backslash and is implicitly relative to the
drive associated with the current working directory. Example:

	c:
	git clone https://github.com/git-for-windows/git \G4W

should clone into C:\G4W.

There is currently a problem with that, in that mingw_mktemp() does not
expect the _wmktemp() function to prefix the absolute path with the
drive prefix, and as a consequence, the resulting path does not fit into
the originally-passed string buffer. The symptom is a "Result too large"
error.

Reported by Juan Carlos Arevalo Baeza.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:11 +01:00
Sverre Rabbelier
a3388599e4 remote-helper: check helper status after import/export
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
2025-02-06 16:33:11 +01:00
Sverre Rabbelier
591376ec2d t9350: point out that refs are not updated correctly
This happens only when the corresponding commits are not exported in
the current fast-export run. This can happen either when the relevant
commit is already marked, or when the commit is explicitly marked
as UNINTERESTING with a negative ref by another argument.

This breaks fast-export basec remote helpers.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
2025-02-06 16:33:11 +01:00
Johannes Schindelin
230e4ca5e2 Merge branch 'fixes-from-the-git-mailing-list'
These fixes have been sent to the Git mailing list but have not been
picked up by the Git project yet.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:10 +01:00
Junio C Hamano
3a7452c638 Merge branch 'mh/credential-cache-authtype-request-fix'
The "cache" credential back-end did not handle authtype correctly,
which has been corrected.

* mh/credential-cache-authtype-request-fix:
  credential-cache: respect authtype capability

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:10 +01:00
Junio C Hamano
86bd41c1b7 Merge branch 'bf/fetch-set-head-fix' into jch
Fetching into a bare repository incorrectly assumed it always used
a mirror layout when deciding to update remote-tracking HEAD, which
has been corrected.

* bf/fetch-set-head-fix:
  fetch set_head: fix non-mirror remotes in bare repositories

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:10 +01:00
Junio C Hamano
77c2d31b15 Merge branch 'rs/ref-filter-used-atoms-value-fix'
"git branch --sort=..." and "git for-each-ref --format=... --sort=..."
did not work as expected with some atoms, which has been corrected.

* rs/ref-fitler-used-atoms-value-fix:
  ref-filter: remove ref_format_clear()
  ref-filter: move is-base tip to used_atom
  ref-filter: move ahead-behind bases into used_atom

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:10 +01:00
Junio C Hamano
e88733ccb2 Merge branch 'kn/reflog-migration-fix-followup'
Code clean-up.

* kn/reflog-migration-fix-followup:
  reftable: prevent 'update_index' changes after adding records
  refs: use 'uint64_t' for 'ref_update.index'
  refs: mark `ref_transaction_update_reflog()` as static

These patches have been actually rebased onto a better base (the
`kn/reflog-migration` tip instead of the merge commit that merged this
tip).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:09 +01:00
Junio C Hamano
db018ce94e Merge branch 'kn/reflog-migration-fix'
"git refs migrate" for migrating reflog data was broken.

* kn/reflog-migration-fix:
  reftable: write correct max_update_index to header

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:09 +01:00
Junio C Hamano
195825fc57 Merge branch 'en/object-name-with-funny-refname-fix'
Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.

* en/object-name-with-funny-refname-fix:
  object-name: be more strict in parsing describe-like output
  object-name: fix resolution of object names containing curly braces

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:09 +01:00
M Hickford
34a7cf4b15 credential-cache: respect authtype capability
Previously, credential-cache populated authtype regardless whether
"get" request had authtype capability. As documented in
git-credential.txt, authtype "should not be sent unless the appropriate
capability ... is provided".

Add test. Without this change, the test failed because "credential fill"
printed an incomplete credential with only protocol and host attributes
(the unexpected authtype attribute was discarded by credential.c).

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:08 +01:00
Bence Ferdinandy
355849e34a fetch set_head: fix non-mirror remotes in bare repositories
In b1b713f722 (fetch set_head: handle mirrored bare repositories,
2024-11-22) it was implicitly assumed that all remotes will be mirrors
in a bare repository, thus fetching a non-mirrored remote could lead to
HEAD pointing to a non-existent reference. Make sure we only overwrite
HEAD if we are in a bare repository and fetching from a mirror.
Otherwise, proceed as normally, and create
refs/remotes/<nonmirrorremote>/HEAD instead.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:08 +01:00
Karthik Nayak
8a95fbc213 reftable: prevent 'update_index' changes after adding records
The function `reftable_writer_set_limits()` allows updating the
'min_update_index' and 'max_update_index' of a reftable writer. These
values are written to both the writer's header and footer.

Since the header is written during the first block write, any subsequent
changes to the update index would create a mismatch between the header
and footer values. The footer would contain the newer values while the
header retained the original ones.

To fix this bug, prevent callers from updating these values after any
record is written. To do this, modify the function to return an error
whenever the limits are modified after any record adds. Check for record
adds within `reftable_writer_set_limits()` by checking the `last_key`
variable, which is set whenever a new record is added.

Modify all callers of the function to anticipate a return type and
handle it accordingly.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:08 +01:00
René Scharfe
64055b320d ref-filter: move is-base tip to used_atom
The string_list "is_base_tips" in struct ref_format stores the
committish part of "is-base:<committish>".  It has the same problems
that its sibling string_list "bases" had.  Fix them the same way as the
previous commit did for the latter, by replacing the string_list with
fields in "used_atom".

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:08 +01:00
Adam Murray
b710fe8167 trace2: prevent segfault on config collection where no value specified
When TRACE2 analytics is enabled, a git config option that has no value
causes a segfault.

Steps to Reproduce
GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.*
git -c status.relativePaths version
Expected Result
git version 2.46.0
Actual Result
zsh: segmentation fault GIT_TRACE2=true

This adds checks to prevent the segfault and instead return
an empty value.

Signed-off-by: Adam Murray <ad@canva.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:08 +01:00
René Scharfe
ffcfbe715f ref-filter: move ahead-behind bases into used_atom
verify_ref_format() parses a ref-filter format string and stores
recognized items in the static array "used_atom".  For
"ahead-behind:<committish>" it stores the committish part in a
string_list member "bases" of struct ref_format.

ref_sorting_options() also parses bare ref-filter format items and
stores stores recognized ones in "used_atom" as well.  The committish
parts go to a dummy struct ref_format in parse_sorting_atom(), though,
and are leaked and forgotten.

If verify_ref_format() is called before ref_sorting_options(), like in
git for-each-ref, then all works well if the sort key is included in the
format string.  If it isn't then sorting cannot work as the committishes
are missing.

If ref_sorting_options() is called first, like in git branch, then we
have the additional issue that if the sort key is included in the format
string then filter_ahead_behind() can't see its committish, will not
generate any results for it and thus it will be expanded to an empty
string.

Fix those issues by replacing the string_list with a field in used_atom
for storing the committish.  This way it can be shared for handling both
ref-filter format strings and sorting options in the same command.

Reported-by: Ross Goldberg <ross.goldberg@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:08 +01:00
Karthik Nayak
5e1c7b13fd reftable: write correct max_update_index to header
In 297c09eabb (refs: allow multiple reflog entries for the same refname,
2024-12-16), the reftable backend learned to handle multiple reflog
entries within the same transaction. This was done modifying the
`update_index` for reflogs with multiple indices. During writing the
logs, the `max_update_index` of the writer was modified to ensure the
limits were raised to the modified `update_index`s.

However, since ref entries are written before the modification to the
`max_update_index`, if there are multiple blocks to be written, the
reftable backend writes the header with the old `max_update_index`. When
all logs are finally written, the footer will be written with the new
`min_update_index`. This causes a mismatch between the header and the
footer and causes the reftable file to be corrupted. The existing tests
only spawn a single block and since headers are lazily written with the
first block, the tests didn't capture this bug.

To fix the issue, the appropriate `max_update_index` limit must be set
even before the first block is written. Add a `max_index` field to the
transaction which holds the `max_index` within all its updates, then
propagate this value to the reftable backend, wherein this is used to
the set the `max_update_index` correctly.

Add a test which creates a few thousand reference updates with multiple
reflog entries, which should trigger the bug.

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:07 +01:00
Elijah Newren
1b7dea1d04 object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt:
    '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
      Output from `git describe`; i.e. a closest tag, optionally
      followed by a dash and a number of commits, followed by a dash, a
      'g', and an abbreviated object name.
which means that output of the format
    ${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}.  This is fine.  However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid.  That is problematic, since it
breaks things like

    git cat-file -p branchname:path/to/file/named/i-gaffed

which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.

A few additional notes:
  - This is a slight backward incompatibility break, because we used
    to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}.  However,
    a backward incompatible break is necessary, because there is no
    other way for someone to be more specific and disambiguate that they
    want the blob master:path/to/who-gabbed instead of the object abbed.
  - There is a possibility that check_refname_format() rules change in
    the future.  However, we can only realistically loosen the rules
    for what that function accepts rather than tighten.  If we were to
    tighten the rules, some real world repositories may already have
    refnames that suddenly become unacceptable and we break those
    repositories.  As such, any describe-like syntax of the form
    ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
    changes in this commit will remain valid in the future.
  - The fact that check_refname_format() rules could loosen in the
    future is probably also an important reason to make this change.  If
    the rules loosen, there might be additional cases within
    ${GARBAGE}-g${HASH} that become ambiguous in the future.  While
    abbreviated hashes can be disambiguated by abbreviating less, it may
    well be that these alternative object names have no way of being
    disambiguated (much like pathnames cannot be).  Accepting all random
    ${GARBAGE} thus makes it difficult for us to allow future
    extensions to object naming.

So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.

Also, add a few tests for git describe using object names of the form
    ${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
    git describe v2.48.0-rc2-161-g6c2274cdbc^0

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:07 +01:00
Jeff King
9804f3af83 test-lib: add a few comments to LSan log checking
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:07 +01:00
Elijah Newren
c9bd22e7ef object-name: fix resolution of object names containing curly braces
Given a branch name of 'foo{bar', commands like

    git cat-file -p foo{bar:README.md

should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2cae (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:

    fatal: Not a valid object name foo{bar:README.md

Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.

Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
  * 'foo@@{...}'
  * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'

Note that we'd prefer not duplicating the special logic for "@^" characters
here, because if get_oid_basic() or interpret_nth_prior_checkout() or
get_oid_basic() or similar gain extra methods of using curly braces,
then the logic in get_oid_with_context_1() would need to be updated as
well.  But it's not clear how to refactor all of these to have a simple
common callpoint with the specialized logic.

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Helped-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:07 +01:00
Jeff King
5996c9fd70 test-lib: simplify lsan results check
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:

  - it's one fewer process to run

  - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
    checked at the beginning of the function, and is the same glob used
    to show the logs in check_test_results_san_file_

  - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
    space in it. For example doing:

       mkdir "/tmp/foo bar"
       TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test

    would yield a lot of:

      grep: /tmp/foo: No such file or directory
      grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory

    when there are leaks. We could do the same thing with "xargs
    --null", but that isn't portable.

We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.

We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:06 +01:00
Johannes Schindelin
656fe4ed8c sideband: do allow ANSI color sequences by default
The preceding two commits introduced special handling of the sideband
channel to neutralize ANSI escape sequences before sending the payload
to the terminal, and `sideband.allowControlCharacters` to override that
behavior.

However, some `pre-receive` hooks that are actively used in practice
want to color their messages and therefore rely on the fact that Git
passes them through to the terminal.

In contrast to other ANSI escape sequences, it is highly unlikely that
coloring sequences can be essential tools in attack vectors that mislead
Git users e.g. by hiding crucial information.

Therefore we can have both: Continue to allow ANSI coloring sequences to
be passed to the terminal, and neutralize all other ANSI escape
sequences.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:33:06 +01:00
Jeff King
acbcc27f09 test-lib: invert return value of check_test_results_san_file_empty
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:

  ! find ... |
  xargs grep leaks |
  grep -v false-positives

where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 16:33:06 +01:00
Johannes Schindelin
e6a6b9d94c sideband: introduce an "escape hatch" to allow control characters
The preceding commit fixed the vulnerability whereas sideband messages
(that are under the control of the remote server) could contain ANSI
escape sequences that would be sent to the terminal verbatim.

However, this fix may not be desirable under all circumstances, e.g.
when remote servers deliberately add coloring to their messages to
increase their urgency.

To help with those use cases, give users a way to opt-out of the
protections: `sideband.allowControlCharacters`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:32:04 +01:00
Johannes Schindelin
20dfd7e9d5 sideband: mask control characters
The output of `git clone` is a vital component for understanding what
has happened when things go wrong. However, these logs are partially
under the control of the remote server (via the "sideband", which
typically contains what the remote `git pack-objects` process sends to
`stderr`), and is currently not sanitized by Git.

This makes Git susceptible to ANSI escape sequence injection (see
CWE-150, https://cwe.mitre.org/data/definitions/150.html), which allows
attackers to corrupt terminal state, to hide information, and even to
insert characters into the input buffer (i.e. as if the user had typed
those characters).

To plug this vulnerability, disallow any control character in the
sideband, replacing them instead with the common `^<letter/symbol>`
(e.g. `^[` for `\x1b`, `^A` for `\x01`).

There is likely a need for more fine-grained controls instead of using a
"heavy hammer" like this, which will be introduced subsequently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:29:44 +01:00
Johannes Schindelin
429023c8d2 credential: disallow Carriage Returns in the protocol by default
While Git has documented that the credential protocol is line-based,
with newlines as terminators, the exact shape of a newline has not been
documented.

From Git's perspective, which is firmly rooted in the Linux ecosystem,
it is clear that "a newline" means a Line Feed character.

However, even Git's credential protocol respects Windows line endings
(a Carriage Return character followed by a Line Feed character, "CR/LF")
by virtue of using `strbuf_getline()`.

There is a third category of line endings that has been used originally
by MacOS, and that is respected by the default line readers of .NET and
node.js: bare Carriage Returns.

Git cannot handle those, and what is worse: Git's remedy against
CVE-2020-5260 does not catch when credential helpers are used that
interpret bare Carriage Returns as newlines.

Git Credential Manager addressed this as CVE-2024-50338, but other
credential helpers may still be vulnerable. So let's not only disallow
Line Feed characters as part of the values in the credential protocol,
but also disallow Carriage Return characters.

In the unlikely event that a credential helper relies on Carriage
Returns in the protocol, introduce an escape hatch via the
`credential.protectProtocol` config setting.

This addresses CVE-2024-52006.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:25:21 +01:00
Johannes Schindelin
db58126452 credential: sanitize the user prompt
When asking the user interactively for credentials, we want to avoid
misleading them e.g. via control sequences that pretend that the URL
targets a trusted host when it does not.

While Git learned, over the course of the preceding commits, to disallow
URLs containing URL-encoded control characters by default, credential
helpers are still allowed to specify values very freely (apart from Line
Feed and NUL characters, anything is allowed), and this would allow,
say, a username containing control characters to be specified that would
then be displayed in the interactive terminal prompt asking the user for
the password, potentially sending those control characters directly to
the terminal. This is undesirable because control characters can be used
to mislead users to divulge secret information to untrusted sites.

To prevent such an attack vector, let's add a `git_prompt()` that forces
the displayed text to be sanitized, i.e. displaying question marks
instead of control characters.

Note: While this commit's diff changes a lot of `user@host` strings to
`user%40host`, which may look suspicious on the surface, there is a good
reason for that: this string specifies a user name, not a
<username>@<hostname> combination! In the context of t5541, the actual
combination looks like this: `user%40@127.0.0.1:5541`. Therefore, these
string replacements document a net improvement introduced by this
commit, as `user@host@127.0.0.1` could have left readers wondering where
the user name ends and where the host name begins.

Hinted-at-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:25:21 +01:00
Johannes Schindelin
501d8da34a credential_format(): also encode <host>[:<port>]
An upcoming change wants to sanitize the credential password prompt
where a URL is displayed that may potentially come from a `.gitmodules`
file. To this end, the `credential_format()` function is employed.

To sanitize the host name (and optional port) part of the URL, we need a
new mode of the `strbuf_add_percentencode()` function because the
current mode is both too strict and too lenient: too strict because it
encodes `:`, `[` and `]` (which should be left unencoded in
`<host>:<port>` and in IPv6 addresses), and too lenient because it does
not encode invalid host name characters `/`, `_` and `~`.

So let's introduce and use a new mode specifically to encode the host
name and optional port part of a URI, leaving alpha-numerical
characters, periods, colons and brackets alone and encoding all others.

This only leads to a change of behavior for URLs that contain invalid
host names.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 16:25:21 +01:00
Junio C Hamano
b28fb93e51 Merge branch 'ps/build-sign-compare'
Last-minute fix for a regression in "git blame --abbrev=<length>"
when insane <length> is specified; we used to correctly cap it to
the hash output length but broke it during the cycle.

* ps/build-sign-compare:
  builtin/blame: fix out-of-bounds write with blank boundary commits
  builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-10 09:19:34 -08:00
Patrick Steinhardt
e7fb2ca945 builtin/blame: fix out-of-bounds write with blank boundary commits
When passing the `-b` flag to git-blame(1), then any blamed boundary
commits which were marked as uninteresting will not get their actual
commit ID printed, but will instead be replaced by a couple of spaces.

The flag can lead to an out-of-bounds write as though when combined with
`--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ`
as we simply use memset(3p) on that array with the user-provided length
directly. The result is most likely that we segfault.

An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes.
But when the underlying object ID is SHA1, and if the abbreviated length
exceeds the SHA1 length, it would cause us to print more bytes than
desired, and the result would be misaligned.

Instead, fix the bug by computing the length via strlen(3p). This makes
us write as many bytes as the formatted object ID requires and thus
effectively limits the length of what we may end up printing to the
length of its hash. If `--abbrev=` asks us to abbreviate to something
shorter than the full length of the underlying hash function it would be
handled by the call to printf(3p) correctly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:55 -08:00
Patrick Steinhardt
1fbb8d7ecb builtin/blame: fix out-of-bounds read with excessive --abbrev
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.

It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.

Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:54 -08:00