163436 Commits

Author SHA1 Message Date
Johannes Schindelin
bcdcf4dcce Add a GitHub workflow to verify that Git/Scalar work in Nano Server
In Git for Windows v2.39.0, we fixed a regression where `git.exe` would
no longer work in Windows Nano Server (frequently used in Docker
containers).

This GitHub workflow can be used to verify manually that the Git/Scalar
executables work in Nano Server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:57 +01:00
Johannes Schindelin
c6ca5c1064 Merge branch 'fixes-from-the-git-mailing-list'
These fixes have been sent to the Git mailing list but have not been
picked up by the Git project yet.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:43 +01:00
Junio C Hamano
278028ef96 Merge branch 'mh/credential-cache-authtype-request-fix'
The "cache" credential back-end did not handle authtype correctly,
which has been corrected.

* mh/credential-cache-authtype-request-fix:
  credential-cache: respect authtype capability

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:43 +01:00
Junio C Hamano
a225dbd554 Merge branch 'jc/show-index-h-update'
Doc and short-help text for "show-index" has been clarified to
stress that the command reads its data from the standard input.

* jc/show-index-h-update:
  show-index: the short help should say the command reads from its input

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:42 +01:00
Junio C Hamano
803d56f5d4 Merge branch 'bf/fetch-set-head-fix' into jch
Fetching into a bare repository incorrectly assumed it always used
a mirror layout when deciding to update remote-tracking HEAD, which
has been corrected.

* bf/fetch-set-head-fix:
  fetch set_head: fix non-mirror remotes in bare repositories

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:42 +01:00
Junio C Hamano
e97324a463 Merge branch 'rs/ref-filter-used-atoms-value-fix'
"git branch --sort=..." and "git for-each-ref --format=... --sort=..."
did not work as expected with some atoms, which has been corrected.

* rs/ref-fitler-used-atoms-value-fix:
  ref-filter: remove ref_format_clear()
  ref-filter: move is-base tip to used_atom
  ref-filter: move ahead-behind bases into used_atom

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:42 +01:00
Junio C Hamano
2dace82faa Merge branch 'kn/reflog-migration-fix-followup'
Code clean-up.

* kn/reflog-migration-fix-followup:
  reftable: prevent 'update_index' changes after adding records
  refs: use 'uint64_t' for 'ref_update.index'
  refs: mark `ref_transaction_update_reflog()` as static

These patches have been actually rebased onto a better base (the
`kn/reflog-migration` tip instead of the merge commit that merged this
tip).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:41 +01:00
Junio C Hamano
9607fee092 Merge branch 'kn/reflog-migration-fix'
"git refs migrate" for migrating reflog data was broken.

* kn/reflog-migration-fix:
  reftable: write correct max_update_index to header

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:41 +01:00
Junio C Hamano
02e4e7b521 Merge branch 'en/object-name-with-funny-refname-fix'
Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.

* en/object-name-with-funny-refname-fix:
  object-name: be more strict in parsing describe-like output
  object-name: fix resolution of object names containing curly braces

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:41 +01:00
Junio C Hamano
24f5d72492 Merge branch 'jk/pack-header-parse-alignment-fix'
It was possible for "git unpack-objects" and "git index-pack" to
make an unaligned access, which has been corrected.

* jk/pack-header-parse-alignment-fix:
  index-pack, unpack-objects: use skip_prefix to avoid magic number
  index-pack, unpack-objects: use get_be32() for reading pack header
  parse_pack_header_option(): avoid unaligned memory writes
  packfile: factor out --pack_header argument parsing
  bswap.h: squelch potential sparse -Wcast-truncate warnings

These patches have actually been rebased onto v2.46.2 for easier
merging.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:41 +01:00
Jeff King
51354d1af5 update-ref: do set reflog's old_oid
In git 2.48.1, the `git update-ref` subcommand no longer correctly
updates the reflog in some cases. Specifically, it appears that the
`old_oid` field will not be updated when modifying a branch referenced
by another symbolic ref (e.g. HEAD). This doesn't break the `git
reflog` subcommand, but does break references like `HEAD@{1}`, which
appear to read the `old_oid` field:

  git init -b main
  git commit --allow-empty -m "A"
  git commit --allow-empty -m "B"
  git update-ref -m "reason" refs/heads/main HEAD~ HEAD

The `old_oid` field is now empty (all zeroes). This is only the case in
derived reflogs (in this case .git/logs/HEAD). The reflog for
`refs/heads/main` appears to be updated correctly.

This was broken in 297c09eabb (refs: allow multiple reflog entries for
the same refname, 2024-12-16).

The reason for that was that there was assumed the flow of
`lock_ref_for_update()` for reflog only updates was to capture the lock
only. But this is wrong since this misses the `old_oid` population. As
such this patch is the correct fix.

Reported-by: Nika Layzell <nika@thelayzells.com>
Acked-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:40 +01:00
Junio C Hamano
9b7b978476 Merge branch 'ps/object-collision-check'
CI jobs gave sporadic failures, which turns out that that the
object finalization code was giving an error when it did not have
to.

* ps/object-collision-check:
  object-file: retry linking file into place when occluding file vanishes
  object-file: don't special-case missing source file in collision check
  object-file: rename variables in `check_collision()`
  object-file: fix race in object collision check

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:40 +01:00
Jeff King
a2e7060a1f grep: prevent ^$ false match at end of file
In some implementations, `regexec_buf()` assumes that it is fed lines;
Without `REG_NOTEOL` it thinks the end of the buffer is the end of a
line. Which makes sense, but trips up this case because we are not
feeding lines, but rather a whole buffer. So the final newline is not
the start of an empty line, but the true end of the buffer.

This causes an interesting bug:

  $ echo content >file.txt
  $ git grep --no-index -n '^$' file.txt
  file.txt:2:

This bug is fixed by making the end of the buffer consistently the end
of the final line.

The patch was applied from
https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/

Reported-by: Olly Betts <olly@survex.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:40 +01:00
Junio C Hamano
cbb262adf2 Merge branch 'jk/lsan-race-ignore-false-positive'
The code to check LSan results has been simplified and made more
robust.

* jk/lsan-race-ignore-false-positive:
  test-lib: add a few comments to LSan log checking
  test-lib: simplify lsan results check
  test-lib: invert return value of check_test_results_san_file_empty

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:40 +01:00
Adam Murray
299435c3e5 trace2: prevent segfault on config collection where no value specified
When TRACE2 analytics is enabled, a git config option that has no value
causes a segfault.

Steps to Reproduce
GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.*
git -c status.relativePaths version
Expected Result
git version 2.46.0
Actual Result
zsh: segmentation fault GIT_TRACE2=true

This adds checks to prevent the segfault and instead return
an empty value.

Signed-off-by: Adam Murray <ad@canva.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:40 +01:00
M Hickford
f40199fe18 credential-cache: respect authtype capability
Previously, credential-cache populated authtype regardless whether
"get" request had authtype capability. As documented in
git-credential.txt, authtype "should not be sent unless the appropriate
capability ... is provided".

Add test. Without this change, the test failed because "credential fill"
printed an incomplete credential with only protocol and host attributes
(the unexpected authtype attribute was discarded by credential.c).

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
Junio C Hamano
3e20693c1d show-index: the short help should say the command reads from its input
The short help text given by "git show-index -h" says

    $ git show-index -h
    usage: git show-index [--object-format=<hash-algorithm>]

        --[no-]object-format <hash-algorithm>
                              specify the hash algorithm to use

The command takes a pack .idx file from its standard input.  The
user has to _know_ this, as there is no indication from this output.

Give a hint that the data to work on is fed from its standard input.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
Bence Ferdinandy
c4c6f21a2e fetch set_head: fix non-mirror remotes in bare repositories
In b1b713f722 (fetch set_head: handle mirrored bare repositories,
2024-11-22) it was implicitly assumed that all remotes will be mirrors
in a bare repository, thus fetching a non-mirrored remote could lead to
HEAD pointing to a non-existent reference. Make sure we only overwrite
HEAD if we are in a bare repository and fetching from a mirror.
Otherwise, proceed as normally, and create
refs/remotes/<nonmirrorremote>/HEAD instead.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
René Scharfe
135a432f4c ref-filter: remove ref_format_clear()
Now that ref_format_clear() no longer releases any memory we don't need
it anymore.  Remove it and its counterpart, ref_format_init().

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
René Scharfe
2fbabb941b ref-filter: move is-base tip to used_atom
The string_list "is_base_tips" in struct ref_format stores the
committish part of "is-base:<committish>".  It has the same problems
that its sibling string_list "bases" had.  Fix them the same way as the
previous commit did for the latter, by replacing the string_list with
fields in "used_atom".

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
René Scharfe
8849ac4117 ref-filter: move ahead-behind bases into used_atom
verify_ref_format() parses a ref-filter format string and stores
recognized items in the static array "used_atom".  For
"ahead-behind:<committish>" it stores the committish part in a
string_list member "bases" of struct ref_format.

ref_sorting_options() also parses bare ref-filter format items and
stores stores recognized ones in "used_atom" as well.  The committish
parts go to a dummy struct ref_format in parse_sorting_atom(), though,
and are leaked and forgotten.

If verify_ref_format() is called before ref_sorting_options(), like in
git for-each-ref, then all works well if the sort key is included in the
format string.  If it isn't then sorting cannot work as the committishes
are missing.

If ref_sorting_options() is called first, like in git branch, then we
have the additional issue that if the sort key is included in the format
string then filter_ahead_behind() can't see its committish, will not
generate any results for it and thus it will be expanded to an empty
string.

Fix those issues by replacing the string_list with a field in used_atom
for storing the committish.  This way it can be shared for handling both
ref-filter format strings and sorting options in the same command.

Reported-by: Ross Goldberg <ross.goldberg@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:39 +01:00
Karthik Nayak
2b0861883e reftable: prevent 'update_index' changes after adding records
The function `reftable_writer_set_limits()` allows updating the
'min_update_index' and 'max_update_index' of a reftable writer. These
values are written to both the writer's header and footer.

Since the header is written during the first block write, any subsequent
changes to the update index would create a mismatch between the header
and footer values. The footer would contain the newer values while the
header retained the original ones.

To fix this bug, prevent callers from updating these values after any
record is written. To do this, modify the function to return an error
whenever the limits are modified after any record adds. Check for record
adds within `reftable_writer_set_limits()` by checking the `last_key`
variable, which is set whenever a new record is added.

Modify all callers of the function to anticipate a return type and
handle it accordingly.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Karthik Nayak
8ba42f1987 reftable: write correct max_update_index to header
In 297c09eabb (refs: allow multiple reflog entries for the same refname,
2024-12-16), the reftable backend learned to handle multiple reflog
entries within the same transaction. This was done modifying the
`update_index` for reflogs with multiple indices. During writing the
logs, the `max_update_index` of the writer was modified to ensure the
limits were raised to the modified `update_index`s.

However, since ref entries are written before the modification to the
`max_update_index`, if there are multiple blocks to be written, the
reftable backend writes the header with the old `max_update_index`. When
all logs are finally written, the footer will be written with the new
`min_update_index`. This causes a mismatch between the header and the
footer and causes the reftable file to be corrupted. The existing tests
only spawn a single block and since headers are lazily written with the
first block, the tests didn't capture this bug.

To fix the issue, the appropriate `max_update_index` limit must be set
even before the first block is written. Add a `max_index` field to the
transaction which holds the `max_index` within all its updates, then
propagate this value to the reftable backend, wherein this is used to
the set the `max_update_index` correctly.

Add a test which creates a few thousand reference updates with multiple
reflog entries, which should trigger the bug.

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Elijah Newren
5aeb1e471a object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt:
    '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
      Output from `git describe`; i.e. a closest tag, optionally
      followed by a dash and a number of commits, followed by a dash, a
      'g', and an abbreviated object name.
which means that output of the format
    ${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}.  This is fine.  However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid.  That is problematic, since it
breaks things like

    git cat-file -p branchname:path/to/file/named/i-gaffed

which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.

A few additional notes:
  - This is a slight backward incompatibility break, because we used
    to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}.  However,
    a backward incompatible break is necessary, because there is no
    other way for someone to be more specific and disambiguate that they
    want the blob master:path/to/who-gabbed instead of the object abbed.
  - There is a possibility that check_refname_format() rules change in
    the future.  However, we can only realistically loosen the rules
    for what that function accepts rather than tighten.  If we were to
    tighten the rules, some real world repositories may already have
    refnames that suddenly become unacceptable and we break those
    repositories.  As such, any describe-like syntax of the form
    ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
    changes in this commit will remain valid in the future.
  - The fact that check_refname_format() rules could loosen in the
    future is probably also an important reason to make this change.  If
    the rules loosen, there might be additional cases within
    ${GARBAGE}-g${HASH} that become ambiguous in the future.  While
    abbreviated hashes can be disambiguated by abbreviating less, it may
    well be that these alternative object names have no way of being
    disambiguated (much like pathnames cannot be).  Accepting all random
    ${GARBAGE} thus makes it difficult for us to allow future
    extensions to object naming.

So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.

Also, add a few tests for git describe using object names of the form
    ${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
    git describe v2.48.0-rc2-161-g6c2274cdbc^0

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Karthik Nayak
80d74ed7c2 refs: use 'uint64_t' for 'ref_update.index'
The 'ref_update.index' variable is used to store an index for a given
reference update. This index is used to order the updates in a
predetermined order, while the default ordering is alphabetical as per
the refname.

For large repositories with millions of references, it should be safer
to use 'uint64_t'. Let's do that. This also is applied for all other
code sections where we store 'index' and pass it around.

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Elijah Newren
5a88867729 object-name: fix resolution of object names containing curly braces
Given a branch name of 'foo{bar', commands like

    git cat-file -p foo{bar:README.md

should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2caef9 (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:

    fatal: Not a valid object name foo{bar:README.md

Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.

Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
  * 'foo@@{...}'
  * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'

Note that we'd prefer not duplicating the special logic for "@^" characters
here, because if get_oid_basic() or interpret_nth_prior_checkout() or
get_oid_basic() or similar gain extra methods of using curly braces,
then the logic in get_oid_with_context_1() would need to be updated as
well.  But it's not clear how to refactor all of these to have a simple
common callpoint with the specialized logic.

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Helped-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Karthik Nayak
62a5707f7c refs: mark ref_transaction_update_reflog() as static
The `ref_transaction_update_reflog()` function is only used within
'refs.c', so mark it as static.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:38 +01:00
Jeff King
dae2c5e5c7 index-pack, unpack-objects: use skip_prefix to avoid magic number
When parsing --pack_header=, we manually skip 14 bytes to the data.
Let's use skip_prefix() to do this automatically.

Note that we overwrite our pointer to the front of the string, so we
have to add more context to the error message. We could avoid this by
declaring an extra pointer to hold the value, but I think the modified
message is actually preferable; it should give translators a bit more
context.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:37 +01:00
Patrick Steinhardt
a6aca325db object-file: retry linking file into place when occluding file vanishes
Prior to 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30), callers could expect that a successful return from
`finalize_object_file()` means that either the file was moved into
place, or the identical bytes were already present. If neither of those
happens, we'd return an error.

Since that commit, if the destination file disappears between our
link(3p) call and the collision check, we'd return success without
actually checking the contents, and without retrying the link. This
solves the common case that the files were indeed the same, but it means
that we may corrupt the repository if they weren't (this implies a hash
collision, but the whole point of this function is protecting against
hash collisions).

We can't be pessimistic and assume they're different; that hurts the
common case that the mentioned commit was trying to fix. But after
seeing that the destination file went away, we can retry linking again.
Adapt the code to do so when we see that the destination file has racily
vanished. This should generally succeed as we have just observed that
the destination file does not exist anymore, except in the very unlikely
event that it gets recreated by another concurrent process again.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:37 +01:00
Jeff King
16be54a929 index-pack, unpack-objects: use get_be32() for reading pack header
Both of these commands read the incoming pack into a static unsigned
char buffer in BSS, and then parse it by casting the start of the buffer
to a struct pack_header. This can result in SIGBUS on some platforms if
the compiler doesn't place the buffer in a position that is properly
aligned for 4-byte integers.

This reportedly happens with unpack-objects (but not index-pack) on
sparc64 when compiled with clang (but not gcc). But we are definitely in
the wrong in both spots; since the buffer's type is unsigned char, we
can't depend on larger alignment. When it works it is only because we
are lucky.

We'll fix this by switching to get_be32() to read the headers (just like
the last few commits similarly switched us to put_be32() for writing
into the same buffer).

It would be nice to factor this out into a common helper function, but
the interface ends up quite awkward. Either the caller needs to hardcode
how many bytes we'll need, or it needs to pass us its fill()/use()
functions as pointers. So I've just fixed both spots in the same way;
this is not code that is likely to be repeated a third time (most of the
pack reading code uses an mmap'd buffer, which should be properly
aligned).

I did make one tweak to the shared code: our pack_version_ok() macro
expects us to pass the big-endian value we'd get by casting. We can
introduce a "native" variant which uses the host integer ordering.

Reported-by: Koakuma <koachan@protonmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:37 +01:00
Patrick Steinhardt
cec3ebbdd8 object-file: don't special-case missing source file in collision check
In 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30) we have started to ignore ENOENT when opening either the
source or destination file of the collision check. This was done to
handle races more gracefully in case either of the potentially-colliding
disappears.

The fix is overly broad though: while the destination file may indeed
vanish racily, this shouldn't ever happen for the source file, which is
a temporary object file (either loose or in packfile format) that we
have just created. So if any concurrent process would have removed that
temporary file it would indicate an actual issue.

Stop treating ENOENT specially for the source file so that we always
bubble up this error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:37 +01:00
Jeff King
5285cdc7f9 parse_pack_header_option(): avoid unaligned memory writes
In order to recreate a pack header in our in-memory buffer, we cast the
buffer to a "struct pack_header" and assign the individual fields. This
is reported to cause SIGBUS on sparc64 due to alignment issues.

We can work around this by using put_be32() which will write individual
bytes into the buffer.

Reported-by: Koakuma <koachan@protonmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:37 +01:00
Patrick Steinhardt
5e1d9aa05e object-file: rename variables in check_collision()
Rename variables used in `check_collision()` to clearly identify which
file is the source and which is the destination. This will make the next
step easier to reason about when we start to treat those files different
from one another.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:37 +01:00
Jeff King
2dbdf2c476 packfile: factor out --pack_header argument parsing
Both index-pack and unpack-objects accept a --pack_header argument. This
is an undocumented internal argument used by receive-pack and fetch to
pass along information about the header of the pack, which they've
already read from the incoming stream.

In preparation for a bugfix, let's factor the duplicated code into a
common helper.

The callers are still responsible for identifying the option. While this
could likewise be factored out, it is more flexible this way (e.g., if
they ever started using parse-options and wanted to handle both the
stuck and unstuck forms).

Likewise, the callers are responsible for reporting errors, though they
both just call die(). I've tweaked unpack-objects to match index-pack in
marking the error for translation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:37 +01:00
Patrick Steinhardt
6abb17c867 object-file: fix race in object collision check
One of the tests in t5616 asserts that git-fetch(1) with `--refetch`
triggers repository maintenance with the correct set of arguments. This
test is flaky and causes us to fail sometimes:

    ++ git -c protocol.version=0 -c gc.autoPackLimit=0 -c maintenance.incremental-repack.auto=1234 -C pc1 fetch --refetch origin
    error: unable to open .git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack: No such file or directory
    fatal: unable to rename temporary file to '.git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack'
    fatal: could not finish pack-objects to repack local links
    fatal: index-pack failed
    error: last command exited with $?=128

The error message is quite confusing as it talks about trying to rename
a temporary packfile. A first hunch would thus be that this packfile
gets written by git-fetch(1), but removed by git-maintenance(1) while it
hasn't yet been finalized, which shouldn't ever happen. And indeed, when
looking closer one notices that the file that is supposedly of temporary
nature does not have the typical `tmp_pack_` prefix.

As it turns out, the "unable to rename temporary file" fatal error is a
red herring and the real error is "unable to open". That error is raised
by `check_collision()`, which is called by `finalize_object_file()` when
moving the new packfile into place. Because t5616 re-fetches objects, we
end up with the exact same pack as we already have in the repository. So
when the concurrent git-maintenance(1) process rewrites the preexisting
pack and unlinks it exactly at the point in time where git-fetch(1)
wants to check the old and new packfiles for equality we will see ENOENT
and thus `check_collision()` returns an error, which gets bubbled up by
`finalize_object_file()` and is then handled by `rename_tmp_packfile()`.
That function does not know about the exact root cause of the error and
instead just claims that the rename has failed.

This race is thus caused by b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26), where we have newly introduced
the collision check.

By definition, two files cannot collide with each other when one of them
has been removed. We can thus trivially fix the issue by ignoring ENOENT
when opening either of the files we're about to check for collision.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:37 +01:00
Junio C Hamano
ab2a205d1c bswap.h: squelch potential sparse -Wcast-truncate warnings
In put_be32(), we right-shift a uint32_t value various amounts and then
assign the low 8-bits to individual "unsigned char" bytes, throwing away
the high bits. For shifts smaller than 24 bits, those thrown away bits
will be arbitrary bits from the original uint32_t.

This works exactly as we want, but if you feed a constant, then sparse
complains. For example if we write this (which we plan to do in a future
patch):

  put_be32(hdr, PACK_SIGNATURE);

then "make sparse" produces:

  compat/bswap.h:175:22: error: cast truncates bits from constant value (5041 becomes 41)
  compat/bswap.h:176:22: error: cast truncates bits from constant value (504143 becomes 43)
  compat/bswap.h:177:22: error: cast truncates bits from constant value (5041434b becomes 4b)

And the same issue exists in the other put_be*() functions, when used
with a constant.

We can silence this warning by explicitly masking off the truncated
bits. The compiler is smart enough to know the result is the same, and
the asm generated by gcc (with both -O0 and -O2) is identical.

Curiously this line already exists:

	put_be32(&hdr_version, INDEX_EXTENSION_VERSION2);

in the fsmonitor.c file, but it does not get flagged because the CPP
macro expands to a small integer (2).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:37 +01:00
Johannes Schindelin
5cc2d6c16f Merge branch 'disallow-control-characters-in-sideband-channel'
This addresses:

- CVE-2024-52005:

	Insufficient neutralization of ANSI escape sequences in sideband
	payload can be used to mislead Git users into believing that
	certain remote-generated messages actually originate from Git.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2025-02-06 19:32:36 +01:00
Jeff King
ec65acf4c5 test-lib: add a few comments to LSan log checking
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:36 +01:00
Johannes Schindelin
63cab6aa67 unix-socket: avoid leak when initialization fails
When a Unix socket is initialized, the current directory's path is
stored so that the cleanup code can `chdir()` back to where it was
before exit.

If the path that needs to be stored exceeds the default size of the
`sun_path` attribute of `struct sockaddr_un` (which is defined as a
108-sized byte array on Linux), a larger buffer needs to be allocated so
that it can hold the path, and it is the responsibility of the
`unix_sockaddr_cleanup()` function to release that allocated memory.

In Git's CI, this stack allocation is not necessary because the code is
checked out to `/home/runner/work/git/git`. Concatenate the path
`t/trash directory.t0301-credential-cache/.cache/git/credential/socket`
and a terminating NUL, and you end up with 96 bytes, 12 shy of the
default `sun_path` size.

However, I use worktrees with slightly longer paths:
`/home/me/projects/git/yes/i/nest/worktrees/to/organize/them/` is more
in line with what I have. When I recently tried to locally reproduce a
failure of the `linux-leaks` CI job, this t0301 test failed (where it
had not failed in CI).

The reason: When `credential-cache` tries to reach its daemon initially
by calling `unix_sockaddr_init()`, it is expected that the daemon cannot
be reached (the idea is to spin up the daemon in that case and try
again). However, when this first call to `unix_sockaddr_init()` fails,
the code returns early from the `unix_stream_connect()` function
_without_ giving the cleanup code a chance to run, skipping the
deallocation of above-mentioned path.

The fix is easy: do not return early but instead go directly to the
cleanup code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:36 +01:00
Johannes Schindelin
65896a3814 sideband: do allow ANSI color sequences by default
The preceding two commits introduced special handling of the sideband
channel to neutralize ANSI escape sequences before sending the payload
to the terminal, and `sideband.allowControlCharacters` to override that
behavior.

However, some `pre-receive` hooks that are actively used in practice
want to color their messages and therefore rely on the fact that Git
passes them through to the terminal.

In contrast to other ANSI escape sequences, it is highly unlikely that
coloring sequences can be essential tools in attack vectors that mislead
Git users e.g. by hiding crucial information.

Therefore we can have both: Continue to allow ANSI coloring sequences to
be passed to the terminal, and neutralize all other ANSI escape
sequences.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:36 +01:00
Jeff King
75dde9545a test-lib: simplify lsan results check
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:

  - it's one fewer process to run

  - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
    checked at the beginning of the function, and is the same glob used
    to show the logs in check_test_results_san_file_

  - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
    space in it. For example doing:

       mkdir "/tmp/foo bar"
       TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test

    would yield a lot of:

      grep: /tmp/foo: No such file or directory
      grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory

    when there are leaks. We could do the same thing with "xargs
    --null", but that isn't portable.

We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.

We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:36 +01:00
Johannes Schindelin
976299bdeb sideband: introduce an "escape hatch" to allow control characters
The preceding commit fixed the vulnerability whereas sideband messages
(that are under the control of the remote server) could contain ANSI
escape sequences that would be sent to the terminal verbatim.

However, this fix may not be desirable under all circumstances, e.g.
when remote servers deliberately add coloring to their messages to
increase their urgency.

To help with those use cases, give users a way to opt-out of the
protections: `sideband.allowControlCharacters`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:36 +01:00
Jeff King
495ab709fd test-lib: invert return value of check_test_results_san_file_empty
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:

  ! find ... |
  xargs grep leaks |
  grep -v false-positives

where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-06 19:32:36 +01:00
Johannes Schindelin
17ae787843 sideband: mask control characters
The output of `git clone` is a vital component for understanding what
has happened when things go wrong. However, these logs are partially
under the control of the remote server (via the "sideband", which
typically contains what the remote `git pack-objects` process sends to
`stderr`), and is currently not sanitized by Git.

This makes Git susceptible to ANSI escape sequence injection (see
CWE-150, https://cwe.mitre.org/data/definitions/150.html), which allows
attackers to corrupt terminal state, to hide information, and even to
insert characters into the input buffer (i.e. as if the user had typed
those characters).

To plug this vulnerability, disallow any control character in the
sideband, replacing them instead with the common `^<letter/symbol>`
(e.g. `^[` for `\x1b`, `^A` for `\x01`).

There is likely a need for more fine-grained controls instead of using a
"heavy hammer" like this, which will be introduced subsequently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:32:35 +01:00
Johannes Schindelin
acc11dc6c2 Start the merging-rebase to v2.48.1
This commit starts the rebase of a62f00cc8e9e to 6768cd3982a5
2025-02-06 19:30:47 +01:00
Johannes Schindelin
97216ef118 Merge 'readme' into HEAD
Add a README.md for GitHub goodness.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:12:08 +01:00
Johannes Schindelin
5141748846 Merge pull request #2837 from dscho/monitor-component-updates
Start monitoring updates of Git for Windows' component in the open
2025-02-06 19:12:08 +01:00
Johannes Schindelin
b1d5f8360f Merge branch 'deprecate-core.useBuiltinFSMonitor'
Originally introduced as `core.useBuiltinFSMonitor` in Git for Windows
and developed, improved and stabilized there, the built-in FSMonitor
only made it into upstream Git (after unnecessarily long hemming and
hawing and throwing overly perfectionist style review sticks into the
spokes) as `core.fsmonitor = true`.

In Git for Windows, with this topic branch, we re-introduce the
now-obsolete config setting, with warnings suggesting to existing users
how to switch to the new config setting, with the intention to
ultimately drop the patch at some stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:12:08 +01:00
Johannes Schindelin
6808fa0d33 Merge branch 'phase-out-reset-stdin'
This topic branch re-adds the deprecated --stdin/-z options to `git
reset`. Those patches were overridden by a different set of options in
the upstream Git project before we could propose `--stdin`.

We offered this in MinGit to applications that wanted a safer way to
pass lots of pathspecs to Git, and these applications will need to be
adjusted.

Instead of `--stdin`, `--pathspec-from-file=-` should be used, and
instead of `-z`, `--pathspec-file-nul`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:12:07 +01:00
Johannes Schindelin
440af7e32c Merge branch 'un-revert-editor-save-and-reset'
A fix for calling `vim` in Windows Terminal caused a regression and was
reverted. We partially un-revert this, to get the fix again.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-02-06 19:12:07 +01:00