Commit Graph

175431 Commits

Author SHA1 Message Date
Kristoffer Haugsbakk
a454cdca42 doc: add caveat about round-tripping format-patch
git-format-patch(1) and git-am(1) deal with formatting commits as
patches and applying them, respectively. Naturally they use a few
delimiters to mark where the commit message ends. This can lead to
surprising behavior when these delimiters are used in the commit
message itself.

git-format-patch(1) will accept any commit message and not warn or error
about these delimiters being used.[1]

Especially problematic is the presence of unindented diffs in the commit
message; the patch machinery will naturally (since the commit message
has ended) try to apply that diff and everything after it.[2]

It is unclear whether any commands in this chain will learn to warn
about this. One concern could be that users have learned to rely on
the three-dash line rule to conveniently add extra-commit message
information in the commit message, knowing that git-am(1) will
ignore it.[4]

All of this is covered already, technically. However, we should spell
out the implications.

† 1: There is also git-commit(1) to consider. However, making that
     command warn or error out over such delimiters would be disruptive
     to all Git users who never use email in their workflow.
† 2: Recently patch(1) caused this issue for a project, but it was noted
     that git-am(1) has the same behavior[3]
† 3: https://github.com/i3/i3/pull/6564#issuecomment-3858381425
† 4: https://lore.kernel.org/git/xmqqldh4b5y2.fsf@gitster.g/
     https://lore.kernel.org/git/V3_format-patch_caveats.354@msgid.xyz/

Reported-by: Matthias Beyer <mail@beyermatthias.de>
Reported-by: Christoph Anton Mitterer <calestyo@scientia.org>
Reported-by: Matheus Tavares <matheus.tavb@gmail.com>
Reported-by: Chris Packham <judge.packham@gmail.com>
Helped-by: Jakob Haufe <sur5r@sur5r.net>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 14:37:56 -08:00
Ashwani Kumar Kamal
580d86576a t9812: modernize test path helpers
Replace assertion-style 'test -f' checks with Git's
test_path_is_file() helper for clearer failures and
consistency.

Signed-off-by: Ashwani Kumar Kamal <ashwanikamal.im421@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:08:29 -08:00
Patrick Steinhardt
732ec9b17b odb: convert odb_has_object() flags into an enum
Following the reason in the preceding commit, convert the
`odb_has_object()` flags into an enum.

With this change, we would have catched the misuse of `odb_has_object()`
that was fixed in a preceding commit as the compiler would have
generated a warning:

  ../builtin/backfill.c:71:9: error: implicit conversion from enumeration type 'enum odb_object_info_flag' to different enumeration type 'enum odb_has_object_flag' [-Werror,-Wenum-conversion]
     70 |                 if (!odb_has_object(ctx->repo->objects, &list->oid[i],
        |                      ~~~~~~~~~~~~~~
     71 |                                     OBJECT_INFO_FOR_PREFETCH))
        |                                     ^~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:05:08 -08:00
Patrick Steinhardt
f6516a5241 odb: convert object info flags into an enum
Convert the object info flags into an enum and adapt all functions that
receive these flags as parameters to use the enum instead of an integer.
This serves two purposes:

  - The function signatures become more self-documenting, as callers
    don't have to wonder which flags they expect.

  - The compiler can warn when a wrong flag type is passed.

Note that the second benefit is somewhat limited. For example, when
or-ing multiple enum flags together the result will be an integer, and
the compiler will not warn about such use cases. But where it does help
is when a single flag of the wrong type is passed, as the compiler would
generate a warning in that case.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:05:08 -08:00
Patrick Steinhardt
ae77afc347 odb: drop gaps in object info flag values
The object info flag values have a two gaps in their definitions, where
some bits are skipped over. These gaps don't really hurt, but it makes
one wonder whether anything is going on and whether a subset of flags
might be defined somewhere else.

That's not the case though. Instead, this is a case of flags that have
been dropped in the past:

  - The value 4 was used by `OBJECT_INFO_SKIP_CACHED`, removed in
    9c8a294a1a (sha1-file: remove OBJECT_INFO_SKIP_CACHED, 2020-01-02).

  - The value 8 was used by `OBJECT_INFO_ALLOW_UNKNOWN_TYPE`, removed in
    ae24b032a0 (object-file: drop OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag,
    2025-05-16).

Close those gaps to avoid any more confusion.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:05:08 -08:00
Patrick Steinhardt
6cf965ccaf builtin/fsck: fix flags passed to odb_has_object()
In `mark_object()` we invoke `has_object()` with a value of 1. This is
somewhat fishy given that the function expects a bitset of flags, so any
behaviour that this results in is purely coincidental and may break at
any point in time.

The call to `has_object()` was originally introduced in 9eb86f41de
(fsck: do not lazy fetch known non-promisor object, 2020-08-05). The
intent here was to skip lazy fetches of promisor objects: we have
already verified that the object is not a promisor object, so if the
object is missing it indicates a corrupt repository.

The hardcoded value that we pass maps to `HAS_OBJECT_RECHECK_PACKED`,
which is probably the intended behaviour: `odb_has_object()` will not
fetch promisor objects unless `HAS_OBJECT_FETCH_PROMISOR` is passed, but
we may want to verify that no concurrent process has written the object
that we're trying to read.

Convert the code to use the named flag instead of the the hardcoded
value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:05:08 -08:00
Patrick Steinhardt
048357d49d builtin/backfill: fix flags passed to odb_has_object()
The function `fill_missing_blobs()` receives an array of object IDs and
verifies for each of them whether the corresponding object exists. If it
doesn't exist, we add it to a set of objects and then batch-fetch all of
the objects at once.

The check for whether or not we already have the object is broken
though: we pass `OBJECT_INFO_FOR_PREFETCH`, but `odb_has_object()`
expects us to pass `HAS_OBJECT_*` flags. The flag expands to:

  - `OBJECT_INFO_QUICK`, which asks the object database to not reprepare
    in case the object wasn't found. This makes sense, as we'd otherwise
    reprepare the object database as many times as we have missing
    objects.

  - `OBJECT_INFO_SKIP_FETCH_OBJECT`, which asks the object database to
    not fetch the object in case it's missing. Again, this makes sense,
    as we want to batch-fetch the objects.

This shows that we indeed want the equivalent of this flag, but of
course represented as `HAS_OBJECT_*` flags.

Luckily, the code is already working correctly. The `OBJECT_INFO` flag
expands to `(1 << 3) | (1 << 4)`, none of which are valid `HAS_OBJECT`
flags. And if no flags are passed, `odb_has_object()` ends up calling
`odb_read_object_info_extended()` with exactly the above two flags that
we wanted to set in the first place.

Of course, this is pure luck, and this can break any moment. So let's
fix this and correct the code to not pass any flags at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 11:05:08 -08:00
Phillip Wood
dd2a4c0c7a diff --anchored: avoid checking unmatched lines
For a line to be an anchor it has to appear in each of the files being
diffed exactly once. With that in mind lets delay checking whether
a line is an anchor until we know there is exactly one instance of
the line in each file. As each line is checked at most once, there
is no need to cache the result of is_anchor() and we can drop that
field from the hashmap entries. When diffing 5000 recent commits in
git.git this gives a modest speedup of ~2%. In the (rather extreme)
example below that consists largely of deletions the speedup is ~16%.

    seq 0 10000000 >old
    printf '%s\n' 300000 100000 200000 >new
    git diff --no-index --anchored=300000 old new

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12 09:28:49 -08:00
Johannes Schindelin
e9edee0b34 Merge branch 'disallow-ntlm-auth-by-default'
This topic branch addresses the following vulnerability:

- **CVE-2025-66413**:
  When a user clones a repository from an attacker-controlled server,
  Git may attempt NTLM authentication and disclose the user's NTLMv2 hash
  to the remote server. Since NTLM hashing is weak, the captured hash can
  potentially be brute-forced to recover the user's credentials. This is
  addressed by disabling NTLM authentication by default.
  (https://github.com/git-for-windows/git/security/advisories/GHSA-hv9c-4jm9-jh3x)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
v2.53.0.windows.2
2026-02-12 16:06:29 +01:00
Johannes Schindelin
816db62d10 credential: advertise NTLM suppression and allow helpers to re-enable
The previous commits disabled NTLM authentication by default due to its
cryptographic weaknesses. Users can re-enable it via the config setting
http.<url>.allowNTLMAuth, but this requires manual intervention.

Credential helpers may have knowledge about which servers are trusted
for NTLM authentication (e.g., known on-prem Azure DevOps instances).
To allow them to signal this trust, introduce a simple negotiation:
when NTLM is suppressed and the server offered it, Git advertises
ntlm=suppressed to the credential helper. The helper can respond with
ntlm=allow to re-enable NTLM for this request.

This happens precisely at the point where we would otherwise warn the
user about NTLM being suppressed, ensuring the capability is only
advertised when relevant.

Helped-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-02-12 16:00:40 +01:00
Junio C Hamano
453e7b744a Merge branch 'master' of https://github.com/j6t/gitk
* 'master' of https://github.com/j6t/gitk:
  gitk: fix msgfmt being required
  gitk: fix highlighted remote prefix of branches with directories
2026-02-11 14:49:53 -08:00
Junio C Hamano
2f99f50f2d CodingGuidelines: document // comments
We do not use // comments in our C code, which is implied by the
description of multi-line comment rule and its examples, but is not
explicitly spelled out.  Spell it out.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-11 13:29:36 -08:00
Junio C Hamano
6fcee47852 The 3rd batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-11 12:29:08 -08:00
Junio C Hamano
06cef761b1 Merge branch 'rs/blame-ignore-colors-fix'
"git blame --ignore-revs=... --color-lines" did not account for
ignored revisions passing blame to the same commit an adjacent line
gets blamed for.

* rs/blame-ignore-colors-fix:
  blame: fix coloring for repeated suspects
2026-02-11 12:29:08 -08:00
Junio C Hamano
ea03b35bb5 Merge branch 'hs/t9160-test-paths'
Test update.

* hs/t9160-test-paths:
  t9160:modernize test path checking
2026-02-11 12:29:07 -08:00
Junio C Hamano
18df0fa3ca Merge branch 'am/doc-github-contributiong-link-to-submittingpatches'
GitHub repository banner update.

* am/doc-github-contributiong-link-to-submittingpatches:
  .github/CONTRIBUTING.md: link to SubmittingPatches on git-scm.com
2026-02-11 12:29:07 -08:00
Junio C Hamano
40c59964b3 Merge branch 'kh/doc-shortlog-fix'
Doc fix.

* kh/doc-shortlog-fix:
  doc: shortlog: put back trailer paragraphs
2026-02-11 12:29:07 -08:00
Junio C Hamano
cf5b37fb43 Merge branch 'sp/show-index-warn-fallback'
When "git show-index" is run outside a repository, it silently
defaults to SHA-1; the tool now warns when this happens.

* sp/show-index-warn-fallback:
  show-index: use gettext wrapping in user facing error messages
  show-index: warn when falling back to SHA-1 outside a repository
2026-02-11 12:29:06 -08:00
Patrick Steinhardt
f4eff7116d builtin/pack-objects: don't fetch objects when merging packs
The "--stdin-packs" option can be used to merge objects from multiple
packfiles given via stdin into a new packfile. One big upside of this
option is that we don't have to perform a complete rev walk to enumerate
objects. Instead, we can simply enumerate all objects that are part of
the specified packfiles, which can be significantly faster in very large
repositories.

There is one downside though: when we don't perform a rev walk we also
don't have a good way to learn about the respective object's names. As a
consequence, we cannot use the name hashes as a heuristic to get better
delta selection.

We try to offset this downside though by performing a localized rev
walk: we queue all objects that we're about to repack as interesting,
and all objects from excluded packfiles as uninteresting. We then
perform a best-effort rev walk that allows us to fill in object names.

There is one gotcha here though: when "--exclude-promisor-objects" has
not been given we will perform backfill fetches for any promised objects
that are missing. This used to not be an issue though as this option was
mutually exclusive with "--stdin-packs". But that has changed recently,
and starting with dcc9c7ef47 (builtin/repack: handle promisor packs with
geometric repacking, 2026-01-05) we will now repack promisor packs
during geometric compaction. The consequence is that a geometric repack
may now perform a bunch of backfill fetches.

We of course cannot pass "--exclude-promisor-objects" to fix this
issue -- after all, the whole intent is to repack objects part of a
promisor pack. But arguably we don't have to: the rev walk is intended
as best effort, and we already configure it to ignore missing links to
other objects. So we can adapt the walk to unconditionally disable
fetching any missing objects.

Do so and add a test that verifies we don't backfill any objects.

Reported-by: Lukas Wanko <lwanko@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-11 09:44:09 -08:00
Kristoffer Haugsbakk
6bfef81c9a doc: rerere-options.adoc: link to git-rerere(1)
Five commands include these options. Let’s link to the command so that
the curious user can learn more about what “rerere” is about.

It’s also better to consistently refer to things like
e.g. “git-subcommand(1)” over `git subcommand` or `subcommand`.

Also apply the same treatment to git-add(1).

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-10 12:27:07 -08:00
René Scharfe
af5706f033 xdiff-interface: stop using the_repository
Use the algorithm-agnostic is_null_oid() and push the dependency of
read_mmblob() on the_repository->objects to its callers.  This allows it
to be used with arbitrary object databases.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-10 08:16:14 -08:00
Junio C Hamano
864f55e190 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 12:09:10 -08:00
Junio C Hamano
6a3c66d72e Merge branch 'ty/perf-3400-optim'
Improve set-up time of a perf test.

* ty/perf-3400-optim:
  t/perf/p3400: speed up setup using fast-import
2026-02-09 12:09:10 -08:00
Junio C Hamano
35e17b2526 Merge branch 'ac/string-list-sort-u-and-tests'
The string_list API gains a new helper, string_list_sort_u(), and
new unit tests to extend coverage.

* ac/string-list-sort-u-and-tests:
  string-list: add string_list_sort_u() that mimics "sort -u"
  u-string-list: add unit tests for string-list methods
2026-02-09 12:09:10 -08:00
Junio C Hamano
4f3929275b Merge branch 'sb/doc-worktree-prune-expire-improvement'
The help text and the documentation for the "--expire" option of
"git worktree [list|prune]" have been improved.

* sb/doc-worktree-prune-expire-improvement:
  worktree: clarify that --expire only affects missing worktrees
2026-02-09 12:09:10 -08:00
Junio C Hamano
6176ee2349 Merge branch 'kn/ref-batch-output-error-reporting-fix'
A handful of code paths that started using batched ref update API
(after Git 2.51 or so) lost detailed error output, which have been
corrected.

* kn/ref-batch-output-error-reporting-fix:
  fetch: delay user information post committing of transaction
  receive-pack: utilize rejected ref error details
  fetch: utilize rejected ref error details
  update-ref: utilize rejected error details if available
  refs: add rejection detail to the callback function
  refs: skip to next ref when current ref is rejected
2026-02-09 12:09:10 -08:00
Junio C Hamano
8087aae540 Merge branch 'pw/replay-drop-empty'
"git replay" is taught to drop commits that become empty (not the
ones that are empty in the original).

* pw/replay-drop-empty:
  replay: drop commits that become empty
2026-02-09 12:09:09 -08:00
Junio C Hamano
7bf3785d09 Merge branch 'ps/history'
"git history" history rewriting UI.

* ps/history:
  builtin/history: implement "reword" subcommand
  builtin: add new "history" command
  wt-status: provide function to expose status for trees
  replay: support updating detached HEAD
  replay: support empty commit ranges
  replay: small set of cleanups
  builtin/replay: move core logic into "libgit.a"
  builtin/replay: extract core logic to replay revisions
2026-02-09 12:09:09 -08:00
Junio C Hamano
2668b6bdc4 rerere: minor documantation update
Let's not call our users "it".  Also "rerere forget \*.c" does not
forget resolutions for just '*.c'; it forgets for all the files
whose filenames end with ".c".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 11:58:53 -08:00
Kristoffer Haugsbakk
b10e0cb1f3 doc: am: fill out hook discussion
Document `--verify` and rephrase the `--[no-]verify` section to lead
with the default, in imperative mood.[1]

Historically it makes sense that only the negated forms are documented;
they are all run by default and thus you only need to use hook options
if you want to turn some of them off. But, beyond just desiring uniform
documentation,[2] it’s very much possible to have, say, a Git alias with
`--no-verify` that you might sometimes want to turn back on with
the *positive* form.

Also mention the options in the “Hooks” section and mention that
`post-applypatch` cannot be skipped.

† 1: See e.g. acffc5e9 (doc: convert git-remote to synopsis style,
     2025-12-20)
† 2: https://lore.kernel.org/git/xmqqcyct1mtq.fsf@gitster.g/

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 10:34:05 -08:00
Kristoffer Haugsbakk
c2f700ac27 doc: am: add missing config am.messageId
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 10:34:05 -08:00
Kristoffer Haugsbakk
3c18135bfd doc: am: say that --message-id adds a trailer
The option `--message-id` was added in a078f732 (git-am: add
--message-id/--no-message-id, 2014-11-25) back when git-interpret-
trailers(1) was relatively new. Let’s spell out that it is a trailer
and link to the dedicated trailer command.

Also use inline-verbatim for `Message-ID`.

Also link to git-interpret-trailers(1) on `--signoff`.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 10:34:04 -08:00
Kristoffer Haugsbakk
b679ee0422 doc: am: normalize git(1) command links
There are many mentions of commands using inline-verbatim or
emphasis ('). We just mention the command themselves, not specific
invocations like `git am <opts>`. Let’s link to them instead.

There are also many such mentions which then link to the command right
afterwards. Simplify to just using a link.

Also remove “see <gitlink>” phrases where they have now already
been mentioned.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 10:34:04 -08:00
SoutrikDas
aaf3cc3d8d t7003: modernize path existence checks using test helpers
Replace direct uses of 'test -f' and 'test -d' with
git's helper functions 'test_path_is_file' ,
'test_path_is_missing' and 'test_path_is_dir'

Signed-off-by: SoutrikDas <valusoutrik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 10:16:01 -08:00
Burak Kaan Karaçay
168d575719 t2003: modernize path existence checks using test helpers
The old style 'test -f' and 'test -d' checks are silent on failure,
which makes debugging difficult.

Replace them with the 'test_path_is_*' helpers which provide verbose
error messages when a test fails.

Signed-off-by: Burak Kaan Karaçay <bkkaracay@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 09:46:49 -08:00
Phillip Wood
58e4eeeeb5 meson: fix building mergetool docs
Building the documentation with meson when the build directory is
not an immediate subdirectory of the source directory prints the
following error

[2/1349] Generating Documentation/mer... command (wrapped by meson to set env)
../../Documentation/generate-mergetool-list.sh: line 15: ../git-mergetool--lib.sh: No such file or directory

The build does not fail because the failure is upstream of a pipe. Fix
the error by passing the correct source directory when meson runs
"generate-mergetool-list.sh". As that script sets $MERGE_TOOLS_DIR
we do not need to set it in the environment when running the script.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 09:45:48 -08:00
Johannes Schindelin
b24f37778a http: warn if might have failed because of NTLM
The new default of Git is to disable NTLM authentication by default.

To help users find the escape hatch of that config setting, should they
need it, suggest it when the authentication failed and the server had
offered NTLM, i.e. if re-enabling it would fix the problem.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-02-09 18:16:36 +01:00
Junio C Hamano
12fee11f21 checkout: tell "parse_remote_branch" which command is calling it
When "git checkout <dwim>" and "git switch <dwim>" need to error out
due to ambiguity of the branch name <dwim>, these two commands give
an advise message with a sample command that tells the user how to
disambiguate from the parse_remote_branch() function.  The sample
command hardcodes "git checkout", since this feature predates "git
switch" by a large margin.  To a user who said "git switch <dwim>"
and got this message, it is confusing.

Pass the "enum checkout_command", which was invented in the previous
step for this exact purpose, down the call chain leading to
parse_remote_branch() function to change the sample command shown to
the user in this advise message.

Also add a bit more test coverage for this "fail to DWIM under
ambiguity" that we lack, as well as the message we produce when we
fail.

Reported-by: Simon Cheng <cyqsimon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 08:33:17 -08:00
Junio C Hamano
2096948158 checkout: pass program-readable token to unified "main"
The "git checkout", "git switch", and "git restore" commands share a
single implementation, checkout_main(), which switches error message
it gives using the usage string passed by each of these three
front-ends.

In order to be able to tweak behaviours of the commands based on
which one we are executing, invent an enum that denotes which one of
these three commands is currently executing, and pass that to
checkout_main() instead.  With this step, there is no externally
visible behaviour change, as this enum parameter is only used to
choose among the three usage strings.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-09 08:33:17 -08:00
René Scharfe
6c21e53bad version: stop using the_repository
Actually it has never been used in version.c since cf7ee48190 (agent:
advertise OS name via agent capability, 2025-02-15) added the dependency
macro.  Remove it, along with the also unused struct declaration.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-08 15:16:49 -08:00
René Scharfe
10c68d2577 remove duplicate includes
The following command reports that some header files are included twice:

   $ git grep '#include' '*.c' | sort | uniq -cd

Remove the second #include line in each case, as it has no effect.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-08 15:03:06 -08:00
René Scharfe
050566633a commit: use commit_stack
Use commit_stack instead of open-coding it.  Also convert the loop
counter i to size_t to match the type of the nr member of struct
commit_stack.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-08 15:02:09 -08:00
brian m. carlson
d49f23ae2f object-file-convert: always make sure object ID algo is valid
In some cases, we zero-initialize our object IDs, which sets the algo
member to zero as well, which is not a valid algorithm number.  This is
a bad practice, but we typically paper over it in many cases by simply
substituting the repository's hash algorithm.

However, our new Rust loose object map code doesn't handle this
gracefully and can't find object IDs when the algorithm is zero because
they don't compare equal to those with the correct algo field.  In
addition, the comparison code doesn't have any knowledge of what the
main algorithm is because that's global state, so we can't adjust the
comparison.

To make our code function properly and to avoid propagating these bad
entries, if we get a source object ID with a zero algo, just make a copy
of it with the fixed algorithm.  This has the benefit of also fixing the
object IDs if we're in a single algorithm mode as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:03 -08:00
brian m. carlson
39e4dcf77d rust: add a small wrapper around the hashfile code
Our new binary object map code avoids needing to be intimately involved
with file handling by simply writing data to an object implement Write.
This makes it very easy to test by writing to a Cursor wrapping a Vec
for tests, and thus decouples it from intimate knowledge about how we
handle files.

However, we will actually want to write our data to an actual file,
since that's the most practical way to persist data.  Implement a
wrapper around the hashfile code that implements the Write trait so that
we can write our object map into a file.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:03 -08:00
brian m. carlson
40a1b4fb2b rust: add a new binary object map format
Our current loose object format has a few problems.  First, it is not
efficient: the list of object IDs is not sorted and even if it were,
there would not be an efficient way to look up objects in both
algorithms.

Second, we need to store mappings for things which are not technically
loose objects but are not packed objects, either, and so cannot be
stored in a pack index.  These kinds of things include shallows, their
parents, and their trees, as well as submodules. Yet we also need to
implement a sensible way to store the kind of object so that we can
prune unneeded entries.  For instance, if the user has updated the
shallows, we can remove the old values.

For these reasons, introduce a new binary object map format.  The
careful reader will notice that it resembles very closely the pack index
v3 format.  Add an in-memory object map as well, and allow writing to a
batched map, which can then be written later as one of the binary object
maps.  Include several tests for round tripping and data lookup across
algorithms.

Note that the use of this code elsewhere in Git will involve some C code
and some C-compatible code in Rust that will be introduced in a future
commit.  Thus, for example, we ignore the fact that if there is no
current batch and the caller asks for data to be written, this code does
nothing, mostly because this code also does not involve itself with
opening or manipulating files.  The C code that we will add later will
implement this functionality at a higher level and take care of this,
since the code which is necessary for writing to the object store is
deeply involved with our C abstractions and it would require extensive
work (which would not be especially valuable at this point) to port
those to Rust.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:03 -08:00
brian m. carlson
0c04f2621e rust: add functionality to hash an object
In a future commit, we'll want to hash some data when dealing with an
object map.  Let's make this easy by creating a structure to hash
objects and calling into the C functions as necessary to perform the
hashing.  For now, we only implement safe hashing, but in the future we
could add unsafe hashing if we want.  Implement Clone and Drop to
appropriately manage our memory.  Additionally implement Write to make
it easy to use with other formats that implement this trait.

While we're at it, add some tests for the various hashing cases.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:03 -08:00
brian m. carlson
3bb0b0afab rust: add a build.rs script for tests
Cargo uses the build.rs script to determine how to compile and link a
binary.  The only binary we're generating, however, is for our tests,
but in a future commit, we're going to link against libgit.a for some
functionality and we'll need to make sure the test binaries are
complete.

Add a build.rs file for this case and specify the files we're going to
be linking against.  Because we cannot specify different dependencies
when building our static library versus our tests, update the Makefile
to specify these dependencies for our static library to avoid race
conditions during build.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:03 -08:00
brian m. carlson
41d1d14a18 rust: fix linking binaries with cargo
When Cargo links binaries with MSVC, it uses the link.exe linker from
PATH to do so.  However, when running under a shell from MSYS, such as
when building with the Git for Windows SDK, which we do in CI, the
/ming64/bin and /usr/bin entries are first in PATH.  That means that the
Unix link binary shows up first, which obviously does not work for
linking binaries in any useful way.

To solve this problem, adjust PATH to place those binaries at the end of
the list instead of the beginning.  This allows access to the normal
Unix tools, but link.exe will be the compiler's linker.  Make sure to
export PATH explicitly: while this should be the default, it's more
robust to not rely on the shell operating in a certain way.

The reason this has not shown up before is that we typically link our
binaries from the C compiler.  However, now that we're about to
introduce a Rust build script (build.rs file), Rust will end up linking
that script to further drive Cargo, in which case we'll invoke the
linker from it.  There are other solutions, such as using LLD, but this
one is simple and reliable and is most likely to work with existing
systems.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:02 -08:00
brian m. carlson
9a1fa36fe3 hash: expose hash context functions to Rust
We'd like to be able to hash our data in Rust using the same contexts as
in C.  However, we need our helper functions to not be inline so they
can be linked into the binary appropriately.  In addition, to avoid
managing memory manually and since we don't know the size of the hash
context structure, we want to have simple alloc and free functions we
can use to make sure a context can be easily dynamically created.

Expose the helper functions and create alloc, free, and init functions
we can call.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:02 -08:00
brian m. carlson
e54bedc169 write-or-die: add an fsync component for the object map
We'll soon be writing out an object map using the hashfile code. Add an
fsync component to allow us to handle fsyncing it correctly.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07 17:41:02 -08:00