Commit Graph

175431 Commits

Author SHA1 Message Date
Junio C Hamano
a62d0da86a Merge branch 'ps/ci-gitlab-msvc-updates'
CI update.

* ps/ci-gitlab-msvc-updates:
  gitlab-ci: handle failed tests on MSVC+Meson job
  gitlab-ci: use "run-test-slice-meson.sh"
  ci: make test slicing consistent across Meson/Make
  github: fix Meson tests not executing at all
  meson: fix MERGE_TOOL_DIR with "--no-bin-wrappers"
  ci: don't skip smallest test slice in GitLab
  ci: handle failures of test-slice helper
2026-02-27 15:11:52 -08:00
Junio C Hamano
0f0a57e1e3 Merge branch 'jc/whitespace-incomplete-line'
It does not make much sense to apply the "incomplete-line"
whitespace rule to symbolic links, whose contents almost always
lack the final newline.  "git apply" and "git diff" are now taught
to exclude them for a change to symbolic links.

* jc/whitespace-incomplete-line:
  whitespace: symbolic links usually lack LF at the end
2026-02-27 15:11:52 -08:00
Junio C Hamano
c33b464dfd Merge branch 'jc/checkout-switch-restore'
"git switch <name>", in an attempt to create a local branch <name>
after a remote tracking branch of the same name gave an advise
message to disambiguate using "git checkout", which has been
updated to use "git switch".

* jc/checkout-switch-restore:
  checkout: tell "parse_remote_branch" which command is calling it
  checkout: pass program-readable token to unified "main"
2026-02-27 15:11:51 -08:00
Junio C Hamano
ce4530ac10 Merge branch 'jk/ref-filter-lrstrip-optim'
Code clean-up.

* jk/ref-filter-lrstrip-optim:
  ref-filter: clarify lstrip/rstrip component counting
  ref-filter: avoid strrchr() in rstrip_ref_components()
  ref-filter: simplify rstrip_ref_components() memory handling
  ref-filter: simplify lstrip_ref_components() memory handling
  ref-filter: factor out refname component counting
2026-02-27 15:11:51 -08:00
Junio C Hamano
bb9c781f4f Merge branch 'ps/history-ergonomics-updates'
UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input
2026-02-27 15:11:50 -08:00
Junio C Hamano
aa95f87c74 Merge branch 'ps/for-each-ref-in-fixes'
A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips
2026-02-27 15:11:50 -08:00
Junio C Hamano
341be27dfe Merge branch 'lo/repo-info-keys'
"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
  repo: add new flag --keys to git-repo-info
  repo: rename the output format "keyvalue" to "lines"
2026-02-27 15:11:49 -08:00
LorenzoPegorari
064b869efc t4052: test for diffstat width when prefix contains ANSI escape codes
Add test checking the calculation of the diffstat display width when the
`line_prefix`, which is text that goes before the diffstat, contains
ANSI escape codes.

This situation happens, for example, when `git log --stat --graph` is
executed:
* `--stat` will create a diffstat for each commit
* `--graph` will stuff `line_prefix` with the graph portion of the log,
  which contains ANSI escape codes to color the text

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 13:59:22 -08:00
LorenzoPegorari
1a9df8de36 diff: handle ANSI escape codes in prefix when calculating diffstat width
The diffstat width is calculated by taking the terminal width and
incorrectly subtracting the `strlen()` of `line_prefix`, instead of the
actual display width of `line_prefix`, which may contain ANSI escape
codes (e.g., ANSI-colored strings in `log --graph --stat`).

Utilize the display width instead, obtained via `utf8_strnwidth()` with
the flag `skip_ansi`.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 13:59:22 -08:00
René Scharfe
a5f2ff6ce8 pack-objects: remove duplicate --stdin-packs definition
cd846bacc7 (pack-objects: introduce '--stdin-packs=follow', 2025-06-23)
added a new definition of the option --stdin-packs that accepts an
argument.  It kept the old definition, which still shows up in the short
help, but is shadowed by the new one.  Remove it.

Hinted-at-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 11:38:25 -08:00
K Jayatheerth
a66c8c7f91 repo: remove unnecessary variable shadow
Avoid redeclaring `entry` inside the conditional block, removing
unnecessary variable shadowing and improving code clarity without
changing behavior.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Acked-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 10:52:33 -08:00
Jonatan Holmgren
cdef625509 git, help: fix memory leaks in alias listing
The list_aliases() function sets the util pointer of each list item to
a heap-allocated copy of the alias command value.  Two callers failed
to free these util pointers:

 - list_cmds() in git.c collects a string list with STRING_LIST_INIT_DUP
   and clears it with string_list_clear(&list, 0), which frees the
   duplicated strings (strdup_strings=1) but not the util pointers.
   Pass free_util=1 to free them.

 - list_cmds_by_config() in help.c calls string_list_sort_u(list, 0) to
   deduplicate the list before processing completion.commands overrides.
   When duplicate entries are removed, the util pointer of each discarded
   item is leaked because free_util=0.  Pass free_util=1 to free them.

Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 13:07:24 -08:00
Jonatan Holmgren
6589294375 alias: treat empty subsection [alias ""] as plain [alias]
When git-config stores a key of the form alias..name, it records
it under an empty subsection ([alias ""]). The new subsection-aware
alias lookup would see a non-NULL but zero-length subsection and
fall into the subsection code path, where it required a "command"
key and thus silently ignored the entry.

Normalize an empty subsection to NULL before any further processing
so that entries stored this way continue to work as plain
case-insensitive aliases, matching the pre-subsection behaviour.

Users who relied on alias..name to create an alias literally named
".name" may want to migrate to subsection syntax, which looks less confusing:

    [alias ".name"]
        command = <value>

Add tests covering both the empty-subsection compatibility case and
the leading-dot alias via the new syntax.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 13:06:49 -08:00
Jonatan Holmgren
2e3a987f3b doc: fix list continuation in alias subsection example
The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 13:06:48 -08:00
Harald Nordgren
3ea95ac9c5 status: add status.compareBranches config for multiple branch comparisons
Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.

Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination

Any other value is ignored and a warning is shown.

When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.

The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons

This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.

Example configuration:
    [status]
        compareBranches = @{upstream} @{push}

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:25:48 -08:00
Harald Nordgren
04f47265c1 refactor format_branch_comparison in preparation
Refactor format_branch_comparison function in preparation for showing
comparison with push remote tracking branch.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:25:48 -08:00
Olamide Caleb Bello
cf50830ce1 environment: move "branch.autoSetupMerge" into struct repo_config_values
The config value `branch.autoSetupMerge` is parsed in
`git_default_branch_config()` and stored in the global variable
`git_branch_track`. This global variable can be overwritten
by another repository when multiple Git repos run in the the same process.

Move this value into `struct repo_config_values` in the_repository to
retain current behaviours and move towards libifying Git.
Since the variable is no longer a global variable, it has been renamed to
`branch_track` in the struct `repo_config_values`.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:22:53 -08:00
Olamide Caleb Bello
4021751558 environment: stop using core.sparseCheckout globally
The config value `core.sparseCheckout` is parsed in
`git_default_core_config()` and stored globally in
`core_apply_sparse_checkout`. This could cause it to be overwritten
by another repository when different Git repositories run in the same
process.

Move the parsed value into `struct repo_config_values` in the_repository
to retain current behaviours and move towards libifying Git.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:22:51 -08:00
Junio C Hamano
7b2bccb0d5 The 7th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:54:18 -08:00
Junio C Hamano
d1c983d41b Merge branch 'ac/string-list-sort-u-and-tests'
Code clean-up using a new helper function introduced lately.

* ac/string-list-sort-u-and-tests:
  sparse-checkout: use string_list_sort_u
2026-02-25 11:54:18 -08:00
Junio C Hamano
422cae6687 Merge branch 'mc/tr2-process-ancestry-cleanup'
Add process ancestry data to trace2 on macOS to match what we
already do on Linux and Windows.  Also adjust the way Windows
implementation reports this information to match the other two.

* mc/tr2-process-ancestry-cleanup:
  t0213: add trace2 cmd_ancestry tests
  test-tool: extend trace2 helper with 400ancestry
  trace2: emit cmd_ancestry data for Windows
  trace2: refactor Windows process ancestry trace2 event
  build: include procinfo.c impl for macOS
  trace2: add macOS process ancestry tracing
2026-02-25 11:54:18 -08:00
Junio C Hamano
b1f4b5888b Merge branch 'ps/pack-concat-wo-backfill'
"git pack-objects --stdin-packs" with "--exclude-promisor-objects"
fetched objects that are promised, which was not wanted.  This has
been fixed.

* ps/pack-concat-wo-backfill:
  builtin/pack-objects: don't fetch objects when merging packs
2026-02-25 11:54:18 -08:00
Junio C Hamano
d21437a916 Merge branch 'dk/complete-stash-import-export'
Command line completion (in contrib/) update.

* dk/complete-stash-import-export:
  completion: add stash import, export
2026-02-25 11:54:17 -08:00
Junio C Hamano
1a46f31b3e Merge branch 'jc/doc-cg-needswork'
A CodingGuidelines update.

* jc/doc-cg-needswork:
  CodingGuidelines: document NEEDSWORK comments
2026-02-25 11:54:17 -08:00
Junio C Hamano
8d15dd1ce1 Merge branch 'ds/revision-maximal-only'
"git rev-list" and friends learn "--maximal-only" to show only the
commits that are not reachable by other commits.

* ds/revision-maximal-only:
  revision: add --maximal-only option
2026-02-25 11:54:17 -08:00
Junio C Hamano
6b5ad01886 Merge branch 'cc/lop-filter-auto'
"auto filter" logic for large-object promisor remote.

* cc/lop-filter-auto:
  fetch-pack: wire up and enable auto filter logic
  promisor-remote: change promisor_remote_reply()'s signature
  promisor-remote: keep advertised filters in memory
  list-objects-filter-options: support 'auto' mode for --filter
  doc: fetch: document `--filter=<filter-spec>` option
  fetch: make filter_options local to cmd_fetch()
  clone: make filter_options local to cmd_clone()
  promisor-remote: allow a client to store fields
  promisor-remote: refactor initialising field lists
2026-02-25 11:54:17 -08:00
Junio C Hamano
e8c6456592 Merge branch 'pw/commit-msg-sample-hook'
Update sample commit-msg hook to complain when a log message has
material mailinfo considers the end of log message in the middle.

* pw/commit-msg-sample-hook:
  templates: detect commit messages containing diffs
  templates: add .gitattributes entry for sample hooks
2026-02-25 11:54:16 -08:00
Junio C Hamano
bf3c3603fd Merge branch 'kh/doc-am-format-sendmail'
Doc update.

* kh/doc-am-format-sendmail:
  doc: add caveat about round-tripping format-patch
2026-02-25 11:54:16 -08:00
Lucas Seiki Oshiro
8b97dc367a Documentation/git-repo: capitalize format descriptions
The descriptions for the git-repo output formats are in lowercase.
Capitalize these descriptions, making them consistent with the rest of
the documentation.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
906b632c4f Documentation/git-repo: replace 'NUL' with '_NUL_'
Replace all occurrences of "NUL" by "_NUL_" in git-repo.adoc, following the
convention used by other documentation files.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
2db3d0a226 t1901: adjust nul format output instead of expected value
The test 'keyvalue and nul format', as it description says, test both
`keyvalue` and `nul` format. These formats are similar, differing only in
their field separator (= in the former, LF in the latter) and their
record separator (LF in the former, NUL in the latter). This way, both
formats can be tested using the same expected output and only replacing
the separators in one of the output formats.

However, it is not desirable to have a NUL character in the files
compared by test_cmp because, if that assetion fails, diff will consider
them binary files and won't display the differences properly.

Adjust the output of `git repo structure --format=nul` in t1901, matching the
--format=keyvalue ones. Compare this output against the same value expected
from --format=keyvalue, without using files with NUL characters in
test_cmp.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
b62dab3b6d t1900: rename t1900-repo to t1900-repo-info
Since the commit bbb2b93348 (builtin/repo: introduce structure subcommand,
2025-10-21), t1901 specifically tests git-repo-structure. Rename
t1900-repo to t1900-repo-info to clarify that it focus solely on
git-repo-info subcommand.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:42 -08:00
Lucas Seiki Oshiro
18f16b889c repo: rename struct field to repo_info_field
Change the name of the struct field to repo_info_field, making it
explicit that it is an internal data type of git-repo-info.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:42 -08:00
Lucas Seiki Oshiro
7377a6ef6b repo: replace get_value_fn_for_key by get_repo_info_field
Remove the function `get_value_fn_for_key`, which returns a function that
retrieves a value for a certain repo info key. Introduce `get_repo_info_field`
instead, which returns a struct field.

This refactor makes the structure of the function print_fields more consistent
to the function print_all_fields, improving its readability.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:42 -08:00
Lucas Seiki Oshiro
3d4e6d3193 repo: rename repo_info_fields to repo_info_field
Rename repo_info_fields as repo_info_field, following the CodingGuidelines rule
for naming arrays in singular. Rename all the references to that array
accordingly.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:42 -08:00
Lucas Seiki Oshiro
c63e64e04d CodingGuidelines: instruct to name arrays in singular
Arrays should be named in the singular form, ensuring that when
accessing an element within an array (e.g. dog[0]) it's clear that
we're referring to an element instead of a collection.

Add a new rule to CodingGuidelines asking for arrays to be named in
singular instead of plural.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:41 -08:00
Karthik Nayak
53592d68e8 refs: add GIT_REFERENCE_BACKEND to specify reference backend
Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:40:00 -08:00
Karthik Nayak
01dc84594e refs: allow reference location in refstorage config
The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:38:41 -08:00
Karthik Nayak
d74aacd7c4 refs: receive and use the reference storage payload
An upcoming commit will add support for providing an URI via the
'extensions.refStorage' config. The URI will contain the reference
backend and a corresponding payload. The payload can be then used for
providing an alternate locations for the reference backend.

To prepare for this, modify the existing backends to accept such an
argument when initializing via the 'init()' function. Both the files
and reftable backends will parse the information to be filesystem paths
to store references. Given that no callers pass any payload yet this is
essentially a no-op change for now.

To enable this, provide a 'refs_compute_filesystem_location()' function
which will parse the current 'gitdir' and the 'payload' to provide the
final reference directory and common reference directory (if working in
a linked worktree).

The documentation and tests will be added alongside the extension of the
config variable.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:27:12 -08:00
Karthik Nayak
2a32ac429e refs: move out stub modification to generic layer
When creating the reftable reference backend on disk, we create stubs to
ensure that the directory can be recognized as a Git repository. This is
done by calling `refs_create_refdir_stubs()`. Move this to the generic
layer as this is needed for all backends excluding from the files
backends. In an upcoming commit where we introduce alternate reference
backend locations, we'll have to also create stubs in the $GIT_DIR
irrespective of the backend being used. This commit builds the base to
add that logic.

Similarly, move the logic for deletion of stubs to the generic layer.
The files backend recursively calls the remove function of the
'packed-backend', here skip calling the generic function since that
would try to delete stubs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:27:12 -08:00
Karthik Nayak
4ffbb02ee4 refs: extract out refs_create_refdir_stubs()
For Git to recognize a directory as a Git directory, it requires the
directory to contain:

  1. 'HEAD' file
  2. 'objects/' directory
  3. 'refs/' directory

Here, #1 and #3 are part of the reference storage mechanism,
specifically the files backend. Since then, newer backends such as the
reftable backend have moved to using their own path ('reftable/') for
storing references. But to ensure Git still recognizes the directory as
a Git directory, we create stubs.

There are two locations where we create stubs:

- In 'refs/reftable-backend.c' when creating the reftable backend.
- In 'clone.c' before spawning transport helpers.

In a following commit, we'll add another instance. So instead of
repeating the code, let's extract out this code to
`refs_create_refdir_stubs()` and use it.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:27:12 -08:00
Karthik Nayak
2c69ff4819 setup: don't modify repo in create_reference_database()
The `create_reference_database()` function is used to create the
reference database during initialization of a repository. The function
calls `repo_set_ref_storage_format()` to set the repositories reference
format. This is an unexpected side-effect of the function. More so
because the function is only called in two locations:

  1. During git-init(1) where the value is propagated from the `struct
     repository_format repo_fmt` value.

  2. During git-clone(1) where the value is propagated from the
     `the_repository` value.

The former is valid, however the flow already calls
`repo_set_ref_storage_format()`, so this effort is simply duplicated.
The latter sets the existing value in `the_repository` back to itself.
While this is okay for now, introduction of more fields in
`repo_set_ref_storage_format()` would cause issues, especially
dynamically allocated strings, where we would free/allocate the same
string back into `the_repostiory`.

To avoid all this confusion, clean up the function to no longer take in
and set the repo's reference storage format.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:27:11 -08:00
cuiweixie
f87593ab1a fetch: fix wrong evaluation order in URL trailing-slash trimming
if i == -1, url[i] will be UB.

Signed-off-by: cuiweixie <cuiweixie@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:12:31 -08:00
Taylor Blau
d54da84bd9 midx: enable reachability bitmaps during MIDX compaction
Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.

Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.

As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.

In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:35 -08:00
Taylor Blau
9df44a97f1 midx: implement MIDX compaction
When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.

While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:

 - Preserving the MIDX chain is impossible, since there is no way to
   write a MIDX layer that contains objects or packs found in an earlier
   MIDX layer already part of the chain. So callers would have to write
   an entirely new (non-incremental) MIDX containing only the compacted
   layers, discarding all other objects/packs from the MIDX.

 - There is (currently) no way to write a MIDX layer outside of the MIDX
   chain to work around the above, such that the MIDX chain could be
   reassembled substituting the compacted layers with the MIDX that was
   written.

 - The `--stdin-packs` command-line option does not allow us to specify
   the order of packs as they appear in the MIDX. Therefore, even if
   there were workarounds for the previous two challenges, any bitmaps
   belonging to layers which come after the compacted layer(s) would no
   longer be valid.

This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.

Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:

 - Instead of calling `fill_packs_from_midx()`, we call a new function
   `fill_packs_from_midx_range()`, which walks backwards along the
   portion of the MIDX chain which we are compacting, and adds packs one
   layer a time.

   In order to preserve the pseudo-pack order, the concatenated pack
   order is preserved, with the exception of preferred packs which are
   always added first.

 - After adding entries from the set of packs in the compaction range,
   `compute_sorted_entries()` must adjust the `pack_int_id`'s for all
   objects added in each fanout layer to match their original
   `pack_int_id`'s (as opposed to the index at which each pack appears
   in `ctx.info`).

   Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
   here, as it unconditionally recurs through the `->base_midx`. Factor
   out a `_1()` variant that operates on a single layer, reimplement
   the existing function in terms of it, and use the new variant from
   `midx_fanout_add_compact()`.

   Since we are sorting the list of objects ourselves, the order we add
   them in does not matter.

 - When writing out the new 'multi-pack-index-chain' file, discard any
   layers in the compaction range, replacing them with the newly written
   layer, instead of keeping them and placing the new layer at the end
   of the chain.

This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.

The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
dedf71f0b1 t/helper/test-read-midx.c: plug memory leak when selecting layer
Though our 'read-midx' test tool is capable of printing information
about a single MIDX layer identified by its checksum, no caller in our
test suite exercises this path.

Unfortunately, there is a memory leak lurking in this (currently) unused
path that would otherwise be exposed by the following commit.

This occurs when providing a MIDX layer checksum other than the tip. As
we walk over the MIDX chain trying to find the matching layer, we drop
our reference to the top-most MIDX layer. Thus, our call to
'close_midx()' later on leaks memory between the top-most MIDX layer and
the MIDX layer immediately following the specified one.

Plug this leak by holding a reference to the tip of the MIDX chain, and
ensure that we call `close_midx()` before terminating the test tool.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
9aea84c4e7 midx-write.c: factor fanout layering from compute_sorted_entries()
When computing the set of objects to appear in a MIDX, we use
compute_sorted_entries(), which handles objects from various existing
sources one fanout layer at a time.

The process for computing this set is slightly different during MIDX
compaction, so factor out the existing functionality into its own
routine to prevent `compute_sorted_entries()` from becoming too
difficult to read.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
93c67df751 midx-write.c: enumerate pack_int_id values directly
Our `midx-write.c::fill_packs_from_midx()` function currently enumerates
the range [0, m->num_packs), and then shifts its index variable up by
`m->num_packs_in_base` to produce a valid `pack_int_id`.

Instead, directly enumerate the range:

    [m->num_packs_in_base, m->num_packs_in_base + m->num_packs)

, which are the original pack_int_ids themselves as opposed to the
indexes of those packs relative to the MIDX layer they are contained
within.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
5f3e7f7279 midx-write.c: extract fill_pack_from_midx()
When filling packs from an existing MIDX, `fill_packs_from_midx()`
handles preparing a MIDX'd pack, and reading out its pack name from the
existing MIDX.

MIDX compaction will want to perform an identical operation, though the
caller will look quite different than `fill_packs_from_midx()`. To
reduce any future code duplication, extract `fill_pack_from_midx()`
from `fill_packs_from_midx()` to prepare to call our new helper function
in a future change.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
4f8543255e midx-write.c: introduce midx_pack_perm() helper
The `ctx->pack_perm` array can be considered as a permutation between
the original `pack_int_id` of some given pack to its position in the
`ctx->info` array containing all packs.

Today we can always index into this array with any known `pack_int_id`,
since there is never a `pack_int_id` which is greater than or equal to
the value `ctx->nr`.

That is not necessarily the case with MIDX compaction. For example,
suppose we have a MIDX chain with three layers, each containing three
packs. The base of the MIDX chain will have packs with IDs 0, 1, and 2,
the next layer 3, 4, and 5, and so on. If we are compacting the topmost
two layers, we'll have input `pack_int_id` values between [3, 8], but
`ctx->nr` will only be 6.

In that example, if we want to know where the pack whose original
`pack_int_id` value was, say, 7, we would compute `ctx->pack_perm[7]`,
leading to an uninitialized read, since there are only 6 entries
allocated in that array.

To address this, there are a couple of options:

 - We could allocate enough entries in `ctx->pack_perm` to accommodate
   the largest `orig_pack_int_id` value.

 - Or, we could internally shift the input values by the number of packs
   in the base layer of the lower end of the MIDX compaction range.

This patch prepare us to take the latter approach, since it does not
allocate more memory than strictly necessary. (In our above example, the
base of the lower end of the compaction range is the first MIDX layer
(having three packs), so we would end up indexing `ctx->pack_perm[7-3]`,
which is a valid read.)

Note that this patch does not actually implement that approach yet, but
merely performs a behavior-preserving refactoring which will make the
change easier to carry out in the future.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:33 -08:00