Commit Graph

17213 Commits

Author SHA1 Message Date
Junio C Hamano
3fe08b8fd1 Merge branch 'cs/add-skip-submodule-ignore-all'
"git add <submodule>" has been taught to honor
submodule.<name>.ignore that is set to "all" (and requires "git add
-f" to override it).

* cs/add-skip-submodule-ignore-all:
  Documentation: update add --force option + ignore=all config
  tests: fix existing tests when add an ignore=all submodule
  tests: t2206-add-submodule-ignored: ignore=all and add --force tests
  read-cache: submodule add need --force given ignore=all configuration
  read-cache: update add_files_to_cache take param ignored_too
2026-03-09 14:36:55 -07:00
Omri Sarig
beca0ca4be doc: make it easier to find custom command information
Git supports creating additional commands through aliases, and through
placement of executables with a "git-" prefix in the PATH.

This information was not easy enough to find - users will look for this
information around the command description, but the documentation
exists in other locations.

Update the "GIT COMMANDS" section to reference the relevant sections,
making it easier for to find this information.

Signed-off-by: Omri Sarig <omri.sarig13@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-07 14:04:23 -08:00
Justin Tobler
a190f01f57 Documentation: extend guidance for submitting patches
Before submitting patches on the mailing list, it is often a good idea
to check for previous related discussions or if similar work is already
in progress. This enables better coordination amongst contributors and
could avoid duplicating work.

Additionally, it is often recommended to give reviewers some time to
reply to a patch series before sending new versions. This helps collect
broader feedback and reduces unnecessary churn from rapid rerolls.

Document this guidance in "Documentation/SubmittingPatches" accordingly.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-06 14:27:05 -08:00
Li Chen
e4f9d6b0ab rebase: support --trailer
Add a new --trailer=<trailer> option to git rebase to append trailer
lines to each rewritten commit message (merge backend only).

Because the apply backend does not provide a commit-message filter,
reject --trailer when --apply is in effect and require the merge backend
instead.

This option implies --force-rebase so that fast-forwarded commits are
also rewritten. Validate trailer arguments early to avoid starting an
interactive rebase with invalid input.

Add integration tests covering error paths and trailer insertion across
non-interactive and interactive rebases.

Signed-off-by: Li Chen <me@linux.beauty>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-06 13:02:20 -08:00
Junio C Hamano
795c338de7 The 12th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05 10:04:49 -08:00
Junio C Hamano
628a66ccf6 The 11th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-04 10:53:02 -08:00
Junio C Hamano
a31d4f1860 Merge branch 'ds/config-list-with-type'
"git config list" is taught to show the values interpreted for
specific type with "--type=<X>" option.

* ds/config-list-with-type:
  config: use an enum for type
  config: restructure format_config()
  config: format colors quietly
  color: add color_parse_quietly()
  config: format expiry dates quietly
  config: format paths gently
  config: format bools or strings in helper
  config: format bools or ints gently
  config: format bools gently
  config: format int64s gently
  config: make 'git config list --type=<X>' work
  config: add 'gently' parameter to format_config()
  config: move show_all_config()
2026-03-04 10:53:02 -08:00
Junio C Hamano
34af1d6e87 Merge branch 'lo/repo-leftover-bits'
Clean-up the code around "git repo info" command.

* lo/repo-leftover-bits:
  Documentation/git-repo: capitalize format descriptions
  Documentation/git-repo: replace 'NUL' with '_NUL_'
  t1901: adjust nul format output instead of expected value
  t1900: rename t1900-repo to t1900-repo-info
  repo: rename struct field to repo_info_field
  repo: replace get_value_fn_for_key by get_repo_info_field
  repo: rename repo_info_fields to repo_info_field
  CodingGuidelines: instruct to name arrays in singular
2026-03-04 10:53:01 -08:00
Junio C Hamano
50d7425767 Merge branch 'ps/maintenance-geometric-default'
"git maintenance" starts using the "geometric" strategy by default.

* ps/maintenance-geometric-default:
  builtin/maintenance: use "geometric" strategy by default
  t7900: prepare for switch of the default strategy
  t6500: explicitly use "gc" strategy
  t5510: explicitly use "gc" strategy
  t5400: explicitly use "gc" strategy
  t34xx: don't expire reflogs where it matters
  t: disable maintenance where we verify object database structure
  t: fix races caused by background maintenance
2026-03-04 10:53:01 -08:00
Junio C Hamano
1d0a2acb78 Merge branch 'kn/ref-location'
Allow the directory in which reference backends store their data to
be specified.

* kn/ref-location:
  refs: add GIT_REFERENCE_BACKEND to specify reference backend
  refs: allow reference location in refstorage config
  refs: receive and use the reference storage payload
  refs: move out stub modification to generic layer
  refs: extract out `refs_create_refdir_stubs()`
  setup: don't modify repo in `create_reference_database()`
2026-03-04 10:52:59 -08:00
Harald Nordgren
68791d7506 status: clarify how status.compareBranches deduplicates
The order of output when multiple branches are specified on the
configuration variable was not clearly spelled out in the
documentation.

Add a paragraph to describe the order and also how the branches are
deduplicated.  Update t6040 with additional tests to illustrate how
multiple branches are shown and deduplicated.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
[jc: made a whole replacement into incremental; wrote log message.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-04 10:13:33 -08:00
Omri Sarig
9c6569a895 doc: add information regarding external commands
Git supports running external commands in the user's PATH as if they
were built-in commands (see execv_dashed_external in git.c).

This feature was not fully documented in Git's user-facing
documentation.

Add a short documentation to describe how PATH is used to find a custom
subcommand.

Signed-off-by: Omri Sarig <omri.sarig13@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-04 09:21:22 -08:00
Patrick Steinhardt
d563ecec28 builtin/history: implement "split" subcommand
It is quite a common use case that one wants to split up one commit into
multiple commits by moving parts of the changes of the original commit
out into a separate commit. This is quite an involved operation though:

  1. Identify the commit in question that is to be dropped.

  2. Perform an interactive rebase on top of that commit's parent.

  3. Modify the instruction sheet to "edit" the commit that is to be
     split up.

  4. Drop the commit via "git reset HEAD~".

  5. Stage changes that should go into the first commit and commit it.

  6. Stage changes that should go into the second commit and commit it.

  7. Finalize the rebase.

This is quite complex, and overall I would claim that most people who
are not experts in Git would struggle with this flow.

Introduce a new "split" subcommand for git-history(1) to make this way
easier. All the user needs to do is to say `git history split $COMMIT`.
From hereon, Git asks the user which parts of the commit shall be moved
out into a separate commit and, once done, asks the user for the commit
message. Git then creates that split-out commit and applies the original
commit on top of it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-03 15:09:37 -08:00
Junio C Hamano
50d063e335 The 10th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-03 11:08:13 -08:00
Jonatan Holmgren
73cc549559 doc: fix list continuation in alias.adoc
Add missing list continuation marks ('+') after code blocks and shell examples
so paragraphs render correctly as part of the preceding list item.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-03 09:59:31 -08:00
LorenzoPegorari
a56fa1ca05 doc: gitprotocol-pack: normalize italic formatting
Uniform italic style usage for command and process names.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 21:35:05 -08:00
LorenzoPegorari
b8091b7935 doc: gitprotocol-pack: improve paragraphs structure
Logically separate the introductory sentence from the first transport
description to improve readability and structural clarity.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 21:35:04 -08:00
LorenzoPegorari
267807eae1 doc: gitprotocol-pack: fix pronoun-antecedent agreement
Fix "pronoun-antecedent agreement" errors.

Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 21:35:04 -08:00
Junio C Hamano
4805bb9930 The 9th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 17:06:53 -08:00
Junio C Hamano
34113149cf Merge branch 'kh/doc-patch-id-4'
Doc update.

* kh/doc-patch-id-4:
  doc: patch-id: see also git-cherry(1)
  doc: patch-id: add script example
  doc: patch-id: emphasize multi-patch processing
2026-03-02 17:06:53 -08:00
Junio C Hamano
112252c844 Merge branch 'pw/meson-doc-mergetool'
Update build precedure for mergetool documentation in meson-based builds.

* pw/meson-doc-mergetool:
  meson: fix building mergetool docs
2026-03-02 17:06:52 -08:00
Junio C Hamano
05c4af5c8f Merge branch 'kh/doc-am-xref'
Doc update.

* kh/doc-am-xref:
  doc: am: fill out hook discussion
  doc: am: add missing config am.messageId
  doc: am: say that --message-id adds a trailer
  doc: am: normalize git(1) command links
2026-03-02 17:06:52 -08:00
Justin Tobler
e33ac9cc9e builtin/repo: collect largest inflated objects
The "structure" output for git-repo(1) shows the total inflated and disk
sizes of reachable objects in the repository, but doesn't show the size
of the largest individual objects. Since an individual object may be a
large contributor to the overall repository size, it is useful for users
to know the maximum size of individual objects.

While interating across objects, record the size and OID of the largest
objects encountered for each object type to provide as output. Note that
the default "table" output format only displays size information and not
the corresponding OID. In a subsequent commit, the table format is
updated to add table annotations that mention the OID.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 13:54:52 -08:00
Kristoffer Haugsbakk
ea3a62c40e doc: diff-options.adoc: make *.noprefix split translatable
We cannot split single words like what we did in the previous
commit. That is because the doc translations are processed in
bigger chunks.

Instead write the two paragraphs with the only variations being this
configuration variable.

Reported-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 08:53:28 -08:00
David Timber
a8215a2051 send-email: add client certificate options
For SMTP servers that do "mutual certificate verification", the mail
client is required to present its own TLS certificate as well. This
patch adds --smtp-ssl-client-cert and --smtp-ssl-client-key for such
servers.

The problem of which private key for the certificate is chosen arises
when there are private keys in both the certificate and private key
file. According to the documentation of IO::Socket::SSL(link supplied),
the behaviour(the private key chosen) depends on the format of the
certificate. In a nutshell,

	- PKCS12: the key in the cert always takes the precedence
	- PEM: if the key file is not given, it will "try" to read one
	  from the cert PEM file

Many users may find this discrepancy unintuitive.

In terms of client certificate, git-send-email is implemented in a way
that what's possible with perl's SSL library is exposed to the user as
much as possible. In this instance, the user may choose to use a PEM
file that contains both certificate and private key should be
at their discretion despite the implications.

Link: https://metacpan.org/pod/IO::Socket::SSL#SSL_cert_file-%7C-SSL_cert-%7C-SSL_key_file-%7C-SSL_key
Link: https://lore.kernel.org/all/319bf98c-52df-4bf9-b157-e4bc2bf087d6@dev.snart.me/

Signed-off-by: David Timber <dxdt@dev.snart.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-02 08:39:26 -08:00
Junio C Hamano
2cc7191751 The 8th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27 15:11:55 -08:00
Junio C Hamano
4416ec1ae3 Merge branch 'db/doc-fetch-jobs-auto'
Doc update.

* db/doc-fetch-jobs-auto:
  doc: fetch: document `--jobs=0` behavior
2026-02-27 15:11:54 -08:00
Junio C Hamano
d64a20a1b1 Merge branch 'mf/format-patch-honor-from-for-cover-letter'
"git format-patch --from=<me>" did not honor the command line
option when writing out the cover letter, which has been corrected.

* mf/format-patch-honor-from-for-cover-letter:
  format-patch: fix From header in cover letter
2026-02-27 15:11:54 -08:00
Junio C Hamano
c0d0b8daed Merge branch 'jh/alias-i18n'
Extend the alias configuration syntax to allow aliases using
characters outside ASCII alphanumeric (plus '-').

* jh/alias-i18n:
  completion: fix zsh alias listing for subsection aliases
  alias: support non-alphanumeric names via subsection syntax
  alias: prepare for subsection aliases
  help: use list_aliases() for alias listing
2026-02-27 15:11:53 -08:00
Junio C Hamano
bb9c781f4f Merge branch 'ps/history-ergonomics-updates'
UI improvements for "git history reword".

* ps/history-ergonomics-updates:
  Documentation/git-history: document default for "--update-refs="
  builtin/history: rename "--ref-action=" to "--update-refs="
  builtin/history: replace "--ref-action=print" with "--dry-run"
  builtin/history: check for merges before asking for user input
  builtin/history: perform revwalk checks before asking for user input
2026-02-27 15:11:50 -08:00
Junio C Hamano
aa95f87c74 Merge branch 'ps/for-each-ref-in-fixes'
A handful of places used refs_for_each_ref_in() API incorrectly,
which has been corrected.

* ps/for-each-ref-in-fixes:
  bisect: simplify string_list memory handling
  bisect: fix misuse of `refs_for_each_ref_in()`
  pack-bitmap: fix bug with exact ref match in "pack.preferBitmapTips"
  pack-bitmap: deduplicate logic to iterate over preferred bitmap tips
2026-02-27 15:11:50 -08:00
Junio C Hamano
341be27dfe Merge branch 'lo/repo-info-keys'
"git repo info" learns "--keys" action to list known keys.

* lo/repo-info-keys:
  repo: add new flag --keys to git-repo-info
  repo: rename the output format "keyvalue" to "lines"
2026-02-27 15:11:49 -08:00
Jonatan Holmgren
2e3a987f3b doc: fix list continuation in alias subsection example
The example showing the equivalence between alias.last and
alias.last.command was missing the list continuation marks (+
between the shell session block and the following prose, leaving
the paragraph detached from the list item in the rendered output.

Signed-off-by: Jonatan Holmgren <jonatan@jontes.page>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 13:06:48 -08:00
Harald Nordgren
3ea95ac9c5 status: add status.compareBranches config for multiple branch comparisons
Add a new configuration variable status.compareBranches that allows
users to specify a space-separated list of branch comparisons in
git status output.

Supported values:
- @{upstream} for the current branch's upstream tracking branch
- @{push} for the current branch's push destination

Any other value is ignored and a warning is shown.

When not configured, the default behavior is equivalent to setting
`status.compareBranches = @{upstream}`, preserving backward
compatibility.

The advice messages shown are context-aware:
- "git pull" advice is shown only when comparing against @{upstream}
- "git push" advice is shown only when comparing against @{push}
- Divergence advice is shown for upstream branch comparisons

This is useful for triangular workflows where the upstream tracking
branch differs from the push destination, allowing users to see their
status relative to both branches at once.

Example configuration:
    [status]
        compareBranches = @{upstream} @{push}

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-26 07:25:48 -08:00
Junio C Hamano
7b2bccb0d5 The 7th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:54:18 -08:00
Junio C Hamano
1a46f31b3e Merge branch 'jc/doc-cg-needswork'
A CodingGuidelines update.

* jc/doc-cg-needswork:
  CodingGuidelines: document NEEDSWORK comments
2026-02-25 11:54:17 -08:00
Junio C Hamano
8d15dd1ce1 Merge branch 'ds/revision-maximal-only'
"git rev-list" and friends learn "--maximal-only" to show only the
commits that are not reachable by other commits.

* ds/revision-maximal-only:
  revision: add --maximal-only option
2026-02-25 11:54:17 -08:00
Junio C Hamano
6b5ad01886 Merge branch 'cc/lop-filter-auto'
"auto filter" logic for large-object promisor remote.

* cc/lop-filter-auto:
  fetch-pack: wire up and enable auto filter logic
  promisor-remote: change promisor_remote_reply()'s signature
  promisor-remote: keep advertised filters in memory
  list-objects-filter-options: support 'auto' mode for --filter
  doc: fetch: document `--filter=<filter-spec>` option
  fetch: make filter_options local to cmd_fetch()
  clone: make filter_options local to cmd_clone()
  promisor-remote: allow a client to store fields
  promisor-remote: refactor initialising field lists
2026-02-25 11:54:17 -08:00
Junio C Hamano
bf3c3603fd Merge branch 'kh/doc-am-format-sendmail'
Doc update.

* kh/doc-am-format-sendmail:
  doc: add caveat about round-tripping format-patch
2026-02-25 11:54:16 -08:00
Lucas Seiki Oshiro
8b97dc367a Documentation/git-repo: capitalize format descriptions
The descriptions for the git-repo output formats are in lowercase.
Capitalize these descriptions, making them consistent with the rest of
the documentation.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
906b632c4f Documentation/git-repo: replace 'NUL' with '_NUL_'
Replace all occurrences of "NUL" by "_NUL_" in git-repo.adoc, following the
convention used by other documentation files.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:43 -08:00
Lucas Seiki Oshiro
c63e64e04d CodingGuidelines: instruct to name arrays in singular
Arrays should be named in the singular form, ensuring that when
accessing an element within an array (e.g. dog[0]) it's clear that
we're referring to an element instead of a collection.

Add a new rule to CodingGuidelines asking for arrays to be named in
singular instead of plural.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 11:47:41 -08:00
Karthik Nayak
53592d68e8 refs: add GIT_REFERENCE_BACKEND to specify reference backend
Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:40:00 -08:00
Karthik Nayak
01dc84594e refs: allow reference location in refstorage config
The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:38:41 -08:00
Taylor Blau
d54da84bd9 midx: enable reachability bitmaps during MIDX compaction
Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.

Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.

As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.

In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:35 -08:00
Taylor Blau
9df44a97f1 midx: implement MIDX compaction
When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.

While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:

 - Preserving the MIDX chain is impossible, since there is no way to
   write a MIDX layer that contains objects or packs found in an earlier
   MIDX layer already part of the chain. So callers would have to write
   an entirely new (non-incremental) MIDX containing only the compacted
   layers, discarding all other objects/packs from the MIDX.

 - There is (currently) no way to write a MIDX layer outside of the MIDX
   chain to work around the above, such that the MIDX chain could be
   reassembled substituting the compacted layers with the MIDX that was
   written.

 - The `--stdin-packs` command-line option does not allow us to specify
   the order of packs as they appear in the MIDX. Therefore, even if
   there were workarounds for the previous two challenges, any bitmaps
   belonging to layers which come after the compacted layer(s) would no
   longer be valid.

This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.

Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:

 - Instead of calling `fill_packs_from_midx()`, we call a new function
   `fill_packs_from_midx_range()`, which walks backwards along the
   portion of the MIDX chain which we are compacting, and adds packs one
   layer a time.

   In order to preserve the pseudo-pack order, the concatenated pack
   order is preserved, with the exception of preferred packs which are
   always added first.

 - After adding entries from the set of packs in the compaction range,
   `compute_sorted_entries()` must adjust the `pack_int_id`'s for all
   objects added in each fanout layer to match their original
   `pack_int_id`'s (as opposed to the index at which each pack appears
   in `ctx.info`).

   Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
   here, as it unconditionally recurs through the `->base_midx`. Factor
   out a `_1()` variant that operates on a single layer, reimplement
   the existing function in terms of it, and use the new variant from
   `midx_fanout_add_compact()`.

   Since we are sorting the list of objects ourselves, the order we add
   them in does not matter.

 - When writing out the new 'multi-pack-index-chain' file, discard any
   layers in the compaction range, replacing them with the newly written
   layer, instead of keeping them and placing the new layer at the end
   of the chain.

This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.

The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:34 -08:00
Taylor Blau
b2ec8e90c2 midx: do not require packs to be sorted in lexicographic order
The MIDX file format currently requires that pack files be identified by
the lexicographic ordering of their names (that is, a pack having a
checksum beginning with "abc" would have a numeric pack_int_id which is
smaller than the same value for a pack beginning with "bcd").

As a result, it is impossible to combine adjacent MIDX layers together
without permuting bits from bitmaps that are in more recent layer(s).

To see why, consider the following example:

          | packs       | preferred pack
  --------+-------------+---------------
  MIDX #0 | { X, Y, Z } | Y
  MIDX #1 | { A, B, C } | B
  MIDX #2 | { D, E, F } | D

, where MIDX #2's base MIDX is MIDX #1, and so on. Suppose that we want
to combine MIDX layers #0 and #1, to create a new layer #0' containing
the packs from both layers. With the original three MIDX layers, objects
are laid out in the bitmap in the order they appear in their source
pack, and the packs themselves are arranged according to the pseudo-pack
order. In this case, that ordering is Y, X, Z, B, A, C.

But recall that the pseudo-pack ordering is defined by the order that
packs appear in the MIDX, with the exception of the preferred pack,
which sorts ahead of all other packs regardless of its position within
the MIDX. In the above example, that means that pack 'Y' could be placed
anywhere (so long as it is designated as preferred), however, all other
packs must be placed in the location listed above.

Because that ordering isn't sorted lexicographically, it is impossible
to compact MIDX layers in the above configuration without permuting the
object-to-bit-position mapping. Changing this mapping would affect all
bitmaps belonging to newer layers, rendering the bitmaps associated with
MIDX #2 unreadable.

One of the goals of MIDX compaction is that we are able to shrink the
length of the MIDX chain *without* invalidating bitmaps that belong to
newer layers, and the lexicographic ordering constraint is at odds with
this goal.

However, packs do not *need* to be lexicographically ordered within the
MIDX. As far as I can gather, the only reason they are sorted lexically
is to make it possible to perform a binary search over the pack names in
a MIDX, necessary to make `midx_contains_pack()`'s performance
logarithmic in the number of packs rather than linear.

Relax this constraint by allowing MIDX writes to proceed with packs that
are not arranged in lexicographic order. `midx_contains_pack()` will
lazily instantiate a `pack_names_sorted` array on the MIDX, which will
be used to implement the binary search over pack names.

This change produces MIDXs which may not be correctly read with external
tools or older versions of Git. Though older versions of Git know how to
gracefully degrade and ignore any MIDX(s) they consider corrupt,
external tools may not be as robust. To avoid unintentionally breaking
any such tools, guard this change behind a version bump in the MIDX's
on-disk format.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:33 -08:00
Taylor Blau
d0e91c128b git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h'
Since c39fffc1c9 (tests: start asserting that *.txt SYNOPSIS matches -h
output, 2022-10-13), the manual page for 'git multi-pack-index' has a
SYNOPSIS section which differs from 'git multi-pack-index -h'.

Correct this while also documenting additional options accepted by the
'write' sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00
Taylor Blau
f775d5b1cf git-multi-pack-index(1): remove non-existent incompatibility
Since fcb2205b77 (midx: implement support for writing incremental MIDX
chains, 2024-08-06), the command-line options '--incremental' and
'--bitmap' were declared to be incompatible with one another when
running 'git multi-pack-index write'.

However, since 27afc272c4 (midx: implement writing incremental MIDX
bitmaps, 2025-03-20), that incompatibility no longer exists, despite the
documentation saying so. Correct this by removing the stale reference to
their incompatibility.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00
Taylor Blau
6b8fb17490 builtin/multi-pack-index.c: make '--progress' a common option
All multi-pack-index sub-commands (write, verify, repack, and expire)
support a '--progress' command-line option, despite not listing it as
one of the common options in `common_opts`.

As a result each sub-command declares its own `OPT_BIT()` for a
"--progress" command-line option. Centralize this within the
`common_opts` to avoid re-declaring it in each sub-command.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24 11:16:32 -08:00