Enable callers to generate reachability bitmaps when performing MIDX
layer compaction by combining all existing bitmaps from the compacted
layers.
Note that because of the object/pack ordering described by the previous
commit, the pseudo-pack order for the compacted MIDX is the same as
concatenating the individual pseudo-pack orderings for each layer in the
compaction range.
As a result, the only non-test or documentation change necessary is to
treat all objects as non-preferred during compaction so as not to
disturb the object ordering.
In the future, we may want to adjust which commit(s) receive
reachability bitmaps when compacting multiple .bitmap files into one, or
even generate new bitmaps (e.g., if the references have moved
significantly since the .bitmap was generated). This commit only
implements combining all existing bitmaps in range together in order to
demonstrate and lay the groundwork for more exotic strategies.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When managing a MIDX chain with many layers, it is convenient to combine
a sequence of adjacent layers into a single layer to prevent the chain
from growing too long.
While it is conceptually possible to "compact" a sequence of MIDX layers
together by running "git multi-pack-index write --stdin-packs", there
are a few drawbacks that make this less than desirable:
- Preserving the MIDX chain is impossible, since there is no way to
write a MIDX layer that contains objects or packs found in an earlier
MIDX layer already part of the chain. So callers would have to write
an entirely new (non-incremental) MIDX containing only the compacted
layers, discarding all other objects/packs from the MIDX.
- There is (currently) no way to write a MIDX layer outside of the MIDX
chain to work around the above, such that the MIDX chain could be
reassembled substituting the compacted layers with the MIDX that was
written.
- The `--stdin-packs` command-line option does not allow us to specify
the order of packs as they appear in the MIDX. Therefore, even if
there were workarounds for the previous two challenges, any bitmaps
belonging to layers which come after the compacted layer(s) would no
longer be valid.
This commit introduces a way to compact a sequence of adjacent MIDX
layers into a single layer while preserving the MIDX chain, as well as
any bitmap(s) in layers which are newer than the compacted ones.
Implementing MIDX compaction does not require a significant number of
changes to how MIDX layers are written. The main changes are as follows:
- Instead of calling `fill_packs_from_midx()`, we call a new function
`fill_packs_from_midx_range()`, which walks backwards along the
portion of the MIDX chain which we are compacting, and adds packs one
layer a time.
In order to preserve the pseudo-pack order, the concatenated pack
order is preserved, with the exception of preferred packs which are
always added first.
- After adding entries from the set of packs in the compaction range,
`compute_sorted_entries()` must adjust the `pack_int_id`'s for all
objects added in each fanout layer to match their original
`pack_int_id`'s (as opposed to the index at which each pack appears
in `ctx.info`).
Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly
here, as it unconditionally recurs through the `->base_midx`. Factor
out a `_1()` variant that operates on a single layer, reimplement
the existing function in terms of it, and use the new variant from
`midx_fanout_add_compact()`.
Since we are sorting the list of objects ourselves, the order we add
them in does not matter.
- When writing out the new 'multi-pack-index-chain' file, discard any
layers in the compaction range, replacing them with the newly written
layer, instead of keeping them and placing the new layer at the end
of the chain.
This ends up being sufficient to implement MIDX compaction in such a way
that preserves bitmaps corresponding to more recent layers in the MIDX
chain.
The tests for MIDX compaction are so far fairly spartan, since the main
interesting behavior here is ensuring that the right packs/objects are
selected from each layer, and that the pack order is preserved despite
whether or not they are sorted in lexicographic order in the original
MIDX chain.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The MIDX file format currently requires that pack files be identified by
the lexicographic ordering of their names (that is, a pack having a
checksum beginning with "abc" would have a numeric pack_int_id which is
smaller than the same value for a pack beginning with "bcd").
As a result, it is impossible to combine adjacent MIDX layers together
without permuting bits from bitmaps that are in more recent layer(s).
To see why, consider the following example:
| packs | preferred pack
--------+-------------+---------------
MIDX #0 | { X, Y, Z } | Y
MIDX #1 | { A, B, C } | B
MIDX #2 | { D, E, F } | D
, where MIDX #2's base MIDX is MIDX #1, and so on. Suppose that we want
to combine MIDX layers #0 and #1, to create a new layer #0' containing
the packs from both layers. With the original three MIDX layers, objects
are laid out in the bitmap in the order they appear in their source
pack, and the packs themselves are arranged according to the pseudo-pack
order. In this case, that ordering is Y, X, Z, B, A, C.
But recall that the pseudo-pack ordering is defined by the order that
packs appear in the MIDX, with the exception of the preferred pack,
which sorts ahead of all other packs regardless of its position within
the MIDX. In the above example, that means that pack 'Y' could be placed
anywhere (so long as it is designated as preferred), however, all other
packs must be placed in the location listed above.
Because that ordering isn't sorted lexicographically, it is impossible
to compact MIDX layers in the above configuration without permuting the
object-to-bit-position mapping. Changing this mapping would affect all
bitmaps belonging to newer layers, rendering the bitmaps associated with
MIDX #2 unreadable.
One of the goals of MIDX compaction is that we are able to shrink the
length of the MIDX chain *without* invalidating bitmaps that belong to
newer layers, and the lexicographic ordering constraint is at odds with
this goal.
However, packs do not *need* to be lexicographically ordered within the
MIDX. As far as I can gather, the only reason they are sorted lexically
is to make it possible to perform a binary search over the pack names in
a MIDX, necessary to make `midx_contains_pack()`'s performance
logarithmic in the number of packs rather than linear.
Relax this constraint by allowing MIDX writes to proceed with packs that
are not arranged in lexicographic order. `midx_contains_pack()` will
lazily instantiate a `pack_names_sorted` array on the MIDX, which will
be used to implement the binary search over pack names.
This change produces MIDXs which may not be correctly read with external
tools or older versions of Git. Though older versions of Git know how to
gracefully degrade and ignore any MIDX(s) they consider corrupt,
external tools may not be as robust. To avoid unintentionally breaking
any such tools, guard this change behind a version bump in the MIDX's
on-disk format.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since c39fffc1c9 (tests: start asserting that *.txt SYNOPSIS matches -h
output, 2022-10-13), the manual page for 'git multi-pack-index' has a
SYNOPSIS section which differs from 'git multi-pack-index -h'.
Correct this while also documenting additional options accepted by the
'write' sub-command.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since fcb2205b77 (midx: implement support for writing incremental MIDX
chains, 2024-08-06), the command-line options '--incremental' and
'--bitmap' were declared to be incompatible with one another when
running 'git multi-pack-index write'.
However, since 27afc272c4 (midx: implement writing incremental MIDX
bitmaps, 2025-03-20), that incompatibility no longer exists, despite the
documentation saying so. Correct this by removing the stale reference to
their incompatibility.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All multi-pack-index sub-commands (write, verify, repack, and expire)
support a '--progress' command-line option, despite not listing it as
one of the common options in `common_opts`.
As a result each sub-command declares its own `OPT_BIT()` for a
"--progress" command-line option. Centralize this within the
`common_opts` to avoid re-declaring it in each sub-command.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge-file" can be run outside a repository, but it ignored
all configuration, even the per-user ones. The command now uses
available configuration files to find its customization.
* yt/merge-file-outside-a-repository:
merge-file: honor merge.conflictStyle outside of a repository
A handful of documentation pages have been modernized to use the
"synopsis" style.
* ja/doc-synopsis-style-even-more:
doc: convert git-show to synopsis style
doc: fix some style issues in git-clone and for-each-ref-options
doc: finalize git-clone documentation conversion to synopsis style
doc: convert git-submodule to synopsis style
Allow recording process ID of the process that holds the lock next
to a lockfile for diagnosis.
* pc/lockfile-pid:
lockfile: add PID file for debugging stale locks
We do not use // comments in our C code, which is implied by the
description of multi-line comment rule and its examples, but is not
explicitly spelled out. Spell it out.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Five commands include these options. Let’s link to the command so that
the curious user can learn more about what “rerere” is about.
It’s also better to consistently refer to things like
e.g. “git-subcommand(1)” over `git subcommand` or `subcommand`.
Also apply the same treatment to git-add(1).
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The help text and the documentation for the "--expire" option of
"git worktree [list|prune]" have been improved.
* sb/doc-worktree-prune-expire-improvement:
worktree: clarify that --expire only affects missing worktrees
"git replay" is taught to drop commits that become empty (not the
ones that are empty in the original).
* pw/replay-drop-empty:
replay: drop commits that become empty
"git history" history rewriting UI.
* ps/history:
builtin/history: implement "reword" subcommand
builtin: add new "history" command
wt-status: provide function to expose status for trees
replay: support updating detached HEAD
replay: support empty commit ranges
replay: small set of cleanups
builtin/replay: move core logic into "libgit.a"
builtin/replay: extract core logic to replay revisions
Let's not call our users "it". Also "rerere forget \*.c" does not
forget resolutions for just '*.c'; it forgets for all the files
whose filenames end with ".c".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running outside a repository, git merge-file ignores the
merge.conflictStyle configuration variable entirely. Since the
function receives `repo` from the caller (which is NULL outside a
repository), and repo_config() falls back to reading system and user
configuration when passed NULL, pass `repo` to repo_config()
unconditionally.
Also document that merge.conflictStyle is honored.
Signed-off-by: Yannik Tausch <dev@ytausch.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In MyFirstContribution.adoc, the link to the repo_config()
documentation is invalid because the related documentation was moved
to a different file.
Replace the path for the repo_config() documentation from
'Documentation/technical/api-config.h' to 'config.h'.
Signed-off-by: SoutrikDas <valusoutrik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add synopsis block definition in asciidoc.conf.in
* convert commands to synopsis style
* use _<placeholder>_ for arguments
* minor formatting fixes
Reviewed-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* spell out all forms of --[no-]reject-shallow in git-clone
* use imperative mood for the first line of options
* Use asciidoc NOTE macro
* fix markups
Reviewed-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-z" and "--max-depth" documentation (and implementation of
"-z") in the "git last-modified" command have been updated.
* tc/last-modified-options-cleanup:
last-modified: change default max-depth to 0
last-modified: document option '--max-depth'
last-modified: document option '-z'
last-modified: clarify in the docs the command takes a pathspec
Avoid local submodule repository directory paths overlapping with
each other by encoding submodule names before using them as path
components.
* ar/submodule-gitdir-tweak:
submodule: detect conflicts with existing gitdir configs
submodule: hash the submodule name for the gitdir path
submodule: fix case-folding gitdir filesystem collisions
submodule--helper: fix filesystem collisions by encoding gitdir paths
builtin/credential-store: move is_rfc3986_unreserved to url.[ch]
submodule--helper: add gitdir migration command
submodule: allow runtime enabling extensions.submodulePathConfig
submodule: introduce extensions.submodulePathConfig
builtin/submodule--helper: add gitdir command
submodule: always validate gitdirs inside submodule_name_to_gitdir
submodule--helper: use submodule_name_to_gitdir in add_submodule
There is no option --signed-off-cc (without -by) for git send-email.
Signed-off-by: Matěj Cepl <mcepl@cepl.eu>
[kh: rebased and changed subject to house style]
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
[jc: minor copyedit in the commit message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
47beb37b (shortlog: match commit trailers with --group, 2020-09-27)
added the `trailer` bullet point with three paragraphs.[1] Later,
3dc95e09 (shortlog: support arbitrary commit format `--group`s,
2022-10-24) put the single-paragraph bullet point about `format` right
after the first paragraph about `trailer`. That meant that the second
and third paragraphs for `trailer` got moved to `format`.
Move the two paragraphs back to `trailer`. We now also need one blank
line before the final bullet point so that it does not get joined with
the second bullet point.
† 1: Technically the bullet list formatting was immediately fixed to
include all three paragraphs in 63d24fa0 (shortlog: allow multiple
groups to be specified, 2020-09-27)
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --expire option for "git worktree list" and "git worktree prune"
only affects worktrees whose working directory path no longer exists.
The help text did not make this clear, and the documentation
inconsistently used "unused" for prune but "missing" for list.
Update the help text and documentation to consistently describe these
as "missing worktrees", and use "prune" instead of "expire" when
describing the effect on missing worktrees since the terminology is
clearer.
While at it, expand the description of the "prune" subcommand itself
to better explain what it does and when to use it, as suggested by
Junio.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Sam Bostock <sam@sambostock.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a lock file is held, it can be helpful to know which process owns
it, especially when debugging stale locks left behind by crashed
processes. Add an optional feature that creates a companion PID file
alongside each lock file, containing the PID of the lock holder.
For a lock file "foo.lock", the PID file is named "foo~pid.lock". The
tilde character is forbidden in refnames and allowed in Windows
filenames, which guarantees no collision with the refs namespace
(e.g., refs "foo" and "foo~pid" cannot both exist). The file contains
a single line in the format "pid <value>" followed by a newline.
The PID file is created when a lock is acquired (if enabled), and
automatically cleaned up when the lock is released (via commit or
rollback). The file is registered as a tempfile so it gets cleaned up
by signal and atexit handlers if the process terminates abnormally.
When a lock conflict occurs, the code checks for an existing PID file
and, if found, uses kill(pid, 0) to determine if the process is still
running. This allows providing context-aware error messages:
Lock is held by process 12345. Wait for it to finish, or remove
the lock file to continue.
Or for a stale lock:
Lock was held by process 12345, which is no longer running.
Remove the stale lock file to continue.
The feature is controlled via core.lockfilePid configuration (boolean).
Defaults to false. When enabled, PID files are created for all lock
operations.
Existing PID files are always read when displaying lock errors,
regardless of the core.lockfilePid setting. This ensures helpful
diagnostics even when the feature was previously enabled and later
disabled.
Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>