pack-objects's --stdin-packs=follow mode learns to handle
excluded-but-open packs.
* tb/stdin-packs-excluded-but-open:
repack: mark non-MIDX packs above the split as excluded-open
pack-objects: support excluded-open packs with --stdin-packs
t7704: demonstrate failure with once-cruft objects above the geometric split
pack-objects: refactor `read_packs_list_from_stdin()` to use `strmap`
pack-objects: plug leak in `read_stdin_packs()`
Object name handling (disambiguation and abbreviation) has been
refactored to be backend-generic, moving logic into the respective
object database backends.
* ps/odb-generic-object-name-handling:
odb: introduce generic `odb_find_abbrev_len()`
object-file: move logic to compute packed abbreviation length
object-name: move logic to compute loose abbreviation length
object-name: simplify computing common prefixes
object-name: abbreviate loose object names without `disambiguate_state`
object-name: merge `update_candidates()` and `match_prefix()`
object-name: backend-generic `get_short_oid()`
object-name: backend-generic `repo_collect_ambiguous()`
object-name: extract function to parse object ID prefixes
object-name: move logic to iterate through packed prefixed objects
object-name: move logic to iterate through loose prefixed objects
odb: introduce `struct odb_for_each_object_options`
oidtree: extend iteration to allow for arbitrary return codes
oidtree: modernize the code a bit
object-file: fix sparse 'plain integer as NULL pointer' error
"git replay" (experimental) learns, in addition to "pick" and
"replay", a new operating mode "revert".
* sa/replay-revert:
replay: add --revert mode to reverse commit changes
sequencer: extract revert message formatting into shared function
Reduce the reference to the_repository in the worktree subsystem.
* pw/worktree-reduce-the-repository:
worktree: reject NULL worktree in get_worktree_git_dir()
worktree add: stop reading ".git/HEAD"
worktree: remove "the_repository" from is_current_worktree()
Code clean-up around the recent "hooks defined in config" topic.
* ar/config-hook-cleanups:
hook: reject unknown hook names in git-hook(1)
hook: show disabled hooks in "git hook list"
hook: show config scope in git hook list
hook: introduce hook_config_cache_entry for per-hook data
t1800: add test to verify hook execution ordering
hook: make consistent use of friendly-name in docs
hook: replace hook_list_clear() -> string_list_clear_func()
hook: detect & emit two more bugs
hook: rename cb_data_free/alloc -> hook_data_free/alloc
hook: fix minor style issues
builtin/receive-pack: properly init receive_hook strbuf
hook: move unsorted_string_list_remove() to string-list.[ch]
The unsigned integer that is used as an bitset to specify the kind
of branches interpret_branch_name() function has been changed to
use a dedicated enum type.
* jw/object-name-bitset-to-enum:
object-name: turn INTERPRET_BRANCH_* constants into enum values
"git apply" now reports the name of the input file along with the
line number when it encounters a corrupt patch, and correctly
resets the line counter when processing multiple patch files.
* jw/apply-corrupt-location:
apply: report input location in binary and garbage patch errors
apply: report input location in header parsing errors
apply: report the location of corrupt patches
split-index.c has been updated to not use the global the_repository
and the_hash_algo variables.
* rs/split-index-the-repo-fix:
split-index: stop using the_repository and the_hash_algo
The cleanup of remaining bitmaps in "ahead_behind()" has been
simplified.
* rs/ahead-behind-cleanup-optimization:
commit-reach: simplify cleanup of remaining bitmaps in ahead_behind ()
Remove the blank lines at both ends of each test_expect_success body
to match the modern style used elsewhere in the test suite.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update t8003-blame-corner-cases.sh to redirect git-blame output
to a temporary file instead of piping it directly to not hide
the exit code of git commands behind pipes, as a crash in git
might go unnoticed.
Signed-off-by: Trieu Huynh <vikingtc4@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5ee86c273b (repack: exclude cruft pack(s) from the MIDX where
possible, 2025-06-23), geometric repacking learned to exclude cruft
packs from the MIDX when 'repack.midxMustContainCruft' is set to
'false'.
This works because packs generated with '--stdin-packs=follow' rescue
any once-unreachable objects that later become reachable, making the
resulting packs closed under reachability without needing the cruft pack
in the MIDX.
However, packs above the geometric split that were not part of the
previous MIDX may not have full object closure. When such packs are
marked as excluded-closed ('^'), pack-objects treats them as a
reachability boundary and does not traverse through them during the
follow pass, potentially leaving the resulting pack without full
closure.
Fix this by marking packs above the geometric split that were not in the
previous MIDX as excluded-open ('!') instead of excluded-closed ('^').
This causes pack-objects to walk through their commits during the follow
pass, rescuing any reachable objects not present in the closed-excluded
packs.
Note that MIDXs which were generated prior to this change and are
unlucky enough to not be closed under reachability may still exhibit
this bug, as we treat all MIDX'd packs as closed. That is true in an
overwhelming number of cases, since in order to have a non-closed MIDX
you would have to:
- Generate a pack via an earlier geometric repack that is not closed
under reachability.
- Store that pack in the MIDX.
- Avoid picking any commits to receive reachability bitmaps which
happen to reach objects from which the missing objects are reachable.
In the extremely rare chance that all of the above should happen, an
all-into-one repack will resolve the issue.
Unfortunately, there is no perfect way to determine whether a MIDX'd
pack is closed outside of ensuring that there is a '1' bit in at least
one bitmap for every bit position corresponding to objects in that pack.
While this is possible to do, this approach would treat MIDX'd packs as
open in cases where there is at least one object that is not reachable
from the subset of commits selected for bitmapping.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cd846bacc7 (pack-objects: introduce '--stdin-packs=follow',
2025-06-23), pack-objects learned to traverse through commits in
included packs when using '--stdin-packs=follow', rescuing reachable
objects from unlisted packs into the output.
When we encounter a commit in an excluded pack during this rescuing
phase we will traverse through its parents. But because we set
`revs.no_kept_objects = 1`, commit simplification will prevent us from
showing it via `get_revision()`. (In practice, `--stdin-packs=follow`
walks commits down to the roots, but only opens up trees for ones that
do not appear in an excluded pack.)
But there are certain cases where we *do* need to see the parents of an
object in an excluded pack. Namely, if an object is rescue-able, but
only reachable from object(s) which appear in excluded packs, then
commit simplification will exclude those commits from the object
traversal, and we will never see a copy of that object, and thus not
rescue it.
This is what causes the failure in the previous commit during repacking.
When performing a geometric repack, packs above the geometric split that
weren't part of the previous MIDX (e.g., packs pushed directly into
`$GIT_DIR/objects/pack`) may not have full object closure. When those
packs are listed as excluded via the '^' marker, the reachability
traversal encounters the sequence described above, and may miss objects
which we expect to rescue with `--stdin-packs=follow`.
Introduce a new "excluded-open" pack prefix, '!'. Like '^'-prefixed
packs, objects from '!'-prefixed packs are excluded from the resulting
pack. But unlike '^', commits in '!'-prefixed packs *are* used as
starting points for the follow traversal, and the traversal does not
treat them as a closure boundary.
In order to distinguish excluded-closed from excluded-open packs during
the traversal, introduce a new `pack_keep_in_core_open` bit on
`struct packed_git`, along with a corresponding `KEPT_PACK_IN_CORE_OPEN`
flag for the kept-pack cache.
In `add_object_entry_from_pack()`, move the `want_object_in_pack()`
check to *after* `add_pending_oid()`. This is necessary so that commits
from excluded-open packs are added as traversal tips even though their
objects won't appear in the output. As a consequence, the caller
`for_each_object_in_pack()` will always provide a non-NULL 'p', hence we
are able to drop the "if (p)" conditional.
The `include_check` and `include_check_obj` callbacks on `rev_info` are
used to halt the walk at closed-excluded packs, since objects behind a
'^' boundary are guaranteed to have closure and need not be rescued.
The following commit will make use of this new functionality within the
repack layer to resolve the test failure demonstrated in the previous
commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test demonstrating a case where geometric repacking fails to
produce a pack with full object closure, thus making it impossible to
write a reachability bitmap.
Mark the test with 'test_expect_failure' for now. The subsequent commit
will explain the precise failure mode, and implement a fix.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--stdin-packs' mode of pack-objects maintains two separate
string_lists: one for included packs, and one for excluded packs. Each
list stores the pack basename as a string and the corresponding
`packed_git` pointer in its `->util` field.
This works, but makes it awkward to extend the set of pack "kinds" that
pack-objects can accept via stdin, since each new kind would need its
own string_list and duplicated handling. A future commit will want to do
just this, so prepare for that change by handling the various "kinds" of
packs specified over stdin in a more generic fashion.
Namely, replace the two `string_list`s with a single `strmap` keyed on
the pack basename, with values pointing to a new `struct
stdin_pack_info`. This struct tracks both the `packed_git` pointer and a
`kind` bitfield indicating whether the pack was specified as included or
excluded.
Extract the logic for sorting packs by mtime and adding their objects
into a separate `stdin_packs_add_pack_entries()` helper.
While we could have used a `string_list`, we must handle the case where
the same pack is specified more than once. With a `string_list` only, we
would have to pay a quadratic cost to either (a) insert elements into
their sorted positions, or (b) a repeated linear search, which is
accidentally quadratic. For that reason, use a strmap instead.
This patch does not include any functional changes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `read_stdin_packs()` function added originally via 339bce27f4
(builtin/pack-objects.c: add '--stdin-packs' option, 2021-02-22)
declares a `rev_info` struct but neglects to call `release_revisions()`
on it before returning, creating the potential for a leak.
The related change in 97ec43247c (pack-objects: declare 'rev_info' for
'--stdin-packs' earlier, 2025-06-23) carried forward this oversight and
did not address it.
Ensure that we call `release_revisions()` appropriately to prevent a
potential leak from this function. Note that in practice our `rev_info`
here does not have a present leak, hence t5331 passes cleanly before
this commit, even when built with SANITIZE=leak.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using format-patch with --commit-list-format different than shortlog,
causes the commit entry lines to wrap if they get longer than
MAIL_DEFAULT_WRAP (72 characters).
While this might be sensible for many when sending changes through
email, it forces this decision of wrapping on the user, reducing the
control granularity of --commit-list-format.
Teach generate_commit_list_cover() to respect commit entry line lengths
and place this wrapping rule on the "modern" preset format instead.
Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation specifies that "git format-patch" would default to
format.commitListFormat if --commit-list-format is not given, but
doesn't specify the default if the format.commitListFormat is not set.
The text for --cover-letter is also obsolete as the commit list can now
be something other than a shortlog.
Document to reflect changes.
Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git backfill` learned to accept revision and pathspec arguments.
* ds/backfill-revs:
t5620: test backfill's unknown argument handling
path-walk: support wildcard pathspecs for blob filtering
backfill: work with prefix pathspecs
backfill: accept revision arguments
t5620: prepare branched repo for revision tests
revision: include object-name.h
Code clean-up overdue by 19 years.
* jc/rerere-modern-strbuf-handling:
cocci: strbuf.buf is never NULL
rerere: update to modern representation of empty strbufs
Doc updates.
* kh/doc-interpret-trailers-1:
interpret-trailers: use placeholder instead of *
doc: config: convert trailers section to synopsis style
doc: interpret-trailers: normalize and fill out options
doc: interpret-trailers: convert to synopsis style
The reference-transaction hook was taught to be triggered before
taking locks on references in the "preparing" phase.
* ej/ref-transaction-hook-preparing:
refs: add 'preparing' phase to the reference-transaction hook
merge-file --object-id used to trigger a BUG when run in a linked
worktree, which has been fixed.
* mr/merge-file-object-id-worktree-fix:
merge-file: fix BUG when --object-id is used in a worktree
Uses of prio_queue as a LIFO stack of commits have been written
with commit_stack.
* rs/prio-queue-to-commit-stack:
use commit_stack instead of prio_queue in LIFO mode
The handling of the incomplete lines at the end by "git
diff-highlight" has been fixed.
* jk/diff-highlight-identical-pairs:
contrib/diff-highlight: do not highlight identical pairs
* mf/format-patch-commit-list-format:
format-patch: --commit-list-format without prefix
format-patch: add preset for --commit-list-format
format-patch: wrap generate_commit_list_cover()
format.commitListFormat: strip meaning from empty
docs/pretty-formats: add %(count) and %(total)
format-patch: rename --cover-letter-format option
format-patch: refactor generate_commit_list_cover
pretty.c: better die message %(count) and %(total)
docs: add usage for the cover-letter fmt feature
format-patch: add commitListFormat config
format-patch: add ability to use alt cover format
format-patch: move cover letter summary generation
pretty.c: add %(count) and %(total) placeholders
"git ls-remote '+refs/tags/*:refs/tags/*' https://..." run outside a
repository would dereference a NULL while trying to see if the given
refspec is a single-object refspec, which has been corrected.
* kj/refspec-parsing-outside-repository:
refspec: fix typo in comment
remote-curl: fall back to default hash outside repo
A test to run a .bat file with whitespaces in the name with arguments
with whitespaces in them was flaky in that sometimes it got killed
before it produced expected side effects, which has been rewritten to
make it more robust.
* jk/t0061-bat-test-update:
t0061: simplify .bat test
"git repo info -h" and "git repo structure -h" limit their help output
to the part that is specific to the subcommand.
* mk/repo-help-strings:
repo: show subcommand-specific help text
repo: factor repo usage strings into shared macros
In case homebrew breaks REG_ENHANCED again, leave a in-code comment
to suggest use of our replacement regex as a workaround.
* jc/macos-homebrew-wo-reg-enhanced:
regexp: leave a pointer to resurrect workaround for Homebrew
Before the recent changes to parse rev-list arguments inside of 'git
backfill', the builtin would take arbitrary arguments without complaint (and
ignore them). This was noticed and a patch was sent [1] which motivates
this change.
[1] https://lore.kernel.org/git/20260321031643.5185-1-r.siddharth.shrimali@gmail.com/
Note that the revision machinery can output an "ambiguous argument"
warning if a value not starting with '--' is found and doesn't make
sense as a reference or a pathspec. For unrecognized arguments starting
with '--' we need to add logic into builtin/backfill.c to catch leftover
arguments.
Reported-by: Siddharth Shrimali <r.siddharth.shrimali@gmail.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, walk_objects_by_path() silently ignored pathspecs containing
wildcards or magic by clearing them. This caused all blobs to be
downloaded regardless of the given pathspec. Wildcard pathspecs like
"d/file.*.txt" are useful for narrowing which blobs to process (e.g.,
during 'git backfill').
Support wildcard pathspecs by making two changes:
1. Add an 'exact_pathspecs' flag to path_walk_context. When the
pathspec has no wildcards or magic, set this flag and use the
existing fast-path prefix matching in add_tree_entries(). When
wildcards are present, skip that block since prefix matching
cannot handle glob patterns.
2. Add a match_pathspec() check in walk_path() to filter out blobs
whose full path does not match the pathspec. This provides the
actual blob-level filtering for wildcard pathspecs.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change allowed specifying revision arguments over the 'git
backfill' command-line. This created the opportunity for restricting the
initial commit set by filtering the revision walk through a pathspec. Other
than filtering the commit set (and thereby the root trees), this did not
restrict the path-walk implementation of 'git backfill' and did not restrict
the blobs that were downloaded to only those matching the pathspec.
Update the path-walk API to accept certain kinds of pathspecs and to
silently ignore anything too complex, for now. We will update this in the
next change to properly restrict to even complex pathspecs.
The current behavior focuses on pathspecs that match paths exactly. This
includes exact filenames, including directory names as prefixes. Pathspecs
containing wildcards or magic are cleared so the path walk downloads all
blobs, as before.
The reason for this restriction is to allow for a faster execution by
pruning the path walk to only trees that could contribute towards one of
those paths as a parent directory.
The test directory 'd/f/' (next to 'd/file*.txt') was prepared in a
previous commit to exercise the subtlety in prefix matching.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing implementation of 'git backfill' only includes downloading
missing blobs reachable from HEAD. Advanced uses may desire more general
commit limiting options, such as '--all' for all references, specifying a
commit range via negative references, or specifying a recency of use such as
with '--since=<date>'.
All of these options are available if we use setup_revisions() to parse the
unknown arguments with the revision machinery. This opens up a large number
of possibilities, only a small set of which are tested here.
For documentation, we avoid duplicating the option documentation and instead
link to the documentation of 'git rev-list'.
Note that these arguments currently allow specifying a pathspec, which
modifies the commit history checks but does not limit the paths used in the
backfill logic. This will be updated in a future change.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare the test infrastructure for upcoming changes that teach 'git
backfill' to accept revision arguments and pathspecs.
Add test_tick before each commit in the setup loop so that commit dates
are deterministic. This enables reliable testing with '--since'.
Rename the 'd/e/' directory to 'd/f/' so that the prefix 'd/f' is
ambiguous with the files 'd/file.*.txt'. This exercises the subtlety
in prefix pathspec matching that will be added in a later commit.
Create a branched version of the test repository (src-revs) with:
- A 'side' branch merged into main, adding s/file.{1,2}.txt with
two versions (4 new blobs, 52 total from main HEAD).
- An unmerged 'other' branch adding o/file.{1,2}.txt (2 more blobs,
54 total reachable from --all).
This structure makes --all, --first-parent, and --since produce
meaningfully different results when used with 'git backfill'.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The REV_INFO_INIT macro includes a use of the DEFAULT_ABBREV macro, which is
defined in object-name.h. Include it in revision.h so consumers of
REV_INFO_INIT do not need to include this hidden dependency.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes the final dependence on "the_repository" in
get_worktree_git_dir(). The last commit removed only caller that
passed a NULL worktree.
get_worktree_git_dir() has the following callers:
- branch.c:prepare_checked_out_branches() which loops over all
worktrees.
- builtin/fsck.c:cmd_fsck() which loops over all worktrees.
- builtin/receive-pack.c:update_worktree() which is called from
update() only when "worktree" is non-NULL.
- builtin/worktree.c:validate_no_submodules() which is called from
check_clean_worktree() and move_worktree(), both of which supply
a non-NULL worktree.
- reachable.c:add_rebase_files() which loops over all worktrees.
- revision.c:add_index_objects_to_pending() which loops over all
worktrees.
- worktree.c:is_current_worktree() which expects a non-NULL worktree.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function can_use_local_refs() prints a warning if there are no local
branches and HEAD is invalid or points to an unborn branch. As part of
the warning it prints the contents of ".git/HEAD". In a repository using
the reftable backend HEAD is not stored in the filesystem so reading
that file is pointless. In a repository using the files backend it is
unclear how useful printing it is - it would be better to diagnose the
problem for the user. For now, simplify the warning by not printing
the file contents and adjust the relevant test case accordingly. Also
fixup the test case to use test_grep so that anyone trying to debug a
test failure in the future is not met by a wall of silence.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>