The experimental "git history" command has been taught a new
"squash" subcommand to fold a range of commits into a single commit,
replaying any descendants on top.
* hn/history-squash:
history: re-edit a squash with every message
history: add squash subcommand to fold a range
history: give commit_tree_ext a message template
history: extract helper for a commit's parent tree
"git log -L<range>:<path>" learned to limit various "diff" operations
like --stat, --check, -G, to the specified range:path.
* mm/line-log-limited-ops:
diffcore-pickaxe: scope -G to the -L tracked range
diff: support --check with -L line ranges
line-log: support diff stat formats with -L
diff: extract a line-range diff helper for reuse
diff: emit -L hunk headers via xdiff's formatter
diff: simplify the line-range filter by classifying removals immediately
diff: rename and group the line-range filter for clarity
A new `diff.<driver>.process` configuration has been introduced to
allow a long-running external process to act as a hunk provider to
allows external tools to control which lines Git considers changed
while leaving all output formatting (word diff, color, blame, etc.) to
Git's standard pipeline.
* mm/diff-process-hunks:
blame: consult diff process for no-hunk detection
diff: bypass diff process with --no-ext-diff and in format-patch
diff: add long-running diff process via diff.<driver>.process
sub-process: separate process lifecycle from hashmap management
userdiff: add diff.<driver>.process config
xdiff: support external hunks via xpparam_t
The contributor guide has been updated to advise new contributors to
trim irrelevant quoted text when replying to review comments, matching
the existing advice given to reviewers.
* wy/doc-myfirstcontribution-trim-quotes:
MyFirstContribution: mention trimming quoted text in replies
git replay learns --linearize option to drop merge commits and
linearize the replayed history, mimicking git rebase
--no-rebase-merges.
* tc/replay-linearize:
replay: offer an option to linearize the commit topology
replay: add helper to put entry into mapped_commits
replay: refactor enum replay_mode into a bool
"git branch" command learned "--delete-merged" option to remove
local branches that have already been merged to the remote-tracking
branches they track.
* hn/branch-delete-merged:
branch: add --dry-run for --delete-merged
branch: add branch.<name>.deleteMerged opt-out
branch: add --delete-merged <branch>
branch: prepare delete_branches for a bulk caller
branch: let delete_branches skip unmerged branches on bulk refusal
branch: convert delete_branches() to a flags argument
branch: add --forked filter for --list mode
The `remote-object-info` command has been added to `git cat-file
--batch-command`, allowing clients to request object metadata
(currently size) from a remote server via protocol v2 without
downloading the entire object.
The client dynamically filters format placeholders based on
server-advertised capabilities and safely returns empty strings for
inapplicable or unsupported fields.
* ps/cat-file-remote-object-info:
cat-file: make remote-object-info allow-list dynamic
cat-file: validate remote atoms with allow_list
cat-file: add remote-object-info to batch-command
transport: add client support for object-info
serve: advertise object-info feature
fetch-pack: move fetch initialization
connect: refactor packet writing
fetch-pack: move function to connect.c
fetch-pack: prepare function to be moved
t1006: split test utility functions into new "lib-cat-file.sh"
cat-file: declare loop counter inside for()
git-compat-util: add strtoul_szt() with error handling
transport-helper: fix memory leak of helper on disconnect
The experimental "git history" command has been taught a new "drop"
subcommand to remove a commit and replay its descendants onto its
parent.
* ps/history-drop:
builtin/history: implement "drop" subcommand
builtin/history: split handling of ref updates into two phases
reset: stop assuming that the caller passes in a clean index
reset: allow the caller to specify the current HEAD object
reset: introduce ability to skip updating HEAD
reset: introduce dry-run mode
reset: modernize flags passed to `reset_working_tree()`
reset: rename `reset_head()`
reset: drop `USE_THE_REPOSITORY_VARIABLE`
read-cache: split out function to drop unmerged entries to stage 0
The -m/-F/-c/-C options to supply commit log message from outside the
editor are now supported for all "git commit --fixup" variations.
* ec/commit-fixup-options:
commit: allow -c/-C for all kinds of --fixup
commit: allow -m/-F for all kinds of --fixup
The [includeIf "condition"] conditional inclusion facility for
configuration files has learned to use the location of worktree
in its condition.
* cl/conditional-config-on-worktree-path:
config: add "worktree" and "worktree/i" includeIf conditions
config: refactor include_by_gitdir() into include_by_path()
"git checkout --track=..." learned to optionally fetch the branch
from the remote the new branch will work with.
* hn/checkout-track-fetch:
checkout: extend --track with a "fetch" mode to refresh start-point
branch: expose helpers for finding the remote owning a tracking ref
Configuration file locking now retries for a short period, avoiding
failures when multiple processes attempt to update the configuration
simultaneously.
* jt/config-lock-timeout:
config: retry acquiring config.lock, configurable via core.configLockTimeout
The merge-base computation has been optimized by stopping the walk
early when one side's exclusive commits in the queue are exhausted,
yielding significant speedups for queries with one-sided histories.
* kk/merge-base-exhaustion:
commit-reach: terminate merge-base walk when one paint side is exhausted
commit-reach: remove unused nonstale_queue dedup wrappers
commit-reach: introduce struct paint_state with per-side counters
commit-reach: add trace2 instrumentation to paint_down_to_common()
t6099, t6600: add side-exhaustion regression tests
t6600: add test cases for side-exhaustion edge cases
Documentation/technical: add paint-down-to-common doc
Documentation updates.
* kh/doc-trailers:
doc: interpret-trailers: document comment line treatment
doc: interpret-trailers: commit to “trailer block” term
doc: interpret-trailers: join new-trailers again
doc: interpret-trailers: add key format example
doc: interpret-trailers: explain key format
doc: interpret-trailers: explain the format after the intro
doc: interpret-trailers: not just for commit messages
doc: interpret-trailers: use “metadata” in Name as well
doc: interpret-trailers: replace “lines” with “metadata”
doc: interpret-trailers: stop fixating on RFC 822
Doc update for "git replay" to actually refer to its configuration
variables.
* kh/doc-replay-config:
doc: replay: move “default” to the right-hand side
doc: replay: use a nested description list
doc: replay: improve config description
doc: link to config for git-replay(1)
The "git refs" toolbox has been extended with new "create", "delete",
"update", and "rename" subcommands to create, delete, update, and
rename references, respectively.
* ps/refs-writing-subcommands:
builtin/refs: add "rename" subcommand
builtin/refs: add "create" subcommand
builtin/refs: add "update" subcommand
builtin/refs: add "delete" subcommand
builtin/refs: drop `the_repository`
Documentation on community contribution guidelines has been updated to
encourage replying to review comments before rerolling, and to advise
a default limit of at most one reroll per day to give reviewers across
different time zones enough time to participate.
* wy/doc-clarify-review-replies:
doc: advise batching patch rerolls
doc: encourage review replies before rerolling
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".
* jk/repo-info-path-keys:
repo: add path.gitdir with absolute and relative suffix formatting
repo: add path.commondir with absolute and relative suffix formatting
path: extract format_path() and use in rev-parse
"git log --follow" has been updated to handle non-linear history, in
which the path being tracked gets renamed differently in multiple
history lines, better.
* mv/log-follow-mergy:
log: improve --follow following renames for non-linear history
The pack-objects command now supports using reachability bitmaps and
delta-islands concurrently with the `--path-walk` option, allowing
faster packaging by falling back to path-walk when bitmaps cannot
fully satisfy the request.
* tb/pack-path-walk-bitmap-delta-islands:
pack-objects: support `--delta-islands` with `--path-walk`
pack-objects: extract `record_tree_depth()` helper
pack-objects: support reachability bitmaps with `--path-walk`
t/perf: drop p5311's lookup-table permutation
The documentation in SubmittingPatches has been updated to clarify how
patch contributors should respond to design and viability critiques,
and how the resolution of such critiques should be recorded in the
final commit messages.
* jc/submittingpatches-design-critiques:
SubmittingPatches: address design critiques
The trailer sections in SubmittingPatches have been updated to
encourage use of standard trailers.
* kh/submittingpatches-trailers:
SubmittingPatches: note that trailer order matters
SubmittingPatches: be consistent with trailer markup
SubmittingPatches: document Based-on-patch-by trailer
SubmittingPatches: discourage common Linux trailers
SubmittingPatches: encourage trailer use for substantial help
The `fetch.followRemoteHEAD` configuration variable has been added to
provide a default for the per-remote `remote.<name>.followRemoteHEAD`
setting.
* mh/fetch-follow-remote-head-config:
fetch: fixup a misaligned comment
fetch: add configuration variable fetch.followRemoteHEAD
fetch: refactor do_fetch handling of followRemoteHEAD
fetch: return 0 on known git_fetch_config
fetch: rename function report_set_head
t5510: cleanup remote in followRemoteHEAD dangling ref test
doc: explain fetchRemoteHEADWarn advice
fetch: fixup set_head advice for warn-if-not-branch
Project-specific configuration for b4 has been introduced, and the
documentation has been updated to recommend using it as a
streamlined method for submitting patches.
* ps/doc-recommend-b4:
b4: introduce configuration for the Git project
MyFirstContribution: recommend the use of b4
MyFirstContribution: recommend shallow threading of cover letters
The handling of promisor-remote protocol capability has been
loosened to allow the other side to add to the list of promisor
remotes via the promisor.acceptFromServerURL configuration
variable.
* cc/promisor-auto-config-url-more:
doc: promisor: improve acceptFromServer entry
promisor-remote: auto-configure unknown remotes
promisor-remote: trust known remotes matching acceptFromServerUrl
promisor-remote: introduce promisor.acceptFromServerUrl
promisor-remote: add 'local_name' to 'struct promisor_info'
urlmatch: add url_normalize_pattern() helper
urlmatch: change 'allow_globs' arg to bool
t5710: simplify 'mkdir X' followed by 'git -C X init'
strstr() is not enough to validate the format placeholders in
remote-object-info causing two errors:
- Atoms recognized by expand_atom() but the remote doesn't returns 1, but
data->type contains garbage causing segfault.
- expand_atom() returns 0 for unknown atoms, calling
strbuf_expand_bad_format() which ends in die() blocking local queries
if the same format is shared.
Add an allow_list with the supported atoms at the top of expand_atom().
In remote mode, unsupported atoms return 1 leaving the sb empty,
honoring how for-each-ref handles known but inapplicable atoms.
As extra safety, initialize data->type to OBJ_BAD and add a NULL check
for type_name() so uninitialized data doesn't cause segfault.
Update tests that expect previous die() behaviour to expect an empty
string and add an explicit test for empty string return on unknown
placeholder.
Update caveat behaviour documentation.
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the `info` command in `cat-file --batch-command` prints object
info for a given object, it is natural to add another command in
`cat-file --batch-command` to print object info for a given object
from a remote.
Add `remote-object-info` to `cat-file --batch-command`.
While `info` takes object ids one at a time, this creates
overhead when making requests to a server. So `remote-object-info`
instead can take multiple object ids at once.
The `cat-file --batch-command` command is generally implemented in
the following manner:
- Receive and parse input from user
- Call respective function attached to command
- Get object info, print object info
In --buffer mode, this changes to:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue
- Call respective function attached to command
- Get object info, print object info
Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:
- Receive and parse input from user
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Parse input, get object info, print object info
And finally for --buffer mode `remote-object-info`:
- Receive and parse input from user
- Store respective function attached to command in a queue
- After flush, loop through commands in queue:
If command is `remote-object-info`:
- Get object info from remote
- Loop through and print each object info
Else:
- Call respective function attached to command
- Get object info, print object info
To summarize, `remote-object-info` gets object info from the remote and
then loops through the object info passed in, printing the info.
In order for `remote-object-info` to avoid remote communication
overhead in the non-buffer mode, the objects are passed in as such:
remote-object-info <remote> <oid> <oid> ... <oid>
rather than
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --dry-run, --delete-merged prints the local branches it would
delete, one "Would delete branch <name>" line each, and exits
without touching any ref. The same filtering applies, so the output
is exactly the set that the real run would delete.
--dry-run is only meaningful together with --delete-merged and is
rejected otherwise.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting branch.<name>.deleteMerged=false exempts that branch from
"git branch --delete-merged", which is useful for a topic you want
to keep developing after an early round of it has been merged
upstream. Unless --quiet is given, each skip is reported so the
user knows why their topic was kept.
Explicit deletion with "git branch -d" still uses the normal merge
check and ignores this setting.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git branch --delete-merged <branch>...
deletes the local branches that "--forked <branch>" would list,
keeping only those whose tip is reachable from their configured
upstream. The work has already landed on the upstream they track,
so the local copy is no longer needed.
A branch is not deleted when:
* it is checked out in any worktree
* its upstream remote-tracking branch no longer exists, since a
missing upstream is not by itself a sign of integration
* its push destination equals its upstream (<branch>@{push} is
the same as <branch>@{upstream}), such as a local "main" that
tracks and pushes to "origin/main". Right after a pull it just
looks "fully merged", so it is kept. Only branches that push
somewhere other than their upstream, typically topics in a fork
workflow, are candidates.
A branch whose work is not yet merged into its upstream is silently
skipped, so one unmerged topic does not abort the whole sweep.
A branch that another, surviving branch tracks as its upstream is
also kept, so a branch is never deleted out from under one stacked
on top of it. Such a kept branch is itself merged, so when its own
upstream is being deleted, clear its now-stale upstream config.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a --forked option to "git branch" list mode that lists only
branches whose configured upstream matches <branch>. The argument
can be a ref (e.g. "origin/main", "master"), a remote name like
"origin" for the branch its origin/HEAD points at, or a shell glob
(e.g. "origin/*"), and may be repeated to widen the filter.
It is an ordinary list filter, so it combines with the others:
git branch --merged origin/main --forked 'origin/*'
lists branches forked from origin that are already merged into
origin/main, and --no-merged inverts the question.
This is the building block for --delete-merged, which deletes the
listed branches once they have landed on their upstream.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default "git history squash" reuses the oldest commit's message.
When --reedit-message is given it only reopened that one message, so the
messages of the folded-in commits were lost.
Gather the messages of every commit in the range, oldest first, and use
them as the editor template when re-editing, mirroring how "git rebase
-i" presents a squash. The combined message is built before the
descendant walk so it is not disturbed by the flags that walk leaves on
the commits.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Folding a series of commits into one required either an interactive
rebase where each commit after the first was hand-edited to "fixup", or
a "git reset --soft" to the merge base followed by "git commit --amend".
Add "git history squash <revision-range>" to do this directly. It folds
every commit in the range into the oldest one, keeping that commit's
message and authorship and taking the tree of the newest commit, so the
range collapses into a single commit. Commits above the range are
replayed on top of the result.
The range is given as <base>..<tip>, so "git history squash @~3.."
folds the three most recent commits and "git history squash @~5..@~2"
squashes an interior range. A merge inside the range is folded like any
other commit, but the range must have a single base, so a range with
more than one entry point is rejected.
The folded commits leave the history, so by default the command refuses
when another ref points at one of them. Use "--update-refs=head" to
rewrite only the current branch and leave those refs untouched.
Inspired-by: Sergey Chernov <serega.morph@gmail.com>
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Forking from an existing remote branch without refreshing first often
has consequences: you start work that has already been done, or you
build on an old version of the code which causes big conflicts later
when you pull. The workaround is two commands ("git fetch <remote>
<branch> && git checkout -b <topic> <remote>/<branch>"), and when
the fetch is skipped the checkout silently starts from a stale tip.
Users may already expect "<remote>/<branch>" to refer to the latest
tip on the remote. While this blurs the line between fetch and
checkout, git already does this in places where it pays off: "git
clone" fetches and checks out, and "git pull" fetches and merges.
Add a "fetch" mode to "--track" that refreshes <start-point> before
checking it out:
git checkout -b new_branch --track=fetch origin/some-branch
Only the requested branch is fetched so other remote-tracking
branches are left untouched. When <start-point> is a bare <remote>
(e.g. "origin"), follow refs/remotes/<remote>/HEAD to learn which
branch to refresh. If "git fetch" fails but the remote-tracking ref
already exists locally, warn and proceed from the existing tip,
otherwise abort.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an early termination check to paint_down_to_common() using the
per-side counters introduced earlier. Once the walk enters the
finite-generation region, terminate early when one side's exclusive
count drops to zero -- no new merge-base can form without both paint
sides meeting.
The check also waits for pending_merge_bases to reach zero, ensuring
all merge-base candidates have been dequeued and recorded before
exiting.
The INFINITY gate ensures correctness: commits without a commit-graph
entry have GENERATION_NUMBER_INFINITY and are ordered by commit date,
which is not topologically reliable. The optimization only fires
once the walk enters the finite-generation region where ordering
guarantees hold.
Step counts measured with trace2 on git.git with commit-graph:
merge-base --all v2.0.0 v2.55.0-rc1:
before: 72264 steps after: 44589 steps
merge-base --all v2.55.0-rc1 v2.55.0-rc1~5:
before: 110 steps after: 7 steps
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a paint_state struct for use by paint_down_to_common() that
wraps a prio_queue with per-side commit counters. Each non-stale
queued commit occupies exactly one counter bucket based on its
paint flags: PARENT1-only, PARENT2-only, or both sides (a pending
merge-base candidate).
The counters are maintained by paint_count_update() which adjusts
the appropriate bucket by a signed delta. An exhaustive switch on
the paint+stale bits documents all valid flag combinations in one
place.
Convert paint_down_to_common() to use paint_state. The loop now
drains the queue via paint_queue_get() which returns NULL when all
counters reach zero, replacing the old pointer-based termination
(max_nonstale). This is equivalent behavior -- both conditions
detect that no non-stale entries remain.
The existing nonstale_queue is left in place for ahead_behind().
Step counts (via trace2 from the previous commit) are identical
before and after this refactoring, confirming no behavioral change.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a technical document describing the paint_down_to_common()
algorithm used for merge-base computation, covering the paint
walk, generation number regions, and termination conditions.
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Scripts need a stable way to locate the git directory without
parsing rev-parse output or relying on its flag-driven path format
selection. There is no way to retrieve this path from git repo info
today.
Introduce path.gitdir.absolute and path.gitdir.relative keys,
consistent with the path.commondir keys added in the previous patch.
Reuse the test_repo_info_path helper introduced there to validate
both variants.
Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Scripts working with worktree setups need a reliable way to discover
the common directory, which diverges from the git directory when
multiple worktrees are in use. There is no way to retrieve this path
from git repo info today.
Introduce path.commondir.absolute and path.commondir.relative keys.
Exposing explicit format variants rather than a single key with a
default avoids ambiguity for scripts that require predictable output.
Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the stated goals of git-replay(1) is to allow implementing the
git-rebase(1) functionality on the server side.
The default mode of git-rebase(1) is to act as if `--no-rebase-merges`
was given. This mode drops merge commits instead of replaying them, and
linearizes the commit history into a sequence of the
regular (single-parent) commits.
Add option `--linearize` to git-replay(1) to do the same.
Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have a repo with a subtree merge, do a 'git log --follow prefix/test.c',
the output only contains history in the outer repo, not commits that
were merged via a subtree merge.
What happens is that 'git log --follow' stores the followed path only in
opt->diffopt.pathspec, so in case the commit history is non-linear, and
multiple parents have renames to the followed path, then the end result
isn't really defined: the first commit that happens to be visited in one
of the parents update opt->diffopt.pathspec, and from that point, only
that updated path is visited.
Fix the problem by introducing a commit -> path map
(follow_pathspec_slab) that stores what will be a path to follow when
visiting that parent. At the top of log_tree_commit(), if the slab has
an entry for this commit, we replace opt->diffopt.pathspec with a path
from this entry, so the correct path is followed, even if an unrelated
sub-tree changed the path to be followed to something else. After
log_tree_diff() runs, we record each parent's path in the slab. As a
result, the walk order doesn't matter, which was exactly the source of
problems previously.
This helps with subtree merges (rename happens inside the merge commit),
but also fixes the general case when the rename happens in the history
of parents, not in the merge commit itself.
Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island that
reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on commits,
exactly as `show_commit()` does.
We have to be careful about the order in which we call this function,
and we must see a commit before its parents in order to have
island marks to propagate.
The path-walk batch preserves that order. Path-walk appends commits
to its `OBJ_COMMIT` batch as they come back from the same
`get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip island propagation for excluded commits to match the regular
traversal, whose `show_commit()` callback is only invoked for
interesting commits. Boundary commits may still be present in
path-walk's callback so they can serve as thin-pack bases, but they
should not contribute island marks.
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the max-depth-wins behavior
when a tree is reached via more than one path. The helper accepts
both the `show_object()` path shape ("foo", "foo/bar") and the
path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` in
increasing-depth order before propagating marks down, so that a
parent tree's marks are finalized before its children inherit them.
Without recording the depth at path-walk time, every
path-walk-discovered tree would land at depth 0 in `to_pack`, the
sort would lose its ordering, and children could inherit marks from
parents whose own contributions had not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.
Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.
This behavior dates back to 70664d2865 (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
In fb2c309b7d (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:
* A path-walk repack that writes bitmaps does not give the bitmap
selector any commits, because path-walk reveals commits through
`add_objects_by_path()` rather than through `show_commit()`, where
`index_commit_for_bitmap()` is normally called.
* An invocation like "git pack-objects --use-bitmap-index --path-walk"
never tries an existing bitmap, even when one is available and could
answer the request.
Fortunately for us, neither restriction is required.
* On the writing side: teach the path-walk object callback to call
`index_commit_for_bitmap()` for commits that it adds to the pack.
That gives the bitmap selector the commit candidates it would have
seen from the regular traversal.
* For bitmap reading, keep passing '--objects' to the internal rev_list
machinery, but stop clearing `use_bitmap_index`. If an existing
bitmap can answer the request, use it; otherwise fall back to
path-walk's own enumeration.
As a result, we can see significantly reduced pack generation times from
p5311 (with our `GIT_PERF_REPO` set to a recent clone of the fluentui
repository) before this commit:
Test HEAD^ HEAD
----------------------------------------------------------------------------------------
5311.40: server (1 days, --path-walk) 1.43(1.39+0.04) 0.01(0.01+0.00) -99.3%
5311.41: size (1 days, --path-walk) 139.6K 139.7K +0.0%
5311.42: client (1 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
5311.44: server (2 days, --path-walk) 1.43(1.39+0.04) 0.01(0.00+0.00) -99.3%
5311.45: size (2 days, --path-walk) 139.6K 139.7K +0.0%
5311.46: client (2 days, --path-walk) 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0%
5311.48: server (4 days, --path-walk) 1.44(1.39+0.04) 0.01(0.01+0.00) -99.3%
5311.49: size (4 days, --path-walk) 238.1K 238.1K +0.0%
5311.50: client (4 days, --path-walk) 0.03(0.03+0.00) 0.03(0.03+0.00) +0.0%
5311.52: server (8 days, --path-walk) 1.43(1.39+0.03) 0.01(0.00+0.00) -99.3%
5311.53: size (8 days, --path-walk) 344.9K 344.9K +0.0%
5311.54: client (8 days, --path-walk) 0.07(0.07+0.00) 0.07(0.08+0.00) +0.0%
5311.56: server (16 days, --path-walk) 1.47(1.44+0.03) 0.10(0.08+0.01) -93.2%
5311.57: size (16 days, --path-walk) 844.0K 844.0K +0.0%
5311.58: client (16 days, --path-walk) 0.09(0.09+0.00) 0.09(0.09+0.00) +0.0%
5311.60: server (32 days, --path-walk) 1.52(1.50+0.05) 0.14(0.15+0.02) -90.8%
5311.61: size (32 days, --path-walk) 4.2M 4.2M +0.1%
5311.62: client (32 days, --path-walk) 0.34(0.48+0.02) 0.34(0.45+0.05) +0.0%
5311.64: server (64 days, --path-walk) 1.55(1.52+0.06) 0.15(0.15+0.04) -90.3%
5311.65: size (64 days, --path-walk) 6.4M 6.4M -0.0%
5311.66: client (64 days, --path-walk) 0.51(0.79+0.05) 0.51(0.80+0.06) +0.0%
5311.68: server (128 days, --path-walk) 1.59(1.57+0.06) 0.16(0.21+0.01) -89.9%
5311.69: size (128 days, --path-walk) 8.4M 8.4M -0.0%
5311.70: client (128 days, --path-walk) 0.72(1.44+0.08) 0.71(1.47+0.09) -1.4%
We get the same size of output pack, but this commit allows us to do so
in a significantly shorter amount of time. Intuitively, we're generating
the same pack (hence the unchanged 'test_size' output from run to run),
but varying how we get there. Before this commit, pack-objects prefers
'--path-walk' to '--use-bitmap-index', so we generate the output pack by
performing a normal '--path-walk' traversal. With this commit, we are
operating over a *repacked* state (that itself was done with a
'--path-walk' traversal), but are able to perform pack-reuse on that
repacked state via bitmaps.
When comparing the size of the repacked pack with/without '--path-walk'
on the previous commit versus this one, we see that (a) the repacked size
improves significantly with '--path-walk', and that (b) writing bitmaps
during repacking does not regress this improvement:
Test HEAD^ HEAD
----------------------------------------------------------------------------------------
5311.3: size of bitmapped pack 558.4M 558.5M +0.0%
5311.38: size of bitmapped pack (--path-walk) 164.4M 164.4M +0.0%
(Note that to observe an improvement here, we must repack with '-F' in
order to avoid reusing non-'--path-walk' deltas, which would otherwise
skew our results.)
There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal does not expect this. Work around this by setting
`revs->boundary` as late as possible within the '--path-walk' traversal,
after any bitmap attempt has either succeeded or declined to answer the
request.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Contributors often need guidance on how quickly to send later iterations
of a patch series. Add a rough default of no more than one new version
of the same series per day so feedback can be batched and reviewers have
time to comment regardless of their time zones.
Mention factors that can affect the timing, such as series size, review
depth, and substantial rework. Also point out that avoiding rapid
rerolls encourages authors to polish each version before sending it, so
reviewers can focus on substantial issues.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Weijie Yuan <wy@wyuan.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Review feedback should not be answered only by sending a new patch
version. Encourage contributors to discuss their planned response in the
mailing-list thread before rerolling.
This makes the author's reasoning explicit before the next version is
prepared, instead of forcing reviewers to infer it from the rerolled
patches. It also encourages more direct social interaction between
contributors and helps foster a more collaborative review process.
Signed-off-by: Weijie Yuan <wy@wyuan.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Contributors sometimes fail to answer fundamental design or
viability comments from reviewers and submit subsequent rounds
without addressing them. When design decisions are resolved on the
mailing list, the final justification should be recorded in the
commit messages.
Instruct authors to be particularly mindful of critiques regarding
high-level design or viability, to defend their choices on the list,
and to accompany new iterations with clearer explanations in the cover
letter, responses, and revised commit messages. Also instruct them to
explicitly document the resolution of these concerns in the commit
message body to keep the historical record complete.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'fetch.followRemoteHEAD' is added as a generic setting used by all
remotes for which 'remote.<name>.followRemoteHEAD' is undefined. If
both variables are undefined, a builtin default of "create" is in
effect, matching the previous behavior.
As mentioned in the previous patch, 'fetch.followRemoteHEAD' supports
all of the values that its 'remote' counterpart does _except_
warn-if-not-$branch, due to its tighter coupling to individual remote
repositories.
This setting interacts with the do_fetch mechanism in the same way as
the previous does, but there are opportunities for improved
user-experience discussed in [1]. See the included NEEDSWORK comment as
well.
Documentation and advice messages for both of the followRemoteHEAD
variables are reworded to better capture the relationship between the
two.
The added tests assert feature parity between the two followRemoteHEAD
variables, as well as the fact that 'remote.<name>.followRemoteHEAD'
always supersedes this new configurable default.
[1]: https://lore.kernel.org/git/xmqqh5n213bw.fsf@gitster.g/
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matt Hunter <m@lfurio.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user sets 'remote.<name>.followRemoteHEAD' to
'warn[-if-not-$branch]', git-fetch will report when a fetched HEAD
disagrees with the locally-configured remote's HEAD. This additional
advice instructs the user how to deal with these warnings, but was
previously undocumented in git-config.
Signed-off-by: Matt Hunter <m@lfurio.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>