Commit Graph

13853 Commits

Author SHA1 Message Date
Junio C Hamano
c192528ef5 Merge branch 'hn/history-squash' into seen
The experimental "git history" command has been taught a new
"squash" subcommand to fold a range of commits into a single commit,
replaying any descendants on top.

* hn/history-squash:
  history: re-edit a squash with every message
  history: add squash subcommand to fold a range
  history: give commit_tree_ext a message template
  history: extract helper for a commit's parent tree
2026-06-25 19:49:54 -07:00
Junio C Hamano
4101c1091e Merge branch 'mm/diff-process-hunks' into seen
A new `diff.<driver>.process` configuration has been introduced to
allow a long-running external process to act as a hunk provider to
allows external tools to control which lines Git considers changed
while leaving all output formatting (word diff, color, blame, etc.) to
Git's standard pipeline.

* mm/diff-process-hunks:
  blame: consult diff process for no-hunk detection
  diff: bypass diff process with --no-ext-diff and in format-patch
  diff: add long-running diff process via diff.<driver>.process
  sub-process: separate process lifecycle from hashmap management
  userdiff: add diff.<driver>.process config
  xdiff: support external hunks via xpparam_t
2026-06-25 19:49:52 -07:00
Junio C Hamano
07f7a0853f Merge branch 'tb/midx-incremental-custom-base' into seen
The `git multi-pack-index write --incremental` command has been
corrected to properly honor the `--base` option. Previously, the
custom base was ignored by the normal write path, and the pack
exclusion logic incorrectly skipped packs from layers above the
selected base, breaking reachability closure for bitmaps.

* tb/midx-incremental-custom-base:
  midx-write: include packs above custom incremental base
  midx: pass custom '--base' through incremental writes
  t5334: expose shared `nth_line()` helper
2026-06-25 19:49:51 -07:00
Junio C Hamano
d90c822705 Merge branch 'tc/replay-linearize' into seen
git replay learns --linearize option to drop merge commits and
linearize the replayed history, mimicking git rebase
--no-rebase-merges.

* tc/replay-linearize:
  replay: offer an option to linearize the commit topology
  replay: add helper to put entry into mapped_commits
  replay: refactor enum replay_mode into a bool
2026-06-25 19:49:50 -07:00
Junio C Hamano
2120b477f4 Merge branch 'hn/branch-delete-merged' into seen
"git branch" command learned "--delete-merged" option to remove
local branches that have already been merged to the remote-tracking
branches they track.

* hn/branch-delete-merged:
  branch: add --dry-run for --delete-merged
  branch: add branch.<name>.deleteMerged opt-out
  branch: add --delete-merged <branch>
  branch: prepare delete_branches for a bulk caller
  branch: let delete_branches skip unmerged branches on bulk refusal
  branch: convert delete_branches() to a flags argument
  branch: add --forked filter for --list mode
2026-06-25 19:49:49 -07:00
Junio C Hamano
3866f16b50 Merge branch 'kk/prio-queue-get-put-fusion' into seen
The lazy priority queue optimization pattern (deferring actual removal
in prio_queue_get() to allow get+put fusion) has been folded directly
into prio_queue itself, speeding up commit traversal workflows and
simplifying callers.

* kk/prio-queue-get-put-fusion:
  prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
  prio-queue: rename .nr to .nr_ and add accessor helpers
2026-06-25 19:49:49 -07:00
Junio C Hamano
a664703d88 Merge branch 'ps/cat-file-remote-object-info' into seen
The `remote-object-info` command has been added to `git cat-file
--batch-command`, allowing clients to request object metadata
(currently size) from a remote server via protocol v2 without
downloading the entire object.

The client dynamically filters format placeholders based on
server-advertised capabilities and safely returns empty strings for
inapplicable or unsupported fields.

* ps/cat-file-remote-object-info:
  cat-file: make remote-object-info allow-list dynamic
  cat-file: validate remote atoms with allow_list
  cat-file: add remote-object-info to batch-command
  transport: add client support for object-info
  serve: advertise object-info feature
  fetch-pack: move fetch initialization
  connect: refactor packet writing
  fetch-pack: move function to connect.c
  fetch-pack: prepare function to be moved
  t1006: split test utility functions into new "lib-cat-file.sh"
  cat-file: declare loop counter inside for()
  git-compat-util: add strtoul_szt() with error handling
  transport-helper: fix memory leak of helper on disconnect
2026-06-25 19:49:48 -07:00
Junio C Hamano
da665bafc9 Merge branch 'ps/history-drop' into seen
The experimental "git history" command has been taught a new "drop"
subcommand to remove a commit and replay its descendants onto its
parent.

* ps/history-drop:
  builtin/history: implement "drop" subcommand
  builtin/history: split handling of ref updates into two phases
  reset: stop assuming that the caller passes in a clean index
  reset: allow the caller to specify the current HEAD object
  reset: introduce ability to skip updating HEAD
  reset: introduce dry-run mode
  reset: modernize flags passed to `reset_working_tree()`
  reset: rename `reset_head()`
  reset: drop `USE_THE_REPOSITORY_VARIABLE`
  read-cache: split out function to drop unmerged entries to stage 0
2026-06-25 19:49:47 -07:00
Junio C Hamano
bd2e9056ec Merge branch 'ec/commit-fixup-options' into seen
The -m/-F/-c/-C options to supply commit log message from outside the
editor are now supported for all "git commit --fixup" variations.

* ec/commit-fixup-options:
  commit: allow -c/-C for all kinds of --fixup
  commit: allow -m/-F for all kinds of --fixup
2026-06-25 19:49:45 -07:00
Junio C Hamano
c8bbce4f07 Merge branch 'hn/checkout-track-fetch' into seen
"git checkout --track=..." learned to optionally fetch the branch
from the remote the new branch will work with.

* hn/checkout-track-fetch:
  checkout: extend --track with a "fetch" mode to refresh start-point
  branch: expose helpers for finding the remote owning a tracking ref
2026-06-25 19:49:43 -07:00
Junio C Hamano
432507c66f Merge branch 'js/parseopt-subcommand-autocorrection' into seen
The parse-options library learned to auto-correct misspelled
subcommand names.

* js/parseopt-subcommand-autocorrection:
  SQUASH???
  doc: document autocorrect API
  parseopt: add tests for subcommand autocorrection
  parseopt: enable subcommand autocorrection for git-remote and git-notes
  parseopt: autocorrect mistyped subcommands
  autocorrect: provide config resolution API
  autocorrect: rename AUTOCORRECT_SHOW to AUTOCORRECT_HINT
  autocorrect: use mode and delay instead of magic numbers
  help: move tty check for autocorrection to autocorrect.c
  help: make autocorrect handling reusable
  parseopt: extract subcommand handling from parse_options_step()
2026-06-25 19:49:42 -07:00
Junio C Hamano
2fb57b8177 Merge branch 'ps/odb-drop-whence' into jch
The whence field in struct object_info has been removed,
refactoring backend-specific object information retrieval into an
opt-in struct object_info_source structure.

* ps/odb-drop-whence:
  odb: document object info fields
  odb: drop `whence` field from object info
  treewide: convert users of `whence` to the new source field
  odb: add `source` field to struct object_info_source
  odb: make backend-specific fields optional
  packfile: thread odb_source_packed through packed_object_info()
2026-06-25 19:49:27 -07:00
Junio C Hamano
a58a2d8b04 Merge branch 'ty/migrate-ignorecase' into jch
The global configuration variable ignore_case (representing the
core.ignorecase configuration) has been migrated into struct
repo_config_values to tie it to a specific repository instance.

* ty/migrate-ignorecase:
  config: use repo_ignore_case() to access core.ignorecase
  environment: move ignore_case into repo_config_values
2026-06-25 19:49:25 -07:00
Junio C Hamano
1a7d1cf76b Merge branch 'ps/refs-writing-subcommands' into jch
The "git refs" toolbox has been extended with new "create", "delete",
"update", and "rename" subcommands to create, delete, update, and
rename references, respectively.

* ps/refs-writing-subcommands:
  builtin/refs: add "rename" subcommand
  builtin/refs: add "create" subcommand
  builtin/refs: add "update" subcommand
  builtin/refs: add "delete" subcommand
  builtin/refs: drop `the_repository`
2026-06-25 19:49:23 -07:00
Junio C Hamano
227b9af174 Merge branch 'ps/refs-avoid-chdir-notify-reparent' into jch
The reference backends have been converted to always use absolute
paths internally. This allows dropping the calls to
`chdir_notify_reparent()` and fixes a memory leak in how the
reference database is constructed with an "onbranch" condition.

* ps/refs-avoid-chdir-notify-reparent:
  refs: protect against chicken-and-egg recursion
  refs/reftable: lazy-load configuration to fix chicken-and-egg
  reftable: split up write options
  refs/files: lazy-load configuration to fix chicken-and-egg
  refs: move parsing of "core.logAllRefUpdates" back into ref stores
  repository: free main reference database
  chdir-notify: drop unused `chdir_notify_reparent()`
  refs: unregister reference stores from "chdir_notify"
  setup: don't apply "GIT_REFERENCE_BACKEND" without a repository
  setup: stop applying repository format twice
  setup: inline `check_and_apply_repository_format()`
2026-06-25 19:49:22 -07:00
Junio C Hamano
8fa129e837 Merge branch 'jk/repo-info-path-keys' into jch
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".

* jk/repo-info-path-keys:
  repo: add path.gitdir with absolute and relative suffix formatting
  repo: add path.commondir with absolute and relative suffix formatting
  path: extract format_path() and use in rev-parse
2026-06-25 19:49:21 -07:00
Junio C Hamano
8811ac8af4 Merge branch 'tb/pack-path-walk-bitmap-delta-islands' into jch
The pack-objects command now supports using reachability bitmaps and
delta-islands concurrently with the `--path-walk` option, allowing
faster packaging by falling back to path-walk when bitmaps cannot
fully satisfy the request.

* tb/pack-path-walk-bitmap-delta-islands:
  pack-objects: support `--delta-islands` with `--path-walk`
  pack-objects: extract `record_tree_depth()` helper
  pack-objects: support reachability bitmaps with `--path-walk`
  t/perf: drop p5311's lookup-table permutation
2026-06-25 19:49:19 -07:00
Junio C Hamano
f0291086bc Merge branch 'mh/fetch-follow-remote-head-config' into jch
The `fetch.followRemoteHEAD` configuration variable has been added to
provide a default for the per-remote `remote.<name>.followRemoteHEAD`
setting.

* mh/fetch-follow-remote-head-config:
  fetch: fixup a misaligned comment
  fetch: add configuration variable fetch.followRemoteHEAD
  fetch: refactor do_fetch handling of followRemoteHEAD
  fetch: return 0 on known git_fetch_config
  fetch: rename function report_set_head
  t5510: cleanup remote in followRemoteHEAD dangling ref test
  doc: explain fetchRemoteHEADWarn advice
  fetch: fixup set_head advice for warn-if-not-branch
2026-06-25 19:49:18 -07:00
Junio C Hamano
2a8c778710 Merge branch 'ps/odb-source-packed' into jch
The packed object source has been refactored into a proper struct
odb_source.

* ps/odb-source-packed:
  odb/source-packed: drop pointer to "files" parent source
  midx: refactor interfaces to work on "packed" source
  odb/source-packed: stub out remaining functions
  odb/source-packed: wire up `freshen_object()` callback
  odb/source-packed: wire up `find_abbrev_len()` callback
  odb/source-packed: wire up `count_objects()` callback
  odb/source-packed: wire up `for_each_object()` callback
  odb/source-packed: wire up `read_object_stream()` callback
  odb/source-packed: wire up `read_object_info()` callback
  packfile: use higher-level interface to implement `has_object_pack()`
  odb/source-packed: wire up `reprepare()` callback
  odb/source-packed: wire up `close()` callback
  odb/source-packed: start converting to a proper `struct odb_source`
  odb/source-packed: store pointer to "files" instead of generic source
  packfile: move packed source into "odb/" subsystem
  packfile: split out packfile list logic
  packfile: rename `struct packfile_store` to `odb_source_packed`
2026-06-25 19:49:17 -07:00
Junio C Hamano
01f18d418a Merge branch 'rs/cat-file-default-format-optim' into jch
* rs/cat-file-default-format-optim:
  cat-file: speed up default format
2026-06-25 19:49:15 -07:00
Junio C Hamano
120798a057 Merge branch 'ps/setup-drop-global-state' into jch
Continuation of "setup.c" refactoring to drop remaining global state
(`git_work_tree_cfg`, `is_bare_repository_cfg`). The most notable
outcome is that `is_bare_repository()` has been updated to no longer
implicitly rely on `the_repository`.

* ps/setup-drop-global-state:
  treewide: drop USE_THE_REPOSITORY_VARIABLE
  environment: stop using `the_repository` in `is_bare_repository()`
  environment: split up concerns of `is_bare_repository_cfg`
  builtin/init: stop modifying `is_bare_repository_cfg`
  setup: remove global `git_work_tree_cfg` variable
  builtin/init: simplify logic to configure worktree
  builtin/init: stop modifying global `git_work_tree_cfg` variable
2026-06-25 19:49:15 -07:00
Pablo Sabater
686196f649 cat-file: make remote-object-info allow-list dynamic
The static allow-list in expand_atom() is hardcoded to only allow
"objectname" and "objectsize" for remote queries. This works because
up to this point all servers will either support object-info with name
and size or they do not support them at all, but we cannot expect that
in a future different servers with different git versions to have the
same object-info capabilities. Therefore, the allow_list needs to be
dynamic depending on what the server advertises.

The client will now:

1. Request the protocol option that the placeholder refers to (i.e.
   "size" when "%(objectsize)").

2. Filters the request in fetch_object_info() dropping any option that
   the server does not advertise.

3. After the fetching, the options that haven't been dropped are the ones
   fetched and supported by the server, these supported options are
   mapped and remote_allowed_atoms is populated with the placeholders.

4. expand_atom() checks remote_allowed_atoms with the same behaviour as
   the static allow_list had.

Move object_info_options out of get_remote_info so the caller which has
data can select what options will be requested instead of requesting
always size.
Move batch_object_write() out so there will always be an output even if
all the placeholders are not supported by the server (returns an empty
line).

Include "type" in the object_info_options so once the server supports
it, the clients know already how to request it.

Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-25 13:21:14 -07:00
Pablo Sabater
8cdf01a420 cat-file: validate remote atoms with allow_list
strstr() is not enough to validate the format placeholders in
remote-object-info causing two errors:

- Atoms recognized by expand_atom() but the remote doesn't returns 1, but
  data->type contains garbage causing segfault.
- expand_atom() returns 0 for unknown atoms, calling
  strbuf_expand_bad_format() which ends in die() blocking local queries
  if the same format is shared.

Add an allow_list with the supported atoms at the top of expand_atom().
In remote mode, unsupported atoms return 1 leaving the sb empty,
honoring how for-each-ref handles known but inapplicable atoms.

As extra safety, initialize data->type to OBJ_BAD and add a NULL check
for type_name() so uninitialized data doesn't cause segfault.

Update tests that expect previous die() behaviour to expect an empty
string and add an explicit test for empty string return on unknown
placeholder.

Update caveat behaviour documentation.

Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-25 13:21:14 -07:00
Eric Ju
1b2b551b5e cat-file: add remote-object-info to batch-command
Since the `info` command in `cat-file --batch-command` prints object
info for a given object, it is natural to add another command in
`cat-file --batch-command` to print object info for a given object
from a remote.

Add `remote-object-info` to `cat-file --batch-command`.

While `info` takes object ids one at a time, this creates
overhead when making requests to a server. So `remote-object-info`
instead can take multiple object ids at once.

The `cat-file --batch-command` command is generally implemented in
the following manner:

 - Receive and parse input from user
 - Call respective function attached to command
 - Get object info, print object info

In --buffer mode, this changes to:

 - Receive and parse input from user
 - Store respective function attached to command in a queue
 - After flush, loop through commands in queue
    - Call respective function attached to command
    - Get object info, print object info

Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:

 - Receive and parse input from user
 If command is `remote-object-info`:
    - Get object info from remote
    - Loop through and print each object info
 Else:
    - Call respective function attached to command
    - Parse input, get object info, print object info

And finally for --buffer mode `remote-object-info`:
 - Receive and parse input from user
 - Store respective function attached to command in a queue
 - After flush, loop through commands in queue:
    If command is `remote-object-info`:
        - Get object info from remote
        - Loop through and print each object info
    Else:
        - Call respective function attached to command
        - Get object info, print object info

To summarize, `remote-object-info` gets object info from the remote and
then loops through the object info passed in, printing the info.

In order for `remote-object-info` to avoid remote communication
overhead in the non-buffer mode, the objects are passed in as such:

remote-object-info <remote> <oid> <oid> ... <oid>

rather than

remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
...
remote-object-info <remote> <oid>

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-25 13:21:14 -07:00
Eric Ju
2eec59f0c4 cat-file: declare loop counter inside for()
Some code used in this series declares variable i and only uses it
in a for loop, not in any other logic outside the loop.

Change the declaration of i to be inside the for loop for readability.
While at it, we also change its type from "int" to "size_t" where the
latter makes more sense.

Helped-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Eric Ju <eric.peijian@gmail.com>
Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-25 13:21:13 -07:00
Patrick Steinhardt
a087ec1109 refs: move parsing of "core.logAllRefUpdates" back into ref stores
In cc42c88945 (refs: extract out reflog config to generic layer,
2026-05-04) we have refactored how we parse "core.logAllRefUpdates" so
that it happens in the generic layer. Unfortunately, this has worsened a
preexisting issue where we may recurse when creating the reference store
because of a chicken-and-egg problem between parsing the configuration
and evaluating "onbranch" conditions.

Prepare for a fix by essentially reverting that change so that we handle
this setting in the respective backends again. The backends are already
parsing other configuration anyway, so by moving the logic back in there
we can ensure that all backend configuration is parsed the same way.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-25 13:19:57 -07:00
Harald Nordgren
4a48af9c27 branch: add --dry-run for --delete-merged
With --dry-run, --delete-merged prints the local branches it would
delete, one "Would delete branch <name>" line each, and exits
without touching any ref. The same filtering applies, so the output
is exactly the set that the real run would delete.

--dry-run is only meaningful together with --delete-merged and is
rejected otherwise.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:18:16 -07:00
Harald Nordgren
365384b1db branch: add branch.<name>.deleteMerged opt-out
Setting branch.<name>.deleteMerged=false exempts that branch from
"git branch --delete-merged", which is useful for a topic you want
to keep developing after an early round of it has been merged
upstream. Unless --quiet is given, each skip is reported so the
user knows why their topic was kept.

Explicit deletion with "git branch -d" still uses the normal merge
check and ignores this setting.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:18:15 -07:00
Harald Nordgren
7b6e901ec8 branch: add --delete-merged <branch>
git branch --delete-merged <branch>...

deletes the local branches that "--forked <branch>" would list,
keeping only those whose tip is reachable from their configured
upstream. The work has already landed on the upstream they track,
so the local copy is no longer needed.

A branch is not deleted when:

  * it is checked out in any worktree
  * its upstream remote-tracking branch no longer exists, since a
    missing upstream is not by itself a sign of integration
  * its push destination equals its upstream (<branch>@{push} is
    the same as <branch>@{upstream}), such as a local "main" that
    tracks and pushes to "origin/main". Right after a pull it just
    looks "fully merged", so it is kept. Only branches that push
    somewhere other than their upstream, typically topics in a fork
    workflow, are candidates.

A branch whose work is not yet merged into its upstream is silently
skipped, so one unmerged topic does not abort the whole sweep.

A branch that another, surviving branch tracks as its upstream is
also kept, so a branch is never deleted out from under one stacked
on top of it. Such a kept branch is itself merged, so when its own
upstream is being deleted, clear its now-stale upstream config.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:18:03 -07:00
Harald Nordgren
28827a8756 branch: prepare delete_branches for a bulk caller
Teach delete_branches() two new modes for the upcoming
--delete-merged: one that asks only whether a branch is merged into
its upstream, without falling back to HEAD when there is no
upstream, and one that rehearses the deletions without removing any
ref. Existing callers keep their current behavior.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:17:53 -07:00
Harald Nordgren
cdce9f3c8b branch: let delete_branches skip unmerged branches on bulk refusal
Add a skip-unmerged mode to delete_branches() and check_branch_commit()
so a bulk caller can silently skip branches that are not fully merged
and carry on, rather than erroring with the "use 'git branch -D'"
advice that the plain "git branch -d" path emits. Existing callers are
unaffected.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:17:51 -07:00
Harald Nordgren
aaf5816f00 branch: convert delete_branches() to a flags argument
delete_branches() and check_branch_commit() take a pair of int
booleans (force and quiet) that the next commits would grow further.
Replace them with a single "unsigned int flags" argument and an
enum, splitting the bits back into named bool locals so the body
keeps reading the same named values.

No change in behavior.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:17:49 -07:00
Harald Nordgren
82a974308c branch: add --forked filter for --list mode
Add a --forked option to "git branch" list mode that lists only
branches whose configured upstream matches <branch>. The argument
can be a ref (e.g. "origin/main", "master"), a remote name like
"origin" for the branch its origin/HEAD points at, or a shell glob
(e.g. "origin/*"), and may be repeated to widen the filter.

It is an ordinary list filter, so it combines with the others:

    git branch --merged origin/main --forked 'origin/*'

lists branches forked from origin that are already merged into
origin/main, and --no-merged inverts the question.

This is the building block for --delete-merged, which deletes the
listed branches once they have landed on their upstream.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:17:46 -07:00
Harald Nordgren
ec0f5a68cd history: re-edit a squash with every message
By default "git history squash" reuses the oldest commit's message.
When --reedit-message is given it only reopened that one message, so the
messages of the folded-in commits were lost.

Gather the messages of every commit in the range, oldest first, and use
them as the editor template when re-editing, mirroring how "git rebase
-i" presents a squash. The combined message is built before the
descendant walk so it is not disturbed by the flags that walk leaves on
the commits.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:15:29 -07:00
Harald Nordgren
f4b9baec7a history: add squash subcommand to fold a range
Folding a series of commits into one required either an interactive
rebase where each commit after the first was hand-edited to "fixup", or
a "git reset --soft" to the merge base followed by "git commit --amend".

Add "git history squash <revision-range>" to do this directly. It folds
every commit in the range into the oldest one, keeping that commit's
message and authorship and taking the tree of the newest commit, so the
range collapses into a single commit. Commits above the range are
replayed on top of the result.

The range is given as <base>..<tip>, so "git history squash @~3.."
folds the three most recent commits and "git history squash @~5..@~2"
squashes an interior range. A merge inside the range is folded like any
other commit, but the range must have a single base, so a range with
more than one entry point is rejected.

The folded commits leave the history, so by default the command refuses
when another ref points at one of them. Use "--update-refs=head" to
rewrite only the current branch and leave those refs untouched.

Inspired-by: Sergey Chernov <serega.morph@gmail.com>
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:15:26 -07:00
Harald Nordgren
064f5be22c history: give commit_tree_ext a message template
commit_tree_ext() reuses the message of the commit it is handed. A
caller that folds several commits together wants to seed the message
from more than that single commit, so add an optional message_template
parameter. When NULL, the behavior is unchanged.

Pass NULL from the existing fixup and split callers.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:15:23 -07:00
Harald Nordgren
8bf7871624 history: extract helper for a commit's parent tree
Three places resolve the tree of a commit's first parent, falling back
to the empty tree for a root commit, each repeating the same parse and
oidcpy dance. Extract a first_parent_tree_oid() helper and route the
existing callers through it.

No change in behavior.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:15:17 -07:00
Harald Nordgren
8b13e2d194 checkout: extend --track with a "fetch" mode to refresh start-point
Forking from an existing remote branch without refreshing first often
has consequences: you start work that has already been done, or you
build on an old version of the code which causes big conflicts later
when you pull. The workaround is two commands ("git fetch <remote>
<branch> && git checkout -b <topic> <remote>/<branch>"), and when
the fetch is skipped the checkout silently starts from a stale tip.

Users may already expect "<remote>/<branch>" to refer to the latest
tip on the remote. While this blurs the line between fetch and
checkout, git already does this in places where it pays off: "git
clone" fetches and checks out, and "git pull" fetches and merges.

Add a "fetch" mode to "--track" that refreshes <start-point> before
checking it out:

    git checkout -b new_branch --track=fetch origin/some-branch

Only the requested branch is fetched so other remote-tracking
branches are left untouched. When <start-point> is a bare <remote>
(e.g. "origin"), follow refs/remotes/<remote>/HEAD to learn which
branch to refresh. If "git fetch" fails but the remote-tracking ref
already exists locally, warn and proceed from the existing tip,
otherwise abort.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 15:06:02 -07:00
Patrick Steinhardt
4f48d2a241 treewide: convert users of whence to the new source field
The `whence` field has become redundant now that callers can learn about
the exact source an object has been looked up from via the `struct
object_info_source::source` field.

Adapt callers to use the new field. Note that all callsites already set
up the `info.sourcep` request pointer, so the conversion is rather
straight-forward.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 10:12:35 -07:00
Patrick Steinhardt
695797490e odb: make backend-specific fields optional
The `struct object_info` carries two pieces of information
about how an object was looked up:

  - The `whence` enum identifying the backend.

  - The backend-tagged union `u` exposing backend-specific details
    (currently only the packed-source case, which records the owning
    pack, offset and packed object type).

The union is populated unconditionally, even though most callers don't
care about provenance at all.

Split the backend-specific union out into a new public type, `struct
object_info_source`, and make the object info structure carry it via
just another opt-in request pointer. As with all the other requestable
information, callers that need source info allocate a `struct
object_info_source` on the stack and point `sourcep` at it; callers that
don't care about it simply leave the field as a `NULL` pointer. Adapt
callers accordingly.

Note that the `whence` enum is strictly-speaking also backend-specific
information, so it would be another good candidate to be moved into the
`struct object_info_source`. For now though it is left alone, as it will
be replaced by a `struct odb_source` pointer in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 10:12:35 -07:00
Patrick Steinhardt
1b9f137b43 packfile: thread odb_source_packed through packed_object_info()
Add an optional `struct odb_source_packed *source` parameter to
`packed_object_info()` and `packed_object_info_with_index_pos()`. This
parameter is unused at this point in time, but it will be used in a
follow-up commit so that we can record the source of a specific object.

Note that callers in "odb/source-packed.c" pass the already-available
source, but all other callers pass `NULL` instead. This is fine though,
as we only care about populating this info when called via the packed
store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-24 10:12:35 -07:00
Junio C Hamano
2ed34f72cf Merge branch 'ps/odb-source-packed' into ps/odb-drop-whence
* ps/odb-source-packed:
  odb/source-packed: drop pointer to "files" parent source
  midx: refactor interfaces to work on "packed" source
  odb/source-packed: stub out remaining functions
  odb/source-packed: wire up `freshen_object()` callback
  odb/source-packed: wire up `find_abbrev_len()` callback
  odb/source-packed: wire up `count_objects()` callback
  odb/source-packed: wire up `for_each_object()` callback
  odb/source-packed: wire up `read_object_stream()` callback
  odb/source-packed: wire up `read_object_info()` callback
  packfile: use higher-level interface to implement `has_object_pack()`
  odb/source-packed: wire up `reprepare()` callback
  odb/source-packed: wire up `close()` callback
  odb/source-packed: start converting to a proper `struct odb_source`
  odb/source-packed: store pointer to "files" instead of generic source
  packfile: move packed source into "odb/" subsystem
  packfile: split out packfile list logic
  packfile: rename `struct packfile_store` to `odb_source_packed`
2026-06-24 10:12:12 -07:00
K Jayatheerth
3ac28d832a repo: add path.gitdir with absolute and relative suffix formatting
Scripts need a stable way to locate the git directory without
parsing rev-parse output or relying on its flag-driven path format
selection. There is no way to retrieve this path from git repo info
today.

Introduce path.gitdir.absolute and path.gitdir.relative keys,
consistent with the path.commondir keys added in the previous patch.
Reuse the test_repo_info_path helper introduced there to validate
both variants.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-23 21:15:52 -07:00
K Jayatheerth
1efca6d0b2 repo: add path.commondir with absolute and relative suffix formatting
Scripts working with worktree setups need a reliable way to discover
the common directory, which diverges from the git directory when
multiple worktrees are in use. There is no way to retrieve this path
from git repo info today.

Introduce path.commondir.absolute and path.commondir.relative keys.
Exposing explicit format variants rather than a single key with a
default avoids ambiguity for scripts that require predictable output.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-23 21:15:52 -07:00
K Jayatheerth
60cafea907 path: extract format_path() and use in rev-parse
Path formatting logic in builtin/rev-parse.c writes directly to
stdout. Other builtins cannot reuse it.

Extract this logic into format_path() in path.c and expose
a path_format enum in path.h.

Convert rev-parse to use the new helper in the same step to validate
the API against existing tests and avoid introducing dead code.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-23 21:15:52 -07:00
Johannes Schindelin
50cc7f3814 replay: offer an option to linearize the commit topology
One of the stated goals of git-replay(1) is to allow implementing the
git-rebase(1) functionality on the server side.

The default mode of git-rebase(1) is to act as if `--no-rebase-merges`
was given. This mode drops merge commits instead of replaying them, and
linearizes the commit history into a sequence of the
regular (single-parent) commits.

Add option `--linearize` to git-replay(1) to do the same.

Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-22 08:38:37 -07:00
Junio C Hamano
02bb39c5cb Merge branch 'js/objects-larger-than-4gb-on-windows-more'
* js/objects-larger-than-4gb-on-windows-more:
  odb: use size_t for object_info.sizep and the size APIs
  packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
  pack-objects: use size_t for in-core object sizes
  packfile: widen unpack_entry()'s size out-parameter to size_t
  pack-objects(check_pack_inflate()): use size_t instead of unsigned long
  patch-delta: use size_t for sizes
  compat/msvc: use _chsize_s for ftruncate
2026-06-21 16:41:38 -07:00
Taylor Blau
7e6de2ac62 pack-objects: support --delta-islands with --path-walk
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.

When discussing those original patches on the list, a message from
Stolee in [1] noted the following:

    this could be remedied by [...] doing a separate walk to identify
    islands using the normal method

In a related portion of the thread, Peff explains[2]:

    The delta islands code already does its own tree walk to propagate
    the bits down (it does rely on the base walk's show_commit() to
    propagate through the commits).

    Once each object has its island bitmaps, I think however you
    choose to come up with delta candidates [...] you should be able
    to use it. It's fundamentally just answering the question of "am
    I allowed to delta between these two objects".

That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.

Recall how delta-islands are computed during a normal repack:

 - `show_commit()` calls `propagate_island_marks()` for each commit,
   which merges the commit's island bitset onto its root tree object and
   onto each of its parent commits.

 - `show_object()` for a tree records the tree's depth derived from the
   slash-separated pathname. Subsequent `resolve_tree_islands()` uses
   that depth to walk trees in increasing-depth order, propagating each
   tree's marks to its children.

 - At delta-search time, `in_same_island()` enforces that a delta
   target's island bitmap is a subset of its base's: every island that
   reaches the target must also reach the base.

Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:

 - For each commit batch, call `propagate_island_marks()` on commits,
   exactly as `show_commit()` does.

   We have to be careful about the order in which we call this function,
   and we must see a commit before its parents in order to have
   island marks to propagate.

   The path-walk batch preserves that order. Path-walk appends commits
   to its `OBJ_COMMIT` batch as they come back from the same
   `get_revision()` loop the regular traversal uses, and
   `add_objects_by_path()` iterates the batch in array order. So every
   commit reaches `propagate_island_marks()` in the same sequence that
   `show_commit()` would have seen it, and the descendant-first chain
   that the algorithm relies on is intact.

   Skip island propagation for excluded commits to match the regular
   traversal, whose `show_commit()` callback is only invoked for
   interesting commits. Boundary commits may still be present in
   path-walk's callback so they can serve as thin-pack bases, but they
   should not contribute island marks.

 - For each tree batch, record the tree's depth from the path. Use the
   `record_tree_depth()` helper from the previous commit so both
   callbacks behave identically, including the max-depth-wins behavior
   when a tree is reached via more than one path. The helper accepts
   both the `show_object()` path shape ("foo", "foo/bar") and the
   path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
   recorded from either traversal mode are directly comparable.

   This is implicit in the implementation sketch from Peff above.
   `resolve_tree_islands()` sorts trees by `oe->tree_depth` in
   increasing-depth order before propagating marks down, so that a
   parent tree's marks are finalized before its children inherit them.
   Without recording the depth at path-walk time, every
   path-walk-discovered tree would land at depth 0 in `to_pack`, the
   sort would lose its ordering, and children could inherit marks from
   parents whose own contributions had not yet been merged in.

With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.

Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.

[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-21 16:26:14 -07:00
Taylor Blau
264efee401 pack-objects: extract record_tree_depth() helper
Prepare for a subsequent change that needs to record tree depths from a
second call site by factoring the delta-islands tree-depth bookkeeping
out of `show_object()` and into a helper, `record_tree_depth()`.

The helper looks up the object in `to_pack`, returns early when the
object was not added there, computes the depth from the slash count in
the supplied name, and preserves the existing max-depth-wins behavior
when a tree is reached by more than one path.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-21 16:26:14 -07:00
Taylor Blau
0a37451106 pack-objects: support reachability bitmaps with --path-walk
When 'pack-objects' is invoked with '--path-walk', it prevents us from
using reachability bitmaps.

This behavior dates back to 70664d2865 (pack-objects: add --path-walk
option, 2025-05-16), which included a comment in the relevant portion of
the command-line arguments handling that read as follows:

    /*
     * We must disable the bitmaps because we are removing
     * the --objects / --objects-edge[-aggressive] options.
     */

In fb2c309b7d (pack-objects: pass --objects with --path-walk,
2026-05-02), path-walk learned to pass '--objects' again, but still
kept bitmap traversal disabled. That leaves two useful cases
unsupported:

 * A path-walk repack that writes bitmaps does not give the bitmap
   selector any commits, because path-walk reveals commits through
   `add_objects_by_path()` rather than through `show_commit()`, where
   `index_commit_for_bitmap()` is normally called.

 * An invocation like "git pack-objects --use-bitmap-index --path-walk"
   never tries an existing bitmap, even when one is available and could
   answer the request.

Fortunately for us, neither restriction is required.

 * On the writing side: teach the path-walk object callback to call
   `index_commit_for_bitmap()` for commits that it adds to the pack.
   That gives the bitmap selector the commit candidates it would have
   seen from the regular traversal.

 * For bitmap reading, keep passing '--objects' to the internal rev_list
   machinery, but stop clearing `use_bitmap_index`. If an existing
   bitmap can answer the request, use it; otherwise fall back to
   path-walk's own enumeration.

As a result, we can see significantly reduced pack generation times from
p5311 (with our `GIT_PERF_REPO` set to a recent clone of the fluentui
repository) before this commit:

    Test                                            HEAD^             HEAD
    ----------------------------------------------------------------------------------------
    5311.40: server (1 days, --path-walk)           1.43(1.39+0.04)   0.01(0.01+0.00) -99.3%
    5311.41: size   (1 days, --path-walk)                    139.6K            139.7K +0.0%
    5311.42: client (1 days, --path-walk)           0.02(0.02+0.00)   0.02(0.02+0.00) +0.0%
    5311.44: server (2 days, --path-walk)           1.43(1.39+0.04)   0.01(0.00+0.00) -99.3%
    5311.45: size   (2 days, --path-walk)                    139.6K            139.7K +0.0%
    5311.46: client (2 days, --path-walk)           0.02(0.02+0.00)   0.02(0.02+0.00) +0.0%
    5311.48: server (4 days, --path-walk)           1.44(1.39+0.04)   0.01(0.01+0.00) -99.3%
    5311.49: size   (4 days, --path-walk)                    238.1K            238.1K +0.0%
    5311.50: client (4 days, --path-walk)           0.03(0.03+0.00)   0.03(0.03+0.00) +0.0%
    5311.52: server (8 days, --path-walk)           1.43(1.39+0.03)   0.01(0.00+0.00) -99.3%
    5311.53: size   (8 days, --path-walk)                    344.9K            344.9K +0.0%
    5311.54: client (8 days, --path-walk)           0.07(0.07+0.00)   0.07(0.08+0.00) +0.0%
    5311.56: server (16 days, --path-walk)          1.47(1.44+0.03)   0.10(0.08+0.01) -93.2%
    5311.57: size   (16 days, --path-walk)                   844.0K            844.0K +0.0%
    5311.58: client (16 days, --path-walk)          0.09(0.09+0.00)   0.09(0.09+0.00) +0.0%
    5311.60: server (32 days, --path-walk)          1.52(1.50+0.05)   0.14(0.15+0.02) -90.8%
    5311.61: size   (32 days, --path-walk)                     4.2M              4.2M +0.1%
    5311.62: client (32 days, --path-walk)          0.34(0.48+0.02)   0.34(0.45+0.05) +0.0%
    5311.64: server (64 days, --path-walk)          1.55(1.52+0.06)   0.15(0.15+0.04) -90.3%
    5311.65: size   (64 days, --path-walk)                     6.4M              6.4M -0.0%
    5311.66: client (64 days, --path-walk)          0.51(0.79+0.05)   0.51(0.80+0.06) +0.0%
    5311.68: server (128 days, --path-walk)         1.59(1.57+0.06)   0.16(0.21+0.01) -89.9%
    5311.69: size   (128 days, --path-walk)                    8.4M              8.4M -0.0%
    5311.70: client (128 days, --path-walk)         0.72(1.44+0.08)   0.71(1.47+0.09) -1.4%

We get the same size of output pack, but this commit allows us to do so
in a significantly shorter amount of time. Intuitively, we're generating
the same pack (hence the unchanged 'test_size' output from run to run),
but varying how we get there. Before this commit, pack-objects prefers
'--path-walk' to '--use-bitmap-index', so we generate the output pack by
performing a normal '--path-walk' traversal. With this commit, we are
operating over a *repacked* state (that itself was done with a
'--path-walk' traversal), but are able to perform pack-reuse on that
repacked state via bitmaps.

When comparing the size of the repacked pack with/without '--path-walk'
on the previous commit versus this one, we see that (a) the repacked size
improves significantly with '--path-walk', and that (b) writing bitmaps
during repacking does not regress this improvement:

    Test                                            HEAD^             HEAD
    ----------------------------------------------------------------------------------------
    5311.3: size of bitmapped pack                           558.4M            558.5M +0.0%
    5311.38: size of bitmapped pack (--path-walk)            164.4M            164.4M +0.0%

(Note that to observe an improvement here, we must repack with '-F' in
order to avoid reusing non-'--path-walk' deltas, which would otherwise
skew our results.)

There is one wrinkle when it comes to '--boundary', which we must not
pass into the bitmap walk in the presence of both '--path-walk' and
'--use-bitmap-index'. Path-walk needs boundary commits when it performs
its own traversal, in order to discover bases for thin packs, but the
bitmap traversal does not expect this. Work around this by setting
`revs->boundary` as late as possible within the '--path-walk' traversal,
after any bitmap attempt has either succeeded or declined to answer the
request.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-21 16:26:14 -07:00