Commit Graph

13828 Commits

Author SHA1 Message Date
Junio C Hamano
7f6866d008 Merge branch 'js/objects-larger-than-4gb-on-windows-more' into seen
* js/objects-larger-than-4gb-on-windows-more:
  odb: use size_t for object_info.sizep and the size APIs
  packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
  pack-objects: use size_t for in-core object sizes
  packfile: widen unpack_entry()'s size out-parameter to size_t
  pack-objects(check_pack_inflate()): use size_t instead of unsigned long
  patch-delta: use size_t for sizes
  compat/msvc: use _chsize_s for ftruncate
2026-06-15 10:38:21 -07:00
Junio C Hamano
cfe6a29bff Merge branch 'mm/diff-process-hunks' into seen
A new `diff.<driver>.process` configuration has been introduced to
allow a long-running external process to act as a hunk provider to
allows external tools to control which lines Git considers changed
while leaving all output formatting (word diff, color, blame, etc.) to
Git's standard pipeline.

* mm/diff-process-hunks:
  blame: consult diff process for no-hunk detection
  diff: bypass diff process with --no-ext-diff and in format-patch
  diff: add long-running diff process via diff.<driver>.process
  sub-process: separate process lifecycle from hashmap management
  userdiff: add diff.<driver>.process config
  xdiff: support external hunks via xpparam_t
2026-06-15 10:38:21 -07:00
Junio C Hamano
9536fbf5de Merge branch 'rs/cat-file-default-format-optim' into seen
* rs/cat-file-default-format-optim:
  cat-file: speed up default format
2026-06-15 10:38:20 -07:00
Junio C Hamano
ec19b5e76c Merge branch 'tb/midx-incremental-custom-base' into seen
The `git multi-pack-index write --incremental` command has been
corrected to properly honor the `--base` option. Previously, the
custom base was ignored by the normal write path, and the pack
exclusion logic incorrectly skipped packs from layers above the
selected base, breaking reachability closure for bitmaps.

* tb/midx-incremental-custom-base:
  midx-write: include packs above custom incremental base
  midx: pass custom '--base' through incremental writes
  t5334: expose shared `nth_line()` helper
2026-06-15 10:27:32 -07:00
Junio C Hamano
604b9ad138 Merge branch 'tc/replay-linearize' into seen
git replay learns --linearize option to drop merge commits and
linearize the replayed history, mimicking git rebase
--no-rebase-merges.

* tc/replay-linearize:
  replay: offer an option to linearize the commit topology
  replay: add helper to put entry into mapped_commits
  replay: refactor enum replay_mode into a bool
2026-06-15 10:27:32 -07:00
Junio C Hamano
24447fa096 Merge branch 'hn/branch-prune-merged' into seen
"git branch" command learned "--prune-merged" option to remove
local branches that have already been merged to the remote-tracking
branches they track.

* hn/branch-prune-merged:
  branch: add --dry-run for --delete-merged
  branch: add branch.<name>.deleteMerged opt-out
  branch: add --delete-merged <branch>
  branch: prepare delete_branches for a bulk caller
  branch: let delete_branches skip unmerged branches on bulk refusal
  branch: convert delete_branches() to a flags argument
  branch: add --forked filter for --list mode
2026-06-15 10:27:31 -07:00
Junio C Hamano
5d8813aa13 Merge branch 'kk/prio-queue-get-put-fusion' into seen
The lazy priority queue optimization pattern (deferring actual removal
in prio_queue_get() to allow get+put fusion) has been folded directly
into prio_queue itself, speeding up commit traversal workflows and
simplifying callers.

* kk/prio-queue-get-put-fusion:
  prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
  prio-queue: rename .nr to .nr_ and add accessor helpers
2026-06-15 10:27:31 -07:00
Junio C Hamano
9cd4ee2fa1 Merge branch 'ps/cat-file-remote-object-info' into seen
The `remote-object-info` command has been added to `git cat-file
--batch-command`, allowing clients to request object metadata
(currently size) from a remote server via protocol v2 without
downloading the entire object.

The client dynamically filters format placeholders based on
server-advertised capabilities and safely returns empty strings for
inapplicable or unsupported fields.

* ps/cat-file-remote-object-info:
  cat-file: make remote-object-info allow-list dynamic
  cat-file: validate remote atoms with allow_list
  cat-file: add remote-object-info to batch-command
  transport: add client support for object-info
  serve: advertise object-info feature
  fetch-pack: move fetch initialization
  connect: refactor packet writing
  fetch-pack: move function to connect.c
  t1006: split test utility functions into new "lib-cat-file.sh"
  cat-file: add declaration of variable i inside its for loop
  git-compat-util: add strtoul_ul() with error handling
  transport-helper: fix memory leak of helper on disconnect
2026-06-15 10:27:31 -07:00
Junio C Hamano
47be044636 Merge branch 'ps/odb-source-packed' into seen
The packed object source has been refactored into a proper struct
odb_source.

* ps/odb-source-packed:
  odb/source-packed: drop pointer to "files" parent source
  midx: refactor interfaces to work on "packed" source
  odb/source-packed: stub out remaining functions
  odb/source-packed: wire up `freshen_object()` callback
  odb/source-packed: wire up `find_abbrev_len()` callback
  odb/source-packed: wire up `count_objects()` callback
  odb/source-packed: wire up `for_each_object()` callback
  odb/source-packed: wire up `read_object_stream()` callback
  odb/source-packed: wire up `read_object_info()` callback
  packfile: use higher-level interface to implement `has_object_pack()`
  odb/source-packed: wire up `reprepare()` callback
  odb/source-packed: wire up `close()` callback
  odb/source-packed: start converting to a proper `struct odb_source`
  odb/source-packed: store pointer to "files" instead of generic source
  packfile: move packed source into "odb/" subsystem
  packfile: split out packfile list logic
  packfile: rename `struct packfile_store` to `odb_source_packed`
2026-06-15 10:27:30 -07:00
Junio C Hamano
4df924f427 Merge branch 'ps/history-drop' into seen
The experimental "git history" command has been taught a new "drop"
subcommand to remove a commit and replay its descendants onto its
parent.

* ps/history-drop:
  builtin/history: implement "drop" subcommand
  builtin/history: split handling of ref updates into two phases
  reset: stop assuming that the caller passes in a clean index
  reset: allow the caller to specify the current HEAD object
  reset: introduce ability to skip updating HEAD
  reset: introduce dry-run mode
  reset: modernize flags passed to `reset_working_tree()`
  reset: rename `reset_head()`
  reset: drop `USE_THE_REPOSITORY_VARIABLE`
  read-cache: split out function to drop unmerged entries to stage 0
2026-06-15 10:27:29 -07:00
Junio C Hamano
85166601f0 Merge branch 'jk/repo-info-path-keys' into seen
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".

* jk/repo-info-path-keys:
  repo: add path.gitdir with absolute and relative suffix formatting
  repo: add path.commondir with absolute and relative suffix formatting
  rev-parse: use append_formatted_path() for path formatting
  path: introduce append_formatted_path() for shared path formatting
2026-06-15 10:27:29 -07:00
Junio C Hamano
a0edb09739 Merge branch 'tb/pack-path-walk-bitmap-delta-islands' into seen
The pack-objects command now supports using reachability bitmaps and
delta-islands concurrently with the `--path-walk` option, allowing
faster packaging by falling back to path-walk when bitmaps cannot
fully satisfy the request.

* tb/pack-path-walk-bitmap-delta-islands:
  pack-objects: support `--delta-islands` with `--path-walk`
  pack-objects: extract `record_tree_depth()` helper
  pack-objects: support reachability bitmaps with `--path-walk`
  t/perf: drop p5311's lookup-table permutation
2026-06-15 10:27:28 -07:00
Junio C Hamano
b0e517b8bf Merge branch 'ec/commit-fixup-options' into seen
The -m/-F/-c/-C options to supply commit log message from outside the
editor are now supported for all "git commit --fixup" variations.

* ec/commit-fixup-options:
  commit: allow -c/-C for all kinds of --fixup
  commit: allow -m/-F for all kinds of --fixup
2026-06-15 10:27:28 -07:00
Junio C Hamano
cc449f1a92 Merge branch 'kk/fetch-store-ref-optimization' into seen
When fetching from a transport that provides a self-contained pack,
pass the transport pointer to the post-fetch `check_connected()` call
to optimize connectivity check.

Retracted.
cf. <CAL71e4MrVqC1=AR6x0_8S=8kVqPdDkhgCZRb4etFsxTzd6s_8Q@mail.gmail.com>

* kk/fetch-store-ref-optimization:
  fetch: pass transport to post-fetch connectivity check
2026-06-15 10:27:28 -07:00
Junio C Hamano
44acece03f Merge branch 'hn/checkout-track-fetch' into seen
"git checkout --track=..." learned to optionally fetch the branch
from the remote the new branch will work with.

* hn/checkout-track-fetch:
  checkout: extend --track with a "fetch" mode to refresh start-point
  branch: expose helpers for finding the remote owning a tracking ref
2026-06-15 10:27:28 -07:00
Junio C Hamano
3f1b7e9586 Merge branch 'js/parseopt-subcommand-autocorrection' into seen
The parse-options library learned to auto-correct misspelled
subcommand names.

* js/parseopt-subcommand-autocorrection:
  SQUASH???
  doc: document autocorrect API
  parseopt: add tests for subcommand autocorrection
  parseopt: enable subcommand autocorrection for git-remote and git-notes
  parseopt: autocorrect mistyped subcommands
  autocorrect: provide config resolution API
  autocorrect: rename AUTOCORRECT_SHOW to AUTOCORRECT_HINT
  autocorrect: use mode and delay instead of magic numbers
  help: move tty check for autocorrection to autocorrect.c
  help: make autocorrect handling reusable
  parseopt: extract subcommand handling from parse_options_step()
2026-06-15 10:27:27 -07:00
Junio C Hamano
36f5666b37 Merge branch 'td/ls-files-pathspec-prefilter' into jch
`git ls-files --modified` and `git ls-files --deleted` have been
optimized to filter with pathspec before calling lstat() when there is
only a single pathspec item, avoiding unnecessary filesystem access
for entries that will not be shown.

* td/ls-files-pathspec-prefilter:
  ls-files: filter pathspec before lstat
2026-06-15 10:27:18 -07:00
Junio C Hamano
0ca4231483 Merge branch 'ps/setup-drop-global-state' into jch
Continuation of "setup.c" refactoring to drop remaining global state
(`git_work_tree_cfg`, `is_bare_repository_cfg`). The most notable
outcome is that `is_bare_repository()` has been updated to no longer
implicitly rely on `the_repository`.

* ps/setup-drop-global-state:
  treewide: drop USE_THE_REPOSITORY_VARIABLE
  environment: stop using `the_repository` in `is_bare_repository()`
  environment: split up concerns of `is_bare_repository_cfg`
  builtin/init: stop modifying `is_bare_repository_cfg`
  setup: remove global `git_work_tree_cfg` variable
  builtin/init: simplify logic to configure worktree
  builtin/init: stop modifying global `git_work_tree_cfg` variable
2026-06-15 10:27:18 -07:00
Junio C Hamano
b6d3520e47 Merge branch 'td/describe-tag-iteration' into jch
'git describe' has been taught to pass the 'refs/tags/' prefix down to
the ref iterator when '--all' is not requested, avoiding unnecessary
iteration over non-tag refs.

* td/describe-tag-iteration:
  describe: limit default ref iteration to tags
2026-06-15 10:27:18 -07:00
Junio C Hamano
fdd7eec2bf Merge branch 'ab/index-pack-retain-child-bases' into jch
"git index-pack" has been optimized by retaining child bases in the
delta cache instead of immediately freeing them, letting the existing
cache limit policy decide eviction.

* ab/index-pack-retain-child-bases:
  index-pack: retain child bases in delta cache
2026-06-15 10:27:17 -07:00
Junio C Hamano
c3138f9d0d Merge branch 'jk/describe-contains-all-match-fix' into jch
The 'git describe --contains --all' command has been fixed to
properly honor the '--match' and '--exclude' options by passing
them down to 'git name-rev' with the appropriate reference
prefixes.

* jk/describe-contains-all-match-fix:
  describe: fix --exclude, --match with --contains and --all
2026-06-15 10:27:16 -07:00
Junio C Hamano
998a0bc412 Merge branch 'kk/streaming-walk-pqueue' into jch
Streaming revision walks have been optimized by using a priority queue
for date-sorting commits, speeding up walks repositories with many
merges.

* kk/streaming-walk-pqueue:
  revision: use priority queue for non-limited streaming walks
  revision: introduce rev_walk_mode to clarify get_revision_1()
  pack-objects: call release_revisions() after cruft traversal
2026-06-15 10:27:15 -07:00
Harald Nordgren
609f13f80d branch: add --dry-run for --delete-merged
With --dry-run, --delete-merged prints the local branches it would
delete, one "Would delete branch <name>" line each, and exits
without touching any ref. The same filtering applies, so the output
is exactly the set that the real run would delete.

--dry-run is only meaningful together with --delete-merged and is
rejected otherwise.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
99a37bb7ab branch: add branch.<name>.deleteMerged opt-out
Setting branch.<name>.deleteMerged=false exempts that branch from
"git branch --delete-merged", which is useful for a topic you want
to keep developing after an early round of it has been merged
upstream. Unless --quiet is given, each skip is reported so the
user knows why their topic was kept.

Explicit deletion with "git branch -d" still uses the normal merge
check and ignores this setting.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
163f106aff branch: add --delete-merged <branch>
git branch --delete-merged <branch>...

deletes the local branches that "--forked <branch>" would list,
keeping only those whose tip is reachable from their configured
upstream. The work has already landed on the upstream they track,
so the local copy is no longer needed.

Three kinds of branches are not deleted:

  * any branch checked out in any worktree
  * any branch whose upstream remote-tracking branch no longer
    exists, since a missing upstream is not by itself a sign of
    integration
  * any branch whose push destination equals its upstream
    (<branch>@{push} is the same as <branch>@{upstream}), such as
    a local "main" that tracks and pushes to "origin/main". Right
    after a pull it just looks "fully merged", so it is kept. Only
    branches that push somewhere other than their upstream,
    typically topics in a fork workflow, are candidates.

A branch whose work is not yet merged into its upstream is silently
skipped, so one unmerged topic does not abort the whole sweep.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
63bdf64e03 branch: prepare delete_branches for a bulk caller
Teach delete_branches() two new modes for the upcoming
--delete-merged: one that asks only whether a branch is merged into
its upstream, without falling back to HEAD when there is no
upstream, and one that rehearses the deletions without removing any
ref. Existing callers keep their current behavior.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
01e3fc772f branch: let delete_branches skip unmerged branches on bulk refusal
Add a skip-unmerged mode to delete_branches() and check_branch_commit()
so a bulk caller can silently skip branches that are not fully merged
and carry on, rather than erroring with the "use 'git branch -D'"
advice that the plain "git branch -d" path emits. Existing callers are
unaffected.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
35529797af branch: convert delete_branches() to a flags argument
delete_branches() and check_branch_commit() take a pair of int
booleans (force and quiet) that the next commits would grow further.
Replace them with a single "unsigned int flags" argument and an
enum, splitting the bits back into named bool locals so the body
keeps reading the same named values.

No change in behavior.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Harald Nordgren
80f889ef00 branch: add --forked filter for --list mode
Add a --forked option to "git branch" list mode that lists only
branches whose configured upstream matches <branch>. The argument
can be a ref (e.g. "origin/main", "master"), a remote name like
"origin" for the branch its origin/HEAD points at, or a shell glob
(e.g. "origin/*"), and may be repeated to widen the filter.

It is an ordinary list filter, so it combines with the others:

    git branch --merged origin/main --forked 'origin/*'

lists branches forked from origin that are already merged into
origin/main, and --no-merged inverts the question.

This is the building block for --delete-merged, which deletes the
listed branches once they have landed on their upstream.

Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:22:45 -07:00
Patrick Steinhardt
aadf60cdd8 builtin/history: implement "drop" subcommand
A common operation when editing the commit history is to drop a specific
commit from the history entirely, but this operation is not currently
covered by git-history(1).

A couple of noteworthy bits:

  - This is the first git-history(1) command that will ultimately result
    in changes to both the index and the working tree. We thus have to
    add logic to merge resulting changes into those.

  - It is still not possible to replay merge commits, so this limitation
    is inherited for the new "drop" command.

  - For now we refuse to drop root commits. While we _can_ indeed drop
    root commits in the general case, there are edge cases where the
    resulting history would become completely empty. This is thus left
    to a subsequent patch series.

Other than that, most of the logic is rather straight-forward as we can
continue to build on the preexisting logic in git-history(1) for most of
the part.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:17:26 -07:00
Patrick Steinhardt
1624075b1a builtin/history: split handling of ref updates into two phases
The function `handle_reference_updates()` is used by git-history(1) to
update all references that refer to commits that have been rewritten. As
such, it performs two steps:

  - It gathers the references that need to be updated in the first
    place.

  - It prepares and commits the reference transaction.

In a subsequent commit we'll want to handle those two steps separately.
Prepare for this by splitting up the function into two.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:17:26 -07:00
Patrick Steinhardt
221202a6fe reset: introduce ability to skip updating HEAD
In a subsequent commit we'll introduce a new caller to
`reset_working_tree()` that really only wants to update the index and
working tree, without updating any references. Introduce a new flag that
makes the caller opt in to updating HEAD and adapt all callers to set
that flag.

Note that in a previous iteration we instead introduced a flag that made
callers opt out of updating any references. This was somewhat awkward
though because we already have the `UPDATE_ORIG_HEAD` flag, so the
result was somewhat inconsistent.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:17:26 -07:00
Patrick Steinhardt
566abc011f reset: modernize flags passed to reset_working_tree()
The flags passed to `reset_working_tree()` are declared as defines. This
has fallen a bit out of practice nowadays, where we instead prefer to
use enums. Furthermore, the prefix of those flags does not match the
function name anymore after the rename in the preceding commit.

Adapt the code to follow modern best practices and adapt the flag names.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:17:26 -07:00
Patrick Steinhardt
47215ec263 reset: rename reset_head()
In a subsequent commit we're about to adapt `reset_head()` so that the
reference update to HEAD is optional, only. At this point the function
starts to feel misnamed, as it doesn't necessarily have anything to do
with the HEAD reference anymore. The gist of the function then is that
we reset the working tree to a specific new commit, updating both the
index and the checked-out files.

Rename it to `reset_working_tree()` to better reflect that.

Note that we don't adjust the flags yet. This will happen in a
subsequent commit.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 10:17:26 -07:00
René Scharfe
800581cd44 cat-file: speed up default format
eb54a3391b (cat-file: skip expanding default format, 2022-03-15) added
special handling for the default batch format.  In the meantime it has
fallen behind the code path for handling arbitrary formats.  Bring it up
to speed by using the new and more efficient strbuf_add_oid_hex() and
strbuf_add_uint() instead of strbuf_addf():

Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):      1.051 s ±  0.003 s    [User: 1.027 s, System: 0.023 s]
  Range (min … max):    1.049 s …  1.058 s    10 runs

Benchmark 2: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):      1.012 s ±  0.002 s    [User: 0.988 s, System: 0.023 s]
  Range (min … max):    1.010 s …  1.018 s    10 runs

Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):     979.0 ms ±   1.1 ms    [User: 954.1 ms, System: 23.2 ms]
  Range (min … max):   977.7 ms … 980.8 ms    10 runs

Summary
  ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran
    1.03 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.07 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 08:10:27 -07:00
Johannes Schindelin
c6a4629e32 odb: use size_t for object_info.sizep and the size APIs
When `js/objects-larger-than-4gb-on-windows` widened the streaming,
index-pack and unpack-objects code paths, in the interest of keeping the
patches somewhat reasonably-sized, it left the public ODB API still
typed in `unsigned long`. In particular `struct object_info::sizep` and
the four wrappers built on top of it (`odb_read_object`,
`odb_read_object_peeled`, `odb_read_object_info`, `odb_pretend_object`)
still return the unpacked size through `unsigned long *`, so on Windows
`cat-file -s` and the `git add` / `git status` paths for a >4 GiB blob
silently cap at 4 GiB.

Widen the field and the four wrappers. The previous commits already
widened the `unpack_entry()` cascade and pack-objects' in-core size
accessors, so most of the cascade arrives here with no further work: the
temporary shims in `packed_object_info_with_index_pos()` and in
`unpack_entry()`'s delta-base recovery path go away, the two
`SET_SIZE(entry, cast_size_t_to_ulong(canonical_size))` calls in
`check_object()` and the matching one in `drop_reused_delta()` collapse
to plain `SET_SIZE`, and `oe_get_size_slow()`'s tail
`cast_size_t_to_ulong()` is gone too.

What remains narrow are the boundaries this series does not
intend to touch: the diff, blame, textconv and fast-import machinery.

Even so, this patch is unfortunately quite large.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:45:41 -07:00
Johannes Schindelin
188bac14f7 pack-objects: use size_t for in-core object sizes
`pack-objects` stores per-entry object sizes in either the 31-bit
`size_` member of the `struct object_entry` or, when the value does not
fit, the `pack->delta_size[]` spill array.  The accessors (`oe_size`,
`oe_delta_size`, `oe_get_size_slow`, `oe_size_*_than`) and the setters
(`oe_set_size`, `oe_set_delta_size`) used `unsigned long` for the spill
type, which on Windows means the spill silently caps at 4 GiB per entry.
That is what made `upload-pack` die with "object too large to read on
this platform" when serving the >4 GiB blob in `t5608` tests 5 and 6
when run with `GIT_TEST_CLONE_2GB`.

Widen them all to `size_t` (including `pack->delta_size`) and drop the
three `cast_size_t_to_ulong()` calls in `check_object()` that guarded
`in_pack_size`.  The two `SET_SIZE(entry, canonical_size)` calls in the
same function stay cast-free as before, since `canonical_size` is still
`unsigned long` until a later commit widens `object_info::sizep`.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:45:41 -07:00
Johannes Schindelin
2d83cc3f84 packfile: widen unpack_entry()'s size out-parameter to size_t
The topic `js/objects-larger-than-4gb-on-windows` widened the streaming,
index-pack and unpack-objects paths to `size_t` but deliberately stopped
at the in-memory `unpack_entry()` cascade, which still hands back the
unpacked size through `unsigned long *`.  On Windows that boundary
truncates above 4 GiB because that data type is only 32 bits wide on
that platform.

Widen the code path. Except `packed_object_info_with_index_pos()`: It
cannot yet pass `oi->sizep` directly because the field is still
`unsigned long *`; bridge it with a `size_t` temporary that narrows
back, and let a later commit drop the bridge once the field is wide
too. `gfi_unpack_entry()` keeps its narrow signature because fast-import
tracks sizes through `unsigned long` everywhere it crosses subsystem
boundaries, keeping its signature allows the scope of this commit to be
somewhat reasonable, still.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:45:40 -07:00
Johannes Schindelin
1d43315b31 pack-objects(check_pack_inflate()): use size_t instead of unsigned long
`write_reuse_object()` learned to track its packed-object size as
`size_t` in 606c192380 (odb, packfile: use size_t for streaming
object sizes, 2026-05-08), but the comparison sink it feeds,
`check_pack_inflate()`, still takes the expected decompressed size
as `unsigned long`. The call site bridges the mismatch with
`cast_size_t_to_ulong()`, which on Windows turns a >4 GiB object
into an immediate die().

That function only uses `expect` once: as the right-hand side of a
`stream.total_out == expect` equality test against zlib's counter.
zlib's own `total_out` counter is `uLong` and is therefore still
32-bit-bound on Windows. Widening `expect` to `size_t` cannot fix that,
but it is a strict improvement nonetheless: instead of dying outright,
an oversized object now simply makes the equality fail and lets
`write_reuse_object()` fall back to `write_no_reuse_object()`, which
decompresses and re-deflates the content (and which the larger
pack-objects widening series targets separately).

Drop the `cast_size_t_to_ulong()` shim at the call site now that
the receiving parameter speaks the same type as `entry_size`.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:45:40 -07:00
Johannes Schindelin
33afe87338 patch-delta: use size_t for sizes
`patch_delta()` takes the source and delta sizes by value and writes
back the reconstructed target size through an `unsigned long *`.  That
datatype cannot represent a value that exceeds 4 GiB on systems where
`unsigned long` is 32-bit (notably 64-bit Windows builds), though, even
though the delta encoding itself, the on-disk layout, and the in-memory
buffers happily carry such sizes. A `size_t` companion to
`get_delta_hdr_size()`, `get_delta_hdr_size_sz()`, was introduced in
17fa077596 (delta, packfile: use size_t for delta header sizes,
2026-05-08) precisely so that `patch_delta()` could be widened without
changing the on-the-wire decoding helper's signature.

Widen `patch_delta()`'s three size parameters to `size_t` and switch
its internal use of `get_delta_hdr_size()` to the `_sz` variant.
Then propagate the wider type through the callers.

Assisted-by: Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:45:40 -07:00
Junio C Hamano
ff1784217f Merge branch 'ak/typofixes'
Typofixes.

* ak/typofixes:
  doc: fix typos via codespell
2026-06-15 07:42:00 -07:00
Junio C Hamano
883a47ef64 Merge branch 'ob/more-repo-config-values'
Many core configuration variables have been migrated from global
variables into 'repo_config_values' to tie them to a specific
repository instance, avoiding cross-repository state leakage.

* ob/more-repo-config-values:
  environment: move "warn_on_object_refname_ambiguity" into `struct repo_config_values`
  environment: move "sparse_expect_files_outside_of_patterns" into `struct repo_config_values`
  environment: move "core_sparse_checkout_cone" into `struct repo_config_values`
  environment: move "precomposed_unicode" into `struct repo_config_values`
  environment: move "pack_compression_level" into `struct repo_config_values`
  environment: move `zlib_compression_level` into `struct repo_config_values`
  environment: move "check_stat" into `struct repo_config_values`
  environment: move "trust_ctime" into `struct repo_config_values`
2026-06-15 07:42:00 -07:00
Junio C Hamano
cfe6682042 Merge branch 'hn/config-typo-advice'
"git config foo.bar=baz" is not likely to be a request to read the
value of such a variable with '=' in its name; rather it is plausible
that the user meant "git config set foo.bar baz".  Give advice when
giving an error message.

* hn/config-typo-advice:
  config: improve diagnostic for "set" with missing value
  config: add git_config_key_is_valid() for quiet validation
2026-06-15 07:41:59 -07:00
K Jayatheerth
7e3a92cc09 repo: add path.gitdir with absolute and relative suffix formatting
Scripts need a stable way to locate the git directory without
parsing rev-parse output or relying on its flag-driven path format
selection. There is no way to retrieve this path from git repo info
today.

Introduce path.gitdir.absolute and path.gitdir.relative keys,
consistent with the path.commondir keys added in the previous patch.
Reuse the test_repo_info_path helper introduced there to validate
both variants.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:32:23 -07:00
K Jayatheerth
eea209169a repo: add path.commondir with absolute and relative suffix formatting
Scripts working with worktree setups need a reliable way to discover
the common directory, which diverges from the git directory when
multiple worktrees are in use. There is no way to retrieve this path
from git repo info today.

Introduce path.commondir.absolute and path.commondir.relative keys.
Exposing explicit format variants rather than a single key with a
default avoids ambiguity for scripts that require predictable output.

Add a test helper test_repo_info_path that creates isolated
repositories per test case to prevent state leaks, captures the repo
root before changing directories to avoid eval, and accepts an optional
init_command to cover environment variable overrides such as
GIT_COMMON_DIR and GIT_DIR.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:32:23 -07:00
K Jayatheerth
53ff8ed0e4 rev-parse: use append_formatted_path() for path formatting
Now that path formatting logic lives in a shared helper, keeping a
duplicate implementation in rev-parse is unnecessary and risks the
two diverging over time.

Replace the local format_type and default_type enums and the
hand-rolled formatting logic with a call to append_formatted_path().
Introduce PATH_FORMAT_DEFAULT as the initial value of arg_path_format
so that per-path fallback behavior is resolved in print_path() rather
than leaked into the shared helper.

Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:32:23 -07:00
Michael Montalbo
ac69d22b32 diff: bypass diff process with --no-ext-diff and in format-patch
Make --no-ext-diff disable diff.<driver>.process in addition to
diff.<driver>.command.  Although the two mechanisms work differently
(command replaces Git's output, process feeds hunks back into the
pipeline), both invoke external tools and --no-ext-diff means
"no external tools."

Replace the OPT_BOOL for --ext-diff with an OPT_CALLBACK that
sets both allow_external and no_diff_process, so a single option
controls both.  Passing --ext-diff explicitly clears
no_diff_process, so a later --ext-diff overrides an earlier
--no-ext-diff.

Disable the diff process unconditionally in format-patch so that
generated patches are always based on the builtin diff algorithm
and can be applied reliably by recipients who do not have the
external tool.

Document that --diff-algorithm also bypasses the diff process,
since it forces the builtin algorithm.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-15 07:31:09 -07:00
Taylor Blau
8e519b8756 midx: pass custom '--base' through incremental writes
The 'multi-pack-index' builtin parses '--base' for incremental writes,
but the normal write path does not pass that value through to
`write_midx_file()`.

As a result, something like:

    $ git multi-pack-index write --incremental --base=<base>

behaves as if no custom base had been given (unless the caller used the
'--stdin-packs' path).

Thread the parsed base through `write_midx_file()`, and update the
repack caller to pass NULL for the new argument where no custom base
selection is needed.

This exposes a pre-existing problem in incremental writes with custom
bases: the writer skips packs from the full existing MIDX chain, even
when the caller selected an older base or no base at all.

The affected t5334 cases fail while trying to write MIDX bitmaps. The
detached layer omits packs above the selected base, and thus the
resulting MIDX does not have a reachability closure, making it
impossible to generate reachability bitmaps.

Mark those tests as expected failures accordingly. The following commit
will fix the broken behavior and restore these tests.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-12 13:39:36 -07:00
Tamir Duberstein
3f5203eeb4 ls-files: filter pathspec before lstat
In --deleted and --modified modes, show_files() calls lstat() for each
index entry before show_ce() applies the pathspec. prune_index() avoids
most of these calls for pathspecs with a common directory prefix, but
not for a top-level name or leading wildcard.

Match before lstat() to avoid accessing the worktree for entries that
cannot be shown. Treat this as a prefilter: do not update ps_matched,
and retain the match in show_ce() so --error-unmatch is satisfied only
by entries that the selected modes actually show.

Prefilter only a single pathspec item, bounding the added work for each
index entry. Applying match_pathspec() to multiple arguments can cost
more than the lstat() calls it avoids. In a synthetic repository with
10,000 clean files, passing every path to ls-files --modified increased
runtime from 112.5 ms to 494.1 ms when the prefilter was unconditional.

With $parent and $this exported as paths to binaries built from the
parent and this commit, on a repository with 881,290 index entries:

    hyperfine --warmup 0 --runs 3 \
        --command-name parent \
        '$parent -c core.fsmonitor=false ls-files --deleted -- README.md >/dev/null' \
        --command-name this-commit \
        '$this -c core.fsmonitor=false ls-files --deleted -- README.md >/dev/null'

reported means of 65.790 seconds for the parent and 4.987 seconds for
this commit.

Link: https://lore.kernel.org/r/xmqqfr2tnfk0.fsf@gitster.g
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Tamir Duberstein <tamird@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-12 12:47:21 -07:00
Patrick Steinhardt
1ceee7431b treewide: drop USE_THE_REPOSITORY_VARIABLE
Adapt a couple of trivial callers of `is_bare_repository()` to instead
use a repository available via the caller's context so that we can drop
the `USE_THE_REPOSITORY_VARIABLE` macro.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-11 05:05:54 -07:00