Commit Graph

13018 Commits

Author SHA1 Message Date
Justin Tobler
eb5cf58ffc builtin/repo: add object counts in structure output
The amount of objects in a repository can provide insight regarding its
shape. To surface this information, use the path-walk API to count the
number of reachable objects in the repository by object type. All
regular references are used to determine the reachable set of objects.
The object counts are appended to the same table containing the
reference information.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21 14:40:38 -07:00
Justin Tobler
bbb2b93348 builtin/repo: introduce structure subcommand
The structure of a repository's history can have huge impacts on the
performance and health of the repository itself. Currently, Git lacks a
means to surface repository metrics regarding its structure/shape via a
single command. Acquiring this information requires users to be familiar
with the relevant data points and the various Git commands required to
surface them. To fill this gap, supplemental tools such as git-sizer(1)
have been developed.

To allow users to more readily identify repository structure related
information, introduce the "structure" subcommand in git-repo(1). The
goal of this subcommand is to eventually provide similar functionality
to git-sizer(1), but natively in Git.

The initial version of this command only iterates through all references
in the repository and tracks the count of branches, tags, remote refs,
and other reference types. The corresponding information is displayed in
a human-friendly table formatted in a very similar manner to
git-sizer(1). The width of each table column is adjusted automatically
to satisfy the requirements of the widest row contained.

Subsequent commits will surface additional relevant data points to
output and also provide other more machine-friendly output formats.

Based-on-patch-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21 14:40:37 -07:00
Justin Tobler
026ad60160 builtin/repo: rename repo_info() to cmd_repo_info()
Subcommand functions are often prefixed with `cmd_` to denote that they
are an entrypoint. Rename repo_info() to cmd_repo_info() accordingly.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-21 14:40:37 -07:00
Junio C Hamano
7c15d990cc Merge branch 'rs/get-oid-with-flags-cleanup'
Code clean-up.

* rs/get-oid-with-flags-cleanup:
  use repo_get_oid_with_flags()
2025-09-23 11:53:40 -07:00
Junio C Hamano
2e8d7569ea Merge branch 'jk/add-i-color'
Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.

* jk/add-i-color:
  contrib/diff-highlight: mention interactive.diffFilter
  add-interactive: manually fall back color config to color.ui
  add-interactive: respect color.diff for diff coloring
  stash: pass --no-color to diff plumbing child processes
2025-09-23 11:53:40 -07:00
Junio C Hamano
7b776bc308 Merge branch 'pc/range-diff-memory-limit'
"git range-diff" learned a way to limit the memory consumed by
O(N*N) cost matrix.

* pc/range-diff-memory-limit:
  range-diff: add configurable memory limit for cost matrix
2025-09-18 10:07:02 -07:00
Junio C Hamano
1fbfabfa71 Merge branch 'pw/3.0-commentchar-auto-deprecation'
"core.commentChar=auto" that attempts to dynamically pick a
suitable comment character is non-workable, as it is too much
trouble to support for little benefit, and is marked as deprecated.

* pw/3.0-commentchar-auto-deprecation:
  commit: print advice when core.commentString=auto
  config: warn on core.commentString=auto
  breaking-changes: deprecate support for core.commentString=auto
2025-09-18 10:07:00 -07:00
Junio C Hamano
13d1e86888 Merge branch 'lo/repo-info-step-2'
"repo info" learns a short-hand option "-z" that is the same as
"--format=nul", and learns to report the objects format used in the
repository.

* lo/repo-info-step-2:
  repo: add the field objects.format
  repo: add the flag -z as an alias for --format=nul
2025-09-15 08:52:05 -07:00
Junio C Hamano
7d00521d7b Merge branch 'jt/de-global-bulk-checkin'
The bulk-checkin code used to depend on a file-scope static
singleton variable, which has been updated to pass an instance
throughout the callchain.

* jt/de-global-bulk-checkin:
  bulk-checkin: use repository variable from transaction
  bulk-checkin: require transaction for index_blob_bulk_checkin()
  bulk-checkin: remove global transaction state
  bulk-checkin: introduce object database transaction structure
2025-09-15 08:52:05 -07:00
Junio C Hamano
da3799a67b Merge branch 'rs/describe-with-lazy-queue-and-oidset'
Instead of scanning for the remaining items to see if there are
still commits to be explored in the queue, use khash to remember
which items are still on the queue (an unacceptable alternative is
to reserve one object flag bits).

* rs/describe-with-lazy-queue-and-oidset:
  describe: use oidset in finish_depth_computation()
2025-09-12 10:41:21 -07:00
Junio C Hamano
07f29476de Merge branch 'ms/refs-exists'
"git refs exists" that works like "git show-ref --exists" has been
added.

* ms/refs-exists:
  t: add test for git refs exists subcommand
  t1422: refactor tests to be shareable
  t1403: split 'show-ref --exists' tests into a separate file
  builtin/refs: add 'exists' subcommand
2025-09-12 10:41:19 -07:00
Junio C Hamano
c31a276f12 Merge branch 'ps/object-store-midx-dedup-info'
Further code clean-up for multi-pack-index code paths.

* ps/object-store-midx-dedup-info:
  midx: compute paths via their source
  midx: stop duplicating info redundant with its owning source
  midx: write multi-pack indices via their source
  midx: load multi-pack indices via their source
  midx: drop redundant `struct repository` parameter
  odb: simplify calling `link_alt_odb_entry()`
  odb: return newly created in-memory sources
  odb: consistently use "dir" to refer to alternate's directory
  odb: allow `odb_find_source()` to fail
  odb: store locality in object database sources
2025-09-12 10:41:18 -07:00
René Scharfe
a66fc22bf9 use repo_get_oid_with_flags()
get_oid_with_context() allows specifying flags and reports object
details via a passed-in struct object_context.  Some callers just want
to specify flags, but don't need any details back.  Convert them to
repo_get_oid_with_flags(), which provides just that and frees them from
dealing with the context structure.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-10 14:29:49 -07:00
Junio C Hamano
95a8428323 Merge branch 'tc/last-modified'
A new command "git last-modified" has been added to show the closest
ancestor commit that touched each path.

* tc/last-modified:
  last-modified: use Bloom filters when available
  t/perf: add last-modified perf script
  last-modified: new subcommand to show when files were last modified
2025-09-08 14:54:35 -07:00
Junio C Hamano
576e0b6eb3 Merge branch 'ds/ls-files-lazy-unsparse'
"git ls-files <pathspec>..." should not necessarily have to expand
the index fully if a sparsified directory is excluded by the
pathspec; the code is taught to expand the index on demand to avoid
this.

* ds/ls-files-lazy-unsparse:
  ls-files: conditionally leave index sparse
2025-09-08 14:54:35 -07:00
Jeff King
89b4183efe stash: pass --no-color to diff plumbing child processes
After a partial stash, we may clear out the working tree by capturing
the output of diff-tree and piping it into git-apply (and likewise we
may use diff-index to restore the index). So we most definitely do not
want color diff output from that diff-tree process.  And it normally
would not produce any, since its stdout is not going to a tty, and the
default value of color.ui is "auto".

However, if GIT_PAGER_IN_USE is set in the environment, that overrides
the tty check, and we'll produce a colorized diff that chokes git-apply:

  $ echo y | GIT_PAGER_IN_USE=1 git stash -p
  [...]
  Saved working directory and index state WIP on main: 4f2e2bb foo
  error: No valid patches in input (allow with "--allow-empty")
  Cannot remove worktree changes

Setting this variable is a relatively silly thing to do, and not
something most users would run into. But we sometimes do it in our tests
to stimulate color. And it is a user-visible bug, so let's fix it rather
than work around it in the tests.

The root issue here is that diff-tree (and other diff plumbing) should
probably not ever produce color by default. It does so not by parsing
color.ui, but because of the baked-in "auto" default from 4c7f1819b3
(make color.ui default to 'auto', 2013-06-10). But changing that is
risky; we've had discussions back and forth on the topic over the years.
E.g.:

  https://lore.kernel.org/git/86D0A377-8AFD-460D-A90E-6327C6934DFC@gmail.com/.

So let's accept that as the status quo for now and protect ourselves by
passing --no-color to the child processes. This is the same thing we did
for add-interactive itself in 1c6ffb546b (add--interactive.perl: specify
--no-color explicitly, 2020-09-07).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-08 14:00:32 -07:00
Lucas Seiki Oshiro
c2e3713334 repo: add the field objects.format
The flag `--show-object-format` from git-rev-parse is used for
retrieving the object storage format. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.

Add a new field `objects.format` to the git-repo-info subcommand
containing that information.

Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-04 11:36:40 -07:00
Lucas Seiki Oshiro
a92f5ca0d5 repo: add the flag -z as an alias for --format=nul
Other Git commands that have nul-terminated output (e.g. git-config,
git-status, git-ls-files) have a flag `-z` for using the null character
as the record separator.

Add the `-z` flag to git-repo-info as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.

Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-04 11:36:39 -07:00
René Scharfe
30598ccc4d describe: use oidset in finish_depth_computation()
Depth computation can end early if all remaining commits are flagged.
The current code determines if that's the case by checking all queue
items each time it dequeues a flagged commit.  This can cause
quadratic complexity.

We could simply count the flagged items in the queue and then update
that number as we add and remove items.  That would provide a general
speedup, but leave one case where we have to scan the whole queue: When
we flag a previously seen, but unflagged commit.  It could be on the
queue and then we'd have to decrease our count.

We could dedicate an object flag to track queue membership, but that
would leave less for candidate tags, affecting the results.  So use a
hash table, specifically an oidset of commit hashes, to track that.
This avoids quadratic behaviour in all cases and provides a nice
performance boost over the previous commit, 08bb69d70f (describe: use
prio_queue_replace(), 2025-08-03):

Benchmark 1: ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     855.3 ms ±   1.3 ms    [User: 790.8 ms, System: 49.9 ms]
  Range (min … max):   853.7 ms … 857.8 ms    10 runs

Benchmark 2: ./git describe $(git rev-list v2.41.0..v2.47.0)
  Time (mean ± σ):     610.8 ms ±   1.7 ms    [User: 546.9 ms, System: 49.3 ms]
  Range (min … max):   608.9 ms … 613.3 ms    10 runs

Summary
  ./git describe $(git rev-list v2.41.0..v2.47.0) ran
    1.40 ± 0.00 times faster than ./git_08bb69d70f describe $(git rev-list v2.41.0..v2.47.0)

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-02 15:15:13 -07:00
Meet Soni
0f0a8a11c0 builtin/refs: add 'exists' subcommand
As part of the ongoing effort to consolidate reference handling,
introduce a new `exists` subcommand. This command provides the same
functionality and exit-code behavior as `git show-ref --exists`, serving
as its modern replacement.

The logic for `show-ref --exists` is minimal. Rather than creating a
shared helper function which would be overkill for ~20 lines of code,
its implementation is intentionally duplicated here. This contrasts with
`git refs list`, where sharing the larger implementation of
`for-each-ref` was necessary.

Documentation for the new subcommand is also added to the `git-refs(1)`
man page.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-02 09:58:35 -07:00
Paulo Casaretto
00727249ec range-diff: add configurable memory limit for cost matrix
When comparing large commit ranges (e.g., 250,000+ commits), range-diff
attempts to allocate an n×n cost matrix that can exhaust available
memory. For example, with 256,784 commits (n = 513,568), the matrix
would require approximately 256GB of memory (513,568² × 4 bytes),
causing either immediate segmentation faults due to integer overflow or
system hangs.

Add a memory limit check in get_correspondences() before allocating the
cost matrix. This check uses the total size in bytes (n² × sizeof(int))
and compares it against a configurable maximum, preventing both
excessive memory usage and integer overflow issues.

The limit is configurable via a new --max-memory option that accepts
human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit
systems and 2GB for 32 bit systems. This allows comparing ranges of
approximately 32,000 (16,000) commits - generous for real-world use cases
while preventing impractical operations.

When the limit is exceeded, range-diff now displays a clear error
message showing both the requested memory size and the maximum allowed,
formatted in human-readable units for better user experience.

Example usage:
  git range-diff --max-memory=1G branch1...branch2
  git range-diff --max-memory=500M base..topic1 base..topic2

This approach was chosen over alternatives:
- Pre-counting commits: Would require spawning additional git processes
  and reading all commits twice
- Limiting by commit count: Less precise than actual memory usage
- Streaming approach: Would require significant refactoring of the
  current algorithm

This issue was previously discussed in:
https://lore.kernel.org/git/RFC-cover-v2-0.5-00000000000-20211210T122901Z-avarab@gmail.com/

Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-29 09:46:07 -07:00
Junio C Hamano
c64ec662d0 Merge branch 'jk/describe-blob'
"git describe <blob>" misbehaves and/or crashes in some corner
cases, which has been taught to exit with failure gracefully.

* jk/describe-blob:
  describe: pass commit to describe_commit()
  describe: handle blob traversal with no commits
  describe: catch unborn branch in describe_blob()
  describe: error if blob not found
  describe: pass oid struct by const pointer
2025-08-29 09:44:38 -07:00
Toon Claes
8d9a7cdfda last-modified: use Bloom filters when available
Our 'git last-modified' performs a revision walk, and computes a diff at
each point in the walk to figure out whether a given revision changed
any of the paths it considers interesting.

When changed-path Bloom filters are available, we can avoid computing
many such diffs. Before computing a diff, we first check if any of the
remaining paths of interest were possibly changed at a given commit by
consulting its Bloom filter. If any of them are, we are resigned to
compute the diff.

If none of those queries returned "maybe", we know that the given commit
doesn't contain any changed paths which are interesting to us. So, we
can avoid computing it in this case.

Comparing the perf test results on git.git:

    Test                                        HEAD~             HEAD
    ------------------------------------------------------------------------------------
    8020.1: top-level last-modified             4.49(4.34+0.11)   2.22(2.05+0.09) -50.6%
    8020.2: top-level recursive last-modified   5.64(5.45+0.11)   5.62(5.30+0.11) -0.4%
    8020.3: subdir last-modified                0.11(0.06+0.04)   0.07(0.03+0.04) -36.4%

Based-on-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 16:44:58 -07:00
Toon Claes
32f74582bc last-modified: new subcommand to show when files were last modified
Similar to git-blame(1), introduce a new subcommand
git-last-modified(1). This command shows the most recent modification to
paths in a tree. It does so by expanding the tree at a given commit,
taking note of the current state of each path, and then walking
backwards through history looking for commits where each path changed
into its final commit ID.

Based-on-patch-by: Jeff King <peff@peff.net>
Improved-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 16:44:58 -07:00
Derrick Stolee
681f26bccc ls-files: conditionally leave index sparse
When running 'git ls-files' with a pathspec, the index entries get
filtered according to that pathspec before iterating over them in
show_files().  In 78087097b8 (ls-files: add --sparse option,
2021-12-22), this iteration was prefixed with a check for the '--sparse'
option which allows the command to output directory entries; this
created a pre-loop call to ensure_full_index().

However, when a user runs 'git ls-files' where the pathspec matches
directories that are recursively matched in the sparse-checkout, there
are not any sparse directories that match the pathspec so they would not
be written to the output. The expansion in this case is just a
performance drop for no behavior difference.

Replace this global check to expand the index with a check inside the
loop for a matched sparse directory. If we see one, then expand the
index and continue from the current location. This is safe since the
previous entries in the index did not have any sparse directories and
thus would remain stable in this expansion.

A test in t1092 confirms that this changes the behavior.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28 08:02:37 -07:00
Phillip Wood
a0e6aaea7d config: warn on core.commentString=auto
As support for this setting was deprecated in the last commit print a
warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set.
Avoid bombarding the user with warnings by only printing it (a) when
running commands that call "git commit" and (b) only once per command.

Some scaffolding is added to repo_read_config() to allow it to
detect deprecated config settings and warn about them. As both
"core.commentChar" and "core.commentString" set the comment
character we record which one of them is used and tailor the
warning message appropriately.

Note the odd combination of die_message() followed by die(NULL)
is to allow the next commit to insert a call to advise() in the middle.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26 08:52:44 -07:00
Phillip Wood
fdae4114a6 breaking-changes: deprecate support for core.commentString=auto
When "core.commentString" is set to "auto" then "git commit" will
automatically select the comment character ensuring that it is not the
first character on any of the lines in the commit message. This was
introduced by commit 84c9dc2c5a (commit: allow core.commentChar=auto
for character auto selection, 2014-05-17). The motivation seems to be
to avoid commenting out lines from the existing message when amending
a commit that was created with a message from a file.

Unfortunately this feature does not work with:

 * commit message templates that contain comments.

 * prepare-commit-msg hooks that introduce comments.

 * "git commit --cleanup=strip --edit -F <file>" which means that it
   is incompatible with

   - the "fixup" and "squash" commands of "git rebase -i" as the
     comments added by those commands are then treated as part of
     the commit message.

   - the conflict comments added to the commit message by "git
     cherry-pick", "git rebase" etc. as these comments are then
     treated as part of the commit message.

It is also ignored by "git notes" when amending a note.

The issues with comments coming from a template, hook or file are a
consequence of the design of this feature and are therefore hard to
fix.

As the costs of this feature outweigh the benefits, deprecate it and
remove it in Git 3.0. If someone comes up with some patches that fix
all the issues in a maintainable way then I'd be happy to see this
change reverted.

The next commits will add a warning and some advice for users on how
they can update their config settings.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-26 08:47:37 -07:00
Junio C Hamano
ebb45da976 Merge branch 'lo/repo-info'
A new subcommand "git repo" gives users a way to grab various
repository characteristics.

* lo/repo-info:
  repo: add the --format flag
  repo: add the field layout.shallow
  repo: add the field layout.bare
  repo: add the field references.format
  repo: declare the repo command
2025-08-25 14:22:04 -07:00
Junio C Hamano
eed447dd95 Merge branch 'ps/commit-graph-wo-globals'
Remove dependency on the_repository and other globals from the
commit-graph code, and other changes unrelated to de-globaling.

* ps/commit-graph-wo-globals:
  commit-graph: stop passing in redundant repository
  commit-graph: stop using `the_repository`
  commit-graph: stop using `the_hash_algo`
  commit-graph: refactor `parse_commit_graph()` to take a repository
  commit-graph: store the hash algorithm instead of its length
  commit-graph: stop using `the_hash_algo` via macros
2025-08-25 14:22:03 -07:00
Junio C Hamano
a3c6459ab6 Merge branch 'dk/help-all'
"git cmd --help-all" now works outside repositories.

* dk/help-all:
  builtin: also setup gently for --help-all
  parse-options: refactor flags for usage_with_options_internal
2025-08-25 14:22:00 -07:00
Justin Tobler
b336144725 bulk-checkin: remove global transaction state
Object database transactions in the bulk-checkin subsystem rely on
global state to track transaction status. Stop relying on global state
and instead store the transaction in the `struct object_database`.
Functions that operate on transactions are updated to now wire
transaction state.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-25 09:48:13 -07:00
Junio C Hamano
9d6e319ec5 Merge branch 'ac/deglobal-fmt-merge-log-config'
Code clean-up.

* ac/deglobal-fmt-merge-log-config:
  builtin/fmt-merge-msg: stop depending on 'the_repository'
  environment: remove the global variable 'merge_log_config'
2025-08-22 13:13:21 -07:00
Junio C Hamano
72e4eb56f0 Merge branch 'jc/diff-no-index-in-subdir'
"git diff --no-index" run inside a subdirectory under control of a
Git repository operated at the top of the working tree and stripped
the prefix from the output, and oddballs like "-" (stdin) did not
work correctly because of it.  Correct the set-up by undoing what
the set-up sequence did to cwd and prefix.

* jc/diff-no-index-in-subdir:
  diff: --no-index should ignore the worktree
2025-08-22 13:13:21 -07:00
Junio C Hamano
c72d5bbf49 Merge branch 'ms/refs-list'
The "list" subcommand of "git refs" acts as a front-end for
"git for-each-ref".

* ms/refs-list:
  t: add test for git refs list subcommand
  t6300: refactor tests to be shareable
  builtin/refs: add list subcommand
  builtin/for-each-ref: factor out core logic into a helper
  builtin/for-each-ref: align usage string with the man page
  doc: factor out common option
2025-08-22 13:13:20 -07:00
Junio C Hamano
54fef16542 Merge branch 'jc/strbuf-split'
Arrays of strbuf is often a wrong data structure to use, and
strbuf_split*() family of functions that create them often have
better alternatives.

Update several code paths and replace strbuf_split*().

* jc/strbuf-split:
  trace2: do not use strbuf_split*()
  trace2: trim_trailing_newline followed by trim is a no-op
  sub-process: do not use strbuf_split*()
  environment: do not use strbuf_split*()
  config: do not use strbuf_split()
  notes: do not use strbuf_split*()
  merge-tree: do not use strbuf_split*()
  clean: do not use strbuf_split*() [part 2]
  clean: do not pass the whole structure when it is not necessary
  clean: do not use strbuf_split*() [part 1]
  clean: do not pass strbuf by value
  wt-status: avoid strbuf_split*()
2025-08-21 13:47:00 -07:00
Junio C Hamano
971ba42dd4 Merge branch 'jc/string-list-split'
string_list_split*() family of functions have been extended to
simplify common use cases.

* jc/string-list-split:
  string-list: split-then-remove-empty can be done while splitting
  string-list: optionally omit empty string pieces in string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally trim string pieces split by string_list_split*()
  string-list: unify string_list_split* functions
  string-list: align string_list_split() with its _in_place() counterpart
  string-list: report programming error with BUG
2025-08-21 13:46:59 -07:00
Junio C Hamano
5a404a70c7 Merge branch 'rs/describe-with-prio-queue'
"git describe" has been optimized by using better data structure.

* rs/describe-with-prio-queue:
  describe: use prio_queue_replace()
  describe: use prio_queue
2025-08-21 13:46:59 -07:00
Junio C Hamano
9a85fa8406 Merge branch 'ps/remote-rename-fix'
"git remote rename origin upstream" failed to move origin/HEAD to
upstream/HEAD when origin/HEAD is unborn and performed other
renames extremely inefficiently, which has been corrected.

* ps/remote-rename-fix:
  builtin/remote: only iterate through refs that are to be renamed
  builtin/remote: rework how remote refs get renamed
  builtin/remote: determine whether refs need renaming early on
  builtin/remote: fix sign comparison warnings
  refs: simplify logic when migrating reflog entries
  refs: pass refname when invoking reflog entry callback
2025-08-21 13:46:58 -07:00
Junio C Hamano
c3c8b6910a Merge branch 'ps/reflog-migrate-fixes'
"git refs migrate" to migrate the reflog entries from a refs
backend to another had a handful of bugs squashed.

* ps/reflog-migrate-fixes:
  refs: fix invalid old object IDs when migrating reflogs
  refs: stop unsetting REF_HAVE_OLD for log-only updates
  refs/files: detect race when generating reflog entry for HEAD
  refs: fix identity for migrated reflogs
  ident: fix type of string length parameter
  builtin/reflog: implement subcommand to write new entries
  refs: export `ref_transaction_update_reflog()`
  builtin/reflog: improve grouping of subcommands
  Documentation/git-reflog: convert to use synopsis type
2025-08-21 13:46:57 -07:00
Junio C Hamano
c8f660a7ca Merge branch 'lo/repo-info' into lo/repo-info-step-2
* lo/repo-info:
  repo: add the --format flag
  repo: add the field layout.shallow
  repo: add the field layout.bare
  repo: add the field references.format
  repo: declare the repo command
2025-08-20 17:18:35 -07:00
Jeff King
7c10e48e81 describe: pass commit to describe_commit()
There's a call in describe_commit() to lookup_commit_reference(), but we
don't check the return value. If it returns NULL, we'll segfault as we
immediately dereference the result.

In practice this can never happen, since all callers pass an oid which
came from a "struct commit" already. So we can make this more obvious
by just taking that commit struct in the first place.

Reported-by: Cheng <prophecheng@stu.pku.edu.cn>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 09:08:57 -07:00
Jeff King
8cfd4ac215 describe: handle blob traversal with no commits
When describing a blob, we traverse from HEAD, remembering each commit
we saw, and then checking each blob to report the containing commit.
But if we haven't seen any commits at all, we'll segfault (we store the
"current" commit as an oid initialized to the null oid, causing
lookup_commit_reference() to return NULL).

This shouldn't be able to happen normally. We always start our traversal
at HEAD, which must be a commit (a property which is enforced by the
refs code). But you can trigger the segfault like this:

  blob=$(echo foo | git hash-object -w --stdin)
  echo $blob >.git/HEAD
  git describe $blob

We can instead catch this case and return an empty result, which hits
the usual "we didn't find $blob while traversing HEAD" error.

This is a minor lie in that we did "find" the blob. And this even hints
at a bigger problem in this code: what if the traversal pointed to the
blob as _not_ part of a commit at all, but we had previously filled in
the recorded "current commit"? One could imagine this happening due to a
tag pointing directly to the blob in question.

But that can't happen, because we only traverse from HEAD, never from
any other refs. And the intent of the blob-describing code is to find
blobs within commits.

So I think this matches the original intent as closely as we can (and
again, this segfault cannot be triggered without corrupting your
repository!).

The test here does not use the formula above, which works only for the
files backend (and not reftables). Instead we use another loophole to
create the bogus state using only Git commands. See the comment in the
test for details.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-20 09:06:02 -07:00
Jeff King
c6478715a5 describe: catch unborn branch in describe_blob()
When describing a blob, we search for it by traversing from HEAD. We do
this by feeding the name HEAD to setup_revisions(). But if we are on an
unborn branch, this will fail with a confusing message:

  $ git describe $blob
  fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

It is OK for this to be an error (we cannot find $blob in an empty
traversal, so we'd eventually complain about that). But the error
message could be more helpful.

Let's resolve HEAD ourselves and pass the resolved object id to
setup_revisions(). If resolving fails, then we can print a more useful
message.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:43 -07:00
Jeff King
db2664b6f7 describe: error if blob not found
If describe_blob() does not find the blob in question, it returns an
empty strbuf, and we print an empty line. This differs from
describe_commit(), which always either returns an answer or calls die()
itself. As the blob function was bolted onto the command afterwards, I
think its behavior is not intentional, and it is just a bug that it does
not report an error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:43 -07:00
Jeff King
e715f77682 describe: pass oid struct by const pointer
We pass a "struct object_id" to describe_blob() by value. This isn't
wrong, as an oid is composed only of copy-able values. But it's unusual;
typically we pass structs by const pointer, including object_ids. Let's
do so.

It similarly makes sense for us to hold that pointer in the callback
data (rather than yet another copy of the oid).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-18 14:23:42 -07:00
Lucas Seiki Oshiro
a81224d128 repo: add the --format flag
Add the --format flag to git-repo-info. By using this flag, the users
can choose the format for obtaining the data they requested.

Given that this command can be used for generating input for other
applications and for being read by end users, it requires at least two
formats: one for being read by humans and other for being read by
machines. Some other Git commands also have two output formats, notably
git-config which was the inspiration for the two formats that were
chosen here:

- keyvalue, where the retrieved data is printed one per line, using =
  for delimiting the key and the value. This is the default format,
  targeted for end users.
- nul, where the retrieved data is separated by NUL characters, using
  the newline character for delimiting the key and the value. This
  format is targeted for being read by machines.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:41 -07:00
Lucas Seiki Oshiro
e52cd654c9 repo: add the field layout.shallow
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag `--is-shallow-repository` from git-rev-parse is used for
retrieving whether the repository is shallow. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.

Then, add a new field `layout.shallow` to the git-repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
acf2669b54 repo: add the field layout.bare
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag --is-bare-repository from git-rev-parse is used for retrieving
whether the current repository is bare. This way, it is used for
querying repository metadata, fitting in the purpose of git-repo-info.

Then, add a new field layout.bare to the git-repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
9adb8a7fd1 repo: add the field references.format
This commit is part of the series that introduces the new subcommand
git-repo-info.

The flag `--show-ref-format` from git-rev-parse is used for retrieving
the reference format (i.e. `files` or `reftable`). This way, it is
used for querying repository metadata, fitting in the purpose of
git-repo-info.

Add a new field `references.format` to the repo-info subcommand
containing that information.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:40 -07:00
Lucas Seiki Oshiro
ab94bb8000 repo: declare the repo command
Currently, `git rev-parse` covers a wide range of functionality not
directly related to parsing revisions, as its name suggests. Over time,
many features like parsing datestrings, options, paths, and others
were added to it because there wasn't a more appropriate command
to place them.

Create a new Git command called `repo`. `git repo` will be the main
command for obtaining the information about a repository (such as
metadata and metrics).

Also declare a subcommand for `repo` called `info`. `git repo info`
will bring the functionality of retrieving repository-related
information currently returned by `rev-parse`.

Add the required documentation and build changes to enable usage of
this subcommand.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Justin Tobler <jltobler@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-17 09:13:39 -07:00