The 'git survey' builtin provides several detail tables, such as "top
files by on-disk size". The size of these tables defaults to 10,
currently.
Allow the user to specify this number via a new --top=<N> option or the
new survey.top config key.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the moment, nothing is obvious about the reason for the use of the
path-walk API, but this will become more prevelant in future iterations. For
now, use the path-walk API to sum up the counts of each kind of object.
For example, this is the reachable object summary output for my local repo:
REACHABLE OBJECT SUMMARY
========================
Object Type | Count
------------+-------
Tags | 1343
Commits | 179344
Trees | 314350
Blobs | 184030
Signed-off-by: Derrick Stolee <stolee@gmail.com>
When 'git survey' provides information to the user, this will be presented
in one of two formats: plaintext and JSON. The JSON implementation will be
delayed until the functionality is complete for the plaintext format.
The most important parts of the plaintext format are headers specifying the
different sections of the report and tables providing concreted data.
Create a custom table data structure that allows specifying a list of
strings for the row values. When printing the table, check each column for
the maximum width so we can create a table of the correct size from the
start.
The table structure is designed to be flexible to the different kinds of
output that will be implemented in future changes.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
By default we will scan all references in "refs/heads/", "refs/tags/"
and "refs/remotes/".
Add command line opts let the use ask for all refs or a subset of them
and to include a detached HEAD.
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Start work on a new 'git survey' command to scan the repository
for monorepo performance and scaling problems. The goal is to
measure the various known "dimensions of scale" and serve as a
foundation for adding additional measurements as we learn more
about Git monorepo scaling problems.
The initial goal is to complement the scanning and analysis performed
by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool.
It is hoped that by creating a builtin command, we may be able to take
advantage of internal Git data structures and code that is not
accessible from GO to gain further insight into potential scaling
problems.
Co-authored-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
The final step, split from earlier attempt by Dscho, to loosen the
sideband restriction for now and tighten later at Git v3.0 boundary.
Will retract.
(this branch uses jc/neuter-sideband-fixup.)
* jc/neuter-sideband-post-3.0:
sideband: delay sanitizing by default to Git v3.0
The contributor guide has been updated to advise new contributors to
trim irrelevant quoted text when replying to review comments, matching
the existing advice given to reviewers.
* wy/doc-myfirstcontribution-trim-quotes:
MyFirstContribution: mention trimming quoted text in replies
The trailer sections in SubmittingPatches have been updated to
encourage use of standard trailers.
* kh/submittingpatches-trailers:
SubmittingPatches: note that trailer order matters
SubmittingPatches: be consistent with trailer markup
SubmittingPatches: document Based-on-patch-by trailer
SubmittingPatches: discourage common Linux trailers
SubmittingPatches: discuss non-ident trailers
SubmittingPatches: encourage trailer use for substantial help
git replay learns --linearize option to drop merge commits and
linearize the replayed history, mimicking git rebase
--no-rebase-merges.
* tc/replay-linearize:
replay: offer an option to linearize the commit topology
replay: add helper to put entry into mapped_commits
replay: refactor enum replay_mode into a bool
"git branch" command learned "--prune-merged" option to remove
local branches that have already been merged to the remote-tracking
branches they track.
* hn/branch-prune-merged:
branch: add --dry-run for --prune-merged
branch: add branch.<name>.pruneMerged opt-out
branch: add --prune-merged <branch>
branch: prepare delete_branches for a bulk caller
branch: let delete_branches warn instead of error on bulk refusal
branch: add --forked filter for --list mode
The `remote-object-info` command has been added to `git cat-file
--batch-command`, allowing clients to request object metadata
(currently size) from a remote server via protocol v2 without
downloading the entire object.
The client dynamically filters format placeholders based on
server-advertised capabilities and safely returns empty strings for
inapplicable or unsupported fields.
* ps/cat-file-remote-object-info:
cat-file: make remote-object-info allow-list dynamic
cat-file: validate remote atoms with allow_list
cat-file: add remote-object-info to batch-command
transport: add client support for object-info
serve: advertise object-info feature
fetch-pack: move fetch initialization
connect: refactor packet writing
fetch-pack: move function to connect.c
t1006: split test utility functions into new "lib-cat-file.sh"
cat-file: add declaration of variable i inside its for loop
git-compat-util: add strtoul_ul() with error handling
transport-helper: fix memory leak of helper on disconnect
Two new mechanisms, the GIT_CONFIG_INCLUDES environment variable and
the top-level --no-includes command-line option, have been introduced
to ignore configuration include directives.
* ds/config-no-includes:
git: add --no-includes top-level option
config: add GIT_CONFIG_INCLUDES
git-config.adoc: fix paragraph break
The experimental "git history" command has been taught a new "drop"
subcommand to remove a commit and replay its descendants onto its
parent.
* ps/history-drop:
builtin/history: implement "drop" subcommand
builtin/history: split handling of ref updates into two phases
reset: stop assuming that the caller passes in a clean index
reset: allow the caller to specify the current HEAD object
reset: introduce ability to skip updating HEAD
reset: introduce dry-run mode
reset: modernize flags passed to `reset_working_tree()`
reset: rename `reset_head()`
reset: drop `USE_THE_REPOSITORY_VARIABLE`
read-cache: split out function to drop unmerged entries to stage 0
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".
* jk/repo-info-path-keys:
repo: add path.gitdir with absolute and relative suffix formatting
repo: add path.commondir with absolute and relative suffix formatting
rev-parse: use append_formatted_path() for path formatting
path: introduce append_formatted_path() for shared path formatting
The pack-objects command now supports using reachability bitmaps and
delta-islands concurrently with the `--path-walk` option, allowing
faster packaging by falling back to path-walk when bitmaps cannot
fully satisfy the request.
* tb/pack-path-walk-bitmap-delta-islands:
pack-objects: support `--delta-islands` with `--path-walk`
pack-objects: extract `record_tree_depth()` helper
pack-objects: support reachability bitmaps with `--path-walk`
t/perf: drop p5311's lookup-table permutation
The -m/-F/-c/-C options to supply commit log message from outside the
editor are now supported for all "git commit --fixup" variations.
* ec/commit-fixup-options:
commit: allow -c/-C for all kinds of --fixup
commit: allow -m/-F for all kinds of --fixup
The [includeIf "condition"] conditional inclusion facility for
configuration files has learned to use the location of worktree
in its condition.
* cl/conditional-config-on-worktree-path:
config: add "worktree" and "worktree/i" includeIf conditions
config: refactor include_by_gitdir() into include_by_path()
"git checkout --track=..." learned to optionally fetch the branch
from the remote the new branch will work with.
* hn/checkout-track-fetch:
checkout: extend --track with a "fetch" mode to refresh start-point
branch: expose helpers for finding the remote owning a tracking ref
Configuration file locking now retries for a short period, avoiding
failures when multiple processes attempt to update the configuration
simultaneously.
* jt/config-lock-timeout:
config: retry acquiring config.lock, configurable via core.configLockTimeout
When fetching objects into a lazily cloned repository, .promisor
files are created with information meant to help debugging. "git
repack" has been taught to carry this information forward to
packfiles that are newly created.
Retracted.
cf. <agx_GPfBKpkSc3Gx@lorenzo-VM>
* lp/repack-propagate-promisor-debugging-info:
repack-promisor: add missing headers
t7703: test for promisor file content after geometric repack
t7700: test for promisor file content after repack
repack-promisor: preserve content of promisor files after repack
repack-promisor add helper to fill promisor file after repack
pack-write: add explanation to promisor file content
Documentation updates.
* kh/doc-trailers:
doc: interpret-trailers: document comment line treatment
doc: interpret-trailers: commit to “trailer block” term
doc: interpret-trailers: join new-trailers again
doc: interpret-trailers: add key format example
doc: interpret-trailers: explain key format
doc: interpret-trailers: explain the format after the intro
doc: interpret-trailers: not just for commit messages
doc: interpret-trailers: use “metadata” in Name as well
doc: interpret-trailers: replace “lines” with “metadata”
doc: interpret-trailers: stop fixating on RFC 822
Doc update for "git replay" to actually refer to its configuration
variables.
* kh/doc-replay-config:
doc: replay: move “default” to the right-hand side
doc: replay: use a nested description list
doc: replay: improve config description
doc: link to config for git-replay(1)
Various AsciiDoc markup fixes in 'git config' documentation and
related files to ensure lists and formatting are rendered correctly.
* ta/doc-config-adoc-fixes:
doc: git-config: escape erroneous highlight markup
doc: config/sideband: fix description list delimiter
doc: config: terminate runaway lists
Project-specific configuration for b4 has been introduced, and the
documentation has been updated to recommend using it as a
streamlined method for submitting patches.
* ps/doc-recommend-b4:
b4: introduce configuration for the Git project
MyFirstContribution: recommend the use of b4
MyFirstContribution: recommend shallow threading of cover letters
The handling of promisor-remote protocol capability has been
loosened to allow the other side to add to the list of promisor
remotes via the promisor.acceptFromServerURL configuration
variable.
* cc/promisor-auto-config-url-more:
doc: promisor: improve acceptFromServer entry
promisor-remote: auto-configure unknown remotes
promisor-remote: trust known remotes matching acceptFromServerUrl
promisor-remote: introduce promisor.acceptFromServerUrl
promisor-remote: add 'local_name' to 'struct promisor_info'
urlmatch: add url_normalize_pattern() helper
urlmatch: change 'allow_globs' arg to bool
t5710: simplify 'mkdir X' followed by 'git -C X init'
Various typos, grammatical errors, and duplicated words in both
documentation and code comments have been corrected.
* wy/docs-typofixes:
docs: fix typos and grammar
Wording used in "format-patch --subject-prefix" documentation
has been improved.
* lo/doc-format-patch-subject-prefix:
Documentation: remove redundant 'instead' in --subject-prefix
"git rev-list" (and "git log" family of commands) learned a new "--max-count-oldest"
that picks oldest N commits in the range instead of the usual newest.
* mf/revision-max-count-oldest:
bash-completions: add --max-count-oldest
revision.c: implement --max-count-oldest
Documentation and tests have been added to clarify that Git's internal
raw timestamp format requires a `@` prefix for values less than
100,000,000 to prevent ambiguity with other formats like YYYYMMDD.
* ls/doc-raw-timestamp-prefix:
doc: document and test `@` prefix for raw timestamps
Guidelines on how to write a cover letter for a multi-patch series
have been added to SubmittingPatches, which also got a new marker
to separate the section for typofixes.
* jc/submitting-patches-cover-letter:
SubmittingPatches: describe cover letter
SubmittingPatches: separate typofixes section
Scripts need a stable way to locate the git directory without
parsing rev-parse output or relying on its flag-driven path format
selection. There is no way to retrieve this path from git repo info
today.
Introduce path.gitdir.absolute and path.gitdir.relative keys,
consistent with the path.commondir keys added in the previous patch.
Reuse the test_repo_info_path helper introduced there to validate
both variants.
Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Scripts working with worktree setups need a reliable way to discover
the common directory, which diverges from the git directory when
multiple worktrees are in use. There is no way to retrieve this path
from git repo info today.
Introduce path.commondir.absolute and path.commondir.relative keys.
Exposing explicit format variants rather than a single key with a
default avoids ambiguity for scripts that require predictable output.
Add a test helper test_repo_info_path that creates isolated
repositories per test case to prevent state leaks, captures the repo
root before changing directories to avoid eval, and accepts an optional
init_command to cover environment variable overrides such as
GIT_COMMON_DIR and GIT_DIR.
Mentored-by: Justin Tobler <jltobler@gmail.com>
Mentored-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ReviewingGuidelines already advises reviewers to trim irrelevant quoted
context when replying. Give the same advice to new contributors in
MyFirstContribution, so our documentation is consistent about mailing
list reply etiquette.
Signed-off-by: Weijie Yuan <wy@wyuan.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Paired octothorpes are used in AsciiDoc to mark highlighted text,
<mark> being the equivalent HTML tag. To use the symbol as a literal
character, it can be escaped with backticks.
Do so in git-config.adoc.
While at it, tweak the text slightly to make it scan better.
Signed-off-by: Tuomas Ahola <taahol@utu.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many places in git-config(1) where paragraphs that should
logically come after a list are instead appended to the last item of
the list. This is a well-documented quirk of AsciiDoc, and can be
mitigated by enclosing the list in an open block:
--
* first item
* last item
--
+
New paragraph after the list.
Fix the issue accordingly.
Signed-off-by: Tuomas Ahola <taahol@utu.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common operation when editing the commit history is to drop a specific
commit from the history entirely, but this operation is not currently
covered by git-history(1).
A couple of noteworthy bits:
- This is the first git-history(1) command that will ultimately result
in changes to both the index and the working tree. We thus have to
add logic to merge resulting changes into those.
- It is still not possible to replay merge commits, so this limitation
is inherited for the new "drop" command.
- For now we refuse to drop root commits. While we _can_ indeed drop
root commits in the general case, there are edge cases where the
resulting history would become completely empty. This is thus left
to a subsequent patch series.
Other than that, most of the logic is rather straight-forward as we can
continue to build on the preexisting logic in git-history(1) for most of
the part.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It matters where you put the s-o-b; it should be last. You are signing
off on the patch as well as the whole message up to that point.
This also makes it clear who added what:
Acked-by: The Reviewer <r@example.org>
Signed-off-by: The Contributor <c@example.org>
Acked-by: The (Late) Reviewer <late@example.org>
Signed-off-by: The Maintainer <m@example.org>
The the first ack was added by the contributor and the second one was
added by the maintainer.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rest of this section and (most importantly) the list has decided to
use `<key>:`. So let’s use backticks (`) and a colon (:) throughout the
document.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This trailer comes up often enough and the use case is not fully covered
by the other trailers here. For example, it is sometimes better to use
this trailer instead of `Co-authored-by:`.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Linux Kernel regularly uses trailers (or “tags”) `Fixes` and
`Link`. Sometimes people submit patches to this project with them.
They have their use in that project but it is not clear what purpose
they would serve here.
For `Fixes`: Linux has many trees, and applying patches with
cherry-picks is common. A `Fixes` trailer in commit C2 pointing to
commit C1 helps the cherry-picker figure out that she probably needs
C2 if she wants to apply C1. See linux/d5d6281a (checkpatch: check for
missing Fixes tags, 2024-06-11):[1]
Why are stable patches encouraged to have a fixes tag? Some people
mark their stable patches as "# 5.10" etc. This is useful but a
Fixes tag is still a good idea. For example, the Fixes tag helps in
review. It helps people to not cherry-pick buggy patches without
also cherry-picking the fix.
In contrast the Git project has few trees (to my knowledge), and there
is much less need to cherry-pick fixes as opposed to either using
backmerges or rebasing all of the downstream tree’s commits on top of
git.git `master` from time to time.
This project does regularly mention what commits a patch/commit fixes,
but that is done inline in the commit message proper (c.f. the trailer
block of the message).
For `Link`: These are used both to link back to the patch submission as
well as with footnotes. In contrast this project has `refs/notes/amlog`
for linking back to the patch submissions, and footnotes are only used
in the commit message proper.
† 1: Commit linux/d5d6281a has “linux” in front of it since this commit
is from the Linux Kernel, not Git. Example of a Linux tree—as well
as an example of `Link`—is [2].
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ [2]
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Briefly discuss trailers that do not credit people. This continues the
discussion from the previous commit about using trailers for *people*.
Using non-ident trailers can be relevant. The contributor should just be
encouraged to consider whether it is useful or not.
The larger trend here is to discourage using trailers as a dumping
ground for any kind of metadata in the spirit of “it doesn’t hurt”.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Trailers beyond the mandatory s-o-b are regularly used based on my
last two years of reading the mailing list. Moreover, reviewers might
encourage it.[1]
This is also in line with the project crediting both commit authors and
people mentioned in trailers each release; “Nobody is THE one making
contribution”.[2]
Adding trailers is already encouraged, but in the section `send-patches`.
Let’s replace “If you like” with outright encouragment in this section
so that all trailer discussion (except s-o-b; see `sign-off` section) is
contained in this section; a link to from `send-patches` makes this
information equally visible.
Now we need to make a heading for `commit-trailers` in order for the
HTML output to make sense.
At the same, it is important to temper this recommendation to a sign-
ificant enough contribution; in my experience beginners can be eager
to add a trailer for everyone who replies with an action point that is
followed up on.
Let’s also spell out that these trailers should follow the Git author/
committer format. One might naturally just write the name, but in that
case it will not be picked up by:
git shortlog --group=trailer:<key>
and normalization via `.mailmap` will not work.
Also introduce the list of common trailers as such. Granted, this is
already implied by the later paragraph about “create your own trailer”,
so this just frontloads this information.
† 1: https://lore.kernel.org/git/CAP8UFD0POvYDgGtEx8GBhvKkd8XzzWQsy8XxAKL9M3+uz3ka+w@mail.gmail.com/#:~:text=for%20at%20least
† 2: https://lore.kernel.org/git/xmqqzh248sy0.fsf@gitster.c.googlers.com/
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for "--word-diff" has been extended with a bit of
implementation detail of where these different words come from.
* mm/doc-word-diff:
doc: clarify that --word-diff operates on line-level hunks
The `git log -L` implementation has been refactored to use the
standard diff output pipeline, enabling pickaxe and diff-filter to
work as expected. Additionally, metadata-only diff formats like
--raw and --name-only are now supported with -L.
* mm/line-log-cleanup:
line-log: allow non-patch diff formats with -L
line-log: integrate -L output with the standard log-tree pipeline
revision: move -L setup before output_format-to-diff derivation
Comment lines have always been ignored but this is not documented.
This is mostly for completeness since this is unlikely to catch anyone
by surprise. But we really ought to be reasonably complete here since
it’s the only documentation page that documents trailers.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>