Every once in a while I need to verify that Microsoft Git's test suite
passes for changes that are not yet meant for public consumption, and
since it was (made) too difficult to keep up a working Azure Pipeline
definition, I have to use GitHub Actions in a private GitHub repository
for that purpose.
In these tests, basically all Dockerized CI jobs fail consistently. The
symptom is something like:
error: cannot create async thread: Resource temporarily unavailable
in the middle of a test, typically in the t5xxx-t6xxx range. The first
such error is immediately followed by plenty more of these errors, and
not a single test succeeds afterwards.
At first, I thought that maybe the massive parallelism I enjoy there is
the problem, and I thought that the cgroups limits might be shared
between the many containers that run on essentially the same physical
machine. But even reducing the matrix to just a single of those
Dockerized jobs runs into the very same problems.
The underlying reason seems to be a substantial difference in the hosted
runners that execute these Dockerized jobs: forcing the PID limit of the
container to a high number lets the jobs pass, even when running the
complete matrix of all 13 Dockerized jobs concurrently. But that's not
the only difference: The jobs seem to take a lot longer in these
containers than, say, in the containers made available to
https://github.com/git/git.
When forcing a PID limit of 64k in that private repository, the jobs
completed successfully, but they also took a lot longer, between 2x to
2.5x longer, i.e. painfully much longer. Reducing the PID limit to 16k,
the CI jobs still passed, but took an equally long amount of time.
Reducing the PID limit to 8k caused the errors to reappear.
Here are the numbers from three example runs, the first one forcing the
PID and nproc limit to 65536, the second one to 16384, the third run is
from the public git/git repository:
Job | 64k | 16k | reference
------------------------------|---------|---------|---------
almalinux-8 | 19m 3s | 16m 0s | 9m 36s
debian-11 | 20m 31s | 20m 3s | 8m 5s
fedora-breaking-changes-meson | 16m 29s | 19m 19s | 9m 40s
linux-asan-ubsan | 1h 10m | 1h 11m | 34m 36s
linux-breaking-changes | 25m 39s | 25m 58s | 13m 15s
linux-leaks | 1h 9m | 1h 10m | 33m 30s
linux-meson | 28m 9s | 27m 4s | 13m 45s
linux-musl-meson | 16m 32s | 13m 39s | 8m 6s
linux-reftable-leaks | 1h 13m | 1h 13m | 34m 34s
linux-reftable | 26m 2s | 25m 48s | 13m 31s
linux-sha256 | 26m 12s | 26m 3s | 12m 36s
linux-TEST-vars | 26m 5s | 25m 21s | 13m 25s
linux32 | 21m 16s | 19m 57s | 10m 44s
It does not look as if the PID limit is the reason for the longer
runtime, seeing as the 64k vs 16k timings deviate no more than as is
usual with GitHub workflows. So let's go for 16k.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Project-specific configuration for b4 has been introduced, and the
documentation has been updated to recommend using it as a
streamlined method for submitting patches.
* ps/doc-recommend-b4:
b4: introduce configuration for the Git project
Documentation/MyFirstContribution: recommend the use of b4
Documentation/MyFirstContribution: recommend shallow threading
When cURL follows a redirect, the WWW-Authenticate headers from the
redirect target were lost because credential_from_url() cleared the
credential state. This has been fixed by preserving the collected
headers across the redirect update.
* ap/http-redirect-wwwauth-fix:
http: preserve wwwauth_headers across redirects
A regression in the error diagnosis code for invalid .git files has
been fixed, avoiding a potential NULL-pointer crash when reporting
that a .git file does not point to a valid repository.
* jk/setup-gitfile-diag-fix:
read_gitfile_gently(): return non-repo path on error
The experimental "git history" command has been taught a new "drop"
subcommand to remove a commit and replay its descendants onto its
parent.
* ps/history-drop:
builtin/history: implement "drop" subcommand
builtin/history: split handling of ref updates into two phases
reset: stop assuming that the caller passes in a clean index
reset: allow the caller to specify the current HEAD object
reset: introduce ability to skip reference updates
reset: introduce dry-run mode
reset: modernize flags passed to `reset_head()`
reset: drop `USE_THE_REPOSITORY_VARIABLE`
read-cache: split out function to drop unmerged entries to stage 0
A common operation when editing the commit history is to drop a specific
commit from the history entirely, but this operation is not currently
covered by git-history(1).
A couple of noteworthy bits:
- This is the first git-history(1) command that will ultimately result
in changes to both the index and the working tree. We thus have to
add logic to merge resulting changes into those.
- It is still not possible to replay merge commits, so this limitation
is inherited for the new "drop" command.
- For now we refuse to drop root commits. While we _can_ indeed drop
root commits in the general case, there are edge cases where the
resulting history would become completely empty. This is thus left
to a subsequent patch series.
Other than that, most of the logic is rather straight-forward as we can
continue to build on the preexisting logic in git-history(1) for most of
the part.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `handle_reference_updates()` is used by git-history(1) to
update all references that refer to commits that have been rewritten. As
such, it performs two steps:
- It gathers the references that need to be updated in the first
place.
- It prepares and commits the reference transaction.
In a subsequent commit we'll want to handle those two steps separately.
Prepare for this by splitting up the function into two.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 652bd0211d (rebase: use 'skip_cache_tree_update' option, 2022-11-10),
we updated `reset_head()` to stop updating the index tree cache. This
was done as a performance optimization: the function is only called by
"sequencer.c" and "rebase.c", both of which assume a clean index before
they perform their operation, so we know that the end result will be a
clean index, too. Consequently, we can skip recomputing the cache as we
can instead use `prime_cache_tree()` directly.
In a subsequent commit we're about to add a new caller though where the
assumption doesn't hold anymore: the index may be dirty before calling
`reset_head()`, and consequently we cannot prime the cache with a given
tree anymore as the index and tree will mismatch.
Adapt the logic so that we only skip the cache tree update in case we're
doing a hard reset. While we could introduce logic that only skips the
update in case the incoming index was dirty already, that doesn't really
feel worth it: after all, the mentioned commit says itself that the
performance improvement was negligible anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `reset_head()` we automatically derive the commit that the
callers wants to move from by reading the HEAD commit. Some callers may
already have resolved it, or they may want to move from a different
commit that doesn't match HEAD.
Introduce a new `oid_from` option that lets the caller specify the
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit we'll introduce a new caller to `reset_head()`
that really only wants to update the index and working tree, without
updating any references. Introduce a new flag that lets the caller
perform this operation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit we'll add add another caller to `reset_head()`
that wants to perform a dry-run check of whether it would be possible to
udpate the index and working tree when moving to a new commit. Introduce
a new flag that lets the caller perform this operation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flags passed to `reset_head()` are declared as defines. This has
fallen a bit out of practice nowadays, where we instead prefer to use
enums.
Modernize the code accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "reset.c" we still have references to `the_repository`, even though
the only entry point into the file already receives a repository as
parameter.
Update all uses of `the_repository` to instead use the passed-in repo
and drop `USE_THE_REPOSITORY_VARIABLE`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `repo_read_index_unmerged()` we read the index and then drop any
unmerged entries to stage 0. In a subsequent commit we'll want to
perform this operation on arbitrary indexes, not only the one of the
given repository.
Prepare for this by splitting out the functionality into a new function
that can act on an arbitrary index.
While at it, fix a signedness mismatch when iterating through the index
cache entries.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git repo info" command has been taught new keys to output both
absolute and relative paths for "gitdir" and "commondir", supported by
a new path-formatting helper extracted from "git rev-parse".
* jk/repo-info-path-keys:
repo: add path.commondir with absolute and relative suffix formatting
repo: add path.gitdir with absolute and relative suffix formatting
rev-parse: use strbuf_add_path for path formatting
path: add strbuf_add_path for formatting paths
The subprocess handshake during startup has been made gentler by using
packet_read_line_gently() instead of packet_read_line() to prevent the
parent Git process from dying abruptly when a configured subprocess
(e.g., a clean/smudge filter) fails to start.
* mm/subprocess-handshake-fix:
sub-process: use gentle handshake to avoid die() on startup failure
prio_queue_get() has been optimized by using a cascade-down approach
(promoting the smaller child at each level and sifting up the last
element from the leaf vacancy), which halves the number of comparisons
per extract-min operation in the common case.
* kk/prio-queue-cascade-sift:
prio-queue: use cascade-down for faster extract-min
The 'trust_executable_bit' (coming from 'core.filemode'
configuration) has been migrated into 'repo_config_values' to tie it
to a specific repository instance.
* ty/migrate-trust-executable-bit:
read-cache: pass 'istate' to stat/mode helper functions
environment: move 'trust_executable_bit' into repo_config_values
read-cache: move 'ce_mode_from_stat()' to 'read-cache.c'
read-cache: remove redundant extern declarations
The pack-objects command now supports using reachability bitmaps and
delta-islands concurrently with the `--path-walk` option, allowing
faster packaging by falling back to path-walk when bitmaps cannot
fully satisfy the request.
* tb/pack-path-walk-bitmap-delta-islands:
pack-objects: support `--delta-islands` with `--path-walk`
pack-objects: extract `record_tree_depth()` helper
pack-objects: support reachability bitmaps with `--path-walk`
t/perf: drop p5311's lookup-table permutation
A new `diff.<driver>.process` configuration has been introduced to
allow a long-running external process to act as a hunk provider to
allows external tools to control which lines Git considers changed
while leaving all output formatting (word diff, color, blame, etc.) to
Git's standard pipeline.
* mm/diff-process-hunks:
blame: consult diff process for no-hunk detection
diff: bypass diff process with --no-ext-diff and in format-patch
diff: add long-running diff process via diff.<driver>.process
sub-process: separate process lifecycle from hashmap management
userdiff: add diff.<driver>.process config
xdiff: support external hunks via xpparam_t
"git index-pack" has been optimized by retaining child bases in the
delta cache instead of immediately freeing them, letting the existing
cache limit policy decide eviction.
* ab/index-pack-retain-child-bases:
index-pack: retain child bases in delta cache
Various typos, grammatical errors, and duplicated words in both
documentation and code comments have been corrected.
* wy/docs-typofixes:
docs: fix typos and grammar
The handling of promisor-remote protocol capability has been
loosened to allow the other side to add to the list of promisor
remotes via the promisor.acceptFromServerURL configuration
variable.
* cc/promisor-auto-config-url-more:
doc: promisor: improve acceptFromServer entry
promisor-remote: auto-configure unknown remotes
promisor-remote: trust known remotes matching acceptFromServerUrl
promisor-remote: introduce promisor.acceptFromServerUrl
promisor-remote: add 'local_name' to 'struct promisor_info'
urlmatch: add url_normalize_pattern() helper
urlmatch: change 'allow_globs' arg to bool
t5710: simplify 'mkdir X' followed by 'git -C X init'
"git rebase --update-refs" has been taught to resolve local branch
symrefs to their referents before queuing updates. This correctly
skips aliases of the current branch and avoids duplicate updates for
underlying real branches, fixing failures when branch aliases (like a
default branch rename) are present.
* sn/rebase-update-refs-symrefs:
rebase: skip branch symref aliases
The -m/-F/-c/-C options to supply commit log message from outside the
editor are now supported for all "git commit --fixup" variations.
* ec/commit-fixup-options:
commit: allow -c/-C for all kinds of --fixup
commit: allow -m/-F for all kinds of --fixup
The [includeIf "condition"] conditional inclusion facility for
configuration files has learned to use the location of worktree
in its condition.
* cl/conditional-config-on-worktree-path:
config: add "worktree" and "worktree/i" includeIf conditions
config: refactor include_by_gitdir() into include_by_path()
When fetching from a transport that provides a self-contained pack,
pass the transport pointer to the post-fetch `check_connected()` call
to optimize connectivity check.
Retracted.
cf. <CAL71e4MrVqC1=AR6x0_8S=8kVqPdDkhgCZRb4etFsxTzd6s_8Q@mail.gmail.com>
* kk/fetch-store-ref-optimization:
fetch: pass transport to post-fetch connectivity check
The setup logic to discover and configure repositories has been
refactored, and the initialization of the object database has been
centralized.
* ps/setup-centralize-odb-creation:
setup: construct object database in `apply_repository_format()`
repository: stop reading loose object map twice on repo init
setup: stop initializing object database without repository
setup: stop creating the object database in `setup_git_env()`
repository: stop initializing the object database in `repo_set_gitdir()`
setup: deduplicate logic to apply repository format
setup: drop `setup_git_env()`
t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
"git rev-list" (and "git log" family of commands) learned a new "--max-count-oldest"
that picks oldest N commits in the range instead of the usual newest.
* mf/revision-max-count-oldest:
revision.c: implement --max-count-oldest
"git config foo.bar=baz" is not likely to be a request to read the
value of such a variable with '=' in its name; rather it is plausible
that the user meant "git config set foo.bar baz". Give advice when
giving an error message.
* hn/config-typo-advice:
config: improve diagnostic for "set" with missing value
config: add git_config_key_is_valid() for quiet validation
Advice shown by "git status" when the local branch is behind or has
diverged from its push branch has been updated to suggest "git pull
<remote> <branch>".
* hn/status-pull-advice-qualified:
remote: qualify "git pull" advice for non-upstream compareBranches
"git branch" command learned "--prune-merged" option to remove
local branches that have already been merged to the remote-tracking
branches they track.
* hn/branch-prune-merged:
branch: add --dry-run for --prune-merged
branch: add branch.<name>.pruneMerged opt-out
branch: add --prune-merged <branch>
branch: prepare delete_branches for a bulk caller
branch: let delete_branches warn instead of error on bulk refusal
branch: add --forked filter for --list mode
"git checkout --track=..." learned to optionally fetch the branch
from the remote the new branch will work with.
* hn/checkout-track-fetch:
checkout: extend --track with a "fetch" mode to refresh start-point
branch: expose helpers for finding the remote owning a tracking ref
Configuration file locking now retries for a short period, avoiding
failures when multiple processes attempt to update the configuration
simultaneously.
* jt/config-lock-timeout:
config: retry acquiring config.lock, configurable via core.configLockTimeout
The parse-options library learned to auto-correct misspelled
subcommand names.
* js/parseopt-subcommand-autocorrection:
SQUASH???
doc: document autocorrect API
parseopt: add tests for subcommand autocorrection
parseopt: enable subcommand autocorrection for git-remote and git-notes
parseopt: autocorrect mistyped subcommands
autocorrect: provide config resolution API
autocorrect: rename AUTOCORRECT_SHOW to AUTOCORRECT_HINT
autocorrect: use mode and delay instead of magic numbers
help: move tty check for autocorrection to autocorrect.c
help: make autocorrect handling reusable
parseopt: extract subcommand handling from parse_options_step()
The display of the rebase todo list in "git status" has been
improved to correctly abbreviate object IDs for more commands and
avoid misinterpreting refs as object IDs.
* pw/status-rebase-todo:
status: improve rebase todo list parsing
sequencer: factor out parsing of todo commands
When fetching objects into a lazily cloned repository, .promisor
files are created with information meant to help debugging. "git
repack" has been taught to carry this information forward to
packfiles that are newly created.
Retracted.
cf. <agx_GPfBKpkSc3Gx@lorenzo-VM>
* lp/repack-propagate-promisor-debugging-info:
repack-promisor: add missing headers
t7703: test for promisor file content after geometric repack
t7700: test for promisor file content after repack
repack-promisor: preserve content of promisor files after repack
repack-promisor add helper to fill promisor file after repack
pack-write: add explanation to promisor file content
When processing large history graphs on Debian or Ubuntu, "git
subtree" can die with a "recursion depth reached" error.
* cs/subtree-split-recursion:
contrib/subtree: reduce recursion during split
contrib/subtree: functionalize split traversal
contrib/subtree: reduce function side-effects
A handful of inappropriate uses of the_repository have been
rewritten to use the right repository structure instance in the
unpack-trees.c codepath.
* jd/unpack-trees-wo-the-repository:
unpack-trees: use repository from index instead of global
unpack-trees: use repository from index instead of global
Documentation and tests have been added to clarify that Git's internal
raw timestamp format requires a `@` prefix for values less than
100,000,000 to prevent ambiguity with other formats like YYYYMMDD.
* ls/doc-raw-timestamp-prefix:
doc: document and test `@` prefix for raw timestamps
A recent regression in t7527 that broke TAP output has been fixed,
some other test noise that also broke TAP output has been silenced,
and 'prove' is now configured to fail on invalid TAP output to
prevent future regressions.
* ps/t7527-fix-tap-output:
t: let prove fail when parsing invalid TAP output
t/lib-git-p4: silence output when killing p4d and its watchdog
t/test-lib: silence EBUSY errors on Windows during test cleanup
t7527: fix broken TAP output
Guidelines on how to write a cover letter for a multi-patch series
have been added to SubmittingPatches, which also got a new marker
to separate the section for typofixes.
* jc/submitting-patches-cover-letter:
SubmittingPatches: describe cover letter
SubmittingPatches: separate typofixes section
The 'git describe --contains --all' command has been fixed to
properly honor the '--match' and '--exclude' options by passing
them down to 'git name-rev' with the appropriate reference
prefixes.
* jk/describe-contains-all-match-fix:
describe: fix --exclude, --match with --contains and --all
Many core configuration variables have been migrated from global
variables into 'repo_config_values' to tie them to a specific
repository instance, avoiding cross-repository state leakage.
* ob/more-repo-config-values:
environment: move "warn_on_object_refname_ambiguity" into `struct repo_config_values`
environment: move "sparse_expect_files_outside_of_patterns" into `struct repo_config_values`
environment: move "core_sparse_checkout_cone" into `struct repo_config_values`
environment: move "precomposed_unicode" into `struct repo_config_values`
environment: move "pack_compression_level" into `struct repo_config_values`
environment: move `zlib_compression_level` into `struct repo_config_values`
environment: move "check_stat" into `struct repo_config_values`
environment: move "trust_ctime" into `struct repo_config_values`
"ort" merge backend handles merging corrupt trees better by
aborting when it should.
* en/ort-harden-against-corrupt-trees:
cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
merge-ort: abort merge when trees have duplicate entries
merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
merge-ort: drop unnecessary show_all_errors from collect_merge_info()
merge-ort: propagate callback errors from traverse_trees_wrapper()
Documentation updates.
* kh/doc-trailers:
doc: interpret-trailers: document comment line treatment
doc: interpret-trailers: commit to “trailer block” term
doc: interpret-trailers: add key format example
doc: interpret-trailers: explain key format
doc: interpret-trailers: explain the format after the intro
doc: interpret-trailers: not just for commit messages
doc: interpret-trailers: use “metadata” in Name as well
doc: interpret-trailers: replace “lines” with “metadata”
doc: interpret-trailers: stop fixating on RFC 822
The final step, split from earlier attempt by Dscho, to loosen the
sideband restriction for now and tighten later at Git v3.0 boundary.
* jc/neuter-sideband-post-3.0:
sideband: delay sanitizing by default to Git v3.0
In a history with more than one root commit, "git log --graph
--oneline" stuffed an unrelated commit immediately below a root
commit, which has been corrected by making the spot below a root
unavailable.
* ps/shift-root-in-graph:
graph: add indentation for commits preceded by a parentless commit