Part of "install-dependencies.sh" is to install some binaries required
for tests into a custom directory that gets added to the PATH. This
directory is located at "$HOME/path" and thus depends on the current
user that the script executes as.
This creates problems for GitLab CI, which installs dependencies as the
root user, but runs tests as a separate, unprivileged user. As their
respective home directories are different, we will end up using two
different custom path directories. Consequently, the unprivileged user
will not be able to find the binaries that were set up as root user.
Fix this issue by allowing CI to override the custom path, which allows
GitLab to set up a constant value that isn't derived from "$HOME".
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're downloading various executables required by our tests. Each of
these executables goes into its own directory, which is then appended to
the PATH variable. Consequently, whenever we add a new dependency and
thus a new directory, we would have to adapt to this change in several
places.
Refactor this to instead put all binaries into a single directory.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're about to merge the "install-docker-dependencies.sh" script into
"install-dependencies.sh". This will also move our Alpine-based jobs
over to use the latter script. This script uses the Bash shell though,
which is not available by default on Alpine Linux.
Refactor "install-dependencies.sh" to use "/bin/sh" instead of Bash.
This requires us to get rid of the pushd/popd invocations, which are
replaced by some more elaborate commands that download or extract
executables right to where they are needed.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "linux-gcc-default" job installs common Ubuntu packages. This is
already done in the distro-specific switch, so we basically duplicate
the effort here.
Drop the duplicate package installations and inline the variable that
contains those common packages.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our "install-dependencies.sh" script is executed by non-dockerized jobs
to install dependencies. These jobs don't run with "root" permissions,
but with a separate user. Consequently, we need to use sudo(8) there to
elevate permissions when installing packages.
We're about to merge "install-docker-dependencies.sh" into that script
though, and our Docker containers do run as "root". Using sudo(8) is
thus unnecessary there, even though it would be harmless. On some images
like Alpine Linux though there is no sudo(8) available by default, which
would consequently break the build.
Adapt the script to make "sudo" a no-op when running as "root" user.
This allows us to easily reuse the script for our dockerized jobs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose a distro name in dockerized jobs. This will be used in a
subsequent commit where we merge the installation scripts for dockerized
and non-dockerized jobs.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "runs_on_pool" environment variable is used by our CI scripts to
distinguish the different kinds of operating systems. It is quite
specific to GitHub Actions though and not really a descriptive name.
Rename the variable to "distro" to clarify its intent.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, the buffer type of Windows' `stdout` is unbuffered (_IONBF),
and there is no need to manually fflush `stdout`.
But some programs, such as the Windows Filtering Platform driver
provided by the security software, may change the buffer type of
`stdout` to full buffering. This nees `fflush(stdout)` to be called
manually, otherwise there will be no output to `stdout`.
Signed-off-by: MinarKotonoha <chengzhuo5@qq.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In 18c9cb7524 (builtin/clone: create the refdb with the correct object
format, 2023-12-12), we have changed git-clone(1) so that it delays
creation of the refdb until after it has learned about the remote's
object format. This change was required for the reftable backend, which
encodes the object format into the tables. So if we pre-initialized the
refdb with the default object format, but the remote uses a different
object format than that, then the resulting tables would have encoded
the wrong object format.
This change unfortunately breaks remote helpers which try to access the
repository that is about to be created. Because the refdb has not yet
been initialized at the point where we spawn the remote helper, we also
don't yet have "HEAD" or "refs/". Consequently, any Git commands ran by
the remote helper which try to access the repository would fail because
it cannot be discovered.
This is essentially a chicken-and-egg problem: we cannot initialize the
refdb because we don't know about the object format. But we cannot learn
about the object format because the remote helper may be unable to
access the partially-initialized repository.
Ideally, we would address this issue via capabilities. But the remote
helper protocol is not structured in a way that guarantees that the
capability announcement happens before the remote helper tries to access
the repository.
Instead, fix this issue by partially initializing the refdb up to the
point where it becomes discoverable by Git commands.
-----
Cherry-picked the commit 199f44cb2e to
provide the backport-PR requested by @dscho in #4843 ...
Commit message unchanged with the exception of the `Signed-off-by:`. The
contents of the commit are anyway unchanged. Hope this works for you.
In 18c9cb7524 (builtin/clone: create the refdb with the correct object
format, 2023-12-12), we have changed git-clone(1) so that it delays
creation of the refdb until after it has learned about the remote's
object format. This change was required for the reftable backend, which
encodes the object format into the tables. So if we pre-initialized the
refdb with the default object format, but the remote uses a different
object format than that, then the resulting tables would have encoded
the wrong object format.
This change unfortunately breaks remote helpers which try to access the
repository that is about to be created. Because the refdb has not yet
been initialized at the point where we spawn the remote helper, we also
don't yet have "HEAD" or "refs/". Consequently, any Git commands ran by
the remote helper which try to access the repository would fail because
it cannot be discovered.
This is essentially a chicken-and-egg problem: we cannot initialize the
refdb because we don't know about the object format. But we cannot learn
about the object format because the remote helper may be unable to
access the partially-initialized repository.
Ideally, we would address this issue via capabilities. But the remote
helper protocol is not structured in a way that guarantees that the
capability announcement happens before the remote helper tries to access
the repository.
Instead, fix this issue by partially initializing the refdb up to the
point where it becomes discoverable by Git commands.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Oliver Schneider <oschneider@exocad.com>
The `fsmonitor` repo-setting data is allocated lazily. It needs to be
released in `repo_clear()` along the rest of the allocated data in
`repository`.
This is needed because the next commit will merge v2.39.4, which will
add a `git checkout --recurse-modules` call (which would leak memory
without this here fix) to an otherwise leak-free test script.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Across only three files, comments and a single function name used
'commitish' rather than 'commit-ish' or 'committish' as the spelling.
The git glossary accepts a hyphen or a double-t, but not a single-t.
Despite the typo in a translation file, none of the typos appear in
user-visible locations.
Signed-off-by: Pi Fisher <Pi.L.D.Fisher@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.
That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.
Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.
Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.
This is a backport of 92a209bf24 (remote-curl: add Transfer-Encoding
header only for older curl, 2024-04-05) into the `maint-2.39` branch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Our documentation claims we support curl versions back to 7.19.5. But we
can no longer compile with that version since adding an unconditional
use of CURLOPT_RESOLVE in 511cfd3bff (http: add custom hostname to IP
address resolutions, 2022-05-16). That feature wasn't added to libcurl
until 7.21.3.
We could add #ifdefs to make this work back to 7.19.5. But given that
nobody noticed the compilation failure in the intervening two years, it
makes more sense to bump the version in the documentation to 7.21.3
(which is itself over 13 years old).
We could perhaps go forward even more (which would let us drop some
cruft from git-curl-compat.h), but this should be an obviously safe
jump, and we can move forward later.
Note that user-visible syntax for CURLOPT_RESOLVE has grown new features
in subsequent curl versions. Our documentation mentions "+" and "-"
entries, which require more recent versions than 7.21.3. We could
perhaps clarify that in our docs, but it's probably not worth cluttering
them with restrictions of ancient curl versions.
This is a backport of c28ee09503 (INSTALL: bump libcurl version to
7.21.3, 2024-04-02) into the `maint-2.39` branch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In get_active_slot(), we return a CURL handle that may have been used
before (reusing them is good because it lets curl reuse the same
connection across many requests). We set a few curl options back to
defaults that may have been modified by previous requests.
We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
defaults to "-1"). This usually doesn't matter because most POSTs will
set both fields together anyway. But there is one exception: when
handling a large request in remote-curl's post_rpc(), we don't set
_either_, and instead set a READFUNCTION to stream data into libcurl.
This can interact weirdly with a stale POSTFIELDSIZE setting, because
curl will assume it should read only some set number of bytes from our
READFUNCTION. However, it has worked in practice because we also
manually set a "Transfer-Encoding: chunked" header, which libcurl uses
as a clue to set the POSTFIELDSIZE to -1 itself.
So everything works, but we're better off resetting the size manually
for a few reasons:
- there was a regression in curl 8.7.0 where the chunked header
detection didn't kick in, causing any large HTTP requests made by
Git to fail. This has since been fixed (but not yet released). In
the issue, curl folks recommended setting it explicitly to -1:
https://github.com/curl/curl/issues/13229#issuecomment-2029826058
and it indeed works around the regression. So even though it won't
be strictly necessary after the fix there, this will help folks who
end up using the affected libcurl versions.
- it's consistent with what a new curl handle would look like. Since
get_active_slot() may or may not return a used handle, this reduces
the possibility of heisenbugs that only appear with certain request
patterns.
Note that the recommendation in the curl issue is to actually drop the
manual Transfer-Encoding header. Modern libcurl will add the header
itself when streaming from a READFUNCTION. However, that code wasn't
added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
support back to 7.19.5, so those older versions still need the manual
header.
This is a backport of 3242311742 (http: reset POSTFIELDSIZE when
clearing curl handle, 2024-04-02) into the `maint-2.39` branch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Doc update, as a preparation to enhance "git update-ref --stdin".
* kn/clarify-update-ref-doc:
githooks: use {old,new}-oid instead of {old,new}-value
update-ref: use {old,new}-oid instead of {old,new}value
Another "set -u" fix for the bash prompt (in contrib/) script.
* vs/complete-with-set-u-fix:
completion: protect prompt against unset SHOWUPSTREAM in nounset mode
completion: fix prompt with unset SHOWCONFLICTSTATE in nounset mode
Move the already existing newline characters out of a few translatable
strings, to help a bit with the translation efforts.
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CST6CDT and the like are POSIX timezone, with no rule for transition.
And POSIX doesn't enforce how to interpret the rule if it's omitted.
Some libc (e.g. glibc) resorted back to IANA (formerly Olson) db rules
for those timezones. Some libc (e.g. FreeBSD) uses a fixed rule.
Other libc (e.g. musl) interpret that as no transition at all [1].
In addition, distributions (notoriously Debian-derived, which uses IANA
db for CST6CDT and the like) started to split "legacy" timezones
like CST6CDT, EST5EDT into `tzdata-legacy', which will not be installed
by default [2].
In those cases, t9604 will run into failure.
Let's switch to POSIX timezone with rules to change timezone.
1: http://mm.icann.org/pipermail/tz/2024-March/058751.html
2: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043250
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use tabs to indent, not two or four spaces.
These days, even the test fixture preparation should be done inside
test_expect_success block.
Address these two style violations in this test.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test file assumes the "apply" backend is the default which is not
the case since 2ac0d6273f (rebase: change the default backend from "am"
to "merge", 2020-02-15). Make sure the "apply" backend is tested by
specifying it explicitly.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using a helper function makes the tests shorter and avoids running "git
cat-file" upstream of a pipe.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perform the setup in a dedicated test so the later tests can be run
independently. Also avoid running git upstream of a pipe and take
advantage of test_commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use advice_if_enabled() API to rewrite a simple pattern to
call advise() after checking advice_enabled().
* rj/use-adv-if-enabled:
add: use advise_if_enabled for ADVICE_ADD_EMBEDDED_REPO
add: use advise_if_enabled for ADVICE_ADD_EMPTY_PATHSPEC
add: use advise_if_enabled for ADVICE_ADD_IGNORED_FILE
"git pack-refs" learned the "--auto" option, which is a useful
addition to be triggered from "git gc --auto".
Acked-by: Karthik Nayak <karthik.188@gmail.com>
cf. <CAOLa=ZRAEA7rSUoYL0h-2qfEELdbPHbeGpgBJRqesyhHi9Q6WQ@mail.gmail.com>
* ps/pack-refs-auto:
builtin/gc: pack refs when using `git maintenance run --auto`
builtin/gc: forward git-gc(1)'s `--auto` flag when packing refs
t6500: extract objects with "17" prefix
builtin/gc: move `struct maintenance_run_opts`
builtin/pack-refs: introduce new "--auto" flag
builtin/pack-refs: release allocated memory
refs/reftable: expose auto compaction via new flag
refs: remove `PACK_REFS_ALL` flag
refs: move `struct pack_refs_opts` to where it's used
t/helper: drop pack-refs wrapper
refs/reftable: print errors on compaction failure
reftable/stack: gracefully handle failed auto-compaction due to locks
reftable/stack: use error codes when locking fails during compaction
reftable/error: discern locked/outdated errors
reftable/stack: fix error handling in `reftable_stack_init_addition()`
The test script had an incomplete and ineffective attempt to avoid
clobbering the testing user's real crontab (and its equivalents),
which has been completed.
* es/test-cron-safety:
test-lib: fix non-functioning GIT_TEST_MAINT_SCHEDULER fallback
"git add -p" and other "interactive hunk selection" UI has learned to
skip showing the hunk immediately after it has already been shown, and
an additional action to explicitly ask to reshow the current hunk.
* rj/add-p-explicit-reshow:
add-patch: do not print hunks repeatedly
add-patch: introduce 'p' in interactive-patch
Documentation rules has been explicitly described how to mark-up
literal parts and a few manual pages have been updated as examples.
* ja/doc-markup-updates:
doc: git-clone: do not autoreference the manpage in itself
doc: git-clone: apply new documentation formatting guidelines
doc: git-init: apply new documentation formatting guidelines
doc: allow literal and emphasis format in doc vs help tests
doc: rework CodingGuidelines with new formatting rules
The "hint:" messages given by the advice mechanism, when given a
message with a blank line, left a line with trailing whitespace,
which has been cleansed.
* jc/advice-sans-trailing-whitespace:
advice: omit trailing whitespace
"git apply" failed to extract the filename the patch applied to,
when the change was about an empty file created in or deleted from
a directory whose name ends with a SP, which has been corrected.
* jc/apply-parse-diff-git-header-names-fix:
t4126: fix "funny directory name" test on Windows (again)
t4126: make sure a directory with SP at the end is usable
apply: parse names out of "diff --git" more carefully
The tests for git-pack-refs(1) with the `core.sharedRepository` config
execute git-pack-refs(1) outside of the shell that has the expected
umask set. This is wrong because we want to test the behaviour of that
command with different umasks. The issue went unnoticed because most
distributions have a default umask of 0022, and we only ever test with
`--shared=true`, which re-adds the group write bit.
Fix the issue by moving git-pack-refs(1) into the umask'd shell and add
a bunch of test cases that exercise behaviour more thoroughly.
Note that we drop the check for whether `core.sharedRepository` was set
to the correct value to make the test setup a bit easier. We should be
able to rely on git-init(1) doing its thing correctly. Furthermore, to
help readability, we convert tests that pass `--shared=true` to instead
pass the equivalent `--shared=group`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two kinds of `--shared=` tests, one for git-init(1) and one for
git-pack-refs(1). Merge them into a reusable function such that we can
easily add additional testcases with different umasks and flags for the
`--shared=` switch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to the preceding commit, let's reuse the `compressed` array that
we use to store compressed data in. This results in a small reduction in
memory allocations when writing many refs.
Before:
HEAP SUMMARY:
in use at exit: 671,931 bytes in 151 blocks
total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 671,931 bytes in 151 blocks
total heap usage: 22,618,257 allocs, 22,618,106 frees, 1,236,351,528 bytes allocated
So while the reduction in allocations isn't really all that big, it's a
low hanging fruit and thus there isn't much of a reason not to pick it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While most reftable blocks are written to disk as-is, blocks for log
records are compressed with zlib. To compress them we use `compress2()`,
which is a simple wrapper around the more complex `zstream` interface
that would require multiple function invocations.
One downside of this interface is that `compress2()` will reallocate
internal state of the `zstream` interface on every single invocation.
Consequently, as we call `compress2()` for every single log block which
we are about to write, this can lead to quite some memory allocation
churn.
Refactor the code so that the block writer reuses a `zstream`. This
significantly reduces the number of bytes allocated when writing many
refs in a single transaction, as demonstrated by the following benchmark
that writes 100k refs in a single transaction.
Before:
HEAP SUMMARY:
in use at exit: 671,931 bytes in 151 blocks
total heap usage: 22,631,887 allocs, 22,631,736 frees, 1,854,670,793 bytes allocated
After:
HEAP SUMMARY:
in use at exit: 671,931 bytes in 151 blocks
total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable writer tracks the last key that it has written so that it
can properly compute the compressed prefix for the next record it is
about to write. This last key must be reset whenever we move on to write
the next block, which is done in `writer_reinit_block_writer()`. We do
this by calling `strbuf_release()` though, which needlessly deallocates
the underlying buffer.
Convert the code to use `strbuf_reset()` instead, which saves one
allocation per block we're about to write. This requires us to also
amend `reftable_writer_free()` to release the buffer's memory now as we
previously seemingly relied on `writer_reinit_block_writer()` to release
the memory for us. Releasing memory here is the right thing to do
anyway.
While at it, convert a callsite where we truncate the buffer by setting
its length to zero to instead use `strbuf_reset()`, too.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two code paths which release memory of the reftable writer:
- `reftable_writer_close()` releases internal state after it has
written data.
- `reftable_writer_free()` releases the block that was written to and
the writer itself.
Both code paths free different parts of the writer, and consequently the
caller must make sure to call both. And while callers mostly do this
already, this falls apart when a write failure causes the caller to skip
calling `reftable_write_close()`.
Introduce a new function `reftable_writer_release()` that releases all
internal state and call it from both paths. Like this it is fine for the
caller to not call `reftable_writer_close()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_flush_nonempty_block()` such that it conforms
better to it and add some documentation that explains some of its more
intricate behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_add_record()` such that it conforms better to it
and add some documentation that explains some of its more intricate
behaviour.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to write reflog entries we need to compute the committer's
identity as it gets encoded in the log record itself. The reftable
backend does this via `git_committer_info()` and `split_ident_line()` in
`fill_reftable_log_record()`, which use the Git config as well as
environment variables to figure out the identity.
While most callers would only call `fill_reftable_log_record()` once or
twice, `write_transaction_table()` will call it as many times as there
are queued ref updates. This can be quite a waste of effort when writing
many refs with reflog entries in a single transaction.
Refactor the code to pre-compute the committer information. This results
in a small speedup when writing 100000 refs in a single transaction:
Benchmark 1: update-ref: create many refs (HEAD~)
Time (mean ± σ): 2.895 s ± 0.020 s [User: 1.516 s, System: 1.374 s]
Range (min … max): 2.868 s … 2.983 s 100 runs
Benchmark 2: update-ref: create many refs (HEAD)
Time (mean ± σ): 2.845 s ± 0.017 s [User: 1.461 s, System: 1.379 s]
Range (min … max): 2.803 s … 2.913 s 100 runs
Summary
update-ref: create many refs (HEAD) ran
1.02 ± 0.01 times faster than update-ref: create many refs (HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit we have disabled name checks in the "reftable"
backend. These checks were responsible for verifying multiple things
when writing records to the reftable stack:
- Detecting file/directory conflicts. Starting with the preceding
commits this is now handled by the reftable backend itself via
`refs_verify_refname_available()`.
- Validating refnames. This is handled by `check_refname_format()` in
the generic ref transacton layer.
The code in the reftable library is thus not used anymore and likely to
bitrot over time. Remove it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the callback functions which write refs in the reftable backend
perform D/F conflict checks via `refs_verify_refname_available()`. But
in reality we perform these D/F conflict checks a second time in the
reftable library via `stack_check_addition()`.
Interestingly, the code in the reftable library is inferior compared to
the generic function:
- It is slower than `refs_verify_refname_available()`, even though
this can probably be optimized.
- It does not provide a proper error message to the caller, and thus
all the user would see is a generic "file/directory conflict"
message.
Disable the D/F conflict checks in the reftable library by setting the
`skip_name_check` write option. This results in a non-negligible speedup
when writing many refs. The following benchmark writes 100k refs in a
single transaction:
Benchmark 1: update-ref: create many refs (HEAD~)
Time (mean ± σ): 3.241 s ± 0.040 s [User: 1.854 s, System: 1.381 s]
Range (min … max): 3.185 s … 3.454 s 100 runs
Benchmark 2: update-ref: create many refs (HEAD)
Time (mean ± σ): 2.878 s ± 0.024 s [User: 1.506 s, System: 1.367 s]
Range (min … max): 2.838 s … 2.960 s 100 runs
Summary
update-ref: create many refs (HEAD~) ran
1.13 ± 0.02 times faster than update-ref: create many refs (HEAD)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>