Commit Graph

178040 Commits

Author SHA1 Message Date
Johannes Schindelin
cb897fc0ce t5505/t5516: fix white-space around redirectors
The convention in Git project's shell scripts is to have white-space
_before_, but not _after_ the `>` (or `<`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:41 +00:00
Johannes Schindelin
c01688b027 t5505/t5516: allow running without .git/branches/ in the templates
When we commit the template directory as part of `make vcxproj`, the
`branches/` directory is not actually commited, as it is empty.

Two tests were not prepared for that situation.

This developer tried to get rid of the support for `.git/branches/` a
long time ago, but that effort did not bear fruit, so the best we can do
is work around in these here tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:41 +00:00
Johannes Schindelin
38d854b6c9 Merge branch 'fixes-from-the-git-mailing-list'
These fixes have been sent to the Git mailing list but have not been
picked up by the Git project yet.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:41 +00:00
Johannes Schindelin
6053c36c84 Merge branch 'v2.53.0.windows.3'
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Jeff King
6e17cb98c5 grep: prevent ^$ false match at end of file
In some implementations, `regexec_buf()` assumes that it is fed lines;
Without `REG_NOTEOL` it thinks the end of the buffer is the end of a
line. Which makes sense, but trips up this case because we are not
feeding lines, but rather a whole buffer. So the final newline is not
the start of an empty line, but the true end of the buffer.

This causes an interesting bug:

  $ echo content >file.txt
  $ git grep --no-index -n '^$' file.txt
  file.txt:2:

This bug is fixed by making the end of the buffer consistently the end
of the final line.

The patch was applied from
https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/

Reported-by: Olly Betts <olly@survex.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Johannes Schindelin
5834197c3c unix-socket: avoid leak when initialization fails
When a Unix socket is initialized, the current directory's path is
stored so that the cleanup code can `chdir()` back to where it was
before exit.

If the path that needs to be stored exceeds the default size of the
`sun_path` attribute of `struct sockaddr_un` (which is defined as a
108-sized byte array on Linux), a larger buffer needs to be allocated so
that it can hold the path, and it is the responsibility of the
`unix_sockaddr_cleanup()` function to release that allocated memory.

In Git's CI, this stack allocation is not necessary because the code is
checked out to `/home/runner/work/git/git`. Concatenate the path
`t/trash directory.t0301-credential-cache/.cache/git/credential/socket`
and a terminating NUL, and you end up with 96 bytes, 12 shy of the
default `sun_path` size.

However, I use worktrees with slightly longer paths:
`/home/me/projects/git/yes/i/nest/worktrees/to/organize/them/` is more
in line with what I have. When I recently tried to locally reproduce a
failure of the `linux-leaks` CI job, this t0301 test failed (where it
had not failed in CI).

The reason: When `credential-cache` tries to reach its daemon initially
by calling `unix_sockaddr_init()`, it is expected that the daemon cannot
be reached (the idea is to spin up the daemon in that case and try
again). However, when this first call to `unix_sockaddr_init()` fails,
the code returns early from the `unix_stream_connect()` function
_without_ giving the cleanup code a chance to run, skipping the
deallocation of above-mentioned path.

The fix is easy: do not return early but instead go directly to the
cleanup code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Johannes Schindelin
85d1af077e Merge branch 'prevent-accidental-ntlm-exfiltration-via-symlinks'
This merges the fix for CVE-2026-32631 into the v2.53.x release branch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Johannes Schindelin
f36839b55d Merge branch 'fix-ci'
This fixes two issues, one specific to running CI for embargoed releases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Johannes Schindelin
ac153f71dd mingw: skip symlink type auto-detection for network share targets
On Windows, symbolic links come in two flavors: file symlinks and
directory symlinks.  Since Git was born on Linux where this distinction
does not exist, Git for Windows has to auto-detect the type by looking
at the target.  When the target does not yet exist at symlink creation
time, Git for Windows creates a "phantom" file symlink and later, once
checkout is complete, calls `CreateFileW()` on the target to check
whether it is actually a directory.

If the symlink target is a UNC path (e.g. `\\attacker\share`), this
auto-detection triggers an SMB connection to the remote host.  Windows
performs NTLM authentication by default for such connections, which
means a crafted repository can exfiltrate the cloning user's NTLMv2
hash to an attacker-controlled server without any user interaction
beyond `git clone -c core.symlinks=true <url>`.

There are ways to specify UNC paths that start with only a single
backslash (e.g. `\??\UNC\host\share`); All of them do start like
that, though, so let's use that as a tell-tale that we should skip
the auto-detection in `process_phantom_symlink()`. The symlink is
then left as a file symlink (the `mklink` default), and a warning is
emitted suggesting the user set the `symlink` gitattribute to `dir`
if a directory symlink is needed.  When the attribute is already set,
auto-detection is never invoked in the first place, so that code path
is unaffected.

This is the same class of vulnerability as CVE-2025-66413
(https://github.com/git-for-windows/git/security/advisories/GHSA-hv9c-4jm9-jh3x)
and follows the same general mitigation pattern that MinTTY adopted for
ANSI escape sequences referencing network share paths
(https://github.com/mintty/mintty/security/advisories/GHSA-jf4m-m6rv-p6c5).

Note that there are legitimate paths starting with a single backslash
that are _not_ network paths: drive-less absolute paths are interpreted
as relative to the current working directory's drive. In practice, these
are highly uncommon (and brittle, just one working directory change
away from breaking). In any case, the only consequence is now that the
symlink type of those has to be specified via Git attributes, is all.

Reported-by: Justin Lee <jessdhoctor@gmail.com>
Addresses: CVE-2026-32631
Addresses: https://github.com/git-for-windows/git/security/advisories/GHSA-9j5h-h4m7-85hx
Assisted-by: Claude Opus 4.6
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Johannes Schindelin
9f0ca65c06 ci(dockerized): reduce the PID limit for private repositories
Every once in a while I need to verify that Microsoft Git's test suite
passes for changes that are not yet meant for public consumption, and
since it was (made) too difficult to keep up a working Azure Pipeline
definition, I have to use GitHub Actions in a private GitHub repository
for that purpose.

In these tests, basically all Dockerized CI jobs fail consistently. The
symptom is something like:

  error: cannot create async thread: Resource temporarily unavailable

in the middle of a test, typically in the t5xxx-t6xxx range. The first
such error is immediately followed by plenty more of these errors, and
not a single test succeeds afterwards.

At first, I thought that maybe the massive parallelism I enjoy there is
the problem, and I thought that the cgroups limits might be shared
between the many containers that run on essentially the same physical
machine. But even reducing the matrix to just a single of those
Dockerized jobs runs into the very same problems.

The underlying reason seems to be a substantial difference in the hosted
runners that execute these Dockerized jobs: forcing the PID limit of the
container to a high number lets the jobs pass, even when running the
complete matrix of all 13 Dockerized jobs concurrently. But that's not
the only difference: The jobs seem to take a lot longer in these
containers than, say, in the containers made available to
https://github.com/git/git.

When forcing a PID limit of 64k in that private repository, the jobs
completed successfully, but they also took a lot longer, between 2x to
2.5x longer, i.e. painfully much longer. Reducing the PID limit to 16k,
the CI jobs still passed, but took an equally long amount of time.
Reducing the PID limit to 8k caused the errors to reappear.

Here are the numbers from three example runs, the first one forcing the
PID and nproc limit to 65536, the second one to 16384, the third run is
from the public git/git repository:

Job                           | 64k     | 16k     | reference
------------------------------|---------|---------|---------
almalinux-8                   | 19m 3s  | 16m 0s  | 9m 36s
debian-11                     | 20m 31s | 20m 3s  | 8m 5s
fedora-breaking-changes-meson | 16m 29s | 19m 19s | 9m 40s
linux-asan-ubsan              | 1h 10m  | 1h 11m  | 34m 36s
linux-breaking-changes        | 25m 39s | 25m 58s | 13m 15s
linux-leaks                   | 1h 9m   | 1h 10m  | 33m 30s
linux-meson                   | 28m 9s  | 27m 4s  | 13m 45s
linux-musl-meson              | 16m 32s | 13m 39s | 8m 6s
linux-reftable-leaks          | 1h 13m  | 1h 13m  | 34m 34s
linux-reftable                | 26m 2s  | 25m 48s | 13m 31s
linux-sha256                  | 26m 12s | 26m 3s  | 12m 36s
linux-TEST-vars               | 26m 5s  | 25m 21s | 13m 25s
linux32                       | 21m 16s | 19m 57s | 10m 44s

It does not look as if the PID limit is the reason for the longer
runtime, seeing as the 64k vs 16k timings deviate no more than as is
usual with GitHub workflows. So let's go for 16k.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:40 +00:00
Git for Windows Build Agent
230e9cdd7d Start the merging-rebase to upstream/next
This commit starts the rebase of ea85ec817e to 8c84645362
2026-06-10 03:07:38 +00:00
Junio C Hamano
8c84645362 Sync with 'master' 2026-06-09 10:11:25 +09:00
Junio C Hamano
40de2e7b90 Merge branch 'ak/typofixes' into next
Typofixes.

* ak/typofixes:
  doc: fix typos via codespell
2026-06-09 10:11:13 +09:00
Junio C Hamano
3d0b057aee Merge branch 'ob/more-repo-config-values' into next
Many core configuration variables have been migrated from global
variables into 'repo_config_values' to tie them to a specific
repository instance, avoiding cross-repository state leakage.

* ob/more-repo-config-values:
  environment: move "warn_on_object_refname_ambiguity" into `struct repo_config_values`
  environment: move "sparse_expect_files_outside_of_patterns" into `struct repo_config_values`
  environment: move "core_sparse_checkout_cone" into `struct repo_config_values`
  environment: move "precomposed_unicode" into `struct repo_config_values`
  environment: move "pack_compression_level" into `struct repo_config_values`
  environment: move `zlib_compression_level` into `struct repo_config_values`
  environment: move "check_stat" into `struct repo_config_values`
  environment: move "trust_ctime" into `struct repo_config_values`
2026-06-09 10:11:13 +09:00
Junio C Hamano
aeaf2363f8 Merge branch 'am/doc-tech-hash-typofix' into next
Typofix.

* am/doc-tech-hash-typofix:
  doc: fix typo in GIT_ALTERNATE_OBJECT_DIRECTORIES
2026-06-09 10:11:12 +09:00
Junio C Hamano
58b2a20f6d Merge branch 'lo/doc-format-patch-subject-prefix' into next
Wording used in "format-patch --subject-prefix" documentation
has been improved.

* lo/doc-format-patch-subject-prefix:
  Documentation: remove redundant 'instead' in --subject-prefix
2026-06-09 10:11:12 +09:00
Junio C Hamano
a1f23cb38c Merge branch 'ps/setup-centralize-odb-creation' into next
The setup logic to discover and configure repositories has been
refactored, and the initialization of the object database has been
centralized.

* ps/setup-centralize-odb-creation:
  setup: construct object database in `apply_repository_format()`
  repository: stop reading loose object map twice on repo init
  setup: stop initializing object database without repository
  setup: stop creating the object database in `setup_git_env()`
  repository: stop initializing the object database in `repo_set_gitdir()`
  setup: deduplicate logic to apply repository format
  setup: drop `setup_git_env()`
  t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
2026-06-09 10:11:12 +09:00
Junio C Hamano
5149e69e3e Merge branch 'hn/config-typo-advice' into next
"git config foo.bar=baz" is not likely to be a request to read the
value of such a variable with '=' in its name; rather it is plausible
that the user meant "git config set foo.bar baz".  Give advice when
giving an error message.

* hn/config-typo-advice:
  config: improve diagnostic for "set" with missing value
  config: add git_config_key_is_valid() for quiet validation
2026-06-09 10:11:12 +09:00
Junio C Hamano
076600fa21 Merge branch 'mf/revision-max-count-oldest' into next
"git rev-list" (and "git log" family of commands) learned a new "--max-count-oldest"
that picks oldest N commits in the range instead of the usual newest.

* mf/revision-max-count-oldest:
  revision.c: implement --max-count-oldest
2026-06-09 10:11:11 +09:00
Junio C Hamano
7198b6bb9d Merge branch 'ls/doc-raw-timestamp-prefix' into next
Documentation and tests have been added to clarify that Git's internal
raw timestamp format requires a `@` prefix for values less than
100,000,000 to prevent ambiguity with other formats like YYYYMMDD.

* ls/doc-raw-timestamp-prefix:
  doc: document and test `@` prefix for raw timestamps
2026-06-09 10:11:11 +09:00
Junio C Hamano
42b2538a2a Merge branch 'jc/submitting-patches-cover-letter' into next
Guidelines on how to write a cover letter for a multi-patch series
have been added to SubmittingPatches, which also got a new marker
to separate the section for typofixes.

* jc/submitting-patches-cover-letter:
  SubmittingPatches: describe cover letter
  SubmittingPatches: separate typofixes section
2026-06-09 10:11:11 +09:00
Junio C Hamano
1ff279f340 The 13th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-09 10:04:51 +09:00
Junio C Hamano
18b6502b3a Merge branch 'jc/doc-monitor-ghci'
Encourage original authors to monitor the CI status.

* jc/doc-monitor-ghci:
  SubmittingPatches: proactively monitor GHCI pages
2026-06-09 10:04:51 +09:00
Junio C Hamano
4d96a1280b Merge branch 'ib/doc-push-default-simple'
The documentation for `push.default = simple` has been clarified to
better explain its behavior, making it clear that it pushes the
current branch to a same-named branch on the remote, and detailing
the upstream requirements for centralized workflows.

* ib/doc-push-default-simple:
  doc: clarify push.default=simple behavior
2026-06-09 10:04:51 +09:00
Junio C Hamano
a58e51dddf Merge branch 'gh/jump-auto-mode'
The 'git-jump' command (in contrib/) has been taught to automatically
pick a mode (merge, diff, or ws) when invoked without arguments.

* gh/jump-auto-mode:
  git-jump: pick a mode automatically when invoked without arguments
2026-06-09 10:04:51 +09:00
Junio C Hamano
2fd113ae07 Merge branch 'rs/strbuf-add-oid-hex'
Formatting object name in full hexadecimal form has been optimized
by using a new strbuf_add_oid_hex() helper function.

* rs/strbuf-add-oid-hex:
  hex: add and use strbuf_add_oid_hex()
2026-06-09 10:04:50 +09:00
Junio C Hamano
7eaa3c82a8 Merge branch 'rs/strbuf-add-uint'
Adding a decimal integer with strbuf_addf("%u") appears commonly;
they have been optimized by using a custom formatter.

* rs/strbuf-add-uint:
  ls-tree: use strbuf_add_uint()
  ls-files: use strbuf_add_uint()
  cat-file: use strbuf_add_uint()
  strbuf: add strbuf_add_uint()
2026-06-09 10:04:50 +09:00
Junio C Hamano
2c677d20b6 Merge branch 'ua/push-remote-group'
"git push" learned to take a "remote group" name to push to, which
causes pushes to multiple places, just like "git fetch" would do.

* ua/push-remote-group:
  push: support pushing to a remote group
  remote: move remote group resolution to remote.c
  remote: fix sign-compare warnings in push_cas_option
2026-06-09 10:04:50 +09:00
Junio C Hamano
fca09c8fc2 Merge branch 'th/promisor-quiet-per-repo'
The "promisor.quiet" configuration variable was not used from
relevant submodules when commands like "grep --recurse-submodules"
triggered a lazy fetch, which has been corrected.

* th/promisor-quiet-per-repo:
  promisor-remote: fix promisor.quiet to use the correct repository
2026-06-09 10:04:50 +09:00
Junio C Hamano
1c0af131cc Merge branch 'tb/bitmap-build-performance'
Reachability bitmap generation has been significantly optimized. By
reordering tree traversal, caching object positions, and refining how
pseudo-merge bitmaps are constructed, the performance of "git repack
--write-midx-bitmaps" is improved, especially for large repositories
and when using pseudo-merges.

* tb/bitmap-build-performance:
  pack-bitmap: build pseudo-merge bitmaps after regular bitmaps
  pack-bitmap: remember pseudo-merge parents
  pack-bitmap: sort bitmaps before XORing
  pack-bitmap: cache object positions during fill
  pack-bitmap: consolidate `find_object_pos()` success path
  pack-bitmap: reuse stored selected bitmaps
  pack-bitmap: check subtree bits before recursing
  pack-bitmap: pass object position to `fill_bitmap_tree()`
2026-06-09 10:04:49 +09:00
Johannes Schindelin
33d828c333 entry: flush fscache after creating directories and writing files (#6250)
## Problem

`git checkout <tree> -- <pathspec>` with `checkout.workers > 1` and
`core.fscache=true` fails when restoring files into directories that do
not yet exist on disk. Two failure modes:

1. `fatal: cannot create directory at '...': Directory not empty` (exit
128)
2. `error: unable to stat just-written file '...'` (exit 255)

100% reproducible when two or more files share a not-yet-created parent
directory.

## Root Cause

The Windows fscache caches directory listings that become stale when
`create_directories()` creates new parent directories via `mkdir()` or
when `write_pc_item()` writes new files. With `workers=1`,
`write_entry()` calls `flush_fscache()` after each file, keeping the
cache in sync. With `workers>1`, `enqueue_checkout()` defers the write
(and the flush), leaving the cache stale for subsequent entries.

## Fix

Add `flush_fscache()` calls:
- In `create_directories()` after each successful `mkdir()`, so
`has_dirs_only_path()` sees the new directory
- In `write_pc_item()` before `lstat()` of the just-written file

On non-Windows platforms `flush_fscache()` is a no-op.

## Test

Adds a regression test to `t2080-parallel-checkout-basics.sh` (`MINGW`
prereq) that deterministically reproduces the bug: two files sharing a
nested parent directory, deleted in a second commit, then restored via
`git checkout <tree> -- <pathspec>` with `workers=2`.
2026-06-08 12:59:24 +02:00
Junio C Hamano
3af1d1dc61 Sync with 'master' 2026-06-08 00:35:52 +09:00
Andrew Kreimer
014c454799 doc: fix typos via codespell
There are some typos in the documentation, comments, etc.
Fix them via codespell, and then adjust the "dump" files
used by the subversion tests to match the updated contents.

Signed-off-by: Andrew Kreimer <algonell@gmail.com>
[dscho noticed and fixed the problems in svn test]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc did final assembling of the three patches]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-08 00:21:35 +09:00
Junio C Hamano
600fe74302 The 12th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-07 23:58:25 +09:00
Junio C Hamano
212d25596d Merge branch 'ja/doc-synopsis-style-again'
A batch of documentation pages has been updated to use the modern
synopsis style.

* ja/doc-synopsis-style-again:
  doc: convert git-imap-send synopsis and options to new style
  doc: convert git-apply synopsis and options to new style
  doc: convert git-am synopsis and options to new style
  doc: convert git-grep synopsis and options to new style
  doc: git bisect: clarify the usage of the synopsis vs actual command
  doc: convert git-bisect to synopsis style
2026-06-07 23:58:25 +09:00
Junio C Hamano
6390da42c7 Merge branch 'kk/commit-reach-optim'
The check for non-stale commits in the priority queue used by
`paint_down_to_common` and `ahead_behind` has been optimized by
replacing an O(N) scan with an O(1) counter, yielding performance
improvements in repositories with wide histories.

* kk/commit-reach-optim:
  commit-reach: replace queue_has_nonstale() scan with O(1) tracking
  commit-reach: deduplicate queue entries in paint_down_to_common
  object.h: fix stale entries in object flag allocation table
2026-06-07 23:58:25 +09:00
Junio C Hamano
de5383c2ce Merge branch 'aj/stash-patch-optimize-temporary-index'
"git stash -p" has been optimized by reusing cached index
entries in its temporary index, avoiding unnecessary lstat()
calls on unchanged files.

* aj/stash-patch-optimize-temporary-index:
  stash: reuse cached index entries in --patch temporary index
2026-06-07 23:58:25 +09:00
Junio C Hamano
92b870a675 Merge branch 'kh/free-commit-list'
Code clean-up.

* kh/free-commit-list:
  commit: remove deprecated functions
  *: replace deprecated free_commit_list
2026-06-07 23:58:24 +09:00
Junio C Hamano
7450009e6f Merge branch 'ds/restore-sparse-index'
'git restore --staged' has been optimized to avoid unnecessarily expanding
the sparse index when operating on paths within the sparse checkout
definition, by handling sparse directory entries at the tree level.

* ds/restore-sparse-index:
  restore: avoid sparse index expansion
  t1092: test 'git restore' with sparse index
2026-06-07 23:58:24 +09:00
Junio C Hamano
17204228cf Merge branch 'ar/receive-pack-worktree-env'
The GIT_WORK_TREE variable prepared to invoke the push-to-checkout
hook was leaking into the environment even when there was no hook
used and broke the default push-to-deploy (i.e., let "git checkout"
update the working tree only when the working tree is clean).

* ar/receive-pack-worktree-env:
  receive-pack: fix updateInstead with core.worktree
2026-06-07 23:58:24 +09:00
Alexander Monakov
d1b72b29e9 doc: fix typo in GIT_ALTERNATE_OBJECT_DIRECTORIES
One file accidentally spelled GIT_ALTERNATE_OBJECT_DIRECTORIES with
REPOSITORIES instead of DIRECTORIES. Fix the typo.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-06 11:04:03 +09:00
Lucas Seiki Oshiro
4a1eb9304a Documentation: remove redundant 'instead' in --subject-prefix
The documentation for --subject-prefix has two words "instead" in
the same sentence, making it a little bit confusing to read.

Change the order of the phrase to a more natural "Use [...]
instead of [...]" structure.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 22:44:36 +09:00
Patrick Steinhardt
42b9d3dc9d setup: construct object database in apply_repository_format()
With the preceding changes we now always construct the repository's
object database before applying the repository format. Remove this
duplication by constructing it in `apply_repository_format()` instead.

Note that we create the object database _after_ having set up the
repository's hash algorithm, but _before_ setting the compat hash
algorithm. This is intentional:

  - Constructing the object database may require knowledge of its
    intended object format.

  - Setting up the compatibility hash requires the object database to be
    initialized already, because we immediately read the loose object
    map.

The first point is sensible, the second maybe a little less so. Ideally,
it should be the responsibility of the object database itself to
initialize any data structures required for the compatibility hash. But
this would require further changes, so this is kept as-is for now.

Further note that this requires us to move handling of the environment
variables GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES into
the repository format, as well. This allows the caller more flexibility
around whether or not those environment variables are being honored, as
we want to respect them in "setup.c", but not in "repository.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:39 +09:00
Patrick Steinhardt
a84a9d4acd repository: stop reading loose object map twice on repo init
When initializing a repository via `repo_init()` we end up reading the
loose object map twice:

  - `apply_repository_format()` calls `repo_set_compat_hash_algo()`,
    which in turn calls `repo_read_loose_object_map()` if we have a
    compatibility hash configured.

  - `repo_init()` calls `repo_read_loose_object_map()` directly a second
    time.

Drop the second read of the loose object map in `repo_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:39 +09:00
Patrick Steinhardt
d87de311ff setup: stop initializing object database without repository
The function `setup_git_directory_gently()` is responsible for
discovering and setting up a Git repository based on various environment
variables and the current working directory. The result is thus a fully
usable Git repository.

One oddity of this function is that we may set up the object database
even in the case where we don't have a repository, namely in the case
where the `GIT_DIR_EXPLICIT` environment variable is set but points to a
non-existent repository. If so, we call `setup_git_env_internal()` with
the value of the environment variable so that the repository's Git
directory is configured, even if it points to a non-existent directory.

Historically though, this function didn't only configure the repository,
but also initialized the object database. We retained this behaviour
from a preceding commit, even though it really doesn't make much sense
in the first place -- there is no repository, so we don't have an object
database either. There seemingly isn't much of a reason to construct the
object database, as we typically won't try to read objects when we don't
have an object database.

There's one exception though: git-index-pack(1) may run outside of a
repository, which can be used to perform consistency checks for a
packfile. The code path is _almost_ working: we already know to call
`parse_object_buffer()`, which can read objects without an object
database being available. And that works for all object types except for
commits, because `parse_commit_buffer()` calls `parse_commit_graph()`,
and that function doesn't handle the case where we don't have an object
database.

Fix this instance to check for the object database instead of checking
for the Git directory having been initialized. With this fixed, we can
now stop constructing an object database completely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
aae4ebc895 setup: stop creating the object database in setup_git_env()
In the preceding commit we have stopped creating the object database in
`repo_set_gitdir()`. But the logic is still somewhat confusing as we
still end up creating it conditionally in `setup_git_dir()`, which is
called multiple times.

Drop the conditional logic and instead create the object database in all
places where we have discovered and configured a repository.

This leads to even more duplication than we already had in the preceding
commit, but an alert reader may notice that we now (almost) always call
`odb_new()` directly before having called `apply_repository_format()`.
The only exception to this is `setup_git_directory_gently()`, where we
also call the function when _not_ applying the repository format. This
will be fixed in the next commit, and once that's done we can then unify
creation of the object database into `apply_repository_format()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
6a2fbab4c9 repository: stop initializing the object database in repo_set_gitdir()
The function `repo_set_gitdir()` obviously sets the Git directory for a
given repository. Less obviously though, the function also configures a
couple of auxiliary settings.

One such thing is that we create the object database in this function.
This logic only happens conditionally though, as `set_git_dir()` may be
called multiple times during repository setup, and we don't want to
create the object database multiple times. This is somewhat tangled and
hard to follow.

Remove the logic from `repo_set_gitdir()` and instead initialize the
object database outside of it. This leads to some duplication right now,
but that duplication will be removed in a subsequent step where we will
start initializing the object database as part of applying the repo's
format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
3d884b0b56 setup: deduplicate logic to apply repository format
After having discovered the repository format we then apply it to the
repository so that it knows to use the proper repository extensions. The
logic to apply the format is duplicated across three callsites, which
makes it rather painfull to add new extensions.

Introduce a new function `apply_repository_format()` that takes a repo
and applies a given format to it and adapt all callsites to use it.
This function is also the new caller of `verify_repository_format()` so
that we can ensure that we never apply an invalid repository format.
The verification we have in `read_and_verify_repository_format()` is
thus redundant now and dropped.

Rename `read_and_verify_repository_format()` accordingly. While at it,
also rename `check_repository_format()` to clarify that it doesn't only
_check_ the format, but that it also applies it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
452ad8db6d setup: drop setup_git_env()
The `setup_git_env()` function is a trivial wrapper around
`setup_git_env_internal()` and has a single call site only. Drop the
function.

While at it, drop stale documentation in "environment.h" that points to
this function, even though it hasn't been exposed to callers outside of
"setup.c" since 43ad1047a9 (setup: stop using `the_repository` in
`setup_git_env()`, 2026-03-27) anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
027e3b3d38 t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
In subsequent commits we'll rework how we set up the repository. This is
a somewhat intricate and thus fragile sequence; there's many things that
can go subtly wrong, and there are lots of interesting interactions that
one can discover.

One such discovered edge case was the interaction between git-init(1)
and the "GIT_OBJECT_DIRECTORY" environment variable. When set, the
behaviour is that the object directory should be created at the path
that the variable points to. This behaviour is documented as such in
its man page:

  If the object storage directory is specified via the
  GIT_OBJECT_DIRECTORY environment variable then the sha1 directories
  are created underneath; otherwise, the default $GIT_DIR/objects
  directory is used.

Curiously enough though we don't seem to have any tests that exercise
this directly, and thus a subsequent commit inadvertently would have
broken this expectation.

Plug this test gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00