Commit Graph

562 Commits

Author SHA1 Message Date
Johannes Schindelin
392cdcf6ea Introduce helper to create symlinks that knows about index_state
On Windows, symbolic links actually have a type depending on the target:
it can be a file or a directory.

In certain circumstances, this poses problems, e.g. when a symbolic link
is supposed to point into a submodule that is not checked out, so there
is no way for Git to auto-detect the type.

To help with that, we will add support over the course of the next
commits to specify that symlink type via the Git attributes. This
requires an index_state, though, something that Git for Windows'
`symlink()` replacement cannot know about because the function signature
is defined by the POSIX standard and not ours to change.

So let's introduce a helper function to create symbolic links that
*does* know about the index_state.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2026-06-10 03:07:46 +00:00
Derrick Stolee
2f96c8bbc3 setup: properly use "%(prefix)/" when in WSL
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
2026-06-10 03:07:42 +00:00
Junio C Hamano
a1f23cb38c Merge branch 'ps/setup-centralize-odb-creation' into next
The setup logic to discover and configure repositories has been
refactored, and the initialization of the object database has been
centralized.

* ps/setup-centralize-odb-creation:
  setup: construct object database in `apply_repository_format()`
  repository: stop reading loose object map twice on repo init
  setup: stop initializing object database without repository
  setup: stop creating the object database in `setup_git_env()`
  repository: stop initializing the object database in `repo_set_gitdir()`
  setup: deduplicate logic to apply repository format
  setup: drop `setup_git_env()`
  t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY
2026-06-09 10:11:12 +09:00
Patrick Steinhardt
42b9d3dc9d setup: construct object database in apply_repository_format()
With the preceding changes we now always construct the repository's
object database before applying the repository format. Remove this
duplication by constructing it in `apply_repository_format()` instead.

Note that we create the object database _after_ having set up the
repository's hash algorithm, but _before_ setting the compat hash
algorithm. This is intentional:

  - Constructing the object database may require knowledge of its
    intended object format.

  - Setting up the compatibility hash requires the object database to be
    initialized already, because we immediately read the loose object
    map.

The first point is sensible, the second maybe a little less so. Ideally,
it should be the responsibility of the object database itself to
initialize any data structures required for the compatibility hash. But
this would require further changes, so this is kept as-is for now.

Further note that this requires us to move handling of the environment
variables GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES into
the repository format, as well. This allows the caller more flexibility
around whether or not those environment variables are being honored, as
we want to respect them in "setup.c", but not in "repository.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:39 +09:00
Patrick Steinhardt
d87de311ff setup: stop initializing object database without repository
The function `setup_git_directory_gently()` is responsible for
discovering and setting up a Git repository based on various environment
variables and the current working directory. The result is thus a fully
usable Git repository.

One oddity of this function is that we may set up the object database
even in the case where we don't have a repository, namely in the case
where the `GIT_DIR_EXPLICIT` environment variable is set but points to a
non-existent repository. If so, we call `setup_git_env_internal()` with
the value of the environment variable so that the repository's Git
directory is configured, even if it points to a non-existent directory.

Historically though, this function didn't only configure the repository,
but also initialized the object database. We retained this behaviour
from a preceding commit, even though it really doesn't make much sense
in the first place -- there is no repository, so we don't have an object
database either. There seemingly isn't much of a reason to construct the
object database, as we typically won't try to read objects when we don't
have an object database.

There's one exception though: git-index-pack(1) may run outside of a
repository, which can be used to perform consistency checks for a
packfile. The code path is _almost_ working: we already know to call
`parse_object_buffer()`, which can read objects without an object
database being available. And that works for all object types except for
commits, because `parse_commit_buffer()` calls `parse_commit_graph()`,
and that function doesn't handle the case where we don't have an object
database.

Fix this instance to check for the object database instead of checking
for the Git directory having been initialized. With this fixed, we can
now stop constructing an object database completely.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
aae4ebc895 setup: stop creating the object database in setup_git_env()
In the preceding commit we have stopped creating the object database in
`repo_set_gitdir()`. But the logic is still somewhat confusing as we
still end up creating it conditionally in `setup_git_dir()`, which is
called multiple times.

Drop the conditional logic and instead create the object database in all
places where we have discovered and configured a repository.

This leads to even more duplication than we already had in the preceding
commit, but an alert reader may notice that we now (almost) always call
`odb_new()` directly before having called `apply_repository_format()`.
The only exception to this is `setup_git_directory_gently()`, where we
also call the function when _not_ applying the repository format. This
will be fixed in the next commit, and once that's done we can then unify
creation of the object database into `apply_repository_format()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
6a2fbab4c9 repository: stop initializing the object database in repo_set_gitdir()
The function `repo_set_gitdir()` obviously sets the Git directory for a
given repository. Less obviously though, the function also configures a
couple of auxiliary settings.

One such thing is that we create the object database in this function.
This logic only happens conditionally though, as `set_git_dir()` may be
called multiple times during repository setup, and we don't want to
create the object database multiple times. This is somewhat tangled and
hard to follow.

Remove the logic from `repo_set_gitdir()` and instead initialize the
object database outside of it. This leads to some duplication right now,
but that duplication will be removed in a subsequent step where we will
start initializing the object database as part of applying the repo's
format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
3d884b0b56 setup: deduplicate logic to apply repository format
After having discovered the repository format we then apply it to the
repository so that it knows to use the proper repository extensions. The
logic to apply the format is duplicated across three callsites, which
makes it rather painfull to add new extensions.

Introduce a new function `apply_repository_format()` that takes a repo
and applies a given format to it and adapt all callsites to use it.
This function is also the new caller of `verify_repository_format()` so
that we can ensure that we never apply an invalid repository format.
The verification we have in `read_and_verify_repository_format()` is
thus redundant now and dropped.

Rename `read_and_verify_repository_format()` accordingly. While at it,
also rename `check_repository_format()` to clarify that it doesn't only
_check_ the format, but that it also applies it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Patrick Steinhardt
452ad8db6d setup: drop setup_git_env()
The `setup_git_env()` function is a trivial wrapper around
`setup_git_env_internal()` and has a single call site only. Drop the
function.

While at it, drop stale documentation in "environment.h" that points to
this function, even though it hasn't been exposed to callers outside of
"setup.c" since 43ad1047a9 (setup: stop using `the_repository` in
`setup_git_env()`, 2026-03-27) anymore.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-05 21:49:38 +09:00
Junio C Hamano
2fb444ae8a Merge branch 'ps/setup-wo-the-repository' into ps/setup-centralize-odb-creation
* ps/setup-wo-the-repository:
  setup: stop using `the_repository` in `init_db()`
  setup: stop using `the_repository` in `create_reference_database()`
  setup: stop using `the_repository` in `initialize_repository_version()`
  setup: stop using `the_repository` in `check_repository_format()`
  setup: stop using `the_repository` in `upgrade_repository_format()`
  setup: stop using `the_repository` in `setup_git_directory()`
  setup: stop using `the_repository` in `setup_git_directory_gently()`
  setup: stop using `the_repository` in `setup_git_env()`
  setup: stop using `the_repository` in `set_git_work_tree()`
  setup: stop using `the_repository` in `setup_work_tree()`
  setup: stop using `the_repository` in `enter_repo()`
  setup: stop using `the_repository` in `verify_non_filename()`
  setup: stop using `the_repository` in `verify_filename()`
  setup: stop using `the_repository` in `path_inside_repo()`
  setup: stop using `the_repository` in `prefix_path()`
  setup: stop using `the_repository` in `is_inside_work_tree()`
  setup: stop using `the_repository` in `is_inside_git_dir()`
  setup: replace use of `the_repository` in static functions
2026-05-22 09:27:58 +09:00
Junio C Hamano
9b7fa37559 Merge branch 'ps/maintenance-daemonize-lockfix' into next
"git maintenance" that goes background did not use the lockfile to
prevent multiple maintenance processes from running at the same
time, which has been corrected..

* ps/maintenance-daemonize-lockfix:
  run-command: honor "gc.auto" for auto-maintenance
  builtin/maintenance: fix locking with "--detach"
2026-05-21 15:51:33 +09:00
Junio C Hamano
d8fb5a7b3e Merge branch 'ps/setup-wo-the-repository' into next
Many uses of the_repository has been updated to use a more
appropriate struct repository instance in setup.c codepath.

* ps/setup-wo-the-repository:
  setup: stop using `the_repository` in `init_db()`
  setup: stop using `the_repository` in `create_reference_database()`
  setup: stop using `the_repository` in `initialize_repository_version()`
  setup: stop using `the_repository` in `check_repository_format()`
  setup: stop using `the_repository` in `upgrade_repository_format()`
  setup: stop using `the_repository` in `setup_git_directory()`
  setup: stop using `the_repository` in `setup_git_directory_gently()`
  setup: stop using `the_repository` in `setup_git_env()`
  setup: stop using `the_repository` in `set_git_work_tree()`
  setup: stop using `the_repository` in `setup_work_tree()`
  setup: stop using `the_repository` in `enter_repo()`
  setup: stop using `the_repository` in `verify_non_filename()`
  setup: stop using `the_repository` in `verify_filename()`
  setup: stop using `the_repository` in `path_inside_repo()`
  setup: stop using `the_repository` in `prefix_path()`
  setup: stop using `the_repository` in `is_inside_work_tree()`
  setup: stop using `the_repository` in `is_inside_git_dir()`
  setup: replace use of `the_repository` in static functions
2026-05-21 13:39:25 +09:00
Patrick Steinhardt
df69f40c34 setup: stop using the_repository in init_db()
Stop using `the_repository` in `init_db()` and instead accept
the repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
15053894cb setup: stop using the_repository in create_reference_database()
Stop using `the_repository` in `create_reference_database()` and instead
accept the repository as a parameter. The injection of `the_repository`
is thus bumped one level higher, where callers now pass it in
explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
779fbcd9eb setup: stop using the_repository in initialize_repository_version()
Stop using `the_repository` in `initialize_repository_version()` and
instead accept the repository as a parameter. The injection of
`the_repository` is thus bumped one level higher, where callers now pass
it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
602254dfb0 setup: stop using the_repository in check_repository_format()
Stop using `the_repository` in `check_repository_format()` and instead
accept the repository as a parameter. The injection of `the_repository`
is thus bumped one level higher, where callers now pass it in
explicitly.

Furthermore, the function is never used outside "setup.c". Drop its
declaration in "setup.h" and make it static. Note that this requires us
to reorder the function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
9cae7229c9 setup: stop using the_repository in upgrade_repository_format()
Stop using `the_repository` in `upgrade_repository_format()` and instead
accept the repository as a parameter. The injection of `the_repository`
is thus bumped one level higher, where callers now pass it in
explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
f9210dbc8a setup: stop using the_repository in setup_git_directory()
Stop using `the_repository` in `setup_git_directory()` and instead
accept the repository as a parameter. The injection of `the_repository`
is thus bumped one level higher, where callers now pass it in
explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:25 +09:00
Patrick Steinhardt
a80a8e3ea6 setup: stop using the_repository in setup_git_directory_gently()
Stop using `the_repository` in `setup_git_directory_gently()` and
instead accept the repository as a parameter. The injection of
`the_repository` is thus bumped one level higher, where callers now pass
it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
27b76d1862 setup: stop using the_repository in setup_git_env()
Stop using `the_repository` in `setup_git_env()` and instead accept the
repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Furthermore, the function is never used outside of "setup.c". Drop the
declaration in "environment.h" and make it static.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
7a6a82fba0 setup: stop using the_repository in set_git_work_tree()
Stop using `the_repository` in `set_git_work_tree()` and instead accept
the repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Similar as with the preceding commit, we track whether the worktree has
been initialized already via a global variable so that we can die in
case the repository is re-initialized with a different worktree path.
Store this info in the `struct repository` instead so that we correctly
handle this per repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
bd2851d84f setup: stop using the_repository in setup_work_tree()
Stop using `the_repository` in `setup_work_tree()` and instead accept
the repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Note that the function tracks two bits of information via global
variables. This of course doesn't make much sense anymore now that we
can set up worktrees for arbitrary repositories:

  - We track whether the worktree has already been initialized and, if
    so, we skip the call to `chdir_notify()` and setenv(3p). It does not
    make much sense to store this info in the repository, as we _would_
    want to update the environment when switching between worktrees back
    and forth.

    So instead of storing this info in the repository, we drop this
    state entirely and live with the fact that we may execute the logic
    twice. It should ultimately be idempotent though and thus not be
    much of a problem.

  - We track whether the worktree configuration is bogus. If so, and if
    later on some caller tries to setup the worktree, then we'll die
    instead. This is indeed information that we can move into the
    repository itself.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
ea1d0f886d setup: stop using the_repository in enter_repo()
Stop using `the_repository` in `enter_repo()` and instead accept the
repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
920dba4581 setup: stop using the_repository in verify_non_filename()
Stop using `the_repository` in `verify_non_filename()` and instead
accept the repository as a parameter. The injection of `the_repository`
is thus bumped one level higher, where callers now pass it in
explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
6e7e50cc7b setup: stop using the_repository in verify_filename()
Stop using `the_repository` in `verify_filename()` and instead accept
the repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
e6a380201e setup: stop using the_repository in path_inside_repo()
Stop using `the_repository` in `path_inside_repo()` and instead accept
the repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
2c46e933fa setup: stop using the_repository in prefix_path()
Stop using `the_repository` in `prefix_path()` and instead accept the
repository as a parameter. The injection of `the_repository` is thus
bumped one level higher, where callers now pass it in explicitly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:24 +09:00
Patrick Steinhardt
8da5ecdb4d setup: stop using the_repository in is_inside_work_tree()
Similar as with the preceding commit, `is_inside_work_tree()` determines
whether the current working directory is located inside the worktree of
`the_repository`. Perform the same refactoring by dropping the caching
mechanism and injecting the repository that shall be checked.

Note that, same as in the preceding commit, we're also resolving the
worktree path via `realpath()`. In theory this step is not necessary as
we always set the worktree path via `repo_set_worktree()`, and that
function already resolves the path for us. But resolving the path a
second time is unlikely to matter performance-wise, and it feels fragile
to rely on the repository's worktree path being absolute. We thus
perform the same extra step even though it's ultimately not required.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:23 +09:00
Patrick Steinhardt
ce70cbc294 setup: stop using the_repository in is_inside_git_dir()
The function `is_inside_git_dir()` verifies whether or not the current
working directory is located inside the gitdir of `the_repository`. This
is done by taking the gitdir path and verifying that it's a prefix of
the current working directory.

This information is cached so that we don't have to re-do this change
multiple times. Furthermore, we proactively set the value in multiple
locations so that we don't even have to perform the check when we have
discovered the repository.

While we could simply move the caching variable into the repository, the
current layout doesn't really feel sensible in the first place:

  - It can easily lead to false positives or negatives if at any point
    in time we may switch the current working directory.

  - We don't call the function in a hot loop, and neither is it overly
    expensive to compute.

Drop the caching infrastructure and instead compute the property ad-hoc
via an injected repository.

Note that there is one small gotcha: we often end up with relative
gitdir paths, and if so `is_inside_dir()` might fail. This wasn't an
issue before because of how we proactively set the cached value during
repository discovery. Now that we stop doing that it becomes a problem
though, which we work around by resolving the gitdir via `realpath()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:23 +09:00
Patrick Steinhardt
bb50ec6b26 setup: replace use of the_repository in static functions
Replace the use of `the_repository` in "setup.c" for all static
functions. For now, we simply add `the_repository` to invocations of
these functions. This will be addressed in subsequent commits, where
we'll move up `the_repository` one more layer to callers of "setup.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-19 19:36:23 +09:00
Patrick Steinhardt
6e95b07e5f builtin/maintenance: fix locking with "--detach"
When running git-maintenance(1), we create a lockfile that is supposed
to keep other maintenance processes from running at the same time. This
lockfile is broken though in case the "--detach" flag is passed: the
lockfile is created by the parent process and will be cleaned up either
manually or on exit. But when detaching, the parent will exit before all
of the background maintenance tasks have been run, and consequently the
lock only covers a smaller part of the whole maintenance process.

Fix this bug by reassigning all tempfiles from the parent process to the
child process when daemonizing so that it becomes the responsibility of
the child to clean them up.

Note that this is a broader fix, as we now always reassign tempfiles
when daemonizing. This is a natural consequence of the semantics of
`daemonize()` though, as it essentially promises to continue running the
current process in the background. It is thus sensible to have that
function perform the whole dance of assigning resources to the child
process, including tempfiles.

There's only a single other caller in "daemon.c", but that process
doesn't create any tempfiles before the call to `daemonize()` and is
thus not impacted by this change.

Reported-by: Jean-Christophe Manciot <actionmystique@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <stolee@gmail.com>
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-05-13 16:57:53 +09:00
Johannes Schindelin
985b38ca6c safe.bareRepository: default to "explicit" with WITH_BREAKING_CHANGES
When an attacker can convince a user to clone a crafted repository
that contains an embedded bare repository with malicious hooks, any Git
command the user runs after entering that subdirectory will discover
the bare repository and execute the hooks. The user does not even need
to run a Git command explicitly: many shell prompts run `git status`
in the background to display branch and dirty state information, and
`git status` in turn may invoke the fsmonitor hook if so configured,
making the user vulnerable the moment they `cd` into the directory. The
`safe.bareRepository` configuration variable (introduced in 8959555cee
(setup_git_directory(): add an owner check for the top-level directory,
2022-03-02)) already provides protection against this attack vector by
allowing users to set it to "explicit", but the default remained "all"
for backwards compatibility.

Since Git 3.0 is the natural point to change defaults to safer
values, flip the default from "all" to "explicit" when built with
`WITH_BREAKING_CHANGES`. This means Git will refuse to work with bare
repositories that are discovered implicitly by walking up the directory
tree. Bare repositories specified via `--git-dir` or `GIT_DIR` continue
to work, and directories that look like `.git`, worktrees, or submodule
directories are unaffected (the existing `is_implicit_bare_repo()`
whitelist handles those cases).

Users who rely on implicit bare repository discovery can restore the
previous behavior by setting `safe.bareRepository=all` in their global
or system configuration.

The test for the "safe.bareRepository in the repository" scenario
needed a more involved fix: it writes a `safe.bareRepository=all`
entry into the bare repository's own config to verify that repo-local
config does not override the protected (global) setting. Previously,
`test_config -C` was used to write that entry, but its cleanup runs `git
-C <bare-repo> config --unset`, which itself fails when the default is
"explicit" and the global config has already been cleaned up. Switching
to direct git config --file access avoids going through repository
discovery entirely.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-27 14:50:54 +09:00
Junio C Hamano
c563b12ce7 Merge branch 'ty/setup-error-tightening'
While discovering a ".git" directory, the code treats any stat()
failure as a sign that a filesystem entity .git does not exist
there, and ignores ".git" that is not a "gitdir" file or a
directory.  The code has been tightened to notice and report
filesystem corruption better.

* ty/setup-error-tightening:
  setup: improve error diagnosis for invalid .git files
2026-03-16 10:48:14 -07:00
Junio C Hamano
1d0a2acb78 Merge branch 'kn/ref-location'
Allow the directory in which reference backends store their data to
be specified.

* kn/ref-location:
  refs: add GIT_REFERENCE_BACKEND to specify reference backend
  refs: allow reference location in refstorage config
  refs: receive and use the reference storage payload
  refs: move out stub modification to generic layer
  refs: extract out `refs_create_refdir_stubs()`
  setup: don't modify repo in `create_reference_database()`
2026-03-04 10:52:59 -08:00
Tian Yuchen
1dd27bfbfd setup: improve error diagnosis for invalid .git files
'read_gitfile_gently()' treats any non-regular file as
'READ_GITFILE_ERR_NOT_A_FILE' and fails to discern between 'ENOENT'
and other stat failures. This flawed error reporting is noted by two
'NEEDSWORK' comments.

Address these comments by introducing two new error codes:
'READ_GITFILE_ERR_MISSING'(which groups the "file missing" scenarios
together) and 'READ_GITFILE_ERR_IS_A_DIR':

1. Update 'read_gitfile_error_die()' to treat 'IS_A_DIR', 'MISSING',
'NOT_A_FILE' and 'STAT_FAILED' as non-fatal no-ops. This accommodates
intentional non-repo scenarios (e.g., GIT_DIR=/dev/null).

2. Explicitly catch 'NOT_A_FILE' and 'STAT_FAILED' during
discovery and call 'die()' if 'die_on_error' is set.

3. Unconditionally pass '&error_code' to 'read_gitfile_gently()'.

4. Only invoke 'is_git_directory()' when we explicitly receive
   'READ_GITFILE_ERR_IS_A_DIR', avoiding redundant checks.

Additionally, audit external callers of 'read_gitfile_gently()' in
'submodule.c' and 'worktree.c' to accommodate the refined error codes.

Signed-off-by: Tian Yuchen <a3205153416@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-04 09:23:48 -08:00
Karthik Nayak
53592d68e8 refs: add GIT_REFERENCE_BACKEND to specify reference backend
Git allows setting a different object directory via
'GIT_OBJECT_DIRECTORY', but provides no equivalent for references. In
the previous commit we extended the 'extensions.refStorage' config to
also support an URI input for reference backend with location.

Let's also add a new environment variable 'GIT_REFERENCE_BACKEND' that
takes in the same input as the config variable. Having an environment
variable allows us to modify the reference backend and location on the
fly for individual Git commands.

The environment variable also allows usage of alternate reference
directories during 'git-clone(1)' and 'git-init(1)'. Add the config to
the repository when created with the environment variable set.

When initializing the repository with an alternate reference folder,
create the required stubs in the repositories $GIT_DIR. The inverse,
i.e. removal of the ref store doesn't clean up the stubs in the $GIT_DIR
since that would render it unusable. Removal of ref store is only used
when migrating between ref formats and cleanup of the $GIT_DIR doesn't
make sense in such a situation.

Helped-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:40:00 -08:00
Karthik Nayak
01dc84594e refs: allow reference location in refstorage config
The 'extensions.refStorage' config is used to specify the reference
backend for a given repository. Both the 'files' and 'reftable' backends
utilize the $GIT_DIR as the reference folder by default in
`get_main_ref_store()`.

Since the reference backends are pluggable, this means that they could
work with out-of-tree reference directories too. Extend the 'refStorage'
config to also support taking an URI input, where users can specify the
reference backend and the location.

Add the required changes to obtain and propagate this value to the
individual backends. Add the necessary documentation and tests.

Traditionally, for linked worktrees, references were stored in the
'$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference
storage path, it doesn't make sense to store the main worktree
references in the new path, and the linked worktree references in the
$GIT_DIR. So, let's store linked worktree references in
'$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the
necessary files and folders while also adding stubs in the $GIT_DIR path
to ensure that it is still considered a Git directory.

Ideally, we would want to pass in a `struct worktree *` to individual
backends, instead of passing the `gitdir`. This allows them to handle
worktree specific logic. Currently, that is not possible since the
worktree code is:

  - Tied to using the global `the_repository` variable.

  - Is not setup before the reference database during initialization of
    the repository.

Add a TODO in 'refs.c' to ensure we can eventually make that change.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:38:41 -08:00
Karthik Nayak
2c69ff4819 setup: don't modify repo in create_reference_database()
The `create_reference_database()` function is used to create the
reference database during initialization of a repository. The function
calls `repo_set_ref_storage_format()` to set the repositories reference
format. This is an unexpected side-effect of the function. More so
because the function is only called in two locations:

  1. During git-init(1) where the value is propagated from the `struct
     repository_format repo_fmt` value.

  2. During git-clone(1) where the value is propagated from the
     `the_repository` value.

The former is valid, however the flow already calls
`repo_set_ref_storage_format()`, so this effort is simply duplicated.
The latter sets the existing value in `the_repository` back to itself.
While this is okay for now, introduction of more fields in
`repo_set_ref_storage_format()` would cause issues, especially
dynamically allocated strings, where we would free/allocate the same
string back into `the_repostiory`.

To avoid all this confusion, clean up the function to no longer take in
and set the repo's reference storage format.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25 09:27:11 -08:00
Junio C Hamano
354b8d89ac Merge branch 'rs/clean-includes'
Clean up redundant includes of header files.

* rs/clean-includes:
  remove duplicate includes
2026-02-17 13:30:42 -08:00
René Scharfe
10c68d2577 remove duplicate includes
The following command reports that some header files are included twice:

   $ git grep '#include' '*.c' | sort | uniq -cd

Remove the second #include line in each case, as it has no effect.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-08 15:03:06 -08:00
Junio C Hamano
c3a5261dc0 Merge branch 'ar/submodule-gitdir-tweak'
Avoid local submodule repository directory paths overlapping with
each other by encoding submodule names before using them as path
components.

* ar/submodule-gitdir-tweak:
  submodule: detect conflicts with existing gitdir configs
  submodule: hash the submodule name for the gitdir path
  submodule: fix case-folding gitdir filesystem collisions
  submodule--helper: fix filesystem collisions by encoding gitdir paths
  builtin/credential-store: move is_rfc3986_unreserved to url.[ch]
  submodule--helper: add gitdir migration command
  submodule: allow runtime enabling extensions.submodulePathConfig
  submodule: introduce extensions.submodulePathConfig
  builtin/submodule--helper: add gitdir command
  submodule: always validate gitdirs inside submodule_name_to_gitdir
  submodule--helper: use submodule_name_to_gitdir in add_submodule
2026-02-05 15:41:58 -08:00
Adrian Ratiu
c349bad729 submodule: allow runtime enabling extensions.submodulePathConfig
Add a new config `init.defaultSubmodulePathConfig` which allows
enabling `extensions.submodulePathConfig` for new submodules by
default (those created via git init or clone).

Important: setting init.defaultSubmodulePathConfig = true does
not globally enable `extensions.submodulePathConfig`. Existing
repositories will still have the extension disabled and will
require migration (for example via git submodule--helper command
added in the next commit).

Suggested-by: Patrick Steinhardt <ps@pks.im>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 11:56:56 -08:00
Adrian Ratiu
4173df5187 submodule: introduce extensions.submodulePathConfig
The idea of this extension is to abstract away the submodule gitdir
path implementation: everyone is expected to use the config and not
worry about how the path is computed internally, either in git or
other implementations.

With this extension enabled, the submodule.<name>.gitdir repo config
becomes the single source of truth for all submodule gitdir paths.

The submodule.<name>.gitdir config is added automatically for all new
submodules when this extension is enabled.

Git will throw an error if the extension is enabled and a config is
missing, advising users how to migrate. Migration is manual for now.

E.g. to add a missing config entry for an existing "foo" module:
git config submodule.foo.gitdir .git/modules/foo

Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 11:56:55 -08:00
Johannes Schindelin
7997e36561 init: do parse _all_ core.* settings early
In Git for Windows, `has_symlinks` is set to 0 by default. Therefore, we
need to parse the config setting `core.symlinks` to know if it has been
set to `true`. In `git init`, we must do that before copying the
templates because they might contain symbolic links.

Even if the support for symbolic links on Windows has not made it to
upstream Git yet, we really should make sure that all the `core.*`
settings are parsed before proceeding, as they might very well change
the behavior of `git init` in a way the user intended.

This fixes https://github.com/git-for-windows/git/issues/3414

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 18:28:36 -08:00
Junio C Hamano
9d442ce2e2 Merge branch 'ps/object-source-management'
Code refactoring around object database sources.

* ps/object-source-management:
  odb: handle recreation of quarantine directories
  odb: handle changing a repository's commondir
  chdir-notify: add function to unregister listeners
  odb: handle initialization of sources in `odb_new()`
  http-push: stop setting up `the_repository` for each reference
  t/helper: stop setting up `the_repository` repeatedly
  builtin/index-pack: fix deferred fsck outside repos
  oidset: introduce `oidset_equal()`
  odb: move logic to disable ref updates into repo
  odb: refactor `odb_clear()` to `odb_free()`
  odb: adopt logic to close object databases
  setup: convert `set_git_dir()` to have file scope
  path: move `enter_repo()` into "setup.c"
2025-12-05 14:49:58 +09:00
Junio C Hamano
0534b78576 Merge branch 'jc/optional-path'
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.

* jc/optional-path:
  config: really treat missing optional path as not configured
  config: really pretend missing :(optional) value is not there
  config: mark otherwise unused function as file-scope static
2025-12-05 14:49:56 +09:00
Patrick Steinhardt
ac65c70663 odb: handle recreation of quarantine directories
In the preceding commit we have moved the logic that reparents object
database sources on chdir(3p) from "setup.c" into "odb.c". Let's also do
the same for any temporary quarantine directories so that the complete
reparenting logic is self-contained in "odb.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
2816b748e5 odb: handle changing a repository's commondir
The function `repo_set_gitdir()` is called in two situations:

  - To initialize the repository with its discovered location. As part
    of this we also set up the new object database.

  - To update the repository's discovered location in case the process
    changes its working directory so that we update relative paths. This
    means we also have to update any relative paths that are potentially
    used in the object database.

In the context of the object database we ideally wouldn't ever have to
worry about the second case: if all paths used by our object database
sources were absolute, then we wouldn't have to update them. But
unfortunately, the paths aren't only used to locate files owned by the
given source, but we also use them for reporting purposes. One such
example is `repo_get_object_directory()`, where we cannot just change
semantics to always return absolute paths, as that is likely to break
tooling out there.

One solution to this would be to have both a "display path" and an
"internal path". This would allow us to use internal paths for all
internal matters, but continue to use the potentially-relative display
paths so that we don't break compatibility. But converting the codebase
to honor this split is quite a messy endeavour, and it wouldn't even
help us with the goal to get rid of the need to update the display path
on chdir(3p).

Another solution would be to rework "setup.c" so that we never have to
update paths in the first place. In that case, we'd only initialize the
repository once we have figured out final locations for all directories.
This would be a significant simplification of that subsystem indeed, but
the current logic is so messy that it would take significant investments
to get there.

Meanwhile though, while object sources may still use relative paths, the
best thing we can do is to handle the reparenting of the object source
paths in the object database itself. This can be done by registering one
callback for each object database so that we get notified whenever the
current working directory changes, and we then perform the reparenting
ourselves.

Ideally, this wouldn't even happen on the object database level, but
instead handled by each object database source. But we don't yet have
proper pluggable object database sources, so this will need to be
handled at a later point in time.

The logic itself is rather simple:

  - We register the callback when creating the object database.

  - We unregister the callback when releasing it again.

  - We split up `set_git_dir_1()` so that it becomes possible to skip
    recreating the object database. This is required because the
    function is called both when the current working directory changes,
    but also when we set up the repository. Calling this function
    without skipping creation of the ODB will result in a bug in case
    it's already created.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
b67b2d9fb7 odb: move logic to disable ref updates into repo
Our object database sources have a field `disable_ref_updates`. This
field can obviously be set to disable reference updates, but it is
somewhat curious that this logic is hosted by the object database.

The reason for this is that it was primarily added to keep us from
accidentally updating references while an ODB transaction is ongoing.
Any objects part of the transaction have not yet been committed to disk,
so new references that point to them might get corrupted in case we
never end up committing the transaction. As such, whenever we create a
new transaction we set up a new temporary ODB source and mark it as
disabling reference updates.

This has one (and only one?) upside: once we have committed the
transaction, the temporary source will be dropped and thus we clean up
the disabled reference updates automatically. But other than that, it's
somewhat misdesigned:

  - We can have multiple ODB sources, but only the currently active
    source inhibits reference updates.

  - We're mixing concerns of the refdb with the ODB.

Arguably, the decision of whether we can update references or not should
be handled by the refdb. But that wouldn't be a great fit either, as
there can be one refdb per worktree. So we'd again have the same problem
that a "global" intent becomes localized to a specific instance.

Instead, move the setting into the repository. While at it, convert it
into a boolean.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:15:59 -08:00
Junio C Hamano
0bd16856ff config: really treat missing optional path as not configured
These callers expect that git_config_pathname() that returns 0 is a
signal that the variable they passed has a string they need to act
on.  But with the introduction of ":(optional)path" earlier, that is
no longer the case.  If the path specified by the configuration
variable is missing, their variable will get a NULL in it, and they
need to act on it (often, just refraining from copying it elsewhere).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24 17:00:47 -08:00