Commit Graph

176407 Commits

Author SHA1 Message Date
Junio C Hamano
3c485e86bf Merge branch 'ps/archive-prefix-doc' into jch
Doc update.

* ps/archive-prefix-doc:
  archive: document --prefix handling of absolute and parent paths
2026-04-09 11:22:15 -07:00
Junio C Hamano
efd38db523 Merge branch 'sp/refs-reduce-the-repository' into jch
Code clean-up to use the right instance of a repository instance in
calls inside refs subsystem.

* sp/refs-reduce-the-repository:
  refs/reftable-backend: drop uses of the_repository
  refs: remove the_hash_algo global state
  refs: add struct repository parameter in get_files_ref_lock_timeout_ms()
2026-04-09 11:22:15 -07:00
Junio C Hamano
1930aa1cb8 Merge branch 'bc/ref-storage-default-doc-update' into jch
Doc update.

* bc/ref-storage-default-doc-update:
  docs: correct information about reftable
2026-04-09 11:22:15 -07:00
Junio C Hamano
1b4f703d67 Merge branch 'jc/neuter-sideband-fixup' into jch
Try to resurrect and reboot a stalled "avoid sending risky escape
sequences taken from sideband to the terminal" topic by Dscho.  The
plan is to keep it in 'next' long enough to see if anybody screams
with the "everything dropped except for ANSI color escape sequences"
default.

* jc/neuter-sideband-fixup:
  sideband: drop 'default' configuration
  sideband: offer to configure sanitizing on a per-URL basis
  sideband: add options to allow more control sequences to be passed through
  sideband: do allow ANSI color sequences by default
  sideband: introduce an "escape hatch" to allow control characters
  sideband: mask control characters
2026-04-09 11:22:14 -07:00
Junio C Hamano
60f07c4f5c A bit more for -rc2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:21:59 -07:00
Junio C Hamano
c343f9cdc2 Merge branch 'ds/rev-list-maximal-only-optim'
"git rev-list --maximal-only" has been optimized by borrowing the
logic used by "git show-branch --independent", which computes the
same kind of information much more efficiently.

* ds/rev-list-maximal-only-optim:
  rev-list: use reduce_heads() for --maximal-only
  p6011: add perf test for rev-list --maximal-only
  t6600: test --maximal-only and --independent
2026-04-09 11:21:59 -07:00
Junio C Hamano
8e04162c18 Merge branch 'kh/doc-config-list'
"git config list" is the official way to spell "git config -l" and
"git config --list".  Use it to update the documentation.

* kh/doc-config-list:
  doc: gitcvs-migration: rephrase “man page”
  doc: replace git config --list/-l with `list`
2026-04-09 11:21:59 -07:00
Junio C Hamano
3eabc358a9 Merge branch 'jk/c23-const-preserving-fixes-more'
Further work to adjust the codebase for C23 that changes functions
like strchr() that discarded constness when they return a pointer into
a const string to preserve constness.

* jk/c23-const-preserving-fixes-more:
  git-compat-util: fix CONST_OUTPARAM typo and indentation
  refs/files-backend: drop const to fix strchr() warning
  http: drop const to fix strstr() warning
  range-diff: drop const to fix strstr() warnings
  pkt-line: make packet_reader.line non-const
  skip_prefix(): check const match between in and out params
  pseudo-merge: fix disk reads from find_pseudo_merge()
  find_last_dir_sep(): convert inline function to macro
  run-command: explicitly cast away constness when assigning to void
  pager: explicitly cast away strchr() constness
  transport-helper: drop const to fix strchr() warnings
  http: add const to fix strchr() warnings
  convert: add const to fix strchr() warnings
2026-04-09 11:21:59 -07:00
Patrick Steinhardt
42f908c6b6 odb: generic in-memory source
Make the in-memory source generic.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:09 -07:00
Patrick Steinhardt
8179380518 odb/source-inmemory: stub out remaining functions
Stub out remaining functions that we either don't need or that are
basically no-ops.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
1919e90e70 odb/source-inmemory: implement freshen_object() callback
Implement the `freshen_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
e6126144ad odb/source-inmemory: implement count_objects() callback
Implement the `count_objects()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
2f473a0b51 odb/source-inmemory: implement find_abbrev_len() callback
Implement the `find_abbrev_len()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
fa93352328 odb/source-inmemory: implement for_each_object() callback
Implement the `for_each_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
2603ba2286 odb/source-inmemory: convert to use oidtree
The in-memory source stores its objects in a simple array that we grow as
needed. This has a couple of downsides:

  - The object lookup is O(n). This doesn't matter in practice because
    we only store a small number of objects.

  - We don't have an easy way to iterate over all objects in
    lexicographic order.

  - We don't have an easy way to compute unique object ID prefixes.

Refactor the code to use an oidtree instead. This is the same data
structure used by our loose object source, and thus it means we get a
bunch of functionality for free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
ee610b535c oidtree: add ability to store data
The oidtree data structure is currently only used to store object IDs,
without any associated data. So consequently, it can only really be used
to track which object IDs exist, and we can use the tree structure to
efficiently operate on OID prefixes.

But there are valid use cases where we want to both:

  - Store object IDs in a sorted order.

  - Associated arbitrary data with them.

Refactor the oidtree interface so that it allows us to store arbitrary
payloads within the respective nodes. This will be used in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
6c6c2cf4f9 cbtree: allow using arbitrary wrapper structures for nodes
The cbtree subsystem allows the user to store arbitrary data in a
prefix-free set of strings. This is used by us to store object IDs in a
way that we can easily iterate through them in lexicograph order, and so
that we can easily perform lookups with shortened object IDs.

In its current form, it is not easily possible to store arbitrary data
with the tree nodes. There are a couple of approaches such a caller
could try to use, but none of them really work:

  - One may embed the `struct cb_node` in a custom structure. This does
    not work though as `struct cb_node` contains a flex array, and
    embedding such a struct in another struct is forbidden.

  - One may use a `union` over `struct cb_node` and ones own data type,
    which _is_ allowed even if the struct contains a flex array. This
    does not work though, as the compiler may align members of the
    struct so that the node key would not immediately start where the
    flex array starts.

  - One may allocate `struct cb_node` such that it has room for both its
    key and the custom data. This has the downside though that if the
    custom data is itself a pointer to allocated memory, then the leak
    checker will not consider the pointer to be alive anymore.

Refactor the cbtree to drop the flex array and instead take in an
explicit offset for where to find the key, which allows the caller to
embed `struct cb_node` is a wrapper struct.

Note that this change has the downside that we now have a bit of padding
in our structure, which grows the size from 60 to 64 bytes on a 64 bit
system. On the other hand though, it allows us to get rid of the memory
copies that we previously had to do to ensure proper alignment. This
seems like a reasonable tradeoff.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
cfa00c26f6 odb/source-inmemory: implement write_object_stream() callback
Implement the `write_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
3cf38cab06 odb/source-inmemory: implement write_object() callback
Implement the `write_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
85daa55ed3 odb/source-inmemory: implement write_object() callback
Implement the `write_object()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
3436407570 odb/source-inmemory: implement read_object_stream() callback
Implement the `read_object_stream()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:08 -07:00
Patrick Steinhardt
02b31495b7 odb/source-inmemory: implement read_object_info() callback
Implement the `read_object_info()` callback function for the in-memory
source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:07 -07:00
Patrick Steinhardt
0e06dbdd14 odb: fix unnecessary call to find_cached_object()
The function `odb_pretend_object()` writes an object into the in-memory
object database source. The effect of this is that the object will now
become readable, but it won't ever be persisted to disk.

Before storing the object, we first verify whether the object already
exists. This is done by calling `odb_has_object()` to check all sources,
followed by `find_cached_object()` to check whether we have already
stored the object in our in-memory source.

This is unnecessary though, as `odb_has_object()` already checks the
in-memory source transitively via:

  - `odb_has_object()`
  - `odb_read_object_info_extended()`
  - `do_oid_object_info_extended()`
  - `find_cached_object()`

Drop the explicit call to `find_cached_object()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:07 -07:00
Patrick Steinhardt
06e49d9d29 odb/source-inmemory: implement free() callback
Implement the `free()` callback function for the "in-memory" source.

Note that this requires us to define `struct cached_object_entry` in
"odb/source-inmemory.h", as it is accessed in both "odb.c" and
"odb/source-inmemory.c" now. This will be fixed in subsequent commits
though.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:07 -07:00
Patrick Steinhardt
3789d4f2be odb: introduce "in-memory" source
Next to our typical object database sources, each object database also
has an implicit source of "cached" objects. These cached objects only
exist in memory and some use cases:

  - They contain evergreen objects that we expect to always exist, like
    for example the empty tree.

  - They can be used to store temporary objects that we don't want to
    persist to disk, which is used by git-blame(1) to create a fake
    worktree commit.

Overall, their use is somewhat restricted though. For example, we don't
provide the ability to use it as a temporary object database source that
allows the user to write objects, but discard them after Git exists. So
while these cached objects behave almost like a source, they aren't used
as one.

This is about to change over the following commits, where we will turn
cached objects into a new "in-memory" source. This will allow us to use
it exactly the same as any other source by providing the same common
interface as the "files" source.

For now, the in-memory source only hosts the cached objects and doesn't
provide any logic yet. This will change with subsequent commits, where
we move respective functionality into the source.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 11:17:07 -07:00
Junio C Hamano
86adb3b430 Merge branch 'jt/odb-transaction-write' into ps/odb-in-memory
* jt/odb-transaction-write:
  odb/transaction: make `write_object_stream()` pluggable
  object-file: generalize packfile writes to use odb_write_stream
  object-file: avoid fd seekback by checking object size upfront
  object-file: remove flags from transaction packfile writes
  odb: update `struct odb_write_stream` read() callback
  odb/transaction: use pluggable `begin_transaction()`
  odb: split `struct odb_transaction` into separate header
2026-04-09 11:16:58 -07:00
Paul Tarjan
5637b48ef7 fsmonitor: convert shown khash to strset in do_handle_client
Replace the khash-based string set used for deduplicating pathnames
in do_handle_client() with a strset, which provides a cleaner
interface for the same purpose.

Since the paths are interned strings from the batch data, use
strdup_strings=0 to avoid unnecessary copies.

Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:29 -07:00
Paul Tarjan
3ef97728f5 fsmonitor: add tests for Linux
Add a smoke test that verifies the filesystem actually delivers
inotify events to the daemon.  On some configurations (e.g.,
overlayfs with older kernels), inotify watches succeed but events
are never delivered.  The daemon cookie wait will time out, but
every subsequent test would fail.  Skip the entire test file early
when this is detected.

Add a test that exercises rapid nested directory creation to verify
the daemon correctly handles the EEXIST race between recursive scan
and queued inotify events.  When IN_MASK_CREATE is available and a
directory watch is added during recursive registration, the kernel
may also deliver a queued IN_CREATE event for the same directory.
The second inotify_add_watch() returns EEXIST, which must be treated
as harmless.  An earlier version of the listener crashed in this
scenario.

Reduce --start-timeout from the default 60 seconds to 10 seconds so
that tests fail promptly when the daemon cannot start.

Harden the test helpers to work in environments without procps
(e.g., Fedora CI): fall back to reading /proc/$pid/stat for the
process group ID when ps is unavailable, guard stop_git() against
an empty pgid, and redirect stderr from kill to /dev/null to avoid
noise when processes have already exited.

Use set -m to enable job control in the submodule-pull test so that
the background git pull gets its own process group, preventing the
shell wait from blocking on the daemon.  setsid() in the previous
commit detaches the daemon itself, but the intermediate git pull
process still needs its own process group for the test shell to
manage it correctly.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:29 -07:00
Paul Tarjan
ec009fc4cb fsmonitor: add timeout to daemon stop command
The "fsmonitor--daemon stop" command polls in a loop waiting for the
daemon to exit after sending a "quit" command over IPC.  If the daemon
fails to shut down (e.g. it is stuck or wedged), this loop spins
forever.

Add a 30-second timeout so the stop command returns an error instead
of blocking indefinitely.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
ec4e5b9310 fsmonitor: close inherited file descriptors and detach in daemon
When the fsmonitor daemon is spawned as a background process, it may
inherit file descriptors from its parent that it does not need.  In
particular, when the test harness or a CI system captures output through
pipes, the daemon can inherit duplicated pipe endpoints.  If the daemon
holds these open, the parent process never sees EOF and may appear to
hang.

Set close_fd_above_stderr on the child process at both daemon startup
paths: the explicit "fsmonitor--daemon start" command and the implicit
spawn triggered by fsmonitor-ipc when a client finds no running daemon.
Also suppress stdout and stderr on the implicit spawn path to prevent
the background daemon from writing to the client's terminal.

Additionally, call setsid() when the daemon starts with --detach to
create a new session and process group.  This prevents the daemon
from being part of the spawning shell's process group, which could
cause the shell's "wait" to block until the daemon exits.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
bef1e78c35 run-command: add close_fd_above_stderr option
Add a close_fd_above_stderr flag to struct child_process.  When set,
the child closes file descriptors 3 and above between fork and exec
(skipping the child-notifier pipe), capped at sysconf(_SC_OPEN_MAX)
or 4096, whichever is smaller.  This prevents the child from
inheriting pipe endpoints or other descriptors from the parent
environment (e.g., the test harness).

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
c9f60a68b8 fsmonitor: implement filesystem change listener for Linux
Implement the built-in fsmonitor daemon for Linux using the inotify
API, bringing it to feature parity with the existing Windows and macOS
implementations.

The implementation uses inotify rather than fanotify because fanotify
requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it
unsuitable for an unprivileged user-space daemon.  While inotify has
the limitation of requiring a separate watch on every directory (unlike
macOS's FSEvents, which can monitor an entire directory tree with a
single watch), it operates without elevated privileges and provides
the per-file event granularity needed for fsmonitor.

The listener uses inotify_init1(O_NONBLOCK) with a poll loop that
checks for events with a 50-millisecond timeout, keeping the inotify
queue well-drained to minimize the risk of overflows.  Bidirectional
hashmaps map between watch descriptors and directory paths for efficient
event resolution.  Directory renames are tracked using inotify's cookie
mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a
periodic check detects stale renames where the matching IN_MOVED_TO
never arrived, forcing a resync.

New directory creation triggers recursive watch registration to ensure
all subdirectories are monitored.  The IN_MASK_CREATE flag is used
where available to prevent modifying existing watches, with a fallback
for older kernels.  When IN_MASK_CREATE is available and
inotify_add_watch returns EEXIST, it means another thread or recursive
scan has already registered the watch, so it is safe to ignore.

Remote filesystem detection uses statfs() to identify network-mounted
filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers.
Mount point information is read from /proc/mounts and matched against
the statfs f_fsid to get accurate, human-readable filesystem type names
for logging.  When the .git directory is on a remote filesystem, the
IPC socket falls back to $HOME or a user-configured directory via the
fsmonitor.socketDir setting.

Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
92efd36e55 fsmonitor: rename fsm-settings-darwin.c to fsm-settings-unix.c
The fsmonitor settings logic in fsm-settings-darwin.c is not
Darwin-specific and will be reused by the upcoming Linux
implementation.  Rename it to fsm-settings-unix.c to reflect that it
is shared by all Unix platforms.

Update the build files (meson.build and CMakeLists.txt) to use
FSMONITOR_OS_SETTINGS for fsm-settings, matching the approach already
used for fsm-ipc.

Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
fed191168e fsmonitor: rename fsm-ipc-darwin.c to fsm-ipc-unix.c
The fsmonitor IPC path logic in fsm-ipc-darwin.c is not
Darwin-specific and will be reused by the upcoming Linux
implementation.  Rename it to fsm-ipc-unix.c to reflect that it
is shared by all Unix platforms.

Introduce FSMONITOR_OS_SETTINGS (set to "unix" for non-Windows, "win32"
for Windows) as a separate variable from FSMONITOR_DAEMON_BACKEND so
that the build files can distinguish between platform-specific files
(listen, health, path-utils) and shared Unix files (ipc, settings).

Move fsm-ipc to the FSMONITOR_OS_SETTINGS section in the Makefile, and
switch fsm-path-utils to use FSMONITOR_DAEMON_BACKEND since path-utils
is platform-specific (there will be separate darwin and linux versions).

Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
34cc34abb3 fsmonitor: use pthread_cond_timedwait for cookie wait
The cookie wait in with_lock__wait_for_cookie() uses an infinite
pthread_cond_wait() loop.  The existing comment notes the desire
to switch to pthread_cond_timedwait(), but the routine was not
available in git thread-utils.

On certain container or overlay filesystems, inotify watches may
succeed but events are never delivered.  In this case the daemon
would hang indefinitely waiting for the cookie event, which in
turn causes the client to hang.

Replace the infinite wait with a one-second timeout using
pthread_cond_timedwait().  If the timeout fires, report an
error and let the client proceed with a trivial (full-scan)
response rather than blocking forever.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:28 -07:00
Paul Tarjan
e6efea2aff compat/win32: add pthread_cond_timedwait
Add a pthread_cond_timedwait() implementation to the Windows pthread
compatibility layer using SleepConditionVariableCS() with a millisecond
timeout computed from the absolute deadline.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:27 -07:00
Paul Tarjan
20ea1e7e3e fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon
The `state.cookies` hashmap is initialized during daemon startup but
never freed during cleanup in the `done:` label of
fsmonitor_run_daemon().  The cookie entries also have names allocated
via strbuf_detach() that must be freed individually.

Iterate the hashmap to free each cookie name, then call
hashmap_clear_and_free() to release the entries and table.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:27 -07:00
Paul Tarjan
c0901a7cd1 fsmonitor: fix khash memory leak in do_handle_client
The `shown` kh_str_t was freed with kh_release_str() at a point in
the code only reachable in the non-trivial response path.  When the
client receives a trivial response, the code jumps to the `cleanup`
label, skipping the kh_release_str() call entirely and leaking the
hash table.

Fix this by initializing `shown` to NULL and moving the cleanup to the
`cleanup` label using kh_destroy_str(), which is safe to call on NULL.
This ensures the hash table is freed regardless of which code path is
taken.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:27 -07:00
Paul Tarjan
7ab80e6389 t9210, t9211: disable GIT_TEST_SPLIT_INDEX for scalar clone tests
index.skipHash (Scalar default) and split-index are incompatible:
the shared index gets a null OID when skipHash skips computing the
hash, and the null OID causes the shared index to not be loaded on
re-read.  This triggers a BUG assertion in fsmonitor when the
fsmonitor_dirty bitmap references more entries than the (now empty)
index has.

Disable GIT_TEST_SPLIT_INDEX in the scalar clone tests that hit
this: tests 12, 13, and 22 in t9210 (matching the existing
workaround in test 16), and all of t9211 (every test does scalar
clone).

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-09 10:59:27 -07:00
Phillip Wood
40c92ff457 xdiff: reduce the size of array
When the myers algorithm is selected the input files are pre-processed
to remove any common prefix and suffix and any lines that appear
in only one file. This requires a map to be created between the
lines that are processed by the myers algorithm and the lines in
the original file. That map does not include the common lines at the
beginning and end of the files but the array is allocated to be the
size of the whole file. Move the allocation into xdl_cleanup_records()
where the map is populated and we know how big it needs to be.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 14:11:55 -07:00
Phillip Wood
8c9d203485 xprepare: simplify error handling
If either of the two allocations fail we want to take the same action
so use a single if statement. This saves a few lines and makes it
easier for the next commit to add a couple more allocations.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 14:11:55 -07:00
Phillip Wood
77c188e4a6 xdiff: cleanup xdl_clean_mmatch()
Remove the "s" parameter as, since the last commit, this function
is always called with s == 0. Also change parameter "e" to expect a
length, rather than the index of the last line to simplify the caller.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 14:11:53 -07:00
Phillip Wood
9232a7adf8 xdiff: reduce size of action arrays
When the myers algorithm is selected the input files are pre-processed
to remove any common prefix and suffix. Then any lines that appear
only in one side of the diff are marked as changed and frequently
occurring lines are marked as changed if they are adjacent to a
changed line. This step requires a couple of temporary arrays. As as
the common prefix and suffix have already been removed, the arrays
only need to be big enough to hold the lines between them, not the
whole file. Reduce the size of the arrays and adjust the loops that
use them accordingly while taking care to keep indexing the arrays
in xdfile_t with absolute line numbers.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 14:09:32 -07:00
Junio C Hamano
c97ac79765 Merge branch 'en/xdiff-cleanup-3' into pw/xdiff-shrink-memory-consumption
* en/xdiff-cleanup-3:
  xdiff/xdl_cleanup_records: put braces around the else clause
  xdiff/xdl_cleanup_records: make setting action easier to follow
  xdiff/xdl_cleanup_records: make limits more clear
  xdiff/xdl_cleanup_records: use unambiguous types
  xdiff: use unambiguous types in xdl_bogo_sqrt()
  xdiff/xdl_cleanup_records: delete local recs pointer
2026-04-08 14:01:42 -07:00
Ezekiel Newren
0ee3c64b97 xdiff/xdl_cleanup_records: put braces around the else clause
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:46 -07:00
Ezekiel Newren
e7e8d80402 xdiff/xdl_cleanup_records: make setting action easier to follow
Rewrite nested ternaries with a clear if/else ladder for
action1/action2 to improve readability while preserving
behavior.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:45 -07:00
Ezekiel Newren
59cb212e84 xdiff/xdl_cleanup_records: make limits more clear
Make the handling of per-file limits and the minimal-case clearer.
  * Use explicit per-file limit variables (mlim1, mlim2) and initialize
    them.
  * The additional condition `!need_min` is redudant now, remove it.
Best viewed with --color-words.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:45 -07:00
Ezekiel Newren
042cefe77b xdiff/xdl_cleanup_records: use unambiguous types
Change the parameters of xdl_clean_mmatch() and the local variables
i, nm, mlim in xdl_cleanup_records() to use unambiguous types. Best
viewed with --color-words.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:45 -07:00
Ezekiel Newren
e85a4167dd xdiff: use unambiguous types in xdl_bogo_sqrt()
There is no real square root for a negative number and size_t may not
be large enough for certain applications, replace long with uint64_t.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:45 -07:00
Ezekiel Newren
da1a90eab0 xdiff/xdl_cleanup_records: delete local recs pointer
Simplify the first 2 for loops by directly indexing the xdfile.recs.
recs is unused in the last 2 for loops, remove it. Best viewed with
--color-words.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-04-08 13:57:45 -07:00