Code clean-up to use the right instance of a repository instance in
calls inside refs subsystem.
* sp/refs-reduce-the-repository:
refs/reftable-backend: drop uses of the_repository
refs: remove the_hash_algo global state
refs: add struct repository parameter in get_files_ref_lock_timeout_ms()
Try to resurrect and reboot a stalled "avoid sending risky escape
sequences taken from sideband to the terminal" topic by Dscho. The
plan is to keep it in 'next' long enough to see if anybody screams
with the "everything dropped except for ANSI color escape sequences"
default.
* jc/neuter-sideband-fixup:
sideband: drop 'default' configuration
sideband: offer to configure sanitizing on a per-URL basis
sideband: add options to allow more control sequences to be passed through
sideband: do allow ANSI color sequences by default
sideband: introduce an "escape hatch" to allow control characters
sideband: mask control characters
"git rev-list --maximal-only" has been optimized by borrowing the
logic used by "git show-branch --independent", which computes the
same kind of information much more efficiently.
* ds/rev-list-maximal-only-optim:
rev-list: use reduce_heads() for --maximal-only
p6011: add perf test for rev-list --maximal-only
t6600: test --maximal-only and --independent
"git config list" is the official way to spell "git config -l" and
"git config --list". Use it to update the documentation.
* kh/doc-config-list:
doc: gitcvs-migration: rephrase “man page”
doc: replace git config --list/-l with `list`
Further work to adjust the codebase for C23 that changes functions
like strchr() that discarded constness when they return a pointer into
a const string to preserve constness.
* jk/c23-const-preserving-fixes-more:
git-compat-util: fix CONST_OUTPARAM typo and indentation
refs/files-backend: drop const to fix strchr() warning
http: drop const to fix strstr() warning
range-diff: drop const to fix strstr() warnings
pkt-line: make packet_reader.line non-const
skip_prefix(): check const match between in and out params
pseudo-merge: fix disk reads from find_pseudo_merge()
find_last_dir_sep(): convert inline function to macro
run-command: explicitly cast away constness when assigning to void
pager: explicitly cast away strchr() constness
transport-helper: drop const to fix strchr() warnings
http: add const to fix strchr() warnings
convert: add const to fix strchr() warnings
Stub out remaining functions that we either don't need or that are
basically no-ops.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `freshen_object()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `count_objects()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `find_abbrev_len()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `for_each_object()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The in-memory source stores its objects in a simple array that we grow as
needed. This has a couple of downsides:
- The object lookup is O(n). This doesn't matter in practice because
we only store a small number of objects.
- We don't have an easy way to iterate over all objects in
lexicographic order.
- We don't have an easy way to compute unique object ID prefixes.
Refactor the code to use an oidtree instead. This is the same data
structure used by our loose object source, and thus it means we get a
bunch of functionality for free.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The oidtree data structure is currently only used to store object IDs,
without any associated data. So consequently, it can only really be used
to track which object IDs exist, and we can use the tree structure to
efficiently operate on OID prefixes.
But there are valid use cases where we want to both:
- Store object IDs in a sorted order.
- Associated arbitrary data with them.
Refactor the oidtree interface so that it allows us to store arbitrary
payloads within the respective nodes. This will be used in the next
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cbtree subsystem allows the user to store arbitrary data in a
prefix-free set of strings. This is used by us to store object IDs in a
way that we can easily iterate through them in lexicograph order, and so
that we can easily perform lookups with shortened object IDs.
In its current form, it is not easily possible to store arbitrary data
with the tree nodes. There are a couple of approaches such a caller
could try to use, but none of them really work:
- One may embed the `struct cb_node` in a custom structure. This does
not work though as `struct cb_node` contains a flex array, and
embedding such a struct in another struct is forbidden.
- One may use a `union` over `struct cb_node` and ones own data type,
which _is_ allowed even if the struct contains a flex array. This
does not work though, as the compiler may align members of the
struct so that the node key would not immediately start where the
flex array starts.
- One may allocate `struct cb_node` such that it has room for both its
key and the custom data. This has the downside though that if the
custom data is itself a pointer to allocated memory, then the leak
checker will not consider the pointer to be alive anymore.
Refactor the cbtree to drop the flex array and instead take in an
explicit offset for where to find the key, which allows the caller to
embed `struct cb_node` is a wrapper struct.
Note that this change has the downside that we now have a bit of padding
in our structure, which grows the size from 60 to 64 bytes on a 64 bit
system. On the other hand though, it allows us to get rid of the memory
copies that we previously had to do to ensure proper alignment. This
seems like a reasonable tradeoff.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `write_object_stream()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `write_object()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `write_object()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `read_object_stream()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `read_object_info()` callback function for the in-memory
source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `odb_pretend_object()` writes an object into the in-memory
object database source. The effect of this is that the object will now
become readable, but it won't ever be persisted to disk.
Before storing the object, we first verify whether the object already
exists. This is done by calling `odb_has_object()` to check all sources,
followed by `find_cached_object()` to check whether we have already
stored the object in our in-memory source.
This is unnecessary though, as `odb_has_object()` already checks the
in-memory source transitively via:
- `odb_has_object()`
- `odb_read_object_info_extended()`
- `do_oid_object_info_extended()`
- `find_cached_object()`
Drop the explicit call to `find_cached_object()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the `free()` callback function for the "in-memory" source.
Note that this requires us to define `struct cached_object_entry` in
"odb/source-inmemory.h", as it is accessed in both "odb.c" and
"odb/source-inmemory.c" now. This will be fixed in subsequent commits
though.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Next to our typical object database sources, each object database also
has an implicit source of "cached" objects. These cached objects only
exist in memory and some use cases:
- They contain evergreen objects that we expect to always exist, like
for example the empty tree.
- They can be used to store temporary objects that we don't want to
persist to disk, which is used by git-blame(1) to create a fake
worktree commit.
Overall, their use is somewhat restricted though. For example, we don't
provide the ability to use it as a temporary object database source that
allows the user to write objects, but discard them after Git exists. So
while these cached objects behave almost like a source, they aren't used
as one.
This is about to change over the following commits, where we will turn
cached objects into a new "in-memory" source. This will allow us to use
it exactly the same as any other source by providing the same common
interface as the "files" source.
For now, the in-memory source only hosts the cached objects and doesn't
provide any logic yet. This will change with subsequent commits, where
we move respective functionality into the source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the khash-based string set used for deduplicating pathnames
in do_handle_client() with a strset, which provides a cleaner
interface for the same purpose.
Since the paths are interned strings from the batch data, use
strdup_strings=0 to avoid unnecessary copies.
Suggested-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a smoke test that verifies the filesystem actually delivers
inotify events to the daemon. On some configurations (e.g.,
overlayfs with older kernels), inotify watches succeed but events
are never delivered. The daemon cookie wait will time out, but
every subsequent test would fail. Skip the entire test file early
when this is detected.
Add a test that exercises rapid nested directory creation to verify
the daemon correctly handles the EEXIST race between recursive scan
and queued inotify events. When IN_MASK_CREATE is available and a
directory watch is added during recursive registration, the kernel
may also deliver a queued IN_CREATE event for the same directory.
The second inotify_add_watch() returns EEXIST, which must be treated
as harmless. An earlier version of the listener crashed in this
scenario.
Reduce --start-timeout from the default 60 seconds to 10 seconds so
that tests fail promptly when the daemon cannot start.
Harden the test helpers to work in environments without procps
(e.g., Fedora CI): fall back to reading /proc/$pid/stat for the
process group ID when ps is unavailable, guard stop_git() against
an empty pgid, and redirect stderr from kill to /dev/null to avoid
noise when processes have already exited.
Use set -m to enable job control in the submodule-pull test so that
the background git pull gets its own process group, preventing the
shell wait from blocking on the daemon. setsid() in the previous
commit detaches the daemon itself, but the intermediate git pull
process still needs its own process group for the test shell to
manage it correctly.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fsmonitor--daemon stop" command polls in a loop waiting for the
daemon to exit after sending a "quit" command over IPC. If the daemon
fails to shut down (e.g. it is stuck or wedged), this loop spins
forever.
Add a 30-second timeout so the stop command returns an error instead
of blocking indefinitely.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the fsmonitor daemon is spawned as a background process, it may
inherit file descriptors from its parent that it does not need. In
particular, when the test harness or a CI system captures output through
pipes, the daemon can inherit duplicated pipe endpoints. If the daemon
holds these open, the parent process never sees EOF and may appear to
hang.
Set close_fd_above_stderr on the child process at both daemon startup
paths: the explicit "fsmonitor--daemon start" command and the implicit
spawn triggered by fsmonitor-ipc when a client finds no running daemon.
Also suppress stdout and stderr on the implicit spawn path to prevent
the background daemon from writing to the client's terminal.
Additionally, call setsid() when the daemon starts with --detach to
create a new session and process group. This prevents the daemon
from being part of the spawning shell's process group, which could
cause the shell's "wait" to block until the daemon exits.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a close_fd_above_stderr flag to struct child_process. When set,
the child closes file descriptors 3 and above between fork and exec
(skipping the child-notifier pipe), capped at sysconf(_SC_OPEN_MAX)
or 4096, whichever is smaller. This prevents the child from
inheriting pipe endpoints or other descriptors from the parent
environment (e.g., the test harness).
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the built-in fsmonitor daemon for Linux using the inotify
API, bringing it to feature parity with the existing Windows and macOS
implementations.
The implementation uses inotify rather than fanotify because fanotify
requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it
unsuitable for an unprivileged user-space daemon. While inotify has
the limitation of requiring a separate watch on every directory (unlike
macOS's FSEvents, which can monitor an entire directory tree with a
single watch), it operates without elevated privileges and provides
the per-file event granularity needed for fsmonitor.
The listener uses inotify_init1(O_NONBLOCK) with a poll loop that
checks for events with a 50-millisecond timeout, keeping the inotify
queue well-drained to minimize the risk of overflows. Bidirectional
hashmaps map between watch descriptors and directory paths for efficient
event resolution. Directory renames are tracked using inotify's cookie
mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a
periodic check detects stale renames where the matching IN_MOVED_TO
never arrived, forcing a resync.
New directory creation triggers recursive watch registration to ensure
all subdirectories are monitored. The IN_MASK_CREATE flag is used
where available to prevent modifying existing watches, with a fallback
for older kernels. When IN_MASK_CREATE is available and
inotify_add_watch returns EEXIST, it means another thread or recursive
scan has already registered the watch, so it is safe to ignore.
Remote filesystem detection uses statfs() to identify network-mounted
filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers.
Mount point information is read from /proc/mounts and matched against
the statfs f_fsid to get accurate, human-readable filesystem type names
for logging. When the .git directory is on a remote filesystem, the
IPC socket falls back to $HOME or a user-configured directory via the
fsmonitor.socketDir setting.
Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsmonitor settings logic in fsm-settings-darwin.c is not
Darwin-specific and will be reused by the upcoming Linux
implementation. Rename it to fsm-settings-unix.c to reflect that it
is shared by all Unix platforms.
Update the build files (meson.build and CMakeLists.txt) to use
FSMONITOR_OS_SETTINGS for fsm-settings, matching the approach already
used for fsm-ipc.
Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsmonitor IPC path logic in fsm-ipc-darwin.c is not
Darwin-specific and will be reused by the upcoming Linux
implementation. Rename it to fsm-ipc-unix.c to reflect that it
is shared by all Unix platforms.
Introduce FSMONITOR_OS_SETTINGS (set to "unix" for non-Windows, "win32"
for Windows) as a separate variable from FSMONITOR_DAEMON_BACKEND so
that the build files can distinguish between platform-specific files
(listen, health, path-utils) and shared Unix files (ipc, settings).
Move fsm-ipc to the FSMONITOR_OS_SETTINGS section in the Makefile, and
switch fsm-path-utils to use FSMONITOR_DAEMON_BACKEND since path-utils
is platform-specific (there will be separate darwin and linux versions).
Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cookie wait in with_lock__wait_for_cookie() uses an infinite
pthread_cond_wait() loop. The existing comment notes the desire
to switch to pthread_cond_timedwait(), but the routine was not
available in git thread-utils.
On certain container or overlay filesystems, inotify watches may
succeed but events are never delivered. In this case the daemon
would hang indefinitely waiting for the cookie event, which in
turn causes the client to hang.
Replace the infinite wait with a one-second timeout using
pthread_cond_timedwait(). If the timeout fires, report an
error and let the client proceed with a trivial (full-scan)
response rather than blocking forever.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a pthread_cond_timedwait() implementation to the Windows pthread
compatibility layer using SleepConditionVariableCS() with a millisecond
timeout computed from the absolute deadline.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `state.cookies` hashmap is initialized during daemon startup but
never freed during cleanup in the `done:` label of
fsmonitor_run_daemon(). The cookie entries also have names allocated
via strbuf_detach() that must be freed individually.
Iterate the hashmap to free each cookie name, then call
hashmap_clear_and_free() to release the entries and table.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `shown` kh_str_t was freed with kh_release_str() at a point in
the code only reachable in the non-trivial response path. When the
client receives a trivial response, the code jumps to the `cleanup`
label, skipping the kh_release_str() call entirely and leaking the
hash table.
Fix this by initializing `shown` to NULL and moving the cleanup to the
`cleanup` label using kh_destroy_str(), which is safe to call on NULL.
This ensures the hash table is freed regardless of which code path is
taken.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
index.skipHash (Scalar default) and split-index are incompatible:
the shared index gets a null OID when skipHash skips computing the
hash, and the null OID causes the shared index to not be loaded on
re-read. This triggers a BUG assertion in fsmonitor when the
fsmonitor_dirty bitmap references more entries than the (now empty)
index has.
Disable GIT_TEST_SPLIT_INDEX in the scalar clone tests that hit
this: tests 12, 13, and 22 in t9210 (matching the existing
workaround in test 16), and all of t9211 (every test does scalar
clone).
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the myers algorithm is selected the input files are pre-processed
to remove any common prefix and suffix and any lines that appear
in only one file. This requires a map to be created between the
lines that are processed by the myers algorithm and the lines in
the original file. That map does not include the common lines at the
beginning and end of the files but the array is allocated to be the
size of the whole file. Move the allocation into xdl_cleanup_records()
where the map is populated and we know how big it needs to be.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If either of the two allocations fail we want to take the same action
so use a single if statement. This saves a few lines and makes it
easier for the next commit to add a couple more allocations.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "s" parameter as, since the last commit, this function
is always called with s == 0. Also change parameter "e" to expect a
length, rather than the index of the last line to simplify the caller.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the myers algorithm is selected the input files are pre-processed
to remove any common prefix and suffix. Then any lines that appear
only in one side of the diff are marked as changed and frequently
occurring lines are marked as changed if they are adjacent to a
changed line. This step requires a couple of temporary arrays. As as
the common prefix and suffix have already been removed, the arrays
only need to be big enough to hold the lines between them, not the
whole file. Reduce the size of the arrays and adjust the loops that
use them accordingly while taking care to keep indexing the arrays
in xdfile_t with absolute line numbers.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/xdiff-cleanup-3:
xdiff/xdl_cleanup_records: put braces around the else clause
xdiff/xdl_cleanup_records: make setting action easier to follow
xdiff/xdl_cleanup_records: make limits more clear
xdiff/xdl_cleanup_records: use unambiguous types
xdiff: use unambiguous types in xdl_bogo_sqrt()
xdiff/xdl_cleanup_records: delete local recs pointer
Rewrite nested ternaries with a clear if/else ladder for
action1/action2 to improve readability while preserving
behavior.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the handling of per-file limits and the minimal-case clearer.
* Use explicit per-file limit variables (mlim1, mlim2) and initialize
them.
* The additional condition `!need_min` is redudant now, remove it.
Best viewed with --color-words.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parameters of xdl_clean_mmatch() and the local variables
i, nm, mlim in xdl_cleanup_records() to use unambiguous types. Best
viewed with --color-words.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no real square root for a negative number and size_t may not
be large enough for certain applications, replace long with uint64_t.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the first 2 for loops by directly indexing the xdfile.recs.
recs is unused in the last 2 for loops, remove it. Best viewed with
--color-words.
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>