Just like the `hash-object --literally` code path, the `--stdin` code
path also needs to use `size_t` instead of `unsigned long` to represent
memory sizes, otherwise it would cause problems on platforms using the
LLP64 data model (such as Windows).
To limit the scope of the test case, the object is explicitly not
written to the object store, nor are any filters applied.
The `big` file from the previous test case is reused to save setup time;
To avoid relying on that side effect, it is generated if it does not
exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`).
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Continue walking the code path for the >4GB `hash-object --literally`
test to the hash algorithm step for LLP64 systems.
This patch lets the SHA1DC code use `size_t`, making it compatible with
LLP64 data models (as used e.g. by Windows).
The interested reader of this patch will note that we adjust the
signature of the `git_SHA1DCUpdate()` function without updating _any_
call site. This certainly puzzled at least one reviewer already, so here
is an explanation:
This function is never called directly, but always via the macro
`platform_SHA1_Update`, which is usually called via the macro
`git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly
in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`,
which is defined thusly:
static void git_hash_sha1_update(git_hash_ctx *ctx,
const void *data, size_t len)
{
git_SHA1_Update(&ctx->sha1, data, len);
}
i.e. it contains an implicit downcast from `size_t` to `unsigned long`
(before this here patch). With this patch, there is no downcast anymore.
With this patch, finally, the t1007-hash-object.sh "files over 4GB hash
literally" test case is fixed.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Continue walking the code path for the >4GB `hash-object --literally`
test. The `hash_object_file_literally()` function internally uses both
`hash_object_file()` and `write_object_file_prepare()`. Both function
signatures use `unsigned long` rather than `size_t` for the mem buffer
sizes. Use `size_t` instead, for LLP64 compatibility.
While at it, convert those function's object's header buffer length to
`size_t` for consistency. The value is already upcast to `uintmax_t` for
print format compatibility.
Note: The hash-object test still does not pass. A subsequent commit
continues to walk the call tree's lower level hash functions to identify
further fixes.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is
only 32 bits (for backward compatibility). Git's use of `unsigned long`
for file memory sizes in many places, rather than size_t, limits the
handling of large files on LLP64 systems (commonly given as `>4GB`).
Provide a minimum test for handling a >4GB file. The `hash-object`
command, with the `--literally` and without `-w` option avoids
writing the object, either loose or packed. This avoids the code paths
hitting the `bigFileThreshold` config test code, the zlib code, and the
pack code.
Subsequent patches will walk the test's call chain, converting types to
`size_t` (which is larger in LLP64 data models) where appropriate.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These fixes have been sent to the Git mailing list but have not been
picked up by the Git project yet.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a Unix socket is initialized, the current directory's path is
stored so that the cleanup code can `chdir()` back to where it was
before exit.
If the path that needs to be stored exceeds the default size of the
`sun_path` attribute of `struct sockaddr_un` (which is defined as a
108-sized byte array on Linux), a larger buffer needs to be allocated so
that it can hold the path, and it is the responsibility of the
`unix_sockaddr_cleanup()` function to release that allocated memory.
In Git's CI, this stack allocation is not necessary because the code is
checked out to `/home/runner/work/git/git`. Concatenate the path
`t/trash directory.t0301-credential-cache/.cache/git/credential/socket`
and a terminating NUL, and you end up with 96 bytes, 12 shy of the
default `sun_path` size.
However, I use worktrees with slightly longer paths:
`/home/me/projects/git/yes/i/nest/worktrees/to/organize/them/` is more
in line with what I have. When I recently tried to locally reproduce a
failure of the `linux-leaks` CI job, this t0301 test failed (where it
had not failed in CI).
The reason: When `credential-cache` tries to reach its daemon initially
by calling `unix_sockaddr_init()`, it is expected that the daemon cannot
be reached (the idea is to spin up the daemon in that case and try
again). However, when this first call to `unix_sockaddr_init()` fails,
the code returns early from the `unix_stream_connect()` function
_without_ giving the cleanup code a chance to run, skipping the
deallocation of above-mentioned path.
The fix is easy: do not return early but instead go directly to the
cleanup code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In some implementations, `regexec_buf()` assumes that it is fed lines;
Without `REG_NOTEOL` it thinks the end of the buffer is the end of a
line. Which makes sense, but trips up this case because we are not
feeding lines, but rather a whole buffer. So the final newline is not
the start of an empty line, but the true end of the buffer.
This causes an interesting bug:
$ echo content >file.txt
$ git grep --no-index -n '^$' file.txt
file.txt:2:
This bug is fixed by making the end of the buffer consistently the end
of the final line.
The patch was applied from
https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/
Reported-by: Olly Betts <olly@survex.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When fetching objects into a lazily cloned repository, .promisor
files are created with information meant to help debugging. "git
repack" has been taught to carry this information forward to
packfiles that are newly created.
* lp/repack-propagate-promisor-debugging-info:
SQUASH???
t7703: test for promisor file content after geometric repack
t7700: test for promisor file content after repack
repack-promisor: preserve content of promisor files after repack
pack-write: add helper to fill promisor file after repack
pack-write: add explanation to promisor file content
Add a new odb "in-memory" source that is meant to only hold
tentative objects (like the virtual blob object that represents the
working tree file used by "git blame").
Comments?
* ps/odb-in-memory:
odb: generic in-memory source
odb/source-inmemory: stub out remaining functions
odb/source-inmemory: implement `freshen_object()` callback
odb/source-inmemory: implement `count_objects()` callback
odb/source-inmemory: implement `find_abbrev_len()` callback
odb/source-inmemory: implement `for_each_object()` callback
odb/source-inmemory: convert to use oidtree
oidtree: add ability to store data
cbtree: allow using arbitrary wrapper structures for nodes
odb/source-inmemory: implement `write_object_stream()` callback
odb/source-inmemory: implement `write_object()` callback
odb/source-inmemory: implement `write_object()` callback
odb/source-inmemory: implement `read_object_stream()` callback
odb/source-inmemory: implement `read_object_info()` callback
odb: fix unnecessary call to `find_cached_object()`
odb/source-inmemory: implement `free()` callback
odb: introduce "in-memory" source
The fsmonitor daemon has been implemented for Linux.
* pt/fsmonitor-linux:
fsmonitor: convert shown khash to strset in do_handle_client
fsmonitor: add tests for Linux
fsmonitor: add timeout to daemon stop command
fsmonitor: close inherited file descriptors and detach in daemon
run-command: add close_fd_above_stderr option
fsmonitor: implement filesystem change listener for Linux
fsmonitor: rename fsm-settings-darwin.c to fsm-settings-unix.c
fsmonitor: rename fsm-ipc-darwin.c to fsm-ipc-unix.c
fsmonitor: use pthread_cond_timedwait for cookie wait
compat/win32: add pthread_cond_timedwait
fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon
fsmonitor: fix khash memory leak in do_handle_client
t9210, t9211: disable GIT_TEST_SPLIT_INDEX for scalar clone tests
Reduce the reference to the_repository in the refs subsystem.
Comments?
* sp/refs-with-less-the-repository:
refs/reftable-backend: drop uses of the_repository
refs: remove the_hash_algo global state
refs: add struct repository parameter in get_files_ref_lock_timeout_ms()
Rust support is enabled by default (but still allows opting out) in
Git 2.54.
* bc/rust-by-default-in-2.54:
Enable Rust by default
Linux: link against libdl
ci: install cargo on Alpine
docs: update version with default Rust support
In a history with more than one root commit, "git log --graph
--oneline" stuffed an unrelated commit immediately below a root
commit, which has been corrected by making the spot below a root
unavailable.
Comments?
* ps/shift-root-in-graph:
graph: add indentation for commits preceded by a parentless commit
Some tests assume that bare repository accesses are by default
allowed; rewrite some of them to avoid the assumption, rewrite
others to explicitly set safe.bareRepository to allow them.
Comments?
* js/adjust-tests-to-explicitly-access-bare-repo:
git p4 clone --bare: need to be explicit about the gitdir
t9700: stop relying on implicit bare repo discovery
t9210: pass `safe.bareRepository=all` to `scalar register`
t6020: use `-C` for worktree, `--git-dir` for bare repository
t5619: wrap `test_commit_bulk` in `GIT_DIR` subshell for bare repo
t5540/t5541: avoid accessing a bare repository via `-C <dir>`
t5509: specify bare repository path explicitly
t5505: export `GIT_DIR` after `git init --bare`
t5503: avoid discovering a bare repository
t2406: use `--git-dir=.` for bare repository worktree repair
t2400: explicitly specify bare repo for `git worktree add`
t1900: avoid using `-C <dir>` for a bare repository
t1020: use `--git-dir` instead of subshell for bare repo
t0056: allow implicit bare repo discovery for `-C` work-tree tests
t0003: use `--git-dir` for bare repo attribute tests
t0001: replace `cd`+`git` with `git --git-dir` in `check_config`
t0001: allow implicit bare repo discovery for aliased-command test
Many uses of the_repository has been updated to use a more
appropriate struct repository instance in setup.c codepath.
* ps/setup-wo-the-repository:
setup: stop using `the_repository` in `init_db()`
setup: stop using `the_repository` in `create_reference_database()`
setup: stop using `the_repository` in `initialize_repository_version()`
setup: stop using `the_repository` in `check_repository_format()`
setup: stop using `the_repository` in `upgrade_repository_format()`
setup: stop using `the_repository` in `setup_git_directory()`
setup: stop using `the_repository` in `setup_git_directory_gently()`
setup: stop using `the_repository` in `setup_git_env()`
setup: stop using `the_repository` in `set_git_work_tree()`
setup: stop using `the_repository` in `setup_work_tree()`
setup: stop using `the_repository` in `enter_repo()`
setup: stop using `the_repository` in `verify_non_filename()`
setup: stop using `the_repository` in `verify_filename()`
setup: stop using `the_repository` in `path_inside_repo()`
setup: stop using `the_repository` in `prefix_path()`
setup: stop using `the_repository` in `is_inside_git_dir()`
setup: stop using `the_repository` in `is_inside_worktree()`
setup: replace use of `the_repository` in static functions
The graph output from commands like "git log --graph" can now be
limited to a specified number of lanes, preventing overly wide output
in repositories with many branches.
* ps/graph-lane-limit:
graph: add truncation mark to capped lanes
graph: add --graph-lane-limit option
graph: limit the graph width to a hard-coded max
"git bisect" now uses the selected terms (e.g., old/new) more
consistently in its output.
* jr/bisect-custom-terms-in-output:
bisect: use selected alternate terms in status output
"git checkout -m another-branch" was invented to deal with local
changes to paths that are different between the current and the new
branch, but it gave only one chance to resolve conflicts. The command
was taught to create a stash to save the local changes.
* hn/git-checkout-m-with-stash:
checkout: -m (--merge) uses autostash when switching branches
sequencer: teach autostash apply to take optional conflict marker labels
sequencer: allow create_autostash to run silently
stash: add --ours-label, --theirs-label, --base-label for apply
"git name-rev" learned to use custom format instead of the object
name in an extended SHA-1 expression form.
Comments?
* kh/name-rev-custom-format:
name-rev: learn --format=<pretty>
name-rev: wrap both blocks in braces
The final step, split from earlier attempt by Dscho, to loosen the
sideband restriction for now and tighten later at Git v3.0 boundary.
* jc/neuter-sideband-post-3.0:
sideband: delay sanitizing by default to Git v3.0
"git clone" learns to pay attention to "clone.<url>.defaultObjectFilter"
configuration and behave as if the "--filter=<filter-spec>" option
was given on the command line.
* ab/clone-default-object-filter:
clone: add clone.<url>.defaultObjectFilter config
When processing large history graphs on Debian or Ubuntu, "git
subtree" can die with a "recursion depth reached" error.
Comments?
* cs/subtree-split-recursion:
contrib/subtree: reduce recursion during split
contrib/subtree: functionalize split traversal
contrib/subtree: reduce function side-effects
Shrink wasted memory in Myers diff that does not account for common
prefix and suffix removal.
* pw/xdiff-shrink-memory-consumption:
xdiff: reduce the size of array
xprepare: simplify error handling
xdiff: cleanup xdl_clean_mmatch()
xdiff: reduce size of action arrays
Preparation of the xdiff/ codebase to work with Rust.
* en/xdiff-cleanup-3:
xdiff/xdl_cleanup_records: put braces around the else clause
xdiff/xdl_cleanup_records: make setting action easier to follow
xdiff/xdl_cleanup_records: make limits more clear
xdiff/xdl_cleanup_records: use unambiguous types
xdiff: use unambiguous types in xdl_bogo_sqrt()
xdiff/xdl_cleanup_records: delete local recs pointer
* ar/parallel-hooks:
hook: allow hook.jobs=-1 to use all available CPU cores
hook: add hook.<event>.enabled switch
hook: move is_known_hook() to hook.c for wider use
hook: warn when hook.<friendly-name>.jobs is set
hook: add per-event jobs config
hook: add -j/--jobs option to git hook run
hook: mark non-parallelizable hooks
hook: allow pre-push parallel execution
hook: allow parallel hook execution
hook: parse the hook.jobs config
config: add a repo_config_get_uint() helper
repository: fix repo_init() memleak due to missing _clear()
The repacking code has been refactored and compaction of MIDX layers
have been implemented, and incremental strategy that does not require
all-into-one repacking has been introduced.
* tb/incremental-midx-part-3.3:
repack: allow `--write-midx=incremental` without `--geometric`
repack: introduce `--write-midx=incremental`
repack: implement incremental MIDX repacking
packfile: ensure `close_pack_revindex()` frees in-memory revindex
builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK`
repack-geometry: prepare for incremental MIDX repacking
repack-midx: extract `repack_fill_midx_stdin_packs()`
repack-midx: factor out `repack_prepare_midx_command()`
midx: expose `midx_layer_contains_pack()`
repack: track the ODB source via existing_packs
midx: support custom `--base` for incremental MIDX writes
midx: introduce `--checksum-only` for incremental MIDX writes
midx: use `strvec` for `keep_hashes`
strvec: introduce `strvec_init_alloc()`
midx: use `string_list` for retained MIDX files
midx-write: handle noop writes when converting incremental chains
The emulation layer we added for writev(3p) tries to be too faithful
to the spec that on systems with SSIZE_MAX set to lower than 64kB to
fit a single sideband packet would fail just like the real system
writev(), which makes our use of writev() for sideband messages
unworkable.
Let's revert them and reboot the effort after the release. The
reverted commits are:
$ git log -Swritev --oneline 8023abc632^..v2.52.0-rc1
89152af176 cmake: use writev(3p) wrapper as needed
26986f4cba sideband: use writev(3p) to send pktlines
1970fcef93 wrapper: introduce writev(3p) wrappers
3b9b2c2a29 compat/posix: introduce writev(3p) wrapper
8023abc632 is the merge of ps/upload-pack-buffer-more-writes topic to
the mainline.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When switching branches with "git checkout -m", local modifications
can block the switch. Teach the -m flow to create a temporary stash
before switching and reapply it after. On success, only "Applied
autostash." is shown. If reapplying causes conflicts, the stash is
kept and the user is told they can resolve and run "git stash drop",
or run "git reset --hard" and later "git stash pop" to recover their
changes.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add label1, label2, and label_ancestor parameters to the autostash
apply machinery so callers can pass custom conflict marker labels
through to "git stash apply --ours-label/--theirs-label/--base-label".
Introduce apply_autostash_ref_with_labels() for callers that want
to pass labels.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a silent parameter to create_autostash_internal and introduce
create_autostash_ref_silent so that callers can create an autostash
without printing the "Created autostash" message. Use stderr for
the message when not silent.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow callers of "git stash apply" to pass custom labels for conflict
markers instead of the default "Updated upstream" and "Stashed changes".
Document the new options and add a test.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The check that implements the logic to see if an in-core cache-tree
is fully ready to write out a tree object was broken, which has
been corrected.
* dl/cache-tree-fully-valid-fix:
cache-tree: fix inverted object existence check in cache_tree_fully_valid
The code path to update the configuration file has been taught to
use a short timeout to retry.
Comments?
* jt/config-lock-timeout:
config: retry acquiring config.lock for 100ms
The [includeIf "condition"] conditional inclusion facility for
configuration files has learned to use the location of worktree
in its condition.
Comments?
* cl/conditional-config-on-worktree-path:
config: add "worktree" and "worktree/i" includeIf conditions
config: refactor include_by_gitdir() into include_by_path()
"git cat-file --batch" learns an in-line command "mailmap"
that lets the user toggle use of mailmap.
* sa/cat-file-batch-mailmap-switch:
cat-file: add mailmap subcommand to --batch-command
"git push" learned to take a "remote group" name to push to, which
causes pushes to multiple places, just like "git fetch" would do.
* ua/push-remote-group:
SQUASH??? - futureproof against the attack of the "main"
Promisor remote handling has been refactored and fixed in
preparation for auto-configuration of advertised remotes.
* cc/promisor-auto-config-url:
t5710: use proper file:// URIs for absolute paths
promisor-remote: remove the 'accepted' strvec
promisor-remote: keep accepted promisor_info structs alive
promisor-remote: refactor accept_from_server()
promisor-remote: refactor has_control_char()
promisor-remote: refactor should_accept_remote() control flow
promisor-remote: reject empty name or URL in advertised remote
promisor-remote: clarify that a remote is ignored
promisor-remote: pass config entry to all_fields_match() directly
promisor-remote: try accepted remotes before others in get_direct()
The parse-options library learned to auto-correct misspelled
subcommand names.
* js/parseopt-subcommand-autocorrection:
doc: document autocorrect API
parseopt: add tests for subcommand autocorrection
parseopt: enable subcommand autocorrection for git-remote and git-notes
parseopt: autocorrect mistyped subcommands
autocorrect: provide config resolution API
autocorrect: rename AUTOCORRECT_SHOW to AUTOCORRECT_HINT
autocorrect: use mode and delay instead of magic numbers
help: move tty check for autocorrection to autocorrect.c
help: make autocorrect handling reusable
parseopt: extract subcommand handling from parse_options_step()
The mechanism to avoid recursive lazy-fetch from promisor remotes
was not propagated properly to child "git fetch" processes, which
has been corrected.
Comments?
* pt/promisor-lazy-fetch-no-recurse:
promisor-remote: prevent lazy-fetch recursion in child fetch