git-for-windows/git - git - Gitea: Self-hosted GitHub

mirror of https://github.com/git-for-windows/git.git synced 2026-06-16 13:04:57 -05:00

Author	SHA1	Message	Date
Junio C Hamano	9adfd2cb8f	Merge branch 'ps/setup-centralize-odb-creation' into ps/refs-avoid-chdir-notify-reparent * ps/setup-centralize-odb-creation: setup: construct object database in `apply_repository_format()` repository: stop reading loose object map twice on repo init setup: stop initializing object database without repository setup: stop creating the object database in `setup_git_env()` repository: stop initializing the object database in `repo_set_gitdir()` setup: deduplicate logic to apply repository format setup: drop `setup_git_env()` t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY	2026-06-11 05:09:19 -07:00
Junio C Hamano	1ff279f340	The 13th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-09 10:04:51 +09:00
Junio C Hamano	18b6502b3a	Merge branch 'jc/doc-monitor-ghci' Encourage original authors to monitor the CI status. * jc/doc-monitor-ghci: SubmittingPatches: proactively monitor GHCI pages	2026-06-09 10:04:51 +09:00
Junio C Hamano	4d96a1280b	Merge branch 'ib/doc-push-default-simple' The documentation for `push.default = simple` has been clarified to better explain its behavior, making it clear that it pushes the current branch to a same-named branch on the remote, and detailing the upstream requirements for centralized workflows. * ib/doc-push-default-simple: doc: clarify push.default=simple behavior	2026-06-09 10:04:51 +09:00
Junio C Hamano	a58e51dddf	Merge branch 'gh/jump-auto-mode' The 'git-jump' command (in contrib/) has been taught to automatically pick a mode (merge, diff, or ws) when invoked without arguments. * gh/jump-auto-mode: git-jump: pick a mode automatically when invoked without arguments	2026-06-09 10:04:51 +09:00
Junio C Hamano	2fd113ae07	Merge branch 'rs/strbuf-add-oid-hex' Formatting object name in full hexadecimal form has been optimized by using a new strbuf_add_oid_hex() helper function. * rs/strbuf-add-oid-hex: hex: add and use strbuf_add_oid_hex()	2026-06-09 10:04:50 +09:00
Junio C Hamano	7eaa3c82a8	Merge branch 'rs/strbuf-add-uint' Adding a decimal integer with strbuf_addf("%u") appears commonly; they have been optimized by using a custom formatter. * rs/strbuf-add-uint: ls-tree: use strbuf_add_uint() ls-files: use strbuf_add_uint() cat-file: use strbuf_add_uint() strbuf: add strbuf_add_uint()	2026-06-09 10:04:50 +09:00
Junio C Hamano	2c677d20b6	Merge branch 'ua/push-remote-group' "git push" learned to take a "remote group" name to push to, which causes pushes to multiple places, just like "git fetch" would do. * ua/push-remote-group: push: support pushing to a remote group remote: move remote group resolution to remote.c remote: fix sign-compare warnings in push_cas_option	2026-06-09 10:04:50 +09:00
Junio C Hamano	fca09c8fc2	Merge branch 'th/promisor-quiet-per-repo' The "promisor.quiet" configuration variable was not used from relevant submodules when commands like "grep --recurse-submodules" triggered a lazy fetch, which has been corrected. * th/promisor-quiet-per-repo: promisor-remote: fix promisor.quiet to use the correct repository	2026-06-09 10:04:50 +09:00
Junio C Hamano	1c0af131cc	Merge branch 'tb/bitmap-build-performance' Reachability bitmap generation has been significantly optimized. By reordering tree traversal, caching object positions, and refining how pseudo-merge bitmaps are constructed, the performance of "git repack --write-midx-bitmaps" is improved, especially for large repositories and when using pseudo-merges. * tb/bitmap-build-performance: pack-bitmap: build pseudo-merge bitmaps after regular bitmaps pack-bitmap: remember pseudo-merge parents pack-bitmap: sort bitmaps before XORing pack-bitmap: cache object positions during fill pack-bitmap: consolidate `find_object_pos()` success path pack-bitmap: reuse stored selected bitmaps pack-bitmap: check subtree bits before recursing pack-bitmap: pass object position to `fill_bitmap_tree()`	2026-06-09 10:04:49 +09:00
Junio C Hamano	600fe74302	The 12th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-07 23:58:25 +09:00
Junio C Hamano	212d25596d	Merge branch 'ja/doc-synopsis-style-again' A batch of documentation pages has been updated to use the modern synopsis style. * ja/doc-synopsis-style-again: doc: convert git-imap-send synopsis and options to new style doc: convert git-apply synopsis and options to new style doc: convert git-am synopsis and options to new style doc: convert git-grep synopsis and options to new style doc: git bisect: clarify the usage of the synopsis vs actual command doc: convert git-bisect to synopsis style	2026-06-07 23:58:25 +09:00
Junio C Hamano	6390da42c7	Merge branch 'kk/commit-reach-optim' The check for non-stale commits in the priority queue used by `paint_down_to_common` and `ahead_behind` has been optimized by replacing an O(N) scan with an O(1) counter, yielding performance improvements in repositories with wide histories. * kk/commit-reach-optim: commit-reach: replace queue_has_nonstale() scan with O(1) tracking commit-reach: deduplicate queue entries in paint_down_to_common object.h: fix stale entries in object flag allocation table	2026-06-07 23:58:25 +09:00
Junio C Hamano	de5383c2ce	Merge branch 'aj/stash-patch-optimize-temporary-index' "git stash -p" has been optimized by reusing cached index entries in its temporary index, avoiding unnecessary lstat() calls on unchanged files. * aj/stash-patch-optimize-temporary-index: stash: reuse cached index entries in --patch temporary index	2026-06-07 23:58:25 +09:00
Junio C Hamano	92b870a675	Merge branch 'kh/free-commit-list' Code clean-up. * kh/free-commit-list: commit: remove deprecated functions *: replace deprecated free_commit_list	2026-06-07 23:58:24 +09:00
Junio C Hamano	7450009e6f	Merge branch 'ds/restore-sparse-index' 'git restore --staged' has been optimized to avoid unnecessarily expanding the sparse index when operating on paths within the sparse checkout definition, by handling sparse directory entries at the tree level. * ds/restore-sparse-index: restore: avoid sparse index expansion t1092: test 'git restore' with sparse index	2026-06-07 23:58:24 +09:00
Junio C Hamano	17204228cf	Merge branch 'ar/receive-pack-worktree-env' The GIT_WORK_TREE variable prepared to invoke the push-to-checkout hook was leaking into the environment even when there was no hook used and broke the default push-to-deploy (i.e., let "git checkout" update the working tree only when the working tree is clean). * ar/receive-pack-worktree-env: receive-pack: fix updateInstead with core.worktree	2026-06-07 23:58:24 +09:00
Patrick Steinhardt	42b9d3dc9d	setup: construct object database in `apply_repository_format()` With the preceding changes we now always construct the repository's object database before applying the repository format. Remove this duplication by constructing it in `apply_repository_format()` instead. Note that we create the object database _after_ having set up the repository's hash algorithm, but _before_ setting the compat hash algorithm. This is intentional: - Constructing the object database may require knowledge of its intended object format. - Setting up the compatibility hash requires the object database to be initialized already, because we immediately read the loose object map. The first point is sensible, the second maybe a little less so. Ideally, it should be the responsibility of the object database itself to initialize any data structures required for the compatibility hash. But this would require further changes, so this is kept as-is for now. Further note that this requires us to move handling of the environment variables GIT_OBJECT_DIRECTORY and GIT_ALTERNATE_OBJECT_DIRECTORIES into the repository format, as well. This allows the caller more flexibility around whether or not those environment variables are being honored, as we want to respect them in "setup.c", but not in "repository.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:39 +09:00
Patrick Steinhardt	a84a9d4acd	repository: stop reading loose object map twice on repo init When initializing a repository via `repo_init()` we end up reading the loose object map twice: - `apply_repository_format()` calls `repo_set_compat_hash_algo()`, which in turn calls `repo_read_loose_object_map()` if we have a compatibility hash configured. - `repo_init()` calls `repo_read_loose_object_map()` directly a second time. Drop the second read of the loose object map in `repo_init()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:39 +09:00
Patrick Steinhardt	d87de311ff	setup: stop initializing object database without repository The function `setup_git_directory_gently()` is responsible for discovering and setting up a Git repository based on various environment variables and the current working directory. The result is thus a fully usable Git repository. One oddity of this function is that we may set up the object database even in the case where we don't have a repository, namely in the case where the `GIT_DIR_EXPLICIT` environment variable is set but points to a non-existent repository. If so, we call `setup_git_env_internal()` with the value of the environment variable so that the repository's Git directory is configured, even if it points to a non-existent directory. Historically though, this function didn't only configure the repository, but also initialized the object database. We retained this behaviour from a preceding commit, even though it really doesn't make much sense in the first place -- there is no repository, so we don't have an object database either. There seemingly isn't much of a reason to construct the object database, as we typically won't try to read objects when we don't have an object database. There's one exception though: git-index-pack(1) may run outside of a repository, which can be used to perform consistency checks for a packfile. The code path is _almost_ working: we already know to call `parse_object_buffer()`, which can read objects without an object database being available. And that works for all object types except for commits, because `parse_commit_buffer()` calls `parse_commit_graph()`, and that function doesn't handle the case where we don't have an object database. Fix this instance to check for the object database instead of checking for the Git directory having been initialized. With this fixed, we can now stop constructing an object database completely. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Patrick Steinhardt	aae4ebc895	setup: stop creating the object database in `setup_git_env()` In the preceding commit we have stopped creating the object database in `repo_set_gitdir()`. But the logic is still somewhat confusing as we still end up creating it conditionally in `setup_git_dir()`, which is called multiple times. Drop the conditional logic and instead create the object database in all places where we have discovered and configured a repository. This leads to even more duplication than we already had in the preceding commit, but an alert reader may notice that we now (almost) always call `odb_new()` directly before having called `apply_repository_format()`. The only exception to this is `setup_git_directory_gently()`, where we also call the function when _not_ applying the repository format. This will be fixed in the next commit, and once that's done we can then unify creation of the object database into `apply_repository_format()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Patrick Steinhardt	6a2fbab4c9	repository: stop initializing the object database in `repo_set_gitdir()` The function `repo_set_gitdir()` obviously sets the Git directory for a given repository. Less obviously though, the function also configures a couple of auxiliary settings. One such thing is that we create the object database in this function. This logic only happens conditionally though, as `set_git_dir()` may be called multiple times during repository setup, and we don't want to create the object database multiple times. This is somewhat tangled and hard to follow. Remove the logic from `repo_set_gitdir()` and instead initialize the object database outside of it. This leads to some duplication right now, but that duplication will be removed in a subsequent step where we will start initializing the object database as part of applying the repo's format. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Patrick Steinhardt	3d884b0b56	setup: deduplicate logic to apply repository format After having discovered the repository format we then apply it to the repository so that it knows to use the proper repository extensions. The logic to apply the format is duplicated across three callsites, which makes it rather painfull to add new extensions. Introduce a new function `apply_repository_format()` that takes a repo and applies a given format to it and adapt all callsites to use it. This function is also the new caller of `verify_repository_format()` so that we can ensure that we never apply an invalid repository format. The verification we have in `read_and_verify_repository_format()` is thus redundant now and dropped. Rename `read_and_verify_repository_format()` accordingly. While at it, also rename `check_repository_format()` to clarify that it doesn't only _check_ the format, but that it also applies it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Patrick Steinhardt	452ad8db6d	setup: drop `setup_git_env()` The `setup_git_env()` function is a trivial wrapper around `setup_git_env_internal()` and has a single call site only. Drop the function. While at it, drop stale documentation in "environment.h" that points to this function, even though it hasn't been exposed to callers outside of "setup.c" since 43ad1047a9 (setup: stop using `the_repository` in `setup_git_env()`, 2026-03-27) anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Patrick Steinhardt	027e3b3d38	t0001: plug test gaps for git-init(1) with GIT_OBJECT_DIRECTORY In subsequent commits we'll rework how we set up the repository. This is a somewhat intricate and thus fragile sequence; there's many things that can go subtly wrong, and there are lots of interesting interactions that one can discover. One such discovered edge case was the interaction between git-init(1) and the "GIT_OBJECT_DIRECTORY" environment variable. When set, the behaviour is that the object directory should be created at the path that the variable points to. This behaviour is documented as such in its man page: If the object storage directory is specified via the GIT_OBJECT_DIRECTORY environment variable then the sha1 directories are created underneath; otherwise, the default $GIT_DIR/objects directory is used. Curiously enough though we don't seem to have any tests that exercise this directly, and thus a subsequent commit inadvertently would have broken this expectation. Plug this test gap. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-05 21:49:38 +09:00
Junio C Hamano	9ac3f193c0	The 11th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-06-02 16:15:29 +09:00
Junio C Hamano	95e5fbd0ef	Merge branch 'kh/doc-hook' Doc updates. * kh/doc-hook: doc: hook: don’t self-link via config include doc: config: include existing git-hook(1) section doc: hook: consistently capitalize Git doc: hook: remove stray backtick	2026-06-02 16:15:29 +09:00
Junio C Hamano	ffaa2eddd0	Merge branch 'ds/path-walk-filters' The "git pack-objects --path-walk" traversal has been integrated with several object filters, including blobless and sparse filters. * ds/path-walk-filters: path-walk: support `combine` filter path-walk: support `object:type` filter path-walk: support `tree:0` filter t6601: tag otherwise-unreachable trees pack-objects: support sparse:oid filter with path-walk path-walk: add pl_sparse_trees to control tree pruning path-walk: support blob size limit filter backfill: die on incompatible filter options path-walk: support blobless filter path-walk: always emit directly-requested objects t/perf: add pack-objects filter and path-walk benchmark pack-objects: pass --objects with --path-walk t5620: make test work with path-walk var	2026-06-02 16:15:29 +09:00
Junio C Hamano	15dc60dcd1	Merge branch 'ta/approxidate-noon-fix' "Friday noon" asked in the morning on Sunday was parsed to be one day before the specified time, which has been corrected. * ta/approxidate-noon-fix: approxidate: use deferred mday adjustments for "specials" approxidate: make "specials" respect fixed day-of-month t0006: add support for approxidate test date adjustment approxidate: make "today" wrap to midnight	2026-06-02 16:15:29 +09:00
Junio C Hamano	7b3ab91768	Merge branch 'jk/connect-service-enum' The "name" argument in git_connect() and related functions has been converted to a "service" enum to improve type safety and clarify its purpose. * jk/connect-service-enum: transport-helper: fix typo in BUG() message connect: use "service" enum for "name" argument	2026-06-02 16:15:28 +09:00
Junio C Hamano	1666c12652	The 10th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-31 10:00:39 +09:00
Junio C Hamano	25d6fff594	Merge branch 'sp/doc-range-diff-takes-notes' Docfix. * sp/doc-range-diff-takes-notes: Documentation/git-range-diff: add missing notes options in synopsis	2026-05-31 10:00:39 +09:00
Junio C Hamano	a096b19c57	Merge branch 'ps/gitlab-ci-macOS-improvements' Update GitLab CI jobs that exercise macOS. * ps/gitlab-ci-macOS-improvements: gitlab-ci: update macOS image gitlab-ci: upgrade macOS runners	2026-05-31 10:00:39 +09:00
Junio C Hamano	33da2f4d3b	Merge branch 'sa/cat-file-batch-mailmap-switch' "git cat-file --batch" learns an in-line command "mailmap" that lets the user toggle use of mailmap. * sa/cat-file-batch-mailmap-switch: cat-file: add mailmap subcommand to --batch-command	2026-05-31 10:00:38 +09:00
Junio C Hamano	f6c8fe189b	Merge branch 'jk/commit-graph-lazy-load-fallback' The logic to lazy-load trees from the commit-graph has been made more robust by falling back to reading the commit object when the commit-graph is no longer available. * jk/commit-graph-lazy-load-fallback: commit: fall back to full read when maybe_tree is NULL	2026-05-31 10:00:38 +09:00
Junio C Hamano	4d11b9c218	Merge branch 'pt/fsmonitor-linux' The fsmonitor daemon has been implemented for Linux. * pt/fsmonitor-linux: fsmonitor: convert shown khash to strset in do_handle_client fsmonitor: add tests for Linux fsmonitor: add timeout to daemon stop command fsmonitor: close inherited file descriptors and detach in daemon run-command: add close_fd_above_stderr option fsmonitor: implement filesystem change listener for Linux fsmonitor: rename fsm-settings-darwin.c to fsm-settings-unix.c fsmonitor: rename fsm-ipc-darwin.c to fsm-ipc-unix.c fsmonitor: use pthread_cond_timedwait for cookie wait compat/win32: add pthread_cond_timedwait fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon fsmonitor: fix khash memory leak in do_handle_client t9210, t9211: disable GIT_TEST_SPLIT_INDEX for scalar clone tests	2026-05-31 10:00:38 +09:00
Junio C Hamano	7af2503365	Merge branch 'ps/graph-lane-limit' The graph output from commands like "git log --graph" can now be limited to a specified number of lanes, preventing overly wide output in repositories with many branches. * ps/graph-lane-limit: graph: add truncation mark to capped lanes graph: add --graph-lane-limit option graph: limit the graph width to a hard-coded max	2026-05-31 10:00:38 +09:00
Junio C Hamano	d2c01318b0	Merge branch 'jr/bisect-custom-terms-in-output' "git bisect" now uses the selected terms (e.g., old/new) more consistently in its output. * jr/bisect-custom-terms-in-output: rev-parse: use selected alternate terms to look up refs bisect: print bisect terms in single quotes bisect: use selected alternate terms in status output	2026-05-31 10:00:37 +09:00
Junio C Hamano	1694d2bba5	Merge branch 'tc/generate-configlist-fix-for-older-ninja' Build update. * tc/generate-configlist-fix-for-older-ninja: generate-configlist: collapse depfile for older Ninja	2026-05-31 10:00:37 +09:00
Junio C Hamano	a0ce168def	Merge branch 'kk/tips-reachable-from-bases-optim' Revision traversal optimization. * kk/tips-reachable-from-bases-optim: t6600: add tests for duplicate tips in tips_reachable_from_bases() commit-reach: use object flags for tips_reachable_from_bases()	2026-05-31 10:00:37 +09:00
Junio C Hamano	e6a641e8a1	Merge branch 'ed/check-connected-close-err-fd' File descriptor leak fix. * ed/check-connected-close-err-fd:	2026-05-31 10:00:37 +09:00
Junio C Hamano	e9068a5b00	Merge branch 'ed/check-connected-close-err-fd-2.53' File descriptor leak fix (for 2.54 maintenance track). * ed/check-connected-close-err-fd-2.53: connected: close err_fd in promisor fast-path	2026-05-31 10:00:36 +09:00
Kristoffer Haugsbakk	83e7f3bd2b	commit: remove deprecated functions These functions were deprecated in a series of commits merged in `52882024` (Merge branch 'ps/commit-list-functions-renamed', 2026-02-13). The compatibility was for in-flight topics at the time. Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-29 05:11:03 +09:00
Kristoffer Haugsbakk	7dd898a92d	*: replace deprecated free_commit_list Replace `free_commit_list` with `commit_list_free`. The former was deprecated in `9f18d089` (commit: rename `free_commit_list()` to conform to coding guidelines, 2026-01-15). This allows us to remove all the deprecated functions in the next commit: • `copy_commit_list` • `reverse_commit_list` • `free_commit_list` Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-29 05:11:02 +09:00
Taylor Blau	49633dc88c	pack-bitmap: build pseudo-merge bitmaps after regular bitmaps When generating bitmaps, `bitmap_builder_init()` starts with an initial selection of commits to receive bitmap coverage, and then determines a set of "maximal" commits based on its input. Commit `089f751360` (pack-bitmap-write: build fewer intermediate bitmaps, 2020-12-08) has extensive details, but the gist is as follows: Each selected commit starts with one commit_mask bit in its "commit mask" bitmap. Then, we walk the first-parent history in topological order and OR each commit's mask into its (first) parent. Whenever that OR results in the parent having more bits set, the child is deemed to be non-maximal, and the frontier is pushed further back along the first parent history. That approach works extremely well for ordinary selected commits, whose first-parent histories often describe real sharing between the bitmaps we are going to write. It struggles, however, to efficiently generate pseudo-merge bitmaps. Unlike ordinary commits for which the above algorithm is designed, pseudo-merges don't represent any "real" commit in history, just a grouping of non-bitmapped reference tips. In that sense, their first parent is just a part of a larger set, and treating them like ordinary selected commits imposes a significant slow-down when generating bitmaps with pseudo-merges enabled. Consider partitioning all non-bitmapped reference tips into eight individual pseudo-merges via the following configuration: [bitmapPseudoMerge "all"] pattern=refs/ threshold=now stableSize=10000000 maxMerges=8 , the cost of generating a bitmap from scratch rises significantly: +------------------+-----------------+---------------+---------------------+ \| \| no pseudo-merge \| pseudo-merges \| Delta \| \| \| \| (HEAD^) \| \| +------------------+-----------------+---------------+---------------------+ \| elapsed \| 294.1 s \| 575.0 s \| +280.9 s (+95.5%) \| \| cycles \| 1,365.5 B \| 2,686.9 B \| +1,321.4 B (+96.8%) \| \| instructions \| 1,389.8 B \| 2,546.6 B \| +1,156.8 B (+83.2%) \| \| CPI \| 0.983 \| 1.055 \| +0.073 (+7.4%) \| +------------------+-----------------+---------------+---------------------+ This is a particularly poor trade-off, because the time saved by these pseudo-merges during, e.g., $ git rev-list --count --all --objects --use-bitmap-index is only: $ hyperfine -L v true,false -n 'pseudo-merges: {v}' ' GIT_TEST_USE_PSEUDO_MERGES={v} git.compile rev-list --count \ --objects --all --use-bitmap-index ' Benchmark 1: pseudo-merges: true Time (mean ± σ): 2.613 s ± 0.012 s [User: 2.308 s, System: 0.305 s] Range (min … max): 2.594 s … 2.633 s 10 runs Benchmark 2: pseudo-merges: false Time (mean ± σ): 52.205 s ± 0.170 s [User: 51.500 s, System: 0.697 s] Range (min … max): 51.956 s … 52.458 s 10 runs Summary pseudo-merges: true ran 19.98 ± 0.11 times faster than pseudo-merges: false In other words, we pay a nearly ~5 minute penalty to generate pseudo-merge bitmaps, but only save ~50 seconds during traversal. The problem stems from injecting pseudo-merges into the bitmap builder as if they were normal commits. The maximal commit selection algorithm was simply not designed for that case, and performs predictably poorly. The only reason we reused the maximal commit selection routine for pseudo-merges alongside regular non-pseudo-merge commits is because we represent them both as commit objects (where the pseudo-merge commits just represent a made-up commit as opposed to one that actually exists in a repository's object store). Instead, build the regular selected commit bitmaps first, considering only non-pseudo-merge commits in `bitmap_builder_init()`. Once those bitmaps have been stored, build each pseudo-merge bitmap separately and attach its parent and object bitmaps to the corresponding pseudo-merge entry before writing the extension. This keeps the regular bitmap build shaped like the no-pseudo-merge case. The later pseudo-merge fill can still stop at stored selected ancestor bitmaps, so it does not have to rewalk each pseudo-merge closure from scratch. When an existing bitmap has the same pseudo-merge parent set, reuse and remap that whole pseudo-merge bitmap before falling back to fill_bitmap_commit(). This preserves the benefit of stable pseudo-merges while keeping the on-disk format and reader behavior unchanged. As a result, the overhead cost for generating pseudo-merges in the above configuration is much smaller: +------------------+-----------------+---------------+-------------------+ \| \| no pseudo-merge \| pseudo-merges \| Delta \| \| \| \| (HEAD) \| \| +------------------+-----------------+---------------+-------------------+ \| elapsed \| 294.1 s \| 328.4 s \| +34.3 s (+11.7%) \| \| cycles \| 1,365.5 B \| 1,529.3 B \| +163.7 B (+12.0%) \| \| instructions \| 1,389.8 B \| 1,552.8 B \| +163.0 B (+11.7%) \| \| CPI \| 0.983 \| 0.985 \| +0.002 (+0.2%) \| +------------------+-----------------+---------------+-------------------+ Recall that at the start of this series, generating reachability bitmaps took 612.5 seconds without pseudo-merges. With this commit, it is still ~46.38% faster to generate reachability bitmaps with pseudo-merges than it was to generate bitmaps wihtout them at the beginning of this series. The changes to implement this are mostly straightforward. We exclude pseudo-merge commits from the existing bitmap generation, and walk over them in a separate pass, by either reusing an existing on-disk pseudo-merge, or passing the pseudo-merge commit itself back to the existing routine in `fill_bitmap_commit()`. (Note that the routine to build pseudo-merge bitmaps is the same both before and after this change, the difference is only that we do not let psuedo-merges participate in determining the set of maximal commits.) The only wrinkle is that `fill_bitmap_commit()` must be taught to not expect that all tree objects have been parsed, which is the case for any portion of history reachable by one or more pseudo-merge(s), but not by any non-pseudo-merge commit selected for bitmapping. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:01 +09:00
Taylor Blau	b04d26607d	pack-bitmap: remember pseudo-merge parents write_pseudo_merges() currently builds an array of temporary bitmaps for the parent set of each pseudo-merge, then serializes those bitmaps later while writing the extension. Move those parent bitmaps onto the corresponding bitmapped_commit entries instead. This keeps the on-disk output unchanged, but gives the parent bitmap the same lifetime and access pattern that later changes will use when pseudo-merge object bitmaps are built before the write step. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:01 +09:00
Taylor Blau	dcccd99746	pack-bitmap: sort bitmaps before XORing Reachability bitmaps may be stored as XORs against nearby bitmaps, up to 10 away. However, when callers provide selected commits in an arbitrary order, the writer may miss good ancestor/descendant pairs and produce much larger bitmap files without changing query coverage. Sort the selected bitmaps in date order (from oldest to newest) before computing XOR offsets, leaving pseudo-merge bitmaps alone (which we will deal with separately in following commits). On our same testing repository from previous commits, this change shrunk our selection of 1,261 bitmaps from ~635.46 MiB to 176.4 MiB for a ~72.24% reduction in the on-disk size of our *.bitmap file. The time to generate the smaller bitmap file decreased by ~3.69 seconds, though this is likely mostly noise. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:01 +09:00
Taylor Blau	c720bbcc53	pack-bitmap: cache object positions during fill The previous commits removed some redundant work from bitmap generation by avoiding unnecessary tree recursion and by reusing selected bitmaps that have already been computed. Even with those changes in place, there is still an extremely hot path from `fill_bitmap_commit()` and `fill_bitmap_tree()` to translate object IDs into their corresponding bit positions in order to generate their bitmaps. In a small repository, this overhead is not significant. However, in a very large repository (e.g., the one that we have been using as a benchmark over the past several commits with ~57M total objects), the overhead of locating object bit positions (often repeatedly) adds up significantly. Combat this by adding a small, direct-mapped cache to the bitmap writer which maps object IDs to their corresponding bit positions. Size the cache according to the number of objects being written, with fixed lower and upper bounds so small repositories do not pay for a large table and large repositories can avoid most repeated packlist and MIDX lookups. On my machine with (a somewhat outdated) GCC 15.2.0, each entry in the cache is 40 bytes wide: $ pahole -C bitmap_pos_cache_entry pack-bitmap-write.o struct bitmap_pos_cache_entry { struct object_id oid; /* 0 36 / uint32_t pos; / 36 4 / / size: 40, cachelines: 1, members: 2 / / last cacheline: 40 bytes */ }; , and we will allocate up to 2^21 entries for a maximum total of 80 MiB of cache overhead. In our example repository from above and in earlier commits, this results in a ~9.4% reduction in runtime relative to the previous commit: +------------------+-------------+-------------+---------------------+ \| \| HEAD^ \| HEAD \| Delta \| +------------------+-------------+-------------+---------------------+ \| elapsed \| 324.8 s \| 294.1 s \| -30.7 s (-9.4%) \| \| cycles \| 1,508.6 B \| 1,365.5 B \| -143.0 B (-9.5%) \| \| instructions \| 1,436.6 B \| 1,389.8 B \| -46.9 B (-3.3%) \| \| CPI \| 1.050 \| 0.983 \| -0.068 (-6.4%) \| +------------------+-------------+-------------+---------------------+ When generating bitmaps on this repository (to produce the above timings), the cache grew to its maximum size of 80 MiB, and resulted in 1.024B cache hits and 59.957M cache misses. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:01 +09:00
Taylor Blau	ece3465d44	pack-bitmap: consolidate `find_object_pos()` success path Both sides of `find_object_pos()` report success in the same way by setting the optional `found` out-parameter and return the resolved bitmap position. Prepare for adding more bookkeeping around object-position lookups by storing the result in a local `pos` variable and sharing the success return path between the packlist and MIDX cases. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:00 +09:00
Taylor Blau	3ea5fe8482	pack-bitmap: reuse stored selected bitmaps When `fill_bitmap_commit()` reaches an ancestor that was selected for its own bitmap and processed earlier, its object closure is already stored in `writer->bitmaps` as an EWAH bitmap. As a result, walking through that commit's tree and parents again is redundant. Teach `fill_bitmap_commit()` to notice that case. For non-root commits in the walk, look for a stored selected bitmap and OR it into the bitmap being built. If one exists, skip the commit, its tree, and its parents. Building bitmaps from scratch on the same test repository from the previous commits yields a significant speed-up: +------------------+-------------+-------------+---------------------+ \| \| HEAD^ \| HEAD \| Delta \| +------------------+-------------+-------------+---------------------+ \| elapsed \| 562.8 s \| 324.8 s \| -237.9 s (-42.3%) \| \| cycles \| 2,621.3 B \| 1,508.6 B \| -1,112.7 B (-42.4%) \| \| instructions \| 2,348.9 B \| 1,436.6 B \| -912.3 B (-38.8%) \| \| CPI \| 1.116 \| 1.050 \| -0.066 (-5.9%) \| +------------------+-------------+-------------+---------------------+ In our testing repository, there are 1,261 commits selected for bitmap coverage, and 1,382 maximal commits induced as a result of that. Of the 1,382 calls made to `fill_bitmap_commit()` (one per maximal commit), 131 of them can be short-circuited at some point during their traversal as a consequence of this change. In large repositories where the cost of filling the bitmap for any individual commit is large, being able to short-circuit even ~9.5% of the calls to `fill_bitmap_commit()` results in a significant savings. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-05-28 05:23:00 +09:00

1 2 3 4 5 ...

81160 Commits