git-for-windows/git - git - Gitea: Self-hosted GitHub

mirror of https://github.com/git-for-windows/git.git synced 2026-04-11 01:53:21 -05:00

Author	SHA1	Message	Date
Taylor Blau	b2ec8e90c2	midx: do not require packs to be sorted in lexicographic order The MIDX file format currently requires that pack files be identified by the lexicographic ordering of their names (that is, a pack having a checksum beginning with "abc" would have a numeric pack_int_id which is smaller than the same value for a pack beginning with "bcd"). As a result, it is impossible to combine adjacent MIDX layers together without permuting bits from bitmaps that are in more recent layer(s). To see why, consider the following example: \| packs \| preferred pack --------+-------------+--------------- MIDX #0 \| { X, Y, Z } \| Y MIDX #1 \| { A, B, C } \| B MIDX #2 \| { D, E, F } \| D , where MIDX #2's base MIDX is MIDX #1, and so on. Suppose that we want to combine MIDX layers #0 and #1, to create a new layer #0' containing the packs from both layers. With the original three MIDX layers, objects are laid out in the bitmap in the order they appear in their source pack, and the packs themselves are arranged according to the pseudo-pack order. In this case, that ordering is Y, X, Z, B, A, C. But recall that the pseudo-pack ordering is defined by the order that packs appear in the MIDX, with the exception of the preferred pack, which sorts ahead of all other packs regardless of its position within the MIDX. In the above example, that means that pack 'Y' could be placed anywhere (so long as it is designated as preferred), however, all other packs must be placed in the location listed above. Because that ordering isn't sorted lexicographically, it is impossible to compact MIDX layers in the above configuration without permuting the object-to-bit-position mapping. Changing this mapping would affect all bitmaps belonging to newer layers, rendering the bitmaps associated with MIDX #2 unreadable. One of the goals of MIDX compaction is that we are able to shrink the length of the MIDX chain without invalidating bitmaps that belong to newer layers, and the lexicographic ordering constraint is at odds with this goal. However, packs do not need to be lexicographically ordered within the MIDX. As far as I can gather, the only reason they are sorted lexically is to make it possible to perform a binary search over the pack names in a MIDX, necessary to make `midx_contains_pack()`'s performance logarithmic in the number of packs rather than linear. Relax this constraint by allowing MIDX writes to proceed with packs that are not arranged in lexicographic order. `midx_contains_pack()` will lazily instantiate a `pack_names_sorted` array on the MIDX, which will be used to implement the binary search over pack names. This change produces MIDXs which may not be correctly read with external tools or older versions of Git. Though older versions of Git know how to gracefully degrade and ignore any MIDX(s) they consider corrupt, external tools may not be as robust. To avoid unintentionally breaking any such tools, guard this change behind a version bump in the MIDX's on-disk format. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:33 -08:00
Taylor Blau	82c905ea6b	midx-write.c: introduce `struct write_midx_opts` In the MIDX writing code, there are four functions which perform some sort of MIDX write operation. They are: - write_midx_file() - write_midx_file_only() - expire_midx_packs() - midx_repack() All of these functions are thin wrappers over `write_midx_internal()`, which implements the bulk of these routines. As a result, the `write_midx_internal()` function takes six arguments. Future commits in this series will want to add additional arguments, and in general this function's signature will be the union of parameters among all possible ways to write a MIDX. Instead of adding yet more arguments to this function to support MIDX compaction, introduce a `struct write_midx_opts`, which has the same struct members as `write_midx_internal()`'s arguments. Adding additional fields to the `write_midx_opts` struct is preferable to adding additional arguments to `write_midx_internal()`. This is because the callers below all zero-initialize the struct, so each time we add a new piece of information, we do not have to pass the zero value for it in all other call-sites that do not care about it. For now, no functional changes are included in this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:33 -08:00
Taylor Blau	ac10f6aa40	midx-write.c: don't use `pack_perm` when assigning `bitmap_pos` In midx_pack_order(), we compute for each bitmapped pack the first bit to correspond to an object in that pack, along with how many bits were assigned to object(s) in that pack. Initially, each bitmap_nr value is set to zero, and each bitmap_pos value is set to the sentinel BITMAP_POS_UNKNOWN. This is done to ensure that there are no packs who have an unknown bit position but a somehow non-zero number of objects (cf. `write_midx_bitmapped_packs()` in midx-write.c). Once the pack order is fully determined, midx_pack_order() sets the bitmap_pos field for any bitmapped packs to zero if they are still listed as BITMAP_POS_UNKNOWN. However, we enumerate the bitmapped packs in order of `ctx->pack_perm`. This is fine for existing cases, since the only time the `ctx->pack_perm` array holds a value outside of the addressable range of `ctx->info` is when there are expired packs, which only occurs via 'git multi-pack-index expire', which does not support writing MIDX bitmaps. As a result, the range of ctx->pack_perm covers all values in [0, `ctx->nr`), so enumerating in this order isn't an issue. A future change necessary for compaction will complicate this further by introducing a wrapper around the `ctx->pack_perm` array, which turns the given `pack_int_id` into one that is relative to the lower end of the compaction range. As a result, indexing into `ctx->pack_perm` through this helper, say, with "0" will produce a crash when the lower end of the compaction range has >0 pack(s) in its base layer, since the subtraction will wrap around the 32-bit unsigned range, resulting in an uninitialized read. But the process is completely unnecessary in the first place: we are enumerating all values of `ctx->info`, and there is no reason to process them in a different order than they appear in memory. Index `ctx->info` directly to reflect that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:33 -08:00
Taylor Blau	8043782112	t/t5319-multi-pack-index.sh: fix copy-and-paste error in t5319.39 Commit `d4bf1d88b9` (multi-pack-index: verify missing pack, 2018-09-13) adds a new test to the MIDX test script to test how we handle missing packs. While the commit itself describes the test as "verify missing pack[s]", the test itself is actually called "verify packnames out of order", despite that not being what it tests. Likely this was a copy-and-paste of the test immediately above it of the same name. Correct this by renaming the test to match the commit message. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:33 -08:00
Taylor Blau	d0e91c128b	git-multi-pack-index(1): align SYNOPSIS with 'git multi-pack-index -h' Since `c39fffc1c9` (tests: start asserting that *.txt SYNOPSIS matches -h output, 2022-10-13), the manual page for 'git multi-pack-index' has a SYNOPSIS section which differs from 'git multi-pack-index -h'. Correct this while also documenting additional options accepted by the 'write' sub-command. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:32 -08:00
Taylor Blau	f775d5b1cf	git-multi-pack-index(1): remove non-existent incompatibility Since `fcb2205b77` (midx: implement support for writing incremental MIDX chains, 2024-08-06), the command-line options '--incremental' and '--bitmap' were declared to be incompatible with one another when running 'git multi-pack-index write'. However, since `27afc272c4` (midx: implement writing incremental MIDX bitmaps, 2025-03-20), that incompatibility no longer exists, despite the documentation saying so. Correct this by removing the stale reference to their incompatibility. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:32 -08:00
Taylor Blau	6b8fb17490	builtin/multi-pack-index.c: make '--progress' a common option All multi-pack-index sub-commands (write, verify, repack, and expire) support a '--progress' command-line option, despite not listing it as one of the common options in `common_opts`. As a result each sub-command declares its own `OPT_BIT()` for a "--progress" command-line option. Centralize this within the `common_opts` to avoid re-declaring it in each sub-command. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:32 -08:00
Taylor Blau	6e86f67924	midx: introduce `midx_get_checksum_hex()` When trying to print out, say, the hexadecimal representation of a MIDX's hash, our code will do something like: hash_to_hex_algop(midx_get_checksum_hash(m), m->source->odb->repo->hash_algo); , which is both cumbersome and repetitive. In fact, all but a handful of callers to `midx_get_checksum_hash()` do exactly the above. Reduce the repetitive nature of calling `midx_get_checksum_hash()` by having it return a pointer into a static buffer containing the above result. For the handful of callers that do need to compare the raw bytes and don't want to deal with an encoded copy (e.g., because they are passing it to hasheq() or similar), they may still rely on `midx_get_checksum_hash()` which returns the raw bytes. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:32 -08:00
Taylor Blau	de811c26bb	midx: rename `get_midx_checksum()` to `midx_get_checksum_hash()` Since `541204aabe` (Documentation: document naming schema for structs and their functions, 2024-07-30), we have adopted a naming convention for functions that would prefer a name like, say, `midx_get_checksum()` over `get_midx_checksum()`. Adopt this convention throughout the midx.h API. Since this function returns a raw (that is, non-hex encoded) hash, let's suffix the function with "_hash()" to make this clear. As a side effect, this prepares us for the subsequent change which will introduce a "_hex()" variant that encodes the checksum itself. Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:32 -08:00
Taylor Blau	00c0d84e6f	midx: mark `get_midx_checksum()` arguments as const To make clear that the function `get_midx_checksum()` does not do anything to modify its argument, mark the MIDX pointer as const. The following commit will rename this function altogether to make clear that it returns the raw bytes of the checksum, not a hex-encoded copy of it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 11:16:31 -08:00
D. Ben Knoble	ebeea3c471	build: regenerate config-list.h when Documentation changes The Meson-based build doesn't know when to rebuild config-list.h, so the header is sometimes stale. For example, an old build directory might have config-list.h from before `4173df5187` (submodule: introduce extensions.submodulePathConfig, 2026-01-12), which added submodule.<name>.gitdir to the list. Without it, t9902-completion.sh fails. Regenerating the config-list.h artifact from sources fixes the artifact and the test. Since Meson does not have (or want) builtin support for globbing like Make, teach generate-configlist.sh to also generate a list of Documentation files its output depends on, and incorporate that into the Meson build. We honor the undocumented GCC/Clang contract of outputting empty targets for all the dependencies (like they do with -MP). That is, generate lines like build/config-list.h: $SOURCE_DIR/Documentation/config.adoc $SOURCE_DIR/Documentation/config.adoc: We assume that if a user adds a new file under Documentation/config then they will also edit one of the existing files to include that new file, and that will trigger a rebuild. Also mark the generator script as a dependency. While we're at it, teach the Makefile to use the same "the script knows it's dependencies" logic. For Meson, combining the following commands helps debug dependencies: ninja -C <builddir> -t deps config-list.h ninja -C <builddir> -t browse config-list.h The former lists all the dependencies discovered from our output ".d" file (the config documentation) and the latter shows the dependency on the script itself, among other useful edges in the dependency graph. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 10:51:54 -08:00
Junio C Hamano	874cf0d49f	Merge branch 'jh/alias-i18n' into jh/alias-i18n-fixes * jh/alias-i18n: completion: fix zsh alias listing for subsection aliases alias: support non-alphanumeric names via subsection syntax alias: prepare for subsection aliases help: use list_aliases() for alias listing	2026-02-24 10:50:38 -08:00
Patrick Steinhardt	452b12c2e0	builtin/maintenance: use "geometric" strategy by default The git-gc(1) command has been introduced in the early days of Git in `30f610b7b0` (Create 'git gc' to perform common maintenance operations., 2006-12-27) as the main repository maintenance utility. And while the tool has of course evolved since then to cover new parts, the basic strategy it uses has never really changed much. It is safe to say that since 2006 the Git ecosystem has changed quite a bit. Repositories tend to be much larger nowadays than they have been almost 20 years ago, and large parts of the industry went crazy for monorepos (for various wildly different definitions of "monorepo"). So the maintenance strategy we used back then may not be the best fit nowadays anymore. Arguably, most of the maintenance tasks that git-gc(1) does are still perfectly fine today: repacking references, expiring various data structures and things like tend to not cause huge problems. But the big exception is the way we repack objects. git-gc(1) by default uses a split strategy: it performs incremental repacks by default, and then whenever we have too many packs we perform a large all-into-one repack. This all-into-one repack is what is causing problems nowadays, as it is an operation that is quite expensive. While it is wasteful in small- and medium-sized repositories, in large repos it may even be prohibitively expensive. We have eventually introduced git-maintenance(1) that was slated as a replacement for git-gc(1). In contrast to git-gc(1), it is much more flexible as it is structured around configurable tasks and strategies. So while its default "gc" strategy still uses git-gc(1) under the hood, it allows us to iterate. A second strategy it knows about is the "incremental" strategy, which we configure when registering a repository for scheduled maintenance. This strategy isn't really a full replacement for git-gc(1) though, as it doesn't know to expire unused data structures. In Git 2.52 we have thus introduced a new "geometric" strategy that is a proper replacement for the old git-gc(1). In contrast to the incremental/all-into-one split used by git-gc(1), the new "geometric" strategy maintains a geometric progression of packfiles, which significantly reduces the number of all-into-one repacks that we have to perform in large repositories. It is thus a much better fit for large repositories than git-gc(1). Note that the "geometric" strategy isn't perfect though: while we perform way less all-into-one repacks compared to git-gc(1), we still have to perform them eventually. But for the largest repositories out there this may not be an option either, as client machines might not be powerful enough to perform such a repack in the first place. These cases would thus still be covered by the "incremental" strategy. Switch the default strategy away from "gc" to "geometric", but retain the "incremental" strategy configured when registering background maintenance with `git maintenance register`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:20 -08:00
Patrick Steinhardt	d2fbe9af79	t7900: prepare for switch of the default strategy The t7900 test suite is exercising git-maintenance(1) and is thus of course heavily reliant on the exact maintenance strategy. This reliance comes in two flavors: - One test explicitly wants to verify that git-gc(1) is run as part of `git maintenance run`. This test is adapted by explicitly picking the "gc" strategy. - The other tests assume a specific shape of the object database, which is dependent on whether or not we run auto-maintenance before we come to the actual subject under test. These tests are adapted by disabling auto-maintenance. With these changes t7900 passes with both "gc" and "geometric" default strategies. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:20 -08:00
Patrick Steinhardt	38ae87c1ba	t6500: explicitly use "gc" strategy The test in t6500 explicitly wants to exercise git-gc(1) and is thus highly specific to the actual on-disk state of the repository and specifically of the object database. An upcoming change modifies the default maintenance strategy to be the "geometric" strategy though, which breaks a couple of assumptions. One fix would arguably be to disable auto-maintenance altogether, as we do want to explicitly verify git-gc(1) anyway. But as the whole test suite is about git-gc(1) in the first place it feels more sensible to configure the default maintenance strategy to be "gc". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:20 -08:00
Patrick Steinhardt	94f5d9f09e	t5510: explicitly use "gc" strategy One of the tests in t5510 wants to verify that auto-gc does not lock up when fetching into a repository. Adapt it to explicitly pick the "gc" strategy for auto-maintenance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:19 -08:00
Patrick Steinhardt	0894704369	t5400: explicitly use "gc" strategy In t5400 we verify that git-receive-pack(1) runs automated repository maintenance in the remote repository. The check is performed indirectly by observing an effect that git-gc(1) would have, namely to prune a temporary object from the object database. In a subsequent commit we're about to switch to the "geometric" strategy by default though, and here we stop observing that effect. Adapt the test to explicitly use the "gc" strategy to prepare for that upcoming change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:19 -08:00
Patrick Steinhardt	ea7d894f44	t34xx: don't expire reflogs where it matters We have a couple of tests in the t34xx range that rely on reflogs. This never really used to be a problem, but in a subsequent commit we will change the default maintenance strategy from "gc" to "geometric", and this will cause us to drop all reflogs in these tests. This may seem surprising and like a bug at first, but it's actually not. The main difference between these two strategies is that the "gc" strategy will skip all maintenance in case the object database is in a well-optimized state. The "geometric" strategy has separate subtasks though, and the conditions for each of these tasks is evaluated on a case by case basis. This means that even if the object database is in good shape, we may still decide to expire reflogs. So why is that a problem? The issue is that Git's test suite hardcodes the committer and author dates to a date in 2005. Interestingly though, these hardcoded dates not only impact the commits, but also the reflog entries. The consequence is that all newly written reflog entries are immediately considered stale as our reflog expiration threshold is in the range of weeks, only. It follows that executing `git reflog expire` will thus immediately purge all reflog entries. This hasn't been a problem in our test suite by pure chance, as the repository shapes simply didn't cause us to perform actual garbage collection. But with the upcoming "geometric" strategy we _will_ start to execute `git reflog expire`, thus surfacing this issue. Prepare for this by explicitly disabling reflog expiration in tests impacted by this upcoming change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:19 -08:00
Patrick Steinhardt	5e6c612720	t: disable maintenance where we verify object database structure We have a couple of tests that explicitly verify the structure of the object database. Naturally, this structure is dependent on whether or not we run repository maintenance: if it decides to optimize the object database the expected structure is likely to not materialize. Explicitly disable auto-maintenance in such tests so that we are not dependent on decisions made by our maintenance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:19 -08:00
Patrick Steinhardt	09505b1115	t: fix races caused by background maintenance Many Git commands spawn git-maintenance(1) to optimize the repository in the background. By default, performing the maintenance is for most of the part asynchronous: we fork the executable and then continue with the rest of our business logic. This is working as expected for our users, but this behaviour is somewhat problematic for our test suite as this is inherently racy. We have many tests that verify the on-disk state of repositories, and those tests may easily race with our background maintenance. In a similar fashion, we may end up with processes that "leak" out of a current test case. Until now this tends to not be much of a problem. Our maintenance uses git-gc(1) by default, which knows to bail out in case there aren't either too many packfiles or too many loose objects. So even if other data structures would need to be optimized, we won't do so unless the object database also needs optimizations. This is about to change though, as a subsequent commit will switch to the "geometric" maintenance strategy as a default. The consequence is that we will run required optimizations even if the object database is well-optimized. And this uncovers races between our test suite and background maintenance all over the place. Disabling maintenance outright in our test suite is not really an option, as it would result in significant divergence from the "real world" and reduce our test coverage. But we've got an alternative up our sleeves: we can ensure that garbage collection runs synchronously by overriding the "maintenance.autoDetach" configuration. Of course that also diverges from the real world, as we now stop testing that background maintenance interacts in a benign way with normal Git commands. But on the other hand this ensures that the maintenance itself does not for example lead to data loss in a more reproducible way. Another concern is that this would make execution of the test suite much slower. But a quick benchmark on my machine demonstrates that this does not seem to be the case: Benchmark 1: meson test (revision = HEAD~) Time (mean ± σ): 131.182 s ± 1.293 s [User: 853.737 s, System: 1160.479 s] Range (min … max): 130.001 s … 132.563 s 3 runs Benchmark 2: meson test (revision = HEAD) Time (mean ± σ): 129.554 s ± 0.507 s [User: 849.040 s, System: 1152.664 s] Range (min … max): 129.000 s … 129.994 s 3 runs Summary meson test (revision = HEAD) ran 1.01 ± 0.01 times faster than meson test (revision = HEAD~) Funny enough, it even seems as if this speeds up test execution ever so slightly, but that may just as well be noise. Introduce a new `GIT_TEST_MAINT_AUTO_DETACH` environment variable that allows us to override the auto-detach behaviour and set that variable in our tests. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:33:19 -08:00
Han Young	2d88ab078d	diffcore-break: avoid segfault with freed entries After we have freed the file pair, we should set the queue reference to null. When computing a diff in a partial clone, there is a chance that we could trigger a prefetch of missing objects when there are freed entries in the global diff queue due to break-rewrites detection. The segfault only occurs if an entry has been freed by break-rewrites and there is an entry to be prefetched. There is a new test in t4067 that trigger the segmentation fault that results in this case. The test explicitly fetch the necessary blobs to trigger the break rewrites, some blobs are left to be prefetched. The fix is to set the queue pointer to NULL after it is freed, the prefetch will skip NULL entries. Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-24 07:20:44 -08:00
Kristoffer Haugsbakk	5eee5d81e4	doc: diff-options.adoc: show format.noprefix for format-patch git-format-patch(1) uses `format.noprefix` and ignores `diff.noprefix`. The configuration variable `format.prefix` was added as an “escape hatch”, and “it’s unlikely that anybody really wants format. noprefix=true in the first place.”[1] Based on that there doesn’t seem to be a need to widely advertise this configuration variable. But in any case: the documentation for this option should not claim that it overrides a config that is always ignored. † 1: `8d5213de` (format-patch: add format.noprefix option, 2023-03-09) Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 15:55:19 -08:00
Kristoffer Haugsbakk	04657ac029	format-patch: make format.noprefix a boolean The config `format.noprefix` was added in `8d5213de` (format-patch: add format.noprefix option, 2023-03-09) to support no-prefix on paths. That was immediately after making git-format-patch(1) not respect `diff.noprefix`.[1] The intent was to mirror `diff.noprefix`. But this config was unintentionally[2] implemented by enabling no-prefix if any kind of value is set. † 1: `c169af8f` (format-patch: do not respect diff.noprefix, 2023-03-09) † 2: https://lore.kernel.org/all/20260211073553.GA1867915@coredump.intra.peff.net/ Let’s indeed mirror `diff.noprefix` by treating it as a boolean. This is a breaking change. And as far as breaking changes go it is pretty benign: • The documentation claims that this config is equivalent to `diff.noprefix`; this is just a bug fix if the documentation is what defines the application interface • Only users with non-boolean values will run into problems when we try to parse it as a boolean. But what would (1) make them suspect they could do that in the first place, and (2) have motivated them to do it? • Users who have set this to `false` and expect that to mean enable format.noprefix (current behavior) will now have the opposite experience. Which is not a reasonable setup. Let’s only offer a breaking change fig leaf by advising about the previous behavior before dying. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 15:55:19 -08:00
Junio C Hamano	b1af291b4a	Merge branch 'ps/object-info-bits-cleanup' into ps/odb-sources * ps/object-info-bits-cleanup: odb: convert `odb_has_object()` flags into an enum odb: convert object info flags into an enum odb: drop gaps in object info flag values builtin/fsck: fix flags passed to `odb_has_object()` builtin/backfill: fix flags passed to `odb_has_object()`	2026-02-23 13:48:48 -08:00
Junio C Hamano	703c97519d	Merge branch 'ps/odb-for-each-object' into ps/odb-sources * ps/odb-for-each-object: odb: drop unused `for_each_{loose,packed}_object()` functions reachable: convert to use `odb_for_each_object()` builtin/pack-objects: use `packfile_store_for_each_object()` odb: introduce mtime fields for object info requests treewide: drop uses of `for_each_{loose,packed}_object()` treewide: enumerate promisor objects via `odb_for_each_object()` builtin/fsck: refactor to use `odb_for_each_object()` odb: introduce `odb_for_each_object()` packfile: introduce function to iterate through objects packfile: extract function to iterate through objects of a store object-file: introduce function to iterate through objects object-file: extract function to read object info from path odb: fix flags parameter to be unsigned odb: rename `FOR_EACH_OBJECT_*` flags	2026-02-23 13:48:00 -08:00
Derrick Stolee	096aa60998	config: use an enum for type The --type=<X> option for 'git config' has previously been defined using macros, but using a typed enum is better for tracking the possible values. Move the definition up to make sure it is defined before a macro uses some of its terms. Update the initializer for config_display_options to explicitly set 'type' to TYPE_NONE even though this is implied by a zero value. This assists in knowing that the switch statement added in the previous change has a complete set of cases for a properly-valued enum. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	645f92a3e9	config: restructure format_config() The recent changes have replaced the bodies of most if/else-if cases with simple helper method calls. This makes it easy to adapt the structure into a clearer switch statement, leaving a simple if/else in the default case. Make things a little simpler to read by reducing the nesting depth via a new goto statement when we want to skip values. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	2d4ab5a885	config: format colors quietly Move the logic for formatting color config value into a helper method and use quiet parsing when needed. This removes error messages when parsing a list of config values that do not match color formats. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	db45e4908d	color: add color_parse_quietly() When parsing colors, a failed parse leads to an error message due to the result returning error(). To allow for quiet parsing, create color_parse_quietly(). This is in contrast to an ..._gently() version because the original does not die(), so both options are technically 'gentle'. To accomplish this, convert the implementation of color_parse_mem() into a static color_parse_mem_1() helper that adds a 'quiet' parameter. The color_parse_quietly() method can then use this. Since it is a near equivalent to color_parse(), move that method down in the file so they can be nearby while also appearing after color_parse_mem_1(). Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	9cb4a5e1ba	config: format expiry dates quietly Move the logic for formatting expiry date config values into a helper method and use quiet parsing when needed. Note that git_config_expiry_date() will show an error on a bad parse and not die() like most other git_config...() parsers. Thus, we use 'quietly' here instead of 'gently'. There is an unfortunate asymmetry in these two parsing methods, but we need to treat a positive response from parse_expiry_date() as an error or we will get incorrect values. This updates the behavior of 'git config list --type=expiry-date' to be quiet when attempting parsing on non-date values. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	bcfb9128c9	config: format paths gently Move the logic for formatting path config values into a helper method and use gentle parsing when needed. We need to be careful about how to handle the ':(optional)' macro, which as tested in t1311-config-optional.sh must allow for ignoring a missing path when other multiple values exist, but cause 'git config get' to fail if it is the only possible value and thus no result is output. In the case of our list, we need to omit those values silently. This necessitates the use of the 'gently' parameter here. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	9c7fc23c24	config: format bools or strings in helper Move the logic for formatting bool-or-string config values into a helper. This parsing has always been gentle, so this is not unlocking new behavior. This extraction is only to match the formatting of the other cases that do need a behavior change. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	5fb7bdcca9	config: format bools or ints gently Move the logic for formatting bool-or-int config values into a helper method and use gentle parsing when needed. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:41 -08:00
Derrick Stolee	53959a8ba2	config: format bools gently Move the logic for formatting bool config values into a helper method and use gentle parsing when needed. This makes 'git config list --type=bool' not fail when coming across a non-boolean value. Such unparseable values are filtered out quietly. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:40 -08:00
Derrick Stolee	d744923fef	config: format int64s gently Move the logic for formatting int64 config values into a helper method and use gentle parsing when needed. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:40 -08:00
Derrick Stolee	1ef1f9d53a	config: make 'git config list --type=<X>' work Previously, the --type=<X> argument to 'git config list' was ignored and did nothing. Now, we add the use of format_config() to the show_all_config() function so each key-value pair is attempted to be parsed. This is our first use of the 'gently' parameter with a nonzero value. When listing multiple values, our initial settings for the output format is different. Add a new init helper to specify the fact that keys should be shown and also add the default delimiters as they were unset in some cases. Our intention is that if there is an error in parsing, then the row is not output. This is necessary to avoid the caller needing to build their own validator to understand the difference between valid, canonicalized types and other raw string values. The raw values will always be available to the user if they do not specify the --type=<X> option. The current behavior is more complicated, including error messages on bad parsing or potentially complete failure of the command. We add tests at this point that demonstrate the current behavior so we can witness the fix in future changes that parse these values quietly and gently. This is a change in behavior! We are starting to respect an option that was previously ignored, leading to potential user confusion. This is probably still a good option, since the --type argument did not change behavior at all previously, so users can get the behavior they expect by removing the --type argument or adding the --no-type argument. t1300-config.sh is updated with the current behavior of this formatting logic to justify the upcoming refactoring of format_config() that will incrementally fix some of these cases to be more user-friendly. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:40 -08:00
Derrick Stolee	12210d0346	config: add 'gently' parameter to format_config() This parameter is set to 0 for all current callers and is UNUSED. However, we will start using this option in future changes and in a critical change that requires gentle parsing (not using die()) to try parsing all values in a list. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:40 -08:00
Derrick Stolee	25ede48b54	config: move show_all_config() In anticipation of using format_config() in this method, move show_all_config() lower in the file without changes. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:23:40 -08:00
Patrick Steinhardt	1dd4f1e43f	refs: replace `refs_for_each_fullref_in()` Replace calls to `refs_for_each_fullref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:19 -08:00
Patrick Steinhardt	96c35a9ba5	refs: replace `refs_for_each_namespaced_ref()` Replace calls to `refs_for_each_namespaced_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:19 -08:00
Patrick Steinhardt	3fc1ad03c6	refs: replace `refs_for_each_glob_ref()` Replace calls to `refs_for_each_glob_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:19 -08:00
Patrick Steinhardt	4091d29893	refs: replace `refs_for_each_glob_ref_in()` Replace calls to `refs_for_each_glob_ref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:19 -08:00
Patrick Steinhardt	5ef6d593f1	refs: replace `refs_for_each_rawref_in()` Replace calls to `refs_for_each_rawref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:19 -08:00
Patrick Steinhardt	67034081ad	refs: replace `refs_for_each_rawref()` Replace calls to `refs_for_each_rawref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	00be226f1f	refs: replace `refs_for_each_ref_in()` Replace calls to `refs_for_each_ref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	5507200b50	refs: improve verification for-each-ref options Improve verification of the passed-in for-each-ref options: - Require that the `refs` store must be given. It's arguably very surprising that we simply return successfully in case the ref store is a `NULL` pointer. - When expected to trim ref prefixes we will `BUG()` in case the refname would become empty or in case we're expected to trim a longer prefix than the refname is long. As such, this case is only guaranteed to _not_ `BUG()` in case the caller also specified a prefix. And furthermore, that prefix must end in a trailing slash, as otherwise it may produce an exact match that could lead us to trim to the empty string. An audit shows that there are no callsites that rely on either of these behaviours, so this should not result in a functional change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	f503bb7dc9	refs: generalize `refs_for_each_fullref_in_prefixes()` The function `refs_for_each_fullref_in_prefixes()` can be used to iterate over all references part of any of the user-provided prefixes. In contrast to the `prefix` parameter of `refs_for_each_ref_ext()` it knows to handle the case well where multiple of the passed-in prefixes start with a common prefix by computing longest common prefixes and then iterating over those. While we could move this logic into `refs_for_each_ref_ext()`, this one feels somewhat special as we perform multiple iterations. But what we _can_ do is to generalize how this function works: instead of accepting only a small handful of parameters, we can have it accept the full options structure. One obvious exception is that the caller must not provide a prefix via the options. But this case can be easily detected. Refactor the code accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	5387919327	refs: generalize `refs_for_each_namespaced_ref()` The function `refs_for_each_namespaced_ref()` iterates through all references that are part of the current ref namespace. This namespace can be configured by setting the `GIT_NAMESPACE` environment variable and is then retrieved by calling `get_git_namespace()`. If a namespace is configured, then we: - Obviously only yield refs that exist in this namespace. - Rewrite exclude patterns so that they work for the given namespace, if any namespace is currently configured. Port this logic to `refs_for_each_ref_ext()` by adding a new `namespace` field to the options structure. This gives callers more flexibility as they can decide by themselves whether they want to use the globally configured or an arbitrary other namespace. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	daf01b1366	refs: speed up `refs_for_each_glob_ref_in()` The function `refs_for_each_glob_ref_in()` can be used to iterate through all refs in a specific prefix with globbing. The logic to handle this is currently hosted by `refs_for_each_glob_ref_in()`, which sets up a callback function that knows to filter out refs that _don't_ match the given globbing pattern. The way we do this is somewhat inefficient though: even though the function is expected to only yield refs in the given prefix, we still end up iterating through _all_ references, regardless of whether or not their name matches the given prefix. Extend `refs_for_each_ref_ext()` so that it can handle patterns and adapt `refs_for_each_glob_ref_in()` to use it. This means we continue to use the same callback-based infrastructure to filter individual refs via the globbing pattern, but we can now also use the other functionality of the `_ext()` variant. Most importantly, this means that we now properly handle the prefix. This results in a performance improvement when using a prefix where a significant majority of refs exists outside of the prefix. The following benchmark is an extreme case, with 1 million refs that exist outside the prefix and a single ref that exists inside it: Benchmark 1: git rev-parse --branches=refs/heads/* (rev = HEAD~) Time (mean ± σ): 115.9 ms ± 0.7 ms [User: 113.0 ms, System: 2.4 ms] Range (min … max): 114.9 ms … 117.8 ms 25 runs Benchmark 2: git rev-parse --branches=refs/heads/* (rev = HEAD) Time (mean ± σ): 1.1 ms ± 0.1 ms [User: 0.3 ms, System: 0.7 ms] Range (min … max): 1.0 ms … 2.3 ms 2092 runs Summary git rev-parse --branches=refs/heads/* (rev = HEAD) ran 107.01 ± 6.49 times faster than git rev-parse --branches=refs/heads/* (rev = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00
Patrick Steinhardt	aefcc9b367	refs: introduce `refs_for_each_ref_ext` In the refs subsystem we have a proliferation of functions that all iterate through references. (Almost) all of these functions internally call `do_for_each_ref()` and provide slightly different arguments so that one can control different aspects of its behaviour. This approach doesn't really scale: every time there is a slightly different use case for iterating through refs we create another new function. This combinatorial explosion doesn't make a lot of sense: it leads to confusing interfaces and heightens the maintenance burden. Refactor the code to become more composable by: - Exposing `do_for_each_ref()` as `refs_for_each_ref_ext()`. - Introducing an options structure that lets the caller control individual options. This gives us a much better foundation to build on going forward. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-02-23 13:21:18 -08:00

... 8 9 10 11 12 ...

175431 Commits