Some of the fields in `struct object_info` are undocumented. Add these
missing comments.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have migrated all callers to derive their
information of how a specific object is stored to use the new object
info source instead, and hence the field is now unused. Drop it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `whence` field has become redundant now that callers can learn about
the exact source an object has been looked up from via the `struct
object_info_source::source` field.
Adapt callers to use the new field. Note that all callsites already set
up the `info.sourcep` request pointer, so the conversion is rather
straight-forward.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit introduced `struct object_info_source` as an opt-in
container for backend-specific information, but for now we only moved
preexisting data into this structure. Most importantly, the caller has
no way yet to learn about which source an object was actually looked up
from. Instead, callers have to rely on the `whence` enum to distinguish
the object type, but cannot use that enum to tell the object source.
Add a `struct odb_source *source` field to the structure and populate it
from each backend's lookup path.
The `whence` enum is still set and used by callers; it will be removed
in a subsequent commit now that `sourcep->source` can identify the
backend on its own.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct object_info` carries two pieces of information
about how an object was looked up:
- The `whence` enum identifying the backend.
- The backend-tagged union `u` exposing backend-specific details
(currently only the packed-source case, which records the owning
pack, offset and packed object type).
The union is populated unconditionally, even though most callers don't
care about provenance at all.
Split the backend-specific union out into a new public type, `struct
object_info_source`, and make the object info structure carry it via
just another opt-in request pointer. As with all the other requestable
information, callers that need source info allocate a `struct
object_info_source` on the stack and point `sourcep` at it; callers that
don't care about it simply leave the field as a `NULL` pointer. Adapt
callers accordingly.
Note that the `whence` enum is strictly-speaking also backend-specific
information, so it would be another good candidate to be moved into the
`struct object_info_source`. For now though it is left alone, as it will
be replaced by a `struct odb_source` pointer in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an optional `struct odb_source_packed *source` parameter to
`packed_object_info()` and `packed_object_info_with_index_pos()`. This
parameter is unused at this point in time, but it will be used in a
follow-up commit so that we can record the source of a specific object.
Note that callers in "odb/source-packed.c" pass the already-available
source, but all other callers pass `NULL` instead. This is fine though,
as we only care about populating this info when called via the packed
store.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/odb-source-packed:
odb/source-packed: drop pointer to "files" parent source
midx: refactor interfaces to work on "packed" source
odb/source-packed: stub out remaining functions
odb/source-packed: wire up `freshen_object()` callback
odb/source-packed: wire up `find_abbrev_len()` callback
odb/source-packed: wire up `count_objects()` callback
odb/source-packed: wire up `for_each_object()` callback
odb/source-packed: wire up `read_object_stream()` callback
odb/source-packed: wire up `read_object_info()` callback
packfile: use higher-level interface to implement `has_object_pack()`
odb/source-packed: wire up `reprepare()` callback
odb/source-packed: wire up `close()` callback
odb/source-packed: start converting to a proper `struct odb_source`
odb/source-packed: store pointer to "files" instead of generic source
packfile: move packed source into "odb/" subsystem
packfile: split out packfile list logic
packfile: rename `struct packfile_store` to `odb_source_packed`
* js/objects-larger-than-4gb-on-windows-more:
odb: use size_t for object_info.sizep and the size APIs
packfile,delta: drop the `cast_size_t_to_ulong()` wrappers
pack-objects: use size_t for in-core object sizes
packfile: widen unpack_entry()'s size out-parameter to size_t
pack-objects(check_pack_inflate()): use size_t instead of unsigned long
patch-delta: use size_t for sizes
compat/msvc: use _chsize_s for ftruncate
A hotfix to an earlier attempt to update code paths that assumed
"unsigned long" was long enough for "size_t".
* js/objects-larger-than-4gb-on-windows:
zlib: properly clamp to uLong
On platforms where `unsigned long` and `size_t` differ in bit size, we
want to clamp the buffers we pass to zlib to the former's size, as per
d05d666977 (git-zlib: handle data streams larger than 4GB, 2026-05-08).
The logic introduced in that commit performs a clamping to the bits,
though, which fails to do what is needed here: If too many bytes are
available in the buffers, we need to clamp to the maximum value of an
`unsigned long`. Otherwise, we ask zlib to use too small buffers, in the
worst case using 0 as the size (think: a value whose 32 lowest bits are
all zero).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
compute_reachable_generation_numbers() in commit-graph used a 32-bit
integer to accumulate parent generations, which is OK for generation
number v1 (topological levels), but with generation number v2
(adjusted committer timestamps), it truncated timestamps beyond
2106. Fixed by widening the accumulator to timestamp_t.
* en/commit-graph-timestamp-fix:
commit-graph: use timestamp_t for max parent generation accumulator
The UNUSED macro in 'compat/posix.h' has been updated to use a
newly introduced GIT_CLANG_PREREQ macro for compiler version
checks, and the existing GIT_GNUC_PREREQ macro has been modernized
to use explicit major/minor comparisons rather than bit-shifting.
* dl/posix-unused-warning-clang:
compat/posix.h: simplify GIT_GNUC_PREREQ() comparison
compat/posix.h: clean up GIT_GNUC_PREREQ() and UNUSED
compat/posix.h: enable UNUSED warning messages for Clang
`git ls-files --modified` and `git ls-files --deleted` have been
optimized to filter with pathspec before calling lstat() when there is
only a single pathspec item, avoiding unnecessary filesystem access
for entries that will not be shown.
* td/ls-files-pathspec-prefilter:
ls-files: filter pathspec before lstat
Various AsciiDoc markup fixes in 'git config' documentation and
related files to ensure lists and formatting are rendered correctly.
* ta/doc-config-adoc-fixes:
doc: git-config: escape erroneous highlight markup
doc: config/sideband: fix description list delimiter
doc: config: terminate runaway lists
'git describe' has been taught to pass the 'refs/tags/' prefix down to
the ref iterator when '--all' is not requested, avoiding unnecessary
iteration over non-tag refs.
* td/describe-tag-iteration:
describe: limit default ref iteration to tags
The TSAN race in transfer_debug() within transport-helper.c has been
resolved by initializing the debug flag early in
bidirectional_transfer_loop() before spawning worker threads, allowing
the removal of a TSAN suppression.
* ps/transport-helper-tsan-fix:
transport-helper: fix TSAN race in transfer_debug()
"git index-pack" has been optimized by retaining child bases in the
delta cache instead of immediately freeing them, letting the existing
cache limit policy decide eviction.
* ab/index-pack-retain-child-bases:
index-pack: retain child bases in delta cache
Over the last commits we have turned the packfile store into a proper
object database source that can be used as a standalone backend. As
such, it is no longer necessary to have it coupled to the "files" parent
source.
Remove the pointer to the owning "files" source so that the "packed"
source can be used as a standalone entity.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our interfaces used to interact with MIDXs all work on top of the
generic `struct odb_source`. This doesn't make much sense though: a MIDX
is strictly tied to the "packed" source, so passing in a generic source
gives the false sense that it may also work with a different type of
source.
Fix this conceptual weirdness and instead require the caller to pass in
a "packed" source explicitly. This also makes the next commit easier to
implement, where we drop the pointer to the "files" source in the
"packed" source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stub out remaining functions that we either don't need or that are
basically no-ops.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `packfile_store_freshen_object()` and from "packfile.c" into
"odb/source-packed.c" and wire it up as the `freshen_object()` callback
of the "packed" source.
Note that this removes the last external caller of `find_pack_entry()`
from "packfile.c", which means that we can now make this function
static.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `packfile_store_find_abbrev_len()` and its associated helpers from
"packfile.c" into "odb/source-packed.c" and wire it up as the
`find_abbrev_len()` callback of the "packed" source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `packfile_store_count_objects()` from "packfile.c" into
"odb/source-packed.c" and wire it up as the `count_objects()` callback
of the "packed" source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `packfile_store_for_each_object()` and its associated helpers from
"packfile.c" into "odb/source-packed.c" and wire it up as the
`for_each_object()` callback of the "packed" source.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wire up the `read_object_stream()` callback for the packed source and
call it in the "files" source via the `odb_source_read_object_stream()`
interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the logic to read object info from a "packed" source into
"odb/source-packed.c" and wire it up as the `read_object_info()`
callback.
Note that we also move around the supporting `find_pack_entry()`, but we
still have to expose it to other callers that exist in "packfile.c".
This will be fixed in subsequent commits though, where all callers in
"packfile.c" will have been moved into "odb/source-packed.c", and at
that point we'll be able to make `find_pack_entry()` file-local again.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `has_object_pack()` we're checking whether a specific object exists
as part of a packfile. This is done by calling the low-level function
`find_pack_entry()`, but this function will eventually be moved into
"odb/source-packed.c" and made file-local.
Refactor the code to use `packfile_store_read_object_info()` instead.
This refactoring is functionally equivalent as that function will call
`find_pack_entry()` itself and then return immediately when it ain't got
no object info pointer as parameter.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the logic to prepare and reprepare the "packed" source into
"odb/source-packed.c" and wire it up as the `reprepare()` callback.
Note that "preparing" a source is not yet generic. Eventually, it would
probably make sense to turn the existing `reprepare()` callback into a
`prepare()` callback with an optional flag to force re-preparing. But
this step will be handled in a separate patch series.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wire up a new `close()` callback for the packed source and call it from
the "files" source via the generic `odb_source_close()` interface.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start converting `struct odb_source_packed` into a proper pluggable
`struct odb_source` by embedding the base struct and assigning it the
new `ODB_SOURCE_PACKED` type. Furthermore, wire up lifecycle management
of this source by implementing the `free` callback and taking ownership
of the chdir notifications.
Note that the packed source is not yet functional as a standalone `struct
odb_source`, as it's missing all of the callback implementations. These
will be wired up in subsequent commits.
Further note that we're also registering a `chdir_notify` callback to
reparent our path. This wasn't previously necessary (and still isn't at
this point in time) because all paths are taken from the owning "files"
source, and that source already handles the reparenting for us. But a
subsequent commit will change that so that we're using the path of the
"packed" source, and once that happens we'll need it to be updated when
changing the working directory.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `struct odb_source_packed` holds a pointer to its owning parent
source. The way that Git is currently structured, this parent is always
the "files" source. In subsequent commits we're going to detangle that
so that the "packed" source doesn't have any owning parent source at
all, which makes it usable as a completely standalone source.
Detangling this mess is somewhat intricate though, and is made even more
intricate because it's not always clear which kind of source one is
holding at a specific point in time -- either the parent "files" source,
or the child "packed" source.
Make this relationship more explicit by storing a pointer to the "files"
source instead of storing a pointer to a generic `struct odb_source`.
This will help make subsequent steps a bit clearer by making it more
obvious whether we're using the generic "base" source or the owning
"files" source.
Note that this is a temporary step, only. At the end of this series
we will have dropped the "files" pointer completely.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent patches we'll be turning `struct odb_source_packed` into a
proper `struct odb_source`. As a first step towards this goal, move its
struct out of "packfile.{c,h}" and into "odb/source-packed.{c,h}".
This detaches the implementation of the packfile object source from the
generic packfile code, following the same convention already used by the
"files" and "in-memory" sources.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit we're about to introduce the "packed" object database
source. This source will embed a packfile list, and consequently we'll
have to include "packfile.h" to make the struct definition available.
This will unfortunately lead to a cyclic dependency that we cannot
resolve with a forward declaration.
Split out the code that relates to the packfile list into a separate
compilation unit so that both "packfile.h" and "odb/source-packed.h" can
include it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not too long ago, we have introduced the packfile store in b7983adb51
(packfile: introduce a new `struct packfile_store`, 2025-09-23). This
struct is responsible for managing all of our access to packfiles and is
used as one of the two sources of objects for the "files" source.
Back when I introduced this structure I didn't have the clear vision yet
that it will eventually also turn into a proper object database source,
and how exactly that infrastructure will look like. Now though it's
becoming increasingly clear that it does make sense to treat it just the
same as any of our other ODB sources.
The consequence is that the naming is now a bit out-of-date: it's just
another source and will be turned into a proper `struct odb_source` over
the next couple of commits, but it's not named accordingly.
Rename the structure to `odb_source_packed` to align it with this goal
and to bring it in line with the other sources we already have.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without NO_RUST defined, the varint encoder/decoder lives in the
RUST_LIB, which needs to be linked. Symptom:
cc [... -o contrib/credential/osxkeychain/git-credential-osxkeychain [...]
Undefined symbols for architecture x86_64:
"_decode_varint", referenced from:
_read_untracked_extension in libgit.a[x86_64][63](dir.o)
_read_untracked_extension in libgit.a[x86_64][63](dir.o)
_read_one_dir in libgit.a[x86_64][63](dir.o)
_read_one_dir in libgit.a[x86_64][63](dir.o)
_load_cache_entry_block in libgit.a[x86_64][174](read-cache.o)
"_encode_varint", referenced from:
_write_untracked_extension in libgit.a[x86_64][63](dir.o)
_write_untracked_extension in libgit.a[x86_64][63](dir.o)
_write_untracked_extension in libgit.a[x86_64][63](dir.o)
_write_one_dir in libgit.a[x86_64][63](dir.o)
_write_one_dir in libgit.a[x86_64][63](dir.o)
_do_write_index in libgit.a[x86_64][174](read-cache.o)
ld: symbol(s) not found for architecture x86_64
While it is curious why these functions are needed at all (osxkeychain
does not read or write the index), the compile error is a real problem.
Instead of trying to play games to add `GITLIBS` while filtering out
`common-main.o`, replace the `$(LIB_FILE) $(EXTLIBS)` construct with the
much shorter `$(LIBS)` construct that _already_ filters out
`common-main.o` and adds the Rust library when needed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subprocess handshake during startup has been made gentler by using
packet_read_line_gently() instead of packet_read_line() to prevent the
parent Git process from dying abruptly when a configured subprocess
(e.g., a clean/smudge filter) fails to start.
* mm/subprocess-handshake-fix:
sub-process: use gentle handshake to avoid die() on startup failure
Various typos, grammatical errors, and duplicated words in both
documentation and code comments have been corrected.
* wy/docs-typofixes:
docs: fix typos and grammar
A handful of inappropriate uses of the_repository have been
rewritten to use the right repository structure instance in the
unpack-trees.c codepath.
* jd/unpack-trees-wo-the-repository:
unpack-trees: use repository from index instead of global
A recent regression in t7527 that broke TAP output has been fixed,
some other test noise that also broke TAP output has been silenced,
and 'prove' is now configured to fail on invalid TAP output to
prevent future regressions.
* ps/t7527-fix-tap-output:
t: let prove fail when parsing invalid TAP output
t/lib-git-p4: silence output when killing p4d and its watchdog
t/test-lib: silence EBUSY errors on Windows during test cleanup
t7810: turn MB_REGEX check into a lazy prereq
t7527: fix broken TAP output
ci: unify Linux images across GitLab and GitHub
gitlab-ci: add missing Linux jobs
gitlab-ci: rearrange Linux jobs to match GitHub's order
The 'git describe --contains --all' command has been fixed to
properly honor the '--match' and '--exclude' options by passing
them down to 'git name-rev' with the appropriate reference
prefixes.
* jk/describe-contains-all-match-fix:
describe: fix --exclude, --match with --contains and --all
Streaming revision walks have been optimized by using a priority queue
for date-sorting commits, speeding up walks repositories with many
merges.
* kk/streaming-walk-pqueue:
revision: use priority queue for non-limited streaming walks
revision: introduce rev_walk_mode to clarify get_revision_1()
pack-objects: call release_revisions() after cruft traversal
"git rev-list" (and "git log" family of commands) learned a new "--max-count-oldest"
that picks oldest N commits in the range instead of the usual newest.
* mf/revision-max-count-oldest:
bash-completions: add --max-count-oldest
revision.c: implement --max-count-oldest