mirror of
https://github.com/git-for-windows/git.git
synced 2026-06-27 00:58:30 -05:00
pack-objects: support --delta-islands with --path-walk
Since the inception of `--path-walk`, this option has had a documented
incompatibility with `--delta-islands`.
When discussing those original patches on the list, a message from
Stolee in [1] noted the following:
this could be remedied by [...] doing a separate walk to identify
islands using the normal method
In a related portion of the thread, Peff explains[2]:
The delta islands code already does its own tree walk to propagate
the bits down (it does rely on the base walk's show_commit() to
propagate through the commits).
Once each object has its island bitmaps, I think however you
choose to come up with delta candidates [...] you should be able
to use it. It's fundamentally just answering the question of "am
I allowed to delta between these two objects".
That is similar to what this patch does, and it turns out the cheaper
option is sufficient: perform the same island side effects from the
path-walk callback rather than doing a second walk.
Recall how delta-islands are computed during a normal repack:
- `show_commit()` calls `propagate_island_marks()` for each commit,
which merges the commit's island bitset onto its root tree object and
onto each of its parent commits.
- `show_object()` for a tree records the tree's depth derived from the
slash-separated pathname. Subsequent `resolve_tree_islands()` uses
that depth to walk trees in increasing-depth order, propagating each
tree's marks to its children.
- At delta-search time, `in_same_island()` enforces that a delta
target's island bitmap is a subset of its base's: every island that
reaches the target must also reach the base.
Path-walk's enumeration callback is `add_objects_by_path()`. It already
adds objects to `to_pack`, but until now did not perform the
island-related side effects. Two things are needed:
- For each commit batch, call `propagate_island_marks()` on commits,
exactly as `show_commit()` does.
We have to be careful about the order in which we call this function,
and we must see a commit before its parents in order to have
island marks to propagate.
The path-walk batch preserves that order. Path-walk appends commits
to its `OBJ_COMMIT` batch as they come back from the same
`get_revision()` loop the regular traversal uses, and
`add_objects_by_path()` iterates the batch in array order. So every
commit reaches `propagate_island_marks()` in the same sequence that
`show_commit()` would have seen it, and the descendant-first chain
that the algorithm relies on is intact.
Skip island propagation for excluded commits to match the regular
traversal, whose `show_commit()` callback is only invoked for
interesting commits. Boundary commits may still be present in
path-walk's callback so they can serve as thin-pack bases, but they
should not contribute island marks.
- For each tree batch, record the tree's depth from the path. Use the
`record_tree_depth()` helper from the previous commit so both
callbacks behave identically, including the max-depth-wins behavior
when a tree is reached via more than one path. The helper accepts
both the `show_object()` path shape ("foo", "foo/bar") and the
path-walk shape with a trailing slash ("foo/", "foo/bar/"), so depths
recorded from either traversal mode are directly comparable.
This is implicit in the implementation sketch from Peff above.
`resolve_tree_islands()` sorts trees by `oe->tree_depth` in
increasing-depth order before propagating marks down, so that a
parent tree's marks are finalized before its children inherit them.
Without recording the depth at path-walk time, every
path-walk-discovered tree would land at depth 0 in `to_pack`, the
sort would lose its ordering, and children could inherit marks from
parents whose own contributions had not yet been merged in.
With those two pieces in place, `resolve_tree_islands()` receives the
same island inputs from path-walk as it would from the regular
traversal, so the existing island checks can be reused unchanged.
Drop the documented incompatibility between `--path-walk` and
`--delta-islands`, and add t5320 coverage for path-walk island repacks
with and without bitmap writing, as well as the same-island case where a
delta remains allowed.
[1]: https://lore.kernel.org/git/9aa2471b-0850-4707-9733-d3b33609f5f2@gmail.com/
[2]: https://lore.kernel.org/git/20240911063203.GA1538586@coredump.intra.peff.net/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
committed by
Junio C Hamano
parent
264efee401
commit
7e6de2ac62
@@ -402,13 +402,13 @@ will be automatically changed to version `1`.
|
||||
of filenames that cause collisions in Git's default name-hash
|
||||
algorithm.
|
||||
+
|
||||
Incompatible with `--delta-islands`. When `--use-bitmap-index` is
|
||||
specified with `--path-walk`, a successful bitmap traversal is used for
|
||||
object enumeration, with path-walk remaining as the fallback traversal
|
||||
when the bitmap cannot satisfy the request. The `--path-walk` option
|
||||
supports the `--filter=<spec>` forms `blob:none`, `blob:limit=<n>`,
|
||||
`tree:0`, `object:type=<type>`, and `sparse:<oid>`. These supported filter
|
||||
types can be combined with the `combine:<spec>+<spec>` form.
|
||||
When `--use-bitmap-index` is specified with `--path-walk`, a successful
|
||||
bitmap traversal is used for object enumeration, with path-walk
|
||||
remaining as the fallback traversal when the bitmap cannot satisfy the
|
||||
request. The `--path-walk` option supports the `--filter=<spec>` forms
|
||||
`blob:none`, `blob:limit=<n>`, `tree:0`, `object:type=<type>`, and
|
||||
`sparse:<oid>`. These supported filter types can be combined with the
|
||||
`combine:<spec>+<spec>` form.
|
||||
|
||||
|
||||
DELTA ISLANDS
|
||||
|
||||
@@ -4747,13 +4747,29 @@ static int add_objects_by_path(const char *path,
|
||||
|
||||
add_object_entry(oid, type, path, exclude);
|
||||
|
||||
if (type == OBJ_COMMIT && write_bitmap_index) {
|
||||
if (type == OBJ_COMMIT) {
|
||||
struct commit *commit;
|
||||
|
||||
if (!write_bitmap_index && !use_delta_islands)
|
||||
continue;
|
||||
|
||||
commit = lookup_commit(the_repository, oid);
|
||||
if (!commit)
|
||||
die(_("could not find commit %s"), oid_to_hex(oid));
|
||||
index_commit_for_bitmap(commit);
|
||||
if (write_bitmap_index)
|
||||
index_commit_for_bitmap(commit);
|
||||
/*
|
||||
* Skip island propagation for boundary commits.
|
||||
* The regular traversal's show_commit() is only
|
||||
* called for interesting commits; matching that
|
||||
* here keeps path-walk from doing extra work that
|
||||
* would only be a no-op anyway (boundary commits
|
||||
* are not in island_marks).
|
||||
*/
|
||||
if (use_delta_islands && !exclude)
|
||||
propagate_island_marks(the_repository, commit);
|
||||
} else if (type == OBJ_TREE && use_delta_islands) {
|
||||
record_tree_depth(oid, path);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -5215,8 +5231,6 @@ int cmd_pack_objects(int argc,
|
||||
const char *option = NULL;
|
||||
if (!path_walk_filter_compatible(&filter_options))
|
||||
option = "--filter";
|
||||
else if (use_delta_islands)
|
||||
option = "--delta-islands";
|
||||
|
||||
if (option) {
|
||||
warning(_("cannot use %s with %s"),
|
||||
|
||||
@@ -53,6 +53,35 @@ test_expect_success 'separate islands disallows delta' '
|
||||
! is_delta_base $two $one
|
||||
'
|
||||
|
||||
test_expect_success 'path-walk island repack respects islands' '
|
||||
GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-islands" \
|
||||
git -c "pack.island=refs/heads/(.*)" repack -adfi \
|
||||
--path-walk 2>err &&
|
||||
test_region pack-objects path-walk trace.path-walk-islands &&
|
||||
test_grep ! "cannot use --delta-islands with --path-walk" err &&
|
||||
! is_delta_base $one $two &&
|
||||
! is_delta_base $two $one
|
||||
'
|
||||
|
||||
test_expect_success 'path-walk island bitmap repack respects islands' '
|
||||
GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-island-bitmap" \
|
||||
git -c "pack.island=refs/heads/(.*)" repack -a -d -f -i -b \
|
||||
--path-walk 2>err &&
|
||||
test_region pack-objects path-walk trace.path-walk-island-bitmap &&
|
||||
test_path_is_file .git/objects/pack/*.bitmap &&
|
||||
git rev-list --test-bitmap --use-bitmap-index one &&
|
||||
test_grep ! "cannot use --delta-islands with --path-walk" err &&
|
||||
! is_delta_base $one $two &&
|
||||
! is_delta_base $two $one
|
||||
'
|
||||
|
||||
test_expect_success 'path-walk same island allows delta' '
|
||||
GIT_TRACE2_EVENT="$(pwd)/trace.path-walk-same-island" \
|
||||
git -c "pack.island=refs/heads" repack -adfi --path-walk &&
|
||||
test_region pack-objects path-walk trace.path-walk-same-island &&
|
||||
is_delta_base $one $two
|
||||
'
|
||||
|
||||
test_expect_success 'same island allows delta' '
|
||||
git -c "pack.island=refs/heads" repack -adfi &&
|
||||
is_delta_base $one $two
|
||||
|
||||
Reference in New Issue
Block a user