Add path walk API and its use in 'git pack-objects' (#5171)

This is a follow up to #5157 as well as motivated by the RFC in
gitgitgadget/git#1786.

We have ways of walking all objects, but it is focused on visiting a
single commit and then expanding the new trees and blobs reachable from
that commit that have not been visited yet. This means that objects
arrive without any locality based on their path.

Add a new "path walk API" that focuses on walking objects in batches
according to their type and path. This will walk all annotated tags, all
commits, all root trees, and then start a depth-first search among all
paths in the repo to collect trees and blobs in batches.

The most important application for this is being fast-tracked to Git for
Windows: `git pack-objects --path-walk`. This application of the path
walk API discovers the objects to pack via this batched walk, and
automatically groups objects that appear at a common path so they can be
checked for delta comparisons.

This use completely avoids any name-hash collisions (even the collisions
that sometimes occur with the new `--full-name-hash` option) and can be
much faster to compute since the first pass of delta calculations does
not waste time on objects that are unlikely to be diffable.

Some statistics are available in the commit messages.
This commit is contained in:
Derrick Stolee 2024-09-25 16:48:41 -04:00 committed by Johannes Schindelin
commit 81c7a40d30
23 changed files with 534 additions and 41 deletions

View File

@ -20,6 +20,10 @@ walking fewer objects.
+
* `pack.allowPackReuse=multi` may improve the time it takes to create a pack by
reusing objects from multiple packs instead of just one.
+
* `pack.usePathWalk` may speed up packfile creation and make the packfiles be
significantly smaller in the presence of certain filename collisions with Git's
default name-hash.
feature.manyFiles::
Enable config options that optimize for repos with many files in the

View File

@ -155,6 +155,14 @@ pack.useSparse::
commits contain certain types of direct renames. Default is
`true`.
pack.usePathWalk::
When true, git will default to using the '--path-walk' option in
'git pack-objects' when the '--revs' option is present. This
algorithm groups objects by path to maximize the ability to
compute delta chains across historical versions of the same
object. This may disable other options, such as using bitmaps to
enumerate objects.
pack.preferBitmapTips::
When selecting which commits will receive bitmaps, prefer a
commit at the tip of any reference that is a suffix of any value

View File

@ -16,7 +16,7 @@ SYNOPSIS
[--cruft] [--cruft-expiration=<time>]
[--stdout [--filter=<filter-spec>] | <base-name>]
[--shallow] [--keep-true-parents] [--[no-]sparse]
[--name-hash-version=<n>] < <object-list>
[--name-hash-version=<n>] [--path-walk] < <object-list>
DESCRIPTION
@ -375,6 +375,16 @@ many different directories. At the moment, this version is not allowed
when writing reachability bitmap files with `--write-bitmap-index` and it
will be automatically changed to version `1`.
--path-walk::
By default, `git pack-objects` walks objects in an order that
presents trees and blobs in an order unrelated to the path they
appear relative to a commit's root tree. The `--path-walk` option
enables a different walking algorithm that organizes trees and
blobs by path. This has the potential to improve delta compression
especially in the presence of filenames that cause collisions in
Git's default name-hash algorithm. Due to changing how the objects
are walked, this option is not compatible with `--delta-islands`,
`--shallow`, or `--filter`.
DELTA ISLANDS
-------------

View File

@ -11,7 +11,7 @@ SYNOPSIS
[verse]
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]
[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
[--write-midx] [--name-hash-version=<n>]
[--write-midx] [--name-hash-version=<n>] [--path-walk]
DESCRIPTION
-----------
@ -258,6 +258,18 @@ linkgit:git-multi-pack-index[1]).
Provide this argument to the underlying `git pack-objects` process.
See linkgit:git-pack-objects[1] for full details.
--path-walk::
This option passes the `--path-walk` option to the underlying
`git pack-options` process (see linkgit:git-pack-objects[1]).
By default, `git pack-objects` walks objects in an order that
presents trees and blobs in an order unrelated to the path they
appear relative to a commit's root tree. The `--path-walk` option
enables a different walking algorithm that organizes trees and
blobs by path. This has the potential to improve delta compression
especially in the presence of filenames that cause collisions in
Git's default name-hash algorithm. Due to changing how the objects
are walked, this option is not compatible with `--delta-islands`
or `--filter`.
CONFIGURATION
-------------

View File

@ -70,3 +70,4 @@ Examples
See example usages in:
`t/helper/test-path-walk.c`,
`builtin/backfill.c`
`builtin/pack-objects.c`

View File

@ -41,6 +41,9 @@
#include "promisor-remote.h"
#include "pack-mtimes.h"
#include "parse-options.h"
#include "blob.h"
#include "tree.h"
#include "path-walk.h"
/*
* Objects we are going to pack are collected in the `to_pack` structure.
@ -218,6 +221,7 @@ static int delta_search_threads;
static int pack_to_stdout;
static int sparse;
static int thin;
static int path_walk = -1;
static int num_preferred_base;
static struct progress *progress_state;
@ -3041,6 +3045,7 @@ static void find_deltas(struct object_entry **list, unsigned *list_size,
struct thread_params {
pthread_t thread;
struct object_entry **list;
struct packing_region *regions;
unsigned list_size;
unsigned remaining;
int window;
@ -3283,6 +3288,236 @@ static int add_ref_tag(const char *tag UNUSED, const char *referent UNUSED, cons
return 0;
}
static int should_attempt_deltas(struct object_entry *entry)
{
if (DELTA(entry))
return 0;
if (!entry->type_valid ||
oe_size_less_than(&to_pack, entry, 50))
return 0;
if (entry->no_try_delta)
return 0;
if (!entry->preferred_base) {
if (oe_type(entry) < 0)
die(_("unable to get type of object %s"),
oid_to_hex(&entry->idx.oid));
} else if (oe_type(entry) < 0) {
/*
* This object is not found, but we
* don't have to include it anyway.
*/
return 0;
}
return 1;
}
static void find_deltas_for_region(struct object_entry *list UNUSED,
struct packing_region *region,
unsigned int *processed)
{
struct object_entry **delta_list;
uint32_t delta_list_nr = 0;
ALLOC_ARRAY(delta_list, region->nr);
for (uint32_t i = 0; i < region->nr; i++) {
struct object_entry *entry = to_pack.objects + region->start + i;
if (should_attempt_deltas(entry))
delta_list[delta_list_nr++] = entry;
}
QSORT(delta_list, delta_list_nr, type_size_sort);
find_deltas(delta_list, &delta_list_nr, window, depth, processed);
free(delta_list);
}
static void find_deltas_by_region(struct object_entry *list,
struct packing_region *regions,
uint32_t start, uint32_t nr)
{
unsigned int processed = 0;
uint32_t progress_nr;
if (!nr)
return;
progress_nr = regions[nr - 1].start + regions[nr - 1].nr;
if (progress)
progress_state = start_progress(the_repository,
_("Compressing objects by path"),
progress_nr);
while (nr--)
find_deltas_for_region(list,
&regions[start++],
&processed);
display_progress(progress_state, progress_nr);
stop_progress(&progress_state);
}
static void *threaded_find_deltas_by_path(void *arg)
{
struct thread_params *me = arg;
progress_lock();
while (me->remaining) {
while (me->remaining) {
progress_unlock();
find_deltas_for_region(to_pack.objects,
me->regions,
me->processed);
progress_lock();
me->remaining--;
me->regions++;
}
me->working = 0;
pthread_cond_signal(&progress_cond);
progress_unlock();
/*
* We must not set ->data_ready before we wait on the
* condition because the main thread may have set it to 1
* before we get here. In order to be sure that new
* work is available if we see 1 in ->data_ready, it
* was initialized to 0 before this thread was spawned
* and we reset it to 0 right away.
*/
pthread_mutex_lock(&me->mutex);
while (!me->data_ready)
pthread_cond_wait(&me->cond, &me->mutex);
me->data_ready = 0;
pthread_mutex_unlock(&me->mutex);
progress_lock();
}
progress_unlock();
/* leave ->working 1 so that this doesn't get more work assigned */
return NULL;
}
static void ll_find_deltas_by_region(struct object_entry *list,
struct packing_region *regions,
uint32_t start, uint32_t nr)
{
struct thread_params *p;
int i, ret, active_threads = 0;
unsigned int processed = 0;
uint32_t progress_nr;
init_threaded_search();
if (!nr)
return;
progress_nr = regions[nr - 1].start + regions[nr - 1].nr;
if (delta_search_threads <= 1) {
find_deltas_by_region(list, regions, start, nr);
cleanup_threaded_search();
return;
}
if (progress > pack_to_stdout)
fprintf_ln(stderr, _("Path-based delta compression using up to %d threads"),
delta_search_threads);
CALLOC_ARRAY(p, delta_search_threads);
if (progress)
progress_state = start_progress(the_repository,
_("Compressing objects by path"),
progress_nr);
/* Partition the work amongst work threads. */
for (i = 0; i < delta_search_threads; i++) {
unsigned sub_size = nr / (delta_search_threads - i);
p[i].window = window;
p[i].depth = depth;
p[i].processed = &processed;
p[i].working = 1;
p[i].data_ready = 0;
p[i].regions = regions;
p[i].list_size = sub_size;
p[i].remaining = sub_size;
regions += sub_size;
nr -= sub_size;
}
/* Start work threads. */
for (i = 0; i < delta_search_threads; i++) {
if (!p[i].list_size)
continue;
pthread_mutex_init(&p[i].mutex, NULL);
pthread_cond_init(&p[i].cond, NULL);
ret = pthread_create(&p[i].thread, NULL,
threaded_find_deltas_by_path, &p[i]);
if (ret)
die(_("unable to create thread: %s"), strerror(ret));
active_threads++;
}
/*
* Now let's wait for work completion. Each time a thread is done
* with its work, we steal half of the remaining work from the
* thread with the largest number of unprocessed objects and give
* it to that newly idle thread. This ensure good load balancing
* until the remaining object list segments are simply too short
* to be worth splitting anymore.
*/
while (active_threads) {
struct thread_params *target = NULL;
struct thread_params *victim = NULL;
unsigned sub_size = 0;
progress_lock();
for (;;) {
for (i = 0; !target && i < delta_search_threads; i++)
if (!p[i].working)
target = &p[i];
if (target)
break;
pthread_cond_wait(&progress_cond, &progress_mutex);
}
for (i = 0; i < delta_search_threads; i++)
if (p[i].remaining > 2*window &&
(!victim || victim->remaining < p[i].remaining))
victim = &p[i];
if (victim) {
sub_size = victim->remaining / 2;
target->regions = victim->regions + victim->remaining - sub_size;
victim->list_size -= sub_size;
victim->remaining -= sub_size;
}
target->list_size = sub_size;
target->remaining = sub_size;
target->working = 1;
progress_unlock();
pthread_mutex_lock(&target->mutex);
target->data_ready = 1;
pthread_cond_signal(&target->cond);
pthread_mutex_unlock(&target->mutex);
if (!sub_size) {
pthread_join(target->thread, NULL);
pthread_cond_destroy(&target->cond);
pthread_mutex_destroy(&target->mutex);
active_threads--;
}
}
cleanup_threaded_search();
free(p);
display_progress(progress_state, progress_nr);
stop_progress(&progress_state);
}
static void prepare_pack(int window, int depth)
{
struct object_entry **delta_list;
@ -3307,39 +3542,21 @@ static void prepare_pack(int window, int depth)
if (!to_pack.nr_objects || !window || !depth)
return;
if (path_walk)
ll_find_deltas_by_region(to_pack.objects, to_pack.regions,
0, to_pack.nr_regions);
ALLOC_ARRAY(delta_list, to_pack.nr_objects);
nr_deltas = n = 0;
for (i = 0; i < to_pack.nr_objects; i++) {
struct object_entry *entry = to_pack.objects + i;
if (DELTA(entry))
/* This happens if we decided to reuse existing
* delta from a pack. "reuse_delta &&" is implied.
*/
if (!should_attempt_deltas(entry))
continue;
if (!entry->type_valid ||
oe_size_less_than(&to_pack, entry, 50))
continue;
if (entry->no_try_delta)
continue;
if (!entry->preferred_base) {
if (!entry->preferred_base)
nr_deltas++;
if (oe_type(entry) < 0)
die(_("unable to get type of object %s"),
oid_to_hex(&entry->idx.oid));
} else {
if (oe_type(entry) < 0) {
/*
* This object is not found, but we
* don't have to include it anyway.
*/
continue;
}
}
delta_list[n++] = entry;
}
@ -4272,6 +4489,88 @@ static void mark_bitmap_preferred_tips(void)
}
}
static inline int is_oid_interesting(struct repository *repo,
struct object_id *oid)
{
struct object *o = lookup_object(repo, oid);
return o && !(o->flags & UNINTERESTING);
}
static int add_objects_by_path(const char *path,
struct oid_array *oids,
enum object_type type,
void *data)
{
size_t oe_start = to_pack.nr_objects;
size_t oe_end;
unsigned int *processed = data;
/*
* First, add all objects to the packing data, including the ones
* marked UNINTERESTING (translated to 'exclude') as they can be
* used as delta bases.
*/
for (size_t i = 0; i < oids->nr; i++) {
int exclude;
struct object_info oi = OBJECT_INFO_INIT;
struct object_id *oid = &oids->oid[i];
/* Skip objects that do not exist locally. */
if ((exclude_promisor_objects || arg_missing_action != MA_ERROR) &&
oid_object_info_extended(the_repository, oid, &oi,
OBJECT_INFO_FOR_PREFETCH) < 0)
continue;
exclude = !is_oid_interesting(the_repository, oid);
if (exclude && !thin)
continue;
add_object_entry(oid, type, path, exclude);
}
oe_end = to_pack.nr_objects;
/* We can skip delta calculations if it is a no-op. */
if (oe_end == oe_start || !window)
return 0;
ALLOC_GROW(to_pack.regions,
to_pack.nr_regions + 1,
to_pack.nr_regions_alloc);
to_pack.regions[to_pack.nr_regions].start = oe_start;
to_pack.regions[to_pack.nr_regions].nr = oe_end - oe_start;
to_pack.nr_regions++;
*processed += oids->nr;
display_progress(progress_state, *processed);
return 0;
}
static void get_object_list_path_walk(struct rev_info *revs)
{
struct path_walk_info info = PATH_WALK_INFO_INIT;
unsigned int processed = 0;
info.revs = revs;
info.path_fn = add_objects_by_path;
info.path_fn_data = &processed;
revs->tag_objects = 1;
/*
* Allow the --[no-]sparse option to be interesting here, if only
* for testing purposes. Paths with no interesting objects will not
* contribute to the resulting pack, but only create noisy preferred
* base objects.
*/
info.prune_all_uninteresting = sparse;
if (walk_objects_by_path(&info))
die(_("failed to pack objects via path-walk"));
}
static void get_object_list(struct rev_info *revs, int ac, const char **av)
{
struct setup_revision_opt s_r_opt = {
@ -4318,7 +4617,7 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
warn_on_object_refname_ambiguity = save_warning;
if (use_bitmap_index && !get_object_list_from_bitmap(revs))
if (use_bitmap_index && !path_walk && !get_object_list_from_bitmap(revs))
return;
if (use_delta_islands)
@ -4327,15 +4626,19 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
if (write_bitmap_index)
mark_bitmap_preferred_tips();
if (prepare_revision_walk(revs))
die(_("revision walk setup failed"));
mark_edges_uninteresting(revs, show_edge, sparse);
if (!fn_show_object)
fn_show_object = show_object;
traverse_commit_list(revs,
show_commit, fn_show_object,
NULL);
if (path_walk) {
get_object_list_path_walk(revs);
} else {
if (prepare_revision_walk(revs))
die(_("revision walk setup failed"));
mark_edges_uninteresting(revs, show_edge, sparse);
traverse_commit_list(revs,
show_commit, fn_show_object,
NULL);
}
if (unpack_unreachable_expiration) {
revs->ignore_missing_links = 1;
@ -4545,6 +4848,8 @@ int cmd_pack_objects(int argc,
N_("use the sparse reachability algorithm")),
OPT_BOOL(0, "thin", &thin,
N_("create thin packs")),
OPT_BOOL(0, "path-walk", &path_walk,
N_("use the path-walk API to walk objects when possible")),
OPT_BOOL(0, "shallow", &shallow,
N_("create packs suitable for shallow fetches")),
OPT_BOOL(0, "honor-pack-keep", &ignore_packed_keep_on_disk,
@ -4614,6 +4919,17 @@ int cmd_pack_objects(int argc,
if (pack_to_stdout != !base_name || argc)
usage_with_options(pack_usage, pack_objects_options);
if (path_walk < 0) {
if (use_bitmap_index > 0 ||
!use_internal_rev_list)
path_walk = 0;
else if (the_repository->gitdir &&
the_repository->settings.pack_use_path_walk)
path_walk = 1;
else
path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0);
}
if (depth < 0)
depth = 0;
if (depth >= (1 << OE_DEPTH_BITS)) {
@ -4630,7 +4946,27 @@ int cmd_pack_objects(int argc,
window = 0;
strvec_push(&rp, "pack-objects");
if (thin) {
if (path_walk && filter_options.choice) {
warning(_("cannot use --filter with --path-walk"));
path_walk = 0;
}
if (path_walk && use_delta_islands) {
warning(_("cannot use delta islands with --path-walk"));
path_walk = 0;
}
if (path_walk && shallow) {
warning(_("cannot use --shallow with --path-walk"));
path_walk = 0;
}
if (path_walk) {
strvec_push(&rp, "--boundary");
/*
* We must disable the bitmaps because we are removing
* the --objects / --objects-edge[-aggressive] options.
*/
use_bitmap_index = 0;
} else if (thin) {
use_internal_rev_list = 1;
strvec_push(&rp, shallow
? "--objects-edge-aggressive"

View File

@ -43,7 +43,7 @@ static char *packdir, *packtmp_name, *packtmp;
static const char *const git_repack_usage[] = {
N_("git repack [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]\n"
"[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]\n"
"[--write-midx] [--name-hash-version=<n>]"),
"[--write-midx] [--name-hash-version=<n>] [--path-walk]"),
NULL
};
@ -63,6 +63,7 @@ struct pack_objects_args {
int quiet;
int local;
int name_hash_version;
int path_walk;
struct list_objects_filter_options filter_options;
};
@ -313,6 +314,8 @@ static void prepare_pack_objects(struct child_process *cmd,
strvec_pushf(&cmd->args, "--no-reuse-object");
if (args->name_hash_version)
strvec_pushf(&cmd->args, "--name-hash-version=%d", args->name_hash_version);
if (args->path_walk)
strvec_pushf(&cmd->args, "--path-walk");
if (args->local)
strvec_push(&cmd->args, "--local");
if (args->quiet)
@ -1184,6 +1187,8 @@ int cmd_repack(int argc,
N_("pass --no-reuse-object to git-pack-objects")),
OPT_INTEGER(0, "name-hash-version", &po_args.name_hash_version,
N_("specify the name hash version to use for grouping similar objects by path")),
OPT_BOOL(0, "path-walk", &po_args.path_walk,
N_("(EXPERIMENTAL!) pass --path-walk to git-pack-objects")),
OPT_NEGBIT('n', NULL, &run_update_server_info,
N_("do not run git-update-server-info"), 1),
OPT__QUIET(&po_args.quiet, N_("be quiet")),

View File

@ -26,6 +26,7 @@ linux-TEST-vars)
export GIT_TEST_NO_WRITE_REV_INDEX=1
export GIT_TEST_CHECKOUT_WORKERS=2
export GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1
export GIT_TEST_PACK_PATH_WALK=1
;;
linux-clang)
export GIT_TEST_DEFAULT_HASH=sha1

View File

@ -120,11 +120,23 @@ struct object_entry {
unsigned ext_base:1; /* delta_idx points outside packlist */
};
/**
* A packing region is a section of the packing_data.objects array
* as given by a starting index and a number of elements.
*/
struct packing_region {
uint32_t start;
uint32_t nr;
};
struct packing_data {
struct repository *repo;
struct object_entry *objects;
uint32_t nr_objects, nr_alloc;
struct packing_region *regions;
uint32_t nr_regions, nr_regions_alloc;
int32_t *index;
uint32_t index_size;

View File

@ -54,11 +54,13 @@ void prepare_repo_settings(struct repository *r)
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
r->settings.pack_use_bitmap_boundary_traversal = 1;
r->settings.pack_use_multi_pack_reuse = 1;
r->settings.pack_use_path_walk = 1;
}
if (manyfiles) {
r->settings.index_version = 4;
r->settings.index_skip_hash = 1;
r->settings.core_untracked_cache = UNTRACKED_CACHE_WRITE;
r->settings.pack_use_path_walk = 1;
}
/* Commit graph config or default, does not cascade (simple) */
@ -73,6 +75,7 @@ void prepare_repo_settings(struct repository *r)
/* Boolean config or default, does not cascade (simple) */
repo_cfg_bool(r, "pack.usesparse", &r->settings.pack_use_sparse, 1);
repo_cfg_bool(r, "pack.usepathwalk", &r->settings.pack_use_path_walk, 0);
repo_cfg_bool(r, "core.multipackindex", &r->settings.core_multi_pack_index, 1);
repo_cfg_bool(r, "index.sparse", &r->settings.sparse_index, 0);
repo_cfg_bool(r, "index.skiphash", &r->settings.index_skip_hash, r->settings.index_skip_hash);

View File

@ -56,6 +56,7 @@ struct repo_settings {
enum untracked_cache_setting core_untracked_cache;
int pack_use_sparse;
int pack_use_path_walk;
enum fetch_negotiation_setting fetch_negotiation_algorithm;
int core_multi_pack_index;

View File

@ -212,6 +212,21 @@ static void add_children_by_path(struct repository *r,
free_tree_buffer(tree);
}
void mark_trees_uninteresting_dense(struct repository *r,
struct oidset *trees)
{
struct object_id *oid;
struct oidset_iter iter;
oidset_iter_init(trees, &iter);
while ((oid = oidset_iter_next(&iter))) {
struct tree *tree = lookup_tree(r, oid);
if (tree->object.flags & UNINTERESTING)
mark_tree_contents_uninteresting(r, tree);
}
}
void mark_trees_uninteresting_sparse(struct repository *r,
struct oidset *trees)
{

View File

@ -486,6 +486,7 @@ void put_revision_mark(const struct rev_info *revs,
void mark_parents_uninteresting(struct rev_info *revs, struct commit *commit);
void mark_tree_uninteresting(struct repository *r, struct tree *tree);
void mark_trees_uninteresting_dense(struct repository *r, struct oidset *trees);
void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees);
/**

View File

@ -170,6 +170,7 @@ static int set_recommended_config(int reconfigure)
{ "core.autoCRLF", "false" },
{ "core.safeCRLF", "false" },
{ "fetch.showForcedUpdates", "false" },
{ "pack.usePathWalk", "true" },
{ NULL, NULL },
};
int i;

View File

@ -415,6 +415,10 @@ GIT_TEST_PACK_SPARSE=<boolean> if disabled will default the pack-objects
builtin to use the non-sparse object walk. This can still be overridden by
the --sparse command-line argument.
GIT_TEST_PACK_PATH_WALK=<boolean> if enabled will default the pack-objects
builtin to use the path-walk API for the object walk. This can still be
overridden by the --no-path-walk command-line argument.
GIT_TEST_PRELOAD_INDEX=<boolean> exercises the preload-index code path
by overriding the minimum number of cache entries required per thread.

View File

@ -64,4 +64,29 @@ do
'
done
test_perf 'thin pack with --path-walk' '
git pack-objects --thin --stdout --revs --sparse --path-walk <in-thin >out
'
test_size 'thin pack size with --path-walk' '
test_file_size out
'
test_perf 'big pack with --path-walk' '
git pack-objects --stdout --revs --sparse --path-walk <in-big >out
'
test_size 'big pack size with --path-walk' '
test_file_size out
'
test_perf 'repack with --path-walk' '
git repack -adf --path-walk
'
test_size 'repack size with --path-walk' '
pack=$(ls .git/objects/pack/pack-*.pack) &&
test_file_size "$pack"
'
test_done

View File

@ -59,6 +59,12 @@ test_expect_success 'pack-objects should fetch from promisor remote and execute
test_expect_success 'clone from promisor remote does not lazy-fetch by default' '
rm -f script-executed &&
# The --path-walk feature of "git pack-objects" is not
# compatible with this kind of fetch from an incomplete repo.
GIT_TEST_PACK_PATH_WALK=0 &&
export GIT_TEST_PACK_PATH_WALK &&
test_must_fail git clone evil no-lazy 2>err &&
test_grep "lazy fetching disabled" err &&
test_path_is_missing script-executed

View File

@ -723,4 +723,25 @@ test_expect_success '--name-hash-version=2 and --write-bitmap-index are incompat
! test_grep "currently, --write-bitmap-index requires --name-hash-version=1" err
'
# Basic "repack everything" test
test_expect_success '--path-walk pack everything' '
git -C server rev-parse HEAD >in &&
GIT_PROGRESS_DELAY=0 git -C server pack-objects \
--stdout --revs --path-walk --progress <in >out.pack 2>err &&
grep "Compressing objects by path" err &&
git -C server index-pack --stdin <out.pack
'
# Basic "thin pack" test
test_expect_success '--path-walk thin pack' '
cat >in <<-EOF &&
$(git -C server rev-parse HEAD)
^$(git -C server rev-parse HEAD~2)
EOF
GIT_PROGRESS_DELAY=0 git -C server pack-objects \
--thin --stdout --revs --path-walk --progress <in >out.pack 2>err &&
grep "Compressing objects by path" err &&
git -C server index-pack --fix-thin --stdin <out.pack
'
test_done

View File

@ -59,6 +59,11 @@ test_expect_success 'indirectly clone patch_clone' '
git pull ../.git &&
test $(git rev-parse HEAD) = $B &&
# The --path-walk feature of "git pack-objects" is not
# compatible with this kind of fetch from an incomplete repo.
GIT_TEST_PACK_PATH_WALK=0 &&
export GIT_TEST_PACK_PATH_WALK &&
git pull ../patch_clone/.git &&
test $(git rev-parse HEAD) = $C
)

View File

@ -158,8 +158,9 @@ test_bitmap_cases () {
ls .git/objects/pack/ | grep bitmap >output &&
test_line_count = 1 output &&
# verify equivalent packs are generated with/without using bitmap index
packasha1=$(git pack-objects --no-use-bitmap-index --all packa </dev/null) &&
packbsha1=$(git pack-objects --use-bitmap-index --all packb </dev/null) &&
# Be careful to not use the path-walk option in either case.
packasha1=$(git pack-objects --no-use-bitmap-index --no-path-walk --all packa </dev/null) &&
packbsha1=$(git pack-objects --use-bitmap-index --no-path-walk --all packb </dev/null) &&
list_packed_objects packa-$packasha1.idx >packa.objects &&
list_packed_objects packb-$packbsha1.idx >packb.objects &&
test_cmp packa.objects packb.objects
@ -388,6 +389,14 @@ test_bitmap_cases () {
git init --bare client.git &&
(
cd client.git &&
# This test relies on reusing a delta, but if the
# path-walk machinery is engaged, the base object
# is considered too small to use during the
# dynamic computation, so is not used.
GIT_TEST_PACK_PATH_WALK=0 &&
export GIT_TEST_PACK_PATH_WALK &&
git config transfer.unpackLimit 1 &&
git fetch .. delta-reuse-old:delta-reuse-old &&
git fetch .. delta-reuse-new:delta-reuse-new &&

View File

@ -89,15 +89,18 @@ max_chain() {
# adjusted (or scrapped if the heuristics have become too unreliable)
test_expect_success 'packing produces a long delta' '
# Use --window=0 to make sure we are seeing reused deltas,
# not computing a new long chain.
pack=$(git pack-objects --all --window=0 </dev/null pack) &&
# not computing a new long chain. (Also avoid the --path-walk
# option as it may break delta chains.)
pack=$(git pack-objects --all --window=0 --no-path-walk </dev/null pack) &&
echo 9 >expect &&
max_chain pack-$pack.pack >actual &&
test_cmp expect actual
'
test_expect_success '--depth limits depth' '
pack=$(git pack-objects --all --depth=5 </dev/null pack) &&
# Avoid --path-walk to avoid breaking delta chains across path
# boundaries.
pack=$(git pack-objects --all --depth=5 --no-path-walk </dev/null pack) &&
echo 5 >expect &&
max_chain pack-$pack.pack >actual &&
test_cmp expect actual

View File

@ -7,6 +7,13 @@ test_description='pack-objects multi-pack reuse'
GIT_TEST_MULTI_PACK_INDEX=0
GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0
# The --path-walk option does not consider the preferred pack
# at all for reusing deltas, so this variable changes the
# behavior of this test, if enabled.
GIT_TEST_PACK_PATH_WALK=0
export GIT_TEST_PACK_PATH_WALK
objdir=.git/objects
packdir=$objdir/pack

View File

@ -1103,6 +1103,7 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
git clone super4 super5 &&
(cd super5 &&
# This test var can mess with the stderr output checked in this test.
GIT_TEST_PACK_PATH_WALK=0 \
GIT_TEST_NAME_HASH_VERSION=1 \
git submodule update --quiet --init --depth=1 submodule3 >out 2>err &&
test_must_be_empty out &&
@ -1110,6 +1111,8 @@ test_expect_success 'submodule update --quiet passes quietness to fetch with a s
) &&
git clone super4 super6 &&
(cd super6 &&
# This test variable will create a "warning" message to stderr
GIT_TEST_PACK_PATH_WALK=0 \
git submodule update --init --depth=1 submodule3 >out 2>err &&
test_file_not_empty out &&
test_file_not_empty err