backfill: add --batch-size=<n> option

Users may want to specify a minimum batch size for their needs. This is only
a minimum: the path-walk API provides a list of OIDs that correspond to the
same path, and thus it is optimal to allow delta compression across those
objects in a single server request.

We could consider limiting the request to have a maximum batch size in the
future.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
This commit is contained in:
Derrick Stolee
2024-09-01 12:22:10 -04:00
committed by Johannes Schindelin
parent ebd1692609
commit 6bbc831ec6
3 changed files with 30 additions and 2 deletions

View File

@@ -9,7 +9,7 @@ git-backfill - Download missing objects in a partial clone
SYNOPSIS SYNOPSIS
-------- --------
[verse] [verse]
'git backfill' [<options>] 'git backfill' [--batch-size=<n>]
DESCRIPTION DESCRIPTION
----------- -----------
@@ -38,6 +38,14 @@ delta compression in the packfile sent by the server.
By default, `git backfill` downloads all blobs reachable from the `HEAD` By default, `git backfill` downloads all blobs reachable from the `HEAD`
commit. This set can be restricted or expanded using various options. commit. This set can be restricted or expanded using various options.
OPTIONS
-------
--batch-size=<n>::
Specify a minimum size for a batch of missing objects to request
from the server. This size may be exceeded by the last set of
blobs seen at a given path. Default batch size is 16,000.
SEE ALSO SEE ALSO
-------- --------
linkgit:git-clone[1]. linkgit:git-clone[1].

View File

@@ -21,7 +21,7 @@
#include "path-walk.h" #include "path-walk.h"
static const char * const builtin_backfill_usage[] = { static const char * const builtin_backfill_usage[] = {
N_("git backfill [<options>]"), N_("git backfill [--batch-size=<n>]"),
NULL NULL
}; };
@@ -113,6 +113,8 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
.batch_size = 50000, .batch_size = 50000,
}; };
struct option options[] = { struct option options[] = {
OPT_INTEGER(0, "batch-size", &ctx.batch_size,
N_("Minimun number of objects to request at a time")),
OPT_END(), OPT_END(),
}; };

View File

@@ -59,6 +59,24 @@ test_expect_success 'do partial clone 1, backfill gets all objects' '
test_line_count = 0 revs2 test_line_count = 0 revs2
' '
test_expect_success 'do partial clone 2, backfill batch size' '
git clone --no-checkout --filter=blob:none \
--single-branch --branch=main \
"file://$(pwd)/srv.bare" backfill2 &&
GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
-C backfill2 backfill --batch-size=20 &&
# Batches were used
test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
test_line_count = 2 matches &&
test_trace2_data promisor fetch_count 8 <batch-trace &&
# No more missing objects!
git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
test_line_count = 0 revs2
'
. "$TEST_DIRECTORY"/lib-httpd.sh . "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd start_httpd