Commit Graph

3109 Commits

Author SHA1 Message Date
Johannes Schindelin
ddd0b3dfd9 credential-cache: handle ECONNREFUSED gracefully (#5329)
I should probably add some tests for this.
2025-03-14 22:30:42 +01:00
Johannes Schindelin
0ac2f22c29 Add experimental 'git survey' builtin (#5174)
This introduces `git survey` to Git for Windows ahead of upstream for
the express purpose of getting the path-based analysis in the hands of
more folks.

The inspiration of this builtin is
[`git-sizer`](https://github.com/github/git-sizer), but since that
command relies on `git cat-file --batch` to get the contents of objects,
it has limits to how much information it can provide.

This is mostly a rewrite of the `git survey` builtin that was introduced
into the `microsoft/git` fork in microsoft/git#667. That version had a
lot more bells and whistles, including an analysis much closer to what
`git-sizer` provides.

The biggest difference in this version is that this one is focused on
using the path-walk API in order to visit batches of objects based on a
common path. This allows identifying, for instance, the path that is
contributing the most to the on-disk size across all versions at that
path.

For example, here are the top ten paths contributing to my local Git
repository (which includes `microsoft/git` and `gitster/git`):

```
TOP FILES BY DISK SIZE
============================================================================
                                    Path | Count | Disk Size | Inflated Size
-----------------------------------------+-------+-----------+--------------
                       whats-cooking.txt |  1373 |  11637459 |      37226854
             t/helper/test-gvfs-protocol |     2 |   6847105 |      17233072
                      git-rebase--helper |     1 |   6027849 |      15269664
                          compat/mingw.c |  6111 |   5194453 |     463466970
             t/helper/test-parse-options |     1 |   3420385 |       8807968
                  t/helper/test-pkt-line |     1 |   3408661 |       8778960
      t/helper/test-dump-untracked-cache |     1 |   3408645 |       8780816
            t/helper/test-dump-fsmonitor |     1 |   3406639 |       8776656
                                po/vi.po |   104 |   1376337 |      51441603
                                po/de.po |   210 |   1360112 |      71198603
```

This kind of analysis has been helpful in identifying the reasons for
growth in a few internal monorepos. Those findings motivated the changes
in #5157 and #5171.

With this early version in Git for Windows, we can expand the reach of
the experimental tool in advance of it being contributed to the upstream
project.

Unfortunately, this will mean that in the next `microsoft/git` rebase,
Jeff Hostetler's version will need to be pulled out since there are
enough conflicts. These conflicts include how tables are stored and
generated, as the version in this PR is slightly more general to allow
for different kinds of data.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:41 +01:00
Johannes Schindelin
568dcdbea3 Lazy load libcurl, allowing for an SSL/TLS backend-specific libcurl (#4410)
As per
https://github.com/git-for-windows/git/issues/4350#issuecomment-1485041503,
the major block for upgrading Git for Windows' OpenSSL from v1.1 to v3
is the tricky part where such an upgrade would break `git fetch`/`git
clone` and `git push` because the libcurl depends on the OpenSSL DLL,
and the major version bump will _change_ the file name of said DLL.

To overcome that, the plan is to build libcurl flavors for each
supported SSL/TLS backend, aligning with the way MSYS2 builds libcurl,
then switch Git for Windows' SDK to the Secure Channel-flavored libcurl,
and teach Git to look for the specific flavor of libcurl corresponding
to the `http.sslBackend` setting (if that was configured).

Here is the PR to teach Git that trick.
2025-03-14 22:30:39 +01:00
Johannes Schindelin
99877ba018 Merge pull request #2974 from derrickstolee/maintenance-and-headless
Include Windows-specific maintenance and headless-git
2025-03-14 22:30:33 +01:00
Matthias Aßhauer
6791d352ff compat/mingw: handle WSA errors in strerror
We map WSAGetLastError() errors to errno errors in winsock_error_to_errno(),
but the MSVC strerror() implementation only produces "Unknown error" for
most of them. Produce some more meaningful error messages in these
cases.

Our builds for ARM64 link against the newer UCRT strerror() that does know
these errors, so we won't change the strerror() used there.

The wording of the messages is copied from glibc strerror() messages.

Reported-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:29 +01:00
Jeff Hostetler
60b7fa02de survey: stub in new experimental 'git-survey' command
Start work on a new 'git survey' command to scan the repository
for monorepo performance and scaling problems.  The goal is to
measure the various known "dimensions of scale" and serve as a
foundation for adding additional measurements as we learn more
about Git monorepo scaling problems.

The initial goal is to complement the scanning and analysis performed
by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool.
It is hoped that by creating a builtin command, we may be able to take
advantage of internal Git data structures and code that is not
accessible from GO to gain further insight into potential scaling
problems.

Co-authored-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
2025-03-14 22:30:27 +01:00
Johannes Schindelin
14f7f3324a http: support lazy-loading libcurl also on Windows
This implements the Windows-specific support code, because everything is
slightly different on Windows, even loading shared libraries.

Note: I specifically do _not_ use the code from
`compat/win32/lazyload.h` here because that code is optimized for
loading individual functions from various system DLLs, while we
specifically want to load _many_ functions from _one_ DLL here, and
distinctly not a system DLL (we expect libcurl to be located outside
`C:\Windows\system32`, something `INIT_PROC_ADDR` refuses to work with).
Also, the `curl_easy_getinfo()`/`curl_easy_setopt()` functions are
declared as vararg functions, which `lazyload.h` cannot handle. Finally,
we are about to optionally override the exact file name that is to be
loaded, which is a goal contrary to `lazyload.h`'s design.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:25 +01:00
Johannes Schindelin
905c9fdf9e http: optionally load libcurl lazily
This compile-time option allows to ask Git to load libcurl dynamically
at runtime.

Together with a follow-up patch that optionally overrides the file name
depending on the `http.sslBackend` setting, this kicks open the door for
installing multiple libcurl flavors side by side, and load the one
corresponding to the (runtime-)configured SSL/TLS backend.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:25 +01:00
Jeff Hostetler
5df9a4955d Makefile: clean up .ilk files when MSVC=1
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
2025-03-14 22:30:20 +01:00
Johannes Schindelin
0cd275e959 mimalloc: offer a build-time option to enable it
By defining `USE_MIMALLOC`, Git can now be compiled with that
nicely-fast and small allocator.

Note that we have to disable a couple `DEVELOPER` options to build
mimalloc's source code, as it makes heavy use of declarations after
statements, among other things that disagree with Git's conventions.

We even have to silence some GCC warnings in non-DEVELOPER mode. For
example, the `-Wno-array-bounds` flag is needed because in `-O2` builds,
trying to call `NtCurrentTeb()` (which `_mi_thread_id()` does on
Windows) causes the bogus warning about a system header, likely related
to https://sourceforge.net/p/mingw-w64/mailman/message/37674519/ and to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578:

C:/git-sdk-64-minimal/mingw64/include/psdk_inc/intrin-impl.h:838:1:
        error: array subscript 0 is outside array bounds of 'long long unsigned int[0]' [-Werror=array-bounds]
  838 | __buildreadseg(__readgsqword, unsigned __int64, "gs", "q")
      | ^~~~~~~~~~~~~~

Also: The `mimalloc` library uses C11-style atomics, therefore we must
require that standard when compiling with GCC if we want to use
`mimalloc` (instead of requiring "only" C99). This is what we do in the
CMake definition already, therefore this commit does not need to touch
`contrib/buildsystems/`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:18 +01:00
Johannes Schindelin
b30d3b28d5 Import the source code of mimalloc v2.1.2
This commit imports mimalloc's source code as per v2.1.2, fetched from
the tag at https://github.com/microsoft/mimalloc.

The .c files are from the src/ subdirectory, and the .h files from the
include/ and include/mimalloc/ subdirectories. We will subsequently
modify the source code to accommodate building within Git's context.

Since we plan on using the `mi_*()` family of functions, we skip the
C++-specific source code, some POSIX compliant functions to interact
with mimalloc, and the code that wants to support auto-magic overriding
of the `malloc()` function (mimalloc-new-delete.h, alloc-posix.c,
mimalloc-override.h, alloc-override.c, alloc-override-osx.c,
alloc-override-win.c and static.c).

To appease the `check-whitespace` job of Git's Continuous Integration,
this commit was washed one time via `git rebase --whitespace=fix`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2025-03-14 22:30:17 +01:00
Junio C Hamano
62c58891e1 Merge branch 'tz/doc-txt-to-adoc-fixes'
Fallouts from recent renaming of documentation files from .txt
suffix to the new .adoc suffix have been corrected.

* tz/doc-txt-to-adoc-fixes: (38 commits)
  xdiff: *.txt -> *.adoc fixes
  unpack-trees.c: *.txt -> *.adoc fixes
  transport.h: *.txt -> *.adoc fixes
  trace2/tr2_sysenv.c: *.txt -> *.adoc fixes
  trace2.h: *.txt -> *.adoc fixes
  t6434: *.txt -> *.adoc fixes
  t6012: *.txt -> *.adoc fixes
  t/helper/test-rot13-filter.c: *.txt -> *.adoc fixes
  simple-ipc.h: *.txt -> *.adoc fixes
  setup.c: *.txt -> *.adoc fixes
  refs.h: *.txt -> *.adoc fixes
  pseudo-merge.h: *.txt -> *.adoc fixes
  parse-options.h: *.txt -> *.adoc fixes
  object-name.c: *.txt -> *.adoc fixes
  list-objects-filter-options.h: *.txt -> *.adoc fixes
  fsck.h: *.txt -> *.adoc fixes
  diffcore.h: *.txt -> *.adoc fixes
  diff.h: *.txt -> *.adoc fixes
  contrib/long-running-filter: *.txt -> *.adoc fixes
  config.c: *.txt -> *.adoc fixes
  ...
2025-03-06 14:06:31 -08:00
Junio C Hamano
6024f321d4 Merge branch 'sk/unit-test-oid'
Convert a few unit tests to the clar framework.

* sk/unit-test-oid:
  t/unit-tests: convert oidtree test to use clar test framework
  t/unit-tests: convert oidmap test to use clar test framework
  t/unit-tests: convert oid-array test to use clar test framework
  t/unit-tests: implement clar specific oid helper functions
2025-03-05 10:37:43 -08:00
Todd Zullinger
d40da0bd4b Makefile: update reference to technical/racy-git.adoc
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03 13:49:19 -08:00
Junio C Hamano
ca39da6997 Merge branch 'ps/meson-contrib-bits'
Update meson-based build procedure to cover contrib/ and other
places as well.

* ps/meson-contrib-bits:
  ci: exercise credential helpers
  ci: fix propagating UTF-8 test locale in musl-based Meson job
  meson: wire up static analysis via Coccinelle
  meson: wire up git-contacts(1)
  meson: wire up credential helpers
  contrib/credential: fix compilation of "osxkeychain" helper
  contrib/credential: fix compiling "libsecret" helper
  contrib/credential: fix compilation of wincred helper with MSVC
  contrib/credential: fix "netrc" tests with out-of-tree builds
  GIT-BUILD-OPTIONS: propagate project's source directory
2025-03-03 08:53:03 -08:00
Junio C Hamano
2a1530a953 Merge branch 'ps/meson-contrib-bits' into tz/doc-txt-to-adoc-fixes
* ps/meson-contrib-bits:
  ci: exercise credential helpers
  ci: fix propagating UTF-8 test locale in musl-based Meson job
  meson: wire up static analysis via Coccinelle
  meson: wire up git-contacts(1)
  meson: wire up credential helpers
  contrib/credential: fix compilation of "osxkeychain" helper
  contrib/credential: fix compiling "libsecret" helper
  contrib/credential: fix compilation of wincred helper with MSVC
  contrib/credential: fix "netrc" tests with out-of-tree builds
  GIT-BUILD-OPTIONS: propagate project's source directory
2025-03-01 10:00:45 -08:00
Junio C Hamano
5ce6e0e242 Merge branch 'tb/new-make-fix'
Workaround the overly picky HT/SP rule in newer GNU Make.

* tb/new-make-fix:
  Makefile: remove accidental recipe prefix in conditional
2025-02-25 14:19:35 -08:00
Seyi Kuforiji
149585079f t/unit-tests: convert oidtree test to use clar test framework
Adapt oidtree test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidtree__initialize` handles the to set up of the global
oidtree variable and `test_oidtree__cleanup` frees the oidtree when all
tests are completed.

With this change, `check_each` stops at the first error encountered,
making it easier to address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:23 -08:00
Seyi Kuforiji
69bc044def t/unit-tests: convert oidmap test to use clar test framework
Adapt oidmap test script to clar framework by using clar assertions
where necessary. `cl_parse_any_oid()` ensures the hash algorithm is set
before parsing. This prevents issues from an uninitialized or invalid
hash algorithm.

Introduce 'test_oidmap__initialize` handles the to set up of the global
oidmap map with predefined key-value pairs, and `test_oidmap__cleanup`
frees the oidmap and its entries when all tests are completed.

The test loops through all entries to detect multiple errors. With this
change, it stops at the first error encountered, making it easier to
address it.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Seyi Kuforiji
869a1edf44 t/unit-tests: convert oid-array test to use clar test framework
Adapt oid-array test script to clar framework by using clar assertions
where necessary. Remove descriptions from macros to reduce
redundancy, and move test input arrays to global scope for reuse across
multiple test functions. Introduce `test_oid_array__initialize()` to
explicitly initialize the hash algorithm.

These changes streamline the test suite, making individual tests
self-contained and reducing redundant code.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Seyi Kuforiji
a16a2ee312 t/unit-tests: implement clar specific oid helper functions
`get_oid_arbitrary_hex()` and `init_hash_algo()` are both required for
oid-related tests to run without errors. In the current implementation,
both functions are defined and declared in the
`t/unit-tests/lib-oid.{c,h}` which is utilized by oid-related tests in
the homegrown unit tests structure.

Adapt functions in lib-oid.{c,h} to use clar. Both these functions
become available for oid-related test files implemented using the clar
testing framework, which requires them. This will be used by subsequent
commits.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25 10:31:22 -08:00
Junio C Hamano
e565f37553 Merge branch 'ds/backfill'
Lazy-loading missing files in a blobless clone on demand is costly
as it tends to be one-blob-at-a-time.  "git backfill" is introduced
to help bulk-download necessary files beforehand.

* ds/backfill:
  backfill: assume --sparse when sparse-checkout is enabled
  backfill: add --sparse option
  backfill: add --min-batch-size=<n> option
  backfill: basic functionality and tests
  backfill: add builtin boilerplate
2025-02-18 15:30:31 -08:00
Patrick Steinhardt
c5823641a6 GIT-BUILD-OPTIONS: propagate project's source directory
A couple of our tests require knowledge around where to find the
project's source directory in order to locate files required for the
test itself. Until now we have been wiring these up ad-hoc via new,
specialized variables catered to the specific usecase. This is quite
awkward though, as every test that potentially needs to locate paths
relative to the source directory needs to grow another variable.

Introduce a new "GIT_SOURCE_DIR" variable into GIT-BUILD-OPTIONS to stop
this proliferation. Remove existing variables that can be derived from
it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-18 11:40:02 -08:00
Junio C Hamano
0cc13007e5 Merge branch 'bc/doc-adoc-not-txt'
All the documentation .txt files have been renamed to .adoc to help
content aware editors.

* bc/doc-adoc-not-txt:
  Remove obsolete ".txt" extensions for AsciiDoc files
  doc: use .adoc extension for AsciiDoc files
  gitattributes: mark AsciiDoc files as LF-only
  editorconfig: add .adoc extension
  doc: update gitignore for .adoc extension
2025-02-14 17:53:47 -08:00
Taylor Blau
f23179924b Makefile: remove accidental recipe prefix in conditional
Back in 728b9ac0c3 (Makefile(s): avoid recipe prefix in conditional
statements, 2024-04-08), we prepared our Makefiles for a forthcoming
change in upstream Make that would ban the recipe prefix within a
conditional statement by replacing tabs (the prefix) with eight spaces.

In b9d6f64393 (compat/zlib: allow use of zlib-ng as backend,
2025-01-28), a handful of recipe prefix characters were introduced in a
conditional statement ('ifdef ZLIB_NG'), causing 'make' to fail on my
system, which uses GNU Make 4.4.90.

Remove the recipe prefix characters by replacing them with the same
script as is mentioned in 728b9ac0c3.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-13 13:16:34 -08:00
Junio C Hamano
a4af0b6288 Merge branch 'js/libgit-rust'
Foreign language interface for Rust into our code base has been added.

* js/libgit-rust:
  libgit: add higher-level libgit crate
  libgit-sys: also export some config_set functions
  libgit-sys: introduce Rust wrapper for libgit.a
  common-main: split init and exit code into new files
2025-02-12 10:08:53 -08:00
Junio C Hamano
aae91a86fb Merge branch 'ds/name-hash-tweaks'
"git pack-objects" and its wrapper "git repack" learned an option
to use an alternative path-hash function to improve delta-base
selection to produce a packfile with deeper history than window
size.

* ds/name-hash-tweaks:
  pack-objects: prevent name hash version change
  test-tool: add helper for name-hash values
  p5313: add size comparison test
  pack-objects: add GIT_TEST_NAME_HASH_VERSION
  repack: add --name-hash-version option
  pack-objects: add --name-hash-version option
  pack-objects: create new name-hash function version
2025-02-12 10:08:51 -08:00
Junio C Hamano
6f0b72205d Merge branch 'sk/unit-tests-0130'
Convert a handful of unit tests to work with the clar framework.

* sk/unit-tests-0130:
  t/unit-tests: convert strcmp-offset test to use clar test framework
  t/unit-tests: convert strbuf test to use clar test framework
  t/unit-tests: adapt example decorate test to use clar test framework
  t/unit-tests: convert hashmap test to use clar test framework
2025-02-10 10:18:31 -08:00
Junio C Hamano
9d0e81e2ae Merge branch 'ps/zlib-ng'
The code paths to interact with zlib has been cleaned up in
preparation for building with zlib-ng.

* ps/zlib-ng:
  ci: make "linux-musl" job use zlib-ng
  ci: switch linux-musl to use Meson
  compat/zlib: allow use of zlib-ng as backend
  git-zlib: cast away potential constness of `next_in` pointer
  compat/zlib: provide stubs for `deflateSetHeader()`
  compat/zlib: provide `deflateBound()` shim centrally
  git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
  compat: introduce new "zlib.h" header
  git-compat-util: drop `z_const` define
  compat: drop `uncompress2()` compatibility shim
2025-02-06 14:56:45 -08:00
Derrick Stolee
a3f79e9abd backfill: add builtin boilerplate
In anticipation of implementing 'git backfill', populate the necessary files
with the boilerplate of a new builtin. Mark the builtin as experimental at
this time, allowing breaking changes in the near future, if necessary.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-03 16:12:41 -08:00
Junio C Hamano
803b5acaa7 Merge branch 'ps/3.0-remote-deprecation'
Following the procedure we established to introduce breaking
changes for Git 3.0, allow an early opt-in for removing support of
$GIT_DIR/branches/ and $GIT_DIR/remotes/ directories to configure
remotes.

* ps/3.0-remote-deprecation:
  remote: announce removal of "branches/" and "remotes/"
  builtin/pack-redundant: remove subcommand with breaking changes
  ci: repurpose "linux-gcc" job for deprecations
  ci: merge linux-gcc-default into linux-gcc
  Makefile: wire up build option for deprecated features
2025-02-03 10:23:33 -08:00
Seyi Kuforiji
af8bf677c1 t/unit-tests: convert strcmp-offset test to use clar test framework
Adapt strcmp-offset test script to clar framework by using clar
assertions where necessary. Introduce `test_strcmp_offset__empty()` to
verify `check_strcmp_offset()` behavior when both input strings are
empty. This ensures the function correctly handles edge cases and
returns expected values.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
Seyi Kuforiji
4b995465b2 t/unit-tests: convert strbuf test to use clar test framework
Adapt strbuf test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:45 -08:00
Seyi Kuforiji
e0f807bdad t/unit-tests: adapt example decorate test to use clar test framework
Introduce `test_example_decorate__initialize()` to explicitly set up
object IDs and retrieve corresponding objects before tests run. This
ensures a consistent and predictable test state without relying on data
from previous tests.

Add `test_example_decorate__cleanup()` to clear decorations after each
test, preventing interference between tests and ensuring each runs in
isolation.

Adapt example decorate test script to clar framework by using clar
assertions where necessary. Previously, tests relied on data written by
earlier tests, leading to unintended dependencies between them. This
explicitly initializes the necessary state within
`test_example_decorate__readd`, ensuring it does not depend on prior
test executions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
Seyi Kuforiji
38b066ee76 t/unit-tests: convert hashmap test to use clar test framework
Adapts hashmap test script to clar framework by using clar assertions
where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31 14:58:44 -08:00
Calvin Wan
65c10aa8d5 libgit: add higher-level libgit crate
The C functions exported by libgit-sys do not provide an idiomatic Rust
interface. To make it easier to use these functions via Rust, add a
higher-level "libgit" crate, that wraps the lower-level configset API
with an interface that is more Rust-y.

This combination of $X and $X-sys crates is a common pattern for FFI in
Rust, as documented in "The Cargo Book" [1].

[1] https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages

Co-authored-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-29 15:06:50 -08:00
Junio C Hamano
f046ab2dd4 Merge branch 'ds/path-walk-1'
Introduce a new API to visit objects in batches based on a common
path, or by type.

* ds/path-walk-1:
  path-walk: drop redundant parse_tree() call
  path-walk: reorder object visits
  path-walk: mark trees and blobs as UNINTERESTING
  path-walk: visit tags and cached objects
  path-walk: allow consumer to specify object types
  t6601: add helper for testing path-walk API
  test-lib-functions: add test_cmp_sorted
  path-walk: introduce an object walk by path
2025-01-29 14:05:09 -08:00
Josh Steadmon
e7f8bf125c libgit-sys: introduce Rust wrapper for libgit.a
Introduce libgit-sys, a Rust wrapper crate that allows Rust code to call
functions in libgit.a. This initial patch defines build rules and an
interface that exposes user agent string getter functions as a proof of
concept. This library can be tested with `cargo test`. In later commits,
a higher-level library containing a more Rust-friendly interface will be
added at `contrib/libgit-rs`.

Symbols in libgit can collide with symbols from other libraries such as
libgit2. We avoid this by first exposing library symbols in
public_symbol_export.[ch]. These symbols are prepended with "libgit_" to
avoid collisions and set to visible using a visibility pragma. In
build.rs, Rust builds contrib/libgit-rs/libgit-sys/libgitpub.a, which also
contains libgit.a and other dependent libraries, with
-fvisibility=hidden to hide all symbols within those libraries that
haven't been exposed with a visibility pragma.

Co-authored-by: Kyle Lippincott <spectral@google.com>
Co-authored-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 14:45:47 -08:00
Josh Steadmon
3f8f2abe05 common-main: split init and exit code into new files
Currently, object files in libgit.a reference common_exit(), which is
contained in common-main.o. However, common-main.o also includes main(),
which references cmd_main() in git.o, which in turn depends on all the
builtin/*.o objects.

We would like to allow external users to link libgit.a without needing
to include so many extra objects. Enable this by splitting common_exit()
and check_bug_if_BUG() into a new file common-exit.c, and add
common-exit.o to LIB_OBJS so that these are included in libgit.a.

This split has previously been proposed ([1], [2]) to support fuzz tests
and unit tests by avoiding conflicting definitions for main(). However,
both of those issues were resolved by other methods of avoiding symbol
conflicts. Now we are trying to make libgit.a more self-contained, so
hopefully we can revisit this approach.

Additionally, move the initialization code out of main() into a new
init_git() function in its own file. Include this in libgit.a as well,
so that external users can share our setup code without calling our
main().

[1] https://lore.kernel.org/git/Yp+wjCPhqieTku3X@google.com/
[2] https://lore.kernel.org/git/20230517-unit-tests-v2-v2-1-21b5b60f4b32@google.com/

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 14:39:16 -08:00
Patrick Steinhardt
b9d6f64393 compat/zlib: allow use of zlib-ng as backend
The zlib-ng library is a hard fork of the old and venerable zlib
library. It describes itself as zlib replacement with optimizations for
"next generation" systems. As such, it contains several implementations
of central algorithms using for example SSE2, AVX2 and other vectorized
CPU intrinsics that supposedly speed up in- and deflating data.

And indeed, compiling Git against zlib-ng leads to a significant speedup
when reading objects. The following benchmark uses git-cat-file(1) with
`--batch --batch-all-objects` in the Git repository:

    Benchmark 1: zlib
      Time (mean ± σ):     52.085 s ±  0.141 s    [User: 51.500 s, System: 0.456 s]
      Range (min … max):   52.004 s … 52.335 s    5 runs

    Benchmark 2: zlib-ng
      Time (mean ± σ):     40.324 s ±  0.134 s    [User: 39.731 s, System: 0.490 s]
      Range (min … max):   40.135 s … 40.484 s    5 runs

    Summary
      zlib-ng ran
        1.29 ± 0.01 times faster than zlib

So we're looking at a ~25% speedup compared to zlib. This is of course
an extreme example, as it makes us read through all objects in the
repository. But regardless, it should be possible to see some sort of
speedup in most commands that end up accessing the object database.

The zlib-ng library provides a compatibility layer that makes it a
proper drop-in replacement for zlib: nothing needs to change in the
build system to support it. Unfortunately though, this mode isn't easy
to use on most systems because distributions do not allow you to install
zlib-ng in that way, as that would mean that the zlib library would be
globally replaced. Instead, many distributions provide a package that
installs zlib-ng without the compatibility layer. This version does
provide effectively the same APIs like zlib does, but all of the symbols
are prefixed with `zng_` to avoid symbol collisions.

Implement a new build option that allows us to link against zlib-ng
directly. If set, we redefine zlib symbols so that we use the `zng_`
prefixed versions thereof provided by that library. Like this, it
becomes possible to install both zlib and zlib-ng (without the compat
layer) and then pick whichever library one wants to link against for
Git.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:23 -08:00
Patrick Steinhardt
3656d57bbf compat: drop uncompress2() compatibility shim
Our compat library has an implementation of zlib's `uncompress2()`
function that gets used when linking against an old version of zlib
that doesn't yet have it. The last user of `uncompress2()` got removed
in 15a60b747e (reftable/block: open-code call to `uncompress2()`,
2024-04-08), so the compatibility code is not required anymore. Drop it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28 13:03:22 -08:00
Junio C Hamano
8d335468ec Merge branch 'sk/unit-tests'
Move a few more unit tests to the clar test framework.

* sk/unit-tests:
  t/unit-tests: convert reftable tree test to use clar test framework
  t/unit-tests: adapt priority queue test to use clar test framework
  t/unit-tests: convert mem-pool test to use clar test framework
  t/unit-tests: handle dashes in test suite filenames
2025-01-28 13:02:22 -08:00
Derrick Stolee
7f9870794f test-tool: add helper for name-hash values
Add a new test-tool helper, name-hash, to output the value of the
name-hash algorithms for the input list of strings, one per line.

Since the name-hash values can be stored in the .bitmap files, it is
important that these hash functions do not change across Git versions.
Add a simple test to t5310-pack-bitmaps.sh to provide some testing of
the current values. Due to how these functions are implemented, it would
be difficult to change them without disturbing these values. The paths
used for this test are carefully selected to demonstrate some of the
behavior differences of the two current name hash versions, including
which conditions will cause them to collide.

Create a performance test that uses test_size to demonstrate how
collisions occur for these hash algorithms. This test helps inform
someone as to the behavior of the name-hash algorithms for their repo
based on the paths at HEAD.

My copy of the Git repository shows modest statistics around the
collisions of the default name-hash algorithm:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                         4.5K
5314.2: distinct hash value: v1               4.1K
5314.3: maximum multiplicity: v1                13
5314.4: distinct hash value: v2               4.2K
5314.5: maximum multiplicity: v2                 9

Here, the maximum collision multiplicity is 13, but around 10% of paths
have a collision with another path.

In a more interesting example, the microsoft/fluentui [1] repo had these
statistics at time of committing:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                        19.5K
5314.2: distinct hash value: v1               8.2K
5314.3: maximum multiplicity: v1               279
5314.4: distinct hash value: v2              17.8K
5314.5: maximum multiplicity: v2                44

[1] https://github.com/microsoft/fluentui

That demonstrates that of the nearly twenty thousand path names, they
are assigned around eight thousand distinct values. 279 paths are
assigned to a single value, leading the packing algorithm to sort
objects from those paths together, by size.

With the v2 name hash function, the maximum multiplicity lowers to 44,
leaving some room for further improvement.

In a more extreme example, an internal monorepo had a much worse
collision rate:

Test                               this tree
--------------------------------------------------
5314.1: paths at head                       227.3K
5314.2: distinct hash value: v1              72.3K
5314.3: maximum multiplicity: v1             14.4K
5314.4: distinct hash value: v2             166.5K
5314.5: maximum multiplicity: v2               138

Here, we can see that the v2 name hash function provides somem
improvements, but there are still a number of collisions that could lead
to repacking problems at this scale.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-27 13:21:43 -08:00
Patrick Steinhardt
68f51871df builtin/pack-redundant: remove subcommand with breaking changes
The git-pack-redundant(1) subcommand has been castrated to require
the "--i-still-use-this" option to do anything since 4406522b
(pack-redundant: escalate deprecation warning to an error,
2023-03-23), which appeared in Git 2.41 and was announced for
removal with 53a92c9552 (Documentation/BreakingChanges: announce
removal of git-pack-redundant(1), 2024-09-02). Stop compiling the
subcommand in case the `WITH_BREAKING_CHANGES` build flag is set.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:34:05 -08:00
Patrick Steinhardt
c5bc9a7f94 Makefile: wire up build option for deprecated features
With 57ec9254eb (docs: introduce document to announce breaking changes,
2024-06-14), we have introduced a new document that tracks upcoming
breaking changes in the Git project. In 2454970930 (BreakingChanges:
early adopter option, 2024-10-11) we have amended the document a bit to
mention that any introduced breaking changes must be accompanied by
logic that allows us to enable the breaking change at compile-time.
While we already have two breaking changes lined up, neither of them has
such a switch because they predate those instructions.

Introduce the proposed `WITH_BREAKING_CHANGES` preprocessor macro and
wire it up with both our Makefiles and Meson. This does not yet wire up
the build flag for existing deprecations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-22 12:28:27 -08:00
brian m. carlson
1f010d6bdf doc: use .adoc extension for AsciiDoc files
We presently use the ".txt" extension for our AsciiDoc files.  While not
wrong, most editors do not associate this extension with AsciiDoc,
meaning that contributors don't get automatic editor functionality that
could be useful, such as syntax highlighting and prose linting.

It is much more common to use the ".adoc" extension for AsciiDoc files,
since this helps editors automatically detect files and also allows
various forges to provide rich (HTML-like) rendering.  Let's do that
here, renaming all of the files and updating the includes where
relevant.  Adjust the various build scripts and makefiles to use the new
extension as well.

Note that this should not result in any user-visible changes to the
documentation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21 12:56:06 -08:00
Seyi Kuforiji
ffbd3f98f9 t/unit-tests: convert reftable tree test to use clar test framework
Adapts reftable tree test script to clar framework by using clar
assertions where necessary.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:12 -08:00
Seyi Kuforiji
8b702f93dd t/unit-tests: adapt priority queue test to use clar test framework
Convert the prio-queue test script to clar framework by using clar
assertions where necessary. Test functions are created as a standalone
to test different cases.

update the type of the variable `j` from int to `size_t`, this ensures
compatibility with the type used for result_size, which is also size_t,
preventing a potential warning or error caused by comparisons between
signed and unsigned integers.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:12 -08:00
Seyi Kuforiji
c143dfa7ed t/unit-tests: convert mem-pool test to use clar test framework
Adapt the mem-pool test script to use clar framework by using clar
assertions where necessary.Test functions are created as a standalone to
test different test cases.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17 14:35:11 -08:00
Seyi Kuforiji
43850dcf9c t/unit-tests: convert hash to use clar test framework
Adapt the hash test functions to clar framework by using clar
assertions where necessary. Following the consensus to convert
the unit-tests scripts found in the t/unit-tests folder to clar driven by
Patrick Steinhardt. Test functions are structured as a standalone to
test individual hash string and literal case.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 07:55:00 -08:00