Patrick Steinhardt f23ac77a43 commit: avoid parsing non-commits in lookup_commit_reference_gently()
The function `lookup_commit_reference_gently()` can be used to look up a
committish by object ID. As such, the function knows to peel for example
tag objects so that we eventually end up with the commit.

The function is used quite a lot throughout our tree. One such user is
"shallow.c" via `assign_shallow_commits_to_refs()`. The intent of this
function is to figure out whether a shallow push is missing any objects
that are required to satisfy the ref updates, and if so, which of the
ref updates is missing objects.

This is done by painting the tree with `UNINTERESTING`. We start
painting by calling `refs_for_each_ref()` so that we can mark all
existing referenced objects as the boundary of objects that we already
have, and which are supposed to be fully connected. The reference tips
are then parsed via `lookup_commit_reference_gently()`, and the commit
is then marked as uninteresting.

But references may not necessarily point to a committish, and if a lot
of them aren't then this step takes a lot of time. This is mostly due to
the way that `lookup_commit_reference_gently()` is implemented: before
we learn about the type of the object we already call `parse_object()`
on the object ID. This has two consequences:

  - We parse all objects, including trees and blobs, even though we
    don't even need the contents of them.

  - More importantly though, `parse_object()` will cause us to check
    whether the object ID matches its contents.

Combined this means that we deflate and hash every non-committish
object, and that of course ends up being both CPU- and memory-intensive.

Improve the logic so that we first use `peel_object()`. This function
won't parse the object for us, and thus it allows us to learn about the
object's type before we parse and return it.

The following benchmark pushes a single object from a shallow clone into
a repository that has 100,000 refs. These refs were created by listing
all objects via `git rev-list(1) --objects --all` and creating refs for
a subset of them, so lots of those refs will cover non-commit objects.

  Benchmark 1: git-receive-pack (rev = HEAD~)
    Time (mean ± σ):     62.571 s ±  0.413 s    [User: 58.331 s, System: 4.053 s]
    Range (min … max):   62.191 s … 63.010 s    3 runs

  Benchmark 2: git-receive-pack (rev = HEAD)
    Time (mean ± σ):     38.339 s ±  0.192 s    [User: 36.220 s, System: 1.992 s]
    Range (min … max):   38.176 s … 38.551 s    3 runs

  Summary
    git-receive-pack . </tmp/input (rev = HEAD) ran
      1.63 ± 0.01 times faster than git-receive-pack . </tmp/input (rev = HEAD~)

This leads to a sizeable speedup as we now skip reading and parsing
non-commit objects. Before this change we spent around 40% of the time
in `assign_shallow_commits_to_refs()`, after the change we only spend
around 1.2% of the time in there. Almost the entire remainder of the
time is spent in git-rev-list(1) to perform the connectivity checks.

Despite the speedup though, this also leads to a massive reduction in
allocations. Before:

  HEAP SUMMARY:
      in use at exit: 352,480,441 bytes in 97,185 blocks
    total heap usage: 2,793,820 allocs, 2,696,635 frees, 67,271,456,983 bytes allocated

And after:

  HEAP SUMMARY:
      in use at exit: 17,524,978 bytes in 22,393 blocks
    total heap usage: 33,313 allocs, 10,920 frees, 407,774,251 bytes allocated

Note that when all references refer to commits performance stays roughly
the same, as expected. The following benchmark was executed with 600k
commits:

  Benchmark 1: git-receive-pack (rev = HEAD~)
    Time (mean ± σ):      9.101 s ±  0.006 s    [User: 8.800 s, System: 0.520 s]
    Range (min … max):    9.095 s …  9.106 s    3 runs

  Benchmark 2: git-receive-pack (rev = HEAD)
    Time (mean ± σ):      9.128 s ±  0.094 s    [User: 8.820 s, System: 0.522 s]
    Range (min … max):    9.019 s …  9.188 s    3 runs

  Summary
    git-receive-pack (rev = HEAD~) ran
      1.00 ± 0.01 times faster than git-receive-pack (rev = HEAD)

This will be improved in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-19 09:34:16 -08:00
2026-01-23 13:34:36 -08:00
2026-01-23 13:34:37 -08:00
2025-12-19 17:57:26 +09:00
2026-01-23 13:34:36 -08:00
2025-08-02 22:44:58 -07:00
2025-07-01 07:46:22 -07:00
2025-10-08 12:17:55 -07:00
2026-01-23 13:34:36 -08:00
2025-10-26 16:34:39 -07:00
2025-12-30 12:58:19 +09:00
2025-12-07 07:28:13 +09:00
2026-01-09 18:36:16 -08:00
2026-01-09 06:37:02 -08:00
2025-09-16 18:00:25 -07:00
2025-09-16 18:00:25 -07:00
2025-12-25 08:29:29 +09:00
2025-12-25 08:29:28 +09:00
2025-07-23 08:15:18 -07:00
2025-12-30 10:53:47 +09:00
2025-11-03 06:49:55 -08:00
2025-09-12 08:59:52 -07:00
2025-12-05 14:49:56 +09:00
2026-02-01 18:15:01 -08:00
2026-01-23 13:34:36 -08:00
2025-07-23 08:15:18 -07:00
2025-12-16 11:08:35 +09:00
2025-11-25 12:15:59 -08:00
2025-11-25 12:15:59 -08:00
2025-07-15 15:18:18 -07:00
2025-11-19 17:41:03 -08:00
2025-11-19 17:41:03 -08:00
2025-07-01 14:58:24 -07:00
2025-07-01 14:46:37 -07:00
2025-08-21 13:46:59 -07:00
2026-01-23 13:34:37 -08:00
2025-11-04 07:48:07 -08:00
2025-08-21 13:46:58 -07:00
2025-11-19 10:55:42 -08:00
2025-12-23 11:33:15 +09:00
2025-07-15 15:18:18 -07:00
2025-07-01 14:58:24 -07:00
2025-12-25 08:29:28 +09:00
2025-12-25 08:29:28 +09:00
2025-12-29 22:02:54 +09:00
2025-07-23 08:15:18 -07:00
2026-01-09 18:36:17 -08:00
2026-01-09 18:36:17 -08:00
2025-12-16 11:08:35 +09:00
2025-12-07 07:28:11 +09:00
2025-11-12 14:04:04 -08:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.adoc to get started, then see Documentation/giteveryday.adoc for a useful minimum set of commands, and Documentation/git-<commandname>.adoc for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.adoc (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email to git+subscribe@vger.kernel.org (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at https://lore.kernel.org/git/, https://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
A fork of Git containing Windows-specific patches.
Readme 559 MiB
2025-08-19 03:50:05 -05:00
Languages
C 51.5%
Shell 37.6%
Perl 4.2%
Tcl 3%
Python 0.8%
Other 2.7%