Commit Graph

10 Commits

Author SHA1 Message Date
Kristofer Karlsson
9f75e7a150 prio-queue: fold lazy_queue into prio_queue for automatic get+put fusion
Defer the actual removal in prio_queue_get() until the next
operation.  If that next operation is a prio_queue_put(), the
removal and insertion are fused into a single replace — writing
the new element at the root and sifting it down — which avoids
a full remove-rebalance-insert cycle.

This matches the dominant usage pattern in git's commit traversal:
get a commit, then put its parents.  The first parent insertion
after each get is now a replace operation automatically.

This generalizes the lazy_queue pattern from builtin/describe.c
(introduced in 08bb69d70f) into prio_queue itself.  Three callers
independently implemented the same get+put fusion:

  - builtin/describe.c had a full lazy_queue wrapper
  - commit.c:pop_most_recent_commit() used peek+replace
  - builtin/show-branch.c:join_revs() used peek+replace

All three now collapse to plain _get() and _put(), with the data
structure handling the fusion internally.  This simplifies callers
and means every prio_queue user gets the optimization for free
without needing to implement it manually.

Remove prio_queue_replace() since no external callers remain.

Benchmarked on a 1.8M-commit monorepo (30 interleaved runs,
paired t-test, Xeon @ 2.20GHz):

Code paths that previously did eager get+put (new optimization):

  Command                       base    patched  change      p
  merge-base --all A A~1000     3828ms  3725ms   -2.69%  0.0001
  rev-list --count A~1000..A    3055ms  2986ms   -2.27%  0.0601
  log --oneline A~1000..A       3408ms  3350ms   -1.71%  0.0482

Code paths that already had manual get+put fusion (expect
neutral — the optimization moves into prio_queue but the number
of heap operations stays the same):

  Command                       base    patched  change      p
  show-branch A A~1000          9156ms  9127ms   -0.32%  0.3470
  describe (4751 revs, 81K repo) 1983ms 1963ms  -1.02%  <0.001

No regressions in any scenario.

Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-09 11:11:46 -07:00
Kristofer Karlsson
3c57836988 prio-queue: rename .nr to .nr_ and add accessor helpers
Rename the .nr member to .nr_ so that callers outside prio-queue.c
that directly reference .nr get a compilation error.  This catches
both existing misuse and future in-flight topics.

Add prio_queue_size() for callers that need to know the element count
and prio_queue_for_each() for callers that need to walk all elements.

Convert all external .nr users:
 - Loop conditions: use prio_queue_size(), prio_queue_get(), or
   prio_queue_peek() as the loop condition
 - Array iterations: use prio_queue_for_each()

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-06-09 11:11:46 -07:00
René Scharfe
3d5091d232 prio-queue: add prio_queue_replace()
Add a function to replace the top element of the queue that basically
does the same as prio_queue_get() followed by prio_queue_put(), but
without the work by prio_queue_get() to rebalance the heap.  It can be
used to optimize loops that get one element and then immediately add
another one.  That's common e.g., with commit history traversal, where
we get out a commit and then put in its parents.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-22 07:28:35 -07:00
Patrick Steinhardt
44945dfe86 prio-queue: fix type of insertion_ctr
In 62e745ced2 (prio-queue: use size_t rather than int for size,
2024-12-20), we have converted `struct prio_queue` to use `size_t` to
track the number of entries in the queue as well as the allocated size
of the underlying array. There is one more counter though, namely the
insertion counter, that is still using an `unsigned` instead of a
`size_t`. This is unlikely to ever be a problem, but it makes one wonder
why some indices use `size_t` while others use `unsigned`. Furthermore,
the mentioned commit stated the intent to also adapt these variables,
but seemingly forgot to do so.

Fix the issue by converting those counters to use `size_t`, as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:10:41 -08:00
Jeff King
62e745ced2 prio-queue: use size_t rather than int for size
The alloc and nr fields of a prio-queue tell us how much memory is
allocated and used in the array. So the natural type for them is size_t,
which prevents overflow on 64-bit systems where "int" is still 32 bits.

This is unlikely to happen in practice, as we typically use it for
storing commits, and having 2^31 of those is rather a lot. But it's good
to keep our generic data structures as flexible as possible. And as we
start to enforce -Wsign-compare, it means that callers need to use
"int", too, and the problem proliferates. Let's fix it at the source.

The changes here can be put into a few groups:

  1. Changing the alloc/nr fields in the struct to size_t. This requires
     swapping out int for size_t in negotiator/skipping.c, as well as
     in prio_queue_get(), because those all iterate over the array.
     Building with -Wsign-compare complains about these.

  2. Other code that assigns or passes around indexes into the array
     (e.g., the swap() and compare() functions) won't trigger
     -Wsign-compare because we are simply truncating the values. These
     are caught by -Wconversion, but I've adjusted them here to
     future-proof us.

  3. In prio_queue_reverse() we compute "queue->nr - 1" without checking
     if anything is in the queue, which underflows now that nr is
     unsigned. We can fix that by returning early when the queue is
     empty (there is nothing to reverse).

  4. The insertion_ctr variable is currently unsigned, but can likewise
     grow (it is actually worse, because adding and removing an element
     many times will keep increasing the counter, even though "nr" does
     not). I've bumped that to size_t here, as well.

     But -Wconversion notes that computing the "cmp" result by
     subtracting the counters and assigning to "int" is a potential
     problem. And that's true even before this patch, since we use an
     unsigned counter (imagine comparing "2^32-1" and "0", which should
     be a high positive value, but instead is "-1" as a signed int).

     Since we only care about the sign (and not the magnitude) of the
     result, we could fix this by swapping out the subtraction for a
     ternary comparison. Probably the performance impact would be
     negligible, since we just called into a custom compare function and
     branched on its result anyway. But it's easy enough to do a
     branchless version by subtracting the comparison results.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-20 07:21:45 -08:00
Denton Liu
554544276a *.[ch]: remove extern from function declarations using spatch
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.

This was the Coccinelle patch used:

	@@
	type T;
	identifier f;
	@@
	- extern
	  T f(...);

and it was run with:

	$ git ls-files \*.{c,h} |
		grep -v ^compat/ |
		xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place

Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-05 15:20:06 +09:00
Derrick Stolee
aca4240f6a prio-queue: add 'peek' operation
When consuming a priority queue, it can be convenient to inspect
the next object that will be dequeued without actually dequeueing
it. Our existing library did not have such a 'peek' operation, so
add it as prio_queue_peek().

Add a reference-level comparison in t/helper/test-prio-queue.c
so this method is exercised by t0009-prio-queue.sh. Further, add
a test that checks the behavior when the compare function is NULL
(i.e. the queue becomes a stack).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02 12:14:21 +09:00
Jeff King
e8f91e3df8 prio-queue: make output stable with respect to insertion
If two items are added to a prio_queue and compare equal,
they currently come out in an apparently random order (this
order is deterministic for a particular sequence of
insertions and removals, but does not necessarily match the
insertion order). This makes it unlike using a date-ordered
commit_list, which is one of the main types we would like to
replace with it (because prio_queue does not suffer from
O(n) insertions).

We can make the priority queue stable by keeping an
insertion counter for each element, and using it to break
ties. This does increase the memory usage of the structure
(one int per element), but in practice it does not seem to
affect runtime. A best-of-five "git rev-list --topo-order"
on linux.git showed less than 1% difference (well within the
run-to-run noise).

In an ideal world, we would offer both stable and unstable
priority queues (the latter to try to maximize performance).
However, given the lack of a measurable performance
difference, it is not worth the extra code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-15 11:02:54 -07:00
Junio C Hamano
da24b1044f sort-in-topological-order: use prio-queue
Use the prio-queue data structure to implement a priority queue of
commits sorted by committer date, when handling --date-order.  The
structure can also be used as a simple LIFO stack, which is a good
match for --topo-order processing.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-11 15:15:21 -07:00
Junio C Hamano
b4b594a315 prio-queue: priority queue of pointers to structs
Traditionally we used a singly linked list of commits to hold a set
of in-flight commits while traversing history.  The most typical use
of the list is to add commits that are newly discovered to it, keep
the list sorted by commit timestamp, pick up the newest one from the
list, and keep digging.  The cost of keeping the singly linked list
sorted is nontrivial, and this typical use pattern better matches a
priority queue.

Introduce a prio-queue structure, that can be used either as a LIFO
stack, or a priority queue.  This will be used in the next patch to
hold in-flight commits during sort-in-topological-order.

Tests and the idea to make it usable for any "void *" pointers to
"things" are by Jeff King.  Bugs are mine.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-11 15:15:21 -07:00