git-for-windows/git - git - Gitea: Self-hosted GitHub

mirror of https://github.com/git-for-windows/git.git synced 2026-03-19 21:10:38 -05:00

Author	SHA1	Message	Date
Johannes Schindelin	ca318aa4a7	Add experimental 'git survey' builtin (#5174 ) This introduces `git survey` to Git for Windows ahead of upstream for the express purpose of getting the path-based analysis in the hands of more folks. The inspiration of this builtin is [`git-sizer`](https://github.com/github/git-sizer), but since that command relies on `git cat-file --batch` to get the contents of objects, it has limits to how much information it can provide. This is mostly a rewrite of the `git survey` builtin that was introduced into the `microsoft/git` fork in microsoft/git#667. That version had a lot more bells and whistles, including an analysis much closer to what `git-sizer` provides. The biggest difference in this version is that this one is focused on using the path-walk API in order to visit batches of objects based on a common path. This allows identifying, for instance, the path that is contributing the most to the on-disk size across all versions at that path. For example, here are the top ten paths contributing to my local Git repository (which includes `microsoft/git` and `gitster/git`): ``` TOP FILES BY DISK SIZE ============================================================================ Path \| Count \| Disk Size \| Inflated Size -----------------------------------------+-------+-----------+-------------- whats-cooking.txt \| 1373 \| 11637459 \| 37226854 t/helper/test-gvfs-protocol \| 2 \| 6847105 \| 17233072 git-rebase--helper \| 1 \| 6027849 \| 15269664 compat/mingw.c \| 6111 \| 5194453 \| 463466970 t/helper/test-parse-options \| 1 \| 3420385 \| 8807968 t/helper/test-pkt-line \| 1 \| 3408661 \| 8778960 t/helper/test-dump-untracked-cache \| 1 \| 3408645 \| 8780816 t/helper/test-dump-fsmonitor \| 1 \| 3406639 \| 8776656 po/vi.po \| 104 \| 1376337 \| 51441603 po/de.po \| 210 \| 1360112 \| 71198603 ``` This kind of analysis has been helpful in identifying the reasons for growth in a few internal monorepos. Those findings motivated the changes in #5157 and #5171. With this early version in Git for Windows, we can expand the reach of the experimental tool in advance of it being contributed to the upstream project. Unfortunately, this will mean that in the next `microsoft/git` rebase, @jeffhostetler's version will need to be pulled out since there are enough conflicts. These conflicts include how tables are stored and generated, as the version in this PR is slightly more general to allow for different kinds of data.	2024-12-17 12:04:02 +01:00
Johannes Schindelin	994196d4b4	Introduce 'git backfill' to get missing blobs in a partial clone (#5172 ) This change introduces the `git backfill` command which uses the path walk API to download missing blobs in a blobless partial clone. By downloading blobs that correspond to the same file path at the same time, we hope to maximize the potential benefits of delta compression against multiple versions. These downloads occur in a configurable batch size, presenting a mechanism to perform "resumable" clones: `git clone --filter=blob:none` gets the commits and trees, then `git backfill` will download all missing blobs. If `git backfill` is interrupted partway through, it can be restarted and will redownload only the missing objects. When combining blobless partial clones with sparse-checkout, `git backfill` will assume its `--sparse` option and download only the blobs within the sparse-checkout. Users may want to do this as the repo size will still be smaller than the full repo size, but commands like `git blame` or `git log -L` will not suffer from many one-by-one blob downloads. Future directions should consider adding a pathspec or file prefix to further focus which paths are being downloaded in a batch.	2024-12-17 12:04:02 +01:00
Derrick Stolee	fcf6f6f6d5	Add path walk API and its use in 'git pack-objects' (#5171 ) This is a follow up to #5157 as well as motivated by the RFC in gitgitgadget/git#1786. We have ways of walking all objects, but it is focused on visiting a single commit and then expanding the new trees and blobs reachable from that commit that have not been visited yet. This means that objects arrive without any locality based on their path. Add a new "path walk API" that focuses on walking objects in batches according to their type and path. This will walk all annotated tags, all commits, all root trees, and then start a depth-first search among all paths in the repo to collect trees and blobs in batches. The most important application for this is being fast-tracked to Git for Windows: `git pack-objects --path-walk`. This application of the path walk API discovers the objects to pack via this batched walk, and automatically groups objects that appear at a common path so they can be checked for delta comparisons. This use completely avoids any name-hash collisions (even the collisions that sometimes occur with the new `--full-name-hash` option) and can be much faster to compute since the first pass of delta calculations does not waste time on objects that are unlikely to be diffable. Some statistics are available in the commit messages.	2024-12-17 12:04:01 +01:00
Johannes Schindelin	1775126925	pack-objects: create new name-hash algorithm (#5157 ) This is an updated version of gitgitgadget/git#1785, intended for early consumption into Git for Windows. The idea here is to add a new `--full-name-hash` option to `git pack-objects` and `git repack`. This adjusts the name-hash value used for finding delta bases in such a way that uses the full path name with a lower likelihood of collisions than the default name-hash algorithm. In many repositories with name-hash collisions and many versions of those paths, this can significantly reduce the size of a full repack. It can also help in certain cases of `git push`, but only if the pack is already artificially inflated by name-hash collisions; cases that find "sibling" deltas as better choices become worse with `--full-name-hash`. Thus, this option is currently recommended for full repacks of large repos, and on client machines without reachability bitmaps. Some care is taken to ignore this option when using bitmaps, either writing bitmaps or using a bitmap walk during reads. The bitmap file format contains name-hash values, but no way to indicate which function is used, so compatibility is a concern for bitmaps. Future work could explore this idea. After this PR is merged, then the more-involved `--path-walk` option may be considered.	2024-12-17 12:04:01 +01:00
Johannes Schindelin	83d4166cce	Lazy load libcurl, allowing for an SSL/TLS backend-specific libcurl (#4410 ) As per https://github.com/git-for-windows/git/issues/4350#issuecomment-1485041503, the major block for upgrading Git for Windows' OpenSSL from v1.1 to v3 is the tricky part where such an upgrade would break `git fetch`/`git clone` and `git push` because the libcurl depends on the OpenSSL DLL, and the major version bump will _change_ the file name of said DLL. To overcome that, the plan is to build libcurl flavors for each supported SSL/TLS backend, aligning with the way MSYS2 builds libcurl, then switch Git for Windows' SDK to the Secure Channel-flavored libcurl, and teach Git to look for the specific flavor of libcurl corresponding to the `http.sslBackend` setting (if that was configured). Here is the PR to teach Git that trick.	2024-12-17 12:03:48 +01:00
Johannes Schindelin	d35ad5344f	Merge pull request #2974 from derrickstolee/maintenance-and-headless Include Windows-specific maintenance and headless-git	2024-12-17 12:03:36 +01:00
Jeff Hostetler	38cc6ef4d7	survey: stub in new experimental 'git-survey' command Start work on a new 'git survey' command to scan the repository for monorepo performance and scaling problems. The goal is to measure the various known "dimensions of scale" and serve as a foundation for adding additional measurements as we learn more about Git monorepo scaling problems. The initial goal is to complement the scanning and analysis performed by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool. It is hoped that by creating a builtin command, we may be able to take advantage of internal Git data structures and code that is not accessible from GO to gain further insight into potential scaling problems. Co-authored-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Derrick Stolee <stolee@gmail.com>	2024-12-17 00:44:39 +01:00
Derrick Stolee	92eaef30a7	backfill: add builtin boilerplate In anticipation of implementing 'git backfill', populate the necessary files with the boilerplate of a new builtin. RFC TODO: When preparing this for a full implementation, make sure it is based on the newest standards introduced by [1]. [1] https://lore.kernel.org/git/xmqqjzfq2f0f.fsf@gitster.g/T/#m606036ea2e75a6d6819d6b5c90e729643b0ff7f7 [PATCH 1/3] builtin: add a repository parameter for builtin functions Signed-off-by: Derrick Stolee <stolee@gmail.com>	2024-12-17 00:43:41 +01:00
Derrick Stolee	51143f56d1	t6601: add helper for testing path-walk API Add some tests based on the current behavior, doing interesting checks for different sets of branches, ranges, and the --boundary option. This sets a baseline for the behavior and we can extend it as new options are introduced. Signed-off-by: Derrick Stolee <stolee@gmail.com>	2024-12-17 00:43:38 +01:00
Derrick Stolee	dd1838d192	path-walk: introduce an object walk by path In anticipation of a few planned applications, introduce the most basic form of a path-walk API. It currently assumes that there are no UNINTERESTING objects, and does not include any complicated filters. It calls a function pointer on groups of tree and blob objects as grouped by path. This only includes objects the first time they are discovered, so an object that appears at multiple paths will not be included in two batches. There are many future adaptations that could be made, but they are left for future updates when consumers are ready to take advantage of those features. Signed-off-by: Derrick Stolee <stolee@gmail.com>	2024-12-16 21:04:46 +01:00
Derrick Stolee	e9e1bd09ef	test-tool: add helper for name-hash values Add a new test-tool helper, name-hash, to output the value of the name-hash algorithms for the input list of strings, one per line. Since the name-hash values can be stored in the .bitmap files, it is important that these hash functions do not change across Git versions. Add a simple test to t5310-pack-bitmaps.sh to provide some testing of the current values. Due to how these functions are implemented, it would be difficult to change them without disturbing these values. Create a performance test that uses test_size to demonstrate how collisions occur for these hash algorithms. This test helps inform someone as to the behavior of the name-hash algorithms for their repo based on the paths at HEAD. My copy of the Git repository shows modest statistics around the collisions of the default name-hash algorithm: Test this tree ----------------------------------------------------------------- 5314.1: paths at head 4.5K 5314.2: number of distinct name-hashes 4.1K 5314.3: number of distinct full-name-hashes 4.5K 5314.4: maximum multiplicity of name-hashes 13 5314.5: maximum multiplicity of fullname-hashes 1 Here, the maximum collision multiplicity is 13, but around 10% of paths have a collision with another path. In a more interesting example, the microsoft/fluentui [1] repo had these statistics at time of committing: Test this tree ----------------------------------------------------------------- 5314.1: paths at head 19.6K 5314.2: number of distinct name-hashes 8.2K 5314.3: number of distinct full-name-hashes 19.6K 5314.4: maximum multiplicity of name-hashes 279 5314.5: maximum multiplicity of fullname-hashes 1 [1] https://github.com/microsoft/fluentui That demonstrates that of the nearly twenty thousand path names, they are assigned around eight thousand distinct values. 279 paths are assigned to a single value, leading the packing algorithm to sort objects from those paths together, by size. In this repository, no collisions occur for the full-name-hash algorithm. In a more extreme example, an internal monorepo had a much worse collision rate: Test this tree ----------------------------------------------------------------- 5314.1: paths at head 221.6K 5314.2: number of distinct name-hashes 72.0K 5314.3: number of distinct full-name-hashes 221.6K 5314.4: maximum multiplicity of name-hashes 14.4K 5314.5: maximum multiplicity of fullname-hashes 2 Even in this repository with many more paths at HEAD, the collision rate was low and the maximum number of paths being grouped into a single bucket by the full-path-name algorithm was two. Signed-off-by: Derrick Stolee <stolee@gmail.com>	2024-12-16 21:04:46 +01:00
Johannes Schindelin	b8bf94e542	http: support lazy-loading libcurl also on Windows This implements the Windows-specific support code, because everything is slightly different on Windows, even loading shared libraries. Note: I specifically do _not_ use the code from `compat/win32/lazyload.h` here because that code is optimized for loading individual functions from various system DLLs, while we specifically want to load _many_ functions from _one_ DLL here, and distinctly not a system DLL (we expect libcurl to be located outside `C:\Windows\system32`, something `INIT_PROC_ADDR` refuses to work with). Also, the `curl_easy_getinfo()`/`curl_easy_setopt()` functions are declared as vararg functions, which `lazyload.h` cannot handle. Finally, we are about to optionally override the exact file name that is to be loaded, which is a goal contrary to `lazyload.h`'s design. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-12-16 21:04:44 +01:00
Johannes Schindelin	7e9e7dbfc8	http: optionally load libcurl lazily This compile-time option allows to ask Git to load libcurl dynamically at runtime. Together with a follow-up patch that optionally overrides the file name depending on the `http.sslBackend` setting, this kicks open the door for installing multiple libcurl flavors side by side, and load the one corresponding to the (runtime-)configured SSL/TLS backend. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-12-16 21:04:44 +01:00
Jeff Hostetler	5ebaabe6a0	Makefile: clean up .ilk files when MSVC=1 Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>	2024-12-16 20:58:17 +01:00
Johannes Schindelin	3f61f35355	mimalloc: offer a build-time option to enable it By defining `USE_MIMALLOC`, Git can now be compiled with that nicely-fast and small allocator. Note that we have to disable a couple `DEVELOPER` options to build mimalloc's source code, as it makes heavy use of declarations after statements, among other things that disagree with Git's conventions. We even have to silence some GCC warnings in non-DEVELOPER mode. For example, the `-Wno-array-bounds` flag is needed because in `-O2` builds, trying to call `NtCurrentTeb()` (which `_mi_thread_id()` does on Windows) causes the bogus warning about a system header, likely related to https://sourceforge.net/p/mingw-w64/mailman/message/37674519/ and to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578: C:/git-sdk-64-minimal/mingw64/include/psdk_inc/intrin-impl.h:838:1: error: array subscript 0 is outside array bounds of 'long long unsigned int[0]' [-Werror=array-bounds] 838 \| __buildreadseg(__readgsqword, unsigned __int64, "gs", "q") \| ^~~~~~~~~~~~~~ Also: The `mimalloc` library uses C11-style atomics, therefore we must require that standard when compiling with GCC if we want to use `mimalloc` (instead of requiring "only" C99). This is what we do in the CMake definition already, therefore this commit does not need to touch `contrib/buildsystems/`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-12-16 20:47:09 +01:00
Johannes Schindelin	bead1b4470	Import the source code of mimalloc v2.1.2 This commit imports mimalloc's source code as per v2.1.2, fetched from the tag at https://github.com/microsoft/mimalloc. The .c files are from the src/ subdirectory, and the .h files from the include/ and include/mimalloc/ subdirectories. We will subsequently modify the source code to accommodate building within Git's context. Since we plan on using the `mi_*()` family of functions, we skip the C++-specific source code, some POSIX compliant functions to interact with mimalloc, and the code that wants to support auto-magic overriding of the `malloc()` function (mimalloc-new-delete.h, alloc-posix.c, mimalloc-override.h, alloc-override.c, alloc-override-osx.c, alloc-override-win.c and static.c). To appease the `check-whitespace` job of Git's Continuous Integration, this commit was washed one time via `git rebase --whitespace=fix`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-12-16 20:47:09 +01:00
Junio C Hamano	29e5596eb8	Merge branch 'ps/build' Build procedure update plus introduction of Meson based builds. * ps/build: (24 commits) Introduce support for the Meson build system Documentation: add comparison of build systems t: allow overriding build dir t: better support for out-of-tree builds Documentation: extract script to generate a list of mergetools Documentation: teach "cmd-list.perl" about out-of-tree builds Documentation: allow sourcing generated includes from separate dir Makefile: simplify building of templates Makefile: write absolute program path into bin-wrappers Makefile: allow "bin-wrappers/" directory to exist Makefile: refactor generators to be PWD-independent Makefile: extract script to generate gitweb.js Makefile: extract script to generate gitweb.cgi Makefile: extract script to massage Python scripts Makefile: extract script to massage Shell scripts Makefile: use "generate-perl.sh" to massage Perl library Makefile: extract script to massage Perl scripts Makefile: consistently use PERL_PATH Makefile: generate doc versions via GIT-VERSION-GEN Makefile: generate "git.rc" via GIT-VERSION-GEN ...	2024-12-15 17:54:33 -08:00
Junio C Hamano	cd0a222f08	Merge branch 'es/oss-fuzz' Backport oss-fuzz tests for us to our codebase. * es/oss-fuzz: fuzz: port fuzz-url-decode-mem from OSS-Fuzz fuzz: port fuzz-parse-attr-line from OSS-Fuzz fuzz: port fuzz-credential-from-url-gently from OSS-Fuzz	2024-12-13 07:33:42 -08:00
Junio C Hamano	de9278127e	Merge branch 'ps/reftable-detach' Isolates the reftable subsystem from the rest of Git's codebase by using fewer pieces of Git's infrastructure. * ps/reftable-detach: reftable/system: provide thin wrapper for lockfile subsystem reftable/stack: drop only use of `get_locked_file_path()` reftable/system: provide thin wrapper for tempfile subsystem reftable/stack: stop using `fsync_component()` directly reftable/system: stop depending on "hash.h" reftable: explicitly handle hash format IDs reftable/system: move "dir.h" to its only user	2024-12-10 10:04:56 +09:00
Patrick Steinhardt	7e0730c8ba	t: better support for out-of-tree builds Our in-tree builds used by the Makefile use various different build directories scattered around different locations. The paths to those build directories have to be propagated to our tests such that they can find the contained files. This is done via a mixture of hardcoded paths in our test library and injected variables in our bin-wrappers or "GIT-BUILD-OPTIONS". The latter two mechanisms are preferable over using hardcoded paths. For one, we have all paths which are subject to change stored in a small set of central files instead of having the knowledge of build paths in many files. And second, it allows build systems which build files elsewhere to adapt those paths based on their own needs. This is especially nice in the context of build systems that use out-of-tree builds like CMake or Meson. Remove hardcoded knowledge of build paths from our test library and move it into our bin-wrappers and "GIT-BUILD-OPTIONS". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:13 +09:00
Patrick Steinhardt	d2407bb8dc	Makefile: write absolute program path into bin-wrappers Write the absolute program path into our bin-wrappers. This allows us to simplify the Meson build instructions we are about to introduce a bit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:12 +09:00
Patrick Steinhardt	95bcd6f0b7	Makefile: allow "bin-wrappers/" directory to exist The "bin-wrappers/" directory gets created by our build system and is populated with one script for each of our binaries. There isn't anything inherently wrong with the current layout, but it is somewhat hard to adapt for out-of-tree build systems. Adapt the layout such that our "bin-wrappers/" directory always exists and contains our "wrap-for-bin.sh" script to make things a little bit easier for subsequent steps. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:11 +09:00
Patrick Steinhardt	3f145a4fe3	Makefile: refactor generators to be PWD-independent We have multiple scripts that generate headers from other data. All of these scripts have the assumption built-in that they are executed in the current source directory, which makes them a bit unwieldy to use during out-of-tree builds. Refactor them to instead take the source directory as well as the output file as arguments. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:11 +09:00
Patrick Steinhardt	b7835b941b	Makefile: extract script to massage Python scripts Extract a script that massages Python scripts. This provides a couple of benefits: - The build logic is deduplicated across Make, CMake and Meson. - CMake learns to rewrite scripts as-needed at build time instead of only writing them at configure time. Furthermore, we will use this script when introducing Meson to deduplicate the logic across build systems. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:10 +09:00
Patrick Steinhardt	eb98cb835c	Makefile: extract script to massage Shell scripts Same as in the preceding commits, extract a script that allows us to unify how we massage shell scripts. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:10 +09:00
Patrick Steinhardt	ccfba9e0c4	Makefile: use "generate-perl.sh" to massage Perl library Extend "generate-perl.sh" such that it knows to also massage the Perl library files. There are two major differences: - We do not read in the Perl header. This is handled by matching on whether or not we have a Perl shebang. - We substitute some more variables, which we read in via our GIT-BUILD-OPTIONS. Adapt both our Makefile and the CMake build instructions to use this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:10 +09:00
Patrick Steinhardt	e4b488049a	Makefile: extract script to massage Perl scripts Extract the script to inject various build-time parameters into our Perl scripts into a standalone script. This is done such that we can reuse it in other build systems. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:09 +09:00
Patrick Steinhardt	c2a3b847ed	Makefile: consistently use PERL_PATH When injecting the Perl path into our scripts we sometimes use '@PERL@' while we othertimes use '@PERL_PATH@'. Refactor the code use the latter consistently, which makes it easier to reuse the same logic for multiple scripts. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:09 +09:00
Patrick Steinhardt	9bb10d27e7	Makefile: generate "git.rc" via GIT-VERSION-GEN The "git.rc" is used on Windows to embed information like the project name and version into the resulting executables. As such we need to inject the version information, which we do by using preprocessor defines. The logic to do so is non-trivial and needs to be kept in sync with the different build systems. Refactor the logic so that we generate "git.rc" via `GIT-VERSION-GEN`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:09 +09:00
Patrick Steinhardt	0c8d339514	Makefile: propagate Git version via generated header We set up a couple of preprocessor macros when compiling Git that propagate the version that Git was built from to `git version` et al. The way this is set up makes it harder than necessary to reuse the infrastructure across the different build systems. Refactor this such that we generate a "version-def.h" header via `GIT-VERSION-GEN` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:08 +09:00
Patrick Steinhardt	4838deab65	Makefile: refactor GIT-VERSION-GEN to be reusable Our "GIT-VERSION-GEN" script always writes the "GIT-VERSION-FILE" into the current directory, where the expectation is that it should exist in the source directory. But other build systems that support out-of-tree builds may not want to do that to keep the source directory pristine, even though CMake currently doesn't care. Refactor the script such that it won't write the "GIT-VERSION-FILE" directly anymore, but instead knows to replace @PLACEHOLDERS@ in an arbitrary input file. This allows us to simplify the logic in CMake to determine the project version, but can also be reused later on in order to generate other files that need to contain version information like our "git.rc" file. While at it, change the format of the version file by removing the spaces around the equals sign. Like this we can continue to include the file in our Makefiles, but can also start to source it in shell scripts in subsequent steps. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:08 +09:00
Patrick Steinhardt	dbe46c0feb	Makefile: consistently use @PLACEHOLDER@ to substitute We have a bunch of placeholders in our scripts that we replace at build time, for example by using sed(1). These placeholders come in three different formats: @PLACEHOLDER@, @@PLACEHOLDER@@ and ++PLACEHOLDER++. Next to being inconsistent it also creates a bit of a problem with CMake, which only supports the first syntax in its `configure_file()` function. To work around that we instead manually replace placeholders via string operations, which is a hassle and removes safeguards that CMake has to verify that we didn't forget to replace any placeholders. Besides that, other build systems like Meson also support the CMake syntax. Unify our codebase to consistently use the syntax supported by such build systems. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:08 +09:00
Patrick Steinhardt	4638e8806e	Makefile: use common template for GIT-BUILD-OPTIONS The "GIT-BUILD-OPTIONS" file is generated by our build systems to propagate built-in features and paths to our tests. The generation is done ad-hoc, where both our Makefile and the CMake build instructions simply echo a bunch of strings into the file. This makes it very hard to figure out what variables are expected to exist and what format they have, and the written variables can easily get out of sync between build systems. Introduce a new "GIT-BUILD-OPTIONS.in" template to address this issue. This has multiple advantages: - It demonstrates which built options exist in the first place. - It can serve as a spot to document the build options. - Some build systems complain when not all variables could be substituted, alerting us of mismatches. Others don't, but if we forgot to substitute such variables we now have a bogus string that will likely cause our tests to fail, if they have any meaning in the first place. Backfill values that we didn't yet set in our CMake build instructions. While at it, remove the `SUPPORTS_SIMPLE_IPC` variable that we only set up in CMake as it isn't used anywhere. This change requires us to adapt the setup of TEST_OUTPUT_DIRECTORY in "test-lib.sh" such that it does not get overwritten after sourcing when it has been set up via the environment. This is the only instance I could find where we rely on ordering on variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-12-07 07:52:08 +09:00
Patrick Steinhardt	01e49941d6	reftable/system: provide thin wrapper for tempfile subsystem We use the tempfile subsystem to write temporary tables, but given that we're in the process of converting the reftable library to become standalone we cannot use this subsystem directly anymore. While we could in theory convert the code to use mkstemp(3p) instead, we'd lose access to our infrastructure that automatically prunes tempfiles via atexit(3p) or signal handlers. Provide a thin wrapper for the tempfile subsystem instead. Like this, the compatibility shim is fully self-contained in "reftable/system.c". Downstream users of the reftable library would have to implement their own tempfile shims by replacing "system.c" with a custom version. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-19 12:23:10 +09:00
Patrick Steinhardt	5dac35bbde	Makefile: let clar header targets depend on their scripts The targets that generate clar headers depend on their source files, but not on the script that is actually generating the output. Fix the issue by adding the missing dependencies. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-18 09:59:26 +09:00
Patrick Steinhardt	9a91ab9400	t/unit-tests: convert "clar-generate.awk" into a shell script Convert "clar-generate.awk" into a shell script that invokes awk(1). This allows us to avoid the shell redirect in the build system, which may otherwise be a problem with build systems on platforms that use a different shell. While at it, wrap the overly long lines in the CMake build instructions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-18 09:59:25 +09:00
Junio C Hamano	1ee7dbde67	Merge branch 'ps/upgrade-clar' Buildfix and upgrade of Clar to a newer version. * ps/upgrade-clar: cmake: set up proper dependencies for generated clar headers cmake: fix compilation of clar-based unit tests Makefile: extract script to generate clar declarations Makefile: adjust sed command for generating "clar-decls.h" t/unit-tests: update clar to 206accb	2024-11-08 12:56:28 +09:00
Taylor Blau	268fd2fe58	Merge branch 'ps/platform-compat-fixes' Various platform compatibility fixes split out of the larger effort to use Meson as the primary build tool. * ps/platform-compat-fixes: t6006: fix prereq handling with `test_format ()` http: fix build error on FreeBSD builtin/credential-cache: fix missing parameter for stub function t7300: work around platform-specific behaviour with long paths on MinGW t5500, t5601: skip tests which exercise paths with '[::1]' on Cygwin t3404: work around platform-specific behaviour on macOS 10.15 t1401: make invocation of tar(1) work with Win32-provided one t/lib-gpg: fix setup of GNUPGHOME in MinGW t/lib-gitweb: test against the build version of gitweb t/test-lib: wire up NO_ICONV prerequisite t/test-lib: fix quoting of TEST_RESULTS_SAN_FILE	2024-11-01 12:53:17 -04:00
Taylor Blau	dca32b8288	Merge branch 'pb/clar-build-fix' Build fix. * pb/clar-build-fix: Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o	2024-10-25 14:02:25 -04:00
Patrick Steinhardt	67f75dfe1b	Makefile: extract script to generate clar declarations Extract the script to generate function declarations for the clar unit testing framework into a standalone script. This is done such that we can reuse it in other build systems. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-21 16:53:07 -04:00
Alejandro R. Sedeño	a779c8e8d5	Makefile: adjust sed command for generating "clar-decls.h" This moves the end-of-line marker out of the captured group, matching the start-of-line marker and for some reason fixing generation of "clar-decls.h" on some older, more esoteric platforms. Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-21 16:53:07 -04:00
Eric Sesterhenn	751d063f27	fuzz: port fuzz-url-decode-mem from OSS-Fuzz Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several additional fuzz tests have been contributed directly to OSS-Fuzz; however, these tests are vulnerable to bitrot because they are not built during Git's CI runs, and thus breaking changes are much less likely to be noticed by Git contributors. Port one of these tests back to the Git project: fuzz-url-decode-mem This test was originally written by Eric Sesterhenn as part of a security audit of Git [2]. It was then contributed to the OSS-Fuzz repo in commit c58ac4492 (Git fuzzing: uncomment the existing and add new targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon) have verified with both Eric and Jaroslav that they're OK with moving this test to the Git project. [1] https://github.com/google/oss-fuzz [2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com> Co-authored-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-16 18:14:11 -04:00
Eric Sesterhenn	72686d4e5e	fuzz: port fuzz-parse-attr-line from OSS-Fuzz Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several additional fuzz tests have been contributed directly to OSS-Fuzz; however, these tests are vulnerable to bitrot because they are not built during Git's CI runs, and thus breaking changes are much less likely to be noticed by Git contributors. Port one of these tests back to the Git project: fuzz-parse-attr-line This test was originally written by Eric Sesterhenn as part of a security audit of Git [2]. It was then contributed to the OSS-Fuzz repo in commit c58ac4492 (Git fuzzing: uncomment the existing and add new targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon) have verified with both Eric and Jaroslav that they're OK with moving this test to the Git project. [1] https://github.com/google/oss-fuzz [2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com> Co-authored-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-16 18:14:11 -04:00
Eric Sesterhenn	966253db75	fuzz: port fuzz-credential-from-url-gently from OSS-Fuzz Git's fuzz tests are run continuously as part of OSS-Fuzz [1]. Several additional fuzz tests have been contributed directly to OSS-Fuzz; however, these tests are vulnerable to bitrot because they are not built during Git's CI runs, and thus breaking changes are much less likely to be noticed by Git contributors. Port one of these tests back to the Git project: fuzz-credential-from-url-gently This test was originally written by Eric Sesterhenn as part of a security audit of Git [2]. It was then contributed to the OSS-Fuzz repo in commit c58ac4492 (Git fuzzing: uncomment the existing and add new targets. (#11486), 2024-02-21) by Jaroslav Lobačevski. I (Josh Steadmon) have verified with both Eric and Jaroslav that they're OK with moving this test to the Git project. [1] https://github.com/google/oss-fuzz [2] https://ostif.org/wp-content/uploads/2023/01/X41-OSTIF-Gitlab-Git-Security-Audit-20230117-public.pdf Co-authored-by: Jaroslav Lobačevski <jarlob@gmail.com> Co-authored-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-16 18:14:11 -04:00
Patrick Steinhardt	df383b5842	t/test-lib: wire up NO_ICONV prerequisite The iconv library is used by Git to reencode files, commit messages and other things. As such it is a rather integral part, but given that many platforms nowadays use UTF-8 everywhere you can live without support for reencoding in many situations. It is thus optional to build Git with iconv, and some of our platforms wired up in "config.mak.uname" disable it. But while we support building without it, running our test suite with "NO_ICONV=Yes" causes many test failures. Wire up a new test prerequisite ICONV that gets populated via our GIT-BUILD-OPTIONS. Annotate failing tests accordingly. Note that this commit does not do a deep dive into every single test to assess whether the failure is expected or not. Most of the tests do smell like the expected kind of failure though. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2024-10-16 17:00:49 -04:00
Philippe Blain	ea3422662d	Makefile: fix dependency for $(UNIT_TEST_DIR)/clar/clar.o The clar source file '$(UNIT_TEST_DIR)/clar/clar.c' includes the generated 'clar.suite', but this dependency is not taken into account by our Makefile, so that it is possible for a parallel build to fail if Make tries to build 'clar.o' before 'clar.suite' is generated. Correctly specify the dependency. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-10-11 11:08:08 -07:00
Junio C Hamano	5575c713c2	Merge branch 'ps/reftable-alloc-failures' The reftable library is now prepared to expect that the memory allocation function given to it may fail to allocate and to deal with such an error. * ps/reftable-alloc-failures: (26 commits) reftable/basics: fix segfault when growing `names` array fails reftable/basics: ban standard allocator functions reftable: introduce `REFTABLE_FREE_AND_NULL()` reftable: fix calls to free(3P) reftable: handle trivial allocation failures reftable/tree: handle allocation failures reftable/pq: handle allocation failures when adding entries reftable/block: handle allocation failures reftable/blocksource: handle allocation failures reftable/iter: handle allocation failures when creating indexed table iter reftable/stack: handle allocation failures in auto compaction reftable/stack: handle allocation failures in `stack_compact_range()` reftable/stack: handle allocation failures in `reftable_new_stack()` reftable/stack: handle allocation failures on reload reftable/reader: handle allocation failures in `reader_init_iter()` reftable/reader: handle allocation failures for unindexed reader reftable/merged: handle allocation failures in `merged_table_init_iter()` reftable/writer: handle allocation failures in `reftable_new_writer()` reftable/writer: handle allocation failures in `writer_index_hash()` reftable/record: handle allocation failures when decoding records ...	2024-10-10 14:22:25 -07:00
Patrick Steinhardt	a5a15a4514	reftable/basics: merge "publicbasics" into "basics" The split between "basics" and "publicbasics" is somewhat arbitrary and not in line with how we typically structure code in the reftable library. While we do indeed split up headers into a public and internal part, we don't do that for the compilation unit itself. Furthermore, the declarations for "publicbasics.c" are in "reftable-malloc.h", which isn't in line with our naming schema, either. Fix these inconsistencies by: - Merging "publicbasics.c" into "basics.c". - Renaming "reftable-malloc.h" to "reftable-basics.h" as the public header. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-10-02 07:53:51 -07:00
Junio C Hamano	ead0a050e2	Merge branch 'tb/weak-sha1-for-tail-sum' The checksum at the tail of files are now computed without collision detection protection. This is safe as the consumer of the information to protect itself from replay attacks checks for hash collisions independently. * tb/weak-sha1-for-tail-sum: csum-file.c: use unsafe SHA-1 implementation when available Makefile: allow specifying a SHA-1 for non-cryptographic uses hash.h: scaffolding for _unsafe hashing variants sha1: do not redefine `platform_SHA_CTX` and friends pack-objects: use finalize_object_file() to rename pack/idx/etc finalize_object_file(): implement collision check finalize_object_file(): refactor unlink_or_warn() placement finalize_object_file(): check for name collision before renaming	2024-10-02 07:46:27 -07:00
Taylor Blau	06c92dafb8	Makefile: allow specifying a SHA-1 for non-cryptographic uses Introduce _UNSAFE variants of the OPENSSL_SHA1, BLK_SHA1, and APPLE_COMMON_CRYPTO_SHA1 compile-time knobs which indicate which SHA-1 implementation is to be used for non-cryptographic uses. There are a couple of small implementation notes worth mentioning: - There is no way to select the collision detecting SHA-1 as the "fast" fallback, since the fast fallback is only for non-cryptographic uses, and is meant to be faster than our collision-detecting implementation. - There are no similar knobs for SHA-256, since no collision attacks are presently known and thus no collision-detecting implementations actually exist. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-27 11:27:47 -07:00

1 2 3 4 5 ...

3060 Commits