Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
This topic branch addresses the following vulnerability:
- **CVE-2025-66413**:
When a user clones a repository from an attacker-controlled server,
Git may attempt NTLM authentication and disclose the user's NTLMv2 hash
to the remote server. Since NTLM hashing is weak, the captured hash can
potentially be brute-forced to recover the user's credentials. This is
addressed by disabling NTLM authentication by default.
(https://github.com/git-for-windows/git/security/advisories/GHSA-hv9c-4jm9-jh3x)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This introduces `git survey` to Git for Windows ahead of upstream for
the express purpose of getting the path-based analysis in the hands of
more folks.
The inspiration of this builtin is
[`git-sizer`](https://github.com/github/git-sizer), but since that
command relies on `git cat-file --batch` to get the contents of objects,
it has limits to how much information it can provide.
This is mostly a rewrite of the `git survey` builtin that was introduced
into the `microsoft/git` fork in microsoft/git#667. That version had a
lot more bells and whistles, including an analysis much closer to what
`git-sizer` provides.
The biggest difference in this version is that this one is focused on
using the path-walk API in order to visit batches of objects based on a
common path. This allows identifying, for instance, the path that is
contributing the most to the on-disk size across all versions at that
path.
For example, here are the top ten paths contributing to my local Git
repository (which includes `microsoft/git` and `gitster/git`):
```
TOP FILES BY DISK SIZE
============================================================================
Path | Count | Disk Size | Inflated Size
-----------------------------------------+-------+-----------+--------------
whats-cooking.txt | 1373 | 11637459 | 37226854
t/helper/test-gvfs-protocol | 2 | 6847105 | 17233072
git-rebase--helper | 1 | 6027849 | 15269664
compat/mingw.c | 6111 | 5194453 | 463466970
t/helper/test-parse-options | 1 | 3420385 | 8807968
t/helper/test-pkt-line | 1 | 3408661 | 8778960
t/helper/test-dump-untracked-cache | 1 | 3408645 | 8780816
t/helper/test-dump-fsmonitor | 1 | 3406639 | 8776656
po/vi.po | 104 | 1376337 | 51441603
po/de.po | 210 | 1360112 | 71198603
```
This kind of analysis has been helpful in identifying the reasons for
growth in a few internal monorepos. Those findings motivated the changes
in #5157 and #5171.
With this early version in Git for Windows, we can expand the reach of
the experimental tool in advance of it being contributed to the upstream
project.
Unfortunately, this will mean that in the next `microsoft/git` rebase,
Jeff Hostetler's version will need to be pulled out since there are
enough conflicts. These conflicts include how tables are stored and
generated, as the version in this PR is slightly more general to allow
for different kinds of data.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'git survey' builtin provides several detail tables, such as "top
files by on-disk size". The size of these tables defaults to 10,
currently.
Allow the user to specify this number via a new --top=<N> option or the
new survey.top config key.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
NTLM authentication is relatively weak. This is the case even with the
default setting of modern Windows versions, where NTLMv1 and LanManager
are disabled and only NTLMv2 is enabled: NTLMv2 hashes of even
reasonably complex 8-character passwords can be broken in a matter of
days, given enough compute resources.
Even worse: On Windows, NTLM authentication uses Security Support
Provider Interface ("SSPI"), which provides the credentials without
requiring the user to type them in.
Which means that an attacker could talk an unsuspecting user into
cloning from a server that is under the attacker's control and extracts
the user's NTLMv2 hash without their knowledge.
For that reason, let's disallow NTLM authentication by default.
NTLM authentication is quite simple to set up, though, and therefore
there are still some on-prem Azure DevOps setups out there whose users
and/or automation rely on this type of authentication. To give them an
escape hatch, introduce the `http.<url>.allowNTLMAuth` config setting
that can be set to `true` to opt back into using NTLM for a specific
remote repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the moment, nothing is obvious about the reason for the use of the
path-walk API, but this will become more prevelant in future iterations. For
now, use the path-walk API to sum up the counts of each kind of object.
For example, this is the reachable object summary output for my local repo:
REACHABLE OBJECT SUMMARY
========================
Object Type | Count
------------+-------
Tags | 1343
Commits | 179344
Trees | 314350
Blobs | 184030
Signed-off-by: Derrick Stolee <stolee@gmail.com>
When 'git survey' provides information to the user, this will be presented
in one of two formats: plaintext and JSON. The JSON implementation will be
delayed until the functionality is complete for the plaintext format.
The most important parts of the plaintext format are headers specifying the
different sections of the report and tables providing concreted data.
Create a custom table data structure that allows specifying a list of
strings for the row values. When printing the table, check each column for
the maximum width so we can create a table of the correct size from the
start.
The table structure is designed to be flexible to the different kinds of
output that will be implemented in future changes.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
By default we will scan all references in "refs/heads/", "refs/tags/"
and "refs/remotes/".
Add command line opts let the use ask for all refs or a subset of them
and to include a detached HEAD.
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Start work on a new 'git survey' command to scan the repository
for monorepo performance and scaling problems. The goal is to
measure the various known "dimensions of scale" and serve as a
foundation for adding additional measurements as we learn more
about Git monorepo scaling problems.
The initial goal is to complement the scanning and analysis performed
by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool.
It is hoped that by creating a builtin command, we may be able to take
advantage of internal Git data structures and code that is not
accessible from GO to gain further insight into potential scaling
problems.
Co-authored-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Atomic append on windows is only supported on local disk files, and it may
cause errors in other situations, e.g. network file system. If that is the
case, this config option should be used to turn atomic append off.
Co-Authored-By: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: 孙卓识 <sunzhuoshi@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds support for a new http.sslAutoClientCert config value.
In cURL 7.77 or later the schannel backend does not automatically send
client certificates from the Windows Certificate Store anymore.
This config value is only used if http.sslBackend is set to "schannel",
and can be used to opt in to the old behavior and force cURL to send
client certificates.
This fixes https://github.com/git-for-windows/git/issues/3292
Signed-off-by: Pascal Muller <pascalmuller@gmail.com>
The native Windows HTTPS backend is based on Secure Channel which lets
the caller decide how to handle revocation checking problems caused by
missing information in the certificate or offline CRL distribution
points.
Unfortunately, cURL chose to handle these problems differently than
OpenSSL by default: while OpenSSL happily ignores those problems
(essentially saying "¯\_(ツ)_/¯"), the Secure Channel backend will error
out instead.
As a remedy, the "no revoke" mode was introduced, which turns off
revocation checking altogether. This is a bit heavy-handed. We support
this via the `http.schannelCheckRevoke` setting.
In https://github.com/curl/curl/pull/4981, we contributed an opt-in
"best effort" strategy that emulates what OpenSSL seems to do.
In Git for Windows, we actually want this to be the default. This patch
makes it so, introducing it as a new value for the
`http.schannelCheckRevoke" setting, which now becmes a tristate: it
accepts the values "false", "true" or "best-effort" (defaulting to the
last one).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Since commit 0c499ea60f (send-pack: demultiplex a sideband stream with
status data, 2010-02-05) the send-pack builtin uses the side-band-64k
capability if advertised by the server.
Unfortunately this breaks pushing over the dump git protocol if used
over a network connection.
The detailed reasons for this breakage are (by courtesy of Jeff Preshing,
quoted from https://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ):
MinGW wraps Windows sockets in CRT file descriptors in order to
mimic the functionality of POSIX sockets. This causes msvcrt.dll
to treat sockets as Installable File System (IFS) handles,
calling ReadFile, WriteFile, DuplicateHandle and CloseHandle on
them. This approach works well in simple cases on recent
versions of Windows, but does not support all usage patterns. In
particular, using this approach, any attempt to read & write
concurrently on the same socket (from one or more processes)
will deadlock in a scenario where the read waits for a response
from the server which is only invoked after the write. This is
what send_pack currently attempts to do in the use_sideband
codepath.
The new config option `sendpack.sideband` allows to override the
side-band-64k capability of the server, and thus makes the dumb git
protocol work.
Other transportation methods like ssh and http/https still benefit from
the sideband channel, therefore the default value of `sendpack.sideband`
is still true.
Signed-off-by: Thomas Braun <thomas.braun@byte-physics.de>
Signed-off-by: Oliver Schneider <oliver@assarbad.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When fetching objects into a lazily cloned repository, .promisor
files are created with information meant to help debugging. "git
repack" has been taught to carry this information forward to
packfiles that are newly created.
* lp/repack-propagate-promisor-debugging-info:
SQUASH???
t7703: test for promisor file content after geometric repack
t7700: test for promisor file content after repack
repack-promisor: preserve content of promisor files after repack
pack-write: add helper to fill promisor file after repack
pack-write: add explanation to promisor file content
The fsmonitor daemon has been implemented for Linux.
* pt/fsmonitor-linux:
fsmonitor: convert shown khash to strset in do_handle_client
fsmonitor: add tests for Linux
fsmonitor: add timeout to daemon stop command
fsmonitor: close inherited file descriptors and detach in daemon
run-command: add close_fd_above_stderr option
fsmonitor: implement filesystem change listener for Linux
fsmonitor: rename fsm-settings-darwin.c to fsm-settings-unix.c
fsmonitor: rename fsm-ipc-darwin.c to fsm-ipc-unix.c
fsmonitor: use pthread_cond_timedwait for cookie wait
compat/win32: add pthread_cond_timedwait
fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon
fsmonitor: fix khash memory leak in do_handle_client
t9210, t9211: disable GIT_TEST_SPLIT_INDEX for scalar clone tests
Rust support is enabled by default (but still allows opting out) in
Git 2.54.
* bc/rust-by-default-in-2.54:
Enable Rust by default
Linux: link against libdl
ci: install cargo on Alpine
docs: update version with default Rust support
The graph output from commands like "git log --graph" can now be
limited to a specified number of lanes, preventing overly wide output
in repositories with many branches.
* ps/graph-lane-limit:
graph: add truncation mark to capped lanes
graph: add --graph-lane-limit option
graph: limit the graph width to a hard-coded max
"git checkout -m another-branch" was invented to deal with local
changes to paths that are different between the current and the new
branch, but it gave only one chance to resolve conflicts. The command
was taught to create a stash to save the local changes.
* hn/git-checkout-m-with-stash:
checkout: -m (--merge) uses autostash when switching branches
sequencer: teach autostash apply to take optional conflict marker labels
sequencer: allow create_autostash to run silently
stash: add --ours-label, --theirs-label, --base-label for apply
"git name-rev" learned to use custom format instead of the object
name in an extended SHA-1 expression form.
Comments?
* kh/name-rev-custom-format:
name-rev: learn --format=<pretty>
name-rev: wrap both blocks in braces
The final step, split from earlier attempt by Dscho, to loosen the
sideband restriction for now and tighten later at Git v3.0 boundary.
* jc/neuter-sideband-post-3.0:
sideband: delay sanitizing by default to Git v3.0
"git clone" learns to pay attention to "clone.<url>.defaultObjectFilter"
configuration and behave as if the "--filter=<filter-spec>" option
was given on the command line.
* ab/clone-default-object-filter:
clone: add clone.<url>.defaultObjectFilter config
* ar/parallel-hooks:
hook: allow hook.jobs=-1 to use all available CPU cores
hook: add hook.<event>.enabled switch
hook: move is_known_hook() to hook.c for wider use
hook: warn when hook.<friendly-name>.jobs is set
hook: add per-event jobs config
hook: add -j/--jobs option to git hook run
hook: mark non-parallelizable hooks
hook: allow pre-push parallel execution
hook: allow parallel hook execution
hook: parse the hook.jobs config
config: add a repo_config_get_uint() helper
repository: fix repo_init() memleak due to missing _clear()
The repacking code has been refactored and compaction of MIDX layers
have been implemented, and incremental strategy that does not require
all-into-one repacking has been introduced.
* tb/incremental-midx-part-3.3:
repack: allow `--write-midx=incremental` without `--geometric`
repack: introduce `--write-midx=incremental`
repack: implement incremental MIDX repacking
packfile: ensure `close_pack_revindex()` frees in-memory revindex
builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK`
repack-geometry: prepare for incremental MIDX repacking
repack-midx: extract `repack_fill_midx_stdin_packs()`
repack-midx: factor out `repack_prepare_midx_command()`
midx: expose `midx_layer_contains_pack()`
repack: track the ODB source via existing_packs
midx: support custom `--base` for incremental MIDX writes
midx: introduce `--checksum-only` for incremental MIDX writes
midx: use `strvec` for `keep_hashes`
strvec: introduce `strvec_init_alloc()`
midx: use `string_list` for retained MIDX files
midx-write: handle noop writes when converting incremental chains
When switching branches with "git checkout -m", local modifications
can block the switch. Teach the -m flow to create a temporary stash
before switching and reapply it after. On success, only "Applied
autostash." is shown. If reapplying causes conflicts, the stash is
kept and the user is told they can resolve and run "git stash drop",
or run "git reset --hard" and later "git stash pop" to recover their
changes.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow callers of "git stash apply" to pass custom labels for conflict
markers instead of the default "Updated upstream" and "Stashed changes".
Document the new options and add a test.
Signed-off-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The [includeIf "condition"] conditional inclusion facility for
configuration files has learned to use the location of worktree
in its condition.
Comments?
* cl/conditional-config-on-worktree-path:
config: add "worktree" and "worktree/i" includeIf conditions
config: refactor include_by_gitdir() into include_by_path()
"git cat-file --batch" learns an in-line command "mailmap"
that lets the user toggle use of mailmap.
* sa/cat-file-batch-mailmap-switch:
cat-file: add mailmap subcommand to --batch-command
Promisor remote handling has been refactored and fixed in
preparation for auto-configuration of advertised remotes.
* cc/promisor-auto-config-url:
t5710: use proper file:// URIs for absolute paths
promisor-remote: remove the 'accepted' strvec
promisor-remote: keep accepted promisor_info structs alive
promisor-remote: refactor accept_from_server()
promisor-remote: refactor has_control_char()
promisor-remote: refactor should_accept_remote() control flow
promisor-remote: reject empty name or URL in advertised remote
promisor-remote: clarify that a remote is ignored
promisor-remote: pass config entry to all_fields_match() directly
promisor-remote: try accepted remotes before others in get_direct()
Try to resurrect and reboot a stalled "avoid sending risky escape
sequences taken from sideband to the terminal" topic by Dscho. The
plan is to keep it in 'next' long enough to see if anybody screams
with the "everything dropped except for ANSI color escape sequences"
default.
* jc/neuter-sideband-fixup:
sideband: drop 'default' configuration
sideband: offer to configure sanitizing on a per-URL basis
sideband: add options to allow more control sequences to be passed through
sideband: do allow ANSI color sequences by default
sideband: introduce an "escape hatch" to allow control characters
sideband: mask control characters
"git config list" is the official way to spell "git config -l" and
"git config --list". Use it to update the documentation.
* kh/doc-config-list:
doc: gitcvs-migration: rephrase “man page”
doc: replace git config --list/-l with `list`
Implement the built-in fsmonitor daemon for Linux using the inotify
API, bringing it to feature parity with the existing Windows and macOS
implementations.
The implementation uses inotify rather than fanotify because fanotify
requires either CAP_SYS_ADMIN or CAP_PERFMON capabilities, making it
unsuitable for an unprivileged user-space daemon. While inotify has
the limitation of requiring a separate watch on every directory (unlike
macOS's FSEvents, which can monitor an entire directory tree with a
single watch), it operates without elevated privileges and provides
the per-file event granularity needed for fsmonitor.
The listener uses inotify_init1(O_NONBLOCK) with a poll loop that
checks for events with a 50-millisecond timeout, keeping the inotify
queue well-drained to minimize the risk of overflows. Bidirectional
hashmaps map between watch descriptors and directory paths for efficient
event resolution. Directory renames are tracked using inotify's cookie
mechanism to correlate IN_MOVED_FROM and IN_MOVED_TO event pairs; a
periodic check detects stale renames where the matching IN_MOVED_TO
never arrived, forcing a resync.
New directory creation triggers recursive watch registration to ensure
all subdirectories are monitored. The IN_MASK_CREATE flag is used
where available to prevent modifying existing watches, with a fallback
for older kernels. When IN_MASK_CREATE is available and
inotify_add_watch returns EEXIST, it means another thread or recursive
scan has already registered the watch, so it is safe to ignore.
Remote filesystem detection uses statfs() to identify network-mounted
filesystems (NFS, CIFS, SMB, FUSE, etc.) via their magic numbers.
Mount point information is read from /proc/mounts and matched against
the statfs f_fsid to get accurate, human-readable filesystem type names
for logging. When the .git directory is on a remote filesystem, the
IPC socket falls back to $HOME or a user-configured directory via the
fsmonitor.socketDir setting.
Based-on-patch-by: Eric DeCosta <edecosta@mathworks.com>
Based-on-patch-by: Marziyeh Esipreh <marziyeh.esipreh@gmail.com>
Signed-off-by: Paul Tarjan <github@paulisageek.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify that --prefix is used as given and is not normalized,
and may include leading slashes or parent directory components.
Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The experimental `git replay` command learned the `--ref=<ref>` option
to allow specifying which ref to update, overriding the default behavior.
* tc/replay-ref:
replay: allow to specify a ref with option --ref
replay: use stuck form in documentation and help message
builtin/replay: mark options as not negatable
Various code clean-up around odb subsystem.
* ps/odb-cleanup:
odb: drop unneeded headers and forward decls
odb: rename `odb_has_object()` flags
odb: use enum for `odb_write_object` flags
odb: rename `odb_write_object()` flags
treewide: use enum for `odb_for_each_object()` flags
CodingGuidelines: document our style for flags
Our description of the reftable format is that it is experimental and
subject to change, but that is no longer true. Remove this statement so
as not to mislead users.
In addition, the documentation says that the files format is the
default, but that is not true if breaking changes mode is on. Correct
this information with a conditional.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handling of signed commits and tags in fast-import has been made more
configurable.
* jt/fast-import-signed-modes:
fast-import: add 'abort-if-invalid' mode to '--signed-tags=<mode>'
fast-import: add 'sign-if-invalid' mode to '--signed-tags=<mode>'
fast-import: add 'strip-if-invalid' mode to '--signed-tags=<mode>'
fast-import: add 'abort-if-invalid' mode to '--signed-commits=<mode>'
fast-export: check for unsupported signing modes earlier
The way the "git log -L<range>:<file>" feature is bolted onto the
log/diff machinery is being reworked a bit to make the feature
compatible with more diff options, like -S/G.
* mm/line-log-use-standard-diff-output:
doc: note that -L supports patch formatting and pickaxe options
t4211: add tests for -L with standard diff options
line-log: route -L output through the standard diff pipeline
line-log: fix crash when combined with pickaxe options
When a server advertises promisor remotes and the client accepts some
of them, those remotes carry the server's intent: 'fetch missing
objects preferably from here', and the client agrees with that for the
remotes it accepts.
However promisor_remote_get_direct() actually iterates over all
promisor remotes in list order, which is the order they appear in the
config files (except perhaps for the one appearing in the
`extensions.partialClone` config variable which is tried last).
This means an existing, but not accepted, promisor remote, could be
tried before the accepted ones, which does not reflect the intent of
the agreement between client and server.
If the client doesn't care about what the server suggests, it should
accept nothing and rely on its remotes as they are already configured.
To better reflect the agreement between client and server, let's make
promisor_remote_get_direct() try the accepted promisor remotes before
the non-accepted ones.
Concretely, let's extract a try_promisor_remotes() helper and call it
twice from promisor_remote_get_direct():
- first with an `accepted_only=true` argument to try only the accepted
remotes,
- then with `accepted_only=false` to fall back to any remaining remote.
Ensuring that accepted remotes are preferred will be even more
important if in the future a mechanism is developed to allow the
client to auto-configure remotes that the server advertises. This will
in particular avoid fetching from the server (which is already
configured as a promisor remote) before trying the auto-configured
remotes, as these new remotes would likely appear at the end of the
config file, and as the server might not appear in the
`extensions.partialClone` config variable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>