Add a macro to mark code sections that only read from the file system,
along with a config option and documentation.
This facilitates implementation of relatively simple file system level
caches without the need to synchronize with the file system.
Enable read-only sections for 'git status' and preload_index.
Signed-off-by: Karsten Blees <blees@dcon.de>
This patch introduces support to set special NTFS attributes that are
interpreted by the Windows Subsystem for Linux as file mode bits, UID
and GID.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch addresses the following vulnerability:
- **CVE-2025-66413**:
When a user clones a repository from an attacker-controlled server,
Git may attempt NTLM authentication and disclose the user's NTLMv2 hash
to the remote server. Since NTLM hashing is weak, the captured hash can
potentially be brute-forced to recover the user's credentials. This is
addressed by disabling NTLM authentication by default.
(https://github.com/git-for-windows/git/security/advisories/GHSA-hv9c-4jm9-jh3x)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This introduces `git survey` to Git for Windows ahead of upstream for
the express purpose of getting the path-based analysis in the hands of
more folks.
The inspiration of this builtin is
[`git-sizer`](https://github.com/github/git-sizer), but since that
command relies on `git cat-file --batch` to get the contents of objects,
it has limits to how much information it can provide.
This is mostly a rewrite of the `git survey` builtin that was introduced
into the `microsoft/git` fork in microsoft/git#667. That version had a
lot more bells and whistles, including an analysis much closer to what
`git-sizer` provides.
The biggest difference in this version is that this one is focused on
using the path-walk API in order to visit batches of objects based on a
common path. This allows identifying, for instance, the path that is
contributing the most to the on-disk size across all versions at that
path.
For example, here are the top ten paths contributing to my local Git
repository (which includes `microsoft/git` and `gitster/git`):
```
TOP FILES BY DISK SIZE
============================================================================
Path | Count | Disk Size | Inflated Size
-----------------------------------------+-------+-----------+--------------
whats-cooking.txt | 1373 | 11637459 | 37226854
t/helper/test-gvfs-protocol | 2 | 6847105 | 17233072
git-rebase--helper | 1 | 6027849 | 15269664
compat/mingw.c | 6111 | 5194453 | 463466970
t/helper/test-parse-options | 1 | 3420385 | 8807968
t/helper/test-pkt-line | 1 | 3408661 | 8778960
t/helper/test-dump-untracked-cache | 1 | 3408645 | 8780816
t/helper/test-dump-fsmonitor | 1 | 3406639 | 8776656
po/vi.po | 104 | 1376337 | 51441603
po/de.po | 210 | 1360112 | 71198603
```
This kind of analysis has been helpful in identifying the reasons for
growth in a few internal monorepos. Those findings motivated the changes
in #5157 and #5171.
With this early version in Git for Windows, we can expand the reach of
the experimental tool in advance of it being contributed to the upstream
project.
Unfortunately, this will mean that in the next `microsoft/git` rebase,
Jeff Hostetler's version will need to be pulled out since there are
enough conflicts. These conflicts include how tables are stored and
generated, as the version in this PR is slightly more general to allow
for different kinds of data.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The Windows Subsystem for Linux (WSL) version 2 allows to use `chmod` on
NTFS volumes provided that they are mounted with metadata enabled (see
https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/
for details), for example:
$ chmod 0755 /mnt/d/test/a.sh
In order to facilitate better collaboration between the Windows
version of Git and the WSL version of Git, we can make the Windows
version of Git also support reading and writing NTFS file modes
in a manner compatible with WSL.
Since this slightly slows down operations where lots of files are
created (such as an initial checkout), this feature is only enabled when
`core.WSLCompat` is set to true. Note that you also have to set
`core.fileMode=true` in repositories that have been initialized without
enabling WSL compatibility.
There are several ways to enable metadata loading for NTFS volumes
in WSL, one of which is to modify `/etc/wsl.conf` by adding:
```
[automount]
enabled = true
options = "metadata,umask=027,fmask=117"
```
And reboot WSL.
It can also be enabled temporarily by this incantation:
$ sudo umount /mnt/c &&
sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111
It's important to note that this modification is compatible with, but
does not depend on WSL. The helper functions in this commit can operate
independently and functions normally on devices where WSL is not
installed or properly configured.
Signed-off-by: xungeng li <xungeng@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, symbolic links have a type: a "file symlink" must point at
a file, and a "directory symlink" must point at a directory. If the
type of symlink does not match its target, it doesn't work.
Git does not record the type of symlink in the index or in a tree. On
checkout it'll guess the type, which only works if the target exists
at the time the symlink is created. This may often not be the case,
for example when the link points at a directory inside a submodule.
By specifying `symlink=file` or `symlink=dir` the user can specify what
type of symlink Git should create, so Git doesn't have to rely on
unreliable heuristics.
Signed-off-by: Bert Belder <bertbelder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The 'git survey' builtin provides several detail tables, such as "top
files by on-disk size". The size of these tables defaults to 10,
currently.
Allow the user to specify this number via a new --top=<N> option or the
new survey.top config key.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
NTLM authentication is relatively weak. This is the case even with the
default setting of modern Windows versions, where NTLMv1 and LanManager
are disabled and only NTLMv2 is enabled: NTLMv2 hashes of even
reasonably complex 8-character passwords can be broken in a matter of
days, given enough compute resources.
Even worse: On Windows, NTLM authentication uses Security Support
Provider Interface ("SSPI"), which provides the credentials without
requiring the user to type them in.
Which means that an attacker could talk an unsuspecting user into
cloning from a server that is under the attacker's control and extracts
the user's NTLMv2 hash without their knowledge.
For that reason, let's disallow NTLM authentication by default.
NTLM authentication is quite simple to set up, though, and therefore
there are still some on-prem Azure DevOps setups out there whose users
and/or automation rely on this type of authentication. To give them an
escape hatch, introduce the `http.<url>.allowNTLMAuth` config setting
that can be set to `true` to opt back into using NTLM for a specific
remote repository.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the moment, nothing is obvious about the reason for the use of the
path-walk API, but this will become more prevelant in future iterations. For
now, use the path-walk API to sum up the counts of each kind of object.
For example, this is the reachable object summary output for my local repo:
REACHABLE OBJECT SUMMARY
========================
Object Type | Count
------------+-------
Tags | 1343
Commits | 179344
Trees | 314350
Blobs | 184030
Signed-off-by: Derrick Stolee <stolee@gmail.com>
When 'git survey' provides information to the user, this will be presented
in one of two formats: plaintext and JSON. The JSON implementation will be
delayed until the functionality is complete for the plaintext format.
The most important parts of the plaintext format are headers specifying the
different sections of the report and tables providing concreted data.
Create a custom table data structure that allows specifying a list of
strings for the row values. When printing the table, check each column for
the maximum width so we can create a table of the correct size from the
start.
The table structure is designed to be flexible to the different kinds of
output that will be implemented in future changes.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
By default we will scan all references in "refs/heads/", "refs/tags/"
and "refs/remotes/".
Add command line opts let the use ask for all refs or a subset of them
and to include a detached HEAD.
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Start work on a new 'git survey' command to scan the repository
for monorepo performance and scaling problems. The goal is to
measure the various known "dimensions of scale" and serve as a
foundation for adding additional measurements as we learn more
about Git monorepo scaling problems.
The initial goal is to complement the scanning and analysis performed
by the GO-based 'git-sizer' (https://github.com/github/git-sizer) tool.
It is hoped that by creating a builtin command, we may be able to take
advantage of internal Git data structures and code that is not
accessible from GO to gain further insight into potential scaling
problems.
Co-authored-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Atomic append on windows is only supported on local disk files, and it may
cause errors in other situations, e.g. network file system. If that is the
case, this config option should be used to turn atomic append off.
Co-Authored-By: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: 孙卓识 <sunzhuoshi@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds support for a new http.sslAutoClientCert config value.
In cURL 7.77 or later the schannel backend does not automatically send
client certificates from the Windows Certificate Store anymore.
This config value is only used if http.sslBackend is set to "schannel",
and can be used to opt in to the old behavior and force cURL to send
client certificates.
This fixes https://github.com/git-for-windows/git/issues/3292
Signed-off-by: Pascal Muller <pascalmuller@gmail.com>
The native Windows HTTPS backend is based on Secure Channel which lets
the caller decide how to handle revocation checking problems caused by
missing information in the certificate or offline CRL distribution
points.
Unfortunately, cURL chose to handle these problems differently than
OpenSSL by default: while OpenSSL happily ignores those problems
(essentially saying "¯\_(ツ)_/¯"), the Secure Channel backend will error
out instead.
As a remedy, the "no revoke" mode was introduced, which turns off
revocation checking altogether. This is a bit heavy-handed. We support
this via the `http.schannelCheckRevoke` setting.
In https://github.com/curl/curl/pull/4981, we contributed an opt-in
"best effort" strategy that emulates what OpenSSL seems to do.
In Git for Windows, we actually want this to be the default. This patch
makes it so, introducing it as a new value for the
`http.schannelCheckRevoke" setting, which now becmes a tristate: it
accepts the values "false", "true" or "best-effort" (defaulting to the
last one).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Since commit 0c499ea60f (send-pack: demultiplex a sideband stream with
status data, 2010-02-05) the send-pack builtin uses the side-band-64k
capability if advertised by the server.
Unfortunately this breaks pushing over the dump git protocol if used
over a network connection.
The detailed reasons for this breakage are (by courtesy of Jeff Preshing,
quoted from https://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ):
MinGW wraps Windows sockets in CRT file descriptors in order to
mimic the functionality of POSIX sockets. This causes msvcrt.dll
to treat sockets as Installable File System (IFS) handles,
calling ReadFile, WriteFile, DuplicateHandle and CloseHandle on
them. This approach works well in simple cases on recent
versions of Windows, but does not support all usage patterns. In
particular, using this approach, any attempt to read & write
concurrently on the same socket (from one or more processes)
will deadlock in a scenario where the read waits for a response
from the server which is only invoked after the write. This is
what send_pack currently attempts to do in the use_sideband
codepath.
The new config option `sendpack.sideband` allows to override the
side-band-64k capability of the server, and thus makes the dumb git
protocol work.
Other transportation methods like ssh and http/https still benefit from
the sideband channel, therefore the default value of `sendpack.sideband`
is still true.
Signed-off-by: Thomas Braun <thomas.braun@byte-physics.de>
Signed-off-by: Oliver Schneider <oliver@assarbad.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The documentation for "--word-diff" has been extended with a bit of
implementation detail of where these different words come from.
* mm/doc-word-diff:
doc: clarify that --word-diff operates on line-level hunks
The `git log -L` implementation has been refactored to use the
standard diff output pipeline, enabling pickaxe and diff-filter to
work as expected. Additionally, metadata-only diff formats like
--raw and --name-only are now supported with -L.
* mm/line-log-cleanup:
line-log: allow non-patch diff formats with -L
line-log: integrate -L output with the standard log-tree pipeline
revision: move -L setup before output_format-to-diff derivation
The documentation for `push.default = simple` has been clarified to
better explain its behavior, making it clear that it pushes the
current branch to a same-named branch on the remote, and detailing
the upstream requirements for centralized workflows.
* ib/doc-push-default-simple:
doc: clarify push.default=simple behavior
"git push" learned to take a "remote group" name to push to, which
causes pushes to multiple places, just like "git fetch" would do.
* ua/push-remote-group:
push: support pushing to a remote group
remote: move remote group resolution to remote.c
remote: fix sign-compare warnings in push_cas_option
A batch of documentation pages has been updated to use the modern
synopsis style.
* ja/doc-synopsis-style-again:
doc: convert git-imap-send synopsis and options to new style
doc: convert git-apply synopsis and options to new style
doc: convert git-am synopsis and options to new style
doc: convert git-grep synopsis and options to new style
doc: git bisect: clarify the usage of the synopsis vs actual command
doc: convert git-bisect to synopsis style
The "git pack-objects --path-walk" traversal has been integrated
with several object filters, including blobless and sparse filters.
* ds/path-walk-filters:
path-walk: support `combine` filter
path-walk: support `object:type` filter
path-walk: support `tree:0` filter
t6601: tag otherwise-unreachable trees
pack-objects: support sparse:oid filter with path-walk
path-walk: add pl_sparse_trees to control tree pruning
path-walk: support blob size limit filter
backfill: die on incompatible filter options
path-walk: support blobless filter
path-walk: always emit directly-requested objects
t/perf: add pack-objects filter and path-walk benchmark
pack-objects: pass --objects with --path-walk
t5620: make test work with path-walk var
"Friday noon" asked in the morning on Sunday was parsed to be one
day before the specified time, which has been corrected.
* ta/approxidate-noon-fix:
approxidate: use deferred mday adjustments for "specials"
approxidate: make "specials" respect fixed day-of-month
t0006: add support for approxidate test date adjustment
approxidate: make "today" wrap to midnight
"git cat-file --batch" learns an in-line command "mailmap"
that lets the user toggle use of mailmap.
* sa/cat-file-batch-mailmap-switch:
cat-file: add mailmap subcommand to --batch-command
The fsmonitor daemon has been implemented for Linux.
* pt/fsmonitor-linux:
fsmonitor: convert shown khash to strset in do_handle_client
fsmonitor: add tests for Linux
fsmonitor: add timeout to daemon stop command
fsmonitor: close inherited file descriptors and detach in daemon
run-command: add close_fd_above_stderr option
fsmonitor: implement filesystem change listener for Linux
fsmonitor: rename fsm-settings-darwin.c to fsm-settings-unix.c
fsmonitor: rename fsm-ipc-darwin.c to fsm-ipc-unix.c
fsmonitor: use pthread_cond_timedwait for cookie wait
compat/win32: add pthread_cond_timedwait
fsmonitor: fix hashmap memory leak in fsmonitor_run_daemon
fsmonitor: fix khash memory leak in do_handle_client
t9210, t9211: disable GIT_TEST_SPLIT_INDEX for scalar clone tests
The graph output from commands like "git log --graph" can now be
limited to a specified number of lanes, preventing overly wide output
in repositories with many branches.
* ps/graph-lane-limit:
graph: add truncation mark to capped lanes
graph: add --graph-lane-limit option
graph: limit the graph width to a hard-coded max
Now that -L flows through log_tree_diff_flush() and diff_flush(),
metadata-only diff formats work because they only read filepair
fields (status, mode, path, oid) already set on the pre-computed
pairs.
Expand the allowlist in setup_revisions() to also accept --raw,
--name-only, --name-status, and --summary. Diff stat formats
(--stat, --numstat, --shortstat, --dirstat) remain blocked because
they call compute_diffstat() on full blob content and would show
whole-file statistics rather than range-scoped ones.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --word-diff documentation describes the output modes and
word-regex mechanics but does not explain that word-diff operates
within the hunks produced by the line-level diff rather than
performing an independent word-stream comparison. This can
surprise users when the line-level alignment causes word-level
changes to appear even though the words in both files are
identical.
Add an implementation note explaining the two-stage relationship
and that the output may change if Git acquires a different
implementation in the future.
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repacking code has been refactored and compaction of MIDX layers
have been implemented, and incremental strategy that does not require
all-into-one repacking has been introduced.
* tb/incremental-midx-part-3.3:
repack: allow `--write-midx=incremental` without `--geometric`
repack: introduce `--write-midx=incremental`
repack: implement incremental MIDX repacking
packfile: ensure `close_pack_revindex()` frees in-memory revindex
builtin/repack.c: convert `--write-midx` to an `OPT_CALLBACK`
repack-geometry: prepare for incremental MIDX repacking
repack-midx: extract `repack_fill_midx_stdin_packs()`
repack-midx: factor out `repack_prepare_midx_command()`
midx: expose `midx_layer_contains_pack()`
repack: track the ODB source via existing_packs
midx: support custom `--base` for incremental MIDX writes
midx: introduce `--no-write-chain-file` for incremental MIDX writes
midx: use `strvec` for `keep_hashes`
midx: build `keep_hashes` array in order
midx: use `strset` for retained MIDX files
midx-write: handle noop writes when converting incremental chains
The negotiation tip options in "git fetch" have been reworked to
allow requiring certain refs to be sent as "have" lines, and to
restrict negotiation to a specific set of refs.
* ds/fetch-negotiation-options:
send-pack: pass negotiation config in push
remote: add remote.*.negotiationInclude config
fetch: add --negotiation-include option for negotiation
negotiator: add have_sent() interface
remote: add remote.*.negotiationRestrict config
transport: rename negotiation_tips
fetch: add --negotiation-restrict option
t5516: fix test order flakiness
Doc updates.
* pb/doc-diff-format-updates:
diff-format.adoc: mode and hash are 0* for unmerged paths from index only
diff-format.adoc: 'git diff-files' prints two lines for unmerged files
diff-format.adoc: remove mention of diff-tree specific output
The documentation for the 'simple' push mode currently singles out
the centralized workflow, which can cause confusion about its
behavior in other scenarios, such as triangular workflows.
Clarify that 'simple' always pushes the current branch to a branch
of the same name, but only enforces the strict upstream tracking
requirement when pushing back to the same remote being pulled from.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ivan Baluta <ivanbaluta.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert git-imap-send from [verse]/single-quote style to the modern
synopsis-block style:
- Replace [verse] with [synopsis] in SYNOPSIS block
- Backtick-quote all OPTIONS terms
- Backtick-quote all config keys in config/imap.adoc
- Backtick-quote bare config key references in prose
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert git-apply from [verse]/single-quote style to the modern
synopsis-block style:
- Replace [verse] with [synopsis] in SYNOPSIS block
- Backtick-quote all OPTIONS terms and config keys in config/apply.adoc
- Convert single-quoted inline commands ('git apply', 'diff', etc.)
- Wrap standalone placeholders in underscores (<n>, <root>, <action>)
- Backtick-quote `*.rej` and GNU `patch` tool references
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert git-am from [verse]/single-quote style to the modern
synopsis-block style:
- Replace [verse] with [synopsis] in SYNOPSIS block
- Backtick-quote all OPTIONS terms
- Convert inline man page refs
- Convert inline command refs
- Convert prose placeholders:
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>