It does not quite work because it produces DOS line endings which the
shell does not like at all.
This lets t0200-gettext-basic.sh, t0204-gettext-reencode-sanity.sh,
t3406-rebase-message.sh, t3903-stash.sh, t7400-submodule-basic.sh,
t7401-submodule-summary.sh, t7406-submodule-update.sh and
t7407-submodule-foreach.sh pass in Git for Windows' SDK.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This solves two problems:
- we now have proper localisation even on Windows
- we sidestep the infamous "BUG: your vsnprintf is broken (returned -1)"
message when running "git init" (which otherwise prevents the entire
test suite from running) because libintl.h overrides vsnprintf() with
libintl_vsnprintf() [*1*]
The latter issue is rather crucial, as *no* test passes in Git for
Windows without this fix.
Footnote *1*: gettext_git=http://git.savannah.gnu.org/cgit/gettext.git
$gettext_git/tree/gettext-runtime/intl/libgnuintl.in.h#n380
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Beginning of the upstreaming process of Git for Windows effort.
* js/msys2:
mingw: uglify (a, 0) definitions to shut up warnings
mingw: squash another warning about a cast
mingw: avoid warnings when casting HANDLEs to int
mingw: avoid redefining S_* constants
compat/winansi: support compiling with MSys2
compat/mingw: support MSys2-based MinGW build
nedmalloc: allow compiling with MSys2's compiler
config.mak.uname: supporting 64-bit MSys2
config.mak.uname: support MSys2
When the result of a (a, 0) expression is not used, MSys2's GCC version
finds it necessary to complain with a warning:
right-hand operand of comma expression has no effect
Let's just pretend to use the 0 value and have a peaceful and quiet life
again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSys2's compiler is correct that casting a "void *" to a "DWORD" loses
precision, but in the case of pthread_exit() we know that the value
fits into a DWORD.
Just like casting handles to DWORDs, let's work around this issue by
casting to "intrptr_t" first, and immediately cast to the final type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HANDLE is defined internally as a void *, but in many cases it is
actually guaranteed to be a 32-bit integer. In these cases, GCC should
not warn about a cast of a pointer to an integer of a different type
because we know exactly what we are doing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When compiling with MSys2's compiler, these constants are already defined.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The excellent MSys2 project brings a substantially updated MinGW
environment including newer GCC versions and new headers. To support
compiling Git, let's special-case the new MinGW (tell-tale: the
_MINGW64_VERSION_MAJOR constant is defined).
Note: this commit only addresses compile failures, not compile warnings
(that task is left for a future patch).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With MSys2's GCC, `ReadWriteBarrier` is already defined, and FORCEINLINE
unfortunately gets defined incorrectly.
Let's work around both problems, using the MSys2-specific
__MINGW64_VERSION_MAJOR constant to guard the FORCEINLINE definition so
as not to affect other platforms.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This just makes things compile, the test suite needs extra tender loving
care in addition to this change. We will address these issues in later
commits.
While at it, also allow building MSys2 Git (i.e. a Git that uses MSys2's
POSIX emulation layer).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a long time, Git for Windows lagged behind Git's 2.x releases because
the Git for Windows developers wanted to let that big jump coincide with
a well-needed jump away from MSys to MSys2.
To understand why this is such a big issue, it needs to be noted that
many parts of Git are not written in portable C, but instead Git relies
on a POSIX shell and Perl to be available.
To support the scripts, Git for Windows has to ship a minimal POSIX
emulation layer with Bash and Perl thrown in, and when the Git for
Windows effort started in August 2007, this developer settled on using
MSys, a stripped down version of Cygwin. Consequently, the original name
of the project was "msysGit" (which, sadly, caused a *lot* of confusion
because few Windows users know about MSys, and even less care).
To compile the C code of Git for Windows, MSys was used, too: it sports
two versions of the GNU C Compiler: one that links implicitly to the
POSIX emulation layer, and another one that targets the plain Win32 API
(with a few convenience functions thrown in). Git for Windows'
executables are built using the latter, and therefore they are really
just Win32 programs. To discern executables requiring the POSIX
emulation layer from the ones that do not, the latter are called MinGW
(Minimal GNU for Windows) when the former are called MSys executables.
This reliance on MSys incurred challenges, too, though: some of our
changes to the MSys runtime -- necessary to support Git for Windows
better -- were not accepted upstream, so we had to maintain our own
fork. Also, the MSys runtime was not developed further to support e.g.
UTF-8 or 64-bit, and apart from lacking a package management system
until much later (when mingw-get was introduced), many packages provided
by the MSys/MinGW project lag behind the respective source code
versions, in particular Bash and OpenSSL. For a while, the Git for
Windows project tried to remedy the situation by trying to build newer
versions of those packages, but the situation quickly became untenable,
especially with problems like the Heartbleed bug requiring swift action
that has nothing to do with developing Git for Windows further.
Happily, in the meantime the MSys2 project (https://msys2.github.io/)
emerged, and was chosen to be the base of the Git for Windows 2.x. Just
like MSys, MSys2 is a stripped down version of Cygwin, but it is
actively kept up-to-date with Cygwin's source code. Thereby, it already
supports Unicode internally, and it also offers the 64-bit support that
we yearned for since the beginning of the Git for Windows project.
MSys2 also ported the Pacman package management system from Arch Linux
and uses it heavily. This brings the same convenience to which Linux
users are used to from `yum` or `apt-get`, and to which MacOSX users are
used to from Homebrew or MacPorts, or BSD users from the Ports system,
to MSys2: a simple `pacman -Syu` will update all installed packages to
the newest versions currently available.
MSys2 is also *very* active, typically providing package updates
multiple times per week.
It still required a two-month effort to bring everything to a state
where Git's test suite passes, many more months until the first official
Git for Windows 2.x was released, and a couple of patches still await
their submission to the respective upstream projects. Yet without MSys2,
the modernization of Git for Windows would simply not have happened.
This commit lays the ground work to supporting MSys2-based Git builds.
Assisted-by: Waldek Maleska <weakcamel@users.github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no more callers that use this mode, and none
likely to be added (as our xdl_merge() eliminates the common
use of it for generating 3-way merge bases).
This is effectively a revert of a9ed376 (xdiff: generate
"anti-diffs" aka what is common to two files, 2006-06-28),
though of course trying to revert that ancient commit
directly produces many textual conflicts.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merge_blobs sees an add/add conflict, it tries to
create a virtual base object for the 3-way merge that
consists of the common lines of each file. It inherited this
strategy from merge-one-file in 0c79938 (Improved three-way
blob merging code, 2006-06-28), and the point is to minimize
the size of the conflict hunks. That commit talks about "if
libxdiff were to ever grow a compatible three-way merge, it
could probably be directly plugged in".
That has long since happened. So as with merge-one-file in
the previous commit, this extra step is no longer necessary.
Our 3-way merge code is smart enough to do the minimizing
itself if we simply feed it an empty base, which is what the
more modern merge-recursive strategy already does.
Not only does this let us drop some code, but it removes an
overflow bug in generate_common_file(). We allocate a buffer
as large as the smallest of the two blobs, under the
assumption that there cannot be more common content than
what is in the smaller blob. However, xdiff may feed us
more: if neither file ends in a newline, it feeds us the
"\nNo newline at end of file" marker as common content, and
we write it into the output. If the differences between the
files are small than that string, we overflow the output
buffer. This patch solves it by simply dropping the buggy
code entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see an add/add conflict on a file, we generate the
conflicted content by doing a 3-way merge with a "virtual"
base consisting of the common lines of the two sides. This
strategy dates back to cb93c19 (merge-one-file: use common
as base, instead of emptiness., 2005-11-09).
Back then, the next step was to call rcs merge to generate
the 3-way conflicts. Using the virtual base produced much
better results, as rcs merge does not attempt to minimize
the hunks. As a result, you'd get a conflict with the
entirety of the files on either side.
Since then, though, we've switched to using git-merge-file,
which uses xdiff's "zealous" merge. This will find the
minimal hunks even with just the simple, empty base.
Let's switch to using that empty base. It's simpler, more
efficient, and reduces our dependencies (we no longer need a
working diff binary). It's also how the merge-recursive
strategy handles this same case.
We can almost get rid of git-sh-setup's create_virtual_base,
but we don't here, for two reasons:
1. The functions in git-sh-setup are part of our public
interface, so it's possible somebody is depending on
it. We'd at least need to deprecate it first.
2. It's also used by mergetool's p4merge driver. It's
unknown whether its 3-way merge is as capable as git's;
if not, then it is benefiting from the function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.
We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.
Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.
We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers of this left, as the last one was
dropped in the previous patch. And there are not likely to
be new ones, as the function has been around since 2010
without gaining any new callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a commit with sha1 "1234abcd" and subject "foo", this
function produces a struct with three strings:
1. "foo"
2. "1234abcd... foo"
3. "parent of 1234abcd... foo"
It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.
Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.
While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.
In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.
Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().
Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for "<sha1> <ref>" versus just "<ref>". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).
This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.
Note that there's one subtlety here, though. We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.
As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions transform an existing argv into one suitable
for exec-ing or spawning via git or a shell. We can use an
argv_array in each to avoid dealing with manual counting and
allocation.
This also makes the memory allocation more clear and fixes
some leaks. In prepare_shell_cmd, we would sometimes
allocate a new string with "$@" in it and sometimes not,
meaning the caller could not correctly free it. On the
non-Windows side, we are in a child process which will
exec() or exit() immediately, so the leak isn't a big deal.
On Windows, though, we use spawn() from the parent process,
and leak a string for each shell command we run. On top of
that, the Windows code did not free the allocated argv array
at all (but does for the prepare_git_cmd case!).
By switching both of these functions to write into an
argv_array, we can consistently free the result as
appropriate.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.
There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.
In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:
1. We no longer have to manually compute the correct final
array size (so it's one less thing we can screw up).
2. In many cases we had to make a separate pass to count,
then allocate, then fill in the array. Now we can do it
in one pass, making the code shorter and easier to
follow.
3. argv_array handles memory ownership for us, making it
more obvious when things should be free()d and and when
not.
Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usual pattern for an argv array is to initialize it,
push in some strings, and then clear it when done. Very
occasionally, though, we must do other exotic things with
the memory, like freeing the list but keeping the strings.
Let's provide a detach function so that callers can make use
of our API to build up the array, and then take ownership of
it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.
This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.
Ideally we could provide a more flexible version that could
handle multiple strings, like:
FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);
But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow. Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.
xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Paths that have been told the index about with "add -N" are not
quite yet in the index, but a few commands behaved as if they
already are in a harmful way.
* nd/ita-cleanup:
grep: make it clear i-t-a entries are ignored
add and use a convenience macro ce_intent_to_add()
blame: remove obsolete comment
The documentation for "git clean" has been corrected; it mentioned
that .git/modules/* are removed by giving two "-f", which has never
been the case.
* mm/clean-doc-fix:
Documentation/git-clean.txt: don't mention deletion of .git/modules/*
The vimdiff backend for "git mergetool" has been tweaked to arrange
and number buffers in the order that would match the expectation of
majority of people who read left to right, then top down and assign
buffers 1 2 3 4 "mentally" to local base remote merge windows based
on that order.
* dw/mergetool-vim-window-shuffle:
mergetool: reorder vim/gvim buffers in three-way diffs
The memory allocation scheme for the textconv interface is a
bit tricky, and not well documented. It was originally
designed as an internal part of diff.c (matching
fill_mmfile), but gradually was made public.
Refactoring it is difficult, but we can at least improve the
situation by documenting the intended flow and enforcing it
with an in-code assertion.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'ks/svn-pathnameencoding-4' of git://git.bogomips.org/git-svn:
git-svn: apply "svn.pathnameencoding" before URL encoding
git-svn: enable "svn.pathnameencoding" on dcommit
git-svn: hoist out utf8 prep from t9129 to lib-git-svn