Commit graph

1042 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason 7d9c80f626 tr2: remove NEEDSWORK comment for "non-procfs" implementations
I'm fairly sure that there is no way on Linux to inspect the process
tree without using procfs, any tool such as ps(1), top(1) etc. that
shows this sort of information ultimately looks the information up in
procfs.

So let's remove this comment added in 2f732bf15e (tr2: log parent
process name, 2021-07-21), it's setting us up for an impossible task.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Carlo Marcelo Arenas Belón 27e0c3c6cf win32: allow building with pedantic mode enabled
In preparation to building with pedantic mode enabled, change a couple
of places where the current mingw gcc compiler provided with the SDK
reports issues.

A full fix for the incompatible use of (void *) to store function
pointers has been punted, with the minimal change to instead use a
generic function pointer (FARPROC), and therefore the (hopefully)
temporary need to disable incompatible pointer warnings.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03 11:40:30 -07:00
Junio C Hamano 6f64eeab60 Merge branch 'es/trace2-log-parent-process-name'
trace2 logs learned to show parent process name to see in what
context Git was invoked.

* es/trace2-log-parent-process-name:
  tr2: log parent process name
  tr2: make process info collection platform-generic
2021-08-24 15:32:40 -07:00
René Scharfe 95b4ff3931 compat: let git_mmap use malloc(3) directly
xmalloc() dies on error, allows zero-sized allocations and enforces
GIT_ALLOC_LIMIT for testing.  Our mmap replacement doesn't need any of
that.  Let's cut out the wrapper, reject zero-sized requests as required
by POSIX and use malloc(3) directly.  Allocation errors were needlessly
handled by git_mmap() before; this code becomes reachable now.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:43:45 -07:00
Thomas Bétous 3e7d4888e5 mingw: align symlinks-related rmdir() behavior with Linux
When performing a rebase, rmdir() is called on the folder .git/logs. On
Unix rmdir() exits without deleting anything in case .git/logs is a
symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
and RemoveDirectoryW) do not behave the same and remove the folder if it
is symlinked even if it is not empty.

This creates issues when folders in .git/ are symlinks which is
especially the case when git-repo[1] is used: It replaces `.git/logs/`
with a symlink.

One such issue is that the _target_ of that symlink is removed e.g.
during a `git rebase`, where `delete_reflog("REBASE_HEAD")` will not
only try to remove `.git/logs/REBASE_HEAD` but then recursively try to
remove the parent directories until an error occurs, a technique that
obviously relies on `rmdir()` refusing to remove a symlink.

This was reported in https://github.com/git-for-windows/git/issues/2967.

This commit updates mingw_rmdir() so that its behavior is the same as
Linux rmdir() in case of symbolic links.

To verify that Git does not regress on the reported issue, this patch
adds a regression test for the `git rebase` symptom, even if the same
`rmdir()` behavior is quite likely to cause potential problems in other
Git commands as well.

[1]: git-repo is a python tool built on top of Git which helps manage
many Git repositories. It stores all the .git/ folders in a central
place by taking advantage of symbolic links.
More information: https://gerrit.googlesource.com/git-repo/

Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02 15:10:58 -07:00
Emily Shaffer 2f732bf15e tr2: log parent process name
It can be useful to tell who invoked Git - was it invoked manually by a
user via CLI or script? By an IDE?  In some cases - like 'repo' tool -
we can influence the source code and set the GIT_TRACE2_PARENT_SID
environment variable from the caller process. In 'repo''s case, that
parent SID is manipulated to include the string "repo", which means we
can positively identify when Git was invoked by 'repo' tool. However,
identifying parents that way requires both that we know which tools
invoke Git and that we have the ability to modify the source code of
those tools. It cannot scale to keep up with the various IDEs and
wrappers which use Git, most of which we don't know about. Learning
which tools and wrappers invoke Git, and how, would give us insight to
decide where to improve Git's usability and performance.

Unfortunately, there's no cross-platform reliable way to gather the name
of the parent process. If procfs is present, we can use that; otherwise
we will need to discover the name another way. However, the process ID
should be sufficient to look up the process name on most platforms, so
that code may be shareable.

Git for Windows gathers similar information and logs it as a "data_json"
event. However, since "data_json" has a variable format, it is difficult
to parse effectively in some languages; instead, let's pursue a
dedicated "cmd_ancestry" event to record information about the ancestry
of the current process and a consistent, parseable way.

Git for Windows also gathers information about more than one generation
of parent. In Linux further ancestry info can be gathered with procfs,
but it's unwieldy to do so. In the interest of later moving Git for
Windows ancestry logging to the 'cmd_ancestry' event, and in the
interest of later adding more ancestry to the Linux implementation - or
of adding this functionality to other platforms which have an easier
time walking the process tree - let's make 'cmd_ancestry' accept an
array of parentage.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:35:20 -07:00
Emily Shaffer b7e6a41622 tr2: make process info collection platform-generic
To pave the way for non-Windows platforms to define
trace2_collect_process_info(), reorganize the stub-or-definition schema
to something which doesn't directly reference Windows.

Platforms which want to collect parent process information in the
future should:

 1. Add an implementation to compat/ (e.g. compat/somearch/procinfo.c)
 2. Add that object to COMPAT_OBJS to config.mak.uname
    (e.g. COMPAT_OBJS += compat/somearch/procinfo.o)
 3. Define HAVE_PLATFORM_PROCINFO in config.mak.uname

In the Windows case, this definition lives in
compat/win32/trace2_win32_process_info.c, which is already conditionally
added to COMPAT_OBJS; so let's add HAVE_PLATFORM_PROCINFO to hint to the
build that compat/stub/procinfo.c should not be used.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 13:35:20 -07:00
Junio C Hamano 6aae0e2ad2 Merge branch 'jh/simple-ipc-sans-pthread'
The "simple-ipc" did not compile without pthreads support, but the
build procedure was not properly account for it.

* jh/simple-ipc-sans-pthread:
  simple-ipc: correct ifdefs when NO_PTHREADS is defined
2021-05-22 18:29:01 +09:00
Jeff Hostetler 6aac70a870 simple-ipc: correct ifdefs when NO_PTHREADS is defined
Simple IPC always requires threads (in addition to various
platform-specific IPC support).  Fix the ifdefs in the Makefile
to define SUPPORTS_SIMPLE_IPC when appropriate.

Previously, the Unix version of the code would only verify that
Unix domain sockets were available.

This problem was reported here:
https://lore.kernel.org/git/YKN5lXs4AoK%2FJFTO@coredump.intra.peff.net/T/#m08be8f1942ea8a2c36cfee0e51cdf06489fdeafc

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 07:55:00 +09:00
Junio C Hamano 4090b6973b Merge branch 'js/access-nul-emulation-on-windows'
Portability fix.

* js/access-nul-emulation-on-windows:
  msvc: avoid calling `access("NUL", flags)`
2021-04-20 17:23:37 -07:00
Johannes Schindelin 9160068ac6 msvc: avoid calling access("NUL", flags)
Apparently this is not supported with Microsoft's Universal C Runtime.
So let's not actually do that.

Instead, just return success because we _know_ that we expect the `NUL`
device to be present.

Side note: it is possible to turn off the "Null device driver" and
thereby disable `NUL`. Too many things are broken if this driver is
disabled, therefore it is not worth bothering to try to detect its
presence when `access()` is called.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 12:05:32 -07:00
Junio C Hamano 4c6ac2da2c Merge branch 'tb/precompose-prefix-simplify'
Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.

* tb/precompose-prefix-simplify:
  macOS: precompose startup_info->prefix
  precompose_utf8: make precompose_string_if_needed() public
2021-04-13 15:28:51 -07:00
Torsten Bögershausen 5020774aef precompose_utf8: make precompose_string_if_needed() public
commit 5c327502 (MacOS: precompose_argv_prefix(), 2021-02-03) uses
the function precompose_string_if_needed() internally.  It is only
used from precompose_argv_prefix() and therefore static in
compat/precompose_utf8.c

Expose this function, it will be used in the next commit.

While there, allow passing a NULL pointer, which will return NULL.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 17:30:04 -07:00
Junio C Hamano 861794b60d Merge branch 'jh/simple-ipc'
A simple IPC interface gets introduced to build services like
fsmonitor on top.

* jh/simple-ipc:
  t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
  simple-ipc: add Unix domain socket implementation
  unix-stream-server: create unix domain socket under lock
  unix-socket: disallow chdir() when creating unix domain sockets
  unix-socket: add backlog size option to unix_stream_listen()
  unix-socket: eliminate static unix_stream_socket() helper function
  simple-ipc: add win32 implementation
  simple-ipc: design documentation for new IPC mechanism
  pkt-line: add options argument to read_packetized_to_strbuf()
  pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
  pkt-line: do not issue flush packets in write_packetized_*()
  pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-04-02 14:43:14 -07:00
Jeff Hostetler 7cd5dbcaba simple-ipc: add Unix domain socket implementation
Create Unix domain socket based implementation of "simple-ipc".

A set of `ipc_client` routines implement a client library to connect
to an `ipc_server` over a Unix domain socket, send a simple request,
and receive a single response.  Clients use blocking IO on the socket.

A set of `ipc_server` routines implement a thread pool to listen for
and concurrently service client connections.

The server creates a new Unix domain socket at a known location.  If a
socket already exists with that name, the server tries to determine if
another server is already listening on the socket or if the socket is
dead.  If socket is busy, the server exits with an error rather than
stealing the socket.  If the socket is dead, the server creates a new
one and starts up.

If while running, the server detects that its socket has been stolen
by another server, it automatically exits.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 11:52:54 -07:00
Jeff Hostetler 59c7b88198 simple-ipc: add win32 implementation
Create Windows implementation of "simple-ipc" using named pipes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:50 -07:00
Junio C Hamano 56a57652ef Sync with Git 2.30.2 for CVE-2021-21300
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 16:09:07 -08:00
Junio C Hamano 921846fa22 Merge branch 'jk/open-returns-eintr'
Work around platforms whose open() is reported to return EINTR (it
shouldn't, as we do our signals with SA_RESTART).

* jk/open-returns-eintr:
  config.mak.uname: enable OPEN_RETURNS_EINTR for macOS Big Sur
  Makefile: add OPEN_RETURNS_EINTR knob
2021-03-04 15:34:45 -08:00
Jeff King 2b08101204 Makefile: add OPEN_RETURNS_EINTR knob
On some platforms, open() reportedly returns EINTR when opening regular
files and we receive a signal (usually SIGALRM from our progress meter).
This shouldn't happen, as open() should be a restartable syscall, and we
specify SA_RESTART when setting up the alarm handler. So it may actually
be a kernel or libc bug for this to happen. But it has been reported on
at least one version of Linux (on a network filesystem):

  https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/

as well as on macOS starting with Big Sur even on a regular filesystem.

We can work around it by retrying open() calls that get EINTR, just as
we do for read(), etc. Since we don't ever _want_ to interrupt an open()
call, we can get away with just redefining open, rather than insisting
all callsites use xopen().

We actually do have an xopen() wrapper already (and it even does this
retry, though there's no indication of it being an observed problem back
then; it seems simply to have been lifted from xread(), etc). But it is
used hardly anywhere, and isn't suitable for general use because it will
die() on error. In theory we could combine the two, but it's awkward to
do so because of the variable-args interface of open().

This patch adds a Makefile knob for enabling the workaround. It's not
enabled by default for any platforms in config.mak.uname yet, as we
don't have enough data to decide how common this is (I have not been
able to reproduce on either Linux or Big Sur myself). It may be worth
enabling preemptively anyway, since the cost is pretty low (if we don't
see an EINTR, it's just an extra conditional).

However, note that we must not enable this on Windows. It doesn't do
anything there, and the macro overrides the existing mingw_open()
redirection. I've added a preemptive #undef here in the mingw header
(which is processed first) to just quietly disable it (we could also
make it an #error, but there is little point in being so aggressive).

Reported-by: Aleksey Kliger <alklig@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:15:51 -08:00
Johannes Schindelin e4e68081bb Sync with 2.29.3
* maint-2.29:
  Git 2.29.3
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:51:12 +01:00
Johannes Schindelin d7bdabe52f Sync with 2.28.1
* maint-2.28:
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:14 +01:00
Johannes Schindelin 3f01e56686 Sync with 2.27.1
* maint-2.27:
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:09 +01:00
Johannes Schindelin 2d1142a3e8 Sync with 2.26.3
* maint-2.26:
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:04 +01:00
Johannes Schindelin 8f80393c14 Sync with 2.25.5
* maint-2.25:
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:59 +01:00
Johannes Schindelin 97d1dcb1ef Sync with 2.24.4
* maint-2.24:
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:55 +01:00
Johannes Schindelin 92ac04b8ee Sync with 2.23.4
* maint-2.23:
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:50 +01:00
Johannes Schindelin 4bd06fd490 Sync with 2.22.5
* maint-2.22:
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:45 +01:00
Johannes Schindelin bcf08f33d8 Sync with 2.21.4
* maint-2.21:
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:41 +01:00
Johannes Schindelin b1726b1a38 Sync with 2.20.5
* maint-2.20:
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:35 +01:00
Johannes Schindelin 804963848e Sync with 2.19.6
* maint-2.19:
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:17 +01:00
Johannes Schindelin fb049fd85b Sync with 2.18.5
* maint-2.18:
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:47 +01:00
Johannes Schindelin 9b77cec89b Sync with 2.17.6
* maint-2.17:
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:42 +01:00
Matheus Tavares 684dd4c2b4 checkout: fix bug that makes checkout follow symlinks in leading path
Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).

But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.

When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.

Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.

Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.

Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.

This addresses CVE-2021-21300.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2021-02-12 15:47:02 +01:00
Torsten Bögershausen 5c327502db MacOS: precompose_argv_prefix()
The following sequence leads to a "BUG" assertion running under MacOS:

  DIR=git-test-restore-p
  Adiarnfd=$(printf 'A\314\210')
  DIRNAME=xx${Adiarnfd}yy
  mkdir $DIR &&
  cd $DIR &&
  git init &&
  mkdir $DIRNAME &&
  cd $DIRNAME &&
  echo "Initial" >file &&
  git add file &&
  echo "One more line" >>file &&
  echo y | git restore -p .

 Initialized empty Git repository in /tmp/git-test-restore-p/.git/
 BUG: pathspec.c:495: error initializing pathspec_item
 Cannot close git diff-index --cached --numstat
 [snip]

The command `git restore` is run from a directory inside a Git repo.
Git needs to split the $CWD into 2 parts:
The path to the repo and "the rest", if any.
"The rest" becomes a "prefix" later used inside the pathspec code.

As an example, "/path/to/repo/dir-inside-repå" would determine
"/path/to/repo" as the root of the repo, the place where the
configuration file .git/config is found.

The rest becomes the prefix ("dir-inside-repå"), from where the
pathspec machinery expands the ".", more about this later.
If there is a decomposed form, (making the decomposing visible like this),
"dir-inside-rep°a" doesn't match "dir-inside-repå".

Git commands need to:

 (a) read the configuration variable "core.precomposeunicode"
 (b) precocompose argv[]
 (c) precompose the prefix, if there was any

The first commit,
76759c7dff "git on Mac OS and precomposed unicode"
addressed (a) and (b).

The call to precompose_argv() was added into parse-options.c,
because that seemed to be a good place when the patch was written.

Commands that don't use parse-options need to do (a) and (b) themselfs.

The commands `diff-files`, `diff-index`, `diff-tree` and `diff`
learned (a) and (b) in
commit 90a78b83e0 "diff: run arguments through precompose_argv"

Branch names (or refs in general) using decomposed code points
resulting in decomposed file names had been fixed in
commit 8e712ef6fc "Honor core.precomposeUnicode in more places"

The bug report from above shows 2 things:
- more commands need to handle precomposed unicode
- (c) should be implemented for all commands using pathspecs

Solution:
precompose_argv() now handles the prefix (if needed), and is renamed into
precompose_argv_prefix().

Inside this function the config variable core.precomposeunicode is read
into the global variable precomposed_unicode, as before.
This reading is skipped if precomposed_unicode had been read before.

The original patch for preocomposed unicode, 76759c7dff, placed
precompose_argv() into parse-options.c

Now add it into git.c::run_builtin() as well.  Existing precompose
calls in diff-files.c and others may become redundant, and if we
audit the callflows that reach these places to make sure that they
can never be reached without going through the new call added to
run_builtin(), we might be able to remove these existing ones.

But in this commit, we do not bother to do so and leave these
precompose callsites as they are.  Because precompose() is
idempotent and can be called on an already precomposed string
safely, this is safer than removing existing calls without fully
vetting the callflows.

There is certainly room for cleanups - this change intends to be a bug fix.
Cleanups needs more tests in e.g. t/t3910-mac-os-precompose.sh, and should
be done in future commits.

[1] git-bugreport-2021-01-06-1209.txt (git can't deal with special characters)
[2] https://lore.kernel.org/git/A102844A-9501-4A86-854D-E3B387D378AA@icloud.com/

Reported-by: Daniel Troger <random_n0body@icloud.com>
Helped-By: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:09:37 -08:00
Junio C Hamano ccbde2c4f4 Merge branch 'da/vs-build-iconv-fix'
Build update.

* da/vs-build-iconv-fix:
  ci(vs-build): stop passing the iconv library location explicitly
2020-12-14 10:21:38 -08:00
Dennis Ameling e66590348a ci(vs-build): stop passing the iconv library location explicitly
Something changed in `vcpkg` (which we use in our Visual C++ build to
provide the dependencies such as libcurl) and our `vs-build` job started
failing in CI. The reason is that we had a work-around in place to help
CMake find iconv, and this work-around is neither needed nor does it
work anymore.

For the full discussion with the vcpkg project, see this comment:
https://github.com/microsoft/vcpkg/issues/14780#issuecomment-735368280

Signed-off-by: Dennis Ameling <dennis@dennisameling.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-04 12:03:15 -08:00
Daniel Gurney 0c038fc65a compat/bswap.h: don't assume MSVC is little-endian
In 1af265f0 (compat/bswap.h: simplify MSVC endianness
detection, 2020-11-08) we attempted to simplify code by assuming MSVC
builds will be for little-endian machines, since only unusably old
versions of MSVC supported big-endian MIPS and m68k architectures.

However, it's possible that MSVC could be ported to build for a
big-endian architecture again, so the simplification wasn't as
future-proof as hoped.

So let's go back to the old way of detecting MSVC, and then checking
architecture from a list of little-endian architecture macros.

Note that MSVC does not treat ARM64 as bi-endian, so we can safely treat
it as little-endian.

Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Daniel Gurney <dgurney99@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 11:24:47 -08:00
Daniel Gurney 1af265f0a0 compat/bswap.h: simplify MSVC endianness detection
Modern MSVC or Windows versions don't support big-endian, so it's
unnecessary to consider architectures when using it.

This also makes ARM64 MSVC builds succeed.

Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Daniel Gurney <dgurney99@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-09 13:01:10 -08:00
Denton Liu fcedb379fd compat/mingw.h: drop extern from function declaration
In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), `extern` on function declarations were declared to
be redundant and thus removed from the codebase. An `extern` was
accidentally reintroduced in 08809c09aa (mingw: add a helper function to
attach GDB to the current process, 2020-02-13).

Remove this spurious `extern`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 09:55:20 -07:00
Junio C Hamano 86cca370e1 Merge branch 'jk/drop-unaligned-loads'
Compilation fix around type punning.

* jk/drop-unaligned-loads:
  Revert "fast-export: use local array to store anonymized oid"
  bswap.h: drop unaligned loads
2020-10-04 12:49:06 -07:00
Jeff King c578e29ba0 bswap.h: drop unaligned loads
Our put_be32() routine and its variants (get_be32(), put_be64(), etc)
has two implementations: on some platforms we cast memory in place and
use nothl()/htonl(), which can cause unaligned memory access. And on
others, we pick out the individual bytes using bitshifts.

This introduces extra complexity, and sometimes causes compilers to
generate warnings about type-punning. And it's not clear there's any
performance advantage.

This split goes back to 660231aa97 (block-sha1: support for
architectures with memory alignment restrictions, 2009-08-12). The
unaligned versions were part of the original block-sha1 code in
d7c208a92e (Add new optimized C 'block-sha1' routines, 2009-08-05),
which says it is:

   Based on the mozilla SHA1 routine, but doing the input data accesses a
   word at a time and with 'htonl()' instead of loading bytes and shifting.

Back then, Linus provided timings versus the mozilla code which showed a
27% improvement:

  https://lore.kernel.org/git/alpine.LFD.2.01.0908051545000.3390@localhost.localdomain/

However, the unaligned loads were either not the useful part of that
speedup, or perhaps compilers and processors have changed since then.
Here are times for computing the sha1 of 4GB of random data, with and
without -DNO_UNALIGNED_LOADS (and BLK_SHA1=1, of course). This is with
gcc 10, -O2, and the processor is a Core i9-9880H.

  [stock]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):      6.638 s ±  0.081 s    [User: 6.269 s, System: 0.368 s]
    Range (min … max):    6.550 s …  6.841 s    10 runs

  [-DNO_UNALIGNED_LOADS]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):      6.418 s ±  0.015 s    [User: 6.058 s, System: 0.360 s]
    Range (min … max):    6.394 s …  6.447 s    10 runs

And here's the same test run on an AMD A8-7600, using gcc 8.

  [stock]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):     11.721 s ±  0.113 s    [User: 10.761 s, System: 0.951 s]
    Range (min … max):   11.509 s … 11.861 s    10 runs

  [-DNO_UNALIGNED_LOADS]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):     11.744 s ±  0.066 s    [User: 10.807 s, System: 0.928 s]
    Range (min … max):   11.637 s … 11.863 s    10 runs

So the unaligned loads don't seem to help much, and actually make things
worse. It's possible there are platforms where they provide more
benefit, but:

  - the non-x86 platforms for which we use this code are old and obscure
    (powerpc and s390).

  - the main caller that cares about performance is block-sha1. But
    these days it is rarely used anyway, in favor of sha1dc (which is
    already much slower, and nobody seems to have cared that much).

Let's just drop unaligned versions entirely in the name of simplicity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:30:09 -07:00
Orgad Shaneh 3384a1ef78 vcbuild: fix batch file name in README
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-03 10:19:48 -07:00
Orgad Shaneh c2f3ef8d8f vcbuild: fix library name for expat with make MSVC=1
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-03 10:19:42 -07:00
Junio C Hamano 5a0482662f Merge branch 'jh/mingw-unlink'
"unlink" emulation on MinGW has been optimized.

* jh/mingw-unlink:
  mingw: improve performance of mingw_unlink()
2020-08-19 16:14:53 -07:00
Jeff Hostetler 680e0b4524 mingw: improve performance of mingw_unlink()
Update mingw_unlink() to first try to delete the file with existing
permissions before trying to force it.

Windows throws an error when trying to delete a read-only file.  The
mingw_unlink() compatibility wrapper always tries to _wchmod(666) the
file before calling _wunlink() to avoid that error.  However, since
most files in the worktree are already writable, this is usually
wasted effort.

Update mingw_unlink() to just call DeleteFileW() directly and if that
succeeds return.  If that fails, fall back into the existing code path
to update the permissions and use _wunlink() to get the existing
error code mapping.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-17 11:27:16 -07:00
Jeff King ef8d7ac42a strvec: convert more callers away from argv_array name
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).

This patch converts remaining files from the first half of the alphabet,
to keep the diff to a manageable size.

The conversion was done purely mechanically with:

  git ls-files '*.c' '*.h' |
  xargs perl -i -pe '
    s/ARGV_ARRAY/STRVEC/g;
    s/argv_array/strvec/g;
  '

and then selectively staging files with "git add '[abcdefghjkl]*'".
We'll deal with any indentation/style fallouts separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28 15:02:18 -07:00
Junio C Hamano 9906d5f8e9 Merge branch 'js/msvc-build-fix'
Workaround breakage in MSVC build, where "curl-config --cflags"
gives settings appropriate for GCC build.

* js/msvc-build-fix:
  msvc: fix "REG_STARTEND" issue
2020-06-17 21:54:03 -07:00
Johannes Schindelin bb0e43d8a1 msvc: fix "REG_STARTEND" issue
In 897d68e7af (Makefile: use curl-config --cflags, 2020-03-26), we
taught the build process to use `curl-config --cflags` to make sure that
it can find cURL's headers.

In the MSVC build, this is completely bogus because we're running in a
Git for Windows SDK whose `curl-config` supports the _GCC_ build.

Let's just ignore each and every `-I<path>` option where `<path>` points
to GCC/Clang specific headers.

Reported by Jeff Hostetler in
https://github.com/microsoft/git/issues/275.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-04 15:52:21 -07:00
Junio C Hamano 7b304ab16c Merge branch 'cb/no-more-gmtime'
Code clean-up by removing a compatibility implementation of a
function we no longer use.

* cb/no-more-gmtime:
  compat: remove gmtime
2020-05-20 08:33:27 -07:00
Carlo Marcelo Arenas Belón 84b0115f0d compat: remove gmtime
ccd469450a (date.c: switch to reentrant {gm,local}time_r, 2019-11-28)
removes the only gmtime() call we had and moves to gmtime_r() which
doesn't have the same portability problems.

Remove the compat gmtime code since it is no longer needed, and confirm
by successfull running t4212 in FreeBSD 9.3 amd64 (the oldest I could
get a hold off).

Further work might be needed to ensure 32bit time_t systems (like FreeBSD
i386) will handle correctly the overflows tested in t4212, but that is
orthogonal to this change, and it doesn't change the current behaviour
as neither gmtime() or gmtime_r() will ever return NULL on those systems
because time_t is unsigned.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-14 13:52:27 -07:00
Junio C Hamano dd094c2b75 Merge branch 'es/bugreport'
The "bugreport" tool.

* es/bugreport:
  bugreport: drop extraneous includes
  bugreport: add compiler info
  bugreport: add uname info
  bugreport: gather git version and build info
  bugreport: add tool to generate debugging info
  help: move list_config_help to builtin/help
2020-05-01 13:39:59 -07:00
Đoàn Trần Công Danh 3bc1f9e48c compat/regex: move stdlib.h up in inclusion chain
In Linux with musl libc, we have this inclusion chain:

compat/regex/regex.c:69
`-> compat/regex/regex_internal.h
   `-> /usr/include/stdlib.h
      `-> /usr/include/features.h
      `-> /usr/include/alloca.h

In that inclusion chain, `<features.h>` claims it's _BSD_SOURCE
compatible when it's NOT asked to be either
{_POSIX,_GNU,_XOPEN,_BSD}_SOURCE, or __STRICT_ANSI__.
And, `<stdlib.h>` will include `<alloca.h>` to be compatible with
software written for GNU and BSD. Thus, redefine `alloca` macro,
which was defined before at compat/regex/regex.c:66.

Considering this is only compat code, we've taken from other project,
it's not our business to decide which source should we adhere to.

Include `<stdlib.h>` early to prevent the redefinition of alloca.
This also remove a potential warning about alloca not defined on:
	#undef alloca

Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-27 11:21:16 -07:00
Junio C Hamano a41b41ca74 Merge branch 'js/mingw-isilon-nfs'
* js/mingw-isilon-nfs:
  mingw: cope with the Isilon network file system
2020-04-22 13:42:58 -07:00
Junio C Hamano b3eb70e0f8 Merge branch 'js/mingw-fixes'
Misc fixes for Windows.

* js/mingw-fixes:
  mingw: help debugging by optionally executing bash with strace
  mingw: do not treat `COM0` as a reserved file name
  mingw: use modern strftime implementation if possible
2020-04-22 13:42:56 -07:00
Emily Shaffer 69bcbbceb7 bugreport: add compiler info
To help pinpoint the source of a regression, it is useful to know some
info about the compiler which the user's Git client was built with. By
adding a generic get_compiler_info() in 'compat/' we can choose which
relevant information to share per compiler; to get started, let's
demonstrate the version of glibc if the user built with 'gcc'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-16 15:23:42 -07:00
Emily Shaffer 709df95b78 help: move list_config_help to builtin/help
Starting in 3ac68a93fd, help.o began to depend on builtin/branch.o,
builtin/clean.o, and builtin/config.o. This meant that help.o was
unusable outside of the context of the main Git executable.

To make help.o usable by other commands again, move list_config_help()
into builtin/help.c (where it makes sense to assume other builtin libraries
are present).

When command-list.h is included but a member is not used, we start to
hear a compiler warning. Since the config list is generated in a fairly
different way than the command list, and since commands and config
options are semantically different, move the config list into its own
header and move the generator into its own script and build rule.

For reasons explained in 976aaedc (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29), some build
artifacts we consider non-source files cannot be generated in the
Visual Studio environment, and we already have some Makefile tweaks
to help Visual Studio to use generated command-list.h header file.
Do the same to a new generated file, config-list.h, introduced by
this change.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
2020-04-16 15:22:16 -07:00
Nathan Sanders 23eafd924a mingw: cope with the Isilon network file system
On certain network filesystems (currently encountered with Isilon, but
in theory more network storage solutions could be causing the same
issue), when the directory in question is missing,
`raceproof_create_file()` fails with an `ERROR_INVALID_PARAMETER`
instead of an `ERROR_PATH_NOT_FOUND`.

Since it is highly unlikely that we produce such an error by mistake
(the parameters we pass are fairly benign), we can be relatively certain
that the directory is missing in this instance. So let's just translate
that error automagically.

This fixes https://github.com/git-for-windows/git/issues/1345.

Signed-off-by: Nathan Sanders <spekbukkem@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 10:34:05 -07:00
Johannes Schindelin 3efc128cd5 mingw: help debugging by optionally executing bash with strace
MSYS2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.

Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.

That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.

Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 10:21:07 -07:00
Johannes Schindelin b6852e1979 mingw: do not treat COM0 as a reserved file name
In 4dc42c6c18 (mingw: refuse paths containing reserved names,
2019-12-21), we started disallowing file names that are reserved, e.g.
`NUL`, `CONOUT$`, etc.

This included `COM<n>` where `<n>` is a digit. Unfortunately, this
includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to
the official documentation, `COM0` is mentioned in the "NT Namespaces"
section but it is explicitly _omitted_ from the list of reserved names:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions

Tests corroborate this: it is totally possible to write a file called
`com0.c` on Windows 10, but not `com1.c`.

So let's tighten the code to disallow only the reserved `COM<n>` file
names, but to allow `COM0` again.

This fixes https://github.com/git-for-windows/git/issues/2470.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-08 12:15:51 -07:00
Matthias Aßhauer a748f3f3dc mingw: use modern strftime implementation if possible
Microsoft introduced a new "Universal C Runtime Library" (UCRT) with
Visual Studio 2015. The UCRT comes with a new strftime() implementation
that supports more date formats. We link git against the older
"Microsoft Visual C Runtime Library" (MSVCRT), so to use the UCRT
strftime() we need to load it from ucrtbase.dll using
DECLARE_PROC_ADDR()/INIT_PROC_ADDR().

Most supported Windows systems should have recieved the UCRT via Windows
update, but in some cases only MSVCRT might be available. In that case
we fall back to using that implementation.

With this change, it is possible to use e.g. the `%g` and `%V` date
format specifiers, e.g.

	git show -s --format=%cd --date=format:‘%g.%V’ HEAD

Without this change, the user would see this error message on Windows:

	fatal: invalid strftime format: '‘%g.%V’'

This fixes https://github.com/git-for-windows/git/issues/2495

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-08 12:15:50 -07:00
Andras Kucsma 05ac8582bc run-command: trigger PATH lookup properly on Cygwin
On Cygwin, the codepath for POSIX-like systems is taken in
run-command.c::start_command(). The prepare_cmd() helper
function is called to decide if the command needs to be looked
up in the PATH. The logic there is to do the PATH-lookup if
and only if it does not have any slash '/' in it. If this test
passes we end up attempting to run the command by appending the
string after each colon-separated component of PATH.

The Cygwin environment supports both Windows and POSIX style
paths, so both forwardslahes '/' and back slashes '\' can be
used as directory separators for any external program the user
supplies.

Examples for path strings which are being incorrectly searched
for in the PATH instead of being executed as is:

- "C:\Program Files\some-program.exe"
- "a\b\c.exe"

To handle these, the PATH lookup detection logic in prepare_cmd()
is taught to know about this Cygwin quirk, by introducing
has_dir_sep(path) helper function to abstract away the difference
between true POSIX and Cygwin systems.

Signed-off-by: Andras Kucsma <r0maikx02b@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-27 11:06:17 -07:00
Junio C Hamano 32fc2c6dd6 Merge branch 'js/mingw-open-in-gdb' into maint
Dev support.

* js/mingw-open-in-gdb:
  mingw: add a helper function to attach GDB to the current process
2020-03-17 15:02:25 -07:00
Junio C Hamano 2d7247af6f Merge branch 'am/mingw-poll-fix' into maint
MinGW's poll() emulation has been improved.

* am/mingw-poll-fix:
  mingw: workaround for hangs when sending STDIN
2020-03-17 15:02:24 -07:00
Junio C Hamano a7a2e12b6e Merge branch 'jk/clang-sanitizer-fixes' into maint
C pedantry ;-) fix.

* jk/clang-sanitizer-fixes:
  obstack: avoid computing offsets from NULL pointer
  xdiff: avoid computing non-zero offset from NULL pointer
  avoid computing zero offsets from NULL pointer
  merge-recursive: use subtraction to flip stage
  merge-recursive: silence -Wxor-used-as-pow warning
2020-03-17 15:02:21 -07:00
Junio C Hamano 1ac37deba2 Merge branch 'am/mingw-poll-fix'
MinGW's poll() emulation has been improved.

* am/mingw-poll-fix:
  mingw: workaround for hangs when sending STDIN
2020-03-09 11:21:20 -07:00
Junio C Hamano ff41848e99 Merge branch 'rs/micro-cleanups'
Code cleanup.

* rs/micro-cleanups:
  use strpbrk(3) to search for characters from a given set
  quote: use isalnum() to check for alphanumeric characters
2020-03-02 15:07:20 -08:00
Alexandr Miloslavskiy 94f4d01932 mingw: workaround for hangs when sending STDIN
Explanation
-----------
The problem here is flawed `poll()` implementation. When it tries to
see if pipe can be written without blocking, it eventually calls
`NtQueryInformationFile()` and tests `WriteQuotaAvailable`. However,
the meaning of quota was misunderstood. The value of quota is reduced
when either some data was written to a pipe, *or* there is a pending
read on the pipe. Therefore, if there is a pending read of size >= than
the pipe's buffer size, poll() will think that pipe is not writable and
will hang forever, usually that means deadlocking both pipe users.

I have studied the problem and found that Windows pipes track two values:
`QuotaUsed` and `BytesInQueue`. The code in `poll()` apparently wants to
know `BytesInQueue` instead of quota. Unfortunately, `BytesInQueue` can
only be requested from read end of the pipe, while `poll()` receives
write end.

The git's implementation of `poll()` was copied from gnulib, which also
contains a flawed implementation up to today.

I also had a look at implementation in cygwin, which is also broken in a
subtle way. It uses this code in `pipe_data_available()`:
	fpli.WriteQuotaAvailable = (fpli.OutboundQuota - fpli.ReadDataAvailable)
However, `ReadDataAvailable` always returns 0 for the write end of the pipe,
turning the code into an obfuscated version of returning pipe's total
buffer size, which I guess will in turn have `poll()` always say that pipe
is writable. The commit that introduced the code doesn't say anything about
this change, so it could be some debugging code that slipped in.

These are the typical sizes used in git:
0x2000 - default read size in `strbuf_read()`
0x1000 - default read size in CRT, used by `strbuf_getwholeline()`
0x2000 - pipe buffer size in compat\mingw.c

As a consequence, as soon as child process uses `strbuf_read()`,
`poll()` in parent process will hang forever, deadlocking both
processes.

This results in two observable behaviors:
1) If parent process begins sending STDIN quickly (and usually that's
   the case), then first `poll()` will succeed and first block will go
   through. MAX_IO_SIZE_DEFAULT is 8MB, so if STDIN exceeds 8MB, then
   it will deadlock.
2) If parent process waits a little bit for any reason (including OS
   scheduler) and child is first to issue `strbuf_read()`, then it will
   deadlock immediately even on small STDINs.

The problem is illustrated by `git stash push`, which will currently
read the entire patch into memory and then send it to `git apply` via
STDIN. If patch exceeds 8MB, git hangs on Windows.

Possible solutions
------------------
1) Somehow obtain `BytesInQueue` instead of `QuotaUsed`
   I did a pretty thorough search and didn't find any ways to obtain
   the value from write end of the pipe.
2) Also give read end of the pipe to `poll()`
   That can be done, but it will probably invite some dirty code,
   because `poll()`
   * can accept multiple pipes at once
   * can accept things that are not pipes
   * is expected to have a well known signature.
3) Make `poll()` always reply "writable" for write end of the pipe
   Afterall it seems that cygwin (accidentally?) does that for years.
   Also, it should be noted that `pump_io_round()` writes 8MB blocks,
   completely ignoring the fact that pipe's buffer size is only 8KB,
   which means that pipe gets clogged many times during that single
   write. This may invite a deadlock, if child's STDERR/STDOUT gets
   clogged while it's trying to deal with 8MB of STDIN. Such deadlocks
   could be defeated with writing less than pipe's buffer size per
   round, and always reading everything from STDOUT/STDERR before
   starting next round. Therefore, making `poll()` always reply
   "writable" shouldn't cause any new issues or block any future
   solutions.
4) Increase the size of the pipe's buffer
   The difference between `BytesInQueue` and `QuotaUsed` is the size
   of pending reads. Therefore, if buffer is bigger than size of reads,
   `poll()` won't hang so easily. However, I found that for example
   `strbuf_read()` will get more and more hungry as it reads large inputs,
   eventually surpassing any reasonable pipe buffer size.

Chosen solution
---------------
Make `poll()` always reply "writable" for write end of the pipe.
Hopefully one day someone will find a way to implement it properly.

Reproduction
------------
printf "%8388608s" X >large_file.txt
git stash push --include-untracked -- large_file.txt

I have decided not to include this as test to avoid slowing down the
test suite. I don't expect the specific problem to come back, and
chances are that `git stash push` will be reworked to avoid sending the
entire patch via STDIN.

Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-27 14:23:29 -08:00
René Scharfe 2ce6d075fa use strpbrk(3) to search for characters from a given set
We can check if certain characters are present in a string by calling
strchr(3) on each of them, or we can pass them all to a single
strpbrk(3) call.  The latter is shorter, less repetitive and slightly
more efficient, so let's do that instead.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-24 09:30:31 -08:00
Junio C Hamano e154451a2f Merge branch 'js/mingw-open-in-gdb'
Dev support.

* js/mingw-open-in-gdb:
  mingw: add a helper function to attach GDB to the current process
2020-02-17 13:22:18 -08:00
Junio C Hamano 54bbadaeca Merge branch 'jk/asan-build-fix' into maint
Work around test breakages caused by custom regex engine used in
libasan, when address sanitizer is used with more recent versions
of gcc and clang.

* jk/asan-build-fix:
  Makefile: use compat regex with SANITIZE=address
2020-02-14 12:42:29 -08:00
Johannes Schindelin 08809c09aa mingw: add a helper function to attach GDB to the current process
When debugging Git, the criss-cross spawning of processes can make
things quite a bit difficult, especially when a Unix shell script is
thrown in the mix that calls a `git.exe` that then segfaults.

To help debugging such things, we introduce the `open_in_gdb()` function
which can be called at a code location where the segfault happens (or as
close as one can get); This will open a new MinTTY window with a GDB
that already attached to the current process.

Inspired by Derrick Stolee.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14 10:02:07 -08:00
Junio C Hamano b783391018 Merge branch 'jk/clang-sanitizer-fixes'
C pedantry ;-) fix.

* jk/clang-sanitizer-fixes:
  obstack: avoid computing offsets from NULL pointer
  xdiff: avoid computing non-zero offset from NULL pointer
  avoid computing zero offsets from NULL pointer
  merge-recursive: use subtraction to flip stage
  merge-recursive: silence -Wxor-used-as-pow warning
2020-02-12 12:41:36 -08:00
Junio C Hamano 76c57fedfa Merge branch 'js/add-p-leftover-bits'
The final leg of rewriting "add -i/-p" in C.

* js/add-p-leftover-bits:
  ci: include the built-in `git add -i` in the `linux-gcc` job
  built-in add -p: handle Escape sequences more efficiently
  built-in add -p: handle Escape sequences in interactive.singlekey mode
  built-in add -p: respect the `interactive.singlekey` config setting
  terminal: add a new function to read a single keystroke
  terminal: accommodate Git for Windows' default terminal
  terminal: make the code of disable_echo() reusable
  built-in add -p: handle diff.algorithm
  built-in add -p: support interactive.diffFilter
  t3701: adjust difffilter test
2020-02-05 14:34:58 -08:00
Junio C Hamano 808dab2b58 Merge branch 'jk/asan-build-fix'
Work around test breakages caused by custom regex engine used in
libasan, when address sanitizer is used with more recent versions
of gcc and clang.

* jk/asan-build-fix:
  Makefile: use compat regex with SANITIZE=address
2020-01-30 14:17:09 -08:00
Jeff King cf82bff73f obstack: avoid computing offsets from NULL pointer
As with the previous two commits, UBSan with clang-11 complains about
computing offsets from a NULL pointer. The failures in t4013 (and
elsewhere) look like this:

  kwset.c:102:23: runtime error: applying non-zero offset 107820859019600 to null pointer
  ...
  not ok 79 - git log -SF master # magic is (not used)

That line is not enlightening:

  ... = obstack_alloc(&kwset->obstack, sizeof (struct trie));

because obstack is implemented almost entirely in macros, and the actual
problem is five macros deep (I temporarily converted them to inline
functions to get better compiler errors, which was tedious but worked
reasonably well).

The actual problem is in these pointer-alignment macros:

  /* If B is the base of an object addressed by P, return the result of
     aligning P to the next multiple of A + 1.  B and P must be of type
     char *.  A + 1 must be a power of 2.  */

  #define __BPTR_ALIGN(B, P, A) ((B) + (((P) - (B) + (A)) & ~(A)))

  /* Similar to _BPTR_ALIGN (B, P, A), except optimize the common case
     where pointers can be converted to integers, aligned as integers,
     and converted back again.  If PTR_INT_TYPE is narrower than a
     pointer (e.g., the AS/400), play it safe and compute the alignment
     relative to B.  Otherwise, use the faster strategy of computing the
     alignment relative to 0.  */

  #define __PTR_ALIGN(B, P, A)                                                \
    __BPTR_ALIGN (sizeof (PTR_INT_TYPE) < sizeof (void *) ? (B) : (char *) 0, \
                  P, A)

If we have a sufficiently-large integer pointer type, then we do the
computation using a NULL pointer constant. That turns __BPTR_ALIGN()
into something like:

  NULL + (P - NULL + A) & ~A

and UBSan is complaining about adding the full value of P to that
initial NULL. We can fix this by doing our math as an integer type, and
then casting the result back to a pointer. The problem case only happens
when we know that the integer type is large enough, so there should be
no issue with truncation.

Another option would be just simplify out all the 0's from
__BPTR_ALIGN() for the NULL-pointer case. That probably wouldn't work
for a platform where the NULL pointer isn't all-zeroes, but Git already
wouldn't work on such a platform (due to our use of memset to set
pointers in structs to NULL). But I tried here to keep as close to the
original as possible.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-28 23:13:25 -08:00
Junio C Hamano 232378479e Sync with maint
* maint:
  msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x
2020-01-16 15:18:46 -08:00
Jeff King f65d07fffa Makefile: use compat regex with SANITIZE=address
Recent versions of the gcc and clang Address Sanitizer produce test
failures related to regexec(). This triggers with gcc-10 and clang-8
(but not gcc-9 nor clang-7). Running:

  make CC=gcc-10 SANITIZE=address test

results in failures in t4018, t3206, and t4062.

The cause seems to be that when built with ASan, we use a different
version of regexec() than normal. And this version doesn't understand
the REG_STARTEND flag. Here's my evidence supporting that.

The failure in t4062 is an ASan warning:

  expecting success of 4062.2 '-G matches':
  	git diff --name-only -G "^(0{64}){64}$" HEAD^ >out &&
  	test 4096-zeroes.txt = "$(cat out)"

  =================================================================
  ==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220
  READ of size 4097 at 0x7fa76f672000 thread T0
      #0 0x7fa7726f75b5  (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5)
      #1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117
      #2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52
      #3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166
      [...]

In this case we're looking in a buffer which was mmap'd via
reuse_worktree_file(), and whose size is 4096 bytes. But libasan's
regex tries to look at byte 4097 anyway! If we tweak Git like this:

  diff --git a/diff.c b/diff.c
  index 8e2914c031..cfae60c120 100644
  --- a/diff.c
  +++ b/diff.c
  @@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate,
           */
          if (ce_uptodate(ce) ||
              (!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0)))
  -               return 1;
  +               return 0;

          return 0;
   }

to use a regular buffer (with a trailing NUL) instead of an mmap, then
the complaint goes away.

The other failures are actually diff output with an incorrect funcname
header. If I instrument xdiff to show the funcname matching like so:

  diff --git a/xdiff-interface.c b/xdiff-interface.c
  index 8509f9ea22..f6c3dc1986 100644
  --- a/xdiff-interface.c
  +++ b/xdiff-interface.c
  @@ -197,6 +197,7 @@ struct ff_regs {
   	struct ff_reg {
   		regex_t re;
   		int negate;
  +		char *printable;
   	} *array;
   };

  @@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len,

   	for (i = 0; i < regs->nr; i++) {
   		struct ff_reg *reg = regs->array + i;
  -		if (!regexec_buf(&reg->re, line, len, 2, pmatch, 0)) {
  +		int ret = regexec_buf(&reg->re, line, len, 2, pmatch, 0);
  +		warning("regexec %s:\n  regex: %s\n  buf: %.*s",
  +			ret == 0 ? "matched" : "did not match",
  +			reg->printable,
  +			(int)len, line);
  +		if (!ret) {
   			if (reg->negate)
   				return -1;
   			break;
  @@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
   			expression = value;
   		if (regcomp(&reg->re, expression, cflags))
   			die("Invalid regexp to look for hunk header: %s", expression);
  +		reg->printable = xstrdup(expression);
   		free(buffer);
   		value = ep + 1;
   	}

then when compiling with ASan and gcc-10, running the diff from t4018.66
produces this:

  $ git diff -U1 cpp-skip-access-specifiers
  warning: regexec did not match:
    regex: ^[     ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
    buf: private:
  warning: regexec matched:
    regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
    buf: private:
  diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
  index 4d4a9db..ebd6f42 100644
  --- a/cpp-skip-access-specifiers
  +++ b/cpp-skip-access-specifiers
  @@ -6,3 +6,3 @@ private:
          void DoSomething();
          int ChangeMe;
  };
          void DoSomething();
  -       int ChangeMe;
  +       int IWasChanged;
   };

That first regex should match (and is negated, so it should be telling
us _not_ to match "private:"). But it wouldn't if regexec() is looking
at the whole buffer, and not just the length-limited line we've fed to
regexec_buf(). So this is consistent again with REG_STARTEND being
ignored.

The correct output (compiling without ASan, or gcc-9 with Asan) looks
like this:

  warning: regexec matched:
    regex: ^[     ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
    buf: private:
  [...more lines that we end up not using...]
  warning: regexec matched:
    regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
    buf: class RIGHT : public Baseclass
  diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
  index 4d4a9db..ebd6f42 100644
  --- a/cpp-skip-access-specifiers
  +++ b/cpp-skip-access-specifiers
  @@ -6,3 +6,3 @@ class RIGHT : public Baseclass
          void DoSomething();
  -       int ChangeMe;
  +       int IWasChanged;
   };

So it really does seem like libasan's regex engine is ignoring
REG_STARTEND. We should be able to work around it by compiling with
NO_REGEX, which would use our local regexec(). But to make matters even
more interesting, this isn't enough by itself.

Because ASan has support from the compiler, it doesn't seem to intercept
our call to regexec() at the dynamic library level. It actually
recognizes when we are compiling a call to regexec() and replaces it
with ASan-specific code at that point. And unlike most of our other
compat code, where we might have git_mmap() or similar, the actual
symbol name in the compiled compat/regex code is regexec(). So just
compiling with NO_REGEX isn't enough; we still end up in libasan!

We can work around that by having the preprocessor replace regexec with
git_regexec (both in the callers and in the actual implementation), and
we truly end up with a call to our custom regex code, even when
compiling with ASan. That's probably a good thing to do anyway, as it
means anybody looking at the symbols later (e.g., in a debugger) would
have a better indication of which function is which. So we'll do the
same for the other common regex functions (even though just regexec() is
enough to fix this ASan problem).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16 14:19:39 -08:00
Johannes Schindelin b6d4d82bd5 msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x
With the upgrade, the library names changed from libeay32/ssleay32 to
libcrypto/libssl.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16 12:18:23 -08:00
Johannes Schindelin 12acdf573a built-in add -p: handle Escape sequences more efficiently
When `interactive.singlekey = true`, we react immediately to keystrokes,
even to Escape sequences (e.g. when pressing a cursor key).

The problem with Escape sequences is that we do not really know when
they are done, and as a heuristic we poll standard input for half a
second to make sure that we got all of it.

While waiting half a second is not asking for a whole lot, it can become
quite annoying over time, therefore with this patch, we read the
terminal capabilities (if available) and extract known Escape sequences
from there, then stop polling immediately when we detected that the user
pressed a key that generated such a known sequence.

This recapitulates the remaining part of b5cc003253 (add -i: ignore
terminal escape sequences, 2011-05-17).

Note: We do *not* query the terminal capabilities directly. That would
either require a lot of platform-specific code, or it would require
linking to a library such as ncurses.

Linking to a library in the built-ins is something we try very hard to
avoid (we even kicked the libcurl dependency to a non-built-in remote
helper, just to shave off a tiny fraction of a second from Git's startup
time). And the platform-specific code would be a maintenance nightmare.

Even worse: in Git for Windows' case, we would need to query MSYS2
pseudo terminals, which `git.exe` simply cannot do (because it is
intentionally *not* an MSYS2 program).

To address this, we simply spawn `infocmp -L -1` and parse its output
(which works even in Git for Windows, because that helper is included in
the end-user facing installations).

This is done only once, as in the Perl version, but it is done only when
the first Escape sequence is encountered, not upon startup of `git add
-i`; This saves on startup time, yet makes reacting to the first Escape
sequence slightly more sluggish. But it allows us to keep the
terminal-related code encapsulated in the `compat/terminal.c` file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15 12:06:17 -08:00
Johannes Schindelin e118f06396 built-in add -p: handle Escape sequences in interactive.singlekey mode
This recapitulates part of b5cc003253 (add -i: ignore terminal escape
sequences, 2011-05-17):

    add -i: ignore terminal escape sequences

    On the author's terminal, the up-arrow input sequence is ^[[A, and
    thus fat-fingering an up-arrow into 'git checkout -p' is quite
    dangerous: git-add--interactive.perl will ignore the ^[ and [
    characters and happily treat A as "discard everything".

    As a band-aid fix, use Term::Cap to get all terminal capabilities.
    Then use the heuristic that any capability value that starts with ^[
    (i.e., \e in perl) must be a key input sequence.  Finally, given an
    input that starts with ^[, read more characters until we have read a
    full escape sequence, then return that to the caller.  We use a
    timeout of 0.5 seconds on the subsequent reads to avoid getting stuck
    if the user actually input a lone ^[.

    Since none of the currently recognized keys start with ^[, the net
    result is that the sequence as a whole will be ignored and the help
    displayed.

Note that we leave part for later which uses "Term::Cap to get all
terminal capabilities", for several reasons:

1. it is actually not really necessary, as the timeout of 0.5 seconds
   should be plenty sufficient to catch Escape sequences,

2. it is cleaner to keep the change to special-case Escape sequences
   separate from the change that reads all terminal capabilities to
   speed things up, and

3. in practice, relying on the terminal capabilities is a bit overrated,
   as the information could be incomplete, or plain wrong. For example,
   in this developer's tmux sessions, the terminal capabilities claim
   that the "cursor up" sequence is ^[M, but the actual sequence
   produced by the "cursor up" key is ^[[A.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15 12:06:17 -08:00
Johannes Schindelin a5e46e6b01 terminal: add a new function to read a single keystroke
Typically, input on the command-line is line-based. It is actually not
really easy to get single characters (or better put: keystrokes).

We provide two implementations here:

- One that handles `/dev/tty` based systems as well as native Windows.
  The former uses the `tcsetattr()` function to put the terminal into
  "raw mode", which allows us to read individual keystrokes, one by one.
  The latter uses `stty.exe` to do the same, falling back to direct
  Win32 Console access.

  Thanks to the refactoring leading up to this commit, this is a single
  function, with the platform-specific details hidden away in
  conditionally-compiled code blocks.

- A fall-back which simply punts and reads back an entire line.

Note that the function writes the keystroke into an `strbuf` rather than
a `char`, in preparation for reading Escape sequences (e.g. when the
user hit an arrow key). This is also required for UTF-8 sequences in
case the keystroke corresponds to a non-ASCII letter.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15 12:06:17 -08:00
Johannes Schindelin 9ea416cb51 terminal: accommodate Git for Windows' default terminal
Git for Windows' Git Bash runs in MinTTY by default, which does not have
a Win32 Console instance, but uses MSYS2 pseudo terminals instead.

This is a problem, as Git for Windows does not want to use the MSYS2
emulation layer for Git itself, and therefore has no direct way to
interact with that pseudo terminal.

As a workaround, use the `stty` utility (which is included in Git for
Windows, and which *is* an MSYS2 program, so it knows how to deal with
the pseudo terminal).

Note: If Git runs in a regular CMD or PowerShell window, there *is* a
regular Win32 Console to work with. This is not a problem for the MSYS2
`stty`: it copes with this scenario just fine.

Also note that we introduce support for more bits than would be
necessary for a mere `disable_echo()` here, in preparation for the
upcoming `enable_non_canonical()` function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15 12:06:17 -08:00
Johannes Schindelin 94ac3c31f7 terminal: make the code of disable_echo() reusable
We are about to introduce the function `enable_non_canonical()`, which
shares almost the complete code with `disable_echo()`.

Let's prepare for that, by refactoring out that shared code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15 12:06:17 -08:00
Junio C Hamano 13432fc6dd Merge branch 'js/mingw-reserved-filenames'
Forbid pathnames that the platform's filesystem cannot represent on
MinGW.

* js/mingw-reserved-filenames:
  mingw: refuse paths containing reserved names
  mingw: short-circuit the conversion of `/dev/null` to UTF-16
2020-01-02 12:38:30 -08:00
Johannes Schindelin 4dc42c6c18 mingw: refuse paths containing reserved names
There are a couple of reserved names that cannot be file names on
Windows, such as `AUX`, `NUL`, etc. For an almost complete list, see
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

If one would try to create a directory named `NUL`, it would actually
"succeed", i.e. the call would return success, but nothing would be
created.

Worse, even adding a file extension to the reserved name does not make
it a valid file name. To understand the rationale behind that behavior,
see https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073

Let's just disallow them all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-21 16:09:07 -08:00
Johannes Schindelin 98d9b23e90 mingw: short-circuit the conversion of /dev/null to UTF-16
In the next commit, we want to disallow accessing any path that contains
any segment that is equivalent to `NUL`. In particular, we want to
disallow accessing `NUL` (e.g. to prevent any repository from being
checked out that contains a file called `NUL`, as that is not a valid
file name on Windows).

However, there are legitimate use cases within Git itself to write to
the Null device. As Git is really a Linux project, it does not abstract
that idea, though, but instead uses `/dev/null` to describe this
intention.

So let's side-step the validation _specifically_ in the case that we
want to write to (or read from) `/dev/null`, via a dedicated short-cut
in the code that skips the call to `validate_win32_path()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-21 16:09:06 -08:00
Junio C Hamano 4755a34c47 Merge branch 'dd/time-reentrancy'
Avoid gmtime() and localtime() and prefer their reentrant
counterparts.

* dd/time-reentrancy:
  mingw: use {gm,local}time_s as backend for {gm,local}time_r
  archive-zip.c: switch to reentrant localtime_r
  date.c: switch to reentrant {gm,local}time_r
2019-12-16 13:08:31 -08:00
Junio C Hamano 55d607d85b Merge branch 'js/mingw-inherit-only-std-handles'
Work around a issue where a FD that is left open when spawning a
child process and is kept open in the child can interfere with the
operation in the parent process on Windows.

* js/mingw-inherit-only-std-handles:
  mingw: forbid translating ERROR_SUCCESS to an errno value
  mingw: do set `errno` correctly when trying to restrict handle inheritance
  mingw: restrict file handle inheritance only on Windows 7 and later
  mingw: spawned processes need to inherit only standard handles
  mingw: work around incorrect standard handles
  mingw: demonstrate that all file handles are inherited by child processes
2019-12-10 13:11:42 -08:00
Junio C Hamano 7034cd094b Sync with Git 2.24.1 2019-12-09 22:17:55 -08:00
Johannes Schindelin 67af91c47a Sync with 2.23.1
* maint-2.23: (44 commits)
  Git 2.23.1
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  ...
2019-12-06 16:31:39 +01:00
Johannes Schindelin 7fd9fd94fb Sync with 2.22.2
* maint-2.22: (43 commits)
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  ...
2019-12-06 16:31:30 +01:00
Johannes Schindelin 5421ddd8d0 Sync with 2.21.1
* maint-2.21: (42 commits)
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  ...
2019-12-06 16:31:23 +01:00
Johannes Schindelin 7d8b676992 mingw: sh arguments need quoting in more circumstances
Previously, we failed to quote characters such as '*', '(' and the
likes. Let's fix this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06 16:31:15 +01:00
Johannes Schindelin 04522edbd4 mingw: fix quoting of empty arguments for sh
When constructing command-lines to spawn processes, it is an unfortunate
but necessary decision to quote arguments differently: MSYS2 has
different dequoting rules (inherited from Cygwin) than the rest of
Windows.

To accommodate that, Git's Windows compatibility layer has two separate
quoting helpers, one for MSYS2 (which it uses exclusively when spawning
`sh`) and the other for regular Windows executables.

The MSYS2 one had an unfortunate bug where a `,` somehow slipped in,
instead of the `;`. As a consequence, empty arguments would not be
enclosed in a pair of double quotes, but the closing double quote was
skipped.

Let's fix this.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06 16:31:14 +01:00
Johannes Schindelin 49f7a76d57 mingw: use MSYS2 quoting even when spawning shell scripts
At the point where `mingw_spawn_fd()` is called, we already have a full
path to the script interpreter in that scenario, and we pass it in as
the executable to run, while the `argv` reflect what the script should
receive as command-line.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06 16:31:14 +01:00
Johannes Schindelin e2ba3d6f6d mingw: detect when MSYS2's sh is to be spawned more robustly
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06 16:31:14 +01:00
Johannes Schindelin fc346cb292 Sync with 2.20.2
* maint-2.20: (36 commits)
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  ...
2019-12-06 16:31:12 +01:00
Johannes Schindelin d851d94151 Sync with 2.19.3
* maint-2.19: (34 commits)
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  ...
2019-12-06 16:30:49 +01:00
Johannes Schindelin 7c9fbda6e2 Sync with 2.18.2
* maint-2.18: (33 commits)
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  ...
2019-12-06 16:30:38 +01:00
Johannes Schindelin 14af7ed5a9 Sync with 2.17.3
* maint-2.17: (32 commits)
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  ...
2019-12-06 16:29:15 +01:00
Johannes Schindelin bdfef0492c Sync with 2.16.6
* maint-2.16: (31 commits)
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  ...
2019-12-06 16:27:36 +01:00
Johannes Schindelin 9ac92fed5b Sync with 2.15.4
* maint-2.15: (29 commits)
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  clone --recurse-submodules: prevent name squatting on Windows
  is_ntfs_dotgit(): only verify the leading segment
  ...
2019-12-06 16:27:18 +01:00
Johannes Schindelin d3ac8c3f27 Sync with 2.14.6
* maint-2.14: (28 commits)
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  clone --recurse-submodules: prevent name squatting on Windows
  is_ntfs_dotgit(): only verify the leading segment
  test-path-utils: offer to run a protectNTFS/protectHFS benchmark
  ...
2019-12-06 16:26:55 +01:00
Johannes Schindelin 2ddcccf97a Merge branch 'win32-accommodate-funny-drive-names'
While the only permitted drive letters for physical drives on Windows
are letters of the US-English alphabet, this restriction does not apply
to virtual drives assigned via `subst <letter>: <path>`.

To prevent targeted attacks against systems where "funny" drive letters
such as `1` or `!` are assigned, let's handle them as regular drive
letters on Windows.

This fixes CVE-2019-1351.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:09 +01:00
Johannes Schindelin 65d30a19de Merge branch 'win32-filenames-cannot-have-trailing-spaces-or-periods'
On Windows, filenames cannot have trailing spaces or periods, when
opening such paths, they are stripped automatically. Read: you can open
the file `README` via the file name `README . . .`. This ambiguity can
be used in combination with other security bugs to cause e.g. remote
code execution during recursive clones. This patch series fixes that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:09 +01:00
Johannes Schindelin f82a97eb91 mingw: handle subst-ed "DOS drives"
Over a decade ago, in 25fe217b86 (Windows: Treat Windows style path
names., 2008-03-05), Git was taught to handle absolute Windows paths,
i.e. paths that start with a drive letter and a colon.

Unbeknownst to us, while drive letters of physical drives are limited to
letters of the English alphabet, there is a way to assign virtual drive
letters to arbitrary directories, via the `subst` command, which is
_not_ limited to English letters.

It is therefore possible to have absolute Windows paths of the form
`1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode
letters can also be used, e.g. `ä:\tschibät.sch`.

While it can be sensibly argued that users who set up such funny drive
letters really seek adverse consequences, the Windows Operating System
is known to be a platform where many users are at the mercy of
administrators who have their very own idea of what constitutes a
reasonable setup.

Therefore, let's just make sure that such funny paths are still
considered absolute paths by Git, on Windows.

In addition to Unicode characters, pretty much any character is a valid
drive letter, as far as `subst` is concerned, even `:` and `"` or even a
space character. While it is probably the opposite of smart to use them,
let's safeguard `is_dos_drive_prefix()` against all of them.

Note: `[::1]:repo` is a valid URL, but not a valid path on Windows.
As `[` is now considered a valid drive letter, we need to be very
careful to avoid misinterpreting such a string as valid local path in
`url_is_local_not_ssh()`. To do that, we use the just-introduced
function `is_valid_path()` (which will label the string as invalid file
name because of the colon characters).

This fixes CVE-2019-1351.

Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:07 +01:00
Johannes Schindelin d2c84dad1c mingw: refuse to access paths with trailing spaces or periods
When creating a directory on Windows whose path ends in a space or a
period (or chains thereof), the Win32 API "helpfully" trims those. For
example, `mkdir("abc ");` will return success, but actually create a
directory called `abc` instead.

This stems back to the DOS days, when all file names had exactly 8
characters plus exactly 3 characters for the file extension, and the
only way to have shorter names was by padding with spaces.

Sadly, this "helpful" behavior is a bit inconsistent: after a successful
`mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because
the directory `abc ` does not actually exist).

Even if it would work, we now have a serious problem because a Git
repository could contain directories `abc` and `abc `, and on Windows,
they would be "merged" unintentionally.

As these paths are illegal on Windows, anyway, let's disallow any
accesses to such paths on that Operating System.

For practical reasons, this behavior is still guarded by the
config setting `core.protectNTFS`: it is possible (and at least two
regression tests make use of it) to create commits without involving the
worktree. In such a scenario, it is of course possible -- even on
Windows -- to create such file names.

Among other consequences, this patch disallows submodules' paths to end
in spaces on Windows (which would formerly have confused Git enough to
try to write into incorrect paths, anyway).

While this patch does not fix a vulnerability on its own, it prevents an
attack vector that was exploited in demonstrations of a number of
recently-fixed security bugs.

The regression test added to `t/t7417-submodule-path-url.sh` reflects
that attack vector.

Note that we have to adjust the test case "prevent git~1 squatting on
Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue.
It tries to clone two submodules whose names differ only in a trailing
period character, and as a consequence their git directories differ in
the same way. Previously, when Git tried to clone the second submodule,
it thought that the git directory already existed (because on Windows,
when you create a directory with the name `b.` it actually creates `b`),
but with this patch, the first submodule's clone will fail because of
the illegal name of the git directory. Therefore, when cloning the
second submodule, Git will take a different code path: a fresh clone
(without an existing git directory). Both code paths fail to clone the
second submodule, both because the the corresponding worktree directory
exists and is not empty, but the error messages are worded differently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:06 +01:00
Johannes Schindelin 817ddd64c2 mingw: refuse to access paths with illegal characters
Certain characters are not admissible in file names on Windows, even if
Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they
are, e.g. `:`, `<`, `>`, etc

Let's disallow those characters explicitly in Windows builds of Git.

Note: just like trailing spaces or periods, it _is_ possible on Windows
to create commits adding files with such illegal characters, as long as
the operation leaves the worktree untouched. To allow for that, we
continue to guard `is_valid_win32_path()` behind the config setting
`core.protectNTFS`, so that users _can_ continue to do that, as long as
they turn the protections off via that config setting.

Among other problems, this prevents Git from trying to write to an "NTFS
Alternate Data Stream" (which refers to metadata stored alongside a
file, under a special name: "<filename>:<stream-name>"). This fix
therefore also prevents an attack vector that was exploited in
demonstrations of a number of recently-fixed security bugs.

Further reading on illegal characters in Win32 filenames:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:06 +01:00
Johannes Schindelin 6d8684161e mingw: fix quoting of arguments
We need to be careful to follow proper quoting rules. For example, if an
argument contains spaces, we have to quote them. Double-quotes need to
be escaped. Backslashes need to be escaped, but only if they are
followed by a double-quote character.

We need to be _extra_ careful to consider the case where an argument
ends in a backslash _and_ needs to be quoted: in this case, we append a
double-quote character, i.e. the backslash now has to be escaped!

The current code, however, fails to recognize that, and therefore can
turn an argument that ends in a single backslash into a quoted argument
that now ends in an escaped double-quote character. This allows
subsequent command-line parameters to be split and part of them being
mistaken for command-line options, e.g. through a maliciously-crafted
submodule URL during a recursive clone.

Technically, we would not need to quote _all_ arguments which end in a
backslash _unless_ the argument needs to be quoted anyway. For example,
`test\` would not need to be quoted, while `test \` would need to be.

To keep the code simple, however, and therefore easier to reason about
and ensure its correctness, we now _always_ quote an argument that ends
in a backslash.

This addresses CVE-2019-1350.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:36:51 +01:00
Johannes Schindelin 3ba3720b3f mingw: forbid translating ERROR_SUCCESS to an errno value
Johannes Sixt pointed out that the `err_win_to_posix()` function
mishandles `ERROR_SUCCESS`: it maps it to `ENOSYS`.

The only purpose of this function is to map Win32 API errors to `errno`
ones, and there is actually no equivalent to `ERROR_SUCCESS`: the idea
of `errno` is that it will only be set in case of an error, and left
alone in case of success.

Therefore, as pointed out by Junio Hamano, it is a bug to call this
function when there was not even any error to map.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-02 11:05:36 -08:00
Doan Tran Cong Danh 0109d676f9 mingw: use {gm,local}time_s as backend for {gm,local}time_r
Since Windows doesn't provide gmtime_r(3) and localtime_r(3),
we're providing a compat version by using non-reentrant gmtime(3) and
localtime(3) as backend. Then, we copy the returned data into the
buffer.

By doing that, in case of failure, we will dereference a NULL pointer
returned by gmtime(3), and localtime(3), and we always return a valid
pointer instead of NULL.

Drop the memcpy(3) by using gmtime_s(), and use localtime_s() as the
backend on Windows, and make sure we will return NULL in case of
failure.

Cc: Johannes Sixt <j6t@kdbg.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-01 22:26:25 -08:00
Junio C Hamano 9da3948781 Merge branch 'rs/use-copy-array-in-mingw-shell-command-preparation'
Code cleanup.

* rs/use-copy-array-in-mingw-shell-command-preparation:
  mingw: use COPY_ARRAY for copying array
2019-12-01 09:04:39 -08:00
Junio C Hamano d3096d2ba6 Merge branch 'en/doc-typofix'
Docfix.

* en/doc-typofix:
  Fix spelling errors in no-longer-updated-from-upstream modules
  multimail: fix a few simple spelling errors
  sha1dc: fix trivial comment spelling error
  Fix spelling errors in test commands
  Fix spelling errors in messages shown to users
  Fix spelling errors in names of tests
  Fix spelling errors in comments of testcases
  Fix spelling errors in code comments
  Fix spelling errors in documentation outside of Documentation/
  Documentation: fix a bunch of typos, both old and new
2019-12-01 09:04:35 -08:00
Johannes Schindelin 4d0375ca24 mingw: do set errno correctly when trying to restrict handle inheritance
In 9a780a384d (mingw: spawned processes need to inherit only standard
handles, 2019-11-22), we taught the Windows-specific part to restrict
which file handles are passed on to the spawned processes.

Since this logic seemed to be a bit fragile across Windows versions (we
_still_ support Windows Vista in Git for Windows, for example), a
fall-back was added to try spawning the process again, this time without
restricting which file handles are to be inherited by the spawned
process.

In the common case (i.e. when the process could not be spawned for
reasons _other_ than the file handle inheritance), the fall-back attempt
would still fail, of course.

Crucially, one thing we missed in that code path was to set `errno`
appropriately.

This should have been caught by t0061.2 which expected `errno` to be
`ENOENT` after trying to start a process for a non-existing executable,
but `errno` was set to `ENOENT` prior to the `CreateProcessW()` call:
while looking for the config settings for trace2, Git tries to access
`xdg_config` and `user_config` via `access_or_die()`, and as neither of
those config files exists when running the test case (because in Git's
test suite, `HOME` points to the test directory), the `errno` has the
expected value, but for the wrong reasons.

Let's fix that by making sure that `errno` is set correctly. It even
appears that `errno` was set in the _wrong_ case previously:
`CreateProcessW()` returns non-zero upon success, but `errno` was set
only in the non-zero case.

It would be nice if we could somehow fix t0061 to make sure that this
does not regress again. One approach that seemed like it should work,
but did not, was to set `errno` to 0 in the test helper that is used by
t0061.2.

However, when `mingw_spawnvpe()` wants to see whether the file in
question is a script, it calls `parse_interpreter()`, which in turn
tries to `open()` the file. Obviously, this call fails, and sets `errno`
to `ENOENT`, deep inside the call chain started from that test helper.

Instead, we force re-set `errno` at the beginning of the function
`mingw_spawnve_fd()`, which _should_ be safe given that callers of that
function will want to look at `errno` if -1 was returned. And if that
`errno` is 0 ("No error"), regression tests like t0061.2 will kick in.

Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30 14:09:09 -08:00
Johannes Schindelin ac33519ddf mingw: restrict file handle inheritance only on Windows 7 and later
Turns out that it don't work so well on Vista, see
https://github.com/git-for-windows/git/issues/1742 for details.

According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it
*should* work on Windows Vista and later.

But apparently there are issues on Windows Vista when pipes are
involved. Given that Windows Vista is past its end of life (official
support ended on April 11th, 2017), let's not spend *too* much time on
this issue and just disable the file handle inheritance restriction on
any Windows version earlier than Windows 7.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23 11:17:01 +09:00
Johannes Schindelin 9a780a384d mingw: spawned processes need to inherit only standard handles
By default, CreateProcess() does not inherit any open file handles,
unless the bInheritHandles parameter is set to TRUE. Which we do need to
set because we need to pass in stdin/stdout/stderr to talk to the child
processes. Sadly, this means that all file handles (unless marked via
O_NOINHERIT) are inherited.

This lead to problems in VFS for Git, where a long-running read-object
hook is used to hydrate missing objects, and depending on the
circumstances, might only be called *after* Git opened a file handle.

Ideally, we would not open files without O_NOINHERIT unless *really*
necessary (i.e. when we want to pass the opened file handle as standard
handle into a child process), but apparently it is all-too-easy to
introduce incorrect open() calls: this happened, and prevented updating
a file after the read-object hook was started because the hook still
held a handle on said file.

Happily, there is a solution: as described in the "Old New Thing"
https://blogs.msdn.microsoft.com/oldnewthing/20111216-00/?p=8873 there
is a way, starting with Windows Vista, that lets us define precisely
which handles should be inherited by the child process.

And since we bumped the minimum Windows version for use with Git for
Windows to Vista with v2.10.1 (i.e. a *long* time ago), we can use this
method. So let's do exactly that.

We need to make sure that the list of handles to inherit does not
contain duplicates; Otherwise CreateProcessW() would fail with
ERROR_INVALID_ARGUMENT.

While at it, stop setting errno to ENOENT unless it really is the
correct value.

Also, fall back to not limiting handle inheritance under certain error
conditions (e.g. on Windows 7, which is a lot stricter in what handles
you can specify to limit to).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23 11:17:01 +09:00
Johannes Schindelin c5a03b1e29 mingw: work around incorrect standard handles
For some reason, when being called via TortoiseGit the standard handles,
or at least what is returned by _get_osfhandle(0) for standard input,
can take on the value (HANDLE)-2 (which is not a legal value, according
to the documentation).

Even if this value is not documented anywhere, CreateProcess() seems to
work fine without complaints if hStdInput set to this value.

In contrast, the upcoming code to restrict which file handles get
inherited by spawned processes would result in `ERROR_INVALID_PARAMETER`
when including such handle values in the list.

To help this, special-case the value (HANDLE)-2 returned by
_get_osfhandle() and replace it with INVALID_HANDLE_VALUE, which will
hopefully let the handle inheritance restriction work even when called
from TortoiseGit.

This fixes https://github.com/git-for-windows/git/issues/1481

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23 11:17:01 +09:00
René Scharfe 51bd6be32d mingw: use COPY_ARRAY for copying array
Use the macro COPY_ARRAY to copy array elements.  The result is shorter
and safer, as it infers the element type automatically and does a (very)
basic type compatibility check for its first two arguments.

Coccinelle and contrib/coccinelle/array.cocci did not generate this
conversion due to the offset of 1 at both source and destination and
because the source is a const pointer; the semantic patch cautiously
handles only pure pointers and array references of the same type.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-13 11:29:22 +09:00
Elijah Newren 03670c8b23 Fix spelling errors in no-longer-updated-from-upstream modules
We have several modules originally taken from some upstream source,
and which as far as I can tell we no longer update from the upstream
anymore.  As such, I have not submitted these spelling fixes to any
external projects but just include them directly here.

Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10 16:00:55 +09:00
Elijah Newren 15beaaa3d1 Fix spelling errors in code comments
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10 16:00:54 +09:00
Junio C Hamano f2db52c46b Merge branch 'js/mingw-needs-hiding-fix'
Fix for a (rather old) buffer-overrun bug.

* js/mingw-needs-hiding-fix:
  mingw: avoid a buffer overrun in `needs_hiding()`
2019-10-30 15:13:13 +09:00
Johannes Schindelin 60e6569a12 mingw: avoid a buffer overrun in needs_hiding()
When this function is passed a path with a trailing slash, it runs right
over the end of that path.

Let's fix this.

Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-28 12:38:51 +09:00
Junio C Hamano 2d74d28ee0 Merge branch 'dl/compat-cleanup'
Code formatting micronit fix.

* dl/compat-cleanup:
  pthread.h: manually align parameter lists
2019-10-18 11:40:47 +09:00
Junio C Hamano 6d5291be45 Merge branch 'js/azure-pipelines-msvc'
CI updates.

* js/azure-pipelines-msvc:
  ci: also build and test with MS Visual Studio on Azure Pipelines
  ci: really use shallow clones on Azure Pipelines
  tests: let --immediate and --write-junit-xml play well together
  test-tool run-command: learn to run (parts of) the testsuite
  vcxproj: include more generated files
  vcxproj: only copy `git-remote-http.exe` once it was built
  msvc: work around a bug in GetEnvironmentVariable()
  msvc: handle DEVELOPER=1
  msvc: ignore some libraries when linking
  compat/win32/path-utils.h: add #include guards
  winansi: use FLEX_ARRAY to avoid compiler warning
  msvc: avoid using minus operator on unsigned types
  push: do not pretend to return `int` from `die_push_simple()`
2019-10-15 13:48:00 +09:00
Denton Liu 9cad1c4488 pthread.h: manually align parameter lists
In previous patches, extern was mechanically removed from function
declarations without care to formatting, causing parameter lists to be
misaligned. Manually format changed sections such that the parameter
lists are realigned.

Viewing this patch with 'git diff -w' should produce no output.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-11 14:59:35 +09:00
Junio C Hamano 772cad0afb Merge branch 'js/diff-rename-force-stable-sort'
The rename detection logic sorts a list of rename source candidates
by similarity to pick the best candidate, which means that a tie
between sources with the same similarity is broken by the original
location in the original candidate list (which is sorted by path).
Force the sorting by similarity done with a stable sort, which is
not promised by system supplied qsort(3), to ensure consistent
results across platforms.

* js/diff-rename-force-stable-sort:
  diffcore_rename(): use a stable sort
  Move git_sort(), a stable sort, into into libgit.a
2019-10-09 14:00:59 +09:00
Johannes Schindelin 61d1d92aa4 msvc: work around a bug in GetEnvironmentVariable()
The return value of that function is 0 both for variables that are
unset, as well as for variables whose values are empty. To discern those
two cases, one has to call `GetLastError()`, whose return value is
`ERROR_ENVVAR_NOT_FOUND` and `ERROR_SUCCESS`, respectively.

Except that it is not actually set to `ERROR_SUCCESS` in the latter
case, apparently, but the last error value seems to be simply unchanged.

To work around this, let's just re-set the last error value just before
inspecting the environment variable.

This fixes a problem that triggers failures in t3301-notes.sh (where we
try to override config settings by passing empty values for certain
environment variables).

This problem is hidden in the MINGW build by the fact that older
MSVC runtimes (such as the one used by MINGW builds) have a `calloc()`
that re-sets the last error value in case of success, while newer
runtimes set the error value only if `NULL` is returned by that
function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-06 09:07:44 +09:00
Johannes Schindelin e4347c9434 msvc: handle DEVELOPER=1
We frequently build Git using the `DEVELOPER=1` make setting as a
shortcut to enable all kinds of more stringent compiler warnings.

Those compiler warnings are relatively specific to GCC, though, so let's
try our best to translate them to the equivalent options to pass to MS
Visual C++'s `cl.exe`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-06 09:07:44 +09:00
Johannes Schindelin ed712ef8d5 msvc: ignore some libraries when linking
To build with MSVC, we "translate" GCC options to MSVC options, and part
of those options refer to the libraries to link into the final
executable. Currently, this part looks somewhat like this on Windows:

	-lcurl -lnghttp2 -lidn2 -lssl -lcrypto -lssl -lcrypto -lgdi32
	-lcrypt32 -lwldap32 -lz -lws2_32 -lexpat

Some of those are direct dependencies (such as curl and ssl) and others
are indirect (nghttp2 and idn2, for example, are dependencies of curl,
but need to be linked in for reasons).

We already handle the direct dependencies, e.g. `-liconv` is already
handled as adding `libiconv.lib` to the list of libraries to link
against.

Let's just ignore the remaining `-l*` options so that MSVC does not have
to warn us that it ignored e.g. the `/lnghttp2` option. We do that by
extending the clause that already "eats" the `-R*` options to also eat
the `-l*` options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-06 09:07:44 +09:00
Johannes Schindelin 5b8f9e2417 compat/win32/path-utils.h: add #include guards
This adds the common guards that allow headers to be #include'd multiple
times.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-06 09:07:44 +09:00
Johannes Schindelin 41616ef618 winansi: use FLEX_ARRAY to avoid compiler warning
MSVC would complain thusly:

    C4200: nonstandard extension used: zero-sized array in struct/union

Let's just use the `FLEX_ARRAY` constant that we introduced for exactly
this type of scenario.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-06 09:07:44 +09:00
Johannes Schindelin 97fff61012 Move git_sort(), a stable sort, into into libgit.a
The `qsort()` function is not guaranteed to be stable, i.e. it does not
promise to maintain the order of items it is told to consider equal. In
contrast, the `git_sort()` function we carry in `compat/qsort.c` _is_
stable, by virtue of implementing a merge sort algorithm.

In preparation for using a stable sort in Git's rename detection, move
the stable sort into `libgit.a` so that it is compiled in
unconditionally, and rename it to `git_stable_qsort()`.

Note: this also makes the hack obsolete that was introduced in
fe21c6b285 (mingw: reencode environment variables on the fly (UTF-16
<-> UTF-8), 2018-10-30), where we included `compat/qsort.c` directly in
`compat/mingw.c` to use the stable sort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-02 14:44:51 +09:00
Junio C Hamano 6f21347f11 Merge branch 'ar/mingw-run-external-with-non-ascii-path'
Windows update.

* ar/mingw-run-external-with-non-ascii-path:
  mingw: fix launching of externals from Unicode paths
2019-09-30 13:19:26 +09:00
Junio C Hamano 00bb74453d Merge branch 'dl/compat-cleanup'
Code cleanup.

* dl/compat-cleanup:
  compat/*.[ch]: remove extern from function declarations using spatch
  mingw: apply array.cocci rule
2019-09-30 13:19:23 +09:00
Denton Liu 7027f508c7 compat/*.[ch]: remove extern from function declarations using spatch
In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), we removed externs from function declarations using
spatch but we intentionally excluded files under compat/ since some are
directly copied from an upstream and we should avoid churning them so
that manually merging future updates will be simpler.

In the last commit, we determined the files which taken from an upstream
so we can exclude them and run spatch on the remainder.

This was the Coccinelle patch used:

	@@
	type T;
	identifier f;
	@@
	- extern
	  T f(...);

and it was run with:

	$ git ls-files compat/\*\*.{c,h} |
		xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
	$ git checkout -- \
		compat/regex/ \
		compat/inet_ntop.c \
		compat/inet_pton.c \
		compat/nedmalloc/ \
		compat/obstack.{c,h} \
		compat/poll/

Coccinelle has some trouble dealing with `__attribute__` and varargs so
we ran the following to ensure that no remaining changes were left
behind:

	$ git ls-files compat/\*\*.{c,h} |
		xargs sed -i'' -e 's/^\(\s*\)extern \([^(]*([^*]\)/\1\2/'
	$ git checkout -- \
		compat/regex/ \
		compat/inet_ntop.c \
		compat/inet_pton.c \
		compat/nedmalloc/ \
		compat/obstack.{c,h} \
		compat/poll/

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05 11:05:45 -07:00
Denton Liu 552fc5016f mingw: apply array.cocci rule
After running Coccinelle on all sources inside compat/ that were created
by us[1], it was found that compat/mingw.c violated an array.cocci rule
in two places and, thus, a patch was generated. Apply this patch so that
all compat/ sources created by us follows all cocci rules.

[1]: Do not run Coccinelle on files that are taken from some upstream
because in case we need to pull updates from them, we would like to have
diverged as little as possible in order to make merging updates simpler.

The following sources were determined to have been taken from some
upstream:

* compat/regex/
* compat/inet_ntop.c
* compat/inet_pton.c
* compat/nedmalloc/
* compat/obstack.{c,h}
* compat/poll/

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05 11:05:03 -07:00
Adam Roben 4e1a641ee3 mingw: fix launching of externals from Unicode paths
If Git were installed in a path containing non-ASCII characters,
commands such as `git am` and `git submodule`, which are implemented as
externals, would fail to launch with the following error:

> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?

This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 85faec9d3a (Win32: Unicode file name support (except dirent),
2012-03-15).

Note that the only problem in this function was calling
`GetFileAttributes()` instead of `GetFileAttributesW()`. The calls to
`access()` were fine because `access()` is a macro which resolves to
`mingw_access()`, which already handles Unicode correctly. But
`lookup_prog()` was changed to use `_waccess()` directly so that we only
convert the path to UTF-16 once.

To make things work correctly, we have to maintain UTF-8 and UTF-16
versions in tandem in `lookup_prog()`.

Signed-off-by: Adam Roben <adam@roben.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-26 10:06:03 -07:00
Junio C Hamano 4336fdb2ef Merge branch 'rs/nedalloc-fixlets'
Compilation fix.

* rs/nedalloc-fixlets:
  nedmalloc: avoid compiler warning about unused value
  nedmalloc: do assignments only after the declaration section
2019-08-22 12:34:11 -07:00
René Scharfe 70597e8386 nedmalloc: avoid compiler warning about unused value
Cast the evaluated value of the macro INITIAL_LOCK to void to instruct
the compiler that we're not interested in said value nor the following
warning:

In file included from compat/nedmalloc/nedmalloc.c:63:
compat/nedmalloc/malloc.c.h: In function ‘init_user_mstate’:
compat/nedmalloc/malloc.c.h:1706:62: error: right-hand operand of comma expression has no effect [-Werror=unused-value]
 1706 | #define INITIAL_LOCK(sl)      (memset(sl, 0, sizeof(MLOCK_T)), 0)
      |                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
compat/nedmalloc/malloc.c.h:5020:3: note: in expansion of macro ‘INITIAL_LOCK’
 5020 |   INITIAL_LOCK(&m->mutex);
      |   ^~~~~~~~~~~~

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-07 11:54:55 -07:00
René Scharfe c9b9c09dae nedmalloc: do assignments only after the declaration section
Avoid the following compiler warning:

In file included from compat/nedmalloc/nedmalloc.c:63:
compat/nedmalloc/malloc.c.h: In function ‘pthread_release_lock’:
compat/nedmalloc/malloc.c.h:1759:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
 1759 |     volatile unsigned int* lp = &sl->l;
      |     ^~~~~~~~

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-07 11:54:45 -07:00
Junio C Hamano c62bc49139 Merge branch 'js/visual-studio'
Support building Git with Visual Studio

The bits about .git/branches/* have been dropped from the series.
We may want to drop the support for it, but until that happens, the
tests should rely on the existence of the support to pass.

* js/visual-studio: (23 commits)
  git: avoid calling aliased builtins via their dashed form
  bin-wrappers: append `.exe` to target paths if necessary
  .gitignore: ignore Visual Studio's temporary/generated files
  .gitignore: touch up the entries regarding Visual Studio
  vcxproj: also link-or-copy builtins
  msvc: add a Makefile target to pre-generate the Visual Studio solution
  contrib/buildsystems: add a backend for modern Visual Studio versions
  contrib/buildsystems: handle options starting with a slash
  contrib/buildsystems: also handle -lexpat
  contrib/buildsystems: handle libiconv, too
  contrib/buildsystems: handle the curl library option
  contrib/buildsystems: error out on unknown option
  contrib/buildsystems: optionally capture the dry-run in a file
  contrib/buildsystems: redirect errors of the dry run into a log file
  contrib/buildsystems: ignore gettext stuff
  contrib/buildsystems: handle quoted spaces in filenames
  contrib/buildsystems: fix misleading error message
  contrib/buildsystems: ignore irrelevant files in Generators/
  contrib/buildsystems: ignore invalidcontinue.obj
  Vcproj.pm: urlencode '<' and '>' when generating VC projects
  ...
2019-08-02 13:12:02 -07:00
Johannes Schindelin 976aaedca0 msvc: add a Makefile target to pre-generate the Visual Studio solution
The entire idea of generating the VS solution makes only sense if we
generate it via Continuous Integration; otherwise potential users would
still have to download the entire Git for Windows SDK.

If we pre-generate the Visual Studio solution, Git can be built entirely
within Visual Studio, and the test scripts can be run in a regular Git
for Windows (e.g. the Portable Git flavor, which does not include a full
GCC toolchain and therefore weighs only about a tenth of Git for
Windows' SDK).

So let's just add a target in the Makefile that can be used to generate
said solution; The generated files will then be committed so that they
can be pushed to a branch ready to check out by Visual Studio users.

To make things even more useful, we also generate and commit other files
that are required to run the test suite, such as templates and
bin-wrappers: with this, developers can run the test suite in a regular
Git Bash after building the solution in Visual Studio.

Note: for this build target, we do not actually need to initialize the
`vcpkg` system, so we don't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-29 14:51:43 -07:00
Junio C Hamano 2f72ebfcd0 Merge branch 'js/mingw-spawn-with-spaces-in-path' into maint
Window 7 update ;-)

* js/mingw-spawn-with-spaces-in-path:
  mingw: support spawning programs containing spaces in their names
2019-07-29 12:38:17 -07:00
Junio C Hamano a5194d806c Merge branch 'js/mingw-spawn-with-spaces-in-path'
Window 7 update ;-)

* js/mingw-spawn-with-spaces-in-path:
  mingw: support spawning programs containing spaces in their names
2019-07-25 13:59:23 -07:00
Junio C Hamano fc613d2d6e Merge branch 'kb/mingw-set-home'
Windows port update.

* kb/mingw-set-home:
  mingw: initialize HOME on startup
2019-07-19 11:30:23 -07:00
Johannes Schindelin eb7c786314 mingw: support spawning programs containing spaces in their names
On some older Windows versions (e.g. Windows 7), the CreateProcessW()
function does not really support spaces in its first argument,
lpApplicationName. But it supports passing NULL as lpApplicationName,
which makes it figure out the application from the (possibly quoted)
first argument of lpCommandLine.

Let's use that trick (if we are certain that the first argument matches
the executable's path) to support launching programs whose path contains
spaces.

We will abuse the test-fake-ssh.exe helper to verify that this works and
does not regress.

This fixes https://github.com/git-for-windows/git/issues/692

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-16 12:47:37 -07:00
Junio C Hamano 0a2ff7c6b5 Merge branch 'js/mingw-use-utf8'
Windows update.

* js/mingw-use-utf8:
  mingw: fix possible buffer overrun when calling `GetUserNameW()`
  mingw: use Unicode functions explicitly
  mingw: get pw_name in UTF-8 format
2019-07-11 15:16:49 -07:00
Junio C Hamano 9b9b24bd57 Merge branch 'cb/windows-manifest'
Windows update.

* cb/windows-manifest:
  mingw: embed a manifest to trick UAC into Doing The Right Thing
2019-07-11 15:16:47 -07:00
Junio C Hamano 88b1075759 Merge branch 'jh/msvc'
Support to build with MSVC has been updated.

* jh/msvc:
  msvc: ignore .dll and incremental compile output
  msvc: avoid debug assertion windows in Debug Mode
  msvc: do not pretend to support all signals
  msvc: add pragmas for common warnings
  msvc: add a compile-time flag to allow detailed heap debugging
  msvc: support building Git using MS Visual C++
  msvc: update Makefile to allow for spaces in the compiler path
  msvc: fix detect_msys_tty()
  msvc: define ftello()
  msvc: do not re-declare the timespec struct
  msvc: mark a variable as non-const
  msvc: define O_ACCMODE
  msvc: include sigset_t definition
  msvc: fix dependencies of compat/msvc.c
  mingw: replace mingw_startup() hack
  obstack: fix compiler warning
  cache-tree/blame: avoid reusing the DEBUG constant
  t0001 (mingw): do not expect a specific order of stdout/stderr
  Mark .bat files as requiring CR/LF endings
  mingw: fix a typo in the msysGit-specific section
2019-07-09 15:25:45 -07:00
Karsten Blees e12a955660 mingw: initialize HOME on startup
HOME initialization was historically duplicated in many different places,
including /etc/profile, launch scripts such as git-bash.vbs and gitk.cmd,
and (although slightly broken) in the git-wrapper.

Even unrelated projects such as GitExtensions and TortoiseGit need to
implement the same logic to be able to call git directly.

Initialize HOME in git's own startup code so that we can eventually retire
all the duplicate initialization code.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-08 14:55:02 -07:00
Johannes Schindelin 697bdd22b8 mingw: fix possible buffer overrun when calling GetUserNameW()
In 39a98e9b68 (mingw: get pw_name in UTF-8 format, 2019-06-27), this
developer missed the fact that the `GetUserNameW()` function takes the
number of characters as `len` parameter, not the number of bytes.

Reported-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-08 12:09:12 -07:00
Johannes Schindelin 94238859b9 mingw: use Unicode functions explicitly
Many Win32 API functions actually exist in two variants: one with
the `A` suffix that takes ANSI parameters (`char *` or `const char *`)
and one with the `W` suffix that takes Unicode parameters (`wchar_t *`
or `const wchar_t *`).

The ANSI variant assumes that the strings are encoded according to
whatever is the current locale. This is not what Git wants to use on
Windows: we assume that `char *` variables point to strings encoded in
UTF-8.

There is a pseudo UTF-8 locale on Windows, but it does not work
as one might expect. In addition, if we overrode the user's locale, that
would modify the behavior of programs spawned by Git (such as editors,
difftools, etc), therefore we cannot use that pseudo locale.

Further, it is actually highly encouraged to use the Unicode versions
instead of the ANSI versions, so let's do precisely that.

Note: when calling the Win32 API functions _without_ any suffix, it
depends whether the `UNICODE` constant is defined before the relevant
headers are #include'd. Without that constant, the ANSI variants are
used. Let's be explicit and avoid that ambiguity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:56:15 -07:00
Johannes Schindelin 39a98e9b68 mingw: get pw_name in UTF-8 format
Previously, we would have obtained the user name encoded in whatever the
current code page is.

Note: the "user name" here does not denote the full name but instead the
short logon name.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:56:13 -07:00
Cesar Eduardo Barros fe90397604 mingw: embed a manifest to trick UAC into Doing The Right Thing
On Windows >= Vista, not having an application manifest with a
requestedExecutionLevel can cause several kinds of confusing behavior.

The first and more obvious behavior is "Installer Detection" of the
"User Account Control" (also known as "UAC") feature, where Windows
sometimes decides (by looking at things like the file name and even
sequences of bytes within the executable) that an executable is an
installer and should run elevated (causing the well-known popup dialog
to appear). In Git's context, subcommands such as "git patch-id" or "git
update-index" fall prey to this behavior.

The second and more confusing behavior is "File Virtualization". It
means that when files are written without having write permission, it
does not fail (as expected), but they are instead redirected to
somewhere else. When the files are read, the original contents are
returned, though, not the ones that were just written somewhere else.
Even more confusing, not all write accesses are redirected; Trying to
write to write-protected .exe files, for example, will fail instead of
redirecting.

In addition to being unwanted behavior, File Virtualization causes
dramatic slowdowns in Git (see for instance
http://code.google.com/p/msysgit/issues/detail?id=320).

A third unwanted behavior of Windows >= Vista is that it lies about the
Windows version when calling `GetWindowsVersionEx()`.

There are two ways to prevent these unwanted behaviors: Either you embed
an application manifest (which really is an XML document conforming to a
specific schema) within all your executables, or you add an external
manifest (a file with the same name followed by `.manifest`) to all your
executables. Since Git's builtins are hardlinked (or copied), it is
simpler and more robust to embed a manifest.

Recent enough MSVC compilers already embed a working internal manifest,
and building with mingw-w64 (which is the case in Git for Windows' SDK)
does it, too, but for MinGW you have to do so by hand.

In any case, it is better to be explicit about this manifest, that way
changes in the compiler toolchain won't surprise us (as mingw-w64 once
did when it broke `GetWindowsVersionEx()` by mistake).

References:
  - New UAC Technologies for Windows Vista
    http://msdn.microsoft.com/en-us/library/bb756960.aspx
  - Create and Embed an Application Manifest (UAC)
    http://msdn.microsoft.com/en-us/library/bb756929.aspx

Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:55:45 -07:00
Johannes Schindelin f6a6393771 msvc: avoid debug assertion windows in Debug Mode
For regular debugging, it is pretty helpful when a debug assertion in a
running application triggers a window that offers to start the debugger.

However, when running the test suite, it is not so helpful, in
particular when the debug assertions are then suppressed anyway because
we disable the invalid parameter checking (via invalidcontinue.obj, see
the comment in config.mak.uname about that object for more information).

So let's simply disable that window in Debug Mode (it is already
disabled in Release Mode).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 10:46:58 -07:00
Jeff Hostetler 446df60367 msvc: do not pretend to support all signals
This special-cases various signals that are not supported on Windows,
such as SIGPIPE. These cause the UCRT to throw asserts (at least in
debug mode).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 10:46:58 -07:00
Philip Oakley b7bd9a7338 msvc: add pragmas for common warnings
MSVC can be overzealous about some warnings. Disable them.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 10:46:57 -07:00
Jeff Hostetler 556702f86c msvc: add a compile-time flag to allow detailed heap debugging
MS Visual C comes with a few neat features we can use to analyze the
heap consumption (i.e. leaks, max memory, etc).

With this patch, we introduce support via the build-time flag
`USE_MSVC_CRTDBG`.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 10:46:57 -07:00
Jeff Hostetler dce7d29551 msvc: support building Git using MS Visual C++
With this patch, Git can be built using the Microsoft toolchain, via:

	make MSVC=1 [DEBUG=1]

Third party libraries are built from source using the open source
"vcpkg" tool set. See https://github.com/Microsoft/vcpkg

On a first build, the vcpkg tools and the third party libraries are
automatically downloaded and built. DLLs for the third party libraries
are copied to the top-level (and t/helper) directory to facilitate
debugging. See compat/vcbuild/README.

A series of .bat files are invoked by the Makefile to find the location
of the installed version of Visual Studio and the associated compiler
tools (essentially replicating the environment setup performed by a
"Developer Command Prompt"). This should find the most recent VS2015 or
VS2017 installation. Output from these scripts are used by the Makefile
to define compiler and linker pathnames and -I and -L arguments.

The build produces .pdb files for both debug and release builds.

Note: This commit was squashed from an organic series of commits
developed between 2016 and 2018 in Git for Windows' `master` branch.
This combined commit eliminates the obsolete commits related to fetching
NuGet packages for third party libraries. It is difficult to use NuGet
packages for C/C++ sources because they may be built by earlier versions
of the MSVC compiler and have CRT version and linking issues.

Additionally, the C/C++ NuGet packages that we were using tended to not
be updated concurrently with the sources.  And in the case of cURL and
OpenSSL, this could expose us to security issues.

Helped-by: Yue Lin Ho <b8732003@student.nsysu.edu.tw>
Helped-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 10:46:57 -07:00
Jeff Hostetler 5f3ff78081 msvc: fix detect_msys_tty()
The ntstatus.h header is only available in MINGW.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Jeff Hostetler f6f470fed4 msvc: define ftello()
It is just called differently in MSVC's headers.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Jeff Hostetler 172e54e2d7 msvc: do not re-declare the timespec struct
VS2015's headers already declare that struct.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Jeff Hostetler 12fb9bd85e msvc: mark a variable as non-const
VS2015 complains when using a const pointer in memcpy()/free().

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Philip Oakley a822fb853a msvc: define O_ACCMODE
This constant is not defined in MSVC's headers.

In UCRT's fcntl.h, _O_RDONLY, _O_WRONLY and _O_RDWR are defined as 0, 1
and 2, respectively. Yes, that means that UCRT breaks with the tradition
that O_RDWR == O_RDONLY | O_WRONLY.

It is a perfectly legal way to define those constants, though, therefore
we need to take care of defining O_ACCMODE accordingly.

This is particularly important in order to keep our "open() can set
errno to EISDIR" emulation working: it tests that (flags & O_ACCMODE) is
not identical to O_RDONLY before going on to test specifically whether
the file for which open() reported EACCES is, in fact, a directory.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Philip Oakley 7ca46634bc msvc: include sigset_t definition
On MSVC (VS2008) sigset_t is not defined.

Signed-off-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Johannes Schindelin 396ff7547d mingw: replace mingw_startup() hack
Git for Windows has special code to retrieve the command-line parameters
(and even the environment) in UTF-16 encoding, so that they can be
converted to UTF-8. This is necessary because Git for Windows wants to
use UTF-8 encoded strings throughout its code, and the main() function
does not get the parameters in that encoding.

To do that, we used the __wgetmainargs() function, which is not even a
Win32 API function, but provided by the MINGW "runtime" instead.

Obviously, this method would not work with any compiler other than GCC,
and in preparation for compiling with Visual C++, we would like to avoid
precisely that.

Lucky us, there is a much more elegant way: we can simply implement the
UTF-16 variant of `main()`: `wmain()`.

To make that work, we need to link with -municode. The command-line
parameters are passed to `wmain()` encoded in UTF-16, as desired, and
this method also works with GCC, and also with Visual C++ after
adjusting the MSVC linker flags to force it to use `wmain()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Johannes Schindelin 96a0679441 obstack: fix compiler warning
MS Visual C suggests that the construct

	condition ? (int) i : (ptrdiff_t) d

is incorrect. Let's fix this by casting to ptrdiff_t also for the
positive arm of the conditional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Johannes Schindelin 4fa42df972 winansi: simplify loading the GetCurrentConsoleFontEx() function
We introduced helper macros to simplify loading functions dynamically.
Might just as well use them.

This also side-steps a compiler warning when building with GCC v8.x: it
would complain about casting between incompatible function pointers.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-13 09:34:16 -07:00
Johannes Schindelin 8d4679fe09 poll (mingw): allow compiling with GCC 8 and DEVELOPER=1
The return type of the `GetProcAddress()` function is `FARPROC` which
evaluates to `long long int (*)()`, i.e. it cannot be cast to the
correct function signature by GCC 8.

To work around that, we first cast to `void *` and go on with our merry
lives.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-13 09:34:16 -07:00
Junio C Hamano b3e0981f28 Merge branch 'tt/no-ipv6-fallback-for-winxp'
Code cleanup.

* tt/no-ipv6-fallback-for-winxp:
  mingw: remove obsolete IPv6-related code
2019-05-19 16:45:32 +09:00
Junio C Hamano 40bef4992e Merge branch 'cc/access-on-aix-workaround'
Workaround for standard-compliant but less-than-useful behaviour of
access(2) for the root user.

* cc/access-on-aix-workaround:
  git-compat-util: work around for access(X_OK) under root
2019-05-13 23:50:35 +09:00
Tanushree Tumane b9f0193b25 mingw: remove obsolete IPv6-related code
To support IPv6, Git provided fall back functions for Windows versions
that did not support IPv6. However, as Git dropped support for Windows
XP and prior, those functions are not needed anymore.

Remove those fallbacks by reverting fe3b2b7b82 (Enable support for
IPv6 on MinGW, 2009-11-24) and using the functions directly (without
'ipv6_' prefix).

Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-07 18:42:28 +09:00
Clément Chigot 400caafb2b git-compat-util: work around for access(X_OK) under root
On AIX, access(X_OK) may succeed when run as root even if the
execution isn't possible. This behavior is allowed by POSIX
which says:

  ... for a process with appropriate privileges, an implementation
  may indicate success for X_OK even if execute permission is not
  granted to any user.

It can lead hook programs to have their execution refused:

   git commit -m content
   fatal: cannot exec '.git/hooks/pre-commit': Permission denied

Add NEED_ACCESS_ROOT_HANDLER in order to use an access helper function.
It checks with stat if any executable flags is set when the current user
is root.

Signed-off-by: Clément Chigot <clement.chigot@atos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-25 17:49:44 +09:00
Jeff Hostetler 26c6f251d7 trace2: report peak memory usage of the process
Teach Windows version of git to report peak memory usage
during exit() processing.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16 13:37:07 +09:00
Jeff Hostetler a089724958 trace2: refactor setting process starting time
Create trace2_initialize_clock() and call from main() to capture
process start time in isolation and before other sub-systems are
ready.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16 13:37:06 +09:00
Junio C Hamano 2274fc75c3 Merge branch 'jk/guard-bswap-header'
The include file compat/bswap.h has been updated so that it is safe
to (accidentally) include it more than once.

* jk/guard-bswap-header:
  compat/bswap: add include header guards
2019-03-11 16:16:25 +09:00
Jeff King 33aa579a55 compat/bswap: add include header guards
Our compat/bswap.h lacks the usual preprocessor guards against multiple
inclusion. This usually isn't an issue since it only gets included from
git-compat-util.h, which has its own guards. But it would produce
redeclaration errors if any file included it separately.

Our hdr-check target would complain about this, except that it currently
skips items in compat/ entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07 07:42:14 +09:00
Jeff Hostetler 353d3d77f4 trace2: collect Windows-specific process information
Add platform-specific interface to log information about the current
process.

On Windows, this interface is used to indicate whether the git process
is running under a debugger and list names of the process ancestors.

Information for other platforms is left for a future effort.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22 15:27:59 -08:00
Jeff Hostetler ee4512ed48 trace2: create new combined trace facility
Create a new unified tracing facility for git.  The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.

In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.

Trace2 defines 3 output targets.  These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".  These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).

* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
  summary data.

* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
  It extends the output with columns for the command process, thread,
  repo, absolute and relative elapsed times.  It reports events for
  child process start/stop, thread start/stop, and per-thread function
  nesting.

* GIT_TR2_EVENT is a new structured format. It writes event data as a
  series of JSON records.

Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22 15:27:59 -08:00
Johannes Schindelin ca1b411648 mingw: safe-guard a bit more against getenv() problems
Running up to v2.21.0, we fixed two bugs that were made prominent by the
Windows-specific change to retain copies of only the 30 latest getenv()
calls' returned strings, invalidating any copies of previous getenv()
calls' return values.

While this really shines a light onto bugs of the form where we hold
onto getenv()'s return values without copying them, it is also a real
problem for users.

And even if Jeff King's patches merged via 773e408881 (Merge branch
'jk/save-getenv-result', 2019-01-29) provide further work on that front,
we are far from done. Just one example: on Windows, we unset environment
variables when spawning new processes, which potentially invalidates
strings that were previously obtained via getenv(), and therefore we
have to duplicate environment values that are somehow involved in
spawning new processes (e.g. GIT_MAN_VIEWER in show_man_page()).

We do not have a chance to investigate, let address, all of those issues
in time for v2.21.0, so let's at least help Windows users by increasing
the number of getenv() calls' return values that are kept valid. The
number 64 was determined by looking at the average number of getenv()
calls per process in the entire test suite run on Windows (which is
around 40) and then adding a bit for good measure. And it is a power of
two (which would have hit yesterday's theme perfectly).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-15 10:25:28 -08:00
Junio C Hamano 8593e8a618 Merge branch 'js/mingw-host-cpu'
Windows update.

* js/mingw-host-cpu:
  mingw: use a more canonical method to fix the CPU reporting
2019-02-13 18:18:43 -08:00
Junio C Hamano 1db999ce8d Merge branch 'nd/fileno-may-be-macro'
* nd/fileno-may-be-macro:
  git-compat-util: work around fileno(fp) that is a macro
2019-02-13 18:18:41 -08:00
Johannes Schindelin bb02e7a560 mingw: use a more canonical method to fix the CPU reporting
In `git version --build-options`, we report also the CPU, but in Git for
Windows we actually cross-compile the 32-bit version in a 64-bit Git for
Windows, so we cannot rely on the auto-detected value.

In 3815f64b0d (mingw: fix CPU reporting in `git version
--build-options`, 2019-02-07), we fixed this by a Windows-only
workaround, making use of magic pre-processor constants, which works in
GCC, but most likely not all C compilers.

As pointed out by Eric Sunshine, there is a better way, anyway: to set
the Makefile variable HOST_CPU explicitly for cross-compiled Git. So
let's do that!

This reverts commit 3815f64b0d partially.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-13 13:46:58 -08:00
Duy Nguyen 18a4f6be6b git-compat-util: work around fileno(fp) that is a macro
On various BSD's, fileno(fp) is implemented as a macro that directly
accesses the fields in the FILE * object, which breaks a function that
accepts a "void *fp" parameter and calls fileno(fp) and expect it to
work.

Work it around by adding a compile-time knob FILENO_IS_A_MACRO that
inserts a real helper function in the middle of the callchain.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-12 10:01:59 -08:00
Junio C Hamano 6951c5fd9f Merge branch 'js/mingw-host-cpu'
Windows update.

* js/mingw-host-cpu:
  mingw: fix CPU reporting in `git version --build-options`
2019-02-08 20:44:53 -08:00
Johannes Schindelin 3815f64b0d mingw: fix CPU reporting in git version --build-options
We cannot rely on `uname -m` in Git for Windows' SDK to tell us what
architecture we are compiling for, as we can compile both 32-bit and
64-bit `git.exe` from a 64-bit SDK, but the `uname -m` in that SDK will
always report `x86_64`.

So let's go back to our original design. And make it explicitly
Windows-specific.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-07 13:05:37 -08:00
Junio C Hamano 57cbc53d3e Merge branch 'js/vsts-ci'
Prepare to run test suite on Azure Pipeline.

* js/vsts-ci: (22 commits)
  test-date: drop unused parameter to getnanos()
  ci: parallelize testing on Windows
  ci: speed up Windows phase
  tests: optionally skip bin-wrappers/
  t0061: workaround issues with --with-dashes and RUNTIME_PREFIX
  tests: add t/helper/ to the PATH with --with-dashes
  mingw: try to work around issues with the test cleanup
  tests: include detailed trace logs with --write-junit-xml upon failure
  tests: avoid calling Perl just to determine file sizes
  README: add a build badge (status of the Azure Pipelines build)
  mingw: be more generous when wrapping up the setitimer() emulation
  ci: use git-sdk-64-minimal build artifact
  ci: add a Windows job to the Azure Pipelines definition
  Add a build definition for Azure DevOps
  ci/lib.sh: add support for Azure Pipelines
  tests: optionally write results as JUnit-style .xml
  test-date: add a subcommand to measure times in shell scripts
  ci: use a junction on Windows instead of a symlink
  ci: inherit --jobs via MAKEFLAGS in run-build-and-tests
  ci/lib.sh: encapsulate Travis-specific things
  ...
2019-02-06 22:05:26 -08:00
Junio C Hamano 0fa3cc77ee Merge branch 'tb/utf-16-le-with-explicit-bom'
A new encoding UTF-16LE-BOM has been invented to force encoding to
UTF-16 with BOM in little endian byte order, which cannot be directly
generated by using iconv.

* tb/utf-16-le-with-explicit-bom:
  Support working-tree-encoding "UTF-16LE-BOM"
2019-02-06 22:05:21 -08:00
Junio C Hamano f5dd919064 Merge branch 'js/mingw-unc-path-w-backslashes'
In Git for Windows, "git clone \\server\share\path" etc. that uses
UNC paths from command line had bad interaction with its shell
emulation.

* js/mingw-unc-path-w-backslashes:
  mingw: special-case arguments to `sh`
  mingw (t5580): document bug when cloning from backslashed UNC paths
2019-02-05 14:26:13 -08:00
Junio C Hamano 33bea7358f Merge branch 'sg/obstack-cast-function-type-fix'
The compat/obstack code had casts that -Wcast-function-type
compilation option found questionable.

* sg/obstack-cast-function-type-fix:
  compat/obstack: fix -Wcast-function-type warnings
2019-02-05 14:26:11 -08:00
Torsten Bögershausen aab2a1ae48 Support working-tree-encoding "UTF-16LE-BOM"
Users who want UTF-16 files in the working tree set the .gitattributes
like this:
test.txt working-tree-encoding=UTF-16

The unicode standard itself defines 3 allowed ways how to encode UTF-16.
The following 3 versions convert all back to 'g' 'i' 't' in UTF-8:

a) UTF-16, without BOM, big endian:
$ printf "\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

b) UTF-16, with BOM, little endian:
$ printf "\377\376g\000i\000t\000" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

c) UTF-16, with BOM, big endian:
$ printf "\376\377\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000    g   i   t

Git uses libiconv to convert from UTF-8 in the index into ITF-16 in the
working tree.
After a checkout, the resulting file has a BOM and is encoded in "UTF-16",
in the version (c) above.
This is what iconv generates, more details follow below.

iconv (and libiconv) can generate UTF-16, UTF-16LE or UTF-16BE:

d) UTF-16
$ printf 'git' | iconv -f UTF-8 -t UTF-16 | od -c
0000000  376 377  \0   g  \0   i  \0   t

e) UTF-16LE
$ printf 'git' | iconv -f UTF-8 -t UTF-16LE | od -c
0000000    g  \0   i  \0   t  \0

f)  UTF-16BE
$ printf 'git' | iconv -f UTF-8 -t UTF-16BE | od -c
0000000   \0   g  \0   i  \0   t

There is no way to generate version (b) from above in a Git working tree,
but that is what some applications need.
(All fully unicode aware applications should be able to read all 3 variants,
but in practise we are not there yet).

When producing UTF-16 as an output, iconv generates the big endian version
with a BOM. (big endian is probably chosen for historical reasons).

iconv can produce UTF-16 files with little endianess by using "UTF-16LE"
as encoding, and that file does not have a BOM.

Not all users (especially under Windows) are happy with this.
Some tools are not fully unicode aware and can only handle version (b).

Today there is no way to produce version (b) with iconv (or libiconv).
Looking into the history of iconv, it seems as if version (c) will
be used in all future iconv versions (for compatibility reasons).

Solve this dilemma and introduce a Git-specific "UTF-16LE-BOM".
libiconv can not handle the encoding, so Git pick it up, handles the BOM
and uses libiconv to convert the rest of the stream.
(UTF-16BE-BOM is added for consistency)

Rported-by: Adrián Gimeno Balaguer <adrigibal@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-31 10:27:52 -08:00
Johannes Schindelin 72d63b2f45 mingw: be more generous when wrapping up the setitimer() emulation
Every once in a while, the Azure Pipeline fails with some semi-random

	error: timer thread did not terminate timely

This error message means that the thread that is used to emulate the
setitimer() function did not terminate within 1,000 milliseconds.

The most likely explanation (and therefore the one we should assume to
be true, according to Occam's Razor) is that the timeout of one second
is simply not enough because we try to run so many tasks in parallel.

So let's give it ten seconds instead of only one. That should be enough.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-29 09:26:46 -08:00
Junio C Hamano 9f2eba2b90 Merge branch 'rb/hpe'
Portability updates for the HPE NonStop platform.

* rb/hpe:
  compat/regex/regcomp.c: define intptr_t and uintptr_t on NonStop
  git-compat-util.h: add FLOSS headers for HPE NonStop
  config.mak.uname: support for modern HPE NonStop config.
  transport-helper: drop read/write errno checks
  transport-helper: use xread instead of read
2019-01-18 13:49:54 -08:00
Johannes Schindelin 9e9da23c27 mingw: special-case arguments to sh
The MSYS2 runtime does its best to emulate the command-line wildcard
expansion and de-quoting which would be performed by the calling Unix
shell on Unix systems.

Those Unix shell quoting rules differ from the quoting rules applying to
Windows' cmd and Powershell, making it a little awkward to quote
command-line parameters properly when spawning other processes.

In particular, git.exe passes arguments to subprocesses that are *not*
intended to be interpreted as wildcards, and if they contain
backslashes, those are not to be interpreted as escape characters, e.g.
when passing Windows paths.

Note: this is only a problem when calling MSYS2 executables, not when
calling MINGW executables such as git.exe. However, we do call MSYS2
executables frequently, most notably when setting the use_shell flag in
the child_process structure.

There is no elegant way to determine whether the .exe file to be
executed is an MSYS2 program or a MINGW one. But since the use case of
passing a command line through the shell is so prevalent, we need to
work around this issue at least when executing sh.exe.

Let's introduce an ugly, hard-coded test whether argv[0] is "sh", and
whether it refers to the MSYS2 Bash, to determine whether we need to
quote the arguments differently than usual.

That still does not fix the issue completely, but at least it is
something.

Incidentally, this also fixes the problem where `git clone \\server\repo`
failed due to incorrect handling of the backslashes when handing the path
to the git-upload-pack process.

Further, we need to take care to quote not only whitespace and
backslashes, but also curly brackets. As aliases frequently go through
the MSYS2 Bash, and as aliases frequently get parameters such as
HEAD@{yesterday}, this is really important. As an early version of this
patch broke this, let's make sure that this does not regress by adding a
test case for that.

Helped-by: Kim Gybels <kgybels@infogroep.be>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18 13:12:14 -08:00
SZEDER Gábor 764473d257 compat/obstack: fix -Wcast-function-type warnings
GCC 8 introduced the new -Wcast-function-type warning, which is
implied by -Wextra (which, in turn is enabled in our DEVELOPER flags).
When building Git with GCC 8 and this warning enabled on a non-glibc
platform [1], one is greeted with a screenful of compiler
warnings/errors:

  compat/obstack.c: In function '_obstack_begin':
  compat/obstack.c:162:17: error: cast between incompatible function types from 'void * (*)(long int)' to 'struct _obstack_chunk * (*)(void *, long int)' [-Werror=cast-function-type]
     h->chunkfun = (struct _obstack_chunk * (*)(void *, long)) chunkfun;
                   ^
  compat/obstack.c:163:16: error: cast between incompatible function types from 'void (*)(void *)' to 'void (*)(void *, struct _obstack_chunk *)' [-Werror=cast-function-type]
     h->freefun = (void (*) (void *, struct _obstack_chunk *)) freefun;
                  ^
  compat/obstack.c:116:8: error: cast between incompatible function types from 'struct _obstack_chunk * (*)(void *, long int)' to 'struct _obstack_chunk * (*)(long int)' [-Werror=cast-function-type]
      : (*(struct _obstack_chunk *(*) (long)) (h)->chunkfun) ((size)))
          ^
  compat/obstack.c:168:22: note: in expansion of macro 'CALL_CHUNKFUN'
     chunk = h->chunk = CALL_CHUNKFUN (h, h -> chunk_size);
                        ^~~~~~~~~~~~~
  <snip>

'struct obstack' stores pointers to two functions to allocate and free
"chunks", and depending on how obstack is used, these functions take
either one parameter (like standard malloc() and free() do; this is
how we use it in 'kwset.c') or two parameters.  Presumably to reduce
memory footprint, a single field is used to store the function pointer
for both signatures, and then it's casted to the appropriate signature
when the function pointer is accessed.  These casts between function
pointers with different number of parameters are what trigger those
compiler errors.

Modify 'struct obstack' to use unions to store function pointers with
different signatures, and then use the union member with the
appropriate signature when accessing these function pointers.  This
eliminates the need for those casts, and thus avoids this compiler
error.

[1] Compiling 'compat/obstack.c' on a platform with glibc is sort of
    a noop, see the comment before '#  define ELIDE_CODE', so this is
    not an issue on common Linux distros.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-17 11:13:38 -08:00
Junio C Hamano 25d90d1cb7 Merge branch 'tb/use-common-win32-pathfuncs-on-cygwin'
Cygwin update.

* tb/use-common-win32-pathfuncs-on-cygwin:
  git clone <url> C:\cygwin\home\USER\repo' is working (again)
2019-01-14 15:29:32 -08:00
Randall S. Becker 0bdaacf66f compat/regex/regcomp.c: define intptr_t and uintptr_t on NonStop
The system definition header files on HPE NonStop do not define
intptr_t and uintptr_t as do other platforms. These typedefs
are added specifically wrapped in a __TANDEM ifdef.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-03 14:16:20 -08:00
Torsten Bögershausen 1cadad6f65 git clone <url> C:\cygwin\home\USER\repo' is working (again)
A regression for cygwin users was introduced with commit 05b458c,
 "real_path: resolve symlinks by hand".

In the the commit message we read:
  The current implementation of real_path uses chdir() in order to resolve
    symlinks.  Unfortunately this isn't thread-safe as chdir() affects a
      process as a whole...

The old (and non-thread-save) OS calls chdir()/pwd() had been
replaced by a string operation.
The cygwin layer "knows" that "C:\cygwin" is an absolute path,
but the new string operation does not.

"git clone <url> C:\cygwin\home\USER\repo" fails like this:
fatal: Invalid path '/home/USER/repo/C:\cygwin\home\USER\repo'

The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]

Extract the needed code into compat/win32/path-utils.[ch] and use it
for cygwin as well.

Reported-by: Steven Penny <svnpenn@gmail.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-26 15:26:17 -08:00
Junio C Hamano c0e286a3d2 Merge branch 'ss/msvc-strcasecmp'
MSVC update.

* ss/msvc-strcasecmp:
  msvc: directly use MS version (_stricmp) of strcasecmp
2018-12-01 21:41:45 +09:00
Junio C Hamano e1372c37de Merge branch 'js/mingw-o-append' into maint
Further fix for O_APPEND emulation on Windows

* js/mingw-o-append:
  mingw: fix mingw_open_append to work with named pipes
  t0051: test GIT_TRACE to a windows named pipe
2018-11-21 22:57:44 +09:00
Sven Strickroth d27525e519 msvc: directly use MS version (_stricmp) of strcasecmp
This also removes an implicit conversion from size_t (unsigned) to int (signed).

_stricmp as well as _strnicmp are both available since VS2012.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-20 10:57:13 +09:00
Junio C Hamano b8a9209193 Merge branch 'js/mingw-msdn-url'
The URL to an MSDN page in a comment has been updated.

* js/mingw-msdn-url:
  mingw: replace an obsolete link with the superseding one
2018-11-18 18:23:58 +09:00
Junio C Hamano c56aa1d280 Merge branch 'js/mingw-create-hard-link'
Windows update.

* js/mingw-create-hard-link:
  mingw: use `CreateHardLink()` directly
2018-11-18 18:23:58 +09:00
Johannes Schindelin 2ef2ae2917 mingw: replace an obsolete link with the superseding one
The MSDN documentation has been superseded by Microsoft Docs (which is
backed by a repository on GitHub containing many, many files in Markdown
format).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-16 14:57:54 +09:00
Loo Rong Jie c73b7ad548 win32: replace pthread_cond_*() with much simpler code
The Win32 CONDITION_VARIABLE has better performance and is easier to
maintain, as the code is a lot shorter now (the semantics of the
CONDITION_VARIABLE matches the pthread_cond_t very well).

Note: CONDITION_VARIABLE is not available in Windows XP and below,
but the declared minimal Windows version required to build and run
Git for Windows is Windows Vista (which is also beyond its
end-of-life, but for less long than Windows XP), so that's okay.

Signed-off-by: Loo Rong Jie <loorongjie@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14 15:14:22 +09:00
Johannes Schindelin 1e1a876bb2 mingw: use CreateHardLink() directly
The function `CreateHardLink()` is available in all supported Windows
versions (even since Windows XP), so there is no more need to resolve it
at runtime.

Helped-by: Max Kirillov <max@max630.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14 14:41:15 +09:00
Junio C Hamano 1961efecae Merge branch 'sh/mingw-safer-compat-poll'
Windows fix.

* sh/mingw-safer-compat-poll:
  poll: use GetTickCount64() to avoid wrap-around issues
2018-11-13 22:37:24 +09:00
Junio C Hamano 0474cd19ef Merge branch 'js/mingw-utf8-env'
Windows fix.

* js/mingw-utf8-env:
  mingw: reencode environment variables on the fly (UTF-16 <-> UTF-8)
  t7800: fix quoting
2018-11-13 22:37:21 +09:00
Junio C Hamano 6c268fdda9 Merge branch 'js/mingw-perl5lib'
Windows fix.

* js/mingw-perl5lib:
  mingw: unset PERL5LIB by default
  config: move Windows-specific config settings into compat/mingw.c
  config: allow for platform-specific core.* config settings
  config: rename `dummy` parameter to `cb` in git_default_config()
2018-11-13 22:37:20 +09:00
Junio C Hamano fbfdc07511 Merge branch 'js/mingw-isatty-and-dup2'
Windows fix.

* js/mingw-isatty-and-dup2:
  mingw: fix isatty() after dup2()
2018-11-13 22:37:20 +09:00
Junio C Hamano b9c3f062b6 Merge branch 'js/mingw-ns-filetime'
Windows port learned to use nano-second resolution file timestamps.

* js/mingw-ns-filetime:
  mingw: implement nanosecond-precision file times
  mingw: replace MSVCRT's fstat() with a Win32-based implementation
  mingw: factor out code to set stat() data
2018-11-06 15:50:21 +09:00
Steve Hoelzer e8dfcace31 poll: use GetTickCount64() to avoid wrap-around issues
The value of timeout starts as an int value, and for this reason it
cannot overflow unsigned long long aka ULONGLONG. The unsigned version
of this initial value is available in orig_timeout. The difference
(orig_timeout - elapsed) cannot wrap around because it is protected by
a conditional (as can be seen in the patch text). Hence, the ULONGLONG
difference can only have values that are smaller than the initial
timeout value and truncation to int cannot overflow.

Signed-off-by: Steve Hoelzer <shoelzer@gmail.com>
[j6t: improved both implementation and log message]
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-05 13:02:42 +09:00
Johannes Schindelin ff8978d533 mingw: fix isatty() after dup2()
Since a9b8a09c3c (mingw: replace isatty() hack, 2016-12-22), we handle
isatty() by special-casing the stdin/stdout/stderr file descriptors,
caching the return value. However, we missed the case where dup2()
overrides the respective file descriptor.

That poses a problem e.g. where the `show` builtin asks for a pager very
early, the `setup_pager()` function sets the pager depending on the
return value of `isatty()` and then redirects stdout. Subsequently,
`cmd_log_init_finish()` calls `setup_pager()` *again*. What should
happen now is that `isatty()` reports that stdout is *not* a TTY and
consequently stdout should be left alone.

Let's override dup2() to handle this appropriately.

This fixes https://github.com/git-for-windows/git/issues/1077

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:48:59 +09:00
Johannes Schindelin 0e218f91c2 mingw: unset PERL5LIB by default
Git for Windows ships with its own Perl interpreter, and insists on
using it, so it will most likely wreak havoc if PERL5LIB is set before
launching Git.

Let's just unset that environment variables when spawning processes.

To make this feature extensible (and overrideable), there is a new
config setting `core.unsetenvvars` that allows specifying a
comma-separated list of names to unset before spawning processes.

Reported by Gabriel Fuhrmann.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:46:32 +09:00
Johannes Schindelin bdfbb0ea93 config: move Windows-specific config settings into compat/mingw.c
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:46:27 +09:00
Johannes Schindelin 70fc5793df config: allow for platform-specific core.* config settings
In the Git for Windows project, we have ample precendent for config
settings that apply to Windows, and to Windows only.

Let's formalize this concept by introducing a platform_core_config()
function that can be #define'd in a platform-specific manner.

This will allow us to contain platform-specific code better, as the
corresponding variables no longer need to be exported so that they can
be defined in environment.c and be set in config.c

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:46:21 +09:00
Johannes Schindelin fe21c6b285 mingw: reencode environment variables on the fly (UTF-16 <-> UTF-8)
On Windows, the authoritative environment is encoded in UTF-16. In Git
for Windows, we convert that to UTF-8 (because UTF-16 is *such* a
foreign idea to Git that its source code is unprepared for it).

Previously, out of performance concerns, we converted the entire
environment to UTF-8 in one fell swoop at the beginning, and upon
putenv() and run_command() converted it back.

Having a private copy of the environment comes with its own perils: when
a library used by Git's source code tries to modify the environment, it
does not really work (in Git for Windows' case, libcurl, see
https://github.com/git-for-windows/git/compare/bcad1e6d58^...bcad1e6d58^2
for a glimpse of the issues).

Hence, it makes our environment handling substantially more robust if we
switch to on-the-fly-conversion in `getenv()`/`putenv()` calls. Based
on an initial version in the MSVC context by Jeff Hostetler, this patch
makes it so.

Surprisingly, this has a *positive* effect on speed: at the time when
the current code was written, we tested the performance, and there were
*so many* `getenv()` calls that it seemed better to convert everything
in one go. In the meantime, though, Git has obviously been cleaned up a
bit with regards to `getenv()` calls so that the Git processes spawned
by the test suite use an average of only 40 `getenv()`/`putenv()` calls
over the process lifetime.

Speaking of the entire test suite: the total time spent in the
re-encoding in the current code takes about 32.4 seconds (out of 113
minutes runtime), whereas the code introduced in this patch takes only
about 8.2 seconds in total. Not much, but it proves that we need not be
concerned about the performance impact introduced by this patch.

Helped-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 11:43:14 +09:00
Junio C Hamano a70d7827ac Merge branch 'cb/compat-mmap-is-private-read-only'
Code tightening.

* cb/compat-mmap-is-private-read-only:
  compat: make sure git_mmap is not expected to write
2018-10-30 15:43:49 +09:00
Junio C Hamano 97ffca6cc7 Merge branch 'js/mingw-load-sys-dll'
The way DLLs are loaded on the Windows port has been improved.

* js/mingw-load-sys-dll:
  mingw: load system libraries the recommended way
2018-10-30 15:43:48 +09:00
Junio C Hamano cc6748720f Merge branch 'js/mingw-getcwd'
The way the Windows port figures out the current directory has been
improved.

* js/mingw-getcwd:
  mingw: fix getcwd when the parent directory cannot be queried
  mingw: ensure `getcwd()` reports the correct case
2018-10-30 15:43:48 +09:00
Junio C Hamano 16ce0b92bd Merge branch 'js/mingw-default-ident'
The logic to select the default user name and e-mail on Windows has
been improved.

* js/mingw-default-ident:
  mingw: use domain information for default email
  getpwuid(mingw): provide a better default for the user name
  getpwuid(mingw): initialize the structure only once
2018-10-26 14:22:15 +09:00
Carlo Marcelo Arenas Belón bf1e6da791 compat: make sure git_mmap is not expected to write
in f48000fc ("Yank writing-back support from gitfakemmap.", 2005-10-08)
support for writting back changes was removed but the specific prot
flag that would be used was not checked for

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-25 18:51:03 +09:00
Anton Serbulov 4745feeb8b mingw: fix getcwd when the parent directory cannot be queried
`GetLongPathName()` function may fail when it is unable to query
the parent directory of a path component to determine the long name
for that component. It happens, because it uses `FindFirstFile()`
function for each next short part of path. The `FindFirstFile()`
requires `List Directory` and `Synchronize` desired access for a calling
process.

In case of lacking such permission for some part of path,
the `GetLongPathName()` returns 0 as result and `GetLastError()`
returns ERROR_ACCESS_DENIED.

`GetFinalPathNameByHandle()` function can help in such cases, because
it requires `Read Attributes` and `Synchronize` desired access to the
target path only.

The `GetFinalPathNameByHandle()` function was introduced on
`Windows Server 2008/Windows Vista`. So we need to load it dynamically.

`CreateFile()` parameters:
    `lpFileName` = path to the current directory
    `dwDesiredAccess` = 0 (it means `Read Attributes` and `Synchronize`)
    `dwShareMode` = FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE
                    (it prevents `Sharing Violation`)
    `lpSecurityAttributes` = NULL (default security attributes)
    `dwCreationDisposition` = OPEN_EXISTING
                              (required to obtain a directory handle)
    `dwFlagsAndAttributes` = FILE_FLAG_BACKUP_SEMANTICS
                             (required to obtain a directory handle)
    `hTemplateFile` = NULL (when opening an existing file or directory,
                            `CreateFile` ignores this parameter)

The string that is returned by `GetFinalPathNameByHandle()` function
uses the \\?\ syntax. To skip the prefix and convert backslashes
to slashes, the `normalize_ntpath()` mingw function will be used.

Note: `GetFinalPathNameByHandle()` function returns a final path.
It is the path that is returned when a path is fully resolved.
For example, for a symbolic link named "C:\tmp\mydir" that points to
"D:\yourdir", the final path would be "D:\yourdir".

Signed-off-by: Anton Serbulov <aserbulov@plesk.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 14:48:57 +09:00
Johannes Schindelin 937974fc65 mingw: ensure getcwd() reports the correct case
When switching the current working directory, say, in PowerShell, it is
quite possible to use a different capitalization than the one that is
recorded on disk. While doing the same in `cmd.exe` adjusts the
capitalization magically, that does not happen in PowerShell so that
`getcwd()` returns the current directory in a different way than is
recorded on disk.

Typically this creates no problems except when you call

	git log .

in a subdirectory called, say, "GIT/" but you switched to "Git/" and
your `getcwd()` reports the latter, then Git won't understand that you
wanted to see the history as per the `GIT/` subdirectory but it thinks you
wanted to see the history of some directory that may have existed in the
past (but actually never did).

So let's be extra careful to adjust the capitalization of the current
directory before working with it.

Reported by a few PowerShell power users ;-)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 14:48:38 +09:00
Johannes Schindelin c6f050a434 mingw: load system libraries the recommended way
When we access IPv6-related functions, we load the corresponding system
library using the `LoadLibrary()` function, which is not the recommended
way to load system libraries.

In practice, it does not make a difference: the `ws2_32.dll` library
containing the IPv6 functions is already loaded into memory, so
LoadLibrary() simply reuses the already-loaded library.

Still, recommended way is recommended way, so let's use that instead.

While at it, also adjust the code in contrib/ that loads system libraries.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 14:48:00 +09:00
Karsten Blees d7e8c87421 mingw: implement nanosecond-precision file times
We no longer use any of MSVCRT's stat-functions, so there's no need to
stick to a CRT-compatible 'struct stat' either.

Define and use our own POSIX-2013-compatible 'struct stat' with nanosecond-
precision file times.

Note: This can cause performance issues when using Git variants with
different file time resolutions, as the timestamps are stored in the Git
index: after updating the index with a Git variant that uses
second-precision file times, a nanosecond-aware Git will think that
pretty much every single file listed in the index is out of date.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 13:18:14 +09:00
Karsten Blees d75e697353 mingw: replace MSVCRT's fstat() with a Win32-based implementation
fstat() is the only stat-related CRT function for which we don't have a
full replacement yet (and thus the only reason to stick with MSVCRT's
'struct stat' definition).

Fully implement fstat(), in preparation of implementing a POSIX 2013
compatible 'struct stat' with nanosecond-precision file times.

This allows us also to implement some clever code to handle pipes and
character devices in our own way.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 13:18:14 +09:00
Johannes Schindelin 7bf1983886 mingw: factor out code to set stat() data
In our fstat() emulation, we convert the file metadata from Win32 data
structures to an emulated POSIX structure.

To structure the code better, let's factor that part out into its own
function.

Note: it would be tempting to try to unify this code with the part of
do_lstat() that does the same thing, but they operate on different data
structures: BY_HANDLE_FILE_INFORMATION vs WIN32_FILE_ATTRIBUTE_DATA. So
unfortunately, they cannot be unified.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 13:18:14 +09:00
Johannes Schindelin 501afcb8b0 mingw: use domain information for default email
When a user is registered in a Windows domain, it is really easy to
obtain the email address. So let's do that.

Suggested by Lutz Roeder.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-16 12:59:57 +09:00
Johannes Schindelin 564be791f3 getpwuid(mingw): provide a better default for the user name
We do have the excellent GetUserInfoEx() function to obtain more
detailed information of the current user (if the user is part of a
Windows domain); Let's use it.

Suggested by Lutz Roeder.

To avoid the cost of loading Secur32.dll (even lazily, loading DLLs
takes a non-neglibile amount of time), we use the established technique
to load DLLs only when, and if, needed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-16 12:59:57 +09:00
Johannes Schindelin 55b6513e73 getpwuid(mingw): initialize the structure only once
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-16 12:59:57 +09:00
Johannes Schindelin d7e357fb9c compat/poll: prepare for targeting Windows Vista
Windows Vista (and later) actually have a working poll(), but we still
cannot use it because it only works on sockets.

So let's detect when we are targeting Windows Vista and undefine those
constants, and define `pollfd` so that we can declare our own pollfd
struct.

We also need to make sure that we override those constants *after*
`winsock2.h` has been `#include`d (otherwise we would not really
override those constants).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-04 05:39:56 -07:00
Junio C Hamano e3d4ff037d Merge branch 'js/mingw-o-append'
Further fix for O_APPEND emulation on Windows

* js/mingw-o-append:
  mingw: fix mingw_open_append to work with named pipes
  t0051: test GIT_TRACE to a windows named pipe
2018-09-24 10:30:47 -07:00
Jeff Hostetler eeaf7ddac7 mingw: fix mingw_open_append to work with named pipes
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-11 13:54:54 -07:00
Junio C Hamano 5ade034464 Merge branch 'en/incl-forward-decl'
Code hygiene improvement for the header files.

* en/incl-forward-decl:
  Remove forward declaration of an enum
  compat/precompose_utf8.h: use more common include guard style
  urlmatch.h: fix include guard
  Move definition of enum branch_track from cache.h to branch.h
  alloc: make allocate_alloc_state and clear_alloc_state more consistent
  Add missing includes and forward declarations
2018-08-20 12:41:32 -07:00
Junio C Hamano d18d09bb81 Merge branch 'js/mingw-o-append'
Among the three codepaths we use O_APPEND to open a file for
appending, one used for writing GIT_TRACE output requires O_APPEND
implementation that behaves sensibly when multiple processes are
writing to the same file.  POSIX emulation used in the Windows port
has been updated to improve in this area.

* js/mingw-o-append:
  mingw: enable atomic O_APPEND
2018-08-20 11:33:52 -07:00
Elijah Newren f790d81252 compat/precompose_utf8.h: use more common include guard style
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15 11:52:09 -07:00
Johannes Sixt d641097589 mingw: enable atomic O_APPEND
The Windows CRT implements O_APPEND "manually": on write() calls, the
file pointer is set to EOF before the data is written. Clearly, this is
not atomic. And in fact, this is the root cause of failures observed in
t5552-skipping-fetch-negotiator.sh and t5503-tagfollow.sh, where
different processes write to the same trace file simultanously; it also
occurred in t5400-send-pack.sh, but there it was worked around in
71406ed4d6 ("t5400: avoid concurrent writes into a trace file",
2017-05-18).

Fortunately, Windows does support atomic O_APPEND semantics using the
file access mode FILE_APPEND_DATA. Provide an implementation that does.

This implementation is minimal in such a way that it only implements
the open modes that are actually used in the Git code base. Emulation
for other modes can be added as necessary later. To become aware of
the necessity early, the unusal error ENOSYS is reported if an
unsupported mode is encountered.

Diagnosed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 15:22:19 -07:00
Junio C Hamano b7d510e8d5 Merge branch 'nd/command-list'
Build doc update for Windows.

* nd/command-list:
  vcbuild/README: update to accommodate for missing common-cmds.h
2018-07-24 14:50:50 -07:00
Johannes Schindelin ede8d89bb1 vcbuild/README: update to accommodate for missing common-cmds.h
In 60f487ac0e (Remove common-cmds.h, 2018-05-10), we forgot to adjust
this README when removing the common-cmds.h file.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:30:34 -07:00
Johannes Schindelin 86e254584b mingw/msvc: use the new-style RUNTIME_PREFIX helper
This change also allows us to stop overriding argv[0] with the absolute
path of the executable, allowing us to preserve e.g. the case of the
executable's file name.

This fixes https://github.com/git-for-windows/git/issues/1496 partially.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 18:10:28 +09:00
Johannes Schindelin 9ee0540a40 mingw: abort on invalid strftime formats
On Windows, strftime() does not silently ignore invalid formats, but
warns about them and then returns 0 and sets errno to EINVAL.

Unfortunately, Git does not expect such a behavior, as it disagrees
with strftime()'s semantics on Linux. As a consequence, Git
misinterprets the return value 0 as "I need more space" and grows the
buffer. As the larger buffer does not fix the format, the buffer grows
and grows and grows until we are out of memory and abort.

Ideally, we would switch off the parameter validation just for
strftime(), but we cannot even override the invalid parameter handler
via _set_thread_local_invalid_parameter_handler() using MINGW because
that function is not declared. Even _set_invalid_parameter_handler(),
which *is* declared, does not help, as it simply does... nothing.

So let's just bite the bullet and override strftime() for MINGW and
abort on an invalid format string. While this does not provide the
best user experience, it is the best we can do.

See https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx for more
details.

This fixes https://github.com/git-for-windows/git/issues/863

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-19 10:53:17 -07:00
Junio C Hamano e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Junio C Hamano b2a276830f Merge branch 'tz/fsf-address-update' into maint
Replace the mailing address of FSF to a URL, as FSF prefers.

* tz/fsf-address-update:
  Replace Free Software Foundation address in license notices
  Replace Free Software Foundation address in license notices
2017-11-21 14:05:32 +09:00
Junio C Hamano f13b8ec25e Merge branch 'tz/fsf-address-update'
* tz/fsf-address-update:
  Replace Free Software Foundation address in license notices
  Replace Free Software Foundation address in license notices
2017-11-15 12:14:34 +09:00
Todd Zullinger 484257925f Replace Free Software Foundation address in license notices
The mailing address for the FSF has changed over the years.  Rather than
updating the address across all files, refer readers to gnu.org, as the
GNU GPL documentation now suggests for license notices.  The mailing
address is retained in the full license files (COPYING and LGPL-2.1).

The old address is still present in t/diff-lib/COPYING.  This is
intentional, as the file is used in tests and the contents are not
expected to change.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-09 13:21:21 +09:00
Johannes Schindelin 1a172e4ac1 mingw: optionally redirect stderr/stdout via the same handle
The "2>&1" notation in Powershell and in Unix shells implies that stderr
is redirected to the same handle into which stdout is already written.

Let's use this special value to allow the same trick with
GIT_REDIRECT_STDERR and GIT_REDIRECT_STDOUT: if the former's value is
`2>&1`, then stderr will simply be written to the same handle as stdout.

The functionality was suggested by Jeff Hostetler.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-02 11:19:43 +09:00
Johannes Schindelin 3f944424ac mingw: add experimental feature to redirect standard handles
Particularly when calling Git from applications, such as Visual Studio's
Team Explorer, it is important that stdin/stdout/stderr are closed
properly. However, when spawning processes on Windows, those handles
must be marked as inheritable if we want to use them, but that flag is a
global flag and may very well be used by other spawned processes which
then do not know to close those handles.

Let's introduce a set of environment variables (GIT_REDIRECT_STDIN and
friends) that specify paths to files, or even better, named pipes (which
are similar to Unix sockets) and that are used by the spawned Git
process.  This helps work around above-mentioned issue: those named
pipes will be opened in a non-inheritable way upon startup, and no
handles are passed around (and therefore no inherited handles need to be
closed by any spawned child).

This feature shipped with Git for Windows (marked as experimental) since
v2.11.0(2), so it has seen some serious testing in the meantime.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-02 11:19:41 +09:00
Junio C Hamano 96d4b17bd6 Merge branch 'rb/compat-poll-fix' into maint
Backports a moral equivalent of 2015 fix to the poll emulation from
the upstream gnulib to fix occasional breakages on HPE NonStop.

* rb/compat-poll-fix:
  poll.c: always set revents, even if to zero
2017-10-18 14:19:05 +09:00
Junio C Hamano 7245ee3d6c Merge branch 'ds/avoid-overflow-in-midpoint-computation'
Code clean-up.

* ds/avoid-overflow-in-midpoint-computation:
  cleanup: fix possible overflow errors in binary search
2017-10-11 14:52:24 +09:00