Commit graph

184 commits

Author SHA1 Message Date
Glen Choo a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Johannes Schindelin 17d3883fe9 setup: prepare for more detailed "dubious ownership" messages
When verifying the ownership of the Git directory, we sometimes would
like to say a bit more about it, e.g. when using a platform-dependent
code path (think: Windows has the permission model that is so different
from Unix'), but only when it is a appropriate to actually say
something.

To allow for that, collect that information and hand it back to the
caller (whose responsibility it is to show it or not).

Note: We do not actually fill in any platform-dependent information yet,
this commit just adds the infrastructure to be able to do so.

Based-on-an-idea-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 09:25:40 -07:00
Junio C Hamano 83937e9592 Merge branch 'ns/batch-fsync'
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.

* ns/batch-fsync:
  core.fsyncmethod: performance tests for batch mode
  t/perf: add iteration setup mechanism to perf-lib
  core.fsyncmethod: tests for batch mode
  test-lib-functions: add parsing helpers for ls-files and ls-tree
  core.fsync: use batch mode and sync loose objects by default on Windows
  unpack-objects: use the bulk-checkin infrastructure
  update-index: use the bulk-checkin infrastructure
  builtin/add: add ODB transaction around add_files_to_cache
  cache-tree: use ODB transaction around writing a tree
  core.fsyncmethod: batched disk flushes for loose-objects
  bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
  bulk-checkin: rename 'state' variable and separate 'plugged' boolean
2022-06-03 14:30:34 -07:00
Junio C Hamano f1b50ec6f8 Git 2.35.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmI8+U4ACgkQsLXohpav
 5ss7GA//YwNiPt8TY2bghKyt2qNiKKNl6sEq1MC0r4w68q5hGgmDjByvWA1K/4W5
 JLfhauBZB6ucx9zrcr6v6nr+a9/y4MC/bEq0Mbw16asyipyrZ0/e4FLsY4A8lVcw
 5vX81LMj5vV7NwHVJiHQ7Qxcyu5ZBCV0UyqIJiIwqXquXMF6UE22dLsraCopIJ3Z
 lLhqf2XgDVSfICvi55e6xgzeVHknJ0CtN8+nOvDmeZmTkjpGK9xPUlHoD9zn8/kN
 Fyfn8fdAwn3+0Yw9HF1i78+WL/btuHebIpCFt0DNHOX0SkBTMpwwMIv0hu83yVb2
 mMfqhDwzkdUWeZsne5gtF2ZunF1hWa0e9a9bZ3IgHojZ1BFMzGusIPR6K//IWKrJ
 PQUdqb7i1lD4IZePrPseN6dPKQQskbBSsw0zSLOBYIhFc4AK5VoZIDHDkVUtMbLH
 Y/eAViGGSfX6WfRTTiZvyZOqJg06fS2z/aQBfO6oKw1J9iTJDUW+5R/IZHqZcLo1
 xe+P1r4mJzsRLspOODJvhJxIpE3aoW0H3/88nUiA3FMz7Qt9aPsgDwtl7p3WyZwu
 bP+FLuoRNEvb1mgz1Y7qXz5/agz/8CxfQFR7oJLi/qGjX6xXVLd1ZIVKiy04awbw
 AEWEWsm64uSOMH3tOzH2J7dfpykSADxNMEzt2SVrRH/UIVvlRa4=
 =f9iS
 -----END PGP SIGNATURE-----

Merge tag 'v2.35.2'
2022-04-11 16:44:45 -07:00
Neeraj Singh 8a94d83349 core.fsync: use batch mode and sync loose objects by default on Windows
Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.

We don't actually do fsyncs in the most of the test suite, since
GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
surrounding batch mode code since GIT_TEST_FSYNC merely makes the
maybe_fsync wrapper always appear to succeed.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-06 13:13:26 -07:00
Johannes Schindelin 6a2381a3e5 Sync with 2.30.3
* maint-2.30:
  Git 2.30.3
  setup_git_directory(): add an owner check for the top-level directory
  Add a function to determine whether a path is owned by the current user
2022-03-24 00:24:29 +01:00
Johannes Schindelin bdc77d1d68 Add a function to determine whether a path is owned by the current user
This function will be used in the next commit to prevent
`setup_git_directory()` from discovering a repository in a directory
that is owned by someone other than the current user.

Note: We cannot simply use `st.st_uid` on Windows just like we do on
Linux and other Unix-like platforms: according to
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/stat-functions
this field is always zero on Windows (because Windows' idea of a user ID
does not fit into a single numerical value). Therefore, we have to do
something a little involved to replicate the same functionality there.

Also note: On Windows, a user's home directory is not actually owned by
said user, but by the administrator. For all practical purposes, it is
under the user's control, though, therefore we pretend that it is owned
by the user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-21 13:16:26 +01:00
Neeraj Singh abf38abec2 core.fsyncmethod: add writeout-only mode
This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.

The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.

Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.

When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.

On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10 15:10:22 -08:00
Jeff King 2b08101204 Makefile: add OPEN_RETURNS_EINTR knob
On some platforms, open() reportedly returns EINTR when opening regular
files and we receive a signal (usually SIGALRM from our progress meter).
This shouldn't happen, as open() should be a restartable syscall, and we
specify SA_RESTART when setting up the alarm handler. So it may actually
be a kernel or libc bug for this to happen. But it has been reported on
at least one version of Linux (on a network filesystem):

  https://lore.kernel.org/git/c8061cce-71e4-17bd-a56a-a5fed93804da@neanderfunk.de/

as well as on macOS starting with Big Sur even on a regular filesystem.

We can work around it by retrying open() calls that get EINTR, just as
we do for read(), etc. Since we don't ever _want_ to interrupt an open()
call, we can get away with just redefining open, rather than insisting
all callsites use xopen().

We actually do have an xopen() wrapper already (and it even does this
retry, though there's no indication of it being an observed problem back
then; it seems simply to have been lifted from xread(), etc). But it is
used hardly anywhere, and isn't suitable for general use because it will
die() on error. In theory we could combine the two, but it's awkward to
do so because of the variable-args interface of open().

This patch adds a Makefile knob for enabling the workaround. It's not
enabled by default for any platforms in config.mak.uname yet, as we
don't have enough data to decide how common this is (I have not been
able to reproduce on either Linux or Big Sur myself). It may be worth
enabling preemptively anyway, since the cost is pretty low (if we don't
see an EINTR, it's just an extra conditional).

However, note that we must not enable this on Windows. It doesn't do
anything there, and the macro overrides the existing mingw_open()
redirection. I've added a preemptive #undef here in the mingw header
(which is processed first) to just quietly disable it (we could also
make it an #error, but there is little point in being so aggressive).

Reported-by: Aleksey Kliger <alklig@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:15:51 -08:00
Denton Liu fcedb379fd compat/mingw.h: drop extern from function declaration
In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), `extern` on function declarations were declared to
be redundant and thus removed from the codebase. An `extern` was
accidentally reintroduced in 08809c09aa (mingw: add a helper function to
attach GDB to the current process, 2020-02-13).

Remove this spurious `extern`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 09:55:20 -07:00
Junio C Hamano e154451a2f Merge branch 'js/mingw-open-in-gdb'
Dev support.

* js/mingw-open-in-gdb:
  mingw: add a helper function to attach GDB to the current process
2020-02-17 13:22:18 -08:00
Johannes Schindelin 08809c09aa mingw: add a helper function to attach GDB to the current process
When debugging Git, the criss-cross spawning of processes can make
things quite a bit difficult, especially when a Unix shell script is
thrown in the mix that calls a `git.exe` that then segfaults.

To help debugging such things, we introduce the `open_in_gdb()` function
which can be called at a code location where the segfault happens (or as
close as one can get); This will open a new MinTTY window with a GDB
that already attached to the current process.

Inspired by Derrick Stolee.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-14 10:02:07 -08:00
Junio C Hamano 13432fc6dd Merge branch 'js/mingw-reserved-filenames'
Forbid pathnames that the platform's filesystem cannot represent on
MinGW.

* js/mingw-reserved-filenames:
  mingw: refuse paths containing reserved names
  mingw: short-circuit the conversion of `/dev/null` to UTF-16
2020-01-02 12:38:30 -08:00
Johannes Schindelin 4dc42c6c18 mingw: refuse paths containing reserved names
There are a couple of reserved names that cannot be file names on
Windows, such as `AUX`, `NUL`, etc. For an almost complete list, see
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

If one would try to create a directory named `NUL`, it would actually
"succeed", i.e. the call would return success, but nothing would be
created.

Worse, even adding a file extension to the reserved name does not make
it a valid file name. To understand the rationale behind that behavior,
see https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073

Let's just disallow them all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-21 16:09:07 -08:00
Junio C Hamano 7034cd094b Sync with Git 2.24.1 2019-12-09 22:17:55 -08:00
Johannes Schindelin 67af91c47a Sync with 2.23.1
* maint-2.23: (44 commits)
  Git 2.23.1
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  ...
2019-12-06 16:31:39 +01:00
Johannes Schindelin 7fd9fd94fb Sync with 2.22.2
* maint-2.22: (43 commits)
  Git 2.22.2
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  ...
2019-12-06 16:31:30 +01:00
Johannes Schindelin 5421ddd8d0 Sync with 2.21.1
* maint-2.21: (42 commits)
  Git 2.21.1
  mingw: sh arguments need quoting in more circumstances
  mingw: fix quoting of empty arguments for `sh`
  mingw: use MSYS2 quoting even when spawning shell scripts
  mingw: detect when MSYS2's sh is to be spawned more robustly
  t7415: drop v2.20.x-specific work-around
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  ...
2019-12-06 16:31:23 +01:00
Johannes Schindelin fc346cb292 Sync with 2.20.2
* maint-2.20: (36 commits)
  Git 2.20.2
  t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  ...
2019-12-06 16:31:12 +01:00
Johannes Schindelin d851d94151 Sync with 2.19.3
* maint-2.19: (34 commits)
  Git 2.19.3
  Git 2.18.2
  Git 2.17.3
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  ...
2019-12-06 16:30:49 +01:00
Johannes Schindelin bdfef0492c Sync with 2.16.6
* maint-2.16: (31 commits)
  Git 2.16.6
  test-drop-caches: use `has_dos_drive_prefix()`
  Git 2.15.4
  Git 2.14.6
  mingw: handle `subst`-ed "DOS drives"
  mingw: refuse to access paths with trailing spaces or periods
  mingw: refuse to access paths with illegal characters
  unpack-trees: let merged_entry() pass through do_add_entry()'s errors
  quote-stress-test: offer to test quoting arguments for MSYS2 sh
  t6130/t9350: prepare for stringent Win32 path validation
  quote-stress-test: allow skipping some trials
  quote-stress-test: accept arguments to test via the command-line
  tests: add a helper to stress test argument quoting
  mingw: fix quoting of arguments
  Disallow dubiously-nested submodule git directories
  protect_ntfs: turn on NTFS protection by default
  path: also guard `.gitmodules` against NTFS Alternate Data Streams
  is_ntfs_dotgit(): speed it up
  mingw: disallow backslash characters in tree objects' file names
  path: safeguard `.git` against NTFS Alternate Streams Accesses
  ...
2019-12-06 16:27:36 +01:00
Johannes Schindelin f82a97eb91 mingw: handle subst-ed "DOS drives"
Over a decade ago, in 25fe217b86 (Windows: Treat Windows style path
names., 2008-03-05), Git was taught to handle absolute Windows paths,
i.e. paths that start with a drive letter and a colon.

Unbeknownst to us, while drive letters of physical drives are limited to
letters of the English alphabet, there is a way to assign virtual drive
letters to arbitrary directories, via the `subst` command, which is
_not_ limited to English letters.

It is therefore possible to have absolute Windows paths of the form
`1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode
letters can also be used, e.g. `ä:\tschibät.sch`.

While it can be sensibly argued that users who set up such funny drive
letters really seek adverse consequences, the Windows Operating System
is known to be a platform where many users are at the mercy of
administrators who have their very own idea of what constitutes a
reasonable setup.

Therefore, let's just make sure that such funny paths are still
considered absolute paths by Git, on Windows.

In addition to Unicode characters, pretty much any character is a valid
drive letter, as far as `subst` is concerned, even `:` and `"` or even a
space character. While it is probably the opposite of smart to use them,
let's safeguard `is_dos_drive_prefix()` against all of them.

Note: `[::1]:repo` is a valid URL, but not a valid path on Windows.
As `[` is now considered a valid drive letter, we need to be very
careful to avoid misinterpreting such a string as valid local path in
`url_is_local_not_ssh()`. To do that, we use the just-introduced
function `is_valid_path()` (which will label the string as invalid file
name because of the colon characters).

This fixes CVE-2019-1351.

Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:07 +01:00
Johannes Schindelin 817ddd64c2 mingw: refuse to access paths with illegal characters
Certain characters are not admissible in file names on Windows, even if
Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they
are, e.g. `:`, `<`, `>`, etc

Let's disallow those characters explicitly in Windows builds of Git.

Note: just like trailing spaces or periods, it _is_ possible on Windows
to create commits adding files with such illegal characters, as long as
the operation leaves the worktree untouched. To allow for that, we
continue to guard `is_valid_win32_path()` behind the config setting
`core.protectNTFS`, so that users _can_ continue to do that, as long as
they turn the protections off via that config setting.

Among other problems, this prevents Git from trying to write to an "NTFS
Alternate Data Stream" (which refers to metadata stored alongside a
file, under a special name: "<filename>:<stream-name>"). This fix
therefore also prevents an attack vector that was exploited in
demonstrations of a number of recently-fixed security bugs.

Further reading on illegal characters in Win32 filenames:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:06 +01:00
Johannes Schindelin d2c84dad1c mingw: refuse to access paths with trailing spaces or periods
When creating a directory on Windows whose path ends in a space or a
period (or chains thereof), the Win32 API "helpfully" trims those. For
example, `mkdir("abc ");` will return success, but actually create a
directory called `abc` instead.

This stems back to the DOS days, when all file names had exactly 8
characters plus exactly 3 characters for the file extension, and the
only way to have shorter names was by padding with spaces.

Sadly, this "helpful" behavior is a bit inconsistent: after a successful
`mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because
the directory `abc ` does not actually exist).

Even if it would work, we now have a serious problem because a Git
repository could contain directories `abc` and `abc `, and on Windows,
they would be "merged" unintentionally.

As these paths are illegal on Windows, anyway, let's disallow any
accesses to such paths on that Operating System.

For practical reasons, this behavior is still guarded by the
config setting `core.protectNTFS`: it is possible (and at least two
regression tests make use of it) to create commits without involving the
worktree. In such a scenario, it is of course possible -- even on
Windows -- to create such file names.

Among other consequences, this patch disallows submodules' paths to end
in spaces on Windows (which would formerly have confused Git enough to
try to write into incorrect paths, anyway).

While this patch does not fix a vulnerability on its own, it prevents an
attack vector that was exploited in demonstrations of a number of
recently-fixed security bugs.

The regression test added to `t/t7417-submodule-path-url.sh` reflects
that attack vector.

Note that we have to adjust the test case "prevent git~1 squatting on
Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue.
It tries to clone two submodules whose names differ only in a trailing
period character, and as a consequence their git directories differ in
the same way. Previously, when Git tried to clone the second submodule,
it thought that the git directory already existed (because on Windows,
when you create a directory with the name `b.` it actually creates `b`),
but with this patch, the first submodule's clone will fail because of
the illegal name of the git directory. Therefore, when cloning the
second submodule, Git will take a different code path: a fresh clone
(without an existing git directory). Both code paths fail to clone the
second submodule, both because the the corresponding worktree directory
exists and is not empty, but the error messages are worded differently.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05 15:37:06 +01:00
Elijah Newren 15beaaa3d1 Fix spelling errors in code comments
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10 16:00:54 +09:00
Denton Liu 7027f508c7 compat/*.[ch]: remove extern from function declarations using spatch
In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), we removed externs from function declarations using
spatch but we intentionally excluded files under compat/ since some are
directly copied from an upstream and we should avoid churning them so
that manually merging future updates will be simpler.

In the last commit, we determined the files which taken from an upstream
so we can exclude them and run spatch on the remainder.

This was the Coccinelle patch used:

	@@
	type T;
	identifier f;
	@@
	- extern
	  T f(...);

and it was run with:

	$ git ls-files compat/\*\*.{c,h} |
		xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
	$ git checkout -- \
		compat/regex/ \
		compat/inet_ntop.c \
		compat/inet_pton.c \
		compat/nedmalloc/ \
		compat/obstack.{c,h} \
		compat/poll/

Coccinelle has some trouble dealing with `__attribute__` and varargs so
we ran the following to ensure that no remaining changes were left
behind:

	$ git ls-files compat/\*\*.{c,h} |
		xargs sed -i'' -e 's/^\(\s*\)extern \([^(]*([^*]\)/\1\2/'
	$ git checkout -- \
		compat/regex/ \
		compat/inet_ntop.c \
		compat/inet_pton.c \
		compat/nedmalloc/ \
		compat/obstack.{c,h} \
		compat/poll/

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05 11:05:45 -07:00
Jeff Hostetler 172e54e2d7 msvc: do not re-declare the timespec struct
VS2015's headers already declare that struct.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Johannes Schindelin 396ff7547d mingw: replace mingw_startup() hack
Git for Windows has special code to retrieve the command-line parameters
(and even the environment) in UTF-16 encoding, so that they can be
converted to UTF-8. This is necessary because Git for Windows wants to
use UTF-8 encoded strings throughout its code, and the main() function
does not get the parameters in that encoding.

To do that, we used the __wgetmainargs() function, which is not even a
Win32 API function, but provided by the MINGW "runtime" instead.

Obviously, this method would not work with any compiler other than GCC,
and in preparation for compiling with Visual C++, we would like to avoid
precisely that.

Lucky us, there is a much more elegant way: we can simply implement the
UTF-16 variant of `main()`: `wmain()`.

To make that work, we need to link with -municode. The command-line
parameters are passed to `wmain()` encoded in UTF-16, as desired, and
this method also works with GCC, and also with Visual C++ after
adjusting the MSVC linker flags to force it to use `wmain()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20 14:03:05 -07:00
Tanushree Tumane b9f0193b25 mingw: remove obsolete IPv6-related code
To support IPv6, Git provided fall back functions for Windows versions
that did not support IPv6. However, as Git dropped support for Windows
XP and prior, those functions are not needed anymore.

Remove those fallbacks by reverting fe3b2b7b82 (Enable support for
IPv6 on MinGW, 2009-11-24) and using the functions directly (without
'ipv6_' prefix).

Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-07 18:42:28 +09:00
Jeff Hostetler ee4512ed48 trace2: create new combined trace facility
Create a new unified tracing facility for git.  The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.

In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.

Trace2 defines 3 output targets.  These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".  These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).

* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
  summary data.

* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
  It extends the output with columns for the command process, thread,
  repo, absolute and relative elapsed times.  It reports events for
  child process start/stop, thread start/stop, and per-thread function
  nesting.

* GIT_TR2_EVENT is a new structured format. It writes event data as a
  series of JSON records.

Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22 15:27:59 -08:00
Junio C Hamano 8593e8a618 Merge branch 'js/mingw-host-cpu'
Windows update.

* js/mingw-host-cpu:
  mingw: use a more canonical method to fix the CPU reporting
2019-02-13 18:18:43 -08:00
Johannes Schindelin bb02e7a560 mingw: use a more canonical method to fix the CPU reporting
In `git version --build-options`, we report also the CPU, but in Git for
Windows we actually cross-compile the 32-bit version in a 64-bit Git for
Windows, so we cannot rely on the auto-detected value.

In 3815f64b0d (mingw: fix CPU reporting in `git version
--build-options`, 2019-02-07), we fixed this by a Windows-only
workaround, making use of magic pre-processor constants, which works in
GCC, but most likely not all C compilers.

As pointed out by Eric Sunshine, there is a better way, anyway: to set
the Makefile variable HOST_CPU explicitly for cross-compiled Git. So
let's do that!

This reverts commit 3815f64b0d partially.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-13 13:46:58 -08:00
Junio C Hamano 6951c5fd9f Merge branch 'js/mingw-host-cpu'
Windows update.

* js/mingw-host-cpu:
  mingw: fix CPU reporting in `git version --build-options`
2019-02-08 20:44:53 -08:00
Johannes Schindelin 3815f64b0d mingw: fix CPU reporting in git version --build-options
We cannot rely on `uname -m` in Git for Windows' SDK to tell us what
architecture we are compiling for, as we can compile both 32-bit and
64-bit `git.exe` from a 64-bit SDK, but the `uname -m` in that SDK will
always report `x86_64`.

So let's go back to our original design. And make it explicitly
Windows-specific.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-07 13:05:37 -08:00
Junio C Hamano 25d90d1cb7 Merge branch 'tb/use-common-win32-pathfuncs-on-cygwin'
Cygwin update.

* tb/use-common-win32-pathfuncs-on-cygwin:
  git clone <url> C:\cygwin\home\USER\repo' is working (again)
2019-01-14 15:29:32 -08:00
Torsten Bögershausen 1cadad6f65 git clone <url> C:\cygwin\home\USER\repo' is working (again)
A regression for cygwin users was introduced with commit 05b458c,
 "real_path: resolve symlinks by hand".

In the the commit message we read:
  The current implementation of real_path uses chdir() in order to resolve
    symlinks.  Unfortunately this isn't thread-safe as chdir() affects a
      process as a whole...

The old (and non-thread-save) OS calls chdir()/pwd() had been
replaced by a string operation.
The cygwin layer "knows" that "C:\cygwin" is an absolute path,
but the new string operation does not.

"git clone <url> C:\cygwin\home\USER\repo" fails like this:
fatal: Invalid path '/home/USER/repo/C:\cygwin\home\USER\repo'

The solution is to implement has_dos_drive_prefix(), skip_dos_drive_prefix()
is_dir_sep(), offset_1st_component() and convert_slashes() for cygwin
in the same way as it is done in 'Git for Windows' in compat/mingw.[ch]

Extract the needed code into compat/win32/path-utils.[ch] and use it
for cygwin as well.

Reported-by: Steven Penny <svnpenn@gmail.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-26 15:26:17 -08:00
Junio C Hamano 0474cd19ef Merge branch 'js/mingw-utf8-env'
Windows fix.

* js/mingw-utf8-env:
  mingw: reencode environment variables on the fly (UTF-16 <-> UTF-8)
  t7800: fix quoting
2018-11-13 22:37:21 +09:00
Junio C Hamano 6c268fdda9 Merge branch 'js/mingw-perl5lib'
Windows fix.

* js/mingw-perl5lib:
  mingw: unset PERL5LIB by default
  config: move Windows-specific config settings into compat/mingw.c
  config: allow for platform-specific core.* config settings
  config: rename `dummy` parameter to `cb` in git_default_config()
2018-11-13 22:37:20 +09:00
Junio C Hamano fbfdc07511 Merge branch 'js/mingw-isatty-and-dup2'
Windows fix.

* js/mingw-isatty-and-dup2:
  mingw: fix isatty() after dup2()
2018-11-13 22:37:20 +09:00
Junio C Hamano b9c3f062b6 Merge branch 'js/mingw-ns-filetime'
Windows port learned to use nano-second resolution file timestamps.

* js/mingw-ns-filetime:
  mingw: implement nanosecond-precision file times
  mingw: replace MSVCRT's fstat() with a Win32-based implementation
  mingw: factor out code to set stat() data
2018-11-06 15:50:21 +09:00
Johannes Schindelin ff8978d533 mingw: fix isatty() after dup2()
Since a9b8a09c3c (mingw: replace isatty() hack, 2016-12-22), we handle
isatty() by special-casing the stdin/stdout/stderr file descriptors,
caching the return value. However, we missed the case where dup2()
overrides the respective file descriptor.

That poses a problem e.g. where the `show` builtin asks for a pager very
early, the `setup_pager()` function sets the pager depending on the
return value of `isatty()` and then redirects stdout. Subsequently,
`cmd_log_init_finish()` calls `setup_pager()` *again*. What should
happen now is that `isatty()` reports that stdout is *not* a TTY and
consequently stdout should be left alone.

Let's override dup2() to handle this appropriately.

This fixes https://github.com/git-for-windows/git/issues/1077

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:48:59 +09:00
Johannes Schindelin 70fc5793df config: allow for platform-specific core.* config settings
In the Git for Windows project, we have ample precendent for config
settings that apply to Windows, and to Windows only.

Let's formalize this concept by introducing a platform_core_config()
function that can be #define'd in a platform-specific manner.

This will allow us to contain platform-specific code better, as the
corresponding variables no longer need to be exported so that they can
be defined in environment.c and be set in config.c

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 12:46:21 +09:00
Johannes Schindelin fe21c6b285 mingw: reencode environment variables on the fly (UTF-16 <-> UTF-8)
On Windows, the authoritative environment is encoded in UTF-16. In Git
for Windows, we convert that to UTF-8 (because UTF-16 is *such* a
foreign idea to Git that its source code is unprepared for it).

Previously, out of performance concerns, we converted the entire
environment to UTF-8 in one fell swoop at the beginning, and upon
putenv() and run_command() converted it back.

Having a private copy of the environment comes with its own perils: when
a library used by Git's source code tries to modify the environment, it
does not really work (in Git for Windows' case, libcurl, see
https://github.com/git-for-windows/git/compare/bcad1e6d58^...bcad1e6d58^2
for a glimpse of the issues).

Hence, it makes our environment handling substantially more robust if we
switch to on-the-fly-conversion in `getenv()`/`putenv()` calls. Based
on an initial version in the MSVC context by Jeff Hostetler, this patch
makes it so.

Surprisingly, this has a *positive* effect on speed: at the time when
the current code was written, we tested the performance, and there were
*so many* `getenv()` calls that it seemed better to convert everything
in one go. In the meantime, though, Git has obviously been cleaned up a
bit with regards to `getenv()` calls so that the Git processes spawned
by the test suite use an average of only 40 `getenv()`/`putenv()` calls
over the process lifetime.

Speaking of the entire test suite: the total time spent in the
re-encoding in the current code takes about 32.4 seconds (out of 113
minutes runtime), whereas the code introduced in this patch takes only
about 8.2 seconds in total. Not much, but it proves that we need not be
concerned about the performance impact introduced by this patch.

Helped-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31 11:43:14 +09:00
Karsten Blees d7e8c87421 mingw: implement nanosecond-precision file times
We no longer use any of MSVCRT's stat-functions, so there's no need to
stick to a CRT-compatible 'struct stat' either.

Define and use our own POSIX-2013-compatible 'struct stat' with nanosecond-
precision file times.

Note: This can cause performance issues when using Git variants with
different file time resolutions, as the timestamps are stored in the Git
index: after updating the index with a Git variant that uses
second-precision file times, a nanosecond-aware Git will think that
pretty much every single file listed in the index is out of date.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24 13:18:14 +09:00
Johannes Schindelin 501afcb8b0 mingw: use domain information for default email
When a user is registered in a Windows domain, it is really easy to
obtain the email address. So let's do that.

Suggested by Lutz Roeder.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-16 12:59:57 +09:00
Johannes Schindelin 9ee0540a40 mingw: abort on invalid strftime formats
On Windows, strftime() does not silently ignore invalid formats, but
warns about them and then returns 0 and sets errno to EINVAL.

Unfortunately, Git does not expect such a behavior, as it disagrees
with strftime()'s semantics on Linux. As a consequence, Git
misinterprets the return value 0 as "I need more space" and grows the
buffer. As the larger buffer does not fix the format, the buffer grows
and grows and grows until we are out of memory and abort.

Ideally, we would switch off the parameter validation just for
strftime(), but we cannot even override the invalid parameter handler
via _set_thread_local_invalid_parameter_handler() using MINGW because
that function is not declared. Even _set_invalid_parameter_handler(),
which *is* declared, does not help, as it simply does... nothing.

So let's just bite the bullet and override strftime() for MINGW and
abort on an invalid format string. While this does not provide the
best user experience, it is the best we can do.

See https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx for more
details.

This fixes https://github.com/git-for-windows/git/issues/863

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-19 10:53:17 -07:00
Junio C Hamano 7d26aa3230 Merge branch 'js/bs-is-a-dir-sep-on-windows'
"foo\bar\baz" in "git fetch foo\bar\baz", even though there is no
slashes in it, cannot be a nickname for a remote on Windows, as
that is likely to be a pathname on a local filesystem.

* js/bs-is-a-dir-sep-on-windows:
  Windows: do not treat a path with backslashes as a remote's nick name
  mingw.h: permit arguments with side effects for is_dir_sep
2017-06-02 15:05:58 +09:00
Johannes Sixt e20b5b5909 mingw.h: permit arguments with side effects for is_dir_sep
Taking git-compat-util.h's cue (which uses an inline function to back
is_dir_sep()), let's use an inline function to back also the Windows
version of is_dir_sep(). This avoids problems when calling the function
with arguments that do more than just provide a single character, e.g.
incrementing a pointer. Example:

    is_dir_sep(*p++)

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-23 21:42:14 +09:00
Johannes Schindelin cbb3f3c9b1 mingw: intercept isatty() to handle /dev/null as Git expects it
When Git's source code calls isatty(), it really asks whether the
respective file descriptor is connected to an interactive terminal.

Windows' _isatty() function, however, determines whether the file
descriptor is associated with a character device. And NUL, Windows'
equivalent of /dev/null, is a character device.

Which means that for years, Git mistakenly detected an associated
interactive terminal when being run through the test suite, which
almost always redirects stdin, stdout and stderr to /dev/null.

This bug only became obvious, and painfully so, when the new
bisect--helper entered the `pu` branch and made the automatic build & test
time out because t6030 was waiting for an answer.

For details, see

	https://msdn.microsoft.com/en-us/library/f4s0ddew.aspx

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-11 16:15:46 -08:00
Junio C Hamano c0e8b3b444 Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile' into maint
The tempfile (hence its user lockfile) API lets the caller to open
a file descriptor to a temporary file, write into it and then
finalize it by first closing the filehandle and then either
removing or renaming the temporary file.  When the process spawns a
subprocess after obtaining the file descriptor, and if the
subprocess has not exited when the attempt to remove or rename is
made, the last step fails on Windows, because the subprocess has
the file descriptor still open.  Open tempfile with O_CLOEXEC flag
to avoid this (on Windows, this is mapped to O_NOINHERIT).

* bw/mingw-avoid-inheriting-fd-to-lockfile:
  mingw: ensure temporary file handles are not inherited by child processes
  t6026-merge-attr: child processes must not inherit index.lock handles
2016-09-08 21:35:56 -07:00