Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
Move functions that are not about pure string manipulation out of
strbuf.[ch]
* cw/strbuf-cleanup:
strbuf: remove global variable
path: move related function to path
object-name: move related functions to object-name
credential-store: move related functions to credential-store file
abspath: move related functions to abspath
strbuf: clarify dependency
strbuf: clarify API boundary
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move path-related function from strbuf.[ch] to path.[ch] so that strbuf
is focused on string manipulation routines with minimal dependencies.
repository.h is no longer a necessary dependency after moving this
function out.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache.h's nature of a dumping ground of includes prevented it from
being included in some compat/ files, forcing us into a workaround
of having a double forward declaration of the read_in_full() function
(see commit 14086b0a13 ("compat/pread.c: Add a forward declaration to
fix a warning", 2007-11-17)). Now that we have moved functions like
read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered
with unrelated and scary #defines, get rid of the extra forward
declaration and just have compat/pread.c include wrapper.h.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By moving several declarations to setup.h, the previous patch made it
possible to remove the include of cache.h in several source files. Do
so.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another step towards letting us remove the include of cache.h in
strbuf.c. It does mean that we also need to add includes of abspath.h
in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More work towards -Wunused.
* jk/unused-post-2.39-part2: (21 commits)
help: mark unused parameter in git_unknown_cmd_config()
run_processes_parallel: mark unused callback parameters
userformat_want_item(): mark unused parameter
for_each_commit_graft(): mark unused callback parameter
rewrite_parents(): mark unused callback parameter
fetch-pack: mark unused parameter in callback function
notes: mark unused callback parameters
prio-queue: mark unused parameters in comparison functions
for_each_object: mark unused callback parameters
list-objects: mark unused callback parameters
mark unused parameters in signal handlers
run-command: mark error routine parameters as unused
mark "pointless" data pointers in callbacks
ref-filter: mark unused callback parameters
http-backend: mark unused parameters in virtual functions
http-backend: mark argc/argv unused
object-name: mark unused parameters in disambiguate callbacks
serve: mark unused parameters in virtual functions
serve: use repository pointer to get config
ls-refs: drop config caching
...
Both the object_array_filter() and trie_find() functions use callback
functions that let the caller specify which elements match. These
callbacks take a void pointer in case the caller wants to pass in extra
data. But in each case, the single user of these functions just passes
NULL, and the callback ignores the extra pointer.
We could just remove these unused parameters from the callback interface
entirely. But it's good practice to provide such a pointer, as it guides
future callers of the function in the right direction (rather than
tempting them to access global data). Plus it's consistent with other
generic callback interfaces.
So let's instead annotate the unused parameters, in order to silence the
compiler's -Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Julien Moutinho reports that in an environment where directory does
not have BSD group semantics and requires the g+s to be set (aka
FORCE_DIR_SET_GID), but the system forbids chmod() to touch the g+s
bit, adjust_shared_perm() fails even when the repository is for
private use with perm = 0600, because we unconditionally try to set
the g+s bit.
When we grant extra access based on group membership (i.e. the
directory has either g+r or g+w bit set), which group the directory
and its contents are owned by matters. But otherwise (e.g. perm is
set to 0600, in Julien's case), flipping g+s bit is not necessary.
Reported-by: Julien Moutinho <julm+git@sourcephile.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preliminary code refactoring around transport and bundle code.
* ds/bundle-uri:
bundle.h: make "fd" version of read_bundle_header() public
remote: allow relative_url() to return an absolute url
remote: move relative_url()
http: make http_get_file() external
fetch-pack: move --keep=* option filling to a function
fetch-pack: add a deref_without_lazy_fetch_extended()
dir API: add a generalized path_match_flags() function
connect.c: refactor sending of agent & object-format
Add a path_match_flags() function and have the two sets of
starts_with_dot_{,dot_}slash() functions added in
63e95beb08 (submodule: port resolve_relative_url from shell to C,
2016-04-15) and a2b26ffb1a (fsck: convert gitmodules url to URL
passed to curl, 2020-04-18) be thin wrappers for it.
As the latter of those notes the fsck version was copied from the
initial builtin/submodule--helper.c version.
Since the code added in a2b26ffb1a was doing really doing the same as
win32_is_dir_sep() added in 1cadad6f65 (git clone <url>
C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move
the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can
then call either it or the platform-specific is_dir_sep() from this
new function.
Let's likewise change code in various other places that was hardcoding
checks for "'/' || '\\'" with the new is_xplatform_dir_sep(). As can
be seen in those callers some of them still concern themselves with
':' (Mac OS classic?), but let's leave the question of whether that
should be consolidated for some other time.
As we expect to make wider use of the "native" case in the future,
define and use two starts_with_dot_{,dot_}slash_native() convenience
wrappers. This makes the diff in builtin/submodule--helper.c much
smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.33:
Git 2.33.2
Git 2.32.1
Git 2.31.2
GIT-VERSION-GEN: bump to v2.33.1
Git 2.30.3
setup_git_directory(): add an owner check for the top-level directory
Add a function to determine whether a path is owned by the current user
* maint-2.31:
Git 2.31.2
Git 2.30.3
setup_git_directory(): add an owner check for the top-level directory
Add a function to determine whether a path is owned by the current user
When determining the length of the longest ancestor of a given path with
respect to to e.g. `GIT_CEILING_DIRECTORIES`, we special-case the root
directory by returning 0 (i.e. we pretend that the path `/` does not end
in a slash by virtually stripping it).
That is the correct behavior because when normalizing paths, the root
directory is special: all other directory paths have their trailing
slash stripped, but not the root directory's path (because it would
become the empty string, which is not a legal path).
However, this special-casing of the root directory in
`longest_ancestor_length()` completely forgets about Windows-style root
directories, e.g. `C:\`. These _also_ get normalized with a trailing
slash (because `C:` would actually refer to the current directory on
that drive, not necessarily to its root directory).
In fc56c7b34b (mingw: accomodate t0060-path-utils for MSYS2,
2016-01-27), we almost got it right. We noticed that
`longest_ancestor_length()` expects a slash _after_ the matched prefix,
and if the prefix already ends in a slash, the normalized path won't
ever match and -1 is returned.
But then that commit went astray: The correct fix is not to adjust the
_tests_ to expect an incorrect -1 when that function is fed a prefix
that ends in a slash, but instead to treat such a prefix as if the
trailing slash had been removed.
Likewise, that function needs to handle the case where it is fed a path
that ends in a slash (not only a prefix that ends in a slash): if it
matches the prefix (plus trailing slash), we still need to verify that
the path does not end there, otherwise the prefix is not actually an
ancestor of the path but identical to it (and we need to return -1 in
that case).
With these two adjustments, we no longer need to play games in t0060
where we only add `$rootoff` if the passed prefix is different from the
MSYS2 pseudo root, instead we also add it for the MSYS2 pseudo root
itself. We do have to be careful to skip that logic entirely for Windows
paths, though, because they do are not subject to that MSYS2 pseudo root
treatment.
This patch fixes the scenario where a user has set
`GIT_CEILING_DIRECTORIES=C:\`, which would be ignored otherwise.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
"git maintenance" scheduler learned to use systemd timers as a
possible backend.
* lh/systemd-timers:
maintenance: add support for systemd timers on Linux
maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
cache.h: Introduce a generic "xdg_config_home_for(…)" function
Current implementation of `xdg_config_home(filename)` returns
`$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted
between the `XDG_CONFIG_HOME` environment variable and the parameter.
This patch introduces a `xdg_config_home_for(subdir, filename)` function
which is more generic. It only concatenates "$XDG_CONFIG_HOME", or
"$HOME/.config" if the former isn’t defined, with the parameters,
without adding `git` in between.
`xdg_config_home(filename)` is now implemented by calling
`xdg_config_home_for("git", filename)` but this new generic function can
be used to compute the configuration directory of other programs.
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since Git learned to detect its install location at runtime, there
was the slightly awkward problem that it was impossible to specify paths
relative to said location.
For example, if a version of Git was shipped with custom SSL
certificates to use, there was no portable way to specify
`http.sslCAInfo`.
In Git for Windows, the problem was "solved" for years by interpreting
paths starting with a slash as relative to the runtime prefix.
However, this is not correct: such paths _are_ legal on Windows, and
they are interpreted as absolute paths in the same drive as the current
directory.
After a lengthy discussion, and an even lengthier time to mull over the
problem and its best solution, and then more discussions, we eventually
decided to introduce support for the magic sequence `%(prefix)/`. If a
path starts with this, the remainder is interpreted as relative to the
detected (runtime) prefix. If built without runtime prefix support, Git
will simply interpolate the compiled-in prefix.
If a user _wants_ to specify a path starting with the magic sequence,
they can prefix the magic sequence with `./` and voilà, the path won't
be expanded.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not immediately clear what `expand_user_path()` means, so let's
rename it to `interpolate_path()`. This also opens the path for
interpolating more than just a home directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `real_home` parameter only has an effect when expanding paths
starting with `~/`, not when expanding paths starting with `~<user>/`.
Let's make that clear.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 395de250d9 (Expand ~ and ~user in core.excludesfile,
commit.template, 2009-11-17), the `user_path()` function was refactored
into the `expand_user_path()`. During that refactoring, the `buf`
parameter was lost, but the code comment above said function still talks
about it. Let's remove that stale part of the comment.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have tests that cover various filesystem-specific spellings of
".gitmodules", because we need to reliably identify that path for some
security checks. These are from dc2d9ba318 (is_{hfs,ntfs}_dotgitmodules:
add tests, 2018-05-12), with the actual code coming from e7cb0b4455
(is_ntfs_dotgit: match other .git files, 2018-05-11) and 0fc333ba20
(is_hfs_dotgit: match other .git files, 2018-05-02).
Those latter two commits also added similar matching functions for
.gitattributes and .gitignore. These ended up not being used in the
final series, and are currently dead code. But in preparation for them
being used in some fsck checks, let's make sure they actually work by
throwing a few basic tests at them. Likewise, let's cover .mailmap
(which does need matching code added).
I didn't bother with the whole battery of tests that we cover for
.gitmodules. These functions are all based on the same generic matcher,
so it's sufficient to test most of the corner cases just once.
Note that the ntfs magic prefix names in the tests come from the
algorithm described in e7cb0b4455 (and are different for each file).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of questions users might ask while resolving
conflicts:
* What changes have been made since the previous (first) parent?
* What changes are staged?
* What is still unstaged? (or what is still conflicted?)
* What changes did I make to resolve conflicts so far?
The first three of these have simple answers:
* git diff HEAD
* git diff --cached
* git diff
There was no way to answer the final question previously. Adding one
is trivial in merge-ort, since it works by creating a tree representing
what should be written to the working copy complete with conflict
markers. Simply write that tree to .git/AUTO_MERGE, allowing users to
answer the fourth question with
* git diff AUTO_MERGE
I avoided using a name like "MERGE_AUTO", because that would be
merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD,
CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on
which type of operation the merge was part of.
Ensure that paths which clean out other temporary operation-specific
files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory)
also clean out this AUTO_MERGE file.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check for existence and delete CHERRY_PICK_HEAD through ref functions.
This will help cherry-pick work with alternate ref storage backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rebase, one can pass the `--autostash` option to cause the worktree
to be automatically stashed before continuing with the rebase. This
option is missing in merge, however.
Implement the `--autostash` option and corresponding `merge.autoStash`
option in merge which stashes before merging and then pops after.
This option is useful when a developer has some local changes on a topic
branch but they realize that their work depends on another branch.
Previously, they had to run something like
git fetch ...
git stash push
git merge FETCH_HEAD
git stash pop
but now, that is reduced to
git fetch ...
git merge --autostash FETCH_HEAD
When an autostash is generated, it is automatically reapplied to the
worktree only in three explicit situations:
1. An incomplete merge is commit using `git commit`.
2. A merge completes successfully.
3. A merge is aborted using `git merge --abort`.
In all other situations where the merge state is removed using
remove_merge_branch_state() such as aborting a merge via
`git reset --hard`, the autostash is saved into the stash reflog
instead keeping the worktree clean.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Suggested-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SHA-256 transition continues.
* bc/sha-256-part-1-of-4: (22 commits)
fast-import: add options for rewriting submodules
fast-import: add a generic function to iterate over marks
fast-import: make find_marks work on any mark set
fast-import: add helper function for inserting mark object entries
fast-import: permit reading multiple marks files
commit: use expected signature header for SHA-256
worktree: allow repository version 1
init-db: move writing repo version into a function
builtin/init-db: add environment variable for new repo hash
builtin/init-db: allow specifying hash algorithm on command line
setup: allow check_repository_format to read repository format
t/helper: make repository tests hash independent
t/helper: initialize repository if necessary
t/helper/test-dump-split-index: initialize git repository
t6300: make hash algorithm independent
t6300: abstract away SHA-1-specific constants
t: use hash-specific lookup tables to define test constants
repository: require a build flag to use SHA-256
hex: add functions to parse hex object IDs in any algorithm
hex: introduce parsing variants taking hash algorithms
...
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.
There was an unfinished effort to abolish this [1].
Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.
This patch uses a local `strbuf` for most places where `real_path()` was
previously called.
However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
read_gitfile_gently()
get_superproject_working_tree()
[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`real_path()` returns result from a shared buffer, inviting subtle
reentrance bugs. One of these bugs occur when invoked this way:
set_git_dir(real_path(git_dir))
In this case, `real_path()` has reentrance:
real_path
read_gitfile_gently
repo_set_gitdir
setup_git_env
set_git_dir_1
set_git_dir
Later, `set_git_dir()` uses its now-dead parameter:
!is_absolute_path(path)
Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we will want to not only check the repository format, but
extract the information that we've gained. To do so, allow
check_repository_format to take a pointer to struct repository_format.
Allow passing NULL for this argument if we're not interested in the
information, and pass NULL for all existing callers. A future patch
will make use of this information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a "dst" buffer to write into, but there's no matching "len"
parameter. The hidden assumption is that normalizing always makes things
smaller, so we're OK as long as "dst" is at least as big as "src". Let's
document that explicitly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.23: (44 commits)
Git 2.23.1
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
...
* maint-2.21: (42 commits)
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
...
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
* maint-2.19: (34 commits)
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
...
* maint-2.18: (33 commits)
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
...
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
Previously, this function was written without focusing on speed,
intending to make reviewing the code as easy as possible, to avoid any
bugs in this critical code.
Turns out: we can do much better on both accounts. With this patch, we
make it as fast as this developer can make it go:
- We avoid the call to `is_dir_sep()` and make all the character
comparisons explicit.
- We avoid the cost of calling `strncasecmp()` and unroll the test for
`.git` and `git~1`, not even using `tolower()` because it is faster to
compare against two constant values.
- We look for `.git` and `.git~1` first thing, and return early if not
found.
- We also avoid calling a separate function for detecting chains of
spaces and periods.
Each of these improvements has a noticeable impact on the speed of
`is_ntfs_dotgit()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We just safe-guarded `.git` against NTFS Alternate Data Stream-related
attack vectors, and now it is time to do the same for `.gitmodules`.
Note: In the added regression test, we refrain from verifying all kinds
of variations between short names and NTFS Alternate Data Streams: as
the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it
is enough to test one in order to know that all of them are guarded
against.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Probably inspired by HFS' resource streams, NTFS supports "Alternate
Data Streams": by appending `:<stream-name>` to the file name,
information in addition to the file contents can be written and read,
information that is copied together with the file (unless copied to a
non-NTFS location).
These Alternate Data Streams are typically used for things like marking
an executable as having just been downloaded from the internet (and
hence not necessarily being trustworthy).
In addition to a stream name, a stream type can be appended, like so:
`:<stream-name>:<stream-type>`. Unless specified, the default stream
type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In
other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the
`.git` directory!
In our work in Git v2.2.1 to protect Git on NTFS drives under
`core.protectNTFS`, we focused exclusively on NTFS short names, unaware
of the fact that NTFS Alternate Data Streams offer a similar attack
vector.
Let's fix this.
Seeing as it is better to be safe than sorry, we simply disallow paths
referring to *any* NTFS Alternate Data Stream of `.git`, not just
`::$INDEX_ALLOCATION`. This also simplifies the implementation.
This closes CVE-2019-1352.
Further reading about NTFS Alternate Data Streams:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The config setting `core.protectNTFS` is specifically designed to work
not only on Windows, but anywhere, to allow for repositories hosted on,
say, Linux servers to be protected against NTFS-specific attack vectors.
As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated
paths (but does not do the same for paths separated by forward slashes),
under the assumption that the backslash might not be a valid directory
separator on the _current_ Operating System.
However, the two callers, `verify_path()` and `fsck_tree()`, are
supposed to feed only individual path segments to the `is_ntfs_dotgit()`
function.
This causes a lot of duplicate scanning (and very inefficient scanning,
too, as the inner loop of `is_ntfs_dotgit()` was optimized for
readability rather than for speed.
Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of
splitting the paths by backslashes as directory separators on the
callers of said function.
Consequently, the `verify_path()` function, which already splits the
path by directory separators, now treats backslashes as directory
separators _explicitly_ when `core.protectNTFS` is turned on, even on
platforms where the backslash is _not_ a directory separator.
Note that we have to repeat some code in `verify_path()`: if the
backslash is not a directory separator on the current Operating System,
we want to allow file names like `\`, but we _do_ want to disallow paths
that are clearly intended to cause harm when the repository is cloned on
Windows.
The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now
needs to look for backslashes in tree entries' names specifically when
`core.protectNTFS` is turned on. While it would be tempting to
completely disallow backslashes in that case (much like `fsck` reports
names containing forward slashes as "full paths"), this would be
overzealous: when `core.protectNTFS` is turned on in a non-Windows
setup, backslashes are perfectly valid characters in file names while we
_still_ want to disallow tree entries that are clearly designed to
exploit NTFS-specific behavior.
This simplification will make subsequent changes easier to implement,
such as turning `core.protectNTFS` on by default (not only on Windows)
or protecting against attack vectors involving NTFS Alternate Data
Streams.
Incidentally, this change allows for catching malicious repositories
that contain tree entries of the form `dir\.gitmodules` already on the
server side rather than only on the client side (and previously only on
Windows): in contrast to `is_ntfs_dotgit()`, the
`is_ntfs_dotgitmodules()` function already expects the caller to split
the paths by directory separators.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>