Commit graph

658 commits

Author SHA1 Message Date
Joanna Wang
1164c7232e attr: enable attr pathspec magic for git-add and git-stash
Allow users to limit or exclude files based on file attributes
during git-add and git-stash.

For example, the chromium project would like to use

    $ git add . ':(exclude,attr:submodule)'

as submodules are managed by an external tool, forbidding end users
to record changes with "git add".  Allowing "git add" to often
records changes that users do not want in their commits.

This commit does not change any attr magic implementation. It is
only adding attr as an allowed pathspec in git-add and git-stash,
which was previously blocked by GUARD_PATHSPEC and a pathspec mask
in parse_pathspec()).

However, we fix a bug in prefix_magic() where attr values were
unintentionally removed.  This was triggerable when parse_pathspec()
is called with PATHSPEC_PREFIX_ORIGIN as a flag, which was the case
for git-stash (Bug originally filed here [*])

Furthermore, while other commands hit this code path it did not
result in unexpected behavior because this bug only impacts the
pathspec->items->original field which is NOT used to filter
paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands.  (See add_pathspecs()
usage and implementation in stash.c)

It is possible that when the attr pathspec feature was first added
in b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used
with git-add, PATHSPEC_ATTR was added as a mask to git-add's
invocation of parse_pathspec() 84d938b732 (add: do not accept
pathspec magic 'attr', 2018-09-18).  However, this user-friendly
error message was never added for git-stash.

[Reference]
 * https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)

Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-04 17:00:27 +09:00
Victoria Dye
aa79636fe7 dir.[ch]: add 'follow_symlink' arg to 'get_dtype'
Add a 'follow_symlink' boolean option to 'get_type()'. If 'follow_symlink'
is enabled, DT_LNK (in addition to DT_UNKNOWN) d_types triggers the
stat-based d_type resolution, using 'stat' instead of 'lstat' to get the
type of the followed symlink. Note that symlinks are not followed
recursively, so a symlink pointing to another symlink will still resolve to
DT_LNK.

Update callers in 'diagnose.c' to specify 'follow_symlink = 0' to preserve
current behavior.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
Victoria Dye
6dc1004333 dir.[ch]: expose 'get_dtype'
Move 'get_dtype()' from 'diagnose.c' to 'dir.c' and add its declaration to
'dir.h' so that it is accessible to callers in other files. The function and
its documentation are moved verbatim except for a small addition to the
description clarifying what the 'path' arg represents.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
Junio C Hamano
13ed10efd4 Merge branch 'jc/pathspec-match-with-common-prefix'
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.

* jc/pathspec-match-with-common-prefix:
  dir: match "attr" pathspec magic with correct paths
  t6135: attr magic with path pattern
2023-07-17 11:30:43 -07:00
Junio C Hamano
f4a8fde057 dir: match "attr" pathspec magic with correct paths
The match_pathspec_item() function takes "prefix" value, allowing a
caller to chop off the common leading prefix of pathspec pattern
strings from the path and only use the remainder of the path to
match the pathspec patterns (after chopping the same leading prefix
of them, of course).

This "common leading prefix" optimization has two main features:

 * discard the entries in the in-core index that are outside of the
   common leading prefix; if you are doing "ls-files one/a one/b",
   we know all matches must be from "one/", so first the code
   discards all entries outside the "one/" directory from the
   in-core index.  This allows us to work on a smaller dataset.

 * allow skipping the comparison of the leading bytes when matching
   pathspec with path.  When "ls-files" finds the path "one/a/1" in
   the in-core index given "one/a" and "one/b" as the pathspec,
   knowing that common leading prefix "one/" was found lets the
   pathspec matchinery not to bother comparing "one/" part, and
   allows it to feed "a/1" down, as long as the pathspec element
   "one/a" gets corresponding adjustment to "a".

When the "attr" pathspec magic is in effect, however, the current
code breaks down.

The attributes, other than the ones that are built-in and the ones
that come from the $GIT_DIR/info/attributes file and the top-level
.gitattributes file, are lazily read from the filesystem on-demand,
as we encounter each path and ask if it matches the pathspec.  For
example, if you say "git ls-files "(attr:label)sub/" in a repository
with a file "sub/file" that is given the 'label' attribute in
"sub/.gitattributes":

 * The common prefix optimization finds that "sub/" is the common
   prefix and prunes the in-core index so that it has only entries
   inside that directory.  This is desirable.

 * The code then walks the in-core index, finds "sub/file", and
   eventually asks do_match_pathspec() if it matches the given
   pathspec.

 * do_match_pathspec() calls match_pathspec_item() _after_ stripping
   the common prefix "sub/" from the path, giving it "file", plus
   the length of the common prefix (4-bytes), so that the pathspec
   element "(attr:label)sub/" can be treated as if it were "(attr:label)".

The last one is what breaks the match in the current code, as the
pathspec subsystem ends up asking the attribute subsystem to find
the attribute attached to the path "file".  We need to ask about the
attributes on "sub/file" when calling match_pathspec_attrs(); this
can be done by looking at "prefix" bytes before the beginning of
"name", which is the same trick already used by another piece of the
code in the same match_pathspec_item() function.

Unfortunately this was not discovered so far because the code works
with slightly different arguments, e.g.

 $ git ls-files "(attr:label)sub"
 $ git ls-files "(attr:label)sub/" "no/such/dir/"

would have reported "sub/file" as a path with the 'label' attribute
just fine, because neither would trigger the common prefix
optimization.

Reported-by: Matthew Hughes <mhughes@uw.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-08 14:36:31 -07:00
Calvin Wan
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Calvin Wan
da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Elijah Newren
68d686460f fsmonitor-ll.h: split this header out of fsmonitor.h
This creates a new fsmonitor-ll.h with most of the functions from
fsmonitor.h, though it leaves three inline functions where they were.
Two-thirds of the files that previously included fsmonitor.h did not
need those three inline functions or the six extra includes those inline
functions required, so this allows them to only include the lower level
header.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren
a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren
c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren
baf889c2cd sparse-index.h: move declarations for sparse-index.c from cache.h
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren
f5653856c2 name-hash.h: move declarations for name-hash.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren
d4a4f9291d commit.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
Elijah Newren
cb2a51356d symlinks.h: move declarations for symlinks.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
Elijah Newren
87bed17907 object-file.h: move declarations for object-file.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren
73359a9b43 treewide: be explicit about dependence on convert.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren
74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
Junio C Hamano
e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason
bc726bd075 cocci: apply the "object-store.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Elijah Newren
e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
0b027f6ca7 abspath.h: move absolute path functions from cache.h
This is another step towards letting us remove the include of cache.h in
strbuf.c.  It does mean that we also need to add includes of abspath.h
in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
Elijah Newren
f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Junio C Hamano
88cc8ed8bc Merge branch 'en/header-cleanup'
Code clean-up to clarify the rule that "git-compat-util.h" must be
the first to be included.

* en/header-cleanup:
  diff.h: remove unnecessary include of object.h
  Remove unnecessary includes of builtin.h
  treewide: replace cache.h with more direct headers, where possible
  replace-object.h: move read_replace_refs declaration from cache.h to here
  object-store.h: move struct object_info from cache.h
  dir.h: refactor to no longer need to include cache.h
  object.h: stop depending on cache.h; make cache.h depend on object.h
  ident.h: move ident-related declarations out of cache.h
  pretty.h: move has_non_ascii() declaration from commit.h
  cache.h: remove dependence on hex.h; make other files include it explicitly
  hex.h: move some hex-related declarations from cache.h
  hash.h: move some oid-related declarations from cache.h
  alloc.h: move ALLOC_GROW() functions from cache.h
  treewide: remove unnecessary cache.h includes in source files
  treewide: remove unnecessary cache.h includes
  treewide: remove unnecessary git-compat-util.h includes in headers
  treewide: ensure one of the appropriate headers is sourced first
2023-03-17 14:03:09 -07:00
Elijah Newren
5fdf285e62 dir: separate public from internal portion of dir_struct
In order to make it clearer to callers what portions of dir_struct are
public API, and avoid errors from them setting fields that are meant as
internal API, split the fields used for internal implementation reasons
into a separate embedded struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 08:29:50 -08:00
Elijah Newren
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Junio C Hamano
ab85a7de6d Merge branch 'ws/single-file-cone'
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.

* ws/single-file-cone:
  dir: check for single file cone patterns
2023-01-16 12:07:47 -08:00
William Sprent
5842710dc2 dir: check for single file cone patterns
The sparse checkout documentation states that the cone mode pattern set
is limited to patterns that either recursively include directories or
patterns that match all files in a directory. In the sparse checkout
file, the former manifest in the form:

    /A/B/C/

while the latter become a pair of patterns either in the form:

    /A/B/
    !/A/B/*/

or in the special case of matching the toplevel files:

    /*
    !/*/

The 'add_pattern_to_hashsets()' function contains checks which serve to
disable cone-mode when non-cone patterns are encountered. However, these
do not catch when the pattern list attempts to match a single file or
directory, e.g. a pattern in the form:

    /A/B/C

This causes sparse-checkout to exhibit unexpected behaviour when such a
pattern is in the sparse-checkout file and cone mode is enabled.
Concretely, with the pattern like the above, sparse-checkout, in
non-cone mode, will only include the directory or file located at
'/A/B/C'. However, with cone mode enabled, sparse-checkout will instead
just manifest the toplevel files but not any file located at '/A/B/C'.

Relatedly, issues occur when supplying the same kind of filter when
partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will
correctly just include the objects that match the non-cone pattern
matching. Which means that checking out the newly cloned repo with the
same filter, but with cone mode enabled, fails due to missing objects.

To fix these issues, add a cone mode pattern check that asserts that
every pattern is either a directory match or the pattern '/*'. Add a
test to verify the new pattern check and modify another to reflect that
non-directory patterns are caught earlier.

Signed-off-by: William Sprent <williams@unity3d.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05 11:14:28 +09:00
Ævar Arnfjörð Bjarmason
083fd1a264 dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
When the "ident" member of the structure was added in
1e8fef609e (untracked cache: guard and disable on system changes,
2015-03-08) this function wasn't updated to free it. Let's do so.

Let's also free the "exclude_per_dir" memory we've been leaking
since[1], while making sure not to free() the constant ".gitignore"
string we add by default[2].

As we now have three struct members we're freeing let's change
free_untracked_cache() to return early if "uc" isn't defined. We won't
hand it to free() now, but that was just for convenience, once we're
dealing with >=2 struct members this pattern is more convenient.

1. f9e6c64958 (untracked cache: load from UNTR index extension,
   2015-03-08)
2. 039bc64e88 (core.excludesfile clean-up, 2007-11-14)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Junio C Hamano
2790ba84b6 Merge branch 'rs/use-fspathncmp'
Code clean-up.

* rs/use-fspathncmp:
  dir: use fspathncmp() in pl_hashmap_cmp()
2022-10-17 14:56:35 -07:00
Jeff King
f7669676d0 dir: use fspathncmp() in pl_hashmap_cmp()
Call fspathncmp() instead of open-coding it.  This shortens the code and
makes it less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-08 22:09:03 -07:00
Junio C Hamano
dd407f1c7c Merge branch 'ab/unused-annotation'
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.

* ab/unused-annotation:
  git-compat-util.h: use "deprecated" for UNUSED variables
  git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14 12:56:39 -07:00
Junio C Hamano
a6b42ec0c6 Merge branch 'jk/unused-annotation'
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.

* jk/unused-annotation:
  is_path_owned_by_current_uid(): mark "report" parameter as unused
  run-command: mark unused async callback parameters
  mark unused read_tree_recursive() callback parameters
  hashmap: mark unused callback parameters
  config: mark unused callback parameters
  streaming: mark unused virtual method parameters
  transport: mark bundle transport_options as unused
  refs: mark unused virtual method parameters
  refs: mark unused reflog callback parameters
  refs: mark unused each_ref_fn parameters
  git-compat-util: add UNUSED macro
2022-09-14 12:56:39 -07:00
Ævar Arnfjörð Bjarmason
5cf88fd8b0 git-compat-util.h: use "UNUSED", not "UNUSED(var)"
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75d (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.

Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.

This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:48 -07:00
Jeff King
77651c032c match_pathname(): drop unused "flags" parameter
This field has not been used since the function was introduced in
b559263216 (exclude: split pathname matching code into a separate
function, 2012-10-15), though there was a brief period where it was
erroneously used and then reverted in ed4958477b (dir: fix pattern
matching on dirs, 2021-09-24) and 5ceb663e92 (dir: fix
directory-matching bug, 2021-11-02).

It's possible we'd eventually add a flag that makes it useful here, but
there are only a handful of callers. It would be easy to add back if
necessary, and in the meantime this makes the function interface less
misleading.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:20:56 -07:00
Jeff King
02c3c59e62 hashmap: mark unused callback parameters
Hashmap comparison functions must conform to a particular callback
interface, but many don't use all of their parameters. Especially the
void cmp_data pointer, but some do not use keydata either (because they
can easily form a full struct to pass when doing lookups). Let's mark
these to make -Wunused-parameter happy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:55 -07:00
Junio C Hamano
f070ec4cb5 Merge branch 'gg/worktree-from-the-above' into maint
In a non-bare repository, the behavior of Git when the
core.worktree configuration variable points at a directory that has
a repository as its subdirectory, regressed in Git 2.27 days.
source: <20220616234433.225-1-gg.oss@outlook.com>
source: <20220616231956.154-1-gg.oss@outlook.com>

* gg/worktree-from-the-above:
  dir: minor refactoring / clean-up
  dir: traverse into repository
2022-07-27 13:00:29 -07:00
Junio C Hamano
dc6315e1fc Merge branch 'gg/worktree-from-the-above'
In a non-bare repository, the behavior of Git when the
core.worktree configuration variable points at a directory that has
a repository as its subdirectory, regressed in Git 2.27 days.

* gg/worktree-from-the-above:
  dir: minor refactoring / clean-up
  dir: traverse into repository
2022-07-14 15:03:58 -07:00
Goss Geppert
d6c9a71755 dir: minor refactoring / clean-up
Narrow the scope of the `nested_repo` variable and conditional return
statement to the block where the variable is set.

Signed-off-by: Goss Geppert <ggossdev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21 22:47:33 -07:00
Goss Geppert
27128996b8 dir: traverse into repository
Since 8d92fb2927 (dir: replace exponential algorithm with a linear one,
2020-04-01) traversing into a repository's directory tree when the
traversal began outside the repository's standard location has failed
because the encountered repository was identified as a nested foreign
repository.

Prior to this commit, the failure to traverse into a repository's
default worktree location was observable from a user's perspective under
either of the following conditions (there may be others):

    1) Set the `core.worktree` location to a parent directory of the
       default worktree; or
    2) Use the `--git_dir` option while the working directory is outside
       the repository's default worktree location

Under either of these conditions, symptoms of the failure to traverse
into the repository's default worktree location include the inability to
add files to the index or get a list of untracked files via ls-files.

This commit adds a check to determine whether a nested repository that
is encountered in recursing a path is actually `the_repository`.  If so,
we simply treat the directory as if it doesn't contain a nested
repository.

The commit includes test-cases to reduce the likelihood of future
regressions.

Signed-off-by: Goss Geppert <ggossdev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-21 22:47:33 -07:00
Junio C Hamano
b3b2ddced2 Merge branch 'ds/bundle-uri'
Preliminary code refactoring around transport and bundle code.

* ds/bundle-uri:
  bundle.h: make "fd" version of read_bundle_header() public
  remote: allow relative_url() to return an absolute url
  remote: move relative_url()
  http: make http_get_file() external
  fetch-pack: move --keep=* option filling to a function
  fetch-pack: add a deref_without_lazy_fetch_extended()
  dir API: add a generalized path_match_flags() function
  connect.c: refactor sending of agent & object-format
2022-06-03 14:30:34 -07:00
Junio C Hamano
db5b7c3e46 Merge branch 'js/ci-gcc-12-fixes'
Fixes real problems noticed by gcc 12 and works around false
positives.

* js/ci-gcc-12-fixes:
  dir.c: avoid "exceeds maximum object size" error with GCC v12.x
  nedmalloc: avoid new compile error
  compat/win32/syslog: fix use-after-realloc
2022-05-31 19:10:35 -07:00
Johannes Schindelin
2acf4cf001 dir.c: avoid "exceeds maximum object size" error with GCC v12.x
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:

dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
       exceeds maximum object size 9223372036854775807
       [-Werror=stringop-overread]
    CC ewah/bitmap.o
 3087 |         if (memchr(start, '/', end - start) == NULL
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.

See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.

Let's just add a safety check, primarily for GCC's benefit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-24 15:58:41 -07:00
Junio C Hamano
538dc459a0 Merge branch 'ep/maint-equals-null-cocci'
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.

* ep/maint-equals-null-cocci:
  tree-wide: apply equals-null.cocci
  tree-wide: apply equals-null.cocci
  contrib/coccinnelle: add equals-null.cocci
2022-05-20 15:26:59 -07:00
Ævar Arnfjörð Bjarmason
9fd512c8d6 dir API: add a generalized path_match_flags() function
Add a path_match_flags() function and have the two sets of
starts_with_dot_{,dot_}slash() functions added in
63e95beb08 (submodule: port resolve_relative_url from shell to C,
2016-04-15) and a2b26ffb1a (fsck: convert gitmodules url to URL
passed to curl, 2020-04-18) be thin wrappers for it.

As the latter of those notes the fsck version was copied from the
initial builtin/submodule--helper.c version.

Since the code added in a2b26ffb1a was doing really doing the same as
win32_is_dir_sep() added in 1cadad6f65 (git clone <url>
C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move
the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can
then call either it or the platform-specific is_dir_sep() from this
new function.

Let's likewise change code in various other places that was hardcoding
checks for "'/' || '\\'" with the new is_xplatform_dir_sep(). As can
be seen in those callers some of them still concern themselves with
':' (Mac OS classic?), but let's leave the question of whether that
should be consolidated for some other time.

As we expect to make wider use of the "native" case in the future,
define and use two starts_with_dot_{,dot_}slash_native() convenience
wrappers. This makes the diff in builtin/submodule--helper.c much
smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-16 15:02:09 -07:00
Junio C Hamano
2b0a58d164 Merge branch 'ep/maint-equals-null-cocci' for maint-2.35
* ep/maint-equals-null-cocci:
  tree-wide: apply equals-null.cocci
  contrib/coccinnelle: add equals-null.cocci
2022-05-02 10:06:04 -07:00
Junio C Hamano
afe8a9070b tree-wide: apply equals-null.cocci
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02 09:50:37 -07:00
Tao Klerks
e6a653554b untracked-cache: support '--untracked-files=all' if configured
Untracked cache was originally designed to only work with
"--untracked-files=normal", and is bypassed when
"--untracked-files=all" is requested, but this causes performance
issues for UI tooling that wants to see "all" on a frequent basis.

On the other hand, the conditions that altogether prevented
applicability to the "all" mode no longer seem to apply, after
several major refactors in recent years; this possibility was
discussed in
81153d02-8e7a-be59-e709-e90cd5906f3a@jeffhostetler.com and
CABPp-BFiwzzUgiTj_zu+vF5x20L0=1cf25cHwk7KZQj2YkVzXw@mail.gmail.com,
and somewhat confirmed experimentally by several users using a
version of this patch to use untracked cache with -uall for about a
year.

When 'git status' runs without using the untracked cache, on a large
repo, on windows, with fsmonitor, it can run very slowly. This can
make GUIs that need to use "-uall" (and therefore currently bypass
untracked cache) unusable when fsmonitor is enabled, on such large
repos.

To partially address this, align the supported directory flags for the
stored untracked cache data with the git config. If a user specifies
an '--untracked-files=' commandline parameter that does not align with
their 'status.showuntrackedfiles' config value, then the untracked
cache will be ignored - as it is for other unsupported situations like
when a pathspec is specified.

If the previously stored flags no longer match the current
configuration, but the currently-applicable flags do match the current
configuration, then discard the previously stored untracked cache
data.

For most users there will be no change in behavior. Users who need
'--untracked-files=all' to perform well will now have the option of
setting "status.showuntrackedfiles" to "all" for better / more
consistent performance.

Users who need '--untracked-files=all' to perform well for their
tooling AND prefer to avoid the verbosity of "all" when running
git status explicitly without options... are out of luck for now (no
change).

Users who have the "status.showuntrackedfiles" config set to "all"
and yet frequently explicitly call
'git status --untracked-files=normal' (and use the untracked cache)
are the only ones who will be disadvantaged by this change. Their
"--untracked-files=normal" calls will, after this change, no longer
use the untracked cache.

Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-01 10:16:42 -07:00
Junio C Hamano
190f9bf62a Merge branch 'vd/sparse-read-tree'
"git read-tree" has been made to be aware of the sparse-index
feature.

* vd/sparse-read-tree:
  read-tree: make three-way merge sparse-aware
  read-tree: make two-way merge sparse-aware
  read-tree: narrow scope of index expansion for '--prefix'
  read-tree: integrate with sparse index
  read-tree: expand sparse checkout test coverage
  read-tree: explicitly disallow prefixes with a leading '/'
  status: fix nested sparse directory diff in sparse index
  sparse-index: prevent repo root from becoming sparse
2022-03-16 17:53:08 -07:00