Commit graph

33 commits

Author SHA1 Message Date
Victoria Dye
5305474ec4 ref-cache.c: fix prefix matching in ref iteration
Update 'cache_ref_iterator_advance' to skip over refs that are not matched
by the given prefix.

Currently, a ref entry is considered "matched" if the entry name is fully
contained within the prefix:

* prefix: "refs/heads/v1"
* entry: "refs/heads/v1.0"

OR if the prefix is fully contained in the entry name:

* prefix: "refs/heads/v1.0"
* entry: "refs/heads/v1"

The first case is always correct, but the second is only correct if the ref
cache entry is a directory, for example:

* prefix: "refs/heads/example"
* entry: "refs/heads/"

Modify the logic in 'cache_ref_iterator_advance' to reflect these
expectations:

1. If 'overlaps_prefix' returns 'PREFIX_EXCLUDES_DIR', then the prefix and
   ref cache entry do not overlap at all. Skip this entry.
2. If 'overlaps_prefix' returns 'PREFIX_WITHIN_DIR', then the prefix matches
   inside this entry if it is a directory. Skip if the entry is not a
   directory, otherwise iterate over it.
3. Otherwise, 'overlaps_prefix' returned 'PREFIX_CONTAINS_DIR', indicating
   that the cache entry (directory or not) is fully contained by or equal to
   the prefix. Iterate over this entry.

Note that condition 2 relies on the names of directory entries having the
appropriate trailing slash. The existing function documentation of
'create_dir_entry' explicitly calls out the trailing slash requirement, so
this is a safe assumption to make.

This bug generally doesn't have any user-facing impact, since it requires:

1. using a non-empty prefix without a trailing slash in an iteration like
   'for_each_fullref_in',
2. the callback to said iteration not reapplying the original filter (as
   for-each-ref does) to ensure unmatched refs are skipped, and
3. the repository having one or more refs that match part of, but not all
   of, the prefix.

However, there are some niche scenarios that meet those criteria
(specifically, 'rev-parse --bisect' and '(log|show|shortlog) --bisect'). Add
tests covering those cases to demonstrate the fix in this patch.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-10-09 15:53:13 -07:00
Elijah Newren
d1cbe1e6d8 hash-ll.h: split out of hash.h to remove dependency on repository.h
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository->hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
Elijah Newren
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Junio C Hamano
538dc459a0 Merge branch 'ep/maint-equals-null-cocci'
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.

* ep/maint-equals-null-cocci:
  tree-wide: apply equals-null.cocci
  tree-wide: apply equals-null.cocci
  contrib/coccinnelle: add equals-null.cocci
2022-05-20 15:26:59 -07:00
Junio C Hamano
afe8a9070b tree-wide: apply equals-null.cocci
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02 09:50:37 -07:00
Ævar Arnfjörð Bjarmason
e2f8acb6a0 refs: use designated initializers for "struct ref_iterator_vtable"
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17 10:36:11 -07:00
Junio C Hamano
162a13b855 Merge branch 'jt/no-abuse-alternate-odb-for-submodules'
Follow through the work to use the repo interface to access
submodule objects in-process, instead of abusing the alternate
object database interface.

* jt/no-abuse-alternate-odb-for-submodules:
  submodule: trace adding submodule ODB as alternate
  submodule: pass repo to check_has_commit()
  object-file: only register submodule ODB if needed
  merge-{ort,recursive}: remove add_submodule_odb()
  refs: peeling non-the_repository iterators is BUG
  refs: teach arbitrary repo support to iterators
  refs: plumb repo into ref stores
2021-10-25 16:06:56 -07:00
Jonathan Tan
8788195c88 refs: peeling non-the_repository iterators is BUG
There is currently no support for peeling the current ref of an iterator
iterating over a non-the_repository ref store, and none is needed. Thus,
for now, BUG() if that happens.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:06:06 -07:00
Ævar Arnfjörð Bjarmason
750036c8f7 refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry()
Remove the now-unused "incomplete" parameter from create_dir_entry(),
all its callers specify it as "1", so let's drop the "incomplete=0"
case. The last caller to use it was search_for_subdir(), but that code
was removed in the preceding commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason
5e4546d599 refs/ref-cache.c: remove "mkdir" parameter from find_containing_dir()
Remove the "mkdir" parameter from the find_containing_dir() function,
the add_ref_entry() function removed in the preceding commit was its
last user.

Since "mkdir" is always "0" we can also remove the parameter from
search_for_subdir(), which in turn means that we can delete most of
that function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason
6a99fa2e9e refs/ref-cache.[ch]: remove unused add_ref_entry()
This function has not been used since 9dd389f3d8 (packed_ref_store:
get rid of the `ref_cache` entirely, 2017-09-25).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason
34e8a20d76 refs/ref-cache.[ch]: remove unused remove_entry_from_dir()
This function was missed in 9939b33d6a (packed-backend: rip out some
now-unused code, 2017-09-08), and has been orphaned since then. Let's
delete it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 15:12:04 -07:00
Han-Wen Nienhuys
617480d75b refs: make explicit that ref_iterator_peel returns boolean
Use -1 as error return value throughout.

This removes spurious differences in the GIT_TRACE_REFS output, depending on the
ref storage backend active.

Before, the cached ref_iterator (but only that iterator!) would return
peel_object() output directly. No callers relied on the peel_status values
beyond success/failure. All calls to these functions go through
peel_iterated_oid(), which returns peel_object() as a fallback, but also
squashing the error values.

The iteration interface already passes REF_ISSYMREF and REF_ISBROKEN through the
flags argument, so the additional error values in enum peel_status provide no
value.

The ref iteration interface provides a separate peel() function because certain
formats (eg. packed-refs and reftable) can store the peeled object next to the
tag SHA1. Passing the peeled SHA1 as an optional argument to each_ref_fn maps
more naturally to the implementation of ref databases. Changing the code in this
way is left for a future refactoring.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 07:54:12 +09:00
René Scharfe
ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
Jeff King
9001dc2a74 convert "oidcmp() != 0" to "!oideq()"
This is the flip side of the previous two patches: checking
for a non-zero oidcmp() can be more strictly expressed as
inequality. Like those patches, we write "!= 0" in the
coccinelle transformation, which covers by isomorphism the
more common:

  if (oidcmp(E1, E2))

As with the previous two patches, this patch can be achieved
almost entirely by running "make coccicheck"; the only
differences are manual line-wrap fixes to match the original
code.

There is one thing to note for anybody replicating this,
though: coccinelle 1.0.4 seems to miss the case in
builtin/tag.c, even though it's basically the same as all
the others. Running with 1.0.7 does catch this, so
presumably it's just a coccinelle bug that was fixed in the
interim.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29 11:32:49 -07:00
Johannes Schindelin
033abf97fc Replace all die("BUG: ...") calls by BUG() ones
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).

The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.

Let's just convert all remaining ones in one fell swoop.

This trick was performed by this invocation:

	sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 19:06:13 +09:00
SZEDER Gábor
f919ffebed Use MOVE_ARRAY
Use the helper macro MOVE_ARRAY to move arrays.  This is shorter and
safer, as it automatically infers the size of elements.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci in
Travis CI's static analysis build job.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-22 11:32:51 -08:00
Michael Haggerty
78fb457968 refs: update some more docs to use "oid" rather than "sha1"
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-06 10:31:08 +09:00
brian m. carlson
ac2ed0d7d5 refs: convert peel_object to struct object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:52 +09:00
Michael Haggerty
a6e19bcdad ref_cache: remove support for storing peeled values
Now that the `packed-refs` backend doesn't use `ref_cache`, there is
nobody left who might want to store peeled values of references in
`ref_cache`. So remove that feature.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
8738a8a4df ref_iterator: keep track of whether the iterator output is ordered
References are iterated over in order by refname, but reflogs are not.
Some consumers of reference iteration care about the difference. Teach
each `ref_iterator` to keep track of whether its output is ordered.

`overlay_ref_iterator` is one of the picky consumers. Add a sanity
check in `overlay_ref_iterator_begin()` to verify that its inputs are
ordered.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Ævar Arnfjörð Bjarmason
88ce3ef636 *.[ch] refactoring: make use of the FREE_AND_NULL() macro
Replace occurrences of `free(ptr); ptr = NULL` which weren't caught by
the coccinelle rule. These fall into two categories:

 - free/NULL assignments one after the other which coccinelle all put
   on one line, which is functionally equivalent code, but very ugly.

 - manually spotted occurrences where the NULL assignment isn't right
   after the free() call.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:09 -07:00
Michael Haggerty
f23092f19e cache_ref_iterator_begin(): avoid priming unneeded directories
When iterating over references, reference priming is used to make sure
that loose references are read into the ref-cache before packed
references, to avoid races. It used to be that the prefix passed to
reference iterators almost always ended in `/`, for example
`refs/heads/`. In that case, the priming code would read all loose
references under `find_containing_dir("refs/heads/")`, which is
"refs/heads/". That's just what we want.

But now that `ref-filter` knows how to pass refname prefixes to
`for_each_fullref_in()`, the prefix might come from user input; for
example,

    git for-each-ref refs/heads

Since the argument doesn't include a trailing slash, the reference
iteration code would prime all of the loose references under
`find_containing_dir("refs/heads")`, which is "refs/". Thus we would
unnecessarily read tags, remote-tracking references, etc., when the
user is only interested in branches.

It is a bit awkward to get around this problem. We can't just append a
slash to the argument, because we don't know ab initio whether an
argument like `refs/tags/release` corresponds to a single tag or to a
directory containing tags.

Moreover, until now a `prefix_ref_iterator` was used to make the final
decision about which references fall within the prefix (the
`cache_ref_iterator` only did a rough cut). This is also inefficient,
because the `prefix_ref_iterator` can't know, for example, that while
you are in a subdirectory that is completely within the prefix, you
don't have to do the prefix check.

So:

* Move the responsibility for doing the prefix check directly to
  `cache_ref_iterator`. This means that `cache_ref_iterator_begin()`
  never has to wrap its return value in a `prefix_ref_iterator`.

* Teach `cache_ref_iterator_begin()` (and `prime_ref_dir()`) to be
  stricter about what they iterate over and what directories they
  prime.

* Teach `cache_ref_iterator` to keep track of whether the current
  `cache_ref_iterator_level` is fully within the prefix. If so, skip
  the prefix checks entirely.

The main benefit of these optimizations is for loose references, since
packed references are always read all at once.

Note that after this change, `prefix_ref_iterator` is only ever used
for its trimming feature and not for its "prefix" feature. But I'm not
ripping out the latter yet, because it might be useful for another
patch series that I'm working on.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 21:21:21 +09:00
Michael Haggerty
c1da06c6f1 create_ref_entry(): remove check_name option
Only one caller was using it, so move the check to that caller.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-23 14:29:56 +09:00
brian m. carlson
4417df8c49 refs/files-backend: convert many internals to struct object_id
Convert many of the internals of the files backend to use struct
object_id.  Avoid converting public APIs (except one change to
refs/ref-cache.c) to limit the scope of the changes.

Convert one use of get_sha1_hex to parse_oid_hex, and rely on the fact
that a strbuf will be NUL-terminated and that parse_oid_hex will fail on
truncated input to avoid the need to check the length.

This is a requirement to convert parse_object later on.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
Michael Haggerty
f890db83ee do_for_each_entry_in_dir(): delete function
Its only remaining caller was itself.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
059ae35a48 cache_ref_iterator_begin(): make function smarter
Change `cache_ref_iterator_begin()` to take two new arguments:

* `prefix` -- to iterate only over references with the specified
  prefix.

* `prime_dir` -- to "prime" (i.e., pre-load) the cache before starting
  the iteration.

The new functionality makes it possible for
`files_ref_iterator_begin()` to be made more ignorant of the internals
of `ref_cache`, and `find_containing_dir()` and `prime_ref_dir()` to
be made private.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
5c7bba77b2 do_for_each_entry_in_dir(): eliminate offset argument
It was never used.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
e3bf2989ca refs: handle "refs/bisect/" in loose_fill_ref_dir()
That "refs/bisect/" has to be handled specially when filling the
ref_cache for loose references is a peculiarity of the files backend,
and the ref-cache code shouldn't need to know about it. So move this
code to the callback function, `loose_fill_ref_dir()`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
df30875987 ref-cache: use a callback function to fill the cache
It is a leveling violation for `ref_cache` to know about
`files_ref_store` or that it should call `read_loose_refs()` to lazily
fill cache directories. So instead, have its constructor take as an
argument a callback function that it should use for lazy-filling, and
change `files_ref_store` to supply a pointer to function
`read_loose_refs` (renamed to `loose_fill_ref_dir`) when creating the
ref cache for its loose refs.

This means that we can generify the type of the back-pointer in
`struct ref_cache` from `files_ref_store` to `ref_store`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
e00d1a4ff7 refs: record the ref_store in ref_cache, not ref_dir
Instead of keeping a pointer to the `ref_store` in every `ref_dir`
entry, store it once in `struct ref_cache`, and change `struct
ref_dir` to include a pointer to its containing `ref_cache` instead.
This makes it easier to add to the information that is accessible from
a `ref_dir` without increasing the size of every `ref_dir` instance.

Note that previously, every `ref_dir` pointed at the containing
`files_ref_store` regardless of whether it was a part of the loose or
packed reference cache. Now we have to be sure to initialize the
instances to point at the correct containing `ref_cache`. So change
`create_dir_entry()` to take a `ref_cache` parameter, and change its
callers to pass the correct `ref_cache` depending on the purpose of
the new `dir_entry`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
7c22bc8a18 ref-cache: introduce a new type, ref_cache
For now, it just wraps a `ref_entry *` that points at the root of the
tree. Soon it will hold more information.

Add two new functions, `create_ref_cache()` and `free_ref_cache()`.
Make `free_ref_entry()` private.

Change files-backend to use this type to hold its caches.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:46 -07:00
Michael Haggerty
958f964691 refs: split ref_cache code into separate files
The `ref_cache` code is currently too tightly coupled to
`files-backend`, making the code harder to understand and making it
awkward for new code to use `ref_cache` (as we indeed have planned).
Start loosening that coupling by splitting `ref_cache` into a separate
module.

This commit moves code, adds declarations, and changes the visibility
of some functions, but doesn't change any code.

The modules are still too tightly coupled, but the situation will be
improved in subsequent commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 21:32:45 -07:00