1
0
mirror of https://github.com/git/git synced 2024-07-05 00:58:49 +00:00
Commit Graph

1116 Commits

Author SHA1 Message Date
Junio C Hamano
8282f95928 Merge branch 'ps/not-so-many-refs-are-special'
Define "special ref" as a very narrow set that consists of
FETCH_HEAD and MERGE_HEAD, and clarify everything else that used to
be classified as such are actually just pseudorefs.

* ps/not-so-many-refs-are-special:
  Documentation: add "special refs" to the glossary
  refs: redefine special refs
  refs: convert MERGE_AUTOSTASH to become a normal pseudo-ref
  sequencer: introduce functions to handle autostashes via refs
  refs: convert AUTO_MERGE to become a normal pseudo-ref
  sequencer: delete REBASE_HEAD in correct repo when picking commits
  sequencer: clean up pseudo refs with REF_NO_DEREF
2024-01-29 16:03:00 -08:00
Junio C Hamano
dc8ce995a2 Merge branch 'ps/worktree-refdb-initialization'
Instead of manually creating refs/ hierarchy on disk upon a
creation of a secondary worktree, which is only usable via the
files backend, use the refs API to populate it.

* ps/worktree-refdb-initialization:
  builtin/worktree: create refdb via ref backend
  worktree: expose interface to look up worktree by name
  builtin/worktree: move setup of commondir file earlier
  refs/files: skip creation of "refs/{heads,tags}" for worktrees
  setup: move creation of "refs/" into the files backend
  refs: prepare `refs_init_db()` for initializing worktree refs
2024-01-26 08:54:46 -08:00
Patrick Steinhardt
2cd33f4428 refs: redefine special refs
Now that our list of special refs really only contains refs which have
actually-special semantics, let's redefine what makes a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
Patrick Steinhardt
3f921c7591 refs: convert MERGE_AUTOSTASH to become a normal pseudo-ref
Similar to the preceding conversion of the AUTO_MERGE pseudo-ref, let's
convert the MERGE_AUTOSTASH ref to become a normal pseudo-ref as well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
Patrick Steinhardt
fd7c6ffa9e refs: convert AUTO_MERGE to become a normal pseudo-ref
In 70c70de616 (refs: complete list of special refs, 2023-12-14) we have
inrtoduced a new `is_special_ref()` function that classifies some refs
as being special. The rule is that special refs are exclusively read and
written via the filesystem directly, whereas normal refs exclucsively go
via the refs API.

The intent of that commit was to record the status quo so that we know
to route reads of such special refs consistently. Eventually, the list
should be reduced to its bare minimum of refs which really are special,
namely FETCH_HEAD and MERGE_HEAD.

Follow up on this promise and convert the AUTO_MERGE ref to become a
normal pseudo-ref by using the refs API to both read and write it
instead of accessing the filesystem directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-19 11:10:41 -08:00
Junio C Hamano
32c6fc3e30 Merge branch 'ps/refstorage-extension'
Introduce a new extension "refstorage" so that we can mark a
repository that uses a non-default ref backend, like reftable.

* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
2024-01-16 10:11:57 -08:00
Patrick Steinhardt
2e573d61ff refs: prepare refs_init_db() for initializing worktree refs
The purpose of `refs_init_db()` is to initialize the on-disk files of a
new ref database. The function is quite inflexible right now though, as
callers can neither specify the `struct ref_store` nor can they pass any
flags.

Refactor the interface to accept both of these. This will be required so
that we can start initializing per-worktree ref databases via the ref
backend instead of open-coding the initialization in "worktree.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
Junio C Hamano
dbf668a1b7 Merge branch 'ps/pseudo-refs'
Assorted changes around pseudoref handling.

* ps/pseudo-refs:
  bisect: consistently write BISECT_EXPECTED_REV via the refdb
  refs: complete list of special refs
  refs: propagate errno when reading special refs fails
  wt-status: read HEAD and ORIG_HEAD via the refdb
2024-01-02 13:51:30 -08:00
Patrick Steinhardt
173761e21b setup: start tracking ref storage format
In order to discern which ref storage format a repository is supposed to
use we need to start setting up and/or discovering the format. This
needs to happen in two separate code paths.

  - The first path is when we create a repository via `init_db()`. When
    we are re-initializing a preexisting repository we need to retain
    the previously used ref storage format -- if the user asked for a
    different format then this indicates an error and we error out.
    Otherwise we either initialize the repository with the format asked
    for by the user or the default format, which currently is the
    "files" backend.

  - The second path is when discovering repositories, where we need to
    read the config of that repository. There is not yet any way to
    configure something other than the "files" backend, so we can just
    blindly set the ref storage format to this backend.

Wire up this logic so that we have the ref storage format always readily
available when needed. As there is only a single backend and because it
is not configurable we cannot yet verify that this tracking works as
expected via tests, but tests will be added in subsequent commits. To
countermand this ommission now though, raise a BUG() in case the ref
storage format is not set up properly in `ref_store_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
Patrick Steinhardt
0fcc285c5e refs: refactor logic to look up storage backends
In order to look up ref storage backends, we're currently using a linked
list of backends, where each backend is expected to set up its `next`
pointer to the next ref storage backend. This is kind of a weird setup
as backends need to be aware of other backends without much of a reason.

Refactor the code so that the array of backends is centrally defined in
"refs.c", where each backend is now identified by an integer constant.
Expose functions to translate from those integer constants to the name
and vice versa, which will be required by subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
Patrick Steinhardt
0a06892ddd bisect: consistently write BISECT_EXPECTED_REV via the refdb
We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.

Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:27 -08:00
Patrick Steinhardt
70c70de616 refs: complete list of special refs
We have some references that are more special than others. The reason
for them being special is that they either do not follow the usual
format of references, or that they are written to the filesystem
directly by the respective owning subsystem and thus circumvent the
reference backend.

This works perfectly fine right now because the reffiles backend will
know how to read those refs just fine. But with the prospect of gaining
a new reference backend implementation we need to be a lot more careful
here:

  - We need to make sure that we are consistent about how those refs are
    written. They must either always be written via the filesystem, or
    they must always be written via the reference backend. Any mixture
    will lead to inconsistent state.

  - We need to make sure that such special refs are always handled
    specially when reading them.

We're already mostly good with regard to the first item, except for
`BISECT_EXPECTED_REV` which will be addressed in a subsequent commit.
But the current list of special refs is missing some refs that really
should be treated specially. Right now, we only treat `FETCH_HEAD` and
`MERGE_HEAD` specially here.

Introduce a new function `is_special_ref()` that contains all current
instances of special refs to fix the reading path.

Note that this is only a temporary measure where we record and rectify
the current state. Ideally, the list of special refs should in the end
only contain `FETCH_HEAD` and `MERGE_HEAD` again because they both may
reference multiple objects and can contain annotations, so they indeed
are special.

Based-on-patch-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:27 -08:00
Patrick Steinhardt
668cdc043f refs: propagate errno when reading special refs fails
Some refs in Git are more special than others due to reasons explained
in the next commit. These refs are read via `refs_read_special_head()`,
but this function doesn't behave the same as when we try to read a
normal ref. Most importantly, we do not propagate `failure_errno` in the
case where the reference does not exist, which is behaviour that we rely
on in many parts of Git.

Fix this bug by propagating errno when `strbuf_read_file()` fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:26 -08:00
Patrick Steinhardt
d6f8e72982 refs: deduplicate code to delete references
Both the files and the packed-refs reference backends now use the same
generic transactions-based code to delete references. Let's pull these
implementations up into `refs_delete_refs()` to deduplicate the code.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-17 10:12:12 +09:00
Junio C Hamano
c5fcd34e1b Merge branch 'jk/unused-parameter'
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.

* jk/unused-parameter:
  t/helper: mark unused callback void data parameters
  tag: mark unused parameters in each_tag_name_fn callbacks
  rev-parse: mark unused parameter in for_each_abbrev callback
  replace: mark unused parameter in each_mergetag_fn callback
  replace: mark unused parameter in ref callback
  merge-tree: mark unused parameter in traverse callback
  fsck: mark unused parameters in various fsck callbacks
  revisions: drop unused "opt" parameter in "tweak" callbacks
  count-objects: mark unused parameter in alternates callback
  am: mark unused keep_cr parameters
  http-push: mark unused parameter in xml callback
  http: mark unused parameters in curl callbacks
  do_for_each_ref_helper(): mark unused repository parameter
  test-ref-store: drop unimplemented reflog-expire command
2023-07-25 12:05:24 -07:00
Junio C Hamano
39fe402d67 Merge branch 'tb/refs-exclusion-and-packed-refs'
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.

* tb/refs-exclusion-and-packed-refs:
  ls-refs.c: avoid enumerating hidden refs where possible
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references
  refs.h: implement `hidden_refs_to_excludes()`
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: add trace2 counters for jump list
  refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
  refs/packed-backend.c: refactor `find_reference_location()`
  refs: plumb `exclude_patterns` argument throughout
  builtin/for-each-ref.c: add `--exclude` option
  ref-filter.c: parameterize match functions over patterns
  ref-filter: add `ref_filter_clear()`
  ref-filter: clear reachable list pointers after freeing
  ref-filter.h: provide `REF_FILTER_INIT`
  refs.c: rename `ref_filter`
2023-07-21 13:47:26 -07:00
Jeff King
1779deed39 do_for_each_ref_helper(): mark unused repository parameter
This function gets a repository parameter because it's a callback for
do_for_each_repo_ref_iterator(). But it's just a wrapper that passes
along each call to a regular each_ref_fn callback, and the latter
doesn't accept a repository argument.

Probably in the long run all of the each_ref_fn callbacks should get a
repository parameter, too. But changing that now would require updates
all over the code base. Until that happens, let's annotate this wrapper
callback to quiet the compiler's -Wunused-parameter warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
Taylor Blau
15af64dcfd refs.h: implement hidden_refs_to_excludes()
In subsequent commits, we'll teach `receive-pack` and `upload-pack` to
use the new jump list feature in the packed-refs iterator by ignoring
references which are mentioned via its respective hideRefs lists.

However, the packed-ref jump lists cannot handle un-hiding rules (that
begin with '!'), or namespace comparisons (that begin with '^'). Add a
convenience function to the refs.h API to detect when either of these
conditions are met, and returns an appropriate value to pass as excluded
patterns.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau
e6bf24d39a refs.h: let for_each_namespaced_ref() take excluded patterns
A future commit will want to call `for_each_namespaced_ref()` with
a list of excluded patterns.

We could introduce a variant of that function, say,
`for_each_namespaced_ref_exclude()` which takes the extra parameter, and
reimplement the original function in terms of that. But all but one
caller (in `http-backend.c`) will supply the new parameter, so add the
new parameter to `for_each_namespaced_ref()` itself instead of
introducing a new function.

For now, supply NULL for the list of excluded patterns at all callers to
avoid changing behavior, which we will do in a future change.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau
c45841fff8 revision.h: store hidden refs in a strvec
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.

Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.

Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:

    ref = xstrdup(value);
    len = strlen(ref);
    while (len && ref[len - 1] == '/')
            ref[--len] = '\0';
    strvec_push(hide_refs, ref);
    free(ref);

but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau
b269ac53c0 refs: plumb exclude_patterns argument throughout
The subsequent patch will want to access an optional `excluded_patterns`
array within `refs/packed-backend.c` that will cull out certain
references matching any of the given patterns on a best-effort basis.

To do so, the refs subsystem needs to be updated to pass this value
across a number of different locations.

Prepare for a future patch by introducing this plumbing now, passing
NULLs at top-level APIs in order to make that patch less noisy and more
easily readable.

Signed-off-by: Taylor Blau <me@ttaylorr.co>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Jeff King
bf1377a12b refs.c: rename ref_filter
The refs machinery has its own implementation of a `ref_filter` (used by
`for-each-ref`), which is distinct from the `ref-filter.h` API (also
used by `for-each-ref`, among other things).

Rename the one within refs.c to more clearly indicate its purpose.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Calvin Wan
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Calvin Wan
da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Elijah Newren
a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren
dd77d58795 git-compat-util.h: remove unneccessary include of wildmatch.h
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch().  The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.

Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren
c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
John Cai
826ae79fca pack-refs: teach --exclude option to exclude refs from being packed
At GitLab, we have a system that creates ephemeral internal refs that
don't live long before getting deleted. Having an option to exclude
certain refs from a packed-refs file allows these internal references to
be deleted much more efficiently.

Add an --exclude option to the pack-refs builtin, and use the ref
exclusions API to exclude certain refs from being packed into the final
packed-refs file

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:14 -07:00
Elijah Newren
e93fc5d721 treewide: remove cache.h inclusion due to object-name.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren
dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren
6c6ddf92d5 treewide: be explicit about dependence on advice.h
Dozens of files made use of advice functions, without explicitly
including advice.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
advice.h if they are using it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Junio C Hamano
e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason
c7c33f50bd post-cocci: adjust comments for recent repo_* migration
In preceding commits we changed many calls to macros that were
providing a "the_repository" argument to invoke corresponding repo_*()
function instead. Let's follow-up and adjust references to those in
comments, which coccinelle didn't (and inherently can't) catch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
Elijah Newren
e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren
a6dc3d364c treewide: remove unnecessary cache.h inclusion from a few headers
Ever since a64215b6cd ("object.h: stop depending on cache.h; make
cache.h depend on object.h", 2023-02-24), we have a few headers that
could have replaced their include of cache.h with an include of
object.h.  Make that change now.

Some C files had to start including cache.h after this change (or some
smaller header it had brought in), because the C files were depending
on things from cache.h but were only formerly implicitly getting
cache.h through one of these headers being modified in this patch.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:50 -07:00
Junio C Hamano
88cc8ed8bc Merge branch 'en/header-cleanup'
Code clean-up to clarify the rule that "git-compat-util.h" must be
the first to be included.

* en/header-cleanup:
  diff.h: remove unnecessary include of object.h
  Remove unnecessary includes of builtin.h
  treewide: replace cache.h with more direct headers, where possible
  replace-object.h: move read_replace_refs declaration from cache.h to here
  object-store.h: move struct object_info from cache.h
  dir.h: refactor to no longer need to include cache.h
  object.h: stop depending on cache.h; make cache.h depend on object.h
  ident.h: move ident-related declarations out of cache.h
  pretty.h: move has_non_ascii() declaration from commit.h
  cache.h: remove dependence on hex.h; make other files include it explicitly
  hex.h: move some hex-related declarations from cache.h
  hash.h: move some oid-related declarations from cache.h
  alloc.h: move ALLOC_GROW() functions from cache.h
  treewide: remove unnecessary cache.h includes in source files
  treewide: remove unnecessary cache.h includes
  treewide: remove unnecessary git-compat-util.h includes in headers
  treewide: ensure one of the appropriate headers is sourced first
2023-03-17 14:03:09 -07:00
Junio C Hamano
dda83e69d0 Merge branch 'jk/shorten-unambiguous-ref-wo-sscanf'
sscanf(3) used in "git symbolic-ref --short" implementation found
to be not working reliably on macOS in UTF-8 locales.  Rewrite the
code to avoid sscanf() altogether to work it around.

* jk/shorten-unambiguous-ref-wo-sscanf:
  shorten_unambiguous_ref(): avoid sscanf()
  shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
  shorten_unambiguous_ref(): avoid integer truncation
2023-02-27 10:08:57 -08:00
Elijah Newren
41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
Elijah Newren
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Jeff King
613bef56b8 shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.

This has a few downsides:

  - sscanf("%s") reportedly misbehaves on macOS with some input and
    locale combinations, returning a partial or garbled string. See
    this thread:

      https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/

  - scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
    rule would never pull "origin" out of "refs/remotes/origin/HEAD".
    Instead it always produced "origin/HEAD", which is redundant with
    the "refs/remotes/%s" rule.

  - scanf in general is an error-prone interface. For example, scanning
    for "%s" will copy bytes into a destination string, which must have
    been correctly sized ahead of time to avoid a buffer overflow. In
    this case, the code is OK (the buffer is pessimistically sized to
    match the original string, which should give us a maximum). But in
    general, we do not want to encourage people to use scanf at all.

So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).

We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.

The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.

There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):

  - the first covers the real-world case which misbehaved on macOS.
    Setting LC_ALL is required to trigger the problem there (since
    otherwise our tests use LC_ALL=C), and hopefully is at worst simply
    ignored on other systems (and doesn't cause libc to complain, etc,
    on systems without that locale).

  - the second covers the "origin/HEAD" case as discussed above, which
    is now fixed

  - the remainder are for "weird" cases that work both before and after
    this patch, but would be easy to get wrong with off-by-one problems
    in the parsing (and came out of discussions and earlier iterations
    of the patch that did get them wrong).

  - absent here are tests of boring, expected-to-work cases like
    "refs/heads/foo", etc. Those are covered all over the test suite
    both explicitly (for-each-ref's refname:short) and implicitly (in
    the output of git-status, etc).

Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
Jeff King
8f416f65c9 shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.

We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.

The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.

Note that there are two things here that aren't just simple text
replacements:

  1. We also use nr_rules to see if a previous call has initialized the
     static pre-processing variables. We can just use the scanf_fmts
     pointer to do the same thing, as it is non-NULL only after we've
     done that initialization.

  2. If nr_rules is zero after we've counted it up, we bail from the
     function. This code is unreachable, though, as the set of rules is
     hard-coded and non-empty. And that becomes even more apparent now
     that we are using the constant. So we can drop this conditional
     completely (and ironically, the code would have the same output if
     it _did_ trigger, as we'd simply skip the loop entirely and return
     the whole refname).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
Jeff King
dd5e4d3976 shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.

In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).

And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.

But even if it is not a problem in practice so far, it is worth fixing
for two reasons:

  1. We'll shortly be replacing the sscanf() call with a real parser
     which will handle arbitrary-sized strings.

  2. Assigning strlen() to an int is an anti-pattern that requires
     people to look twice when auditing for real overflow problems.

So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00
Jeff King
91e2ab1587 ls-refs: use repository parameter to iterate refs
The ls_refs() function (for the v2 protocol command of the same name)
takes a repository parameter (like all v2 commands), but ignores it. It
should use it to access the refs.

This isn't a bug in practice, since we only call this function when
serving upload-pack from the main repository. But it's an awkward
gotcha, and it causes -Wunused-parameter to complain.

The main reason we don't use the repository parameter is that the ref
iteration interface we call doesn't have a "refs_" variant that takes a
ref_store. However we can easily add one. In fact, since there is only
one other caller (in ref-filter.c), there is no need to maintain the
non-repository wrapper; that caller can just use the_repository. It's
still a long way from consistently using a repository object, but it's
one small step in the right direction.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 22:16:22 +09:00
Patrick Steinhardt
9b67eb6fbe refs: get rid of global list of hidden refs
We're about to add a new argument to git-rev-list(1) that allows it to
add all references that are visible when taking `transfer.hideRefs` et
al into account. This will require us to potentially parse multiple sets
of hidden refs, which is not easily possible right now as there is only
a single, global instance of the list of parsed hidden refs.

Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
take the list of hidden references as input and adjust callers to keep a
local list, instead. This allows us to easily use multiple hidden-ref
lists. Furthermore, it allows us to properly free this list before we
exit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:51 -05:00
Patrick Steinhardt
5eeb9aa208 refs: fix memory leak when parsing hideRefs config
When parsing the hideRefs configuration, we first duplicate the config
value so that we can modify it. We then subsequently append it to the
`hide_refs` string list, which is initialized with `strdup_strings`
enabled. As a consequence we again reallocate the string, but never
free the first duplicate and thus have a memory leak.

While we never clean up the static `hide_refs` variable anyway, this is
no excuse to make the leak worse by leaking every value twice. We are
also about to change the way this variable will be handled so that we do
indeed start to clean it up. So let's fix the memory leak by using the
`string_list_append_nodup()` so that we pass ownership of the allocated
string to `hide_refs`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:51 -05:00
Han-Wen Nienhuys
71e5473493 refs: unify parse_worktree_ref() and ref_type()
The logic to handle worktree refs (worktrees/NAME/REF and
main-worktree/REF) existed in two places:

* ref_type() in refs.c

* parse_worktree_ref() in worktree.c

Collapse this logic together in one function parse_worktree_ref():
this avoids having to cross-check the result of parse_worktree_ref()
and ref_type().

Introduce enum ref_worktree_type, which is slightly different from
enum ref_type. The latter is a misleading name (one would think that
'ref_type' would have the symref option).

Instead, enum ref_worktree_type only makes explicit how a refname
relates to a worktree. From this point of view, HEAD and
refs/bisect/abc are the same: they specify the current worktree
implicitly.

The files-backend must avoid packing refs/bisect/* and friends into
packed-refs, so expose is_per_worktree_ref() separately.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 11:11:11 -07:00
Junio C Hamano
dd407f1c7c Merge branch 'ab/unused-annotation'
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.

* ab/unused-annotation:
  git-compat-util.h: use "deprecated" for UNUSED variables
  git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14 12:56:39 -07:00