1
0
mirror of https://github.com/git/git synced 2024-06-30 22:54:27 +00:00
Commit Graph

349 Commits

Author SHA1 Message Date
Junio C Hamano
5f14d20984 Merge branch 'kn/update-ref-symref'
"git update-ref --stdin" learned to handle transactional updates of
symbolic-refs.

* kn/update-ref-symref:
  update-ref: add support for 'symref-update' command
  reftable: pick either 'oid' or 'target' for new updates
  update-ref: add support for 'symref-create' command
  update-ref: add support for 'symref-delete' command
  update-ref: add support for 'symref-verify' command
  refs: specify error for regular refs with `old_target`
  refs: create and use `ref_update_expects_existing_old_ref()`
2024-06-20 15:45:12 -07:00
Junio C Hamano
4216329457 Merge branch 'ps/no-writable-strings'
Building with "-Werror -Wwrite-strings" is now supported.

* ps/no-writable-strings: (27 commits)
  config.mak.dev: enable `-Wwrite-strings` warning
  builtin/merge: always store allocated strings in `pull_twohead`
  builtin/rebase: always store allocated string in `options.strategy`
  builtin/rebase: do not assign default backend to non-constant field
  imap-send: fix leaking memory in `imap_server_conf`
  imap-send: drop global `imap_server_conf` variable
  mailmap: always store allocated strings in mailmap blob
  revision: always store allocated strings in output encoding
  remote-curl: avoid assigning string constant to non-const variable
  send-pack: always allocate receive status
  parse-options: cast long name for OPTION_ALIAS
  http: do not assign string constant to non-const field
  compat/win32: fix const-correctness with string constants
  pretty: add casts for decoration option pointers
  object-file: make `buf` parameter of `index_mem()` a constant
  object-file: mark cached object buffers as const
  ident: add casts for fallback name and GECOS
  entry: refactor how we remove items for delayed checkouts
  line-log: always allocate the output prefix
  line-log: stop assigning string constant to file parent buffer
  ...
2024-06-17 15:55:58 -07:00
Patrick Steinhardt
b567004b4b global: improve const correctness when assigning string constants
We're about to enable `-Wwrite-strings`, which changes the type of
string constants to `const char[]`. Fix various sites where we assign
such constants to non-const variables.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:30:48 -07:00
Karthik Nayak
ed3272720e update-ref: add support for 'symref-create' command
Add 'symref-create' command to the '--stdin' mode 'git-update-ref' to
allow creation of symbolic refs in a transaction. The 'symref-create'
command takes in a <new-target>, which the created <ref> will point to.

Also, support the 'core.prefersymlinkrefs' config, wherein if the config
is set and the filesystem supports symlinks, we create the symbolic ref
as a symlink. We fallback to creating a regular symref if creating the
symlink is unsuccessful.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:45 -07:00
Karthik Nayak
2343720967 update-ref: add support for 'symref-delete' command
Add a new command 'symref-delete' to allow deletions of symbolic refs in
a transaction via the '--stdin' mode of the 'git-update-ref' command.
The 'symref-delete' command can, when given an <old-target>, delete the
provided <ref> only when it points to <old-target>.

This command is only compatible with the 'no-deref' mode because we
optionally want to check the 'old_target' of the ref being deleted.
De-referencing a symbolic ref would provide a regular ref and we already
have the 'delete' command for regular refs.

While users can also use 'git symbolic-ref -d' to delete symbolic refs,
the 'symref-delete' command in 'git-update-ref' allows users to do so
within a transaction, which promises atomicity of the operation and can
be batched with other commands.

When no 'old_target' is provided it can also delete regular refs,
similar to how the 'delete' command can delete symrefs when no 'old_oid'
is provided.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
Karthik Nayak
1451ac734f update-ref: add support for 'symref-verify' command
The 'symref-verify' command allows users to verify if a provided <ref>
contains the provided <old-target> without changing the <ref>. If
<old-target> is not provided, the command will verify that the <ref>
doesn't exist.

The command allows users to verify symbolic refs within a transaction,
and this means users can perform a set of changes in a transaction only
when the verification holds good.

Since we're checking for symbolic refs, this command will only work with
the 'no-deref' mode. This is because any dereferenced symbolic ref will
point to an object and not a ref and the regular 'verify' command can be
used in such situations.

Add required tests for symref support in 'verify'. Since we're here,
also add reflog checks for the pre-existing 'verify' tests, there is no
divergence from behavior, but we never tested to ensure that reflog
wasn't affected by the 'verify' command.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07 10:25:44 -07:00
Patrick Steinhardt
6d6a3a99c7 refs: implement logic to migrate between ref storage formats
With the introduction of the new "reftable" backend, users may want to
migrate repositories between the backends without having to recreate the
whole repository. Add the logic to do so.

The implementation is generic and works with arbitrary ref storage
formats so that a backend does not need to implement any migration
logic. It does have a few limitations though:

  - We do not migrate repositories with worktrees, because worktrees
    have separate ref storages. It makes the overall affair more complex
    if we have to migrate multiple storages at once.

  - We do not migrate reflogs, because we have no interfaces to write
    many reflog entries.

  - We do not lock the repository for concurrent access, and thus
    concurrent writes may end up with weird in-between states. There is
    no way to fully lock the "files" backend for writes due to its
    format, and thus we punt on this topic altogether and defer to the
    user to avoid those from happening.

In other words, this version is a minimum viable product for migrating a
repository's ref storage format. It works alright for bare repos, which
often have neither worktrees nor reflogs. But it will not work for many
other repositories without some preparations. These limitations are not
set into stone though, and ideally we will eventually address them over
time.

The logic is not yet used by anything, and thus there are no tests for
it. Those will be added in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:33 -07:00
Patrick Steinhardt
64a6dd8ffc refs: implement removal of ref storages
We're about to introduce logic to migrate ref storages. One part of the
migration will be to delete the files that are part of the old ref
storage format. We don't yet have a way to delete such data generically
across ref backends though.

Implement a new `delete` callback and expose it via a new
`ref_storage_delete()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:33 -07:00
Patrick Steinhardt
fbd1a693c7 refs: allow to skip creation of reflog entries
The ref backends do not have any way to disable the creation of reflog
entries. This will be required for upcoming ref format migration logic
so that we do not create any entries that didn't exist in the original
ref database.

Provide a new `REF_SKIP_CREATE_REFLOG` flag that allows the caller to
disable reflog entry creation.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
Patrick Steinhardt
318efb966b refs: convert ref storage format to an enum
The ref storage format is tracked as a simple unsigned integer, which
makes it harder than necessary to discover what that integer actually is
or where its values are defined.

Convert the ref storage format to instead be an enum.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06 09:04:31 -07:00
Junio C Hamano
16a592f132 Merge branch 'ps/pseudo-ref-terminology'
Terminology to call various ref-like things are getting
straightened out.

* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-28 11:17:06 -07:00
Junio C Hamano
0ff6d23a0f Merge branch 'ps/pseudo-ref-terminology' into ps/ref-storage-migration
* ps/pseudo-ref-terminology:
  refs: refuse to write pseudorefs
  ref-filter: properly distinuish pseudo and root refs
  refs: pseudorefs are no refs
  refs: classify HEAD as a root ref
  refs: do not check ref existence in `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: rename `is_pseudoref()` to `is_root_ref()`
  Documentation/glossary: define root refs as refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: redefine pseudorefs as special refs
2024-05-23 09:14:32 -07:00
Junio C Hamano
e55f364398 Merge branch 'ps/refs-without-the-repository-updates' into ps/ref-storage-migration
* ps/refs-without-the-repository-updates:
  refs/packed: remove references to `the_hash_algo`
  refs/files: remove references to `the_hash_algo`
  refs/files: use correct repository
  refs: remove `dwim_log()`
  refs: drop `git_default_branch_name()`
  refs: pass repo when peeling objects
  refs: move object peeling into "object.c"
  refs: pass ref store when detecting dangling symrefs
  refs: convert iteration over replace refs to accept ref store
  refs: retrieve worktree ref stores via associated repository
  refs: refactor `resolve_gitlink_ref()` to accept a repository
  refs: pass repo when retrieving submodule ref store
  refs: track ref stores via strmap
  refs: implement releasing ref storages
  refs: rename `init_db` callback to avoid confusion
  refs: adjust names for `init` and `init_db` callbacks
2024-05-23 09:14:08 -07:00
Junio C Hamano
4beb7a3b06 Merge branch 'kn/ref-transaction-symref'
Updates to symbolic refs can now be made as a part of ref
transaction.

* kn/ref-transaction-symref:
  refs: remove `create_symref` and associated dead code
  refs: rename `refs_create_symref()` to `refs_update_symref()`
  refs: use transaction in `refs_create_symref()`
  refs: add support for transactional symref updates
  refs: move `original_update_refname` to 'refs.c'
  refs: support symrefs in 'reference-transaction' hook
  files-backend: extract out `create_symref_lock()`
  refs: accept symref values in `ref_transaction_update()`
2024-05-20 11:20:04 -07:00
Patrick Steinhardt
2bb444b196 refs: remove dwim_log()
Remove `dwim_log()` in favor of `repo_dwim_log()` so that we can get rid
of one more dependency on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt
97abaab5f6 refs: drop git_default_branch_name()
The `git_default_branch_name()` function is a thin wrapper around
`repo_default_branch_name()` with two differences:

  - We implicitly rely on `the_repository`.

  - We cache the default branch name.

None of the callsites of `git_default_branch_name()` are hot code paths
though, so the caching of the branch name is not really required.

Refactor the callsites to use `repo_default_branch_name()` instead and
drop `git_default_branch_name()`, thus getting rid of one more case
where we rely on `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt
30aaff437f refs: pass repo when peeling objects
Both `peel_object()` and `peel_iterated_oid()` implicitly rely on
`the_repository` to look up objects. Despite the fact that we want to
get rid of `the_repository`, it also leads to some restrictions in our
ref iterators when trying to retrieve the peeled value for a repository
other than `the_repository`.

Refactor these functions such that both take a repository as argument
and remove the now-unnecessary restrictions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:39 -07:00
Patrick Steinhardt
330a2ae60b refs: pass ref store when detecting dangling symrefs
Both `warn_dangling_symref()` and `warn_dangling_symrefs()` derive the
ref store via `the_repository`. Adapt them to instead take in the ref
store as a parameter. While at it, rename the functions to have a `ref_`
prefix to align them with other functions that take a ref store.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt
8378c9d27b refs: convert iteration over replace refs to accept ref store
The function `for_each_replace_ref()` is a bit of an oddball across the
refs interfaces as it accepts a pointer to the repository instead of a
pointer to the ref store. The only reason for us to accept a repository
is so that we can eventually pass it back to the callback function that
the caller has provided. This is somewhat arbitrary though, as callers
that need the repository can instead make it accessible via the callback
payload.

Refactor the function to instead accept the ref store and adjust callers
accordingly. This allows us to get rid of some of the boilerplate that
we had to carry to pass along the repository and brings us in line with
the other functions that iterate through refs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt
e19488a60a refs: refactor resolve_gitlink_ref() to accept a repository
In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to
look up the submodule ref store. Now that we can look up submodule ref
stores for arbitrary repositories we can improve this function to
instead accept a repository as parameter for which we want to resolve
the gitlink.

Do so and adjust callers accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:38 -07:00
Patrick Steinhardt
965f8991e5 refs: pass repo when retrieving submodule ref store
Looking up submodule ref stores has two deficiencies:

  - The initialized subrepo will be attributed to `the_repository`.

  - The submodule ref store will be tracked in a global map.

This makes it impossible to have submodule ref stores for a repository
other than `the_repository`.

Modify the function to accept the parent repository as parameter and
move the global map into `struct repository`. Like this it becomes
possible to look up submodule ref stores for arbitrary repositories.

Note that this also adds a new reference to `the_repository` in
`resolve_gitlink_ref()`, which is part of the refs interfaces. This will
get adjusted in the next patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
Patrick Steinhardt
71c871b48d refs: implement releasing ref storages
Ref storages are typically only initialized once for `the_repository`
and then never released. Until now we got away with that without causing
memory leaks because `the_repository` stays reachable, and because the
ref backend is reachable via `the_repository` its memory basically never
leaks.

This is about to change though because of the upcoming migration logic,
which will create a secondary ref storage. In that case, we will either
have to release the old or new ref storage to avoid leaks.

Implement a new `release` callback and expose it via a new
`ref_storage_release()` function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:37 -07:00
Patrick Steinhardt
ed93ea1602 refs: rename init_db callback to avoid confusion
Reference backends have two callbacks `init` and `init_db`. The
similarity of these two callbacks has repeatedly confused me whenever I
was looking at them, where I always had to look up which of them does
what.

Rename the `init_db` callback to `create_on_disk`, which should
hopefully be clearer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17 10:33:36 -07:00
Patrick Steinhardt
f1701f279a ref-filter: properly distinuish pseudo and root refs
The ref-filter interfaces currently define root refs as either a
detached HEAD or a pseudo ref. Pseudo refs aren't root refs though, so
let's properly distinguish those ref types.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
Patrick Steinhardt
31951c2248 refs: classify HEAD as a root ref
Root refs are those refs that live in the root of the ref hierarchy.
Our old and venerable "HEAD" reference falls into this category, but we
don't yet classify it as such in `is_root_ref()`.

Adapt the function to also treat "HEAD" as a root ref. This change is
safe to do for all current callers:

  - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    calling `is_root_ref()`.

  - The "files" and "reftable" backends explicitly call both
    `is_root_ref()` and `is_headref()` together.

This also aligns behaviour or `is_root_ref()` and `is_headref()` such
that we stop checking for ref existence. This changes semantics for our
backends:

  - In the reftable backend we already know that the ref must exist
    because `is_headref()` is called as part of the ref iterator. The
    existence check is thus redundant, and the change is safe to do.

  - In the files backend we use it when populating root refs, where we
    would skip adding the "HEAD" file if it was not possible to resolve
    it. The new behaviour is to instead mark "HEAD" as broken, which
    will cause us to emit warnings in various places.

As there are no callers of `is_headref()` left afer the refactoring, we
can absorb it completely into `is_root_ref()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
Patrick Steinhardt
afcd067dad refs: do not check ref existence in is_root_ref()
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.

This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.

Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.

We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.

This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:

  - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    It's clear that the intent is to classify based on the ref name,
    only.

  - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    filter root refs. Again, using existence checks is pointless here as
    the iterator has just surfaced the ref, so we know it does exist.

  - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    determine whether it should add a ref to the root directory of its
    iterator. This had the effect that we skipped over any files that
    are either a symbolic ref, or which are not a ref at all.

    The new behaviour is to include symbolic refs know, which aligns us
    with the adapted terminology. Furthermore, files which look like
    root refs but aren't are now mark those as "broken". As broken refs
    are not surfaced by our tooling, this should not lead to a change in
    user-visible behaviour, but may cause us to emit warnings. This
    feels like the right thing to do as we would otherwise just silently
    ignore corrupted root refs completely.

So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.

Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:52 -07:00
Patrick Steinhardt
f6936e62a5 refs: rename is_pseudoref() to is_root_ref()
Rename `is_pseudoref()` to `is_root_ref()` to adapt to the newly defined
terminology in our gitglossary(7).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 07:30:51 -07:00
Patrick Steinhardt
c8f815c208 refs: remove functions without ref store
The preceding commit has rewritten all callers of ref-related functions
to use the equivalents that accept a `struct ref_store`. Consequently,
the respective variants without the ref store are now unused. Remove
them.

There are likely patch series in-flight that use the now-removed
functions. To help the authors, the old implementations have been added
to "refs.c" in an ifdef'd section as a reference for how to migrate each
of the respective callers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Patrick Steinhardt
54876c6dfb refs: add exclude_patterns parameter to for_each_fullref_in()
The `for_each_fullref_in()` function is supposedly the ref-store-less
equivalent of `refs_for_each_fullref_in()`, but the latter has gained a
new parameter `exclude_patterns` over time. Bring these two functions
back in sync again by adding the parameter to the former function, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Patrick Steinhardt
39a9ef8fc4 refs: introduce missing functions that accept a struct ref_store
While most of the functions in "refs.h" have a variant that accepts a
`struct ref_store`, some don't. Callers of these functions are thus
forced to implicitly rely on `the_repository` to figure out the ref
store that is to be used.

Introduce those missing functions to address this shortcoming.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:58 -07:00
Karthik Nayak
f151dfe3c9 refs: rename refs_create_symref() to refs_update_symref()
The `refs_create_symref()` function is used to update/create a symref.
But it doesn't check the old target of the symref, if existing. It force
updates the symref. In this regard, the name `refs_create_symref()` is a
bit misleading. So let's rename it to `refs_update_symref()`. This is
akin to how 'git-update-ref(1)' also allows us to create apart from
update.

While we're here, rename the arguments in the function to clarify what
they actually signify and reduce confusion.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:50 -07:00
Karthik Nayak
1bc4cc3fc4 refs: accept symref values in ref_transaction_update()
The function `ref_transaction_update()` obtains ref information and
flags to create a `ref_update` and add them to the transaction at hand.

To extend symref support in transactions, we need to also accept the
old and new ref targets and process it. This commit adds the required
parameters to the function and modifies all call sites.

The two parameters added are `new_target` and `old_target`. The
`new_target` is used to denote what the reference should point to when
the transaction is applied. Some functions allow this parameter to be
NULL, meaning that the reference is not changed.

The `old_target` denotes the value the reference must have before the
update. Some functions allow this parameter to be NULL, meaning that the
old value of the reference is not checked.

We also update the internal function `ref_transaction_add_update()`
similarly to take the two new parameters.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 08:51:49 -07:00
Patrick Steinhardt
f89356db4a refs/reftable: expose auto compaction via new flag
Under normal circumstances, the "reftable" backend will automatically
perform compaction after appending to the stack. It is thus not
necessary and may even be considered wasteful to run git-pack-refs(1) in
"reftable"-backed repositories as it will cause the backend to compact
all tables into a single one. We do exactly that though when running
`git maintenance run --auto` or `git gc --auto`, which gets spawned by
Git after running some specific commands.

The `--auto` mode is typically only executing optimizations as needed.
To do so, we already use several heuristics for the various different
data structures in Git to determine whether to optimize them or not.
We do not use any heuristics for refs though and instead always optimize
them.

Introduce a new `PACK_REFS_AUTO` flag that can be passed to the backend.
When not handled by the backend we will continue to behave the exact
same as we do right now, that is we optimize refs unconditionally. This
is done for the "files" backend for now to retain current behaviour,
even though we may eventually also want to introduce heuristics here.
For the "reftable" backend though we already do have auto-compaction, so
we can easily reuse that logic to implement the new auto-packing flag.

Note that under normal circumstances, this should always end up being a
no-op. After all, we already invoke the code for every single addition
to the stack. But there are special cases where it can still be helpful
to execute the auto-compaction code explicitly:

  - Concurrent writers may cause compaction to not run due to locks.

  - Callers may decide to disable compaction altogether and then pack
    refs at a later point due to various reasons.

  - Other implementations of the reftable format may do compaction
    differently or even not at all.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
Patrick Steinhardt
35aeabd6c2 refs: remove PACK_REFS_ALL flag
The intent of the `PACK_REFS_ALL` flag is to ask the backend to compact
all refs instead of only a subset of them. Thus, this flag gets passed
down to `refs_pack_refs()` via `struct pack_refs_opts::flags`.

But starting with 4fe42f326e (pack-refs: teach pack-refs --include
option, 2023-05-12), the flag's semantics have changed. Instead of being
handled by the respective backends, this flag is now getting handled by
the callers of `refs_pack_refs()` which will add a single glob ("*") to
the list of refs-to-be-packed. Thus, the flag serves no purpose to the
ref backends anymore.

Remove the flag and replace it with a local variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
Patrick Steinhardt
0f65c7a676 refs: move struct pack_refs_opts to where it's used
The declaration of `struct pack_refs_opts` is in a seemingly random
place. Move it so that it's located right next to its flags and
functions that use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-25 09:54:07 -07:00
Junio C Hamano
d037212d97 Merge branch 'kn/for-all-refs'
"git for-each-ref" learned "--include-root-refs" option to show
even the stuff outside the 'refs/' hierarchy.

* kn/for-all-refs:
  for-each-ref: add new option to include root refs
  ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR'
  refs: introduce `refs_for_each_include_root_refs()`
  refs: extract out `loose_fill_ref_dir_regular_file()`
  refs: introduce `is_pseudoref()` and `is_headref()`
2024-03-05 09:44:44 -08:00
Junio C Hamano
74522bbd98 Merge branch 'jk/reflog-special-cases-fix'
The logic to access reflog entries by date and number had ugly
corner cases at the boundaries, which have been cleaned up.

* jk/reflog-special-cases-fix:
  read_ref_at(): special-case ref@{0} for an empty reflog
  get_oid_basic(): special-case ref@{n} for oldest reflog entry
  Revert "refs: allow @{n} to work with n-sized reflog"
2024-03-05 09:44:42 -08:00
Jeff King
5edd126720 read_ref_at(): special-case ref@{0} for an empty reflog
The previous commit special-cased get_oid_basic()'s handling of ref@{n}
for a reflog with n entries. But its special case doesn't work for
ref@{0} in an empty reflog, because read_ref_at() dies when it notices
the empty reflog!

We can make this work by special-casing this in read_ref_at(). It's
somewhat gross, for two reasons:

  1. We have no reflog entry to describe in the "msg" out-parameter. So
     we have to leave it uninitialized or make something up.

  2. Likewise, we have no oid to put in the "oid" out-parameter. Leaving
     it untouched is actually the best thing here, as all of the callers
     will have initialized it with the current ref value via
     repo_dwim_log(). This is rather subtle, but it is how things worked
     in 6436a20284 (refs: allow @{n} to work with n-sized reflog,
     2021-01-07) before we reverted it.

The key difference from 6436a20284 here is that we'll return "1" to
indicate that we _didn't_ find the requested reflog entry. Coupled with
the special-casing in get_oid_basic() in the previous commit, that's
enough to make looking up ref@{0} work, and we can flip 6436a20284's
test back to expect_success.

It also means that the call in show-branch which segfaulted with
6436a20284 (and which is now tested in t3202) remains OK. The caller
notices that we could not find any reflog entry, and so it breaks out of
its loop, showing nothing. This is different from the current behavior
of producing an error, but it's just as reasonable (and is exactly what
we'd do if you asked it to walk starting at ref@{1} but there was only 1
entry).

Thus nobody should actually look at the reflog entry info we return. But
we'll still put in some fake values just to be on the safe side, since
this is such a subtle and confusing interface. Likewise, we'll document
what's going on in a comment above the function declaration. If this
were a function with a lot of callers, the footgun would probably not be
worth it. But it has only ever had two callers in its 18-year existence,
and it seems unlikely to grow more. So let's hold our noses and let
users enjoy the convenience of a simulated ref@{0}.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 10:05:35 -08:00
Karthik Nayak
d0f00c1ac1 refs: introduce refs_for_each_include_root_refs()
Introduce a new ref iteration flag `DO_FOR_EACH_INCLUDE_ROOT_REFS`,
which will be used to iterate over regular refs plus pseudorefs and
HEAD.

Refs which fall outside the `refs/` and aren't either pseudorefs or HEAD
are more of a grey area. This is because we don't block the users from
creating such refs but they are not officially supported.

Introduce `refs_for_each_include_root_refs()` which calls
`do_for_each_ref()` with this newly introduced flag.

In `refs/files-backend.c`, introduce a new function
`add_pseudoref_and_head_entries()` to add pseudorefs and HEAD to the
`ref_dir`. We then finally call `add_pseudoref_and_head_entries()`
whenever the `DO_FOR_EACH_INCLUDE_ROOT_REFS` flag is set. Any new ref
backend will also have to implement similar changes on its end.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
Karthik Nayak
1eba2240f8 refs: introduce is_pseudoref() and is_headref()
Introduce two new functions `is_pseudoref()` and `is_headref()`. This
provides the necessary functionality for us to add pseudorefs and HEAD
to the loose ref cache in the files backend, allowing us to build
tooling to print these refs.

The `is_pseudoref()` function internally calls `is_pseudoref_syntax()`
but adds onto it by also checking to ensure that the pseudoref either
ends with a "_HEAD" suffix or matches a list of exceptions. After which
we also parse the contents of the pseudoref to ensure that it conforms
to the ref format.

We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
because the function is also used by `is_current_worktree_ref()` and
making it stricter to match only known pseudorefs might have unintended
consequences due to files like 'BISECT_START' which isn't a pseudoref
but sometimes contains object ID.

Keeping this in mind, we leave `is_pseudoref_syntax()` as is and create
`is_pseudoref()` which is stricter. Ideally we'd want to move the new
syntax checks to `is_pseudoref_syntax()` but a prerequisite for this
would be to actually remove the exception list by converting those
pseudorefs to also contain a '_HEAD' suffix and perhaps move bisect
related files like 'BISECT_START' to a new directory similar to the
'rebase-merge' directory.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-23 10:36:27 -08:00
Patrick Steinhardt
31f898397b refs: drop unused params from the reflog iterator callback
The ref and reflog iterators share much of the same underlying code to
iterate over the corresponding entries. This results in some weird code
because the reflog iterator also exposes an object ID as well as a flag
to the callback function. Neither of these fields do refer to the reflog
though -- they refer to the corresponding ref with the same name. This
is quite misleading. In practice at least the object ID cannot really be
implemented in any other way as a reflog does not have a specific object
ID in the first place. This is further stressed by the fact that none of
the callbacks except for our test helper make use of these fields.

Split up the infrastucture so that ref and reflog iterators use separate
callback signatures. This allows us to drop the nonsensical fields from
the reflog iterator.

Note that internally, the backends still use the same shared infra to
iterate over both types. As the backends should never end up being
called directly anyway, this is not much of a problem and thus kept
as-is for simplicity's sake.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-21 09:58:06 -08:00
Junio C Hamano
dc8ce995a2 Merge branch 'ps/worktree-refdb-initialization'
Instead of manually creating refs/ hierarchy on disk upon a
creation of a secondary worktree, which is only usable via the
files backend, use the refs API to populate it.

* ps/worktree-refdb-initialization:
  builtin/worktree: create refdb via ref backend
  worktree: expose interface to look up worktree by name
  builtin/worktree: move setup of commondir file earlier
  refs/files: skip creation of "refs/{heads,tags}" for worktrees
  setup: move creation of "refs/" into the files backend
  refs: prepare `refs_init_db()` for initializing worktree refs
2024-01-26 08:54:46 -08:00
Junio C Hamano
32c6fc3e30 Merge branch 'ps/refstorage-extension'
Introduce a new extension "refstorage" so that we can mark a
repository that uses a non-default ref backend, like reftable.

* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
2024-01-16 10:11:57 -08:00
Patrick Steinhardt
2eb1d0c452 refs/files: skip creation of "refs/{heads,tags}" for worktrees
The files ref backend will create both "refs/heads" and "refs/tags" in
the Git directory. While this logic makes sense for normal repositories,
it does not for worktrees because those refs are "common" refs that
would always be contained in the main repository's ref database.

Introduce a new flag telling the backend that it is expected to create a
per-worktree ref database and skip creation of these dirs in the files
backend when the flag is set. No other backends (currently) need
worktree-specific logic, so this is the only required change to start
creating per-worktree ref databases via `refs_init_db()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
Patrick Steinhardt
2e573d61ff refs: prepare refs_init_db() for initializing worktree refs
The purpose of `refs_init_db()` is to initialize the on-disk files of a
new ref database. The function is quite inflexible right now though, as
callers can neither specify the `struct ref_store` nor can they pass any
flags.

Refactor the interface to accept both of these. This will be required so
that we can start initializing per-worktree ref databases via the ref
backend instead of open-coding the initialization in "worktree.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
Patrick Steinhardt
0fcc285c5e refs: refactor logic to look up storage backends
In order to look up ref storage backends, we're currently using a linked
list of backends, where each backend is expected to set up its `next`
pointer to the next ref storage backend. This is kind of a weird setup
as backends need to be aware of other backends without much of a reason.

Refactor the code so that the array of backends is centrally defined in
"refs.c", where each backend is now identified by an integer constant.
Expose functions to translate from those integer constants to the name
and vice versa, which will be required by subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
Junio C Hamano
7122f4f747 refs.h: HEAD is not that special
In-code comment explains pseudorefs but used a wrong nomenclature
"special ref".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-15 14:08:28 -08:00
Junio C Hamano
39fe402d67 Merge branch 'tb/refs-exclusion-and-packed-refs'
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.

* tb/refs-exclusion-and-packed-refs:
  ls-refs.c: avoid enumerating hidden refs where possible
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references
  refs.h: implement `hidden_refs_to_excludes()`
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: add trace2 counters for jump list
  refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
  refs/packed-backend.c: refactor `find_reference_location()`
  refs: plumb `exclude_patterns` argument throughout
  builtin/for-each-ref.c: add `--exclude` option
  ref-filter.c: parameterize match functions over patterns
  ref-filter: add `ref_filter_clear()`
  ref-filter: clear reachable list pointers after freeing
  ref-filter.h: provide `REF_FILTER_INIT`
  refs.c: rename `ref_filter`
2023-07-21 13:47:26 -07:00
Taylor Blau
15af64dcfd refs.h: implement hidden_refs_to_excludes()
In subsequent commits, we'll teach `receive-pack` and `upload-pack` to
use the new jump list feature in the packed-refs iterator by ignoring
references which are mentioned via its respective hideRefs lists.

However, the packed-ref jump lists cannot handle un-hiding rules (that
begin with '!'), or namespace comparisons (that begin with '^'). Add a
convenience function to the refs.h API to detect when either of these
conditions are met, and returns an appropriate value to pass as excluded
patterns.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau
e6bf24d39a refs.h: let for_each_namespaced_ref() take excluded patterns
A future commit will want to call `for_each_namespaced_ref()` with
a list of excluded patterns.

We could introduce a variant of that function, say,
`for_each_namespaced_ref_exclude()` which takes the extra parameter, and
reimplement the original function in terms of that. But all but one
caller (in `http-backend.c`) will supply the new parameter, so add the
new parameter to `for_each_namespaced_ref()` itself instead of
introducing a new function.

For now, supply NULL for the list of excluded patterns at all callers to
avoid changing behavior, which we will do in a future change.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00