git/refs/packed-backend.h
Patrick Steinhardt c3459ae9ef refs/files: use heuristic to decide whether to repack with --auto
The `--auto` flag for git-pack-refs(1) allows the ref backend to decide
whether or not a repack is in order. This switch has been introduced
mostly with the "reftable" backend in mind, which already knows to
auto-compact its tables during normal operations. When the flag is set,
then it will use the same auto-compaction mechanism and thus end up
doing nothing in most cases.

The "files" backend does not have any such heuristic yet and instead
packs any loose references unconditionally. So we rewrite the complete
"packed-refs" file even if there's only a single loose reference to be
packed.

Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using
`git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is
unconditionally executed via our auto maintenance, so we end up repacking
references every single time auto maintenance kicks in. And while that
commit already mentioned that the "files" backend unconditionally packs
refs now, the author obviously didn't quite think about the consequences
thereof. So while the idea was sound, we really should have added a
heuristic to the "files" backend before implementing it.

Introduce a heuristic that decides whether or not it is worth to pack
loose references. The important factors to decide here are the number of
loose references in comparison to the overall size of the "packed-refs"
file. The bigger the "packed-refs" file, the longer it takes to rewrite
it and thus we scale up the limit of allowed loose references before we
repack.

As is the nature of heuristics, this mechansim isn't obviously
"correct", but should rather be seen as a tradeoff between how much
resources we spend packing refs and how inefficient the ref store
becomes. For all I can say, we have successfully been using the exact
same heuristic in Gitaly for several years by now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00

46 lines
1.5 KiB
C

#ifndef REFS_PACKED_BACKEND_H
#define REFS_PACKED_BACKEND_H
struct repository;
struct ref_transaction;
/*
* Support for storing references in a `packed-refs` file.
*
* Note that this backend doesn't check for D/F conflicts, because it
* doesn't care about them. But usually it should be wrapped in a
* `files_ref_store` that prevents D/F conflicts from being created,
* even among packed refs.
*/
struct ref_store *packed_ref_store_init(struct repository *repo,
const char *gitdir,
unsigned int store_flags);
/*
* Lock the packed-refs file for writing. Flags is passed to
* hold_lock_file_for_update(). Return 0 on success. On errors, write
* an error message to `err` and return a nonzero value.
*/
int packed_refs_lock(struct ref_store *ref_store, int flags, struct strbuf *err);
void packed_refs_unlock(struct ref_store *ref_store);
int packed_refs_is_locked(struct ref_store *ref_store);
/*
* Obtain the size of the `packed-refs` file. Reports `0` as size in case there
* is no packed-refs file. Returns 0 on success, negative otherwise.
*/
int packed_refs_size(struct ref_store *ref_store,
size_t *out);
/*
* Return true if `transaction` really needs to be carried out against
* the specified packed_ref_store, or false if it can be skipped
* (i.e., because it is an obvious NOOP). `ref_store` must be locked
* before calling this function.
*/
int is_packed_transaction_needed(struct ref_store *ref_store,
struct ref_transaction *transaction);
#endif /* REFS_PACKED_BACKEND_H */