Commit graph

5 commits

Author SHA1 Message Date
Taylor Blau a72dfab8b8 pseudo-merge.c: ensure pseudo-merge groups are closed
When generating pseudo-merge bitmaps, it's possible that concurrent
reference updates may reveal some pseudo-merge candidates which reach
objects that are not contained in the bitmap's pack or pseudo-pack
order (in the case of MIDX bitmaps).

The latter case is relatively easy to demonstrate: if we generate a MIDX
bitmap with only half of the repository packed, then the unpacked
contents are not part of the MIDX's object order.

If we happen to select one or more commit(s) from the unpacked portion
of the repository for inclusion in a pseudo-merge, we'll get the
following message when trying to generate its bitmap:

    $ git multi-pack-index write --bitmap
    [...]
    Selecting pseudo-merge commits: 100% (1/1), done.
    warning: Failed to write bitmap index. Packfile doesn't have full closure (object ... is missing)
    Building bitmaps:  50% (1/2), done.
    error: could not write multi-pack bitmap

, and the attempted bitmap write will fail, leaving the repository
without a current bitmap.

Rectify this by ensuring that the commits which are pseudo-merge
candidates can only be so if they appear somewhere in the packing order.

This is sufficient, since we know that the original packing order is
closed under reachability, so if a commit appears in that list as a
potential pseudo-merge candidate, we know that everything reachable from
it also appears in the list (and thus the candidate is a good one).

Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:32:28 -07:00
Taylor Blau 25b78668de pseudo-merge.c: do not generate empty pseudo-merge commits
The previous commit demonstrated it is possible to generate empty
pseudo-merge commits, which is not useful as such pseudo-merges carry no
information.

Ensure that we only generate non-empty groups by not pushing a new
commit onto the bitmap_writer when that commit has no parents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:29:15 -07:00
Taylor Blau 42f80e361c t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
Demonstrate that it is possible to generate empty pseudo-merge commits
in certain cases.

In the below instance, we generate one non-empty pseudo-merge
(containing commit "base"), and one empty pseudo-merge group
(corresponding to the unstable commits within that group).

(In my testing, the pseudo-merge machinery seems to handle empty groups
just fine, but generating them is pointless as they carry no
information.)

This commit (introducing a deliberate "test_expect_failure") is split
out from the actual fix (which will appear in the following commit) to
demonstrate that the failure is correctly induced.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:26:35 -07:00
Taylor Blau 7252d9a036 pseudo-merge: implement support for finding existing merges
This patch implements support for reusing existing pseudo-merge commits
when writing bitmaps when there is an existing pseudo-merge bitmap which
has exactly the same set of parents as one that we are about to write.

Note that unstable pseudo-merges are likely to change between
consecutive repacks, and so are generally poor candidates for reuse.
However, stable pseudo-merges (see the configuration option
'bitmapPseudoMerge.<name>.stableThreshold') are by definition unlikely
to change between runs (as they represent long-running branches).

Because there is no index from a *set* of pseudo-merge parents to a
matching pseudo-merge bitmap, we have to construct the bitmap
corresponding to the set of parents for each pending pseudo-merge commit
and see if a matching bitmap exists.

This is technically quadratic in the number of pseudo-merges, but is OK
in practice for a couple of reasons:

  - non-matching pseudo-merge bitmaps are rejected quickly as soon as
    they differ in a single bit

  - already-matched pseudo-merge bitmaps are discarded from subsequent
    rounds of search

  - the number of pseudo-merges is generally small, even for large
    repositories

In order to do this, implement (a) a function that finds a matching
pseudo-merge given some uncompressed bitset describing its parents, (b)
a function that computes the bitset of parents for a given pseudo-merge
commit, and (c) call that function before computing the set of reachable
objects for some pending pseudo-merge.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:44 -07:00
Taylor Blau 11d45a6e6a pack-bitmap.c: use pseudo-merges during traversal
Now that all of the groundwork has been laid to support reading and
using pseudo-merges, make use of that work in this commit by teaching
the pack-bitmap machinery to use pseudo-merge(s) when available during
traversal.

The basic operation is as follows:

  - When enumerating objects on either side of a reachability query,
    first see if any subset of the roots satisfies some pseudo-merge
    bitmap. If it does, apply that pseudo-merge bitmap.

  - If any pseudo-merge bitmap(s) were applied in the previous step, OR
    them into the result[^1]. Then repeat the process over all
    pseudo-merge bitmaps (we'll refer to this as "cascading"
    pseudo-merges). Once this is done, OR in the resulting bitmap.

  - If there is no fill-in traversal to be done, return the bitmap for
    that side of the reachability query. If there is fill-in traversal,
    then for each commit we encounter via show_commit(), check to see if
    any unsatisfied pseudo-merges containing that commit as one of its
    parents has been made satisfied by the presence of that commit.

    If so, OR in the object set from that pseudo-merge bitmap, and then
    cascade. If not, continue traversal.

A similar implementation is present in the boundary-based bitmap
traversal routines.

[^1]: Importantly, we cannot OR in the entire set of roots along with
  the objects reachable from whatever pseudo-merge bitmaps were
  satisfied.  This may leave some dangling bits corresponding to any
  unsatisfied root(s) getting OR'd into the resulting bitmap, tricking
  other parts of the traversal into thinking we already have a
  reachability closure over those commit(s) when we do not.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-24 11:40:43 -07:00