git/t/t5326-multi-pack-bitmaps.sh
Taylor Blau 95e8383bac midx.c: make changing the preferred pack safe
The previous patch demonstrates a bug where a MIDX's auxiliary object
order can become out of sync with a MIDX bitmap.

This is because of two confounding factors:

  - First, the object order is stored in a file which is named according
    to the multi-pack index's checksum, and the MIDX does not store the
    object order. This means that the object order can change without
    altering the checksum.

  - But the .rev file is moved into place with finalize_object_file(),
    which link(2)'s the file into place instead of renaming it. For us,
    that means that a modified .rev file will not be moved into place if
    MIDX's checksum was unchanged.

This fix is to force the MIDX's checksum to change when the preferred
pack changes but the set of packs contained in the MIDX does not. In
other words, when the object order changes, the MIDX's checksum needs to
change with it (regardless of whether the MIDX is tracking the same or
different packs).

This prevents a race whereby changing the object order (but not the
packs themselves) enables a reader to see the new .rev file with the old
MIDX, or similarly seeing the new bitmap with the old object order.

But why can't we just stop hardlinking the .rev into place instead
adding additional data to the MIDX? Suppose that's what we did. Then
when we go to generate the new bitmap, we'll load the old MIDX bitmap,
along with the MIDX that it references. That's fine, since the new MIDX
isn't moved into place until after the new bitmap is generated. But the
new object order *has* been moved into place. So we'll read the old
bitmaps in the new order when generating the new bitmap file, meaning
that without this secondary change, bitmap generation itself would
become a victim of the race described here.

This can all be prevented by forcing the MIDX's checksum to change when
the object order does. By embedding the entire object order into the
MIDX, we do just that. That is, the MIDX's checksum will change in
response to any perturbation of the underlying object order. In t5326,
this will cause the MIDX's checksum to update (even without changing the
set of packs in the MIDX), preventing the stale read problem.

Note that this makes it safe to continue to link(2) the MIDX .rev file
into place, since it is now impossible to have a .rev file that is
out-of-sync with the MIDX whose checksum it references. (But we will do
away with MIDX .rev files later in this series anyway, so this is
somewhat of a moot point).

In theory, it is possible to store a "fingerprint" of the full object
order here, so long as that fingerprint changes at least as often as the
full object order does. Some possibilities here include storing the
identity of the preferred pack, along with the mtimes of the
non-preferred packs in a consistent order. But storing a limited part of
the information makes it difficult to reason about whether or not there
are gaps between the two that would cause us to get bitten by this bug
again.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-27 12:07:52 -08:00

429 lines
10 KiB
Bash
Executable file

#!/bin/sh
test_description='exercise basic multi-pack bitmap functionality'
. ./test-lib.sh
. "${TEST_DIRECTORY}/lib-bitmap.sh"
# We'll be writing our own midx and bitmaps, so avoid getting confused by the
# automatic ones.
GIT_TEST_MULTI_PACK_INDEX=0
GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
objdir=.git/objects
midx=$objdir/pack/multi-pack-index
# midx_pack_source <obj>
midx_pack_source () {
test-tool read-midx --show-objects .git/objects | grep "^$1 " | cut -f2
}
setup_bitmap_history
test_expect_success 'enable core.multiPackIndex' '
git config core.multiPackIndex true
'
test_expect_success 'create single-pack midx with bitmaps' '
git repack -ad &&
git multi-pack-index write --bitmap &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev
'
basic_bitmap_tests
test_expect_success 'create new additional packs' '
for i in $(test_seq 1 16)
do
test_commit "$i" &&
git repack -d || return 1
done &&
git checkout -b other2 HEAD~8 &&
for i in $(test_seq 1 8)
do
test_commit "side-$i" &&
git repack -d || return 1
done &&
git checkout second
'
test_expect_success 'create multi-pack midx with bitmaps' '
git multi-pack-index write --bitmap &&
ls $objdir/pack/pack-*.pack >packs &&
test_line_count = 25 packs &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev
'
basic_bitmap_tests
test_expect_success '--no-bitmap is respected when bitmaps exist' '
git multi-pack-index write --bitmap &&
test_commit respect--no-bitmap &&
git repack -d &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev &&
git multi-pack-index write --no-bitmap &&
test_path_is_file $midx &&
test_path_is_missing $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_missing $midx-$(midx_checksum $objdir).rev
'
test_expect_success 'setup midx with base from later pack' '
# Write a and b so that "a" is a delta on top of base "b", since Git
# prefers to delete contents out of a base rather than add to a shorter
# object.
test_seq 1 128 >a &&
test_seq 1 130 >b &&
git add a b &&
git commit -m "initial commit" &&
a=$(git rev-parse HEAD:a) &&
b=$(git rev-parse HEAD:b) &&
# In the first pack, "a" is stored as a delta to "b".
p1=$(git pack-objects .git/objects/pack/pack <<-EOF
$a
$b
EOF
) &&
# In the second pack, "a" is missing, and "b" is not a delta nor base to
# any other object.
p2=$(git pack-objects .git/objects/pack/pack <<-EOF
$b
$(git rev-parse HEAD)
$(git rev-parse HEAD^{tree})
EOF
) &&
git prune-packed &&
# Use the second pack as the preferred source, so that "b" occurs
# earlier in the MIDX object order, rendering "a" unusable for pack
# reuse.
git multi-pack-index write --bitmap --preferred-pack=pack-$p2.idx &&
have_delta $a $b &&
test $(midx_pack_source $a) != $(midx_pack_source $b)
'
rev_list_tests 'full bitmap with backwards delta'
test_expect_success 'clone with bitmaps enabled' '
git clone --no-local --bare . clone-reverse-delta.git &&
test_when_finished "rm -fr clone-reverse-delta.git" &&
git rev-parse HEAD >expect &&
git --git-dir=clone-reverse-delta.git rev-parse HEAD >actual &&
test_cmp expect actual
'
bitmap_reuse_tests() {
from=$1
to=$2
test_expect_success "setup pack reuse tests ($from -> $to)" '
rm -fr repo &&
git init repo &&
(
cd repo &&
test_commit_bulk 16 &&
git tag old-tip &&
git config core.multiPackIndex true &&
if test "MIDX" = "$from"
then
git repack -Ad &&
git multi-pack-index write --bitmap
else
git repack -Adb
fi
)
'
test_expect_success "build bitmap from existing ($from -> $to)" '
(
cd repo &&
test_commit_bulk --id=further 16 &&
git tag new-tip &&
if test "MIDX" = "$to"
then
git repack -d &&
git multi-pack-index write --bitmap
else
git repack -Adb
fi
)
'
test_expect_success "verify resulting bitmaps ($from -> $to)" '
(
cd repo &&
git for-each-ref &&
git rev-list --test-bitmap refs/tags/old-tip &&
git rev-list --test-bitmap refs/tags/new-tip
)
'
}
bitmap_reuse_tests 'pack' 'MIDX'
bitmap_reuse_tests 'MIDX' 'pack'
bitmap_reuse_tests 'MIDX' 'MIDX'
test_expect_success 'missing object closure fails gracefully' '
rm -fr repo &&
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit loose &&
test_commit packed &&
# Do not pass "--revs"; we want a pack without the "loose"
# commit.
git pack-objects $objdir/pack/pack <<-EOF &&
$(git rev-parse packed)
EOF
test_must_fail git multi-pack-index write --bitmap 2>err &&
grep "doesn.t have full closure" err &&
test_path_is_missing $midx
)
'
test_expect_success 'setup partial bitmaps' '
test_commit packed &&
git repack &&
test_commit loose &&
git multi-pack-index write --bitmap 2>err &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev
'
basic_bitmap_tests HEAD~
test_expect_success 'removing a MIDX clears stale bitmaps' '
rm -fr repo &&
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit base &&
git repack &&
git multi-pack-index write --bitmap &&
# Write a MIDX and bitmap; remove the MIDX but leave the bitmap.
stale_bitmap=$midx-$(midx_checksum $objdir).bitmap &&
stale_rev=$midx-$(midx_checksum $objdir).rev &&
rm $midx &&
# Then write a new MIDX.
test_commit new &&
git repack &&
git multi-pack-index write --bitmap &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev &&
test_path_is_missing $stale_bitmap &&
test_path_is_missing $stale_rev
)
'
test_expect_success 'pack.preferBitmapTips' '
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit_bulk --message="%s" 103 &&
git log --format="%H" >commits.raw &&
sort <commits.raw >commits &&
git log --format="create refs/tags/%s %H" HEAD >refs &&
git update-ref --stdin <refs &&
git multi-pack-index write --bitmap &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test_path_is_file $midx-$(midx_checksum $objdir).rev &&
test-tool bitmap list-commits | sort >bitmaps &&
comm -13 bitmaps commits >before &&
test_line_count = 1 before &&
perl -ne "printf(\"create refs/tags/include/%d \", $.); print" \
<before | git update-ref --stdin &&
rm -fr $midx-$(midx_checksum $objdir).bitmap &&
rm -fr $midx-$(midx_checksum $objdir).rev &&
rm -fr $midx &&
git -c pack.preferBitmapTips=refs/tags/include \
multi-pack-index write --bitmap &&
test-tool bitmap list-commits | sort >bitmaps &&
comm -13 bitmaps commits >after &&
! test_cmp before after
)
'
test_expect_success 'writing a bitmap with --refs-snapshot' '
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit one &&
test_commit two &&
git rev-parse one >snapshot &&
git repack -ad &&
# First, write a MIDX which see both refs/tags/one and
# refs/tags/two (causing both of those commits to receive
# bitmaps).
git multi-pack-index write --bitmap &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test-tool bitmap list-commits | sort >bitmaps &&
grep "$(git rev-parse one)" bitmaps &&
grep "$(git rev-parse two)" bitmaps &&
rm -fr $midx-$(midx_checksum $objdir).bitmap &&
rm -fr $midx-$(midx_checksum $objdir).rev &&
rm -fr $midx &&
# Then again, but with a refs snapshot which only sees
# refs/tags/one.
git multi-pack-index write --bitmap --refs-snapshot=snapshot &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test-tool bitmap list-commits | sort >bitmaps &&
grep "$(git rev-parse one)" bitmaps &&
! grep "$(git rev-parse two)" bitmaps
)
'
test_expect_success 'write a bitmap with --refs-snapshot (preferred tips)' '
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit_bulk --message="%s" 103 &&
git log --format="%H" >commits.raw &&
sort <commits.raw >commits &&
git log --format="create refs/tags/%s %H" HEAD >refs &&
git update-ref --stdin <refs &&
git multi-pack-index write --bitmap &&
test_path_is_file $midx &&
test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
test-tool bitmap list-commits | sort >bitmaps &&
comm -13 bitmaps commits >before &&
test_line_count = 1 before &&
(
grep -vf before commits.raw &&
# mark missing commits as preferred
sed "s/^/+/" before
) >snapshot &&
rm -fr $midx-$(midx_checksum $objdir).bitmap &&
rm -fr $midx-$(midx_checksum $objdir).rev &&
rm -fr $midx &&
git multi-pack-index write --bitmap --refs-snapshot=snapshot &&
test-tool bitmap list-commits | sort >bitmaps &&
comm -13 bitmaps commits >after &&
! test_cmp before after
)
'
test_expect_success 'hash-cache values are propagated from pack bitmaps' '
rm -fr repo &&
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit base &&
test_commit base2 &&
git repack -adb &&
test-tool bitmap dump-hashes >pack.raw &&
test_file_not_empty pack.raw &&
sort pack.raw >pack.hashes &&
test_commit new &&
git repack &&
git multi-pack-index write --bitmap &&
test-tool bitmap dump-hashes >midx.raw &&
sort midx.raw >midx.hashes &&
# ensure that every namehash in the pack bitmap can be found in
# the midx bitmap (i.e., that there are no oid-namehash pairs
# unique to the pack bitmap).
comm -23 pack.hashes midx.hashes >dropped.hashes &&
test_must_be_empty dropped.hashes
)
'
test_expect_success 'changing the preferred pack does not corrupt bitmaps' '
rm -fr repo &&
git init repo &&
test_when_finished "rm -fr repo" &&
(
cd repo &&
test_commit A &&
test_commit B &&
git rev-list --objects --no-object-names HEAD^ >A.objects &&
git rev-list --objects --no-object-names HEAD^.. >B.objects &&
A=$(git pack-objects $objdir/pack/pack <A.objects) &&
B=$(git pack-objects $objdir/pack/pack <B.objects) &&
cat >indexes <<-EOF &&
pack-$A.idx
pack-$B.idx
EOF
git multi-pack-index write --bitmap --stdin-packs \
--preferred-pack=pack-$A.pack <indexes &&
git rev-list --test-bitmap A &&
git multi-pack-index write --bitmap --stdin-packs \
--preferred-pack=pack-$B.pack <indexes &&
git rev-list --test-bitmap A
)
'
test_done