git/t/t5200-update-server-info.sh
Jeff King 8b5763e8fa midx: avoid duplicate packed_git entries
When we scan a pack directory to load the idx entries we find into the
packed_git list, we skip any of them that are contained in a midx. We
then load them later lazily if we actually need to access the
corresponding pack, referencing them both from the midx struct and the
packed_git list.

The lazy-load in the midx code checks to see if the midx already
mentions the pack, but doesn't otherwise check the packed_git list. This
makes sense, since we should have added any pack to both lists.

But there's a loophole! If we call close_object_store(), that frees the
midx entirely, but _not_ the packed_git structs, which we must keep
around for Reasons[1]. If we then try to look up more objects, we'll
auto-load the midx again, which won't realize that we've already loaded
those packs, and will create duplicate entries in the packed_git list.

This is possibly inefficient, because it means we may open and map the
pack redundantly. But it can also lead to weird user-visible behavior.
The case I found is in "git repack", which closes and reopens the midx
after repacking and then calls update_server_info(). We end up writing
the duplicate entries into objects/info/packs.

We could obviously de-dup them while writing that file, but it seems
like a violation of more core assumptions that we end up with these
duplicate structs at all.

We can avoid the duplicates reasonably efficiently by checking their
names in the pack_map hash. This annoyingly does require a little more
than a straight hash lookup due to the naming conventions, but it should
only happen when we are about to actually open a pack. I don't think one
extra malloc will be noticeable there.

[1] I'm not entirely sure of all the details, except that we generally
    assume the packed_git structs never go away. We noted this
    restriction in the comment added by 6f1e9394e2 (object: fix leaking
    packfiles when closing object store, 2024-08-08), but it's somewhat
    vague. At any rate, if you try freeing the structs in
    close_object_store(), you can observe segfaults all over the test
    suite. So it might be fixable, but it's not trivial.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 17:35:46 -04:00

50 lines
1.2 KiB
Bash
Executable file

#!/bin/sh
test_description='Test git update-server-info'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'setup' 'test_commit file'
test_expect_success 'create info/refs' '
git update-server-info &&
test_path_is_file .git/info/refs
'
test_expect_success 'modify and store mtime' '
test-tool chmtime =0 .git/info/refs &&
test-tool chmtime --get .git/info/refs >a
'
test_expect_success 'info/refs is not needlessly overwritten' '
git update-server-info &&
test-tool chmtime --get .git/info/refs >b &&
test_cmp a b
'
test_expect_success 'info/refs can be forced to update' '
git update-server-info -f &&
test-tool chmtime --get .git/info/refs >b &&
! test_cmp a b
'
test_expect_success 'info/refs updates when changes are made' '
test-tool chmtime =0 .git/info/refs &&
test-tool chmtime --get .git/info/refs >b &&
test_cmp a b &&
git update-ref refs/heads/foo HEAD &&
git update-server-info &&
test-tool chmtime --get .git/info/refs >b &&
! test_cmp a b
'
test_expect_success 'midx does not create duplicate pack entries' '
git repack -d --write-midx &&
git repack -d &&
grep ^P .git/objects/info/packs >packs &&
uniq -d <packs >dups &&
test_must_be_empty dups
'
test_done