git/t/t1401-symbolic-ref.sh
Jeff King 613bef56b8 shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.

This has a few downsides:

  - sscanf("%s") reportedly misbehaves on macOS with some input and
    locale combinations, returning a partial or garbled string. See
    this thread:

      https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/

  - scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
    rule would never pull "origin" out of "refs/remotes/origin/HEAD".
    Instead it always produced "origin/HEAD", which is redundant with
    the "refs/remotes/%s" rule.

  - scanf in general is an error-prone interface. For example, scanning
    for "%s" will copy bytes into a destination string, which must have
    been correctly sized ahead of time to avoid a buffer overflow. In
    this case, the code is OK (the buffer is pessimistically sized to
    match the original string, which should give us a maximum). But in
    general, we do not want to encourage people to use scanf at all.

So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).

We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.

The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.

There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):

  - the first covers the real-world case which misbehaved on macOS.
    Setting LC_ALL is required to trigger the problem there (since
    otherwise our tests use LC_ALL=C), and hopefully is at worst simply
    ignored on other systems (and doesn't cause libc to complain, etc,
    on systems without that locale).

  - the second covers the "origin/HEAD" case as discussed above, which
    is now fixed

  - the remainder are for "weird" cases that work both before and after
    this patch, but would be easy to get wrong with off-by-one problems
    in the parsing (and came out of discussions and earlier iterations
    of the patch that did get them wrong).

  - absent here are tests of boring, expected-to-work cases like
    "refs/heads/foo", etc. Those are covered all over the test suite
    both explicitly (for-each-ref's refname:short) and implicitly (in
    the output of git-status, etc).

Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 08:53:17 -08:00

226 lines
6.7 KiB
Bash
Executable file

#!/bin/sh
test_description='basic symbolic-ref tests'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
# If the tests munging HEAD fail, they can break detection of
# the git repo, meaning that further tests will operate on
# the surrounding git repo instead of the trash directory.
reset_to_sane() {
rm -rf .git &&
"$TAR" xf .git.tar
}
test_expect_success 'setup' '
git symbolic-ref HEAD refs/heads/foo &&
test_commit file &&
"$TAR" cf .git.tar .git/
'
test_expect_success 'symbolic-ref read/write roundtrip' '
git symbolic-ref HEAD refs/heads/read-write-roundtrip &&
echo refs/heads/read-write-roundtrip >expect &&
git symbolic-ref HEAD >actual &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref refuses non-ref for HEAD' '
test_must_fail git symbolic-ref HEAD foo
'
reset_to_sane
test_expect_success 'symbolic-ref refuses bare sha1' '
test_must_fail git symbolic-ref HEAD $(git rev-parse HEAD)
'
reset_to_sane
test_expect_success 'HEAD cannot be removed' '
test_must_fail git symbolic-ref -d HEAD
'
reset_to_sane
test_expect_success 'symbolic-ref can be deleted' '
git symbolic-ref NOTHEAD refs/heads/foo &&
git symbolic-ref -d NOTHEAD &&
git rev-parse refs/heads/foo &&
test_must_fail git symbolic-ref NOTHEAD
'
reset_to_sane
test_expect_success 'symbolic-ref can delete dangling symref' '
git symbolic-ref NOTHEAD refs/heads/missing &&
git symbolic-ref -d NOTHEAD &&
test_must_fail git rev-parse refs/heads/missing &&
test_must_fail git symbolic-ref NOTHEAD
'
reset_to_sane
test_expect_success 'symbolic-ref fails to delete missing FOO' '
echo "fatal: Cannot delete FOO, not a symbolic ref" >expect &&
test_must_fail git symbolic-ref -d FOO >actual 2>&1 &&
test_cmp expect actual
'
reset_to_sane
test_expect_success 'symbolic-ref fails to delete real ref' '
echo "fatal: Cannot delete refs/heads/foo, not a symbolic ref" >expect &&
test_must_fail git symbolic-ref -d refs/heads/foo >actual 2>&1 &&
git rev-parse --verify refs/heads/foo &&
test_cmp expect actual
'
reset_to_sane
test_expect_success 'create large ref name' '
# make 256+ character ref; some systems may not handle that,
# so be gentle
long=0123456789abcdef &&
long=$long/$long/$long/$long &&
long=$long/$long/$long/$long &&
long_ref=refs/heads/$long &&
tree=$(git write-tree) &&
commit=$(echo foo | git commit-tree $tree) &&
if git update-ref $long_ref $commit; then
test_set_prereq LONG_REF
else
echo >&2 "long refs not supported"
fi
'
test_expect_success LONG_REF 'symbolic-ref can point to large ref name' '
git symbolic-ref HEAD $long_ref &&
echo $long_ref >expect &&
git symbolic-ref HEAD >actual &&
test_cmp expect actual
'
test_expect_success LONG_REF 'we can parse long symbolic ref' '
echo $commit >expect &&
git rev-parse --verify HEAD >actual &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref reports failure in exit code' '
test_when_finished "rm -f .git/HEAD.lock" &&
>.git/HEAD.lock &&
test_must_fail git symbolic-ref HEAD refs/heads/whatever
'
test_expect_success 'symbolic-ref writes reflog entry' '
git checkout -b log1 &&
test_commit one &&
git checkout -b log2 &&
test_commit two &&
git checkout --orphan orphan &&
git symbolic-ref -m create HEAD refs/heads/log1 &&
git symbolic-ref -m update HEAD refs/heads/log2 &&
cat >expect <<-\EOF &&
update
create
EOF
git log --format=%gs -g -2 >actual &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref does not create ref d/f conflicts' '
git checkout -b df &&
test_commit df &&
test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df &&
git pack-refs --all --prune &&
test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df
'
test_expect_success 'symbolic-ref can overwrite pointer to invalid name' '
test_when_finished reset_to_sane &&
head=$(git rev-parse HEAD) &&
git symbolic-ref HEAD refs/heads/outer &&
test_when_finished "git update-ref -d refs/heads/outer/inner" &&
git update-ref refs/heads/outer/inner $head &&
git symbolic-ref HEAD refs/heads/unrelated
'
test_expect_success 'symbolic-ref can resolve d/f name (EISDIR)' '
test_when_finished reset_to_sane &&
head=$(git rev-parse HEAD) &&
git symbolic-ref HEAD refs/heads/outer/inner &&
test_when_finished "git update-ref -d refs/heads/outer" &&
git update-ref refs/heads/outer $head &&
echo refs/heads/outer/inner >expect &&
git symbolic-ref HEAD >actual &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref can resolve d/f name (ENOTDIR)' '
test_when_finished reset_to_sane &&
head=$(git rev-parse HEAD) &&
git symbolic-ref HEAD refs/heads/outer &&
test_when_finished "git update-ref -d refs/heads/outer/inner" &&
git update-ref refs/heads/outer/inner $head &&
echo refs/heads/outer >expect &&
git symbolic-ref HEAD >actual &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref refuses invalid target for non-HEAD' '
test_must_fail git symbolic-ref refs/heads/invalid foo..bar
'
test_expect_success 'symbolic-ref allows top-level target for non-HEAD' '
git symbolic-ref refs/heads/top-level FETCH_HEAD &&
git update-ref FETCH_HEAD HEAD &&
test_cmp_rev top-level HEAD
'
test_expect_success 'symbolic-ref pointing at another' '
git update-ref refs/heads/maint-2.37 HEAD &&
git symbolic-ref refs/heads/maint refs/heads/maint-2.37 &&
git checkout maint &&
git symbolic-ref HEAD >actual &&
echo refs/heads/maint-2.37 >expect &&
test_cmp expect actual &&
git symbolic-ref --no-recurse HEAD >actual &&
echo refs/heads/maint >expect &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref --short handles complex utf8 case' '
name="测试-加-增加-加-增加" &&
git symbolic-ref TEST_SYMREF "refs/heads/$name" &&
# In the real world, we saw problems with this case only
# when the locale includes UTF-8. Set it here to try to make things as
# hard as possible for us to pass, but in practice we should do the
# right thing regardless (and of course some platforms may not even
# have this locale).
LC_ALL=en_US.UTF-8 git symbolic-ref --short TEST_SYMREF >actual &&
echo "$name" >expect &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref --short handles name with suffix' '
git symbolic-ref TEST_SYMREF "refs/remotes/origin/HEAD" &&
git symbolic-ref --short TEST_SYMREF >actual &&
echo "origin" >expect &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref --short handles almost-matching name' '
git symbolic-ref TEST_SYMREF "refs/headsXfoo" &&
git symbolic-ref --short TEST_SYMREF >actual &&
echo "headsXfoo" >expect &&
test_cmp expect actual
'
test_expect_success 'symbolic-ref --short handles name with percent' '
git symbolic-ref TEST_SYMREF "refs/heads/%foo" &&
git symbolic-ref --short TEST_SYMREF >actual &&
echo "%foo" >expect &&
test_cmp expect actual
'
test_done