git/t/t0063-string-list.sh

144 lines
3.1 KiB
Bash
Raw Normal View History

#!/bin/sh
#
# Copyright (c) 2012 Michael Haggerty
#
test_description='Test string list functionality'
tests: add a test mode for SANITIZE=leak, run it in CI While git can be compiled with SANITIZE=leak, we have not run regression tests under that mode. Memory leaks have only been fixed as one-offs without structured regression testing. This change adds CI testing for it. We'll now build and small set of whitelisted t00*.sh tests under Linux with a new job called "linux-leaks". The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test mode. When running in that mode, we'll assert that we were compiled with SANITIZE=leak. We'll then skip all tests, except those that we've opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true". A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to selectively skip tests even under "GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true". This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true: $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh 1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations as follow-up change, but let's start small to begin with. In ci/run-build-and-tests.sh we make use of the default "*" case to run "make test" without any GIT_TEST_* modes. SANITIZE=leak is known to fail in combination with GIT_TEST_SPLIT_INDEX=true in t0016-oidmap.sh, and we're likely to have other such failures in various GIT_TEST_* modes. Let's focus on getting the base tests passing, we can expand coverage to GIT_TEST_* modes later. It would also be possible to implement a more lightweight version of this by only relying on setting "LSAN_OPTIONS". See <YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and <YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of that. I've opted for this approach of adding a GIT_TEST_* mode instead because it's consistent with how we handle other special test modes. Being able to add a "!SANITIZE_LEAK" prerequisite and calling "test_done" early if it isn't satisfied also means that we can more incrementally add regression tests without being forced to fix widespread and hard-to-fix leaks at the same time. We have tests that do simple checking of some tool we're interested in, but later on in the script might be stressing trace2, or common sources of leaks like "git log" in combination with the tool (e.g. the commit-graph tests). To be clear having a prerequisite could also be accomplished by using "LSAN_OPTIONS" directly. On the topic of "LSAN_OPTIONS": It would be nice to have a mode to aggregate all failures in our various scripts, see [2] for a start at doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on that for now, it can be added later. As of writing this we've got major regressions between master..seen, i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04) have regressed recently. See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about the lack of this sort of test mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false positives, 2017-09-08) for the initial addition of SANITIZE=leak. See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19), 7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent 936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the past history of "one-off" SANITIZE=leak (and more) fixes. As noted in [5] we can't support this on OSX yet until Clang 14 is released, at that point we'll probably want to resurrect that "osx-leaks" job. 1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer 2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/ 3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/ 4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/ 5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23 09:20:46 +00:00
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_split () {
cat >expected &&
test_expect_success "split $1 at $2, max $3" "
test-tool string-list split '$1' '$2' '$3' >actual &&
test_cmp expected actual &&
test-tool string-list split_in_place '$1' '$2' '$3' >actual &&
test_cmp expected actual
"
}
string-list: multi-delimiter `string_list_split_in_place()` Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 22:20:10 +00:00
test_split_in_place() {
cat >expected &&
test_expect_success "split (in place) $1 at $2, max $3" "
test-tool string-list split_in_place '$1' '$2' '$3' >actual &&
test_cmp expected actual
"
}
test_split "foo:bar:baz" ":" "-1" <<EOF
3
[0]: "foo"
[1]: "bar"
[2]: "baz"
EOF
test_split "foo:bar:baz" ":" "0" <<EOF
1
[0]: "foo:bar:baz"
EOF
test_split "foo:bar:baz" ":" "1" <<EOF
2
[0]: "foo"
[1]: "bar:baz"
EOF
test_split "foo:bar:baz" ":" "2" <<EOF
3
[0]: "foo"
[1]: "bar"
[2]: "baz"
EOF
test_split "foo:bar:" ":" "-1" <<EOF
3
[0]: "foo"
[1]: "bar"
[2]: ""
EOF
test_split "" ":" "-1" <<EOF
1
[0]: ""
EOF
test_split ":" ":" "-1" <<EOF
2
[0]: ""
[1]: ""
EOF
string-list: multi-delimiter `string_list_split_in_place()` Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 22:20:10 +00:00
test_split_in_place "foo:;:bar:;:baz:;:" ":;" "-1" <<EOF
10
[0]: "foo"
[1]: ""
[2]: ""
[3]: "bar"
[4]: ""
[5]: ""
[6]: "baz"
[7]: ""
[8]: ""
[9]: ""
EOF
test_split_in_place "foo:;:bar:;:baz" ":;" "0" <<EOF
1
[0]: "foo:;:bar:;:baz"
EOF
test_split_in_place "foo:;:bar:;:baz" ":;" "1" <<EOF
2
[0]: "foo"
[1]: ";:bar:;:baz"
EOF
test_split_in_place "foo:;:bar:;:baz" ":;" "2" <<EOF
3
[0]: "foo"
[1]: ""
[2]: ":bar:;:baz"
EOF
test_split_in_place "foo:;:bar:;:" ":;" "-1" <<EOF
7
[0]: "foo"
[1]: ""
[2]: ""
[3]: "bar"
[4]: ""
[5]: ""
[6]: ""
EOF
test_expect_success "test filter_string_list" '
test "x-" = "x$(test-tool string-list filter - y)" &&
test "x-" = "x$(test-tool string-list filter no y)" &&
test yes = "$(test-tool string-list filter yes y)" &&
test yes = "$(test-tool string-list filter no:yes y)" &&
test yes = "$(test-tool string-list filter yes:no y)" &&
test y1:y2 = "$(test-tool string-list filter y1:y2 y)" &&
test y2:y1 = "$(test-tool string-list filter y2:y1 y)" &&
test "x-" = "x$(test-tool string-list filter x1:x2 y)"
'
test_expect_success "test remove_duplicates" '
test "x-" = "x$(test-tool string-list remove_duplicates -)" &&
test "x" = "x$(test-tool string-list remove_duplicates "")" &&
test a = "$(test-tool string-list remove_duplicates a)" &&
test a = "$(test-tool string-list remove_duplicates a:a)" &&
test a = "$(test-tool string-list remove_duplicates a:a:a:a:a)" &&
test a:b = "$(test-tool string-list remove_duplicates a:b)" &&
test a:b = "$(test-tool string-list remove_duplicates a:a:b)" &&
test a:b = "$(test-tool string-list remove_duplicates a:b:b)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:b:c)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:a:b:c)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:b:b:c)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:b:c:c)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:a:b:b:c:c)" &&
test a:b:c = "$(test-tool string-list remove_duplicates a:a:a:b:b:b:c:c:c)"
'
test_done