mirror of
https://github.com/git/git
synced 2024-09-29 21:27:13 +00:00
5421e7c3a1
In an earlier commit, a bug was described where it's possible for Git to produce non-murmur3 hashes when the platform's "char" type is signed, and there are paths with characters whose highest bit is set (i.e. all characters >= 0x80). That patch allows the caller to control which version of Bloom filters are read and written. However, even on platforms with a signed "char" type, it is possible to reuse existing Bloom filters if and only if there are no changed paths in any commit's first parent tree-diff whose characters have their highest bit set. When this is the case, we can reuse the existing filter without having to compute a new one. This is done by marking trees which are known to have (or not have) any such paths. When a commit's root tree is verified to not have any such paths, we mark it as such and declare that the commit's Bloom filter is reusable. Note that this heuristic only goes in one direction. If neither a commit nor its first parent have any paths in their trees with non-ASCII characters, then we know for certain that a path with non-ASCII characters will not appear in a tree-diff against that commit's first parent. The reverse isn't necessarily true: just because the tree-diff doesn't contain any such paths does not imply that no such paths exist in either tree. So we end up recomputing some Bloom filters that we don't strictly have to (i.e. their bits are the same no matter which version of murmur3 we use). But culling these out is impossible, since we'd have to perform the full tree-diff, which is the same effort as computing the Bloom filter from scratch. But because we can cache our results in each tree's flag bits, we can often avoid recomputing many filters, thereby reducing the time it takes to run $ git commit-graph write --changed-paths --reachable when upgrading from v1 to v2 Bloom filters. To benchmark this, let's generate a commit-graph in linux.git with v1 changed-paths in generation order[^1]: $ git clone git@github.com:torvalds/linux.git $ cd linux $ git commit-graph write --reachable --changed-paths $ graph=".git/objects/info/commit-graph" $ mv $graph{,.bak} Then let's time how long it takes to go from v1 to v2 filters (with and without the upgrade path enabled), resetting the state of the commit-graph each time: $ git config commitGraph.changedPathsVersion 2 $ hyperfine -p 'cp -f $graph.bak $graph' -L v 0,1 \ 'GIT_TEST_UPGRADE_BLOOM_FILTERS={v} git.compile commit-graph write --reachable --changed-paths' On linux.git (where there aren't any non-ASCII paths), the timings indicate that this patch represents a speed-up over recomputing all Bloom filters from scratch: Benchmark 1: GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths Time (mean ± σ): 124.873 s ± 0.316 s [User: 124.081 s, System: 0.643 s] Range (min … max): 124.621 s … 125.227 s 3 runs Benchmark 2: GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths Time (mean ± σ): 79.271 s ± 0.163 s [User: 74.611 s, System: 4.521 s] Range (min … max): 79.112 s … 79.437 s 3 runs Summary 'GIT_TEST_UPGRADE_BLOOM_FILTERS=1 git.compile commit-graph write --reachable --changed-paths' ran 1.58 ± 0.01 times faster than 'GIT_TEST_UPGRADE_BLOOM_FILTERS=0 git.compile commit-graph write --reachable --changed-paths' On git.git, we do have some non-ASCII paths, giving us a more modest improvement from 4.163 seconds to 3.348 seconds, for a 1.24x speed-up. On my machine, the stats for git.git are: - 8,285 Bloom filters computed from scratch - 10 Bloom filters generated as empty - 4 Bloom filters generated as truncated due to too many changed paths - 65,114 Bloom filters were reused when transitioning from v1 to v2. [^1]: Note that this is is important, since `--stdin-packs` or `--stdin-commits` orders commits in the commit-graph by their pack position (with `--stdin-packs`) or in the raw input (with `--stdin-commits`). Since we compute Bloom filters in the same order that commits appear in the graph, we must see a commit's (first) parent before we process the commit itself. This is only guaranteed to happen when sorting commits by their generation number. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
777 lines
24 KiB
Bash
Executable file
777 lines
24 KiB
Bash
Executable file
#!/bin/sh
|
|
|
|
test_description='git log for a path with Bloom filters'
|
|
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
|
|
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
|
|
|
|
. ./test-lib.sh
|
|
. "$TEST_DIRECTORY"/lib-chunk.sh
|
|
|
|
GIT_TEST_COMMIT_GRAPH=0
|
|
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
|
|
|
|
test_expect_success 'setup test - repo, commits, commit graph, log outputs' '
|
|
git init &&
|
|
mkdir A A/B A/B/C &&
|
|
test_commit c1 A/file1 &&
|
|
test_commit c2 A/B/file2 &&
|
|
test_commit c3 A/B/C/file3 &&
|
|
test_commit c4 A/file1 &&
|
|
test_commit c5 A/B/file2 &&
|
|
test_commit c6 A/B/C/file3 &&
|
|
test_commit c7 A/file1 &&
|
|
test_commit c8 A/B/file2 &&
|
|
test_commit c9 A/B/C/file3 &&
|
|
test_commit c10 file_to_be_deleted &&
|
|
git checkout -b side HEAD~4 &&
|
|
test_commit side-1 file4 &&
|
|
git checkout main &&
|
|
git merge side &&
|
|
test_commit c11 file5 &&
|
|
mv file5 file5_renamed &&
|
|
git add file5_renamed &&
|
|
git commit -m "rename" &&
|
|
rm file_to_be_deleted &&
|
|
git add . &&
|
|
git commit -m "file removed" &&
|
|
git commit --allow-empty -m "empty" &&
|
|
git commit-graph write --reachable --changed-paths &&
|
|
|
|
test_oid_cache <<-EOF
|
|
oid_version sha1:1
|
|
oid_version sha256:2
|
|
EOF
|
|
'
|
|
|
|
graph_read_expect () {
|
|
NUM_CHUNKS=6
|
|
cat >expect <<- EOF
|
|
header: 43475048 1 $(test_oid oid_version) $NUM_CHUNKS 0
|
|
num_commits: $1
|
|
chunks: oid_fanout oid_lookup commit_metadata generation_data bloom_indexes bloom_data
|
|
options: bloom(1,10,7) read_generation_data
|
|
EOF
|
|
test-tool read-graph >actual &&
|
|
test_cmp expect actual
|
|
}
|
|
|
|
test_expect_success 'commit-graph write wrote out the bloom chunks' '
|
|
graph_read_expect 16
|
|
'
|
|
|
|
# Turn off any inherited trace2 settings for this test.
|
|
sane_unset GIT_TRACE2 GIT_TRACE2_PERF GIT_TRACE2_EVENT
|
|
sane_unset GIT_TRACE2_PERF_BRIEF
|
|
sane_unset GIT_TRACE2_CONFIG_PARAMS
|
|
|
|
setup () {
|
|
rm -f "$TRASH_DIRECTORY/trace.perf" &&
|
|
git -c core.commitGraph=false log --pretty="format:%s" $1 >log_wo_bloom &&
|
|
GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace.perf" git -c core.commitGraph=true log --pretty="format:%s" $1 >log_w_bloom
|
|
}
|
|
|
|
test_bloom_filters_used () {
|
|
log_args=$1
|
|
bloom_trace_prefix="statistics:{\"filter_not_present\":${2:-0},\"maybe\""
|
|
setup "$log_args" &&
|
|
grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" &&
|
|
test_cmp log_wo_bloom log_w_bloom &&
|
|
test_path_is_file "$TRASH_DIRECTORY/trace.perf"
|
|
}
|
|
|
|
test_bloom_filters_not_used () {
|
|
log_args=$1
|
|
setup "$log_args" &&
|
|
|
|
if grep -q "statistics:{\"filter_not_present\":" "$TRASH_DIRECTORY/trace.perf"
|
|
then
|
|
# if the Bloom filter system is initialized, ensure that no
|
|
# filters were used
|
|
data="statistics:{"
|
|
# unusable filters (e.g., those computed with a
|
|
# different value of commitGraph.changedPathsVersion)
|
|
# are counted in the filter_not_present bucket, so any
|
|
# value is OK there.
|
|
data="$data\"filter_not_present\":[0-9][0-9]*,"
|
|
data="$data\"maybe\":0,"
|
|
data="$data\"definitely_not\":0,"
|
|
data="$data\"false_positive\":0}"
|
|
|
|
grep -q "$data" "$TRASH_DIRECTORY/trace.perf"
|
|
fi &&
|
|
test_cmp log_wo_bloom log_w_bloom
|
|
}
|
|
|
|
for path in A A/B A/B/C A/file1 A/B/file2 A/B/C/file3 file4 file5 file5_renamed file_to_be_deleted
|
|
do
|
|
for option in "" \
|
|
"--all" \
|
|
"--full-history" \
|
|
"--full-history --simplify-merges" \
|
|
"--simplify-merges" \
|
|
"--simplify-by-decoration" \
|
|
"--follow" \
|
|
"--first-parent" \
|
|
"--topo-order" \
|
|
"--date-order" \
|
|
"--author-date-order" \
|
|
"--ancestry-path side..main"
|
|
do
|
|
test_expect_success "git log option: $option for path: $path" '
|
|
test_bloom_filters_used "$option -- $path" &&
|
|
test_config commitgraph.readChangedPaths false &&
|
|
test_bloom_filters_not_used "$option -- $path"
|
|
'
|
|
done
|
|
done
|
|
|
|
test_expect_success 'git log -- folder works with and without the trailing slash' '
|
|
test_bloom_filters_used "-- A" &&
|
|
test_bloom_filters_used "-- A/"
|
|
'
|
|
|
|
test_expect_success 'git log for path that does not exist. ' '
|
|
test_bloom_filters_used "-- path_does_not_exist"
|
|
'
|
|
|
|
test_expect_success 'git log with --walk-reflogs does not use Bloom filters' '
|
|
test_bloom_filters_not_used "--walk-reflogs -- A"
|
|
'
|
|
|
|
test_expect_success 'git log -- multiple path specs does not use Bloom filters' '
|
|
test_bloom_filters_not_used "-- file4 A/file1"
|
|
'
|
|
|
|
test_expect_success 'git log -- "." pathspec at root does not use Bloom filters' '
|
|
test_bloom_filters_not_used "-- ."
|
|
'
|
|
|
|
test_expect_success 'git log with wildcard that resolves to a single path uses Bloom filters' '
|
|
test_bloom_filters_used "-- *4" &&
|
|
test_bloom_filters_used "-- *renamed"
|
|
'
|
|
|
|
test_expect_success 'git log with wildcard that resolves to a multiple paths does not uses Bloom filters' '
|
|
test_bloom_filters_not_used "-- *" &&
|
|
test_bloom_filters_not_used "-- file*"
|
|
'
|
|
|
|
test_expect_success 'setup - add commit-graph to the chain without Bloom filters' '
|
|
test_commit c14 A/anotherFile2 &&
|
|
test_commit c15 A/B/anotherFile2 &&
|
|
test_commit c16 A/B/C/anotherFile2 &&
|
|
git commit-graph write --reachable --split --no-changed-paths &&
|
|
test_line_count = 2 .git/objects/info/commit-graphs/commit-graph-chain
|
|
'
|
|
|
|
test_expect_success 'use Bloom filters even if the latest graph does not have Bloom filters' '
|
|
# Ensure that the number of empty filters is equal to the number of
|
|
# filters in the latest graph layer to prove that they are loaded (and
|
|
# ignored).
|
|
test_bloom_filters_used "-- A/B" 3
|
|
'
|
|
|
|
test_expect_success 'setup - add commit-graph to the chain with Bloom filters' '
|
|
test_commit c17 A/anotherFile3 &&
|
|
git commit-graph write --reachable --changed-paths --split &&
|
|
test_line_count = 3 .git/objects/info/commit-graphs/commit-graph-chain
|
|
'
|
|
|
|
test_bloom_filters_used_when_some_filters_are_missing () {
|
|
log_args=$1
|
|
bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":6,\"definitely_not\":10"
|
|
setup "$log_args" &&
|
|
grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" &&
|
|
test_cmp log_wo_bloom log_w_bloom
|
|
}
|
|
|
|
test_expect_success 'Use Bloom filters if they exist in the latest but not all commit graphs in the chain.' '
|
|
test_bloom_filters_used_when_some_filters_are_missing "-- A/B"
|
|
'
|
|
|
|
test_expect_success 'persist filter settings' '
|
|
test_when_finished rm -rf .git/objects/info/commit-graph* &&
|
|
rm -rf .git/objects/info/commit-graph* &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=9 \
|
|
GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY=15 \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":512" trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2-auto.txt" \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15,\"max_changed_paths\":512" trace2-auto.txt
|
|
'
|
|
|
|
test_max_changed_paths () {
|
|
grep "\"max_changed_paths\":$1" $2
|
|
}
|
|
|
|
test_filter_not_computed () {
|
|
grep "\"key\":\"filter-not-computed\",\"value\":\"$1\"" $2
|
|
}
|
|
|
|
test_filter_computed () {
|
|
grep "\"key\":\"filter-computed\",\"value\":\"$1\"" $2
|
|
}
|
|
|
|
test_filter_trunc_empty () {
|
|
grep "\"key\":\"filter-trunc-empty\",\"value\":\"$1\"" $2
|
|
}
|
|
|
|
test_filter_trunc_large () {
|
|
grep "\"key\":\"filter-trunc-large\",\"value\":\"$1\"" $2
|
|
}
|
|
|
|
test_filter_upgraded () {
|
|
grep "\"key\":\"filter-upgraded\",\"value\":\"$1\"" $2
|
|
}
|
|
|
|
test_expect_success 'correctly report changes over limit' '
|
|
git init limits &&
|
|
(
|
|
cd limits &&
|
|
mkdir d &&
|
|
mkdir d/e &&
|
|
|
|
for i in $(test_seq 1 2)
|
|
do
|
|
printf $i >d/file$i.txt &&
|
|
printf $i >d/e/file$i.txt || return 1
|
|
done &&
|
|
|
|
mkdir mode &&
|
|
printf bash >mode/script.sh &&
|
|
|
|
mkdir foo &&
|
|
touch foo/bar &&
|
|
touch foo.txt &&
|
|
|
|
git add d foo foo.txt mode &&
|
|
git commit -m "files" &&
|
|
|
|
# Commit has 7 file and 4 directory adds
|
|
GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS=10 \
|
|
GIT_TRACE2_EVENT="$(pwd)/trace" \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
test_max_changed_paths 10 trace &&
|
|
test_filter_computed 1 trace &&
|
|
test_filter_trunc_large 1 trace &&
|
|
|
|
for path in $(git ls-tree -r --name-only HEAD)
|
|
do
|
|
git -c commitGraph.readChangedPaths=false log \
|
|
-- $path >expect &&
|
|
git log -- $path >actual &&
|
|
test_cmp expect actual || return 1
|
|
done &&
|
|
|
|
# Make a variety of path changes
|
|
printf new1 >d/e/file1.txt &&
|
|
printf new2 >d/file2.txt &&
|
|
rm d/e/file2.txt &&
|
|
rm -r foo &&
|
|
printf text >foo &&
|
|
mkdir f &&
|
|
printf new1 >f/file1.txt &&
|
|
|
|
# including a mode-only change (counts as modified)
|
|
git update-index --chmod=+x mode/script.sh &&
|
|
|
|
git add foo d f &&
|
|
git commit -m "complicated" &&
|
|
|
|
# start from scratch and rebuild
|
|
rm -f .git/objects/info/commit-graph &&
|
|
GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS=10 \
|
|
GIT_TRACE2_EVENT="$(pwd)/trace-edit" \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
test_max_changed_paths 10 trace-edit &&
|
|
test_filter_computed 2 trace-edit &&
|
|
test_filter_trunc_large 2 trace-edit &&
|
|
|
|
for path in $(git ls-tree -r --name-only HEAD)
|
|
do
|
|
git -c commitGraph.readChangedPaths=false log \
|
|
-- $path >expect &&
|
|
git log -- $path >actual &&
|
|
test_cmp expect actual || return 1
|
|
done &&
|
|
|
|
# start from scratch and rebuild
|
|
rm -f .git/objects/info/commit-graph &&
|
|
GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS=11 \
|
|
GIT_TRACE2_EVENT="$(pwd)/trace-update" \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
test_max_changed_paths 11 trace-update &&
|
|
test_filter_computed 2 trace-update &&
|
|
test_filter_trunc_large 0 trace-update &&
|
|
|
|
for path in $(git ls-tree -r --name-only HEAD)
|
|
do
|
|
git -c commitGraph.readChangedPaths=false log \
|
|
-- $path >expect &&
|
|
git log -- $path >actual &&
|
|
test_cmp expect actual || return 1
|
|
done
|
|
)
|
|
'
|
|
|
|
test_expect_success 'correctly report commits with no changed paths' '
|
|
git init empty &&
|
|
test_when_finished "rm -fr empty" &&
|
|
(
|
|
cd empty &&
|
|
|
|
git commit --allow-empty -m "initial commit" &&
|
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable --changed-paths &&
|
|
test_filter_computed 1 trace.event &&
|
|
test_filter_not_computed 0 trace.event &&
|
|
test_filter_trunc_empty 1 trace.event &&
|
|
test_filter_trunc_large 0 trace.event
|
|
)
|
|
'
|
|
|
|
test_expect_success 'Bloom generation is limited by --max-new-filters' '
|
|
(
|
|
cd limits &&
|
|
test_commit c2 filter &&
|
|
test_commit c3 filter &&
|
|
test_commit c4 no-filter &&
|
|
|
|
rm -f trace.event &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable --split=replace \
|
|
--changed-paths --max-new-filters=2 &&
|
|
|
|
test_filter_computed 2 trace.event &&
|
|
test_filter_not_computed 3 trace.event &&
|
|
test_filter_trunc_empty 0 trace.event &&
|
|
test_filter_trunc_large 0 trace.event
|
|
)
|
|
'
|
|
|
|
test_expect_success 'Bloom generation backfills previously-skipped filters' '
|
|
# Check specifying commitGraph.maxNewFilters over "git config" works.
|
|
test_config -C limits commitGraph.maxNewFilters 1 &&
|
|
(
|
|
cd limits &&
|
|
|
|
rm -f trace.event &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable --changed-paths \
|
|
--split=replace &&
|
|
test_filter_computed 1 trace.event &&
|
|
test_filter_not_computed 4 trace.event &&
|
|
test_filter_trunc_empty 0 trace.event &&
|
|
test_filter_trunc_large 0 trace.event
|
|
)
|
|
'
|
|
|
|
test_expect_success '--max-new-filters overrides configuration' '
|
|
git init override &&
|
|
test_when_finished "rm -fr override" &&
|
|
test_config -C override commitGraph.maxNewFilters 2 &&
|
|
(
|
|
cd override &&
|
|
test_commit one &&
|
|
test_commit two &&
|
|
|
|
rm -f trace.event &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable --changed-paths \
|
|
--max-new-filters=1 &&
|
|
test_filter_computed 1 trace.event &&
|
|
test_filter_not_computed 1 trace.event &&
|
|
test_filter_trunc_empty 0 trace.event &&
|
|
test_filter_trunc_large 0 trace.event
|
|
)
|
|
'
|
|
|
|
test_expect_success 'Bloom generation backfills empty commits' '
|
|
git init empty &&
|
|
test_when_finished "rm -fr empty" &&
|
|
(
|
|
cd empty &&
|
|
for i in $(test_seq 1 6)
|
|
do
|
|
git commit --allow-empty -m "$i" || return 1
|
|
done &&
|
|
|
|
# Generate Bloom filters for empty commits 1-6, two at a time.
|
|
for i in $(test_seq 1 3)
|
|
do
|
|
rm -f trace.event &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable \
|
|
--changed-paths --max-new-filters=2 &&
|
|
test_filter_computed 2 trace.event &&
|
|
test_filter_not_computed 4 trace.event &&
|
|
test_filter_trunc_empty 2 trace.event &&
|
|
test_filter_trunc_large 0 trace.event || return 1
|
|
done &&
|
|
|
|
# Finally, make sure that once all commits have filters, that
|
|
# none are subsequently recomputed.
|
|
rm -f trace.event &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace.event" \
|
|
git commit-graph write --reachable \
|
|
--changed-paths --max-new-filters=2 &&
|
|
test_filter_computed 0 trace.event &&
|
|
test_filter_not_computed 6 trace.event &&
|
|
test_filter_trunc_empty 0 trace.event &&
|
|
test_filter_trunc_large 0 trace.event
|
|
)
|
|
'
|
|
|
|
graph=.git/objects/info/commit-graph
|
|
graphdir=.git/objects/info/commit-graphs
|
|
chain=$graphdir/commit-graph-chain
|
|
|
|
test_expect_success 'setup for mixed Bloom setting tests' '
|
|
repo=mixed-bloom-settings &&
|
|
|
|
git init $repo &&
|
|
for i in one two three
|
|
do
|
|
test_commit -C $repo $i file || return 1
|
|
done
|
|
'
|
|
|
|
test_expect_success 'ensure Bloom filters with incompatible settings are ignored' '
|
|
# Compute Bloom filters with "unusual" settings.
|
|
git -C $repo rev-parse one >in &&
|
|
GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=3 git -C $repo commit-graph write \
|
|
--stdin-commits --changed-paths --split <in &&
|
|
layer=$(head -n 1 $repo/$chain) &&
|
|
|
|
# A commit-graph layer without Bloom filters "hides" the layers
|
|
# below ...
|
|
git -C $repo rev-parse two >in &&
|
|
git -C $repo commit-graph write --stdin-commits --no-changed-paths \
|
|
--split=no-merge <in &&
|
|
|
|
# Another commit-graph layer that has Bloom filters, but with
|
|
# standard settings, and is thus incompatible with the base
|
|
# layer written above.
|
|
git -C $repo rev-parse HEAD >in &&
|
|
git -C $repo commit-graph write --stdin-commits --changed-paths \
|
|
--split=no-merge <in &&
|
|
|
|
test_line_count = 3 $repo/$chain &&
|
|
|
|
# Ensure that incompatible Bloom filters are ignored.
|
|
git -C $repo -c core.commitGraph=false log --oneline --no-decorate -- file \
|
|
>expect 2>err &&
|
|
git -C $repo log --oneline --no-decorate -- file >actual 2>err &&
|
|
test_cmp expect actual &&
|
|
grep "disabling Bloom filters for commit-graph layer .$layer." err
|
|
'
|
|
|
|
test_expect_success 'merge graph layers with incompatible Bloom settings' '
|
|
# Ensure that incompatible Bloom filters are ignored when
|
|
# merging existing layers.
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C $repo commit-graph write --reachable --changed-paths 2>err &&
|
|
grep "disabling Bloom filters for commit-graph layer .$layer." err &&
|
|
grep "{\"hash_version\":1,\"num_hashes\":7,\"bits_per_entry\":10,\"max_changed_paths\":512" trace2.txt &&
|
|
|
|
test_path_is_file $repo/$graph &&
|
|
test_dir_is_empty $repo/$graphdir &&
|
|
|
|
git -C $repo -c core.commitGraph=false log --oneline --no-decorate -- \
|
|
file >expect &&
|
|
trace_out="$(pwd)/trace.perf" &&
|
|
GIT_TRACE2_PERF="$trace_out" \
|
|
git -C $repo log --oneline --no-decorate -- file >actual 2>err &&
|
|
|
|
test_cmp expect actual &&
|
|
grep "statistics:{\"filter_not_present\":0," trace.perf &&
|
|
test_must_be_empty err
|
|
'
|
|
|
|
# chosen to be the same under all Unicode normalization forms
|
|
CENT=$(printf "\302\242")
|
|
|
|
test_expect_success 'ensure Bloom filter with incompatible versions are ignored' '
|
|
rm "$repo/$graph" &&
|
|
|
|
git -C $repo log --oneline --no-decorate -- $CENT >expect &&
|
|
|
|
# Compute v1 Bloom filters for commits at the bottom.
|
|
git -C $repo rev-parse HEAD^ >in &&
|
|
git -C $repo commit-graph write --stdin-commits --changed-paths \
|
|
--split <in &&
|
|
|
|
# Compute v2 Bloomfilters for the rest of the commits at the top.
|
|
git -C $repo rev-parse HEAD >in &&
|
|
git -C $repo -c commitGraph.changedPathsVersion=2 commit-graph write \
|
|
--stdin-commits --changed-paths --split=no-merge <in &&
|
|
|
|
test_line_count = 2 $repo/$chain &&
|
|
|
|
git -C $repo log --oneline --no-decorate -- $CENT >actual 2>err &&
|
|
test_cmp expect actual &&
|
|
|
|
layer="$(head -n 1 $repo/$chain)" &&
|
|
cat >expect.err <<-EOF &&
|
|
warning: disabling Bloom filters for commit-graph layer $SQ$layer$SQ due to incompatible settings
|
|
EOF
|
|
test_cmp expect.err err &&
|
|
|
|
# Merge the two layers with incompatible bloom filter versions,
|
|
# ensuring that the v2 filters are used.
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C $repo -c commitGraph.changedPathsVersion=2 commit-graph write --reachable --changed-paths 2>err &&
|
|
grep "disabling Bloom filters for commit-graph layer .$layer." err &&
|
|
grep "{\"hash_version\":2,\"num_hashes\":7,\"bits_per_entry\":10,\"max_changed_paths\":512" trace2.txt
|
|
'
|
|
|
|
get_first_changed_path_filter () {
|
|
test-tool read-graph bloom-filters >filters.dat &&
|
|
head -n 1 filters.dat
|
|
}
|
|
|
|
test_expect_success 'set up repo with high bit path, version 1 changed-path' '
|
|
git init highbit1 &&
|
|
test_commit -C highbit1 c1 "$CENT" &&
|
|
git -C highbit1 commit-graph write --reachable --changed-paths
|
|
'
|
|
|
|
test_expect_success 'setup check value of version 1 changed-path' '
|
|
(
|
|
cd highbit1 &&
|
|
echo "52a9" >expect &&
|
|
get_first_changed_path_filter >actual
|
|
)
|
|
'
|
|
|
|
# expect will not match actual if char is unsigned by default. Write the test
|
|
# in this way, so that a user running this test script can still see if the two
|
|
# files match. (It will appear as an ordinary success if they match, and a skip
|
|
# if not.)
|
|
if test_cmp highbit1/expect highbit1/actual
|
|
then
|
|
test_set_prereq SIGNED_CHAR_BY_DEFAULT
|
|
fi
|
|
test_expect_success SIGNED_CHAR_BY_DEFAULT 'check value of version 1 changed-path' '
|
|
# Only the prereq matters for this test.
|
|
true
|
|
'
|
|
|
|
test_expect_success 'setup make another commit' '
|
|
# "git log" does not use Bloom filters for root commits - see how, in
|
|
# revision.c, rev_compare_tree() (the only code path that eventually calls
|
|
# get_bloom_filter()) is only called by try_to_simplify_commit() when the commit
|
|
# has one parent. Therefore, make another commit so that we perform the tests on
|
|
# a non-root commit.
|
|
test_commit -C highbit1 anotherc1 "another$CENT"
|
|
'
|
|
|
|
test_expect_success 'version 1 changed-path used when version 1 requested' '
|
|
(
|
|
cd highbit1 &&
|
|
test_bloom_filters_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'version 1 changed-path not used when version 2 requested' '
|
|
(
|
|
cd highbit1 &&
|
|
git config --add commitGraph.changedPathsVersion 2 &&
|
|
test_bloom_filters_not_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'version 1 changed-path used when autodetect requested' '
|
|
(
|
|
cd highbit1 &&
|
|
git config --add commitGraph.changedPathsVersion -1 &&
|
|
test_bloom_filters_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'when writing another commit graph, preserve existing version 1 of changed-path' '
|
|
test_commit -C highbit1 c1double "$CENT$CENT" &&
|
|
git -C highbit1 commit-graph write --reachable --changed-paths &&
|
|
(
|
|
cd highbit1 &&
|
|
git config --add commitGraph.changedPathsVersion -1 &&
|
|
echo "options: bloom(1,10,7) read_generation_data" >expect &&
|
|
test-tool read-graph >full &&
|
|
grep options full >actual &&
|
|
test_cmp expect actual
|
|
)
|
|
'
|
|
|
|
test_expect_success 'set up repo with high bit path, version 2 changed-path' '
|
|
git init highbit2 &&
|
|
git -C highbit2 config --add commitGraph.changedPathsVersion 2 &&
|
|
test_commit -C highbit2 c2 "$CENT" &&
|
|
git -C highbit2 commit-graph write --reachable --changed-paths
|
|
'
|
|
|
|
test_expect_success 'check value of version 2 changed-path' '
|
|
(
|
|
cd highbit2 &&
|
|
echo "c01f" >expect &&
|
|
get_first_changed_path_filter >actual &&
|
|
test_cmp expect actual
|
|
)
|
|
'
|
|
|
|
test_expect_success 'setup make another commit' '
|
|
# "git log" does not use Bloom filters for root commits - see how, in
|
|
# revision.c, rev_compare_tree() (the only code path that eventually calls
|
|
# get_bloom_filter()) is only called by try_to_simplify_commit() when the commit
|
|
# has one parent. Therefore, make another commit so that we perform the tests on
|
|
# a non-root commit.
|
|
test_commit -C highbit2 anotherc2 "another$CENT"
|
|
'
|
|
|
|
test_expect_success 'version 2 changed-path used when version 2 requested' '
|
|
(
|
|
cd highbit2 &&
|
|
test_bloom_filters_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'version 2 changed-path not used when version 1 requested' '
|
|
(
|
|
cd highbit2 &&
|
|
git config --add commitGraph.changedPathsVersion 1 &&
|
|
test_bloom_filters_not_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'version 2 changed-path used when autodetect requested' '
|
|
(
|
|
cd highbit2 &&
|
|
git config --add commitGraph.changedPathsVersion -1 &&
|
|
test_bloom_filters_used "-- another$CENT"
|
|
)
|
|
'
|
|
|
|
test_expect_success 'when writing another commit graph, preserve existing version 2 of changed-path' '
|
|
test_commit -C highbit2 c2double "$CENT$CENT" &&
|
|
git -C highbit2 commit-graph write --reachable --changed-paths &&
|
|
(
|
|
cd highbit2 &&
|
|
git config --add commitGraph.changedPathsVersion -1 &&
|
|
echo "options: bloom(2,10,7) read_generation_data" >expect &&
|
|
test-tool read-graph >full &&
|
|
grep options full >actual &&
|
|
test_cmp expect actual
|
|
)
|
|
'
|
|
|
|
test_expect_success 'when writing commit graph, do not reuse changed-path of another version' '
|
|
git init doublewrite &&
|
|
test_commit -C doublewrite c "$CENT" &&
|
|
|
|
git -C doublewrite config --add commitGraph.changedPathsVersion 1 &&
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C doublewrite commit-graph write --reachable --changed-paths &&
|
|
test_filter_computed 1 trace2.txt &&
|
|
test_filter_upgraded 0 trace2.txt &&
|
|
|
|
git -C doublewrite commit-graph write --reachable --changed-paths &&
|
|
for v in -2 3
|
|
do
|
|
git -C doublewrite config --add commitGraph.changedPathsVersion $v &&
|
|
git -C doublewrite commit-graph write --reachable --changed-paths 2>err &&
|
|
cat >expect <<-EOF &&
|
|
warning: attempting to write a commit-graph, but ${SQ}commitGraph.changedPathsVersion${SQ} ($v) is not supported
|
|
EOF
|
|
test_cmp expect err || return 1
|
|
done &&
|
|
|
|
git -C doublewrite config --add commitGraph.changedPathsVersion 2 &&
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C doublewrite commit-graph write --reachable --changed-paths &&
|
|
test_filter_computed 1 trace2.txt &&
|
|
test_filter_upgraded 0 trace2.txt &&
|
|
|
|
(
|
|
cd doublewrite &&
|
|
echo "c01f" >expect &&
|
|
get_first_changed_path_filter >actual &&
|
|
test_cmp expect actual
|
|
)
|
|
'
|
|
|
|
test_expect_success 'when writing commit graph, reuse changed-path of another version where possible' '
|
|
git init upgrade &&
|
|
|
|
test_commit -C upgrade base no-high-bits &&
|
|
|
|
git -C upgrade config --add commitGraph.changedPathsVersion 1 &&
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C upgrade commit-graph write --reachable --changed-paths &&
|
|
test_filter_computed 1 trace2.txt &&
|
|
test_filter_upgraded 0 trace2.txt &&
|
|
|
|
git -C upgrade config --add commitGraph.changedPathsVersion 2 &&
|
|
>trace2.txt &&
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
|
git -C upgrade commit-graph write --reachable --changed-paths &&
|
|
test_filter_computed 0 trace2.txt &&
|
|
test_filter_upgraded 1 trace2.txt
|
|
'
|
|
|
|
corrupt_graph () {
|
|
test_when_finished "rm -rf $graph" &&
|
|
git commit-graph write --reachable --changed-paths &&
|
|
corrupt_chunk_file $graph "$@"
|
|
}
|
|
|
|
check_corrupt_graph () {
|
|
corrupt_graph "$@" &&
|
|
git -c core.commitGraph=false log -- A/B/file2 >expect.out &&
|
|
git -c core.commitGraph=true log -- A/B/file2 >out 2>err &&
|
|
test_cmp expect.out out
|
|
}
|
|
|
|
test_expect_success 'Bloom reader notices too-small data chunk' '
|
|
check_corrupt_graph BDAT clear 00000000 &&
|
|
echo "warning: ignoring too-small changed-path chunk" \
|
|
"(4 < 12) in commit-graph file" >expect.err &&
|
|
test_cmp expect.err err
|
|
'
|
|
|
|
test_expect_success 'Bloom reader notices out-of-bounds filter offsets' '
|
|
check_corrupt_graph BIDX 12 FFFFFFFF &&
|
|
# use grep to avoid depending on exact chunk size
|
|
grep "warning: ignoring out-of-range offset (4294967295) for changed-path filter at pos 3 of .git/objects/info/commit-graph" err
|
|
'
|
|
|
|
test_expect_success 'Bloom reader notices too-small index chunk' '
|
|
# replace the index with a single entry, making most
|
|
# lookups out-of-bounds
|
|
check_corrupt_graph BIDX clear 00000000 &&
|
|
echo "warning: commit-graph changed-path index chunk" \
|
|
"is too small" >expect.err &&
|
|
test_cmp expect.err err
|
|
'
|
|
|
|
test_expect_success 'Bloom reader notices out-of-order index offsets' '
|
|
# we do not know any real offsets, but we can pick
|
|
# something plausible; we should not get to the point of
|
|
# actually reading from the bogus offsets anyway.
|
|
corrupt_graph BIDX 4 0000000c00000005 &&
|
|
echo "warning: ignoring decreasing changed-path index offsets" \
|
|
"(12 > 5) for positions 1 and 2 of .git/objects/info/commit-graph" >expect.err &&
|
|
git -c core.commitGraph=false log -- A/B/file2 >expect.out &&
|
|
git -c core.commitGraph=true log -- A/B/file2 >out 2>err &&
|
|
test_cmp expect.out out &&
|
|
test_cmp expect.err err
|
|
'
|
|
|
|
test_done
|