gitformat-commit-graph: describe version 2 of BDAT

The code change to Git to support version 2 will be done in subsequent
commits.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jonathan Tan 2024-06-25 13:39:34 -04:00 committed by Junio C Hamano
parent cf73936ddf
commit 23e91c0ca3

View file

@ -142,13 +142,16 @@ All multi-byte numbers are in network byte order.
==== Bloom Filter Data (ID: {'B', 'D', 'A', 'T'}) [Optional]
* It starts with header consisting of three unsigned 32-bit integers:
- Version of the hash algorithm being used. We currently only support
value 1 which corresponds to the 32-bit version of the murmur3 hash
- Version of the hash algorithm being used. We currently support
value 2 which corresponds to the 32-bit version of the murmur3 hash
implemented exactly as described in
https://en.wikipedia.org/wiki/MurmurHash#Algorithm and the double
hashing technique using seed values 0x293ae76f and 0x7e646e2 as
described in https://doi.org/10.1007/978-3-540-30494-4_26 "Bloom Filters
in Probabilistic Verification"
in Probabilistic Verification". Version 1 Bloom filters have a bug that appears
when char is signed and the repository has path names that have characters >=
0x80; Git supports reading and writing them, but this ability will be removed
in a future version of Git.
- The number of times a path is hashed and hence the number of bit positions
that cumulatively determine whether a file is present in the commit.
- The minimum number of bits 'b' per entry in the Bloom filter. If the filter