git/t/t1051-large-conversion.sh
Matt Cooper e9aa762cc7 odb: teach read_blob_entry to use size_t
There is mixed use of size_t and unsigned long to deal with sizes in the
codebase. Recall that Windows defines unsigned long as 32 bits even on
64-bit platforms, meaning that converting size_t to unsigned long narrows
the range. This mostly doesn't cause a problem since Git rarely deals
with files larger than 2^32 bytes.

But adjunct systems such as Git LFS, which use smudge/clean filters to
keep huge files out of the repository, may have huge file contents passed
through some of the functions in entry.c and convert.c. On Windows, this
results in a truncated file being written to the workdir. I traced this to
one specific use of unsigned long in write_entry (and a similar instance
in write_pc_item_to_fd for parallel checkout). That appeared to be for
the call to read_blob_entry, which expects a pointer to unsigned long.

By altering the signature of read_blob_entry to expect a size_t,
write_entry can be switched to use size_t internally (which all of its
callers and most of its callees already used). To avoid touching dozens of
additional files, read_blob_entry uses a local unsigned long to call a
chain of functions which aren't prepared to accept size_t.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-03 11:22:27 -07:00

102 lines
2.5 KiB
Bash
Executable file

#!/bin/sh
test_description='test conversion filters on large files'
. ./test-lib.sh
set_attr() {
test_when_finished 'rm -f .gitattributes' &&
echo "* $*" >.gitattributes
}
check_input() {
git read-tree --empty &&
git add small large &&
git cat-file blob :small >small.index &&
git cat-file blob :large | head -n 1 >large.index &&
test_cmp small.index large.index
}
check_output() {
rm -f small large &&
git checkout small large &&
head -n 1 large >large.head &&
test_cmp small large.head
}
test_expect_success 'setup input tests' '
printf "\$Id: foo\$\\r\\n" >small &&
cat small small >large &&
git config core.bigfilethreshold 20 &&
git config filter.test.clean "sed s/.*/CLEAN/"
'
test_expect_success 'autocrlf=true converts on input' '
test_config core.autocrlf true &&
check_input
'
test_expect_success 'eol=crlf converts on input' '
set_attr eol=crlf &&
check_input
'
test_expect_success 'ident converts on input' '
set_attr ident &&
check_input
'
test_expect_success 'user-defined filters convert on input' '
set_attr filter=test &&
check_input
'
test_expect_success 'setup output tests' '
echo "\$Id\$" >small &&
cat small small >large &&
git add small large &&
git config core.bigfilethreshold 7 &&
git config filter.test.smudge "sed s/.*/SMUDGE/"
'
test_expect_success 'autocrlf=true converts on output' '
test_config core.autocrlf true &&
check_output
'
test_expect_success 'eol=crlf converts on output' '
set_attr eol=crlf &&
check_output
'
test_expect_success 'user-defined filters convert on output' '
set_attr filter=test &&
check_output
'
test_expect_success 'ident converts on output' '
set_attr ident &&
rm -f small large &&
git checkout small large &&
sed -n "s/Id: .*/Id: SHA/p" <small >small.clean &&
head -n 1 large >large.head &&
sed -n "s/Id: .*/Id: SHA/p" <large.head >large.clean &&
test_cmp small.clean large.clean
'
# This smudge filter prepends 5GB of zeros to the file it checks out. This
# ensures that smudging doesn't mangle large files on 64-bit Windows.
test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
'files over 4GB convert on output' '
test_commit test small "a small file" &&
small_size=$(test_file_size small) &&
test_config filter.makelarge.smudge \
"test-tool genzeros $((5*1024*1024*1024)) && cat" &&
echo "small filter=makelarge" >.gitattributes &&
rm small &&
git checkout -- small &&
size=$(test_file_size small) &&
test "$size" -eq $((5 * 1024 * 1024 * 1024 + $small_size))
'
test_done