2021-01-23 19:58:19 +00:00
|
|
|
#!/bin/sh
|
|
|
|
|
|
|
|
test_description='compare full workdir to sparse workdir'
|
|
|
|
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
GIT_TEST_SPLIT_INDEX=0
|
2021-03-30 13:11:00 +00:00
|
|
|
GIT_TEST_SPARSE_INDEX=
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
. ./test-lib.sh
|
|
|
|
|
|
|
|
test_expect_success 'setup' '
|
|
|
|
git init initial-repo &&
|
|
|
|
(
|
2021-03-30 13:10:49 +00:00
|
|
|
GIT_TEST_SPARSE_INDEX=0 &&
|
2021-01-23 19:58:19 +00:00
|
|
|
cd initial-repo &&
|
|
|
|
echo a >a &&
|
|
|
|
echo "after deep" >e &&
|
|
|
|
echo "after folder1" >g &&
|
|
|
|
echo "after x" >z &&
|
2022-03-17 15:55:34 +00:00
|
|
|
mkdir folder1 folder2 deep before x &&
|
|
|
|
echo "before deep" >before/a &&
|
|
|
|
echo "before deep again" >before/b &&
|
2021-07-14 13:12:28 +00:00
|
|
|
mkdir deep/deeper1 deep/deeper2 deep/before deep/later &&
|
2021-01-23 19:58:19 +00:00
|
|
|
mkdir deep/deeper1/deepest &&
|
2021-12-06 14:10:36 +00:00
|
|
|
mkdir deep/deeper1/deepest2 &&
|
|
|
|
mkdir deep/deeper1/deepest3 &&
|
2021-01-23 19:58:19 +00:00
|
|
|
echo "after deeper1" >deep/e &&
|
|
|
|
echo "after deepest" >deep/deeper1/e &&
|
|
|
|
cp a folder1 &&
|
|
|
|
cp a folder2 &&
|
|
|
|
cp a x &&
|
|
|
|
cp a deep &&
|
2021-07-14 13:12:28 +00:00
|
|
|
cp a deep/before &&
|
2021-01-23 19:58:19 +00:00
|
|
|
cp a deep/deeper1 &&
|
|
|
|
cp a deep/deeper2 &&
|
2021-07-14 13:12:28 +00:00
|
|
|
cp a deep/later &&
|
2021-01-23 19:58:19 +00:00
|
|
|
cp a deep/deeper1/deepest &&
|
2021-12-06 14:10:36 +00:00
|
|
|
cp a deep/deeper1/deepest2 &&
|
|
|
|
cp a deep/deeper1/deepest3 &&
|
|
|
|
cp -r deep/deeper1/ deep/deeper2 &&
|
2021-07-14 13:12:28 +00:00
|
|
|
mkdir deep/deeper1/0 &&
|
|
|
|
mkdir deep/deeper1/0/0 &&
|
|
|
|
touch deep/deeper1/0/1 &&
|
|
|
|
touch deep/deeper1/0/0/0 &&
|
|
|
|
>folder1- &&
|
|
|
|
>folder1.x &&
|
|
|
|
>folder10 &&
|
|
|
|
cp -r deep/deeper1/0 folder1 &&
|
|
|
|
cp -r deep/deeper1/0 folder2 &&
|
|
|
|
echo >>folder1/0/0/0 &&
|
|
|
|
echo >>folder2/0/1 &&
|
2021-01-23 19:58:19 +00:00
|
|
|
git add . &&
|
|
|
|
git commit -m "initial commit" &&
|
|
|
|
git checkout -b base &&
|
|
|
|
for dir in folder1 folder2 deep
|
|
|
|
do
|
2021-09-08 11:23:57 +00:00
|
|
|
git checkout -b update-$dir base &&
|
2021-01-23 19:58:19 +00:00
|
|
|
echo "updated $dir" >$dir/a &&
|
|
|
|
git commit -a -m "update $dir" || return 1
|
|
|
|
done &&
|
|
|
|
|
|
|
|
git checkout -b rename-base base &&
|
2021-07-14 13:12:27 +00:00
|
|
|
cat >folder1/larger-content <<-\EOF &&
|
2021-01-23 19:58:19 +00:00
|
|
|
matching
|
|
|
|
lines
|
|
|
|
help
|
|
|
|
inexact
|
|
|
|
renames
|
|
|
|
EOF
|
|
|
|
cp folder1/larger-content folder2/ &&
|
|
|
|
cp folder1/larger-content deep/deeper1/ &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "add interesting rename content" &&
|
|
|
|
|
|
|
|
git checkout -b rename-out-to-out rename-base &&
|
|
|
|
mv folder1/a folder2/b &&
|
|
|
|
mv folder1/larger-content folder2/edited-content &&
|
|
|
|
echo >>folder2/edited-content &&
|
2021-07-14 13:12:28 +00:00
|
|
|
echo >>folder2/0/1 &&
|
|
|
|
echo stuff >>deep/deeper1/a &&
|
2021-01-23 19:58:19 +00:00
|
|
|
git add . &&
|
|
|
|
git commit -m "rename folder1/... to folder2/..." &&
|
|
|
|
|
|
|
|
git checkout -b rename-out-to-in rename-base &&
|
|
|
|
mv folder1/a deep/deeper1/b &&
|
2021-07-14 13:12:28 +00:00
|
|
|
echo more stuff >>deep/deeper1/a &&
|
|
|
|
rm folder2/0/1 &&
|
|
|
|
mkdir folder2/0/1 &&
|
|
|
|
echo >>folder2/0/1/1 &&
|
2021-01-23 19:58:19 +00:00
|
|
|
mv folder1/larger-content deep/deeper1/edited-content &&
|
|
|
|
echo >>deep/deeper1/edited-content &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "rename folder1/... to deep/deeper1/..." &&
|
|
|
|
|
|
|
|
git checkout -b rename-in-to-out rename-base &&
|
|
|
|
mv deep/deeper1/a folder1/b &&
|
2021-07-14 13:12:28 +00:00
|
|
|
echo >>folder2/0/1 &&
|
|
|
|
rm -rf folder1/0/0 &&
|
|
|
|
echo >>folder1/0/0 &&
|
2021-01-23 19:58:19 +00:00
|
|
|
mv deep/deeper1/larger-content folder1/edited-content &&
|
|
|
|
echo >>folder1/edited-content &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "rename deep/deeper1/... to folder1/..." &&
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
git checkout -b df-conflict-1 base &&
|
|
|
|
rm -rf folder1 &&
|
|
|
|
echo content >folder1 &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "dir to file" &&
|
|
|
|
|
|
|
|
git checkout -b df-conflict-2 base &&
|
|
|
|
rm -rf folder2 &&
|
|
|
|
echo content >folder2 &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "dir to file" &&
|
|
|
|
|
|
|
|
git checkout -b fd-conflict base &&
|
|
|
|
rm a &&
|
|
|
|
mkdir a &&
|
|
|
|
echo content >a/a &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "file to dir" &&
|
|
|
|
|
2021-07-29 14:52:03 +00:00
|
|
|
for side in left right
|
|
|
|
do
|
|
|
|
git checkout -b merge-$side base &&
|
|
|
|
echo $side >>deep/deeper2/a &&
|
|
|
|
echo $side >>folder1/a &&
|
|
|
|
echo $side >>folder2/a &&
|
|
|
|
git add . &&
|
|
|
|
git commit -m "$side" || return 1
|
|
|
|
done &&
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
git checkout -b deepest base &&
|
|
|
|
echo "updated deepest" >deep/deeper1/deepest/a &&
|
2021-12-06 14:10:36 +00:00
|
|
|
echo "updated deepest2" >deep/deeper1/deepest2/a &&
|
|
|
|
echo "updated deepest3" >deep/deeper1/deepest3/a &&
|
2021-01-23 19:58:19 +00:00
|
|
|
git commit -a -m "update deepest" &&
|
|
|
|
|
|
|
|
git checkout -f base &&
|
|
|
|
git reset --hard
|
|
|
|
)
|
|
|
|
'
|
|
|
|
|
|
|
|
init_repos () {
|
|
|
|
rm -rf full-checkout sparse-checkout sparse-index &&
|
|
|
|
|
|
|
|
# create repos in initial state
|
|
|
|
cp -r initial-repo full-checkout &&
|
|
|
|
git -C full-checkout reset --hard &&
|
|
|
|
|
|
|
|
cp -r initial-repo sparse-checkout &&
|
|
|
|
git -C sparse-checkout reset --hard &&
|
2021-03-30 13:10:49 +00:00
|
|
|
|
|
|
|
cp -r initial-repo sparse-index &&
|
|
|
|
git -C sparse-index reset --hard &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
# initialize sparse-checkout definitions
|
2021-03-30 13:10:49 +00:00
|
|
|
git -C sparse-checkout sparse-checkout init --cone &&
|
|
|
|
git -C sparse-checkout sparse-checkout set deep &&
|
2021-03-30 13:11:00 +00:00
|
|
|
git -C sparse-index sparse-checkout init --cone --sparse-index &&
|
|
|
|
test_cmp_config -C sparse-index true index.sparse &&
|
|
|
|
git -C sparse-index sparse-checkout set deep
|
2021-01-23 19:58:19 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
run_on_sparse () {
|
|
|
|
(
|
|
|
|
cd sparse-checkout &&
|
2021-05-25 20:52:34 +00:00
|
|
|
GIT_PROGRESS_DELAY=100000 "$@" >../sparse-checkout-out 2>../sparse-checkout-err
|
2021-03-30 13:10:49 +00:00
|
|
|
) &&
|
|
|
|
(
|
|
|
|
cd sparse-index &&
|
2021-05-25 20:52:34 +00:00
|
|
|
GIT_PROGRESS_DELAY=100000 "$@" >../sparse-index-out 2>../sparse-index-err
|
2021-01-23 19:58:19 +00:00
|
|
|
)
|
|
|
|
}
|
|
|
|
|
|
|
|
run_on_all () {
|
|
|
|
(
|
|
|
|
cd full-checkout &&
|
2021-05-25 20:52:34 +00:00
|
|
|
GIT_PROGRESS_DELAY=100000 "$@" >../full-checkout-out 2>../full-checkout-err
|
2021-01-23 19:58:19 +00:00
|
|
|
) &&
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_sparse "$@"
|
2021-01-23 19:58:19 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
test_all_match () {
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all "$@" &&
|
2021-01-23 19:58:19 +00:00
|
|
|
test_cmp full-checkout-out sparse-checkout-out &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_cmp full-checkout-out sparse-index-out &&
|
|
|
|
test_cmp full-checkout-err sparse-checkout-err &&
|
|
|
|
test_cmp full-checkout-err sparse-index-err
|
2021-01-23 19:58:19 +00:00
|
|
|
}
|
|
|
|
|
2021-03-30 13:10:49 +00:00
|
|
|
test_sparse_match () {
|
|
|
|
run_on_sparse "$@" &&
|
|
|
|
test_cmp sparse-checkout-out sparse-index-out &&
|
|
|
|
test_cmp sparse-checkout-err sparse-index-err
|
|
|
|
}
|
|
|
|
|
2021-09-24 15:39:03 +00:00
|
|
|
test_sparse_unstaged () {
|
|
|
|
file=$1 &&
|
|
|
|
for repo in sparse-checkout sparse-index
|
|
|
|
do
|
|
|
|
# Skip "unmerged" paths
|
|
|
|
git -C $repo diff --staged --diff-filter=u -- "$file" >diff &&
|
|
|
|
test_must_be_empty diff || return 1
|
|
|
|
done
|
|
|
|
}
|
|
|
|
|
2022-05-23 13:48:37 +00:00
|
|
|
# Usage: test_sprase_checkout_set "<c1> ... <cN>" "<s1> ... <sM>"
|
|
|
|
# Verifies that "git sparse-checkout set <c1> ... <cN>" succeeds and
|
|
|
|
# leaves the sparse index in a state where <s1> ... <sM> are sparse
|
|
|
|
# directories (and <c1> ... <cN> are not).
|
|
|
|
test_sparse_checkout_set () {
|
|
|
|
CONE_DIRS=$1 &&
|
|
|
|
SPARSE_DIRS=$2 &&
|
2022-05-23 13:48:38 +00:00
|
|
|
git -C sparse-index sparse-checkout set --skip-checks $CONE_DIRS &&
|
2021-12-22 14:20:54 +00:00
|
|
|
git -C sparse-index ls-files --sparse --stage >cache &&
|
2022-05-23 13:48:37 +00:00
|
|
|
|
|
|
|
# Check that the directories outside of the sparse-checkout cone
|
|
|
|
# have sparse directory entries.
|
|
|
|
for dir in $SPARSE_DIRS
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
do
|
|
|
|
TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
|
2021-12-22 14:20:54 +00:00
|
|
|
grep "040000 $TREE 0 $dir/" cache \
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|| return 1
|
|
|
|
done &&
|
|
|
|
|
2022-05-23 13:48:37 +00:00
|
|
|
# Check that the directories in the sparse-checkout cone
|
|
|
|
# are not sparse directory entries.
|
|
|
|
for dir in $CONE_DIRS
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
do
|
2022-05-23 13:48:38 +00:00
|
|
|
# Allow TREE to not exist because
|
|
|
|
# $dir does not exist at HEAD.
|
|
|
|
TREE=$(git -C sparse-index rev-parse HEAD:$dir) ||
|
2022-05-23 13:48:37 +00:00
|
|
|
! grep "040000 $TREE 0 $dir/" cache \
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|| return 1
|
2022-05-23 13:48:37 +00:00
|
|
|
done
|
|
|
|
}
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|
2022-05-23 13:48:37 +00:00
|
|
|
test_expect_success 'sparse-index contents' '
|
|
|
|
init_repos &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|
2022-05-23 13:48:37 +00:00
|
|
|
# Remove deep, add three other directories.
|
|
|
|
test_sparse_checkout_set \
|
|
|
|
"folder1 folder2 x" \
|
|
|
|
"before deep" &&
|
|
|
|
|
|
|
|
# Remove folder1, add deep
|
|
|
|
test_sparse_checkout_set \
|
|
|
|
"deep folder2 x" \
|
|
|
|
"before folder1" &&
|
|
|
|
|
|
|
|
# Replace deep with deep/deeper2 (dropping deep/deeper1)
|
|
|
|
# Add folder1
|
|
|
|
test_sparse_checkout_set \
|
|
|
|
"deep/deeper2 folder1 folder2 x" \
|
|
|
|
"before deep/deeper1" &&
|
2021-03-30 13:11:00 +00:00
|
|
|
|
2022-05-23 13:48:38 +00:00
|
|
|
# Replace deep/deeper2 with deep/deeper1
|
|
|
|
# Replace folder1 with folder1/0/0
|
|
|
|
# Replace folder2 with non-existent folder2/2/3
|
|
|
|
# Add non-existent "bogus"
|
|
|
|
test_sparse_checkout_set \
|
|
|
|
"bogus deep/deeper1 folder1/0/0 folder2/2/3 x" \
|
|
|
|
"before deep/deeper2 folder2/0" &&
|
|
|
|
|
|
|
|
# Drop down to only files at root
|
|
|
|
test_sparse_checkout_set \
|
|
|
|
"" \
|
|
|
|
"before deep folder1 folder2 x" &&
|
2021-03-30 13:11:00 +00:00
|
|
|
|
2021-12-22 14:20:54 +00:00
|
|
|
# Disabling the sparse-index replaces tree entries with full ones
|
2021-03-30 13:11:00 +00:00
|
|
|
git -C sparse-index sparse-checkout init --no-sparse-index &&
|
2021-12-22 14:20:54 +00:00
|
|
|
test_sparse_match git ls-files --stage --sparse
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
'
|
|
|
|
|
2021-03-30 13:10:51 +00:00
|
|
|
test_expect_success 'expanded in-memory index matches full index' '
|
|
|
|
init_repos &&
|
2021-12-22 14:20:54 +00:00
|
|
|
test_sparse_match git ls-files --stage
|
2021-03-30 13:10:51 +00:00
|
|
|
'
|
|
|
|
|
2022-03-01 20:24:24 +00:00
|
|
|
test_expect_success 'root directory cannot be sparse' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# Remove all in-cone files and directories from the index, collapse index
|
|
|
|
# with `git sparse-checkout reapply`
|
|
|
|
git -C sparse-index rm -r . &&
|
|
|
|
git -C sparse-index sparse-checkout reapply &&
|
|
|
|
|
|
|
|
# Verify sparse directories still present, root directory is not sparse
|
|
|
|
cat >expect <<-EOF &&
|
2022-03-17 15:55:34 +00:00
|
|
|
before/
|
2022-03-01 20:24:24 +00:00
|
|
|
folder1/
|
|
|
|
folder2/
|
|
|
|
x/
|
|
|
|
EOF
|
|
|
|
git -C sparse-index ls-files --sparse >actual &&
|
|
|
|
test_cmp expect actual
|
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_expect_success 'status with options' '
|
|
|
|
init_repos &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match ls &&
|
2021-01-23 19:58:19 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git status --porcelain=v2 -z -u &&
|
|
|
|
test_all_match git status --porcelain=v2 -uno &&
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all touch README.md &&
|
2021-01-23 19:58:19 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git status --porcelain=v2 -z -u &&
|
|
|
|
test_all_match git status --porcelain=v2 -uno &&
|
|
|
|
test_all_match git add README.md &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git status --porcelain=v2 -z -u &&
|
|
|
|
test_all_match git status --porcelain=v2 -uno
|
|
|
|
'
|
|
|
|
|
2022-03-01 20:24:25 +00:00
|
|
|
test_expect_success 'status with diff in unexpanded sparse directory' '
|
|
|
|
init_repos &&
|
|
|
|
test_all_match git checkout rename-base &&
|
|
|
|
test_all_match git reset --soft rename-out-to-out &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
2021-07-14 13:12:36 +00:00
|
|
|
test_expect_success 'status reports sparse-checkout' '
|
|
|
|
init_repos &&
|
|
|
|
git -C sparse-checkout status >full &&
|
|
|
|
git -C sparse-index status >sparse &&
|
|
|
|
test_i18ngrep "You are in a sparse checkout with " full &&
|
|
|
|
test_i18ngrep "You are in a sparse checkout." sparse
|
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_expect_success 'add, commit, checkout' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all ../edit-contents README.md &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git add README.md &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git commit -m "Add README.md" &&
|
|
|
|
|
|
|
|
test_all_match git checkout HEAD~1 &&
|
|
|
|
test_all_match git checkout - &&
|
|
|
|
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all ../edit-contents README.md &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git add -A &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git commit -m "Extend README.md" &&
|
|
|
|
|
|
|
|
test_all_match git checkout HEAD~1 &&
|
|
|
|
test_all_match git checkout - &&
|
|
|
|
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all ../edit-contents deep/newfile &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git status --porcelain=v2 -uno &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git add . &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git commit -m "add deep/newfile" &&
|
|
|
|
|
|
|
|
test_all_match git checkout HEAD~1 &&
|
|
|
|
test_all_match git checkout -
|
|
|
|
'
|
|
|
|
|
unpack-trees: use traverse_path instead of name
The sparse_dir_matches_path() method compares a cache entry that is a
sparse directory entry against a 'struct traverse_info *info' and a
'struct name_entry *p' to see if the cache entry has exactly the right
name for those other inputs.
This method was introduced in 523506d (unpack-trees: unpack sparse
directory entries, 2021-07-14), but included a significant mistake. The
path comparisons used 'info->name' instead of 'info->traverse_path'.
Since 'info->name' only stores a single tree entry name while
'info->traverse_path' stores the full path from root, this method does
not work when 'info' is in a subdirectory of a directory. Replacing the
right strings and their corresponding lengths make the method work
properly.
The previous change included a failing test that exposes this issue.
That test now passes. The critical detail is that as we go deep into
unpack_trees(), the logic for merging a sparse directory entry with a
tree entry during 'git checkout' relies on this
sparse_dir_matches_path() in order to avoid calling
traverse_trees_recursive() during unpack_callback() in this hunk:
if (!is_sparse_directory_entry(src[0], names, info) &&
traverse_trees_recursive(n, dirmask, mask & ~dirmask,
names, info) < 0) {
return -1;
}
For deep paths, the short-circuit never occurred and
traverse_trees_recursive() was being called incorrectly and that was
causing other strange issues. Specifically, the error message from the
now-passing test previously included this:
error: Your local changes to the following files would be overwritten by checkout:
deep/deeper1/deepest2/a
deep/deeper1/deepest3/a
Please commit your changes or stash them before you switch branches.
Aborting
These messages occurred because the 'current' cache entry in
twoway_merge() was showing as NULL because the index did not contain
entries for the paths contained within the sparse directory entries. We
instead had 'oldtree' given as the entry at HEAD and 'newtree' as the
entry in the target tree. This led to reject_merge() listing these
paths.
Now that sparse_dir_matches_path() works the same for deep paths as it
does for shallow depths, the rest of the logic kicks in to properly
handle modifying the sparse directory entries as designed.
Reported-by: Gustave Granroth <gus.gran@gmail.com>
Reported-by: Mike Marcelais <michmarc@exchange.microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-06 14:10:37 +00:00
|
|
|
test_expect_success 'deep changes during checkout' '
|
2021-12-06 14:10:36 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_sparse_match git sparse-checkout set deep/deeper1/deepest &&
|
|
|
|
test_all_match git checkout deepest &&
|
|
|
|
test_all_match git checkout base
|
|
|
|
'
|
|
|
|
|
unpack-trees: increment cache_bottom for sparse directories
Correct tracking of the 'cache_bottom' for cases where sparse directories
are present in the index.
BACKGROUND
----------
The 'unpack_trees_options.cache_bottom' is a variable that tracks the
in-progress "bottom" of the cache as 'unpack_trees()' iterates through the
contents of the index. Most importantly, this value informs the sequential
return values of 'next_cache_entry()' which, in the "diff cache" usage of
'unpack_callback()', are either unpacked as-is or are passed into the diff
machinery.
The 'cache_bottom' is intended to track the position of the first entry in
the index that has not yet been diffed or unpacked. It is advanced in two
main ways: either it is incremented when an index entry is marked as "used"
(in 'mark_ce_used()'), indicating that it was unpacked or diffed, or when a
directory is unpacked, in which case it is increased by an amount equaling
the number of index entries inside that tree.
In 17a1bb570b (unpack-trees: preserve cache_bottom, 2021-07-14), it was
identified that sparse directories posed a problem to the above
'cache_bottom' advancement logic - because a sparse directory was both an
index entry that could be "used" and a directory that can be unpacked, the
'cache_bottom' would be incremented too many times. To solve this problem,
the 'mark_ce_used()' advancement of 'cache_bottom' was skipped for sparse
directories.
INCORRECT CACHE_BOTTOM TRACKING
-------------------------------
Skipping the 'cache_bottom' advancement for sparse directories in
'mark_ce_used()' breaks down in two cases:
1. When the 'unpack_trees()' operation is *not* a "cache diff" (because the
directory contents-based incrementing of 'cache_bottom' does not happen).
2. When a cache diff is performed with a pathspec (because
'unpack_index_entry()' will unpack a sparse directory not matched by the
pathspec without performing the directory contents-based increment).
The former luckily does not appear to affect 'git' behavior, likely because
'cache_bottom' is largely unused (non-"cache diff" 'unpack_trees()' uses
'find_index_entry()' - rather than 'next_cache_entry()' - to find the index
entries to unpack).
The latter, however, causes 'cache_bottom' to "lag behind" its intended
position by an amount equal to the number of sparse directories unpacked so
far with 'unpack_index_entry()'. If a repository is structured such that any
sparse directories are ordered lexicographically *after* any
pathspec-matching directories, though, this issue won't present any adverse
behavior.
This was the case with the 't1092-sparse-checkout-compatibility.sh' tests
before the addition of the 'before/' sparse directory (ordered *before* the
in-cone 'deep/' directory), therefore sidestepping the issue. Once the
'before/' directory was added, though, 'cache_bottom' began to lag behind
its intended position, causing 'next_cache_entry()' to return index entries
it had already processed and, ultimately, an incorrect diff.
CORRECTING CACHE_BOTTOM
-----------------------
The problems observed in 't1092' come from 'cache_bottom' lagging behind in
cases where the cache tree-based advancement doesn't occur. To solve this,
then, the fix in 17a1bb570b is "reversed"; rather than skipping
'cache_bottom' advancement in 'mark_ce_used()', we skip the directory
contents-based advancement for sparse directories. Now, every index entry
can be accounted for in 'cache_bottom':
* if you're working with a single index entry, 'cache_bottom' is incremented
in 'mark_ce_used()'
* if you're working with a directory that contains index entries (but is not
one itself), 'cache_bottom' is incremented by the number of entries in
that directory.
Finally, change the 'test_expect_failure' tests in 't1092' failing due to
this bug back to 'test_expect_success'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17 15:55:35 +00:00
|
|
|
test_expect_success 'add outside sparse cone' '
|
2021-09-24 15:39:03 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
run_on_sparse mkdir folder1 &&
|
|
|
|
run_on_sparse ../edit-contents folder1/a &&
|
|
|
|
run_on_sparse ../edit-contents folder1/newfile &&
|
|
|
|
test_sparse_match test_must_fail git add folder1/a &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/a &&
|
2021-09-24 15:39:06 +00:00
|
|
|
test_sparse_match test_must_fail git add folder1/newfile &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/newfile
|
2021-09-24 15:39:03 +00:00
|
|
|
'
|
|
|
|
|
2021-06-29 02:13:04 +00:00
|
|
|
test_expect_success 'commit including unstaged changes' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-file <<-\EOF &&
|
|
|
|
echo $1 >$2
|
|
|
|
EOF
|
|
|
|
|
|
|
|
run_on_all ../edit-file 1 a &&
|
|
|
|
run_on_all ../edit-file 1 deep/a &&
|
|
|
|
|
|
|
|
test_all_match git commit -m "-a" -a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
run_on_all ../edit-file 2 a &&
|
|
|
|
run_on_all ../edit-file 2 deep/a &&
|
|
|
|
|
|
|
|
test_all_match git commit -m "--include" --include deep/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git commit -m "--include" --include a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
run_on_all ../edit-file 3 a &&
|
|
|
|
run_on_all ../edit-file 3 deep/a &&
|
|
|
|
|
|
|
|
test_all_match git commit -m "--amend" -a --amend &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
unpack-trees: increment cache_bottom for sparse directories
Correct tracking of the 'cache_bottom' for cases where sparse directories
are present in the index.
BACKGROUND
----------
The 'unpack_trees_options.cache_bottom' is a variable that tracks the
in-progress "bottom" of the cache as 'unpack_trees()' iterates through the
contents of the index. Most importantly, this value informs the sequential
return values of 'next_cache_entry()' which, in the "diff cache" usage of
'unpack_callback()', are either unpacked as-is or are passed into the diff
machinery.
The 'cache_bottom' is intended to track the position of the first entry in
the index that has not yet been diffed or unpacked. It is advanced in two
main ways: either it is incremented when an index entry is marked as "used"
(in 'mark_ce_used()'), indicating that it was unpacked or diffed, or when a
directory is unpacked, in which case it is increased by an amount equaling
the number of index entries inside that tree.
In 17a1bb570b (unpack-trees: preserve cache_bottom, 2021-07-14), it was
identified that sparse directories posed a problem to the above
'cache_bottom' advancement logic - because a sparse directory was both an
index entry that could be "used" and a directory that can be unpacked, the
'cache_bottom' would be incremented too many times. To solve this problem,
the 'mark_ce_used()' advancement of 'cache_bottom' was skipped for sparse
directories.
INCORRECT CACHE_BOTTOM TRACKING
-------------------------------
Skipping the 'cache_bottom' advancement for sparse directories in
'mark_ce_used()' breaks down in two cases:
1. When the 'unpack_trees()' operation is *not* a "cache diff" (because the
directory contents-based incrementing of 'cache_bottom' does not happen).
2. When a cache diff is performed with a pathspec (because
'unpack_index_entry()' will unpack a sparse directory not matched by the
pathspec without performing the directory contents-based increment).
The former luckily does not appear to affect 'git' behavior, likely because
'cache_bottom' is largely unused (non-"cache diff" 'unpack_trees()' uses
'find_index_entry()' - rather than 'next_cache_entry()' - to find the index
entries to unpack).
The latter, however, causes 'cache_bottom' to "lag behind" its intended
position by an amount equal to the number of sparse directories unpacked so
far with 'unpack_index_entry()'. If a repository is structured such that any
sparse directories are ordered lexicographically *after* any
pathspec-matching directories, though, this issue won't present any adverse
behavior.
This was the case with the 't1092-sparse-checkout-compatibility.sh' tests
before the addition of the 'before/' sparse directory (ordered *before* the
in-cone 'deep/' directory), therefore sidestepping the issue. Once the
'before/' directory was added, though, 'cache_bottom' began to lag behind
its intended position, causing 'next_cache_entry()' to return index entries
it had already processed and, ultimately, an incorrect diff.
CORRECTING CACHE_BOTTOM
-----------------------
The problems observed in 't1092' come from 'cache_bottom' lagging behind in
cases where the cache tree-based advancement doesn't occur. To solve this,
then, the fix in 17a1bb570b is "reversed"; rather than skipping
'cache_bottom' advancement in 'mark_ce_used()', we skip the directory
contents-based advancement for sparse directories. Now, every index entry
can be accounted for in 'cache_bottom':
* if you're working with a single index entry, 'cache_bottom' is incremented
in 'mark_ce_used()'
* if you're working with a directory that contains index entries (but is not
one itself), 'cache_bottom' is incremented by the number of entries in
that directory.
Finally, change the 'test_expect_failure' tests in 't1092' failing due to
this bug back to 'test_expect_success'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17 15:55:35 +00:00
|
|
|
test_expect_success 'status/add: outside sparse cone' '
|
2021-07-14 13:12:29 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# folder1 is at HEAD, but outside the sparse cone
|
|
|
|
run_on_sparse mkdir folder1 &&
|
|
|
|
cp initial-repo/folder1/a sparse-checkout/folder1/a &&
|
|
|
|
cp initial-repo/folder1/a sparse-index/folder1/a &&
|
|
|
|
|
|
|
|
test_sparse_match git status &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
run_on_all ../edit-contents folder1/a &&
|
2021-07-14 13:12:29 +00:00
|
|
|
run_on_all ../edit-contents folder1/new &&
|
|
|
|
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
|
2021-07-29 14:52:04 +00:00
|
|
|
# Adding the path outside of the sparse-checkout cone should fail.
|
2021-07-14 13:12:29 +00:00
|
|
|
test_sparse_match test_must_fail git add folder1/a &&
|
2021-09-24 15:39:03 +00:00
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/a &&
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
test_all_match git add --refresh folder1/a &&
|
|
|
|
test_must_be_empty sparse-checkout-err &&
|
2021-09-24 15:39:03 +00:00
|
|
|
test_sparse_unstaged folder1/a &&
|
2021-09-24 15:39:06 +00:00
|
|
|
test_sparse_match test_must_fail git add folder1/new &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/new &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_sparse_match git add --sparse folder1/a &&
|
|
|
|
test_sparse_match git add --sparse folder1/new &&
|
2021-07-29 14:52:04 +00:00
|
|
|
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse . &&
|
2021-07-14 13:12:29 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git commit -m folder1/new &&
|
2021-07-29 14:52:04 +00:00
|
|
|
test_all_match git rev-parse HEAD^{tree} &&
|
2021-07-14 13:12:29 +00:00
|
|
|
|
|
|
|
run_on_all ../edit-contents folder1/newer &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse folder1/ &&
|
2021-07-14 13:12:29 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
2021-07-29 14:52:04 +00:00
|
|
|
test_all_match git commit -m folder1/newer &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree}
|
2021-07-14 13:12:29 +00:00
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_expect_success 'checkout and reset --hard' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout update-folder1 &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git checkout update-deep &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git checkout -b reset-test &&
|
|
|
|
test_all_match git reset --hard deepest &&
|
|
|
|
test_all_match git reset --hard update-folder1 &&
|
|
|
|
test_all_match git reset --hard update-folder2
|
|
|
|
'
|
|
|
|
|
2021-12-06 15:55:59 +00:00
|
|
|
test_expect_success 'diff --cached' '
|
2021-01-23 19:58:19 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>README.md
|
|
|
|
EOF
|
2021-03-30 13:10:46 +00:00
|
|
|
run_on_all ../edit-contents &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git diff &&
|
2021-12-06 15:55:59 +00:00
|
|
|
test_all_match git diff --cached &&
|
2021-01-23 19:58:19 +00:00
|
|
|
test_all_match git add README.md &&
|
|
|
|
test_all_match git diff &&
|
2021-12-06 15:55:59 +00:00
|
|
|
test_all_match git diff --cached
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
# NEEDSWORK: sparse-checkout behaves differently from full-checkout when
|
|
|
|
# running this test with 'df-conflict-2' after 'df-conflict-1'.
|
2021-07-14 13:12:28 +00:00
|
|
|
test_expect_success 'diff with renames and conflicts' '
|
2021-01-23 19:58:19 +00:00
|
|
|
init_repos &&
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
for branch in rename-out-to-out \
|
|
|
|
rename-out-to-in \
|
|
|
|
rename-in-to-out \
|
|
|
|
df-conflict-1 \
|
|
|
|
fd-conflict
|
2021-01-23 19:58:19 +00:00
|
|
|
do
|
|
|
|
test_all_match git checkout rename-base &&
|
2021-07-14 13:12:28 +00:00
|
|
|
test_all_match git checkout $branch -- . &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
2021-12-06 15:55:59 +00:00
|
|
|
test_all_match git diff --cached --no-renames &&
|
|
|
|
test_all_match git diff --cached --find-renames || return 1
|
2021-07-14 13:12:28 +00:00
|
|
|
done
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'diff with directory/file conflicts' '
|
|
|
|
init_repos &&
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
for branch in rename-out-to-out \
|
|
|
|
rename-out-to-in \
|
|
|
|
rename-in-to-out \
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
df-conflict-1 \
|
|
|
|
df-conflict-2 \
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
fd-conflict
|
2021-07-14 13:12:28 +00:00
|
|
|
do
|
|
|
|
git -C full-checkout reset --hard &&
|
|
|
|
test_sparse_match git reset --hard &&
|
|
|
|
test_all_match git checkout $branch &&
|
|
|
|
test_all_match git checkout rename-base -- . &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
2021-12-06 15:55:59 +00:00
|
|
|
test_all_match git diff --cached --no-renames &&
|
|
|
|
test_all_match git diff --cached --find-renames || return 1
|
2021-01-23 19:58:19 +00:00
|
|
|
done
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'log with pathspec outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git log -- a &&
|
|
|
|
test_all_match git log -- folder1/a &&
|
|
|
|
test_all_match git log -- folder2/a &&
|
|
|
|
test_all_match git log -- deep/a &&
|
|
|
|
test_all_match git log -- deep/deeper1/a &&
|
|
|
|
test_all_match git log -- deep/deeper1/deepest/a &&
|
|
|
|
|
|
|
|
test_all_match git checkout update-folder1 &&
|
|
|
|
test_all_match git log -- folder1/a
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'blame with pathspec inside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
2021-12-06 15:56:01 +00:00
|
|
|
for file in a \
|
|
|
|
deep/a \
|
|
|
|
deep/deeper1/a \
|
|
|
|
deep/deeper1/deepest/a
|
|
|
|
do
|
|
|
|
test_all_match git blame $file
|
|
|
|
done
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
2021-12-06 15:56:01 +00:00
|
|
|
# Without a revision specified, blame will error if passed any file that
|
|
|
|
# is not present in the working directory (even if the file is tracked).
|
|
|
|
# Here we just verify that this is also true with sparse checkouts.
|
|
|
|
test_expect_success 'blame with pathspec outside sparse definition' '
|
2021-01-23 19:58:19 +00:00
|
|
|
init_repos &&
|
2021-12-06 15:56:01 +00:00
|
|
|
test_sparse_match git sparse-checkout set &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
2021-12-06 15:56:01 +00:00
|
|
|
for file in a \
|
|
|
|
deep/a \
|
|
|
|
deep/deeper1/a \
|
|
|
|
deep/deeper1/deepest/a
|
|
|
|
do
|
|
|
|
test_sparse_match test_must_fail git blame $file &&
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
fatal: Cannot lstat '"'"'$file'"'"': No such file or directory
|
|
|
|
EOF
|
|
|
|
# We compare sparse-checkout-err and sparse-index-err in
|
|
|
|
# `test_sparse_match`. Given we know they are the same, we
|
|
|
|
# only check the content of sparse-index-err here.
|
|
|
|
test_cmp expect sparse-index-err
|
|
|
|
done
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
reset: preserve skip-worktree bit in mixed reset
Change `update_index_from_diff` to set `skip-worktree` when applicable for
new index entries. When `git reset --mixed <tree-ish>` is run, entries in
the index with differences between the pre-reset HEAD and reset <tree-ish>
are identified and handled with `update_index_from_diff`. For each file, a
new cache entry in inserted into the index, created from the <tree-ish> side
of the reset (without changing the working tree). However, the newly-created
entry must have `skip-worktree` explicitly set in either of the following
scenarios:
1. the file is in the current index and has `skip-worktree` set
2. the file is not in the current index but is outside of a defined sparse
checkout definition
Not setting the `skip-worktree` bit leads to likely-undesirable results for
a user. It causes `skip-worktree` settings to disappear on the
"diff"-containing files (but *only* the diff-containing files), leading to
those files now showing modifications in `git status`. For example, when
running `git reset --mixed` in a sparse checkout, some file entries outside
of sparse checkout could show up as deleted, despite the user never deleting
anything (and not wanting them on-disk anyway).
Additionally, add a test to `t7102` to ensure `skip-worktree` is preserved
in a basic `git reset --mixed` scenario and update a failure-documenting
test from 19a0acc (t1092: test interesting sparse-checkout scenarios,
2021-01-23) with new expected behavior.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 14:39:17 +00:00
|
|
|
test_expect_success 'checkout and reset (mixed)' '
|
2021-01-23 19:58:19 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout -b reset-test update-deep &&
|
|
|
|
test_all_match git reset deepest &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
|
reset: preserve skip-worktree bit in mixed reset
Change `update_index_from_diff` to set `skip-worktree` when applicable for
new index entries. When `git reset --mixed <tree-ish>` is run, entries in
the index with differences between the pre-reset HEAD and reset <tree-ish>
are identified and handled with `update_index_from_diff`. For each file, a
new cache entry in inserted into the index, created from the <tree-ish> side
of the reset (without changing the working tree). However, the newly-created
entry must have `skip-worktree` explicitly set in either of the following
scenarios:
1. the file is in the current index and has `skip-worktree` set
2. the file is not in the current index but is outside of a defined sparse
checkout definition
Not setting the `skip-worktree` bit leads to likely-undesirable results for
a user. It causes `skip-worktree` settings to disappear on the
"diff"-containing files (but *only* the diff-containing files), leading to
those files now showing modifications in `git status`. For example, when
running `git reset --mixed` in a sparse checkout, some file entries outside
of sparse checkout could show up as deleted, despite the user never deleting
anything (and not wanting them on-disk anyway).
Additionally, add a test to `t7102` to ensure `skip-worktree` is preserved
in a basic `git reset --mixed` scenario and update a failure-documenting
test from 19a0acc (t1092: test interesting sparse-checkout scenarios,
2021-01-23) with new expected behavior.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 14:39:17 +00:00
|
|
|
# Because skip-worktree is preserved, resetting to update-folder1
|
|
|
|
# will show worktree changes for folder1/a in full-checkout, but not
|
|
|
|
# in sparse-checkout or sparse-index.
|
|
|
|
git -C full-checkout reset update-folder1 >full-checkout-out &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match git reset update-folder1 &&
|
reset: preserve skip-worktree bit in mixed reset
Change `update_index_from_diff` to set `skip-worktree` when applicable for
new index entries. When `git reset --mixed <tree-ish>` is run, entries in
the index with differences between the pre-reset HEAD and reset <tree-ish>
are identified and handled with `update_index_from_diff`. For each file, a
new cache entry in inserted into the index, created from the <tree-ish> side
of the reset (without changing the working tree). However, the newly-created
entry must have `skip-worktree` explicitly set in either of the following
scenarios:
1. the file is in the current index and has `skip-worktree` set
2. the file is not in the current index but is outside of a defined sparse
checkout definition
Not setting the `skip-worktree` bit leads to likely-undesirable results for
a user. It causes `skip-worktree` settings to disappear on the
"diff"-containing files (but *only* the diff-containing files), leading to
those files now showing modifications in `git status`. For example, when
running `git reset --mixed` in a sparse checkout, some file entries outside
of sparse checkout could show up as deleted, despite the user never deleting
anything (and not wanting them on-disk anyway).
Additionally, add a test to `t7102` to ensure `skip-worktree` is preserved
in a basic `git reset --mixed` scenario and update a failure-documenting
test from 19a0acc (t1092: test interesting sparse-checkout scenarios,
2021-01-23) with new expected behavior.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-27 14:39:17 +00:00
|
|
|
grep "M folder1/a" full-checkout-out &&
|
|
|
|
! grep "M folder1/a" sparse-checkout-out &&
|
|
|
|
run_on_sparse test_path_is_missing folder1
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
2021-11-29 15:52:39 +00:00
|
|
|
test_expect_success 'checkout and reset (merge)' '
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
init_repos &&
|
|
|
|
|
2021-11-29 15:52:39 +00:00
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_all_match git checkout -b reset-test update-deep &&
|
|
|
|
run_on_all ../edit-contents a &&
|
|
|
|
test_all_match git reset --merge deepest &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset --hard update-deep &&
|
|
|
|
run_on_all ../edit-contents deep/a &&
|
|
|
|
test_all_match test_must_fail git reset --merge deepest
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'checkout and reset (keep)' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_all_match git checkout -b reset-test update-deep &&
|
|
|
|
run_on_all ../edit-contents a &&
|
|
|
|
test_all_match git reset --keep deepest &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset --hard update-deep &&
|
|
|
|
run_on_all ../edit-contents deep/a &&
|
|
|
|
test_all_match test_must_fail git reset --keep deepest
|
|
|
|
'
|
|
|
|
|
unpack-trees: increment cache_bottom for sparse directories
Correct tracking of the 'cache_bottom' for cases where sparse directories
are present in the index.
BACKGROUND
----------
The 'unpack_trees_options.cache_bottom' is a variable that tracks the
in-progress "bottom" of the cache as 'unpack_trees()' iterates through the
contents of the index. Most importantly, this value informs the sequential
return values of 'next_cache_entry()' which, in the "diff cache" usage of
'unpack_callback()', are either unpacked as-is or are passed into the diff
machinery.
The 'cache_bottom' is intended to track the position of the first entry in
the index that has not yet been diffed or unpacked. It is advanced in two
main ways: either it is incremented when an index entry is marked as "used"
(in 'mark_ce_used()'), indicating that it was unpacked or diffed, or when a
directory is unpacked, in which case it is increased by an amount equaling
the number of index entries inside that tree.
In 17a1bb570b (unpack-trees: preserve cache_bottom, 2021-07-14), it was
identified that sparse directories posed a problem to the above
'cache_bottom' advancement logic - because a sparse directory was both an
index entry that could be "used" and a directory that can be unpacked, the
'cache_bottom' would be incremented too many times. To solve this problem,
the 'mark_ce_used()' advancement of 'cache_bottom' was skipped for sparse
directories.
INCORRECT CACHE_BOTTOM TRACKING
-------------------------------
Skipping the 'cache_bottom' advancement for sparse directories in
'mark_ce_used()' breaks down in two cases:
1. When the 'unpack_trees()' operation is *not* a "cache diff" (because the
directory contents-based incrementing of 'cache_bottom' does not happen).
2. When a cache diff is performed with a pathspec (because
'unpack_index_entry()' will unpack a sparse directory not matched by the
pathspec without performing the directory contents-based increment).
The former luckily does not appear to affect 'git' behavior, likely because
'cache_bottom' is largely unused (non-"cache diff" 'unpack_trees()' uses
'find_index_entry()' - rather than 'next_cache_entry()' - to find the index
entries to unpack).
The latter, however, causes 'cache_bottom' to "lag behind" its intended
position by an amount equal to the number of sparse directories unpacked so
far with 'unpack_index_entry()'. If a repository is structured such that any
sparse directories are ordered lexicographically *after* any
pathspec-matching directories, though, this issue won't present any adverse
behavior.
This was the case with the 't1092-sparse-checkout-compatibility.sh' tests
before the addition of the 'before/' sparse directory (ordered *before* the
in-cone 'deep/' directory), therefore sidestepping the issue. Once the
'before/' directory was added, though, 'cache_bottom' began to lag behind
its intended position, causing 'next_cache_entry()' to return index entries
it had already processed and, ultimately, an incorrect diff.
CORRECTING CACHE_BOTTOM
-----------------------
The problems observed in 't1092' come from 'cache_bottom' lagging behind in
cases where the cache tree-based advancement doesn't occur. To solve this,
then, the fix in 17a1bb570b is "reversed"; rather than skipping
'cache_bottom' advancement in 'mark_ce_used()', we skip the directory
contents-based advancement for sparse directories. Now, every index entry
can be accounted for in 'cache_bottom':
* if you're working with a single index entry, 'cache_bottom' is incremented
in 'mark_ce_used()'
* if you're working with a directory that contains index entries (but is not
one itself), 'cache_bottom' is incremented by the number of entries in
that directory.
Finally, change the 'test_expect_failure' tests in 't1092' failing due to
this bug back to 'test_expect_success'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-17 15:55:35 +00:00
|
|
|
test_expect_success 'reset with pathspecs inside sparse definition' '
|
2021-11-29 15:52:39 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_all_match git checkout -b reset-test update-deep &&
|
|
|
|
run_on_all ../edit-contents deep/a &&
|
|
|
|
|
|
|
|
test_all_match git reset base -- deep/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset base -- nonexistent-file &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset deepest -- deep &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
# Although the working tree differs between full and sparse checkouts after
|
|
|
|
# reset, the state of the index is the same.
|
|
|
|
test_expect_success 'reset with pathspecs outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
test_all_match git checkout -b reset-test base &&
|
|
|
|
|
|
|
|
test_sparse_match git reset update-folder1 -- folder1 &&
|
|
|
|
git -C full-checkout reset update-folder1 -- folder1 &&
|
2022-01-11 18:04:58 +00:00
|
|
|
test_all_match git ls-files -s -- folder1 &&
|
2021-11-29 15:52:39 +00:00
|
|
|
|
|
|
|
test_sparse_match git reset update-folder2 -- folder2/a &&
|
|
|
|
git -C full-checkout reset update-folder2 -- folder2/a &&
|
2022-01-11 18:04:58 +00:00
|
|
|
test_all_match git ls-files -s -- folder2/a
|
2021-11-29 15:52:39 +00:00
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'reset with wildcard pathspec' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git reset update-deep -- deep\* &&
|
|
|
|
test_all_match git ls-files -s -- deep &&
|
|
|
|
|
|
|
|
test_all_match git reset deepest -- deep\*\*\* &&
|
|
|
|
test_all_match git ls-files -s -- deep &&
|
|
|
|
|
|
|
|
# The following `git reset`s result in updating the index on files with
|
|
|
|
# `skip-worktree` enabled. To avoid failing due to discrepencies in reported
|
|
|
|
# "modified" files, `test_sparse_match` reset is performed separately from
|
|
|
|
# "full-checkout" reset, then the index contents of all repos are verified.
|
|
|
|
|
|
|
|
test_sparse_match git reset update-folder1 -- \*/a &&
|
|
|
|
git -C full-checkout reset update-folder1 -- \*/a &&
|
|
|
|
test_all_match git ls-files -s -- deep/a folder1/a &&
|
|
|
|
|
|
|
|
test_sparse_match git reset update-folder2 -- folder\* &&
|
|
|
|
git -C full-checkout reset update-folder2 -- folder\* &&
|
|
|
|
test_all_match git ls-files -s -- folder10 folder1 folder2 &&
|
|
|
|
|
|
|
|
test_sparse_match git reset base -- folder1/\* &&
|
|
|
|
git -C full-checkout reset base -- folder1/\* &&
|
|
|
|
test_all_match git ls-files -s -- folder1
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
'
|
|
|
|
|
update-index: add tests for sparse-checkout compatibility
Introduce tests for a variety of `git update-index` use cases, including
performance scenarios. Tests are intended to exercise `update-index` with
options that change the commands interaction with the index (e.g.,
`--again`) and with files/directories inside and outside a sparse checkout
cone.
Of note is that these tests clearly establish the behavior of `git
update-index --add` with untracked, outside-of-cone files. Unlike `git add`,
which fails with an error when provided with such files, `update-index`
succeeds in adding them to the index. Additionally, the `skip-worktree` flag
is *not* automatically added to the new entry. Although this is pre-existing
behavior, there are a couple of reasons to avoid changing it in favor of
consistency with e.g. `git add`:
* `update-index` is low-level command for modifying the index; while it can
perform operations similar to those of `add`, it traditionally has fewer
"guardrails" preventing a user from doing something they may not want to
do (in this case, adding an outside-of-cone, non-`skip-worktree` file to
the index)
* `update-index` typically only exits with an error code if it is incapable
of performing an operation (e.g., if an internal function call fails);
adding a new file outside the sparse checkout definition is still a valid
operation, albeit an inadvisable one
* `update-index` does not implicitly set flags (e.g., `skip-worktree`) when
creating new index entries with `--add`; if flags need to be updated,
options like `--[no-]skip-worktree` allow a user to intentionally set them
All this to say that, while there are valid reasons to consider changing the
treatment of outside-of-cone files in `update-index`, there are also
sufficient reasons for leaving it as-is.
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-11 18:05:04 +00:00
|
|
|
test_expect_success 'update-index modify outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
# Create & modify folder1/a
|
|
|
|
# Note that this setup is a manual way of reaching the erroneous
|
|
|
|
# condition in which a `skip-worktree` enabled, outside-of-cone file
|
|
|
|
# exists on disk. It is used here to ensure `update-index` is stable
|
|
|
|
# and behaves predictably if such a condition occurs.
|
|
|
|
run_on_sparse mkdir -p folder1 &&
|
|
|
|
run_on_sparse cp ../initial-repo/folder1/a folder1/a &&
|
|
|
|
run_on_all ../edit-contents folder1/a &&
|
|
|
|
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
# If file has skip-worktree enabled, but the file is present, it is
|
|
|
|
# treated the same as if skip-worktree is disabled
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git update-index folder1/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
update-index: add tests for sparse-checkout compatibility
Introduce tests for a variety of `git update-index` use cases, including
performance scenarios. Tests are intended to exercise `update-index` with
options that change the commands interaction with the index (e.g.,
`--again`) and with files/directories inside and outside a sparse checkout
cone.
Of note is that these tests clearly establish the behavior of `git
update-index --add` with untracked, outside-of-cone files. Unlike `git add`,
which fails with an error when provided with such files, `update-index`
succeeds in adding them to the index. Additionally, the `skip-worktree` flag
is *not* automatically added to the new entry. Although this is pre-existing
behavior, there are a couple of reasons to avoid changing it in favor of
consistency with e.g. `git add`:
* `update-index` is low-level command for modifying the index; while it can
perform operations similar to those of `add`, it traditionally has fewer
"guardrails" preventing a user from doing something they may not want to
do (in this case, adding an outside-of-cone, non-`skip-worktree` file to
the index)
* `update-index` typically only exits with an error code if it is incapable
of performing an operation (e.g., if an internal function call fails);
adding a new file outside the sparse checkout definition is still a valid
operation, albeit an inadvisable one
* `update-index` does not implicitly set flags (e.g., `skip-worktree`) when
creating new index entries with `--add`; if flags need to be updated,
options like `--[no-]skip-worktree` allow a user to intentionally set them
All this to say that, while there are valid reasons to consider changing the
treatment of outside-of-cone files in `update-index`, there are also
sufficient reasons for leaving it as-is.
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-11 18:05:04 +00:00
|
|
|
|
|
|
|
# When skip-worktree is disabled (even on files outside sparse cone), file
|
|
|
|
# is updated in the index
|
|
|
|
test_sparse_match git update-index --no-skip-worktree folder1/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git update-index folder1/a &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'update-index --add outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
# Create folder1, add new file
|
|
|
|
run_on_sparse mkdir -p folder1 &&
|
|
|
|
run_on_all ../edit-contents folder1/b &&
|
|
|
|
|
|
|
|
# The *untracked* out-of-cone file is added to the index because it does
|
|
|
|
# not have a `skip-worktree` bit to signal that it should be ignored
|
|
|
|
# (unlike in `git add`, which will fail due to the file being outside
|
|
|
|
# the sparse checkout definition).
|
|
|
|
test_all_match git update-index --add folder1/b &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
# NEEDSWORK: `--remove`, unlike the rest of `update-index`, does not ignore
|
|
|
|
# `skip-worktree` entries by default and will remove them from the index.
|
|
|
|
# The `--ignore-skip-worktree-entries` flag must be used in conjunction with
|
|
|
|
# `--remove` to ignore the `skip-worktree` entries and prevent their removal
|
|
|
|
# from the index.
|
|
|
|
test_expect_success 'update-index --remove outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# When --ignore-skip-worktree-entries is _not_ specified:
|
|
|
|
# out-of-cone, not-on-disk files are removed from the index
|
|
|
|
test_sparse_match git update-index --remove folder1/a &&
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
D folder1/a
|
|
|
|
EOF
|
|
|
|
test_sparse_match git diff --cached --name-status &&
|
|
|
|
test_cmp expect sparse-checkout-out &&
|
|
|
|
|
|
|
|
# Reset the state
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
|
|
|
|
# When --ignore-skip-worktree-entries is specified, out-of-cone
|
|
|
|
# (skip-worktree) files are ignored
|
|
|
|
test_sparse_match git update-index --remove --ignore-skip-worktree-entries folder1/a &&
|
|
|
|
test_sparse_match git diff --cached --name-status &&
|
|
|
|
test_must_be_empty sparse-checkout-out &&
|
|
|
|
|
|
|
|
# Reset the state
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
|
|
|
|
# --force-remove supercedes --ignore-skip-worktree-entries, removing
|
|
|
|
# a skip-worktree file from the index (and disk) when both are specified
|
|
|
|
# with --remove
|
|
|
|
test_sparse_match git update-index --force-remove --ignore-skip-worktree-entries folder1/a &&
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
D folder1/a
|
|
|
|
EOF
|
|
|
|
test_sparse_match git diff --cached --name-status &&
|
|
|
|
test_cmp expect sparse-checkout-out
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'update-index with directories' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# update-index will exit silently when provided with a directory name
|
|
|
|
# containing a trailing slash
|
|
|
|
test_all_match git update-index deep/ folder1/ &&
|
|
|
|
grep "Ignoring path deep/" sparse-checkout-err &&
|
|
|
|
grep "Ignoring path folder1/" sparse-checkout-err &&
|
|
|
|
|
|
|
|
# When update-index is given a directory name WITHOUT a trailing slash, it will
|
|
|
|
# behave in different ways depending on the status of the directory on disk:
|
|
|
|
# * if it exists, the command exits with an error ("add individual files instead")
|
|
|
|
# * if it does NOT exist (e.g., in a sparse-checkout), it is assumed to be a
|
|
|
|
# file and either triggers an error ("does not exist and --remove not passed")
|
|
|
|
# or is ignored completely (when using --remove)
|
|
|
|
test_all_match test_must_fail git update-index deep &&
|
|
|
|
run_on_all test_must_fail git update-index folder1 &&
|
|
|
|
test_must_fail git -C full-checkout update-index --remove folder1 &&
|
|
|
|
test_sparse_match git update-index --remove folder1 &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'update-index --again file outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout -b test-reupdate &&
|
|
|
|
|
|
|
|
# Update HEAD without modifying the index to introduce a difference in
|
|
|
|
# folder1/a
|
|
|
|
test_sparse_match git reset --soft update-folder1 &&
|
|
|
|
|
|
|
|
# Because folder1/a differs in the index vs HEAD,
|
|
|
|
# `git update-index --no-skip-worktree --again` will effectively perform
|
|
|
|
# `git update-index --no-skip-worktree folder1/a` and remove the skip-worktree
|
|
|
|
# flag from folder1/a
|
|
|
|
test_sparse_match git update-index --no-skip-worktree --again &&
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
D folder1/a
|
|
|
|
EOF
|
|
|
|
test_sparse_match git diff --name-status &&
|
|
|
|
test_cmp expect sparse-checkout-out
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'update-index --cacheinfo' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
deep_a_oid=$(git -C full-checkout rev-parse update-deep:deep/a) &&
|
|
|
|
folder2_oid=$(git -C full-checkout rev-parse update-folder2:folder2) &&
|
|
|
|
folder1_a_oid=$(git -C full-checkout rev-parse update-folder1:folder1/a) &&
|
|
|
|
|
|
|
|
test_all_match git update-index --cacheinfo 100644 $deep_a_oid deep/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Cannot add sparse directory, even in sparse index case
|
|
|
|
test_all_match test_must_fail git update-index --add --cacheinfo 040000 $folder2_oid folder2/ &&
|
|
|
|
|
|
|
|
# Sparse match only: the new outside-of-cone entry is added *without* skip-worktree,
|
|
|
|
# so `git status` reports it as "deleted" in the worktree
|
|
|
|
test_sparse_match git update-index --add --cacheinfo 100644 $folder1_a_oid folder1/a &&
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
MD folder1/a
|
|
|
|
EOF
|
|
|
|
test_sparse_match git status --short -- folder1/a &&
|
|
|
|
test_cmp expect sparse-checkout-out &&
|
|
|
|
|
|
|
|
# To return folder1/a to "normal" for a sparse checkout (ignored &
|
|
|
|
# outside-of-cone), add the skip-worktree flag.
|
|
|
|
test_sparse_match git update-index --skip-worktree folder1/a &&
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
S folder1/a
|
|
|
|
EOF
|
|
|
|
test_sparse_match git ls-files -t -- folder1/a &&
|
|
|
|
test_cmp expect sparse-checkout-out
|
|
|
|
'
|
|
|
|
|
2022-03-01 20:24:27 +00:00
|
|
|
for MERGE_TREES in "base HEAD update-folder2" \
|
|
|
|
"update-folder1 update-folder2" \
|
|
|
|
"update-folder2"
|
|
|
|
do
|
|
|
|
test_expect_success "'read-tree -mu $MERGE_TREES' with files outside sparse definition" '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# Although the index matches, without --no-sparse-checkout, outside-of-
|
|
|
|
# definition files will not exist on disk for sparse checkouts
|
|
|
|
test_all_match git read-tree -mu $MERGE_TREES &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_path_is_missing sparse-checkout/folder2 &&
|
|
|
|
test_path_is_missing sparse-index/folder2 &&
|
|
|
|
|
|
|
|
test_all_match git read-tree --reset -u HEAD &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git read-tree -mu --no-sparse-checkout $MERGE_TREES &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_cmp sparse-checkout/folder2/a sparse-index/folder2/a &&
|
|
|
|
test_cmp sparse-checkout/folder2/a full-checkout/folder2/a
|
|
|
|
|
|
|
|
'
|
|
|
|
done
|
|
|
|
|
|
|
|
test_expect_success 'read-tree --merge with edit/edit conflicts in sparse directories' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# Merge of multiple changes to same directory (but not same files) should
|
|
|
|
# succeed
|
|
|
|
test_all_match git read-tree -mu base rename-base update-folder1 &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
|
|
|
|
test_all_match git read-tree -mu rename-base update-folder2 &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
|
|
|
|
test_all_match test_must_fail git read-tree -mu base update-folder1 rename-out-to-in &&
|
|
|
|
test_all_match test_must_fail git read-tree -mu rename-out-to-in update-folder1
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'read-tree --prefix' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# If files differing between the index and target <commit-ish> exist
|
|
|
|
# inside the prefix, `read-tree --prefix` should fail
|
|
|
|
test_all_match test_must_fail git read-tree --prefix=deep/ deepest &&
|
|
|
|
test_all_match test_must_fail git read-tree --prefix=folder1/ update-folder1 &&
|
|
|
|
|
|
|
|
# If no differing index entries exist matching the prefix,
|
|
|
|
# `read-tree --prefix` updates the index successfully
|
|
|
|
test_all_match git rm -rf deep/deeper1/deepest/ &&
|
|
|
|
test_all_match git read-tree --prefix=deep/deeper1/deepest -u deepest &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git rm -rf --sparse folder1/ &&
|
|
|
|
test_all_match git read-tree --prefix=folder1/ -u update-folder1 &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git rm -rf --sparse folder2/0 &&
|
|
|
|
test_all_match git read-tree --prefix=folder2/0/ -u rename-out-to-out &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'read-tree --merge with directory-file conflicts' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout -b test-branch rename-base &&
|
|
|
|
|
|
|
|
# Although the index matches, without --no-sparse-checkout, outside-of-
|
|
|
|
# definition files will not exist on disk for sparse checkouts
|
|
|
|
test_sparse_match git read-tree -mu rename-out-to-out &&
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
test_path_is_missing sparse-checkout/folder2 &&
|
|
|
|
test_path_is_missing sparse-index/folder2 &&
|
|
|
|
|
|
|
|
test_sparse_match git read-tree --reset -u HEAD &&
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_sparse_match git read-tree -mu --no-sparse-checkout rename-out-to-out &&
|
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
test_cmp sparse-checkout/folder2/0/1 sparse-index/folder2/0/1
|
|
|
|
'
|
|
|
|
|
2021-09-08 11:23:59 +00:00
|
|
|
test_expect_success 'merge, cherry-pick, and rebase' '
|
2021-01-23 19:58:19 +00:00
|
|
|
init_repos &&
|
|
|
|
|
2021-10-16 09:07:09 +00:00
|
|
|
for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
|
2021-09-08 11:23:59 +00:00
|
|
|
do
|
|
|
|
test_all_match git checkout -B temp update-deep &&
|
|
|
|
test_all_match git $OPERATION update-folder1 &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree} &&
|
|
|
|
test_all_match git $OPERATION update-folder2 &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree} || return 1
|
|
|
|
done
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
2021-09-24 15:39:08 +00:00
|
|
|
test_expect_success 'merge with conflict outside cone' '
|
2021-07-29 14:52:03 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout -b merge-tip merge-left &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match test_must_fail git merge -m merge merge-right &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Resolve the conflict in different ways:
|
|
|
|
# 1. Revert to the base
|
|
|
|
test_all_match git checkout base -- deep/deeper2/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# 2. Add the file with conflict markers
|
2021-09-24 15:39:07 +00:00
|
|
|
test_sparse_match test_must_fail git add folder1/a &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/a &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse folder1/a &&
|
2021-07-29 14:52:03 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# 3. Rename the file to another sparse filename and
|
|
|
|
# accept conflict markers as resolved content.
|
|
|
|
run_on_all mv folder2/a folder2/z &&
|
2021-09-24 15:39:07 +00:00
|
|
|
test_sparse_match test_must_fail git add folder2 &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder2/z &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse folder2 &&
|
2021-07-29 14:52:03 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git merge --continue &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree}
|
|
|
|
'
|
|
|
|
|
2021-09-24 15:39:08 +00:00
|
|
|
test_expect_success 'cherry-pick/rebase with conflict outside cone' '
|
2021-09-08 11:24:01 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
for OPERATION in cherry-pick rebase
|
|
|
|
do
|
|
|
|
test_all_match git checkout -B tip &&
|
|
|
|
test_all_match git reset --hard merge-left &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match test_must_fail git $OPERATION merge-right &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Resolve the conflict in different ways:
|
|
|
|
# 1. Revert to the base
|
|
|
|
test_all_match git checkout base -- deep/deeper2/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# 2. Add the file with conflict markers
|
2021-09-24 15:39:06 +00:00
|
|
|
# NEEDSWORK: Even though the merge conflict removed the
|
|
|
|
# SKIP_WORKTREE bit from the index entry for folder1/a, we should
|
|
|
|
# warn that this is a problematic add.
|
2021-09-24 15:39:07 +00:00
|
|
|
test_sparse_match test_must_fail git add folder1/a &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder1/a &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse folder1/a &&
|
2021-09-08 11:24:01 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# 3. Rename the file to another sparse filename and
|
|
|
|
# accept conflict markers as resolved content.
|
2021-09-24 15:39:06 +00:00
|
|
|
# NEEDSWORK: This mode now fails, because folder2/z is
|
|
|
|
# outside of the sparse-checkout cone and does not match an
|
|
|
|
# existing index entry with the SKIP_WORKTREE bit cleared.
|
2021-09-08 11:24:01 +00:00
|
|
|
run_on_all mv folder2/a folder2/z &&
|
2021-09-24 15:39:07 +00:00
|
|
|
test_sparse_match test_must_fail git add folder2 &&
|
|
|
|
grep "Disable or modify the sparsity rules" sparse-checkout-err &&
|
|
|
|
test_sparse_unstaged folder2/z &&
|
2021-09-24 15:39:08 +00:00
|
|
|
test_all_match git add --sparse folder2 &&
|
2021-09-08 11:24:01 +00:00
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git $OPERATION --continue &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree} || return 1
|
|
|
|
done
|
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_expect_success 'merge with outside renames' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
for type in out-to-out out-to-in in-to-out
|
|
|
|
do
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
test_all_match git checkout -f -b merge-$type update-deep &&
|
|
|
|
test_all_match git merge -m "$type" rename-$type &&
|
|
|
|
test_all_match git rev-parse HEAD^{tree} || return 1
|
|
|
|
done
|
|
|
|
'
|
|
|
|
|
sparse-index: skip indexes with unmerged entries
The sparse-index format is designed to be compatible with merge
conflicts, even those outside the sparse-checkout definition. The reason
is that when converting a full index to a sparse one, a cache entry with
nonzero stage will not be collapsed into a sparse directory entry.
However, this behavior was not tested, and a different behavior within
convert_to_sparse() fails in this scenario. Specifically,
cache_tree_update() will fail when unmerged entries exist.
convert_to_sparse_rec() uses the cache-tree data to recursively walk the
tree structure, but also to compute the OIDs used in the
sparse-directory entries.
Add an index scan to convert_to_sparse() that will detect if these merge
conflict entries exist and skip the conversion before trying to update
the cache-tree. This is marked as NEEDSWORK because this can be removed
with a suitable update to cache_tree_update() or a similar method that
can construct a cache-tree with invalid nodes, but still allow creating
the nodes necessary for creating sparse directory entries.
It is possible that in the future we will not need to make such an
update, since if we do not expand a sparse-index into a full one, this
conversion does not need to happen. Thus, this can be deferred until the
merge machinery is made to integrate with the sparse-index.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:12:25 +00:00
|
|
|
# Sparse-index fails to convert the index in the
|
|
|
|
# final 'git cherry-pick' command.
|
|
|
|
test_expect_success 'cherry-pick with conflicts' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-conflict <<-\EOF &&
|
|
|
|
echo $1 >conflict
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_all_match git checkout -b to-cherry-pick &&
|
|
|
|
run_on_all ../edit-conflict ABC &&
|
|
|
|
test_all_match git add conflict &&
|
|
|
|
test_all_match git commit -m "conflict to pick" &&
|
|
|
|
|
|
|
|
test_all_match git checkout -B base HEAD~1 &&
|
|
|
|
run_on_all ../edit-conflict DEF &&
|
|
|
|
test_all_match git add conflict &&
|
|
|
|
test_all_match git commit -m "conflict in base" &&
|
|
|
|
|
|
|
|
test_all_match test_must_fail git cherry-pick to-cherry-pick
|
|
|
|
'
|
|
|
|
|
2022-05-10 23:32:27 +00:00
|
|
|
test_expect_success 'stash' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
# Stash a sparse directory (folder1)
|
|
|
|
test_all_match git checkout -b test-branch rename-base &&
|
|
|
|
test_all_match git reset --soft rename-out-to-out &&
|
|
|
|
test_all_match git stash &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Apply the sparse directory stash without reinstating the index
|
|
|
|
test_all_match git stash apply -q &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Reset to state where stash can be applied
|
|
|
|
test_sparse_match git sparse-checkout reapply &&
|
|
|
|
test_all_match git reset --hard rename-out-to-out &&
|
|
|
|
|
|
|
|
# Apply the sparse directory stash *with* reinstating the index
|
|
|
|
test_all_match git stash apply --index -q &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Reset to state where we will get a conflict applying the stash
|
|
|
|
test_sparse_match git sparse-checkout reapply &&
|
|
|
|
test_all_match git reset --hard update-folder1 &&
|
|
|
|
|
|
|
|
# Apply the sparse directory stash with conflicts
|
|
|
|
test_all_match test_must_fail git stash apply --index -q &&
|
|
|
|
test_all_match test_must_fail git stash apply -q &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# Reset to base branch
|
|
|
|
test_sparse_match git sparse-checkout reapply &&
|
|
|
|
test_all_match git reset --hard base &&
|
|
|
|
|
|
|
|
# Stash & unstash an untracked file outside of the sparse checkout
|
|
|
|
# definition.
|
|
|
|
run_on_sparse mkdir -p folder1 &&
|
|
|
|
run_on_all ../edit-contents folder1/new &&
|
|
|
|
test_all_match git stash -u &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
test_all_match git stash pop -q &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
2022-01-11 18:05:01 +00:00
|
|
|
test_expect_success 'checkout-index inside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
run_on_all rm -f deep/a &&
|
|
|
|
test_all_match git checkout-index -- deep/a &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
echo test >>new-a &&
|
|
|
|
run_on_all cp ../new-a a &&
|
|
|
|
test_all_match test_must_fail git checkout-index -- a &&
|
|
|
|
test_all_match git checkout-index -f -- a &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'checkout-index outside sparse definition' '
|
|
|
|
init_repos &&
|
|
|
|
|
2022-01-11 18:05:02 +00:00
|
|
|
# Without --ignore-skip-worktree-bits, outside-of-cone files will trigger
|
|
|
|
# an error
|
|
|
|
test_sparse_match test_must_fail git checkout-index -- folder1/a &&
|
|
|
|
test_i18ngrep "folder1/a has skip-worktree enabled" sparse-checkout-err &&
|
|
|
|
test_path_is_missing folder1/a &&
|
|
|
|
|
|
|
|
# With --ignore-skip-worktree-bits, outside-of-cone files are checked out
|
|
|
|
test_sparse_match git checkout-index --ignore-skip-worktree-bits -- folder1/a &&
|
2022-01-11 18:05:01 +00:00
|
|
|
test_cmp sparse-checkout/folder1/a sparse-index/folder1/a &&
|
|
|
|
test_cmp sparse-checkout/folder1/a full-checkout/folder1/a &&
|
|
|
|
|
|
|
|
run_on_sparse rm -rf folder1 &&
|
|
|
|
echo test >new-a &&
|
|
|
|
run_on_sparse mkdir -p folder1 &&
|
|
|
|
run_on_all cp ../new-a folder1/a &&
|
|
|
|
|
2022-01-11 18:05:02 +00:00
|
|
|
test_all_match test_must_fail git checkout-index --ignore-skip-worktree-bits -- folder1/a &&
|
|
|
|
test_all_match git checkout-index -f --ignore-skip-worktree-bits -- folder1/a &&
|
2022-01-11 18:05:01 +00:00
|
|
|
test_cmp sparse-checkout/folder1/a sparse-index/folder1/a &&
|
|
|
|
test_cmp sparse-checkout/folder1/a full-checkout/folder1/a
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'checkout-index with folders' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# Inside checkout definition
|
|
|
|
test_all_match test_must_fail git checkout-index -f -- deep/ &&
|
|
|
|
|
|
|
|
# Outside checkout definition
|
2022-01-11 18:05:03 +00:00
|
|
|
# Note: although all tests fail (as expected), the messaging differs. For
|
|
|
|
# non-sparse index checkouts, the error is that the "file" does not appear
|
|
|
|
# in the index; for sparse checkouts, the error is explicitly that the
|
|
|
|
# entry is a sparse directory.
|
|
|
|
run_on_all test_must_fail git checkout-index -f -- folder1/ &&
|
|
|
|
test_cmp full-checkout-err sparse-checkout-err &&
|
|
|
|
! test_cmp full-checkout-err sparse-index-err &&
|
|
|
|
grep "is a sparse directory" sparse-index-err
|
2022-01-11 18:05:01 +00:00
|
|
|
'
|
|
|
|
|
2022-01-11 18:05:02 +00:00
|
|
|
test_expect_success 'checkout-index --all' '
|
2022-01-11 18:05:01 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout-index --all &&
|
2022-01-11 18:05:02 +00:00
|
|
|
test_sparse_match test_path_is_missing folder1 &&
|
|
|
|
|
|
|
|
# --ignore-skip-worktree-bits will cause `skip-worktree` files to be
|
|
|
|
# checked out, causing the outside-of-cone `folder1` to exist on-disk
|
|
|
|
test_all_match git checkout-index --ignore-skip-worktree-bits --all &&
|
|
|
|
test_all_match test_path_exists folder1
|
2022-01-11 18:05:01 +00:00
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_expect_success 'clean' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
echo bogus >>.gitignore &&
|
|
|
|
run_on_all cp ../.gitignore . &&
|
|
|
|
test_all_match git add .gitignore &&
|
2021-03-30 13:10:46 +00:00
|
|
|
test_all_match git commit -m "ignore bogus files" &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
run_on_sparse mkdir folder1 &&
|
2022-01-11 18:05:00 +00:00
|
|
|
run_on_all mkdir -p deep/untracked-deep &&
|
2021-01-23 19:58:19 +00:00
|
|
|
run_on_all touch folder1/bogus &&
|
2022-01-11 18:05:00 +00:00
|
|
|
run_on_all touch folder1/untracked &&
|
|
|
|
run_on_all touch deep/untracked-deep/bogus &&
|
|
|
|
run_on_all touch deep/untracked-deep/untracked &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_all_match git clean -f &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match ls &&
|
|
|
|
test_sparse_match ls folder1 &&
|
2022-01-11 18:05:00 +00:00
|
|
|
run_on_all test_path_exists folder1/bogus &&
|
|
|
|
run_on_all test_path_is_missing folder1/untracked &&
|
|
|
|
run_on_all test_path_exists deep/untracked-deep/bogus &&
|
|
|
|
run_on_all test_path_exists deep/untracked-deep/untracked &&
|
|
|
|
|
|
|
|
test_all_match git clean -fd &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
test_sparse_match ls &&
|
|
|
|
test_sparse_match ls folder1 &&
|
|
|
|
run_on_all test_path_exists folder1/bogus &&
|
|
|
|
run_on_all test_path_exists deep/untracked-deep/bogus &&
|
|
|
|
run_on_all test_path_is_missing deep/untracked-deep/untracked &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git clean -xf &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match ls &&
|
|
|
|
test_sparse_match ls folder1 &&
|
2022-01-11 18:05:00 +00:00
|
|
|
run_on_all test_path_is_missing folder1/bogus &&
|
|
|
|
run_on_all test_path_exists deep/untracked-deep/bogus &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
|
|
|
test_all_match git clean -xdf &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match ls &&
|
|
|
|
test_sparse_match ls folder1 &&
|
2022-01-11 18:05:00 +00:00
|
|
|
run_on_all test_path_is_missing deep/untracked-deep/bogus &&
|
2021-01-23 19:58:19 +00:00
|
|
|
|
sparse-index: convert from full to sparse
If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.
For now, we avoid converting the index to a sparse index if:
1. the index is split.
2. the index is already sparse.
3. sparse-checkout is disabled.
4. sparse-checkout does not use cone mode.
Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.
The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.
The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.
Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."
We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.
The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 13:10:55 +00:00
|
|
|
test_sparse_match test_path_is_dir folder1
|
2021-01-23 19:58:19 +00:00
|
|
|
'
|
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
for builtin in show rev-parse
|
|
|
|
do
|
|
|
|
test_expect_success "$builtin (cached blobs/trees)" "
|
|
|
|
init_repos &&
|
2022-04-26 20:43:16 +00:00
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
test_all_match git $builtin :a &&
|
|
|
|
test_all_match git $builtin :deep/a &&
|
|
|
|
test_sparse_match git $builtin :folder1/a &&
|
2022-04-26 20:43:16 +00:00
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
# The error message differs depending on whether
|
|
|
|
# the directory exists in the worktree.
|
|
|
|
test_all_match test_must_fail git $builtin :deep/ &&
|
|
|
|
test_must_fail git -C full-checkout $builtin :folder1/ &&
|
|
|
|
test_sparse_match test_must_fail git $builtin :folder1/ &&
|
2022-04-26 20:43:17 +00:00
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
# Change the sparse cone for an extra case:
|
|
|
|
run_on_sparse git sparse-checkout set deep/deeper1 &&
|
2022-04-26 20:43:19 +00:00
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
# deep/deeper2 is a sparse directory in the sparse index.
|
|
|
|
test_sparse_match test_must_fail git $builtin :deep/deeper2/ &&
|
2022-04-26 20:43:19 +00:00
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
# deep/deeper2/deepest is not in the sparse index, but
|
|
|
|
# will trigger an index expansion.
|
|
|
|
test_sparse_match test_must_fail git $builtin :deep/deeper2/deepest/
|
|
|
|
"
|
|
|
|
done
|
2022-04-26 20:43:16 +00:00
|
|
|
|
2021-03-30 13:10:56 +00:00
|
|
|
test_expect_success 'submodule handling' '
|
|
|
|
init_repos &&
|
|
|
|
|
2021-09-24 15:39:06 +00:00
|
|
|
test_sparse_match git sparse-checkout add modules &&
|
2021-03-30 13:10:56 +00:00
|
|
|
test_all_match mkdir modules &&
|
|
|
|
test_all_match touch modules/a &&
|
|
|
|
test_all_match git add modules &&
|
|
|
|
test_all_match git commit -m "add modules directory" &&
|
|
|
|
|
|
|
|
run_on_all git submodule add "$(pwd)/initial-repo" modules/sub &&
|
|
|
|
test_all_match git commit -m "add submodule" &&
|
|
|
|
|
|
|
|
# having a submodule prevents "modules" from collapse
|
2021-09-24 15:39:06 +00:00
|
|
|
test_sparse_match git sparse-checkout set deep/deeper1 &&
|
2021-12-22 14:20:54 +00:00
|
|
|
git -C sparse-index ls-files --sparse --stage >cache &&
|
|
|
|
grep "100644 .* modules/a" cache &&
|
|
|
|
grep "160000 $(git -C initial-repo rev-parse HEAD) 0 modules/sub" cache
|
2021-03-30 13:10:56 +00:00
|
|
|
'
|
|
|
|
|
2021-10-27 14:39:18 +00:00
|
|
|
# When working with a sparse index, some commands will need to expand the
|
|
|
|
# index to operate properly. If those commands also write the index back
|
|
|
|
# to disk, they need to convert the index to sparse before writing.
|
|
|
|
# This test verifies that both of these events are logged in trace2 logs.
|
2021-03-30 13:10:58 +00:00
|
|
|
test_expect_success 'sparse-index is expanded and converted back' '
|
|
|
|
init_repos &&
|
|
|
|
|
2021-11-29 13:47:46 +00:00
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
2021-10-27 14:39:18 +00:00
|
|
|
git -C sparse-index reset -- folder1/a &&
|
2021-03-30 13:11:00 +00:00
|
|
|
test_region index convert_to_sparse trace2.txt &&
|
2021-12-22 14:20:53 +00:00
|
|
|
test_region index ensure_full_index trace2.txt &&
|
|
|
|
|
|
|
|
# ls-files expands on read, but does not write.
|
|
|
|
rm trace2.txt &&
|
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
|
|
|
|
git -C sparse-index ls-files &&
|
2021-07-14 13:12:37 +00:00
|
|
|
test_region index ensure_full_index trace2.txt
|
|
|
|
'
|
2021-03-30 13:11:00 +00:00
|
|
|
|
2021-11-23 00:20:33 +00:00
|
|
|
test_expect_success 'index.sparse disabled inline uses full index' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# When index.sparse is disabled inline with `git status`, the
|
|
|
|
# index is expanded at the beginning of the execution then never
|
|
|
|
# converted back to sparse. It is then written to disk as a full index.
|
|
|
|
rm -f trace2.txt &&
|
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
|
|
|
|
git -C sparse-index -c index.sparse=false status &&
|
|
|
|
! test_region index convert_to_sparse trace2.txt &&
|
|
|
|
test_region index ensure_full_index trace2.txt &&
|
|
|
|
|
|
|
|
# Since index.sparse is set to true at a repo level, the index
|
|
|
|
# is converted from full to sparse when read, then never expanded
|
|
|
|
# over the course of `git status`. It is written to disk as a sparse
|
|
|
|
# index.
|
|
|
|
rm -f trace2.txt &&
|
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
|
|
|
|
git -C sparse-index status &&
|
|
|
|
test_region index convert_to_sparse trace2.txt &&
|
|
|
|
! test_region index ensure_full_index trace2.txt &&
|
|
|
|
|
|
|
|
# Now that the index has been written to disk as sparse, it is not
|
|
|
|
# converted to sparse (or expanded to full) when read by `git status`.
|
|
|
|
rm -f trace2.txt &&
|
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
|
|
|
|
git -C sparse-index status &&
|
|
|
|
! test_region index convert_to_sparse trace2.txt &&
|
|
|
|
! test_region index ensure_full_index trace2.txt
|
|
|
|
'
|
|
|
|
|
2021-06-29 02:13:04 +00:00
|
|
|
ensure_not_expanded () {
|
2021-07-14 13:12:37 +00:00
|
|
|
rm -f trace2.txt &&
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
if test -z "$WITHOUT_UNTRACKED_TXT"
|
|
|
|
then
|
|
|
|
echo >>sparse-index/untracked.txt
|
|
|
|
fi &&
|
2021-09-08 11:23:58 +00:00
|
|
|
|
|
|
|
if test "$1" = "!"
|
|
|
|
then
|
|
|
|
shift &&
|
|
|
|
test_must_fail env \
|
2021-11-29 13:47:46 +00:00
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
2021-09-08 11:23:58 +00:00
|
|
|
git -C sparse-index "$@" || return 1
|
|
|
|
else
|
2021-11-29 13:47:46 +00:00
|
|
|
GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \
|
2021-09-08 11:23:58 +00:00
|
|
|
git -C sparse-index "$@" || return 1
|
|
|
|
fi &&
|
2021-07-14 13:12:37 +00:00
|
|
|
test_region ! index ensure_full_index trace2.txt
|
2021-06-29 02:13:04 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
test_expect_success 'sparse-index is not expanded' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
ensure_not_expanded status &&
|
2021-12-22 14:20:53 +00:00
|
|
|
ensure_not_expanded ls-files --sparse &&
|
2021-06-29 02:13:04 +00:00
|
|
|
ensure_not_expanded commit --allow-empty -m empty &&
|
|
|
|
echo >>sparse-index/a &&
|
|
|
|
ensure_not_expanded commit -a -m a &&
|
|
|
|
echo >>sparse-index/a &&
|
|
|
|
ensure_not_expanded commit --include a -m a &&
|
|
|
|
echo >>sparse-index/deep/deeper1/a &&
|
2021-06-29 02:13:06 +00:00
|
|
|
ensure_not_expanded commit --include deep/deeper1/a -m deeper &&
|
|
|
|
ensure_not_expanded checkout rename-out-to-out &&
|
|
|
|
ensure_not_expanded checkout - &&
|
|
|
|
ensure_not_expanded switch rename-out-to-out &&
|
|
|
|
ensure_not_expanded switch - &&
|
2021-11-29 15:52:41 +00:00
|
|
|
ensure_not_expanded reset --hard &&
|
2021-06-29 02:13:06 +00:00
|
|
|
ensure_not_expanded checkout rename-out-to-out -- deep/deeper1 &&
|
2021-11-29 15:52:41 +00:00
|
|
|
ensure_not_expanded reset --hard &&
|
2021-07-29 14:52:04 +00:00
|
|
|
ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1 &&
|
|
|
|
|
|
|
|
echo >>sparse-index/README.md &&
|
|
|
|
ensure_not_expanded add -A &&
|
|
|
|
echo >>sparse-index/extra.txt &&
|
2021-07-29 14:52:05 +00:00
|
|
|
ensure_not_expanded add extra.txt &&
|
|
|
|
echo >>sparse-index/untracked.txt &&
|
2021-09-08 11:23:57 +00:00
|
|
|
ensure_not_expanded add . &&
|
|
|
|
|
2022-01-11 18:05:03 +00:00
|
|
|
ensure_not_expanded checkout-index -f a &&
|
|
|
|
ensure_not_expanded checkout-index -f --all &&
|
2021-11-29 15:52:41 +00:00
|
|
|
for ref in update-deep update-folder1 update-folder2 update-deep
|
|
|
|
do
|
|
|
|
echo >>sparse-index/README.md &&
|
|
|
|
ensure_not_expanded reset --hard $ref || return 1
|
|
|
|
done &&
|
|
|
|
|
2021-11-29 15:52:42 +00:00
|
|
|
ensure_not_expanded reset --mixed base &&
|
2021-11-29 15:52:41 +00:00
|
|
|
ensure_not_expanded reset --hard update-deep &&
|
|
|
|
ensure_not_expanded reset --keep base &&
|
|
|
|
ensure_not_expanded reset --merge update-deep &&
|
|
|
|
ensure_not_expanded reset --hard &&
|
|
|
|
|
2021-11-29 15:52:42 +00:00
|
|
|
ensure_not_expanded reset base -- deep/a &&
|
|
|
|
ensure_not_expanded reset base -- nonexistent-file &&
|
|
|
|
ensure_not_expanded reset deepest -- deep &&
|
|
|
|
|
|
|
|
# Although folder1 is outside the sparse definition, it exists as a
|
|
|
|
# directory entry in the index, so the pathspec will not force the
|
|
|
|
# index to be expanded.
|
|
|
|
ensure_not_expanded reset deepest -- folder1 &&
|
|
|
|
ensure_not_expanded reset deepest -- folder1/ &&
|
|
|
|
|
|
|
|
# Wildcard identifies only in-cone files, no index expansion
|
|
|
|
ensure_not_expanded reset deepest -- deep/\* &&
|
|
|
|
|
|
|
|
# Wildcard identifies only full sparse directories, no index expansion
|
|
|
|
ensure_not_expanded reset deepest -- folder\* &&
|
|
|
|
|
2022-01-11 18:05:00 +00:00
|
|
|
ensure_not_expanded clean -fd &&
|
|
|
|
|
2021-09-08 11:23:57 +00:00
|
|
|
ensure_not_expanded checkout -f update-deep &&
|
|
|
|
test_config -C sparse-index pull.twohead ort &&
|
|
|
|
(
|
|
|
|
sane_unset GIT_TEST_MERGE_ALGORITHM &&
|
2021-09-08 11:24:01 +00:00
|
|
|
for OPERATION in "merge -m merge" cherry-pick rebase
|
|
|
|
do
|
|
|
|
ensure_not_expanded merge -m merge update-folder1 &&
|
|
|
|
ensure_not_expanded merge -m merge update-folder2 || return 1
|
|
|
|
done
|
2021-09-08 11:23:57 +00:00
|
|
|
)
|
2021-03-30 13:10:58 +00:00
|
|
|
'
|
|
|
|
|
2021-09-08 11:23:58 +00:00
|
|
|
test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
for side in right left
|
|
|
|
do
|
|
|
|
git -C sparse-index checkout -b expand-$side base &&
|
|
|
|
echo $side >sparse-index/deep/a &&
|
|
|
|
git -C sparse-index commit -a -m "$side" || return 1
|
|
|
|
done &&
|
|
|
|
|
|
|
|
(
|
|
|
|
sane_unset GIT_TEST_MERGE_ALGORITHM &&
|
|
|
|
git -C sparse-index config pull.twohead ort &&
|
|
|
|
ensure_not_expanded ! merge -m merged expand-right
|
|
|
|
)
|
|
|
|
'
|
|
|
|
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
test_expect_success 'sparse-index is not expanded: stash' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
echo >>sparse-index/a &&
|
|
|
|
ensure_not_expanded stash &&
|
|
|
|
ensure_not_expanded stash list &&
|
|
|
|
ensure_not_expanded stash show stash@{0} &&
|
stash: apply stash using 'merge_ort_nonrecursive()'
Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.
There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:31 +00:00
|
|
|
ensure_not_expanded stash apply stash@{0} &&
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
ensure_not_expanded stash drop stash@{0} &&
|
|
|
|
|
|
|
|
echo >>sparse-index/deep/new &&
|
2022-05-10 23:32:30 +00:00
|
|
|
ensure_not_expanded stash -u &&
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
(
|
|
|
|
WITHOUT_UNTRACKED_TXT=1 &&
|
2022-05-10 23:32:32 +00:00
|
|
|
ensure_not_expanded stash pop
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
) &&
|
|
|
|
|
|
|
|
ensure_not_expanded stash create &&
|
|
|
|
oid=$(git -C sparse-index stash create) &&
|
|
|
|
ensure_not_expanded stash store -m "test" $oid &&
|
|
|
|
ensure_not_expanded reset --hard &&
|
stash: apply stash using 'merge_ort_nonrecursive()'
Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.
There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:31 +00:00
|
|
|
ensure_not_expanded stash pop
|
stash: integrate with sparse index
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-10 23:32:28 +00:00
|
|
|
'
|
|
|
|
|
2021-12-06 15:56:00 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: diff' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
write_script edit-contents <<-\EOF &&
|
|
|
|
echo text >>$1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
# Add file within cone
|
|
|
|
test_sparse_match git sparse-checkout set deep &&
|
|
|
|
run_on_all ../edit-contents deep/testfile &&
|
|
|
|
test_all_match git add deep/testfile &&
|
|
|
|
run_on_all ../edit-contents deep/testfile &&
|
|
|
|
|
|
|
|
test_all_match git diff &&
|
|
|
|
test_all_match git diff --cached &&
|
|
|
|
ensure_not_expanded diff &&
|
|
|
|
ensure_not_expanded diff --cached &&
|
|
|
|
|
|
|
|
# Add file outside cone
|
|
|
|
test_all_match git reset --hard &&
|
|
|
|
run_on_all mkdir newdirectory &&
|
|
|
|
run_on_all ../edit-contents newdirectory/testfile &&
|
|
|
|
test_sparse_match git sparse-checkout set newdirectory &&
|
|
|
|
test_all_match git add newdirectory/testfile &&
|
|
|
|
run_on_all ../edit-contents newdirectory/testfile &&
|
|
|
|
test_sparse_match git sparse-checkout set &&
|
|
|
|
|
|
|
|
test_all_match git diff &&
|
|
|
|
test_all_match git diff --cached &&
|
|
|
|
ensure_not_expanded diff &&
|
|
|
|
ensure_not_expanded diff --cached &&
|
|
|
|
|
|
|
|
# Merge conflict outside cone
|
|
|
|
# The sparse checkout will report a warning that is not in the
|
|
|
|
# full checkout, so we use `run_on_all` instead of
|
|
|
|
# `test_all_match`
|
|
|
|
run_on_all git reset --hard &&
|
|
|
|
test_all_match git checkout merge-left &&
|
|
|
|
test_all_match test_must_fail git merge merge-right &&
|
|
|
|
|
|
|
|
test_all_match git diff &&
|
|
|
|
test_all_match git diff --cached &&
|
|
|
|
ensure_not_expanded diff &&
|
|
|
|
ensure_not_expanded diff --cached
|
|
|
|
'
|
|
|
|
|
2022-04-26 20:43:20 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: show and rev-parse' '
|
2022-04-26 20:43:17 +00:00
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
ensure_not_expanded show :a &&
|
2022-04-26 20:43:20 +00:00
|
|
|
ensure_not_expanded show :deep/a &&
|
|
|
|
ensure_not_expanded rev-parse :a &&
|
|
|
|
ensure_not_expanded rev-parse :deep/a
|
2022-04-26 20:43:17 +00:00
|
|
|
'
|
|
|
|
|
2022-01-11 18:05:05 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: update-index' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
deep_a_oid=$(git -C full-checkout rev-parse update-deep:deep/a) &&
|
|
|
|
ensure_not_expanded update-index --cacheinfo 100644 $deep_a_oid deep/a &&
|
|
|
|
|
|
|
|
echo "test" >sparse-index/README.md &&
|
|
|
|
echo "test2" >sparse-index/a &&
|
|
|
|
rm -f sparse-index/deep/a &&
|
|
|
|
|
|
|
|
ensure_not_expanded update-index --add README.md &&
|
|
|
|
ensure_not_expanded update-index a &&
|
2022-01-11 18:05:06 +00:00
|
|
|
ensure_not_expanded update-index --remove deep/a &&
|
|
|
|
|
|
|
|
ensure_not_expanded reset --soft update-deep &&
|
|
|
|
ensure_not_expanded update-index --add --remove --again
|
2022-01-11 18:05:05 +00:00
|
|
|
'
|
|
|
|
|
2021-12-06 15:56:01 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: blame' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
for file in a \
|
|
|
|
deep/a \
|
|
|
|
deep/deeper1/a \
|
|
|
|
deep/deeper1/deepest/a
|
|
|
|
do
|
|
|
|
ensure_not_expanded blame $file
|
|
|
|
done
|
|
|
|
'
|
|
|
|
|
2021-12-22 14:20:52 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: fetch/pull' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
git -C sparse-index remote add full "file://$(pwd)/full-checkout" &&
|
|
|
|
ensure_not_expanded fetch full &&
|
|
|
|
git -C full-checkout commit --allow-empty -m "for pull merge" &&
|
|
|
|
git -C sparse-index commit --allow-empty -m "for pull merge" &&
|
|
|
|
ensure_not_expanded pull full base
|
|
|
|
'
|
|
|
|
|
2022-03-01 20:24:28 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: read-tree' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
ensure_not_expanded checkout -b test-branch update-folder1 &&
|
2022-03-01 20:24:31 +00:00
|
|
|
for MERGE_TREES in "base HEAD update-folder2" \
|
|
|
|
"base HEAD rename-base" \
|
|
|
|
"base update-folder2" \
|
2022-03-01 20:24:30 +00:00
|
|
|
"base rename-base" \
|
|
|
|
"update-folder2"
|
2022-03-01 20:24:28 +00:00
|
|
|
do
|
|
|
|
ensure_not_expanded read-tree -mu $MERGE_TREES &&
|
|
|
|
ensure_not_expanded reset --hard || return 1
|
2022-03-01 20:24:29 +00:00
|
|
|
done &&
|
|
|
|
|
|
|
|
rm -rf sparse-index/deep/deeper2 &&
|
|
|
|
ensure_not_expanded add . &&
|
|
|
|
ensure_not_expanded commit -m "test" &&
|
|
|
|
|
|
|
|
ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest
|
2022-03-01 20:24:28 +00:00
|
|
|
'
|
|
|
|
|
2021-12-22 14:20:53 +00:00
|
|
|
test_expect_success 'ls-files' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
# Use a smaller sparse-checkout for reduced output
|
|
|
|
test_sparse_match git sparse-checkout set &&
|
|
|
|
|
|
|
|
# Behavior agrees by default. Sparse index is expanded.
|
|
|
|
test_all_match git ls-files &&
|
|
|
|
|
|
|
|
# With --sparse, the sparse index data changes behavior.
|
|
|
|
git -C sparse-index ls-files --sparse >actual &&
|
|
|
|
|
|
|
|
cat >expect <<-\EOF &&
|
|
|
|
a
|
2022-03-17 15:55:34 +00:00
|
|
|
before/
|
2021-12-22 14:20:53 +00:00
|
|
|
deep/
|
|
|
|
e
|
|
|
|
folder1-
|
|
|
|
folder1.x
|
|
|
|
folder1/
|
|
|
|
folder10
|
|
|
|
folder2/
|
|
|
|
g
|
|
|
|
x/
|
|
|
|
z
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_cmp expect actual &&
|
|
|
|
|
|
|
|
# With --sparse and no sparse index, nothing changes.
|
|
|
|
git -C sparse-checkout ls-files >dense &&
|
|
|
|
git -C sparse-checkout ls-files --sparse >sparse &&
|
|
|
|
test_cmp dense sparse &&
|
|
|
|
|
|
|
|
# Set up a strange condition of having a file edit
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
# outside of the sparse-checkout cone. We want to verify
|
|
|
|
# that all modes handle this the same, and detect the
|
|
|
|
# modification.
|
2021-12-22 14:20:53 +00:00
|
|
|
write_script edit-content <<-\EOF &&
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
mkdir -p folder1 &&
|
2021-12-22 14:20:53 +00:00
|
|
|
echo content >>folder1/a
|
|
|
|
EOF
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
run_on_all ../edit-content &&
|
2021-12-22 14:20:53 +00:00
|
|
|
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
test_all_match git ls-files --modified &&
|
2021-12-22 14:20:53 +00:00
|
|
|
|
|
|
|
git -C sparse-index ls-files --sparse --modified >sparse-index-out &&
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
cat >expect <<-\EOF &&
|
|
|
|
folder1/a
|
|
|
|
EOF
|
|
|
|
test_cmp expect sparse-index-out &&
|
2021-12-22 14:20:53 +00:00
|
|
|
|
|
|
|
# Add folder1 to the sparse-checkout cone and
|
|
|
|
# check that ls-files shows the expanded files.
|
|
|
|
test_sparse_match git sparse-checkout add folder1 &&
|
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-14 15:59:41 +00:00
|
|
|
test_all_match git ls-files --modified &&
|
2021-12-22 14:20:53 +00:00
|
|
|
|
|
|
|
test_all_match git ls-files &&
|
|
|
|
git -C sparse-index ls-files --sparse >actual &&
|
|
|
|
|
|
|
|
cat >expect <<-\EOF &&
|
|
|
|
a
|
2022-03-17 15:55:34 +00:00
|
|
|
before/
|
2021-12-22 14:20:53 +00:00
|
|
|
deep/
|
|
|
|
e
|
|
|
|
folder1-
|
|
|
|
folder1.x
|
|
|
|
folder1/0/0/0
|
|
|
|
folder1/0/1
|
|
|
|
folder1/a
|
|
|
|
folder10
|
|
|
|
folder2/
|
|
|
|
g
|
|
|
|
x/
|
|
|
|
z
|
|
|
|
EOF
|
|
|
|
|
|
|
|
test_cmp expect actual &&
|
|
|
|
|
|
|
|
# Double-check index expansion is avoided
|
|
|
|
ensure_not_expanded ls-files --sparse
|
|
|
|
'
|
|
|
|
|
2022-05-23 13:48:46 +00:00
|
|
|
test_expect_success 'sparse index is not expanded: sparse-checkout' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
ensure_not_expanded sparse-checkout set deep/deeper2 &&
|
|
|
|
ensure_not_expanded sparse-checkout set deep/deeper1 &&
|
|
|
|
ensure_not_expanded sparse-checkout set deep &&
|
|
|
|
ensure_not_expanded sparse-checkout add folder1 &&
|
|
|
|
ensure_not_expanded sparse-checkout set deep/deeper1 &&
|
|
|
|
ensure_not_expanded sparse-checkout set folder2 &&
|
|
|
|
|
|
|
|
# Demonstrate that the checks that "folder1/a" is a file
|
|
|
|
# do not cause a sparse-index expansion (since it is in the
|
|
|
|
# sparse-checkout cone).
|
|
|
|
echo >>sparse-index/folder2/a &&
|
|
|
|
git -C sparse-index add folder2/a &&
|
|
|
|
|
|
|
|
ensure_not_expanded sparse-checkout add folder1 &&
|
|
|
|
|
|
|
|
# Skip checks here, since deep/deeper1 is inside a sparse directory
|
|
|
|
# that must be expanded to check whether `deep/deeper1` is a file
|
|
|
|
# or not.
|
|
|
|
ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 &&
|
|
|
|
ensure_not_expanded sparse-checkout set
|
|
|
|
'
|
|
|
|
|
2021-07-14 13:12:40 +00:00
|
|
|
# NEEDSWORK: a sparse-checkout behaves differently from a full checkout
|
|
|
|
# in this scenario, but it shouldn't.
|
2021-07-14 13:12:38 +00:00
|
|
|
test_expect_success 'reset mixed and checkout orphan' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_all_match git checkout rename-out-to-in &&
|
|
|
|
|
|
|
|
# Sparse checkouts do not agree with full checkouts about
|
|
|
|
# how to report a directory/file conflict during a reset.
|
|
|
|
# This command would fail with test_all_match because the
|
|
|
|
# full checkout reports "T folder1/0/1" while a sparse
|
|
|
|
# checkout reports "D folder1/0/1". This matches because
|
|
|
|
# the sparse checkouts skip "adding" the other side of
|
|
|
|
# the conflict.
|
|
|
|
test_sparse_match git reset --mixed HEAD~1 &&
|
2021-12-22 14:20:54 +00:00
|
|
|
test_sparse_match git ls-files --stage &&
|
2021-07-14 13:12:38 +00:00
|
|
|
test_sparse_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# At this point, sparse-checkouts behave differently
|
|
|
|
# from the full-checkout.
|
|
|
|
test_sparse_match git checkout --orphan new-branch &&
|
2021-12-22 14:20:54 +00:00
|
|
|
test_sparse_match git ls-files --stage &&
|
2021-07-14 13:12:38 +00:00
|
|
|
test_sparse_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
|
|
|
test_expect_success 'add everything with deep new file' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
run_on_sparse git sparse-checkout set deep/deeper1/deepest &&
|
|
|
|
|
|
|
|
run_on_all touch deep/deeper1/x &&
|
|
|
|
test_all_match git add . &&
|
|
|
|
test_all_match git status --porcelain=v2
|
|
|
|
'
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
# NEEDSWORK: 'git checkout' behaves incorrectly in the case of
|
|
|
|
# directory/file conflicts, even without sparse-checkout. Use this
|
|
|
|
# test only as a documentation of the incorrect behavior, not a
|
|
|
|
# measure of how it _should_ behave.
|
|
|
|
test_expect_success 'checkout behaves oddly with df-conflict-1' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_sparse_match git sparse-checkout disable &&
|
|
|
|
|
|
|
|
write_script edit-content <<-\EOF &&
|
|
|
|
echo content >>folder1/larger-content
|
|
|
|
git add folder1
|
|
|
|
EOF
|
|
|
|
|
|
|
|
run_on_all ../edit-content &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
git -C sparse-checkout sparse-checkout init --cone &&
|
|
|
|
git -C sparse-index sparse-checkout init --cone --sparse-index &&
|
|
|
|
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# This checkout command should fail, because we have a staged
|
|
|
|
# change to folder1/larger-content, but the destination changes
|
|
|
|
# folder1 to a file.
|
|
|
|
git -C full-checkout checkout df-conflict-1 \
|
|
|
|
1>full-checkout-out \
|
|
|
|
2>full-checkout-err &&
|
|
|
|
git -C sparse-checkout checkout df-conflict-1 \
|
|
|
|
1>sparse-checkout-out \
|
|
|
|
2>sparse-checkout-err &&
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
git -C sparse-index checkout df-conflict-1 \
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
1>sparse-index-out \
|
|
|
|
2>sparse-index-err &&
|
|
|
|
|
|
|
|
# Instead, the checkout deletes the folder1 file and adds the
|
|
|
|
# folder1/larger-content file, leaving all other paths that were
|
|
|
|
# in folder1/ as deleted (without any warning).
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
D folder1
|
|
|
|
A folder1/larger-content
|
|
|
|
EOF
|
|
|
|
test_cmp expect full-checkout-out &&
|
|
|
|
test_cmp expect sparse-checkout-out &&
|
|
|
|
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
# The sparse-index reports no output
|
|
|
|
test_must_be_empty sparse-index-out &&
|
|
|
|
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
# stderr: Switched to branch df-conflict-1
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
test_cmp full-checkout-err sparse-checkout-err &&
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
test_cmp full-checkout-err sparse-checkout-err
|
|
|
|
'
|
|
|
|
|
|
|
|
# NEEDSWORK: 'git checkout' behaves incorrectly in the case of
|
|
|
|
# directory/file conflicts, even without sparse-checkout. Use this
|
|
|
|
# test only as a documentation of the incorrect behavior, not a
|
|
|
|
# measure of how it _should_ behave.
|
|
|
|
test_expect_success 'checkout behaves oddly with df-conflict-2' '
|
|
|
|
init_repos &&
|
|
|
|
|
|
|
|
test_sparse_match git sparse-checkout disable &&
|
|
|
|
|
|
|
|
write_script edit-content <<-\EOF &&
|
|
|
|
echo content >>folder2/larger-content
|
|
|
|
git add folder2
|
|
|
|
EOF
|
|
|
|
|
|
|
|
run_on_all ../edit-content &&
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
git -C sparse-checkout sparse-checkout init --cone &&
|
|
|
|
git -C sparse-index sparse-checkout init --cone --sparse-index &&
|
|
|
|
|
|
|
|
test_all_match git status --porcelain=v2 &&
|
|
|
|
|
|
|
|
# This checkout command should fail, because we have a staged
|
|
|
|
# change to folder1/larger-content, but the destination changes
|
|
|
|
# folder1 to a file.
|
|
|
|
git -C full-checkout checkout df-conflict-2 \
|
|
|
|
1>full-checkout-out \
|
|
|
|
2>full-checkout-err &&
|
|
|
|
git -C sparse-checkout checkout df-conflict-2 \
|
|
|
|
1>sparse-checkout-out \
|
|
|
|
2>sparse-checkout-err &&
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
git -C sparse-index checkout df-conflict-2 \
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
1>sparse-index-out \
|
|
|
|
2>sparse-index-err &&
|
|
|
|
|
|
|
|
# The full checkout deviates from the df-conflict-1 case here!
|
|
|
|
# It drops the change to folder1/larger-content and leaves the
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
# folder1 path as-is on disk. The sparse-index behaves the same.
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
test_must_be_empty full-checkout-out &&
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
test_must_be_empty sparse-index-out &&
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
|
|
|
|
# In the sparse-checkout case, the checkout deletes the folder1
|
|
|
|
# file and adds the folder1/larger-content file, leaving all other
|
|
|
|
# paths that were in folder1/ as deleted (without any warning).
|
|
|
|
cat >expect <<-EOF &&
|
|
|
|
D folder2
|
|
|
|
A folder2/larger-content
|
|
|
|
EOF
|
|
|
|
test_cmp expect sparse-checkout-out &&
|
|
|
|
|
|
|
|
# Switched to branch df-conflict-1
|
unpack-trees: resolve sparse-directory/file conflicts
When running unpack_trees() with a sparse index, we attempt to operate
on the index without expanding the sparse directory entries. Thus, we
operate by manipulating entire directories and passing them to the
unpack function. In the case of the 'git checkout' command, this is the
twoway_merge() function.
There are several cases in twoway_merge() that handle different
situations. One new one to add is the case of a directory/file conflict
where the directory is sparse. Before the sparse index, such a conflict
would appear as a list of file additions and deletions. Now,
twoway_merge() initializes 'current', 'oldtree', and 'newtree' from
src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is
equal to the df_conflict_entry. The way to determine that we have a
directory/file conflict is to test that 'current' and 'newtree' disagree
on being sparse directory entries.
When we are in this case, we want to resolve the situation by calling
merged_entry(). This allows replacing the 'current' entry with the
'newtree' entry. This is important for cases where we want to run 'git
checkout' across the conflict and have the new HEAD represent the new
file type at that path. The first NEEDSWORK comment dropped in t1092
demonstrates this necessary behavior.
However, we still are in a confusing state when 'current' corresponds to
a staged change within a sparse directory that is not present at HEAD.
This should be atypical, because it requires adding a change outside of
the sparse-checkout cone, but it is possible. Since we are unable to
determine that this is a staged change within twoway_merge(), we cannot
add a case to reject the merge at this point. I believe this is due to
the use of df_conflict_entry in the place of 'oldtree' instead of using
the valud at HEAD, which would provide some perspective to this
decision. Any change that would allow this differentiation for staged
entries would need to involve information further up in unpack_trees().
That work should be done, sometime, because we are further confusing the
behavior of a directory/file conflict when staging a change in the
directory. The two cases 'checkout behaves oddly with df-conflict-?' in
t1092 demonstrate that even without a sparse-checkout, Git is not
consistent in its behavior. Neither of the two options seems correct,
either. This change makes the sparse-index behave differently than the
typcial sparse-checkout case, but it does match the full checkout
behavior in the df-conflict-2 case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:41 +00:00
|
|
|
test_cmp full-checkout-err sparse-checkout-err &&
|
|
|
|
test_cmp full-checkout-err sparse-index-err
|
t1092: document bad 'git checkout' behavior
Add new branches to the test repo that demonstrate directory/file
conflicts in different ways. Since the directory 'folder1/' has
adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes
searches for 'folder1/' to land in a different place in the index than a
search for 'folder1'. This causes a change in behavior when working with
the df-conflict-1 and df-conflict-2 branches, whose only difference is
that the first uses 'folder1' as the conflict and the other uses
'folder2' which does not have these adjacent files.
We can extend two tests that compare the behavior across different 'git
checkout' commands, and we see already that the behavior will be
different in some cases and not in others. The difference between the
two test loops is that one uses 'git reset --hard' between iterations.
Further, we isolate the behavior of creating a staged change within a
directory and then checking out a branch where that directory is
replaced with a file. A full checkout behaves differently across these
two cases, while a sparse-checkout cone behaves consistently. In both
cases, the behavior is wrong. In one case, the staged change is dropped
entirely. The other case the staged change is kept, replacing the file
at that location, but none of the other files in the directory are kept.
Likely, the correct behavior in this case is to reject the checkout and
report the conflict, leaving HEAD in its previous location. None of the
cases behave this way currently. Use comments to demonstrate that the
tested behavior is only a documentation of the current, incorrect
behavior to ensure we do not _accidentally_ change it. Instead, we would
prefer to change it on purpose with a future change.
At this point, the sparse-index does not handle these 'git checkout'
commands correctly. Or rather, it _does_ reject the 'git checkout' when
we have the staged change, but for the wrong reason. It also rejects the
'git checkout' commands when there is no staged change and we want to
replace a directory with a file. A fix for that unstaged case will
follow in the next change, but that will make the sparse-index agree
with the full checkout case in these documented incorrect behaviors.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 20:14:40 +00:00
|
|
|
'
|
|
|
|
|
2021-01-23 19:58:19 +00:00
|
|
|
test_done
|