Correctness and performance update to "diff --color-moved" feature.
* pw/diff-color-moved-fix:
diff --color-moved: intern strings
diff: use designated initializers for emitted_diff_symbol
diff --color-moved-ws=allow-indentation-change: improve hash lookups
diff --color-moved: stop clearing potential moved blocks
diff --color-moved: shrink potential moved blocks as we go
diff --color-moved: unify moved block growth functions
diff --color-moved: call comparison function directly
diff --color-moved-ws=allow-indentation-change: simplify and optimize
diff: simplify allow-indentation-change delta calculation
diff --color-moved: avoid false short line matches and bad zebra coloring
diff --color-moved=zebra: fix alternate coloring
diff --color-moved: rewind when discarding pmb
diff --color-moved: factor out function
diff --color-moved: clear all flags on blocks that are too short
diff --color-moved: add perf tests
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As libxdiff does not have a whitespace flag to ignore the indentation
the code for --color-moved-ws=allow-indentation-change uses
XDF_IGNORE_WHITESPACE and then filters out any hash lookups where
there are non-indentation changes. This filtering is inefficient as
we have to perform another string comparison.
By using the offset data that we have already computed to skip the
indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove
the extra checks which improves the performance by 11% and paves the
way for the elimination of string comparisons in the next commit.
This change slightly increases the run time of other --color-moved
modes. This could be avoided by using different comparison functions
for the different modes but after the next two commits there is no
measurable benefit in doing so.
There is a change in behavior for lines that begin with a form-feed or
vertical-tab character. Since b46054b374 ("xdiff: use
git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as
whitespace characters. This means that lines starting with those
characters are never considered to be blank and never match a line
that does not start with the same character. After this patch a line
matching "^[\f\v\r]*[ \t]*$" is considered to be blank by
--color-moved-ws=allow-indentation-change and lines beginning
"^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This
changes the output of git show for d18f76dccf ("compat/regex: use the
regex engine from gawk for compat", 2010-08-17) as some lines in the
pre-image before a moved block that contain '\f' are now considered
moved as well as they match a blank line before the moved lines in the
post-image. This commit updates one of the tests to reflect this
change.
Test HEAD^ HEAD
--------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.86(0.82+0.04) 0.88(0.84+0.04) +2.3%
4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.94+0.03) 0.86(0.81+0.05) -11.3%
4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.09) 1.16(1.06+0.09) +0.0%
4002.5: log --color-moved --no-color-moved-ws 1.32(1.26+0.06) 1.33(1.27+0.05) +0.8%
4002.6: log --color-moved-ws=allow-indentation-change 1.35(1.29+0.06) 1.33(1.24+0.08) -1.5%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When marking moved lines it is possible for a block of potential
matched lines to extend past a change in sign when there is a sequence
of added lines whose text matches the text of a sequence of deleted
and added lines. Most of the time either `match` will be NULL or
`pmb_advance_or_null()` will fail when the loop encounters a change of
sign but there are corner cases where `match` is non-NULL and
`pmb_advance_or_null()` successfully advances the moved block despite
the change in sign.
One consequence of this is highlighting a short line as moved when it
should not be. For example
-moved line # Correctly highlighted as moved
+short line # Wrongly highlighted as moved
context
+moved line # Correctly highlighted as moved
+short line
context
-short line
The other consequence is coloring a moved addition following a moved
deletion in the wrong color. In the example below the first "+moved
line 3" should be highlighted as newMoved not newMovedAlternate.
-moved line 1 # Correctly highlighted as oldMoved
-moved line 2 # Correctly highlighted as oldMovedAlternate
+moved line 3 # Wrongly highlighted as newMovedAlternate
context # Everything else is highlighted correctly
+moved line 2
+moved line 3
context
+moved line 1
-moved line 3
These false matches are more likely when using --color-moved-ws with
the exception of --color-moved-ws=allow-indentation-change which ties
the sign of the current whitespace delta to the sign of the line to
avoid this problem. The fix is to check that the sign of the new line
being matched is the same as the sign of the line that started the
block of potential matches.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b0a2ba4776 ("diff --color-moved=zebra: be stricter with color
alternation", 2018-11-23) sought to avoid using the alternate colors
unless there are two adjacent moved blocks of the same
sign. Unfortunately it contains two bugs that prevented it from fixing
the problem properly. Firstly `last_symbol` is reset at the start of
each iteration of the loop losing the symbol of the last line and
secondly when deciding whether to use the alternate color it should be
checking if the current line is the same sign of the last line, not a
different sign. The combination of the two errors means that we still
use the alternate color when we should do but we also use it when we
shouldn't. This is most noticable when using
--color-moved-ws=allow-indentation-change with hunks like
-this line gets indented
+ this line gets indented
where the post image is colored with newMovedAlternate rather than
newMoved. While this does not matter much, the next commit will change
the coloring to be correct in this case, so lets fix the bug here to
make it clear why the output is changing and add a regression test.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "diff-lib" to "lib-diff". With this rename and preceding
commits there is no remaining t/*lib* which doesn't follow the
convention of being called t/lib-*.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff -I<pattern> -exit-code" should exit with 0 status when
all the changes match the ignored pattern, but it didn't.
* jc/diff-I-status-fix:
diff: correct interaction between --exit-code and -I<pattern>
Just like "git diff -w --exit-code" should exit with 0 when ignoring
whitespace differences results in no changes shown, if ignoring
certain changes with "git diff -I<pattern> --exit-code" result in an
empty patch, we should exit with 0.
The test suite did not cover the interaction between "--exit-code"
and "-w"; add one while adding a new test for "--exit-code" + "-I".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not need to hard-code the actual branch name, as we can use the
`test_commit` function to simplify the code and use the tag it
generates, thereby being a lot more precise in what we want.
Strangely enough, this test case would have succeeded even with an
overridden default branch name, obviously for the wrong reason. Let's
verify that it passes for the expected reason, by looking for a
tell-tale in Git's output.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When options such as --ignore-space-change are in use, files with
modifications can have no interesting textual changes worth showing. In
such cases, "git diff --stat" shows 0 lines of additions and deletions.
Teach "git diff --stat" not to show such a path in its output, which
would be more natural.
However, we don't want to prevent the display of all files that have 0
effective diffs since they could be the result of a rename, permission
change, or other similar operation that may still be of interest so we
special case additions and deletions as they are always interesting.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test that includes an actual function line in the test file to
check if context is expanded to include the whole function, and add an
ignored change before function context to check if that one stays hidden
while the originally ignored change within function context is shown.
This differs from the existing test, which is concerned with the case
where there is no function line at all in the file (and we might look
past the beginning of the file).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff" machinery learned not to lose added/removed blank lines
in the context when --ignore-blank-lines and --function-context are
used at the same time.
* rs/xdiff-ignore-ws-w-func-context:
xdiff: unignore changes in function context
Changes involving only blank lines are hidden with --ignore-blank-lines,
unless they appear in the context lines of other changes. This is
handled by xdl_get_hunk() for context added by --inter-hunk-context, -u
and -U.
Function context for -W and --function-context added by xdl_emit_diff()
doesn't pay attention to such ignored changes; it relies fully on
xdl_get_hunk() and shows just the post-image of ignored changes
appearing in function context. That's inconsistent and confusing.
Improve the result of using --ignore-blank-lines and --function-context
together by fully showing ignored changes if they happen to fall within
function context.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of rolling our own method to write out some lines into a file,
use the existing test_write_lines().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no git commands upstream.
The other way is when a command is in a non-assignment command
substitution. The return code will be lost in favour of the surrounding
command's. Rewrite instances of this so that git commands are either run
on their own or in an assignment-only command substitution.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internal diff machinery can be made to read out of bounds while
looking for --funcion-context line in a corner case, which has been
corrected.
* jk/xdiff-clamp-funcname-context-index:
xdiff: clamp function context indices in post-image
After finding a function line for --function-context in the pre-image,
xdl_emit_diff() calculates the equivalent line in the post-image. It
assumes that the lines between changes are the same on both sides. If
the option --ignore-blank-lines was also given then this is not
necessarily true.
Clamp the calculation results for start and end of the function context
to prevent out-of-bounds array accesses.
Note that this _just_ fixes the case where our mismatch sends us off the
beginning of the file. There are likely other cases where our assumption
causes us to go to the wrong line within the file. Nobody has developed
a test case yet, and the ultimate fix is likely more complicated than
this patch. But this at least prevents a segfault in the meantime.
Credit for finding the bug goes to "Liu Wei of Tencent Security Xuanwu
Lab".
Reported-by: 刘炜 <lw17qhdz@gmail.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using --color-moved-ws=allow-indentation-change allow lines with
the same indentation change to be grouped across blank lines. For now
this only works if the blank lines have been moved as well, not for
blocks that have just had their indentation changed.
This completes the changes to the implementation of
--color-moved=allow-indentation-change. Running
git diff --color-moved=allow-indentation-change v2.18.0 v2.19.0
now takes 5.0s. This is a saving of 41% from 8.5s for the optimized
version of the previous implementation and 66% from the original which
took 14.6s.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently diff --color-moved-ws=allow-indentation-change does not
support indentation that contains a mix of tabs and spaces. For
example in commit 546f70f377 ("convert.h: drop 'extern' from function
declaration", 2018-06-30) the function parameters in the following
lines are not colored as moved [1].
-extern int stream_filter(struct stream_filter *,
- const char *input, size_t *isize_p,
- char *output, size_t *osize_p);
+int stream_filter(struct stream_filter *,
+ const char *input, size_t *isize_p,
+ char *output, size_t *osize_p);
This commit changes the way the indentation is handled to track the
visual size of the indentation rather than the characters in the
indentation. This has the benefit that any whitespace errors do not
interfer with the move detection (the whitespace errors will still be
highlighted according to --ws-error-highlight). During the discussion
of this feature there were concerns about the correct detection of
indentation for python. However those concerns apply whether or not
we're detecting moved lines so no attempt is made to determine if the
indentation is 'pythonic'.
[1] Note that before the commit to fix the erroneous coloring of moved
lines each line was colored as a different block, since that commit
they are uncolored.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently when using --color-moved=zebra the color of moved blocks
depends on the number of lines separating them. This means that adding
an odd number of unmoved lines between blocks that are already separated
by one or more unmoved lines will change the color of subsequent moved
blocks. This does not make much sense as the blocks were already
separated by unmoved lines and causes problems when adding lines to test
cases.
Fix this by only using the alternate colors for adjacent moved blocks.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'diff --color-moved-ws=allow-indentation-change' can highlight lines
that have internal whitespace changes rather than indentation
changes. For example in commit 1a07e59c3e ("Update messages in
preparation for i18n", 2018-07-21) the lines
- die (_("must end with a color"));
+ die(_("must end with a color"));
are highlighted as moved when they should not be. Modify an existing
test to show the problem that will be fixed in the next commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This changes the error handling for the options --color-moved-ws
and --color-moved-ws to be like the rest of the options.
Move the die() call out of parse_color_moved_ws into the parsing
of command line options. As the function returns a bit field, change
its signature to return an unsigned instead of an int; add a new bit
to signal errors. Once the error is signaled, we discard the other
bits, such that it doesn't matter if the error bit overlaps with any
other bit.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test fixes.
* sg/test-must-be-empty:
tests: use 'test_must_be_empty' instead of 'test_cmp <empty> <out>'
tests: use 'test_must_be_empty' instead of 'test_cmp /dev/null <out>'
tests: use 'test_must_be_empty' instead of 'test ! -s'
tests: use 'test_must_be_empty' instead of '! test -s'
Using 'test_must_be_empty' is shorter and more idiomatic than
>empty &&
test_cmp empty out
as it saves the creation of an empty file. Furthermore, sometimes the
expected empty file doesn't have such a descriptive name like 'empty',
and its creation is far away from the place where it's finally used
for comparison (e.g. in 't7600-merge.sh', where two expected empty
files are created in the 'setup' test, but are used only about 500
lines later).
These cases were found by instrumenting 'test_cmp' to error out the
test script when it's used to compare empty files, and then converted
manually.
Note that even after this patch there still remain a lot of cases
where we use 'test_cmp' to check empty files:
- Sometimes the expected output is not hard-coded in the test, but
'test_cmp' is used to ensure that two similar git commands produce
the same output, and that output happens to be empty, e.g. the
test 'submodule update --merge - ignores --merge for new
submodules' in 't7406-submodule-update.sh'.
- Repetitive common tasks, including preparing the expected results
and running 'test_cmp', are often extracted into a helper
function, and some of this helper's callsites expect no output.
- For the same reason as above, the whole 'test_expect_success'
block is within a helper function, e.g. in 't3070-wildmatch.sh'.
- Or 'test_cmp' is invoked in a loop, e.g. the test 'cvs update
(-p)' in 't9400-git-cvsserver-server.sh'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".
* es/diff-color-moved-fix:
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
Change various tests that use an idiom of the form:
>expect &&
test_cmp expect actual
To instead use:
test_must_be_empty actual
The test_must_be_empty() wrapper was introduced in ca8d148daf ("test:
test_must_be_empty helper", 2013-06-09). Many of these tests have been
added after that time. This was mostly found with, and manually pruned
from:
git grep '^\s+>.*expect.* &&$' t
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff. Some cases came up where different treatment would
have been nice.
Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=<modes>
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.
As we change the default, we'll adjust the tests.
For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.
As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.
By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new "blocks" mode provides a middle ground between plain and zebra.
It is as intuitive (few colors) as plain, but still has the requirement
for a minimum of lines/characters to count a block as moved.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
(https://public-inbox.org/git/87o9j0uljo.fsf@evledraar.gmail.com/)
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t4015 we have a pattern of
git diff [<options, related to color>] |
grep -v "index" |
test_decode_color >actual &&
to produce output that we want to test against. This pattern was introduced
in 86b452e276 (diff.c: add dimming to moved line detection, 2017-06-30)
as then the focus on getting the colors right. However the pattern used
is not best practice as we do care about the exit code of Git. So let's
not have Git as the upstream of a pipe. Piping the output of grep to
some function is fine as we assume grep to be un-flawed in our test suite.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent update to the submodule configuration code broke "diff-tree"
by accidentally stopping to read from the index upfront.
* bw/submodule-config-cleanup:
diff-tree: read the index so attribute checks work in bare repositories
A regression was introduced in 557a5998d (submodule: remove
gitmodules_config, 2017-08-03) to how attribute processing was handled
in bare repositories when running the diff-tree command.
By default the attribute system will first try to read ".gitattribute"
files from the working tree and then falls back to reading them from the
index if there isn't a copy checked out in the worktree. Prior to
557a5998d the index was read as a side effect of the call to
'gitmodules_config()' which ensured that the index was already populated
before entering the attribute subsystem.
Since the call to 'gitmodules_config()' was removed the index is no
longer being read so when the attribute system tries to read from the
in-memory index it doesn't find any ".gitattribute" entries effectively
ignoring any configured attributes.
Fix this by explicitly reading the index during the setup of diff-tree.
Reported-by: Ben Boeckel <ben.boeckel@kitware.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff" family of commands learned to ignore differences in
carriage return at the end of line.
* jc/ignore-cr-at-eol:
diff: --ignore-cr-at-eol
xdiff: reassign xpparm_t.flags bits
A new option --ignore-cr-at-eol tells the diff machinery to treat a
carriage-return at the end of a (complete) line as if it does not
exist.
Just like other "--ignore-*" options to ignore various kinds of
whitespace differences, this will help reviewing the real changes
you made without getting distracted by spurious CRLF<->LF conversion
made by your editor program.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
[jch: squashed in command line completion by Dscho]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code for handling whitespace with --color-moved
represents partial strings as a pair of pointers. There are
two possible conventions for the end pointer:
1. It points to the byte right after the end of the
string.
2. It points to the final byte of the string.
But we seem to use both conventions in the code:
a. we assign the initial pointers from the NUL-terminated
string using (1)
b. we eat trailing whitespace by checking the second
pointer for isspace(), which needs (2)
c. the next_byte() function checks for end-of-string with
"if (cp > endp)", which is (2)
d. in next_byte() we skip past internal whitespace with
"while (cp < end)", which is (1)
This creates fewer bugs than you might think, because there
are some subtle interactions. Because of (a) and (c), we
always return the NUL-terminator from next_byte(). But all
of the callers of next_byte() happen to handle that
gracefully.
Because of the mismatch between (d) and (c), next_byte()
could accidentally return a whitespace character right at
endp. But because of the interaction of (a) and (b), we fail
to actually chomp trailing whitespace, meaning our endp
_always_ points to a NUL, canceling out the problem.
But that does leave (b) as a real bug: when ignoring
whitespace only at the end-of-line, we don't correctly trim
it, and fail to match up lines.
We can fix the whole thing by moving consistently to one
convention. Since convention (1) is idiomatic in our code
base, we'll pick that one.
The existing "-w" and "-b" tests continue to pass, and a new
"--ignore-space-at-eol" shows off the breakage we're fixing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit fa5ba2c1dd (diff: fix infinite loop with
--color-moved --ignore-space-change, 2017-10-12) added a
test to make sure that "--color-moved -b" doesn't run
forever, but the test in question doesn't actually have any
moved lines in it.
Let's scrap that test and add a variant of the existing
"--color-moved -w" test, but this time we'll check that we
find the move with whitespace changes, but not arbitrary
whitespace additions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We test that lines with whitespace changes are not found by
"--color-moved" by default, but are found if "-w" is added.
Let's add one more twist: a line that has non-whitespace
changes should not be marked as a pure move.
This is perhaps an obvious case for us to get right (and we
do), but as we add more whitespace tests, they will form a
pattern of "make sure this case is a move and this other
case is not".
Note that we have to add a line to our moved block, since
having a too-small block doesn't trigger the "moved"
heuristics. And we also add a line of context to ensure
that there's more context lines than moved lines (so the
diff shows us moving the lines up, rather than moving the
context down).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for testing several different whitespace
options, let's split out the setup and cleanup steps of the
whitespace test.
While we're here, let's also switch to using "<<-" to indent
our here-documents properly, and use q_to_tab to more
explicitly mark where we expect whitespace to appear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recently added "--color-moved" feature of "diff" fell into
infinite loop when ignoring whitespace changes, which has been
fixed.
* sb/diff-color-move:
diff: fix infinite loop with --color-moved --ignore-space-change
The --color-moved code uses next_byte() to advance through
the blob contents. When the user has asked to ignore
whitespace changes, we try to collapse any whitespace change
down to a single space.
However, we enter the conditional block whenever we see the
IGNORE_WHITESPACE_CHANGE flag, even if the next byte isn't
whitespace.
This means that the combination of "--color-moved and
--ignore-space-change" was completely broken. Worse, because
we return from next_byte() without having advanced our
pointer, the function makes no forward progress in the
buffer and loops infinitely.
Fix this by entering the conditional only when we actually
see whitespace. We can apply this also to the
IGNORE_WHITESPACE change. That code path isn't buggy
(because it falls through to returning the next
non-whitespace byte), but it makes the logic more clear if
we only bother to look at whitespace flags after seeing that
the next byte is whitespace.
Reported-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the argument order for test_cmp. When given the expected
result first the diff shows the actual output with '+' and the
expectation with '-', which is the convention for our tests.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/ui-color-always-to-auto-maint:
color: make "always" the same as "auto" in config
provide --color option for all ref-filter users
t3205: use --color instead of color.branch=always
t3203: drop "always" color test
t6006: drop "always" color config tests
t7502: use diff.noprefix for --verbose test
t7508: use test_terminal for color output
t3701: use test-terminal to collect color output
t4015: prefer --color to -c color.diff=always
test-terminal: set TERM=vt100
The tests for --color-moved write their output to a file,
but doing so suppresses color output under "auto". Right now
this is solved by running the whole script under
"color.diff=always". In preparation for the behavior of
"always" changing, let's explicitly enable color.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>