Commit graph

65633 commits

Author SHA1 Message Date
Eric Sunshine fca965a3bc t/lib-pager: use sane_unset() to avoid breaking &&-chain
This test intentionally breaks the &&-chain following `unset` since it
doesn't know if `unset` will succeed or fail and doesn't want a local
`unset` failure to abort the test overall. We can do better by using
sane_unset() which can be linked into the &&-chain as usual.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:35:47 -08:00
Elijah Newren 324b170b88 t2501: simplify the tests since we can now assume desired behavior
We no longer are dealing with a mixture of previous and desired
behavior, so simplify the tests a bit.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren 580a5d7f75 dir: new flag to remove_dir_recurse() to spare the original_cwd
remove_dir_recurse(), and its non-static wrapper called
remove_dir_recursively(), both take flags for modifying its behavior.
As with the previous commits, we would generally like to protect
the original_cwd, but we want to forced user commands (e.g. 'git rm -rf
...') or other special cases to remove it.  Add a flag for this purpose.
After reading through every caller of remove_dir_recursively() in the
current codebase, there was only one that should be adjusted and that
one only in a very unusual circumstance.  Add a pair of new testcases to
highlight that very specific case involving submodules && --git-dir &&
--work-tree.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren 63bbe8beb7 dir: avoid incidentally removing the original_cwd in remove_path()
Modern git often tries to avoid leaving empty directories around when
removing files.  Originally, it did not bother.  This behavior started
with commit 80e21a9ed8 (merge-recursive::removeFile: remove empty
directories, 2005-11-19), stating the reason simply as:

    When the last file in a directory is removed as the result of a
    merge, try to rmdir the now-empty directory.

This was reimplemented in C and renamed to remove_path() in commit
e1b3a2cad7 ("Build-in merge-recursive", 2008-02-07), but was still
internal to merge-recursive.

This trend towards removing leading empty directories continued with
commit d9b814cc97 (Add builtin "git rm" command, 2006-05-19), which
stated the reasoning as:

    The other question is what to do with leading directories. The old
    "git rm" script didn't do anything, which is somewhat inconsistent.
    This one will actually clean up directories that have become empty
    as a result of removing the last file, but maybe we want to have a
    flag to decide the behaviour?

remove_path() in dir.c was added in 4a92d1bfb7 (Add remove_path: a
function to remove as much as possible of a path, 2008-09-27), because
it was noted that we had two separate implementations of the same idea
AND both were buggy.  It described the purpose of the function as

    a function to remove as much as possible of a path

Why remove as much as possible?  Well, at the time we probably would
have said something like:

  * removing leading directories makes things feel tidy
  * removing leading directories doesn't hurt anything so long as they
    had no files in them.

But I don't believe those reasons hold when the empty directory happens
to be the current working directory we inherited from our parent
process.  Leaving the parent process in a deleted directory can cause
user confusion when subsequent processes fail: any git command, for
example, will immediately fail with

    fatal: Unable to read current working directory: No such file or directory

Other commands may similarly get confused.  Modify remove_path() so that
the empty leading directories it also deletes does not include the
current working directory we inherited from our parent process.  I have
looked through every caller of remove_path() in the current codebase to
make sure that all should take this change.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren 0fce211ccc stash: do not attempt to remove startup_info->original_cwd
Since stash spawns a `clean` subprocess, make sure we run that from the
startup_info->original_cwd directory, so that the `clean` processs knows
to protect that directory.  Also, since the `clean` command might no
longer run from the toplevel, pass the ':/' magic pathspec to ensure we
still clean from the toplevel.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren bc3ae46b42 rebase: do not attempt to remove startup_info->original_cwd
Since rebase spawns a `checkout` subprocess, make sure we run that from
the startup_info->original_cwd directory, so that the checkout process
knows to protect that directory.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren c65744e7d7 clean: do not attempt to remove startup_info->original_cwd
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren 00fcce285d symlinks: do not include startup_info->original_cwd in dir removal
symlinks has a pair of schedule_dir_for_removal() and
remove_scheduled_dirs() functions that ensure that directories made
empty by removing other files also themselves get removed.  However, we
want to exclude startup_info->original_cwd and leave it around.  This
avoids the user getting confused by subsequent git commands (and non-git
commands) that would otherwise report confusing messages about being
unable to read the current working directory.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:13 -08:00
Elijah Newren 0b0ee3388c unpack-trees: add special cwd handling
When running commands such as `git reset --hard` from a subdirectory, if
that subdirectory is in the way of adding needed files, bail with an
error message.

Note that this change looks kind of like it duplicates the new lines of
code from the previous commit in verify_clean_subdirectory().  However,
when we are preserving untracked files, we would rather any error
messages about untracked files being in the way take precedence over
error messages about a subdirectory that happens to be the_original_cwd
being in the way.  But in the UNPACK_RESET_OVERWRITE_UNTRACKED case,
there is no untracked checking to be done, so we simply add a special
case near the top of verify_absent_1.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:12 -08:00
Elijah Newren b817e54533 unpack-trees: refuse to remove startup_info->original_cwd
In the past, when a directory needs to be removed to make room for a
file, we have always errored out when that directory contains any
untracked (but not ignored) files.  Add an extra condition on that: also
error out if the directory is the current working directory we inherited
from our parent process.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:12 -08:00
Elijah Newren e6f8861bd4 setup: introduce startup_info->original_cwd
Removing the current working directory causes all subsequent git
commands run from that directory to get confused and fail with a message
about being unable to read the current working directory:

    $ git status
    fatal: Unable to read current working directory: No such file or directory

Non-git commands likely have similar warnings or even errors, e.g.

    $ bash -c 'echo hello'
    shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
    hello

This confuses end users, particularly since the command they get the
error from is not the one that caused the problem; the problem came from
the side-effect of some previous command.

We would like to avoid removing the current working directory of our
parent process; towards this end, introduce a new variable,
startup_info->original_cwd, that tracks the current working directory
that we inherited from our parent process.  For convenience of later
comparisons, we prefer that this new variable store a path relative to
the toplevel working directory (thus much like 'prefix'), except without
the trailing slash.

Subsequent commits will make use of this new variable.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:12 -08:00
Elijah Newren 8a0d52dfd8 t2501: add various tests for removing the current working directory
Numerous commands will remove directories left empty as a "convenience"
after removing files within them.  That is normally fine, but removing
the current working directory can be rather inconvenient since it can
cause confusion for the user when they run subsequent commands.  For
example, after one git process has removed the current working
directory, git status/log/diff will all abort with the message:

    fatal: Unable to read current working directory: No such file or directory

We also have code paths that, when a file needs to be placed where a
directory is (due to e.g. checkout, merge, reset, whatever), will check
if this is okay and error out if not.  These rules include:
  * all tracked files under that directory are intended to be removed by
    the operation
  * none of the tracked files under that directory have uncommitted
    modification
  * there are no untracked files under that directory
However, if we end up remove the current working directory, we can cause
user confusion when they run subsequent commands, so we would prefer if
there was a fourth rule added to this list: avoid removing the current
working directory.

Since there are several code paths that can result in the current
working directory being removed, add several tests of various different
codepaths.  To make it clearer what the difference between the current
behavior and the behavior at the end of the series, code both of them
into the tests and have the appropriate behavior be selected by a flag.
Subsequent commits will toggle the flag from current to desired
behavior.

Also add a few tests suggested during the review of earlier rounds of
this patch series.

Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:33:12 -08:00
Phillip Wood 72962e8b3c diff --color-moved: intern strings
Taking inspiration from xdl_classify_record() assign an id to each
addition and deletion such that lines that match for the current
--color-moved-ws mode share the same unique id. This reduces the
number of hash lookups a little (calculating the ids still involves
one hash lookup per line) but the main benefit is that when growing
blocks of potentially moved lines we can replace string comparisons
which involve chasing a pointer with a simple integer comparison. On a
large diff this commit reduces the time to run 'diff --color-moved' by
37% compared to the previous commit and 31% compared to master, for
'diff --color-moved-ws=allow-indentation-change' the reduction is 28%
compared to the previous commit and 96% compared to master. There is
little change in the performance of 'git log --patch' as the diffs are
smaller.

Test                                                                  HEAD^              HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38(0.33+0.05)    0.38(0.33+0.05)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.88(0.81+0.06)    0.55(0.50+0.04) -37.5%
4002.3: diff --color-moved-ws=allow-indentation-change large change   0.85(0.79+0.06)    0.61(0.54+0.06) -28.2%
4002.4: log --no-color-moved --no-color-moved-ws                      1.16(1.07+0.08)    1.15(1.09+0.05)  -0.9%
4002.5: log --color-moved --no-color-moved-ws                         1.31(1.22+0.08)    1.29(1.19+0.09)  -1.5%
4002.6: log --color-moved-ws=allow-indentation-change                 1.32(1.24+0.08)    1.31(1.18+0.13)  -0.8%

Test                                                                  master             HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38 (0.33+0.05)   0.38(0.33+0.05)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.80 (0.75+0.04)   0.55(0.50+0.04) -31.2%
4002.3: diff --color-moved-ws=allow-indentation-change large change  14.20(14.15+0.05)   0.61(0.54+0.06) -95.7%
4002.4: log --no-color-moved --no-color-moved-ws                      1.15 (1.05+0.09)   1.15(1.09+0.05)  +0.0%
4002.5: log --color-moved --no-color-moved-ws                         1.30 (1.19+0.11)   1.29(1.19+0.09)  -0.8%
4002.6: log --color-moved-ws=allow-indentation-change                 1.70 (1.63+0.06)   1.31(1.18+0.13) -22.9%

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood b4a5c5c419 diff: use designated initializers for emitted_diff_symbol
This makes it clearer which fields are being explicitly initialized
and will simplify the next commit where we add a new field to the
struct.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood 25e61909e9 diff --color-moved-ws=allow-indentation-change: improve hash lookups
As libxdiff does not have a whitespace flag to ignore the indentation
the code for --color-moved-ws=allow-indentation-change uses
XDF_IGNORE_WHITESPACE and then filters out any hash lookups where
there are non-indentation changes. This filtering is inefficient as
we have to perform another string comparison.

By using the offset data that we have already computed to skip the
indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove
the extra checks which improves the performance by 11% and paves the
way for the elimination of string comparisons in the next commit.

This change slightly increases the run time of other --color-moved
modes. This could be avoided by using different comparison functions
for the different modes but after the next two commits there is no
measurable benefit in doing so.

There is a change in behavior for lines that begin with a form-feed or
vertical-tab character. Since b46054b374 ("xdiff: use
git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as
whitespace characters. This means that lines starting with those
characters are never considered to be blank and never match a line
that does not start with the same character. After this patch a line
matching "^[\f\v\r]*[ \t]*$" is considered to be blank by
--color-moved-ws=allow-indentation-change and lines beginning
"^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This
changes the output of git show for d18f76dccf ("compat/regex: use the
regex engine from gawk for compat", 2010-08-17) as some lines in the
pre-image before a moved block that contain '\f' are now considered
moved as well as they match a blank line before the moved lines in the
post-image. This commit updates one of the tests to reflect this
change.

Test                                                                  HEAD^             HEAD
--------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38(0.33+0.05)   0.38(0.33+0.05)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.86(0.82+0.04)   0.88(0.84+0.04)  +2.3%
4002.3: diff --color-moved-ws=allow-indentation-change large change   0.97(0.94+0.03)   0.86(0.81+0.05) -11.3%
4002.4: log --no-color-moved --no-color-moved-ws                      1.16(1.07+0.09)   1.16(1.06+0.09)  +0.0%
4002.5: log --color-moved --no-color-moved-ws                         1.32(1.26+0.06)   1.33(1.27+0.05)  +0.8%
4002.6: log --color-moved-ws=allow-indentation-change                 1.35(1.29+0.06)   1.33(1.24+0.08)  -1.5%

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood eec7f53b31 diff --color-moved: stop clearing potential moved blocks
moved_block_clear() was introduced in 74d156f4a1 ("diff
--color-moved-ws: fix double free crash", 2018-10-04) to free the
memory that was allocated when initializing a potential moved
block. However since 21536d077f ("diff --color-moved-ws: modify
allow-indentation-change", 2018-11-23) initializing a potential moved
block no longer allocates any memory. Up until the last commit we were
relying on moved_block_clear() to set the `match` pointer to NULL when
a block stopped matching, but since that commit we do not clear a
moved block that does not match so it does not make sense to clear
them elsewhere.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood 0e488f1732 diff --color-moved: shrink potential moved blocks as we go
Rather than setting `match` to NULL and then looping over the list of
potential matched blocks for a second time to remove blocks with no
matches just filter out the blocks with no matches as we go.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood ff046a0066 diff --color-moved: unify moved block growth functions
After the last two commits pmb_advance_or_null() and
pmb_advance_or_null_multi_match() differ only in the comparison they
perform. Lets simplify the code by combining them into a single
function.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:06 -08:00
Phillip Wood 08fba1076f diff --color-moved: call comparison function directly
This change will allow us to easily combine pmb_advance_or_null() and
pmb_advance_or_null_multi_match() in the next commit. Calling
xdiff_compare_lines() directly rather than using a function pointer
from the hash map has little effect on the run time.

Test                                                                  HEAD^             HEAD
-------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38(0.35+0.03)   0.38(0.32+0.06) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.87(0.83+0.04)   0.87(0.80+0.06) +0.0%
4002.3: diff --color-moved-ws=allow-indentation-change large change   0.97(0.92+0.04)   0.97(0.93+0.04) +0.0%
4002.4: log --no-color-moved --no-color-moved-ws                      1.17(1.06+0.10)   1.16(1.10+0.05) -0.9%
4002.5: log --color-moved --no-color-moved-ws                         1.32(1.24+0.08)   1.31(1.22+0.09) -0.8%
4002.6: log --color-moved-ws=allow-indentation-change                 1.36(1.25+0.10)   1.35(1.25+0.10) -0.7%

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood 52d14e166d diff --color-moved-ws=allow-indentation-change: simplify and optimize
If we already have a block of potentially moved lines then as we move
down the diff we need to check if the next line of each potentially
moved line matches the current line of the diff. The implementation of
--color-moved-ws=allow-indentation-change was needlessly performing
this check on all the lines in the diff that matched the current line
rather than just the current line. To exacerbate the problem finding
all the other lines in the diff that match the current line involves a
fuzzy lookup so we were wasting even more time performing a second
comparison to filter out the non-matching lines. Fixing this reduces
time to run
  git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0
by 93% compared to master and simplifies the code.

Test                                                                  HEAD^              HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38 (0.35+0.03)   0.38(0.35+0.03)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.86 (0.80+0.06)   0.87(0.83+0.04)  +1.2%
4002.3: diff --color-moved-ws=allow-indentation-change large change  19.01(18.93+0.06)   0.97(0.92+0.04) -94.9%
4002.4: log --no-color-moved --no-color-moved-ws                      1.16 (1.06+0.09)   1.17(1.06+0.10)  +0.9%
4002.5: log --color-moved --no-color-moved-ws                         1.32 (1.25+0.07)   1.32(1.24+0.08)  +0.0%
4002.6: log --color-moved-ws=allow-indentation-change                 1.71 (1.64+0.06)   1.36(1.25+0.10) -20.5%

Test                                                                  master             HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change        0.38 (0.33+0.05)   0.38(0.35+0.03)  +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change           0.80 (0.75+0.04)   0.87(0.83+0.04)  +8.7%
4002.3: diff --color-moved-ws=allow-indentation-change large change  14.20(14.15+0.05)   0.97(0.92+0.04) -93.2%
4002.4: log --no-color-moved --no-color-moved-ws                      1.15 (1.05+0.09)   1.17(1.06+0.10)  +1.7%
4002.5: log --color-moved --no-color-moved-ws                         1.30 (1.19+0.11)   1.32(1.24+0.08)  +1.5%
4002.6: log --color-moved-ws=allow-indentation-change                 1.70 (1.63+0.06)   1.36(1.25+0.10) -20.0%

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood 76e32d6193 diff: simplify allow-indentation-change delta calculation
Now that we reliably end a block when the sign changes we don't need
the whitespace delta calculation to rely on the sign.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood eb89352504 diff --color-moved: avoid false short line matches and bad zebra coloring
When marking moved lines it is possible for a block of potential
matched lines to extend past a change in sign when there is a sequence
of added lines whose text matches the text of a sequence of deleted
and added lines. Most of the time either `match` will be NULL or
`pmb_advance_or_null()` will fail when the loop encounters a change of
sign but there are corner cases where `match` is non-NULL and
`pmb_advance_or_null()` successfully advances the moved block despite
the change in sign.

One consequence of this is highlighting a short line as moved when it
should not be. For example

-moved line  # Correctly highlighted as moved
+short line  # Wrongly highlighted as moved
 context
+moved line  # Correctly highlighted as moved
+short line
 context
-short line

The other consequence is coloring a moved addition following a moved
deletion in the wrong color. In the example below the first "+moved
line 3" should be highlighted as newMoved not newMovedAlternate.

-moved line 1 # Correctly highlighted as oldMoved
-moved line 2 # Correctly highlighted as oldMovedAlternate
+moved line 3 # Wrongly highlighted as newMovedAlternate
 context      # Everything else is highlighted correctly
+moved line 2
+moved line 3
 context
+moved line 1
-moved line 3

These false matches are more likely when using --color-moved-ws with
the exception of --color-moved-ws=allow-indentation-change which ties
the sign of the current whitespace delta to the sign of the line to
avoid this problem. The fix is to check that the sign of the new line
being matched is the same as the sign of the line that started the
block of potential matches.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood eb315457f6 diff --color-moved=zebra: fix alternate coloring
b0a2ba4776 ("diff --color-moved=zebra: be stricter with color
alternation", 2018-11-23) sought to avoid using the alternate colors
unless there are two adjacent moved blocks of the same
sign. Unfortunately it contains two bugs that prevented it from fixing
the problem properly. Firstly `last_symbol` is reset at the start of
each iteration of the loop losing the symbol of the last line and
secondly when deciding whether to use the alternate color it should be
checking if the current line is the same sign of the last line, not a
different sign. The combination of the two errors means that we still
use the alternate color when we should do but we also use it when we
shouldn't. This is most noticable when using
--color-moved-ws=allow-indentation-change with hunks like

-this line gets indented
+    this line gets indented

where the post image is colored with newMovedAlternate rather than
newMoved. While this does not matter much, the next commit will change
the coloring to be correct in this case, so lets fix the bug here to
make it clear why the output is changing and add a regression test.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood 0990658bf8 diff --color-moved: rewind when discarding pmb
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood 7dfe427107 diff --color-moved: factor out function
This code is quite heavily indented and having it in its own function
simplifies an upcoming change.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood bea084ba41 diff --color-moved: clear all flags on blocks that are too short
If a block of potentially moved lines is not long enough then the
DIFF_SYMBOL_MOVED_LINE flag is cleared on the matching lines so they
are not marked as moved. To avoid problems when we start rewinding
after an unsuccessful match in a couple of commits time make sure all
the move related flags are cleared, not just DIFF_SYMBOL_MOVED_LINE.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Phillip Wood f73613ac33 diff --color-moved: add perf tests
Add some tests so we can monitor changes to the performance of the
move detection code. The tests record the performance --color-moved
and --color-moved-ws=allow-indentation-change for a large diff and a
sequence of smaller diffs. The range of commits used for the large
diff can be customized by exporting TEST_REV_A and TEST_REV_B when
running the test.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-09 13:24:05 -08:00
Junio C Hamano deefc2d9f6 flex-array: simplify compiler-specific workaround
We use "type array[];" syntax for the flex-array member at the end
of a struct under C99 or later, except when we are building with
older SUNPRO_C compilers.  As we find more vendor compilers that
claim to grok C99 but not understand the flex-array syntax, the
existing "If we are using C99, but not with these compilers..."
conditional will keep growing.

Make it more manageable by listing vendor-specific exceptions
earlier, with the expectation that new exceptions will not be
combined into existing ones to make the condition longer, and
instead will be implemented as a new "#elif" in the cascade of
similar to old SUNPRO_C, we can just add a single line

    #elif defined(_MSC_VER)

immediately before "#elif defined(__GNUC__)" to cause us to fallback
to the safer but a bit wasteful version.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-08 17:45:16 -08:00
Neeraj Singh ecd81dfc79 tmp-objdir: disable ref updates when replacing the primary odb
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.

Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.

Peff's test case [1] was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.

[1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-08 14:06:46 -08:00
Neeraj Singh b3cecf49ea tmp-objdir: new API for creating temporary writable databases
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it.  The
subprocesses would view it as their primary datastore and write to it.

Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.

For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.

Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.

Based-on-patch-by: Elijah Newren <newren@gmail.com>

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-08 14:06:36 -08:00
Jeff King 5f46385309 config.mak.dev: specify -std=gnu99 for gcc/clang
The point of DEVELOPER=1 is to turn up the warnings so we can catch
portability or correctness mistakes at the compiler level. But since
modern compilers tend to default to modern standards like gnu17, we
might miss warnings about older standards, even though we expect Git to
build with compilers that use them.

So it's helpful for developer builds to set the -std argument to our
lowest-common denominator. Traditionally this was c89, but since we're
moving to assuming c99 in 7bc341e21b (git-compat-util: add a test
balloon for C99 support, 2021-12-01) that seems like a good spot to
land. And as explained in that commit, we want "gnu99" because we still
want to take advantage of some extensions when they're available.

The new argument kicks in only for clang and gcc (which we know to
support "-std=" and "gnu" standards). And only for compiler versions
which default to a newer standard. That will avoid accidentally
silencing any build problems that non-developers would run into on older
compilers that default to c89.

My digging found that the default switched to gnu11 in gcc 5.1.0.
Clang's documentation is less clear, but has done so since at least
clang-7. So that's what I put in the conditional here. It's OK to err on
the side of not-enabling this for older compilers. Most developers (as
well as CI) are using much more recent versions, so any warnings will
eventually surface.

A concrete example is anonymous unions, which became legal in c11.
Without this patch, "gcc -pedantic" will not complain about them, but
will if we add in "-std=gnu99".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-08 13:16:44 -08:00
Junio C Hamano e95566d909 Merge branch 'bc/require-c99' into jk/limit-developers-to-gnu99
* bc/require-c99:
  git-compat-util: add a test balloon for C99 support
2021-12-08 13:16:32 -08:00
Ævar Arnfjörð Bjarmason 17baeaf82d pull, fetch: fix segfault in --set-upstream option
Fix a segfault in the --set-upstream option added in
24bc1a1292 (pull, fetch: add --set-upstream option, 2019-08-19) added
in v2.24.0.

The code added there did not do the same checking we do for "git
branch" itself since 8efb8899cf (branch: segfault fixes and
validation, 2013-02-23), which in turn fixed the same sort of segfault
I'm fixing now in "git branch --set-upstream-to", see
6183d826ba (branch: introduce --set-upstream-to, 2012-08-20).

The warning message I'm adding here is an amalgamation of the error
added for "git branch" in 8efb8899cf, and the error output
install_branch_config() itself emits, i.e. it trims "refs/heads/" from
the name and says "branch X on remote", not "branch refs/heads/X on
remote".

I think it would make more sense to simply die() here, but in the
other checks for --set-upstream added in 24bc1a1292 we issue a
warning() instead. Let's do the same here for consistency for now.

There was an earlier submitted alternate way of fixing this in [1],
due to that patch breaking threading with the original report at [2] I
didn't notice it before authoring this version. I think the more
detailed warning message here is better, and we should also have tests
for this behavior.

The --no-rebase option to "git pull" is needed as of the recently
merged 7d0daf3f12 (Merge branch 'en/pull-conflicting-options',
2021-08-30).

1. https://lore.kernel.org/git/20210706162238.575988-1-clemens@endorphin.org/
2. https://lore.kernel.org/git/CAG6gW_uHhfNiHGQDgGmb1byMqBA7xa8kuH1mP-wAPEe5Tmi2Ew@mail.gmail.com/

Reported-by: Clemens Fruhwirth <clemens@endorphin.org>
Reported-by: Jan Pokorný <poki@fnusa.cz>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 15:19:28 -08:00
Eric Wong 2c68f577fc cbtree: remove broken and unused cb_unlink
cb_unlink is broken once a node is no longer self-referential
due to subsequent insertions.  This is a consequence of an
intrusive implementation and I'm not sure if it's easily fixable
while retaining our cache-friendly intrusive property (I've
tried for several hours in another project).

In any case, we're not using cb_unlink anywhere in our codebase,
just get rid of it to avoid misleading future readers.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 15:18:35 -08:00
Ævar Arnfjörð Bjarmason f5c39c3268 config API: use get_error_routine(), not vreportf()
Change the git_die_config() function added in 5a80e97c82 (config: add
`git_die_config()` to the config-set API, 2014-08-07) to use the
public callbacks in the usage.[ch] API instead of the the underlying
vreportf() function.

In preceding commits the rest of the vreportf() users outside of
usage.c was migrated to die_message(), so we can now make it "static".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:16 -08:00
Ævar Arnfjörð Bjarmason 24f6e6d626 usage.c + gc: add and use a die_message_errno()
Change the "error: " output when we exit with 128 due to gc.log errors
to use a "fatal: " prefix instead. To do this add a
die_message_errno() a sibling function to the die_errno() added in a
preceding commit.

Before this we'd expect report_last_gc_error() to return -1 from
error_errno() in this case. It already treated a status of 0 and 1
specially. Let's just document that anything that's not 0 or 1 should
be returned.

We could also retain the "ret < 0" behavior here without hardcoding
128 by returning -128, and having the caller do a "return -ret", but I
think this makes more sense, and preserves the path from
die_message*()'s return value to the "return" without hardcoding
"128".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:16 -08:00
Ævar Arnfjörð Bjarmason 0faf84d97d gc: return from cmd_gc(), don't call exit()
A minor code cleanup. Let's "return" from cmd_gc() instead of calling
exit(). See 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08) for other such cases.

While we're at it add a \n to separate the variable declaration from
the rest of the code in this block. Both of these changes make a
subsequent change smaller and easier to read.

This change isn't really needed for that subsequent change, but now
someone viewing that future behavior change won't need to wonder why
we're either still calling exit() here, or fixing it while we're at
it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:16 -08:00
Ævar Arnfjörð Bjarmason adcd4d4c6f usage.c API users: use die_message() for error() + exit 128
Continue the migration of code that printed a message and exited with
128. In this case the caller used "error()", so we'll be changing the
output from "error: " to "fatal: ". This change is intentional and
desired.

This code is dying, so it should emit "fatal", the only reason it
didn't do so was because before the existence of "die_message()" it
would have needed to craft its own "fatal: " message.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:15 -08:00
Ævar Arnfjörð Bjarmason e081a7c3b7 usage.c API users: use die_message() for "fatal :" + exit 128
Change code that printed its own "fatal: " message and exited with a
status code of 128 to use the die_message() function added in a
preceding commit.

This change also demonstrates why the return value of
die_message_routine() needed to be that of "report_fn". We have
callers such as the run-command.c::child_err_spew() which would like
to replace its error routine with the return value of
"get_die_message_routine()".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:15 -08:00
Ævar Arnfjörð Bjarmason 18568ee8f8 usage.c: add a die_message() routine
We have code in various places that would like to call die(), but
wants to defer the exit(128) it would invoke, e.g. to print an
additional message, or adjust the exit code. Add a die_message()
helper routine to bridge this gap in the API.

Functionally this behaves just like the error() routine, except it'll
print a "fatal: " prefix, and it will return with 128 instead of -1,
this is so that caller can pass the return value to "exit()", instead
of having to hardcode "exit(128)".

Note that as with the other routines the "die_message_builtin" needs
to return "void" and otherwise conform to the "report_fn"
signature.

As we'll see in a subsequent commit callers will want to replace
e.g. their default "die_routine" with a "die_message_routine".

For now we're just adding the routine and making die_builtin() in
usage.c itself use it. In order to do that we need to add a
get_die_message_routine() function, which works like the other
get_*_routine() functions in usage.c. There is no
set_die_message_rotine(), as it hasn't been needed yet. We can add it
if we ever need it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:25:15 -08:00
Han-Wen Nienhuys 9912391402 t1430: create valid symrefs using test-helper
This still leaves some other direct filesystem access. Currently, the files
backend does not allow invalidly named symrefs. Fixes for this are currently in
the 'seen' branch

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:20 -08:00
Han-Wen Nienhuys e39ceeb475 t1430: remove refs using test-tool
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:19 -08:00
Han-Wen Nienhuys 3c966c7b4e refs: introduce REF_SKIP_REFNAME_VERIFICATION flag
Use this flag with the test-helper in t1430, to avoid direct writes to the ref
database.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:19 -08:00
Han-Wen Nienhuys e9706a188f refs: introduce REF_SKIP_OID_VERIFICATION flag
This lets the ref-store test helper write non-existent or unparsable objects
into the ref storage.

Use this to make t1006 and t3800 independent of the files storage backend.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:19 -08:00
Han-Wen Nienhuys 0464d0a134 refs: update comment.
REF_IS_PRUNING is right below this comment, so it clearly does not belong in
this comment. This was apparently introduced in commit 5ac95fee (Nov 5, 2017
"refs: tidy up and adjust visibility of the `ref_update` flags").

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:19 -08:00
Han-Wen Nienhuys df25a19d72 test-ref-store: plug memory leak in cmd_delete_refs
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:19 -08:00
Han-Wen Nienhuys cd2d40fb7f test-ref-store: parse symbolic flag constants
This lets tests use REF_XXXX constants instead of hardcoded integers. The flag
names should be separated by a ','.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:18 -08:00
Han-Wen Nienhuys 93db6eef04 test-ref-store: remove force-create argument for create-reflog
Nobody uses force_create=0, so this flag is unnecessary.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 13:15:18 -08:00
Ævar Arnfjörð Bjarmason eafd6e7e55 object.c: use BUG(...) no die("BUG: ...") in lookup_object_by_type()
Adjust code added in 7463064b28 (object.h: add
lookup_object_by_type() function, 2021-06-22) to use the BUG()
function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 12:33:58 -08:00
Ævar Arnfjörð Bjarmason a78537a0f2 pathspec: use BUG(...) not die("BUG:%s:%d....", <file>, <line>)
Change code that was added in 8f4f8f4579 (guard against new pathspec
magic in pathspec matching code, 2013-07-14) to use the BUG() macro
instead of emitting a "fatal" message with the "__FILE__"-name and
"__LINE__"-numbers.

The original code predated the existence of the BUG() function, which
was added in d8193743e0 (usage.c: add BUG() function, 2017-05-12).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 12:31:17 -08:00