"checkout --merge -- path" and "update-index --unresolve path" did
not resurrect conflicted state that was resolved to remove path,
but now they do.
* jc/unresolve-removal:
checkout: allow "checkout -m path" to unmerge removed paths
checkout/restore: add basic tests for --merge
checkout/restore: refuse unmerging paths unless checking out of the index
update-index: remove stale fallback code for "--unresolve"
update-index: use unmerge_index_entry() to support removal
resolve-undo: allow resurrecting conflicted state that resolved to deletion
update-index: do not read HEAD and MERGE_HEAD unconditionally
The implementation of "get_sha1_hex()" that reads a hexadecimal
string that spells a full object name has been extended to cope
with any hash function used in the repository, but the "sha1" in
its name survived. Rename it to get_hash_hex(), a name that is
more consistent within its friends like get_hash_hex_algop().
* jc/retire-get-sha1-hex:
hex: retire get_sha1_hex()
"git checkout -m -- path" uses the unmerge_marked_index() API, whose
implementation is incapable of unresolving a path that was resolved
as removed. Extend the unmerge_index() API function so that we can
mark the ce_flags member of the cache entries we add to the index as
unmerged, and replace use of unmerge_marked_index() with it.
Now, together with its unmerge_index_entry_at() helper function,
unmerge_marked_index() function is no longer called by anybody, and
can safely be removed.
This makes two known test failures in t2070 and t7201 to succeed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The naming convention around get_sha1_hex() and its friends is
awkward these days, after "struct object_id" was introduced.
There are three public functions around this area:
* get_sha1_hex() - use the implied the_hash_algo, fill uchar *
* get_oid_hex() - use the implied the_hash_algo, fill oid *
* get_oid_hex_algop() - use the passed algop, fill oid *
Between the latter two, the "_algop" suffix signals whether the
the_hash_algo is used as the implied algorithm or the caller should
pass an algorithm explicitly. That is very much understandable and
is a good convention.
Between the former two, however, the "SHA1" vs "OID" in the names
differentiate in what type of variable the result is stored.
We could argue that it makes sense to use "SHA1" to mean "flat byte
buffer" to honor the historical practice in the days before "struct
object_id" was invented, but the natural fourth friend of the above
group would take an algop and fill a flat byte buffer, and it would
be strange to name it get_sha1_hex_algop(). Do we use the passed in
algo, or are we limited to SHA-1 ;-)?
In fact, such a function exists, albeit as a private helper function
used by the implementation of these functions, and is named a lot
more sensibly: get_hash_hex_algop().
Correct the misnomer of get_sha1_hex() and use "hash", instead of
"sha1", as "flat byte buffer that stores binary (as opposed to
hexadecimal) representation of the hash".
The four (2x2) friends now become:
* get_hash_hex() - use the implied the_hash_algo, fill uchar *
* get_oid_hex() - use the implied the_hash_algo, fill oid *
* get_hash_hex_algop() - use the passed algop, fill uchar *
* get_oid_hex_algop() - use the passed algop, fill oid *
As there are only two remaining calls to get_sha1_hex() in the
codebase right now, the blast radious of this change is fairly
small.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch]. I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.
Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix. Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.
Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group. But giving them the common "ll-" prefix would group them. Using
"-ll" as a suffix thus seems just much more logical to me. Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.
[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.
Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first). This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h. Also move some related inline
functions from cache.h to read-cache.h. The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had a handful of headers including cache.h that didn't need to
anymore. Drop those includes and replace them with includes of
smaller files, or forward declarations. However, note that two .c
files now need to directly include cache.h, though they should have
been including it all along given they are directly using structs
defined in it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/remove-implicit-use-of-the-repository:
libs: use "struct repository *" argument, not "the_repository"
post-cocci: adjust comments for recent repo_* migration
cocci: apply the "revision.h" part of "the_repository.pending"
cocci: apply the "rerere.h" part of "the_repository.pending"
cocci: apply the "refs.h" part of "the_repository.pending"
cocci: apply the "promisor-remote.h" part of "the_repository.pending"
cocci: apply the "packfile.h" part of "the_repository.pending"
cocci: apply the "pretty.h" part of "the_repository.pending"
cocci: apply the "object-store.h" part of "the_repository.pending"
cocci: apply the "diff.h" part of "the_repository.pending"
cocci: apply the "commit.h" part of "the_repository.pending"
cocci: apply the "commit-reach.h" part of "the_repository.pending"
cocci: apply the "cache.h" part of "the_repository.pending"
cocci: add missing "the_repository" macros to "pending"
cocci: sort "the_repository" rules by header
cocci: fix incorrect & verbose "the_repository" rules
cocci: remove dead rule from "the_repository.pending.cocci"
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another step towards letting us remove the include of cache.h in
strbuf.c. It does mean that we also need to add includes of abspath.h
in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dozens of files made use of gettext functions, without explicitly
including gettext.h. This made it more difficult to find which files
could remove a dependence on cache.h. Make C files explicitly include
gettext.h if they are using it.
However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places. It does mean that we also need to add
includes of alloc.h in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred. Message printing
probably does not belong in a "low-level merge" anyway.
This commit continues printing the message as-is, just from the callers
instead of within ll_merge(). Future changes will start handling the
message differently in the merge-ort codepath.
There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.
Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them. After modifying all of them, I then
changed the struct to an enum.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many places in the code were doing
while ((d = readdir(dir)) != NULL) {
if (is_dot_or_dotdot(d->d_name))
continue;
...process d...
}
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
...process d...
}
This helper particularly simplifies checks for empty directories.
Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms. (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We store a struct for each directory we access under .git/rr-cache. The
structs are kept in an array sorted by the binary hash associated with
their name (and we do lookups with a binary search).
This works OK, but there are a few small downsides:
- the amount of code isn't huge, but it's more than we'd need using one
of our other stock data structures
- the insertion into a sorted array is quadratic (though in practice
it's unlikely anybody has enough conflicts for this to matter)
- it's intimately tied to the representation of an object hash. This
isn't a big deal, as the conflict ids we generate use the same hash,
but it produces a few awkward bits (e.g., we are the only user of
hash_pos() that is not using object_id).
Let's instead just treat the directory names as strings, and store them
in a strmap. This is less code, and removes the use of hash_pos().
Insertion is now non-quadratic, though we probably use a bit more
memory. Besides the hash table overhead, and storing hex bytes instead
of a binary hash, we actually store each name twice. Other code expects
to access the name of a rerere_dir struct from the struct itself, so we
need a copy there. But strmap keeps its own copy of the name, as well.
Using a bare hashmap instead of strmap means we could use the name for
both, but at the cost of extra code (e.g., our own comparison function).
Likewise, strmap has a feature to use a pointer to the in-struct name at
the cost of a little extra code. I didn't do either here, as simple code
seemed more important than squeezing out a few bytes of efficiency.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check only that get_sha1_hex() doesn't complain, which means we'd
match an all-hex name with trailing cruft after it. This probably
doesn't matter much in practice, since there shouldn't be anything else
in the rr-cache directory, but it could possibly cause us to mix up sha1
and sha256 entries (which also shouldn't be intermingled, but could be
leftovers from a repository conversion).
Note that "get_sha1_hex()" is a confusing historical name. It is
actually using the_hash_algo, so it would be sha256 in a sha256 repo.
We'll switch to using parse_oid_hex(), because that conveniently
advances our pointer. But it also gets rid of the sha1 name. Arguably
it's a little funny to use "object_id" here for something that isn't
actually naming an object, but it's unlikely to be a problem (and is
contained in a single function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rerere_gc(), we walk over the .git/rr_cache directory and create a
struct for each entry we find. We feed any name we get from readdir() to
find_rerere_dir(), which then calls get_sha1_hex() on it (since we use
the binary hash as a lookup key). If that fails (i.e., the directory
name is not what we expected), it returns NULL. But the comment in
find_rerere_dir() says "BUG".
It _would_ be a bug for the call from new_rerere_id_hex(), the only
other code path, to fail here; it's generating the hex internally. But
the call in rerere_gc() is using it say "is this a plausible directory
name".
Let's instead have rerere_gc() do its own "is this plausible" check.
That has two benefits:
- we can now reliably BUG() inside find_rerere_dir(), which would
catch bugs in the other code path (and we now will never return NULL
from the function, which makes it easier to see that a rerere_id
struct will always have a non-NULL "collection" field).
- it makes the use of the binary hash an implementation detail of
find_rerere_dir(), not known by callers. That will free us up to
change it in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change all remnants of "sha1" in hash-lookup.c and .h and rename them to
reflect that we're not just able to handle SHA-1 these days.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename this function to reflect that we're not just able to handle SHA-1
these days. There are a few instances of "sha1" left in sha1-lookup.[ch]
after this, but those will be addressed in the next commit.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the uses of sha1_to_hex in this function with hash_to_hex to
allow the use of SHA-256 as well. Rename a variable since it is no
longer limited to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hold_locked_index() assumes the index path at $GIT_DIR/index. This is
not good for places that take an arbitrary index_state instead of
the_index, which is basically everywhere except builtin/.
Replace it with repo_hold_locked_index(). hold_locked_index() remains
as a wrapper around repo_hold_locked_index() to reduce changes in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More codepaths are moving away from hardcoded hash sizes.
* bc/hash-transition-part-15:
rerere: convert to use the_hash_algo
submodule: make zero-oid comparison hash function agnostic
apply: rename new_sha1_prefix and old_sha1_prefix
apply: replace hard-coded constants
tag: express constant in terms of the_hash_algo
transport: use parse_oid_hex instead of a constant
upload-pack: express constants in terms of the_hash_algo
refs/packed-backend: express constants using the_hash_algo
packfile: express constants in terms of the_hash_algo
pack-revindex: express constants in terms of the_hash_algo
builtin/fetch-pack: remove constants with parse_oid_hex
builtin/mktree: remove hard-coded constant
builtin/repack: replace hard-coded constants
pack-bitmap-write: use GIT_MAX_RAWSZ for allocation
object_id.cocci: match only expressions of type 'struct object_id'
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
Since this data is stored in the .git directory, it makes sense for us
to use the same hash algorithm for it as for everything else. Convert
the remaining uses of SHA-1 to use the_hash_algo. Use GIT_MAX_RAWSZ for
allocations. Rename various struct members, local variables, and a
function to be named "hash" instead of "sha1".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reason rerere(), rerere_forget() and rerere_remaining() take a
struct repository instead of struct index_state is not obvious from
the patch:
Deep in update_paths() and find_conflict(), hold_locked_index() and
read_index() are called. These functions assumes the index path at
$GIT_DIR/index which is not always true when you take an arbitrary
index state. Taking a repository will allow us to point to the right
index path later when we replace them with repo_ versions.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes to "git rerere" corner cases, especially when conflict
markers cannot be parsed in the file.
* tg/rerere:
rerere: recalculate conflict ID when unresolved conflict is committed
rerere: teach rerere to handle nested conflicts
rerere: return strbuf from handle path
rerere: factor out handle_conflict function
rerere: only return whether a path has conflicts or not
rerere: fix crash with files rerere can't handle
rerere: add documentation for conflict normalization
rerere: mark strings for translation
rerere: wrap paths in output in sq
rerere: lowercase error messages
rerere: unify error messages when read_cache fails
check_one_conflict() compares `i` to `active_nr` in two places to avoid
buffer overruns, but left out an important third location.
The code did used to have a check here comparing i to active_nr, back
before commit fb70a06da2 ("rerere: fix an off-by-one non-bug",
2015-06-28), however the code at the time used an 'if' rather than a
'while' meaning back then that this loop could not have read past the
end of the array, making the check unnecessary and it was removed.
Unfortunately, in commit 5eda906b28 ("rerere: handle conflicts with
multiple stage #1 entries", 2015-07-24), the 'if' was changed to a
'while' and the check comparing i and active_nr was not re-instated,
leading to this problem.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently when a user doesn't resolve a conflict, commits the results,
and does an operation which creates another conflict, rerere will use
the ID of the previously unresolved conflict for the new conflict.
This is because the conflict is kept in the MERGE_RR file, which
'rerere' reads every time it is invoked.
After the new conflict is solved, rerere will record the resolution
with the ID of the old conflict. So in order to replay the conflict,
both merges would have to be re-done, instead of just the last one, in
order for rerere to be able to automatically resolve the conflict.
Instead of that, assign a new conflict ID if there are still conflicts
in a file and the file had conflicts at a previous step. This ID
matches the conflict we actually resolved at the corresponding step.
Note that there are no backwards compatibility worries here, as rerere
would have failed to even normalize the conflict before this patch
series.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently rerere can't handle nested conflicts and will error out when
it encounters such conflicts. Do that by recursively calling the
'handle_conflict' function to normalize the conflict.
Note that a conflict like this would only be produced if a user
commits a file with conflict markers, and gets a conflict including
that in a susbsequent operation.
The conflict ID calculation here deserves some explanation:
As we are using the same handle_conflict function, the nested conflict
is normalized the same way as for non-nested conflicts, which means
the ancestor in the diff3 case is stripped out, and the parts of the
conflict are ordered alphabetically.
The conflict ID is however is only calculated in the top level
handle_conflict call, so it will include the markers that 'rerere'
adds to the output. e.g. say there's the following conflict:
<<<<<<< HEAD
1
=======
<<<<<<< HEAD
3
=======
2
>>>>>>> branch-2
>>>>>>> branch-3~
it would be recorde as follows in the preimage:
<<<<<<<
1
=======
<<<<<<<
2
=======
3
>>>>>>>
>>>>>>>
and the conflict ID would be calculated as
sha1(1<NUL><<<<<<<
2
=======
3
>>>>>>><NUL>)
Stripping out vs. leaving the conflict markers in place in the inner
conflict should have no practical impact, but it simplifies the
implementation.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently we write the conflict to disk directly in the handle_path
function. To make it re-usable for nested conflicts, instead of
writing the conflict out directly, store it in a strbuf and let the
caller write it out.
This does mean some slight increase in memory usage, however that
increase is limited to the size of the largest conflict we've
currently processed. We already keep one copy of the conflict in
memory, and it shouldn't be too large, so the increase in memory usage
seems acceptable.
As a bonus this lets us get replace the rerere_io_putconflict function
with a trivial two line function.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out the handle_conflict function, which handles a single
conflict in a path. This is in preparation for a subsequent commit,
where this function will be re-used.
Note that this does change the behaviour of 'git rerere' slightly.
Where previously we'd consider all files where an unmatched conflict
marker is found as invalid, we now only consider files invalid when
the "ours" conflict marker ("<<<<<<< <text>") is unmatched, not when
other conflict markers (e.g. "=======") is unmatched.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently return the exact number of conflict hunks a certain path
has from the 'handle_paths' function. However all of its callers only
care whether there are conflicts or not or if there is an error.
Return only that information, and document that only that information
is returned. This will simplify the code in the subsequent steps.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently when a user does a conflict resolution and ends it (in any
way that calls 'git rerere' again) with a file 'rerere' can't handle,
subsequent rerere operations that are interested in that path, such as
'rerere clear' or 'rerere forget <path>' will fail, or even worse in
the case of 'rerere clear' segfault.
Such states include nested conflicts, or a conflict marker that
doesn't have any match.
This is because 'git rerere' calculates a conflict file and writes it
to the MERGE_RR file. When the user then changes the file in any way
rerere can't handle, and then calls 'git rerere' on it again to record
the conflict resolution, the handle_file function fails, and removes
the 'preimage' file in the rr-cache in the process, while leaving the
ID in the MERGE_RR file.
Now when 'rerere clear' is run, it reads the ID from the MERGE_RR
file, however the 'fit_variant' function for the ID is never called as
the 'preimage' file does not exist anymore. This means
'collection->status' in 'has_rerere_resolution' is NULL, and the
command will crash.
To fix this, remove the rerere ID from the MERGE_RR file in the case
when we can't handle it, just after the 'preimage' file was removed
and remove the corresponding variant from .git/rr-cache/. Removing it
unconditionally is fine here, because if the user would have resolved
the conflict and ran rerere, the entry would no longer be in the
MERGE_RR file, so we wouldn't have this problem in the first place,
while if the conflict was not resolved.
Currently there is nothing left in this folder, as the 'preimage'
was already deleted by the 'handle_file' function, so 'remove_variant'
is a no-op. Still call the function, to make sure we clean everything
up, in case we add some other files corresponding to a variant in the
future.
Note that other variants that have the same conflict ID will not be
touched.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>