Commit graph

652 commits

Author SHA1 Message Date
Junio C Hamano 5efabc7ed9 Merge branch 'ew/hashmap'
Code clean-up of the hashmap API, both users and implementation.

* ew/hashmap:
  hashmap_entry: remove first member requirement from docs
  hashmap: remove type arg from hashmap_{get,put,remove}_entry
  OFFSETOF_VAR macro to simplify hashmap iterators
  hashmap: introduce hashmap_free_entries
  hashmap: hashmap_{put,remove} return hashmap_entry *
  hashmap: use *_entry APIs for iteration
  hashmap_cmp_fn takes hashmap_entry params
  hashmap_get{,_from_hash} return "struct hashmap_entry *"
  hashmap: use *_entry APIs to wrap container_of
  hashmap_get_next returns "struct hashmap_entry *"
  introduce container_of macro
  hashmap_put takes "struct hashmap_entry *"
  hashmap_remove takes "const struct hashmap_entry *"
  hashmap_get takes "const struct hashmap_entry *"
  hashmap_add takes "struct hashmap_entry *"
  hashmap_get_next takes "const struct hashmap_entry *"
  hashmap_entry_init takes "struct hashmap_entry *"
  packfile: use hashmap_entry in delta_base_cache_entry
  coccicheck: detect hashmap_entry.hash assignment
  diff: use hashmap_entry_init on moved_entry.ent
2019-10-15 13:48:02 +09:00
Junio C Hamano 280bd44551 Merge branch 'en/merge-recursive-cleanup'
The merge-recursive machiery is one of the most complex parts of
the system that accumulated cruft over time.  This large series
cleans up the implementation quite a bit.

* en/merge-recursive-cleanup: (26 commits)
  merge-recursive: fix the fix to the diff3 common ancestor label
  merge-recursive: fix the diff3 common ancestor label for virtual commits
  merge-recursive: alphabetize include list
  merge-recursive: add sanity checks for relevant merge_options
  merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_*
  merge-recursive: split internal fields into a separate struct
  merge-recursive: avoid losing output and leaking memory holding that output
  merge-recursive: comment and reorder the merge_options fields
  merge-recursive: consolidate unnecessary fields in merge_options
  merge-recursive: move some definitions around to clean up the header
  merge-recursive: rename merge_options argument to opt in header
  merge-recursive: rename 'mrtree' to 'result_tree', for clarity
  merge-recursive: use common name for ancestors/common/base_list
  merge-recursive: fix some overly long lines
  cache-tree: share code between functions writing an index as a tree
  merge-recursive: don't force external callers to do our logging
  merge-recursive: remove useless parameter in merge_trees()
  merge-recursive: exit early if index != head
  Ensure index matches head before invoking merge machinery, round N
  merge-recursive: remove another implicit dependency on the_repository
  ...
2019-10-15 13:47:59 +09:00
Elijah Newren b657047719 merge-recursive: fix the fix to the diff3 common ancestor label
In commit 8e4ec337 ("merge-recursive: fix the diff3 common ancestor
label for virtual commits", 2019-10-01), which was a fix to commit
743474cbfa ("merge-recursive: provide a better label for diff3
common ancestor", 2019-08-17), the label for the common ancestor was
changed from always being

         "merged common ancestors"

to instead be based on the number of merge bases and whether the merge
base was a real commit or a virtual one:

    >=2: "merged common ancestors"
      1, via merge_recursive_generic: "constructed merge base"
      1, otherwise: <abbreviated commit hash>
      0: "<empty tree>"

The handling for "constructed merge base" worked by allowing
opt->ancestor to be set in merge_recursive_generic(), so we paid
attention to the setting of that variable in merge_recursive_internal().
Now, for the outer merge, the code flow was simply the following:

	ancestor_name = "merged merge bases"
	loop over merge_bases: merge_recursive_internal()

The first merge base not needing recursion would determine its own
ancestor_name however necessary and thus run

	ancestor_name = $SOMETHING
	empty loop over merge_bases...
	opt->ancestor = ancestor_name
        merge_trees_internal()

Now, the next set of merge_bases that would need to be merged after this
particular merge had completed would note that opt->ancestor has been
set to something (to a local ancestor_name variable that has since been
popped off the stack), and thus it would run:

	... else if (opt->ancestor) {
		ancestor_name = opt->ancestor;  /* OOPS! */
        loop over merge_bases: merge_recursive_internal()
        opt->ancestor = ancestor_name
        merge_trees_internal()

This resulted in garbage strings being printed for the virtual merge
bases, which was visible in git.git by just merging commit b744c3af07
into commit 6d8cb22a4f.  There are two ways to fix this: set
opt->ancestor to NULL after using it to avoid re-use, or add a
!opt->priv->call_depth check to the if block for using a pre-defined
opt->ancestor.  Apply both fixes.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-08 11:36:27 +09:00
Eric Wong 404ab78e39 hashmap: remove type arg from hashmap_{get,put,remove}_entry
Since these macros already take a `keyvar' pointer of a known type,
we can rely on OFFSETOF_VAR to get the correct offset without
relying on non-portable `__typeof__' and `offsetof'.

Argument order is also rearranged, so `keyvar' and `member' are
sequential as they are used as: `keyvar->member'

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:12 +09:00
Eric Wong 23dee69f53 OFFSETOF_VAR macro to simplify hashmap iterators
While we cannot rely on a `__typeof__' operator being portable
to use with `offsetof'; we can calculate the pointer offset
using an existing pointer and the address of a member using
pointer arithmetic for compilers without `__typeof__'.

This allows us to simplify usage of hashmap iterator macros
by not having to specify a type when a pointer of that type
is already given.

In the future, list iterator macros (e.g. list_for_each_entry)
may also be implemented using OFFSETOF_VAR to save hackers the
trouble of using container_of/list_entry macros and without
relying on non-portable `__typeof__'.

v3: use `__typeof__' to avoid clang warnings

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:11 +09:00
Eric Wong c8e424c9c9 hashmap: introduce hashmap_free_entries
`hashmap_free_entries' behaves like `container_of' and passes
the offset of the hashmap_entry struct to the internal
`hashmap_free_' function, allowing the function to free any
struct pointer regardless of where the hashmap_entry field
is located.

`hashmap_free' no longer takes any arguments aside from
the hashmap itself.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:11 +09:00
Eric Wong 87571c3f71 hashmap: use *_entry APIs for iteration
Inspired by list_for_each_entry in the Linux kernel.
Once again, these are somewhat compromised usability-wise
by compilers lacking __typeof__ support.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:11 +09:00
Eric Wong 939af16eac hashmap_cmp_fn takes hashmap_entry params
Another step in eliminating the requirement of hashmap_entry
being the first member of a struct.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:11 +09:00
Eric Wong f23a465132 hashmap_get{,_from_hash} return "struct hashmap_entry *"
Update callers to use hashmap_get_entry, hashmap_get_entry_from_hash
or container_of as appropriate.

This is another step towards eliminating the requirement of
hashmap_entry being the first field in a struct.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:11 +09:00
Eric Wong 26b455f21e hashmap_put takes "struct hashmap_entry *"
This is less error-prone than "void *" as the compiler now
detects invalid types being passed.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:10 +09:00
Eric Wong 28ee794128 hashmap_remove takes "const struct hashmap_entry *"
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:10 +09:00
Eric Wong b6c5241606 hashmap_get takes "const struct hashmap_entry *"
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:10 +09:00
Eric Wong b94e5c1df6 hashmap_add takes "struct hashmap_entry *"
This is less error-prone than "void *" as the compiler now
detects invalid types being passed.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:10 +09:00
Eric Wong d22245a2e3 hashmap_entry_init takes "struct hashmap_entry *"
C compilers do type checking to make life easier for us.  So
rely on that and update all hashmap_entry_init callers to take
"struct hashmap_entry *" to avoid future bugs while improving
safety and readability.

Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07 10:20:09 +09:00
Elijah Newren 8e4ec3376e merge-recursive: fix the diff3 common ancestor label for virtual commits
In commit 743474cbfa ("merge-recursive: provide a better label for
diff3 common ancestor", 2019-08-17), the label for the common ancestor
was changed from always being

         "merged common ancestors"

to instead be based on the number of merge bases:

    >=2: "merged common ancestors"
      1: <abbreviated commit hash>
      0: "<empty tree>"

Unfortunately, this did not take into account that when we have a single
merge base, that merge base could be fake or constructed.  In such
cases, this resulted in a label of "00000000".  Of course, the previous
label of "merged common ancestors" was also misleading for this case.
Since we have an API that is explicitly about creating fake merge base
commits in merge_recursive_generic(), we should provide a better label
when using that API with one merge base.  So, when
merge_recursive_generic() is called with one merge base, set the label
to:

         "constructed merge base"

Note that callers of merge_recursive_generic() include the builtin
commands git-am (in combination with git apply --build-fake-ancestor),
git-merge-recursive, and git-stash.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-02 14:59:29 +09:00
Jonathan Tan 83e3ad3b12 merge-recursive: symlink's descendants not in way
When the working tree has:
 - bar (directory)
 - bar/file (file)
 - foo (symlink to .)

(note that lstat() for "foo/bar" would tell us that it is a directory)

and the user merges a commit that deletes the foo symlink and instead
contains:
 - bar (directory, as above)
 - bar/file (file, as above)
 - foo (directory)
 - foo/bar (file)

the merge should happen without requiring user intervention. However,
this does not happen.

This is because dir_in_way(), when checking the working tree, thinks
that "foo/bar" is a directory. But a symlink should be treated much the
same as a file: since dir_in_way() is only checking to see if there is a
directory in the way, we don't want symlinks in leading paths to
sometimes cause dir_in_way() to return true.

Teach dir_in_way() to also check for symlinks in leading paths before
reporting whether a directory is in the way.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-20 10:15:57 -07:00
Elijah Newren 4615a8cb5b merge-recursive: alphabetize include list
Other than cache.h which needs to appear first, and merge-recursive.h
which I want to be second so that we are more likely to notice if
merge-recursive.h has any missing includes, the rest of the list is
long and easier to look through if it's alphabetical.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren 45ef16f77a merge-recursive: add sanity checks for relevant merge_options
There are lots of options that callers can set, yet most have a limited
range of valid values, some options are meant for output (e.g.
opt->obuf, which is expected to start empty), and callers are expected
to not set opt->priv.  Add several sanity checks to ensure callers
provide sane values.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren f3081dae01 merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_*
I want to implement the same outward facing API as found within
merge-recursive.h in a different merge strategy.  However, that makes
names like MERGE_RECURSIVE_{NORMAL,OURS,THEIRS} look a little funny;
rename to MERGE_VARIANT_{NORMAL,OURS,THEIRS}.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren 5bf7e5779e merge-recursive: split internal fields into a separate struct
merge_options has several internal fields that should not be set or read
by external callers.  This just complicates the API.  Move them into an
opaque merge_options_internal struct that is defined only in
merge-recursive.c and keep these out of merge-recursive.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren e95e481f9e merge-recursive: avoid losing output and leaking memory holding that output
If opt->buffer_output is less than 2, then merge_trees(),
merge_recursive(), and merge_recursive_generic() are all supposed to
flush the opt->obuf output buffer to stdout and release any memory it
holds.  merge_trees() did not do this.  Move the logic that handles this
for merge_recursive_internal() to merge_finalize() so that all three
methods handle this requirement.

Note that this bug didn't cause any problems currently, because there
are only two callers of merge_trees() right now (a git grep for
'merge_trees(' is misleading because builtin/merge-tree.c also defines a
'merge_tree' function that is unrelated), and only one of those is
called with buffer_output less than 2 (builtin/checkout.c), but it set
opt->verbosity to 0, for which there is only currently one non-error
message that would be shown: "Already up to date!".  However, that one
message can only occur when the merge is utterly trivial (the merge base
tree exactly matches the merge tree), and builtin/checkout.c already
attempts a trivial merge via unpack_trees() before falling back to
merge_trees().

Also, if opt->buffer_output is 2, then the caller is responsible to
handle showing any output in opt->obuf and for free'ing it.  This
requirement might be easy to overlook, so add a comment to
merge-recursive.h pointing it out.  (There are currently two callers
that set buffer_output to 2, both in sequencer.c, and both of which
handle this correctly.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren a779fb829b merge-recursive: comment and reorder the merge_options fields
The merge_options struct had lots of fields, making it a little
imposing, but the options naturally fall into multiple different groups.
Grouping similar options and adding a comment or two makes it easier to
read, easier for new folks to figure out which options are related, and
thus easier for them to find the options they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren 8599ab4574 merge-recursive: consolidate unnecessary fields in merge_options
We provided users with the ability to state whether they wanted rename
detection, and to put a limit on how much CPU would be spent.  Both of
these fields had multiple configuration parameters for setting them,
with one being a fallback and the other being an override.  However,
instead of implementing the logic for how to combine the multiple
source locations into the appropriate setting at config loading time,
we loaded and tracked both values and then made the code combine them
every time it wanted to check the overall value.  This had a few
minor drawbacks:
  * it seems more complicated than necessary
  * it runs the risk of people using the independent settings in the
    future and breaking the intent of how the options are used
    together
  * it makes merge_options more complicated than necessary for other
    potential users of the API

Fix these problems by moving the logic for combining the pairs of
options into a single value; make it apply at time-of-config-loading
instead of each-time-of-use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren 7c0a6c8e47 merge-recursive: move some definitions around to clean up the header
No substantive code changes (view this with diff --color-moved), but
a few small code cleanups:
  * Move structs and an inline function only used by merge-recursive.c
    into merge-recursive.c
  * Re-order function declarations to be more logical
  * Add or fix some explanatory comments

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren bab56877e0 merge-recursive: rename 'mrtree' to 'result_tree', for clarity
It is not at all clear what 'mr' was supposed to stand for, at least not
to me.  Pick a clearer name for this variable.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:04 -07:00
Elijah Newren ff1bfa2cd5 merge-recursive: use common name for ancestors/common/base_list
merge_trees(), merge_recursive(), and merge_recursive_generic() in
their function headers used four different names for the merge base or
list of merge bases they were passed:
  * 'common'
  * 'ancestors'
  * 'ca'
  * 'base_list'
They were able to refer to it four different ways instead of only three
by using a different name in the signature for the .c file than the .h
file.  Change all of these to 'merge_base' or 'merge_bases'.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 4d7101e25c merge-recursive: fix some overly long lines
No substantive code change, just add some line breaks to fix lines that
have grown in length due to various refactorings.  Most remaining lines
of excessive length in merge-recursive include error messages and it's
not clear that splitting those improves things.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 724dd767b2 cache-tree: share code between functions writing an index as a tree
write_tree_from_memory() appeared to be a merge-recursive special that
basically duplicated write_index_as_tree().  The two have a different
signature, but the bigger difference was just that write_index_as_tree()
would always unconditionally read the index off of disk instead of
working on the current in-memory index.  So:

  * split out common code into write_index_as_tree_internal()

  * rename write_tree_from_memory() to write_inmemory_index_as_tree(),
    make it call write_index_as_tree_internal(), and move it to
    cache-tree.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 345480d1ed merge-recursive: don't force external callers to do our logging
Alternatively, you can view this as "make the merge functions behave
more similarly."  merge-recursive has three different entry points:
merge_trees(), merge_recursive(), and merge_recursive_generic().  Two of
these would call diff_warn_rename_limit(), but merge_trees() didn't.
This lead to callers of merge_trees() needing to manually call
diff_warn_rename_limit() themselves.  Move this to the new
merge_finalize() function to make sure that all three entry points run
this function.

Note that there are two external callers of merge_trees(), one in
sequencer.c and one in builtin/checkout.c.  The one in sequencer.c is
cleaned up by this patch and just transfers where the call to
diff_warn_rename_limit() is made; the one in builtin/checkout.c is for
switching to a different commit and in the very rare case where the
warning might be triggered, it would probably be helpful to include
(e.g. if someone is modifying a file that has been renamed in moving to
the other commit, but there are so many renames between the commits that
the limit kicks in and none are detected, it may help to have an
explanation about why they got a delete/modify conflict instead of a
proper content merge in a renamed file).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren b4db8a2b76 merge-recursive: remove useless parameter in merge_trees()
merge_trees() took a results parameter that would only be written when
opt->call_depth was positive, which is never the case now that
merge_trees_internal() has been split from merge_trees().  Remove the
misleading and unused parameter from merge_trees().

While at it, add some comments explaining how the output of
merge_trees() and merge_recursive() differ.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 98a1d3d888 merge-recursive: exit early if index != head
We had a rule to enforce that the index matches head, but it was found
at the beginning of merge_trees() and would only trigger when
opt->call_depth was 0.  Since merge_recursive() doesn't call
merge_trees() until after returning from recursing, this meant that the
check wasn't triggered by merge_recursive() until it had first finished
all the intermediate merges to create virtual merge bases.  That is a
potentially huge amount of computation (and writing of intermediate
merge results into the .git/objects directory) before it errors out and
says, in effect, "Sorry, I can't do any merging because you have some
local changes that would be overwritten."

Further, not enforcing this requirement earlier allowed other bugs (such
as an unintentional unconditional dropping and reloading of the index in
merge_recursive() even when no recursion was necessary), to mask bugs in
other callers (which were fixed in the commit prior to this one).

Make sure we do the index == head check at the beginning of the merge,
and error out immediately if it fails.  While we're at it, fix a small
leak in the show-the-error codepath.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 10f751c06b merge-recursive: remove another implicit dependency on the_repository
Commit d7cf3a96e9 ("merge-recursive.c: remove implicit dependency on
the_repository", 2019-01-12) and follow-ups like commit 34e7771bc6
("Use the right 'struct repository' instead of the_repository",
2019-06-27), removed most implicit uses of the_repository.  Convert
calls to get_commit_tree() to instead use repo_get_commit_tree() to get
rid of another.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren f836bf3937 merge-recursive: future-proof update_file_flags() against memory leaks
There is a 'free_buf' label to which all but one of the error paths in
update_file_flags() jump; that error case involves a NULL buf and is
thus not a memory leak.  However, make that error case execute the same
deallocation code anyway so that if anyone adds any additional memory
allocations or deallocations, then all error paths correctly deallocate
resources.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Derrick Stolee 8e01251694 merge-recursive: introduce an enum for detect_directory_renames values
Improve code readability by introducing an enum to replace the
not-quite-boolean values taken on by detect_directory_renames.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 743474cbfa merge-recursive: provide a better label for diff3 common ancestor
In commit 7ca56aa076 ("merge-recursive: add a label for ancestor",
2010-03-20), a label was added for the '||||||' line to make it have
the more informative heading '|||||| merged common ancestors', with
the statement:

    It would be nicer to use a more informative label.  Perhaps someone
    will provide one some day.

This chosen label was perfectly reasonable when recursiveness kicks in,
i.e. when there are multiple merge bases.  (I can't think of a better
label in such cases.)  But it is actually somewhat misleading when there
is a unique merge base or no merge base.  Change this based on the
number of merge bases:
    >=2: "merged common ancestors"
    1:   <abbreviated commit hash>
    0:   "<empty tree>"

Tests have also been added to check that we get the right ancestor name
for each of the three cases.

Also, since merge_recursive() and merge_trees() have polar opposite
pre-conditions for opt->ancestor, document merge_recursive()'s
pre-condition with an assertion.  (An assertion was added to
merge_trees() already a few commits ago.)  The differences in
pre-conditions stem from two factors: (1) merge_trees() does not recurse
and thus does not have multiple sub-merges to worry about -- each of
which would require a different value for opt->ancestor, (2)
merge_trees() is only passed trees rather than commits and thus cannot
internally guess as good of a label.  Thus, while external callers of
merge_trees() are required to provide a non-NULL opt->ancestor,
merge_recursive() expects to set this value itself.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-19 10:08:03 -07:00
Elijah Newren 139ef37a2f merge-recursive: enforce opt->ancestor != NULL when calling merge_trees()
We always want our conflict hunks to be labelled so that users can know
where each came from.  The previous commit fixed the one caller in the
codebase which was not setting opt->ancestor (and thus not providing a
label for the "merge base" conflict hunk in diff3-style conflict
markers); add an assertion to prevent future codepaths from also
overlooking this requirement.

Enforcing this requirement also allows us to simplify the code for
labelling the conflict hunks by no longer checking if the ancestor label
is NULL.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-16 12:47:20 -07:00
Elijah Newren d8523ca1b9 merge-recursive: be consistent with assert
In commit 8daec1df03 ("merge-recursive: switch from (oid,mode) pairs
to a diff_filespec", 2019-04-05), an assertion on a->path && b->path
was added for code readability to document that these both needed to be
non-NULL at this point in the code.  However, the subsequent lines also
read o->path, so it should be included in the assert.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-16 12:47:20 -07:00
Junio C Hamano 0bdaec1d3d Merge branch 'en/disable-dir-rename-in-recursive-merge'
"merge-recursive" hit a BUG() when building a virtual merge base
detected a directory rename.

* en/disable-dir-rename-in-recursive-merge:
  merge-recursive: avoid directory rename detection in recursive case
2019-08-08 14:26:10 -07:00
Elijah Newren ff6d54771a merge-recursive: avoid directory rename detection in recursive case
Ever since commit 8c8e5bd6eb ("merge-recursive: switch directory
rename detection default", 2019-04-05), the default handling with
directory rename detection was to report a conflict and leave unstaged
entries in the index.  However, when creating a virtual merge base in
the recursive case, we absolutely need a tree, and the only way a tree
can be written is if we have no unstaged entries -- otherwise we hit a
BUG().

There are a few fixes possible here which at least fix the BUG(), but
none of them seem optimal for other reasons; see the comments with the
new testcase 13e in t6043 for details (which testcase triggered a BUG()
prior to this patch).  As such, just opt for a very conservative and
simple choice that is still relatively reasonable: have the recursive
case treat 'conflict' as 'false' for opt->detect_directory_renames.

Reported-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-06 10:42:36 -07:00
Nguyễn Thái Ngọc Duy 34e7771bc6 Use the right 'struct repository' instead of the_repository
There are a couple of places where 'struct repository' is already passed
around, but the_repository is still used. Use the right repo.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:45:17 -07:00
Nguyễn Thái Ngọc Duy 90d3405196 match-trees.c: remove the_repo from shift_tree*()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:45:17 -07:00
Nguyễn Thái Ngọc Duy 50ddb089ff tree-walk.c: remove the_repo from get_tree_entry()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:45:17 -07:00
Junio C Hamano 20fbf7dd42 Merge branch 'en/merge-directory-renames-fix'
Recent code restructuring of merge-recursive engine introduced a
regression dealing with rename/add conflict.

* en/merge-directory-renames-fix:
  merge-recursive: restore accidentally dropped setting of path
2019-06-06 14:03:36 -07:00
Elijah Newren 481de8a293 merge-recursive: restore accidentally dropped setting of path
In commit 8daec1df03 ("merge-recursive: switch from (oid,mode) pairs
to a diff_filespec", 2019-04-05), we actually switched from
(oid,mode,path) triplets to a diff_filespec -- but most callsites in the
patch only needed to worry about oid and mode so the commit message
focused on that.  The oversight in the commit message apparently spilled
over to the code as well; one of the dozen or so callsites accidentally
dropped the setting of the path in the conversion.  Restore the path
setting in that location.

Also, this pointed out that our testsuite was lacking a good rename/add
test, at least one that involved the need for merge content with the
rename.  Add such a test, and since rename/add vs. add/rename could
possibly be important, redo the merge the opposite direction to make
sure we don't have issues with the direction of the merge.  These
testcases failed before restoring the setting of path, but with the
paths appropriately set the testcases both pass.

Reported-by: Ben Humphreys <behumphreys@atlassian.com>
Based-on-patch-by: SZEDER Gábor <szeder.dev@gmail.com>
Tested-by: Ben Humphreys <behumphreys@atlassian.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-05 09:30:40 -07:00
Junio C Hamano 0b179f3175 Merge branch 'nd/sha1-name-c-wo-the-repository'
Further code clean-up to allow the lowest level of name-to-object
mapping layer to work with a passed-in repository other than the
default one.

* nd/sha1-name-c-wo-the-repository: (34 commits)
  sha1-name.c: remove the_repo from get_oid_mb()
  sha1-name.c: remove the_repo from other get_oid_*
  sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name
  submodule-config.c: use repo_get_oid for reading .gitmodules
  sha1-name.c: add repo_get_oid()
  sha1-name.c: remove the_repo from get_oid_with_context_1()
  sha1-name.c: remove the_repo from resolve_relative_path()
  sha1-name.c: remove the_repo from diagnose_invalid_index_path()
  sha1-name.c: remove the_repo from handle_one_ref()
  sha1-name.c: remove the_repo from get_oid_1()
  sha1-name.c: remove the_repo from get_oid_basic()
  sha1-name.c: remove the_repo from get_describe_name()
  sha1-name.c: remove the_repo from get_oid_oneline()
  sha1-name.c: add repo_interpret_branch_name()
  sha1-name.c: remove the_repo from interpret_branch_mark()
  sha1-name.c: remove the_repo from interpret_nth_prior_checkout()
  sha1-name.c: remove the_repo from get_short_oid()
  sha1-name.c: add repo_for_each_abbrev()
  sha1-name.c: store and use repo in struct disambiguate_state
  sha1-name.c: add repo_find_unique_abbrev_r()
  ...
2019-05-09 00:37:25 +09:00
Junio C Hamano 96379f043f Merge branch 'en/merge-directory-renames'
"git merge-recursive" backend recently learned a new heuristics to
infer file movement based on how other files in the same directory
moved.  As this is inherently less robust heuristics than the one
based on the content similarity of the file itself (rather than
based on what its neighbours are doing), it sometimes gives an
outcome unexpected by the end users.  This has been toned down to
leave the renamed paths in higher/conflicted stages in the index so
that the user can examine and confirm the result.

* en/merge-directory-renames:
  merge-recursive: switch directory rename detection default
  merge-recursive: give callers of handle_content_merge() access to contents
  merge-recursive: track information associated with directory renames
  t6043: fix copied test description to match its purpose
  merge-recursive: switch from (oid,mode) pairs to a diff_filespec
  merge-recursive: cleanup handle_rename_* function signatures
  merge-recursive: track branch where rename occurred in rename struct
  merge-recursive: remove ren[12]_other fields from rename_conflict_info
  merge-recursive: shrink rename_conflict_info
  merge-recursive: move some struct declarations together
  merge-recursive: use 'ci' for rename_conflict_info variable name
  merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf'
  merge-recursive: rename diff_filespec 'one' to 'o'
  merge-recursive: rename merge_options argument from 'o' to 'opt'
  Use 'unsigned short' for mode, like diff_filespec does
2019-05-09 00:37:22 +09:00
Nguyễn Thái Ngọc Duy a133c40b23 commit.cocci: refactor code, avoid double rewrite
"maybe" pointer in 'struct commit' is tricky because it can be lazily
initialized to take advantage of commit-graph if available. This makes
it not safe to access directly.

This leads to a rule in commit.cocci to rewrite 'x->maybe_tree' to
'get_commit_tree(x)'. But that rule alone could lead to incorrectly
rewrite assignments, e.g. from

    x->maybe_tree = yes

to

    get_commit_tree(x) = yes

Because of this we have a second rule to revert this effect. Szeder
found out that we could do better by performing the assignment rewrite
rule first, then the remaining is read-only access and handled by the
current first rule.

For this to work, we need to transform "x->maybe_tree = y" to something
that does NOT contain "x->maybe_tree" to avoid the original first
rule. This is where set_commit_tree() comes in.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16 18:56:51 +09:00
Elijah Newren 8c8e5bd6eb merge-recursive: switch directory rename detection default
When all of x/a, x/b, and x/c have moved to z/a, z/b, and z/c on one
branch, there is a question about whether x/d added on a different
branch should remain at x/d or appear at z/d when the two branches are
merged.  There are different possible viewpoints here:

  A) The file was placed at x/d; it's unrelated to the other files in
     x/ so it doesn't matter that all the files from x/ moved to z/ on
     one branch; x/d should still remain at x/d.

  B) x/d is related to the other files in x/, and x/ was renamed to z/;
     therefore x/d should be moved to z/d.

Since there was no ability to detect directory renames prior to
git-2.18, users experienced (A) regardless of context.  Choice (B) was
implemented in git-2.18, with no option to go back to (A), and has been
in use since.  However, one user reported that the merge results did not
match their expectations, making the change of default problematic,
especially since there was no notice printed when directory rename
detection moved files.

Note that there is also a third possibility here:

  C) There are different answers depending on the context and content
     that cannot be determined by git, so this is a conflict.  Use a
     higher stage in the index to record the conflict and notify the
     user of the potential issue instead of silently selecting a
     resolution for them.

Add an option for users to specify their preference for whether to use
directory rename detection, and default to (C).  Even when directory
rename detection is on, add notice messages about files moved into new
directories.

As a sidenote, x/d did not have to be a new file here; it could have
already existed at some other path and been renamed to x/d, with
directory rename detection just renaming it again to z/d.  Thus, it's
not just new files, but also a modification to all rename types (normal
renames, rename/add, rename/delete, rename/rename(1to1),
rename/rename(1to2), and rename/rename(2to1)).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00
Elijah Newren e62d11239c merge-recursive: give callers of handle_content_merge() access to contents
Pass a merge_file_info struct to handle_content_merge() so that the
callers can access the oid and mode of the result afterward.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00
Elijah Newren 6d169fd321 merge-recursive: track information associated with directory renames
Directory rename detection previously silently applied.  In order to
allow printing information about paths that changed or printing a
conflict notification (and only doing so near other potential conflict
messages associated with the paths), save this information inside the
rename struct for later use.  A subsequent patch will make use of the
additional information.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00