Commit graph

28 commits

Author SHA1 Message Date
Eric Wong df799f5d99 t1512: test ambiguous cat-file --batch and --batch-output
Test the new "ambiguous" result from cat-file --batch and
--batch-check.  This is in t1512 instead of t1006 since
we need a repo with ambiguous object_id names.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-31 10:52:37 -08:00
Eric Sunshine f2deabfcb6 t1000-t1999: fix broken &&-chains
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-16 14:38:47 -07:00
Junio C Hamano 9472b13201 Merge branch 'bc/hash-independent-tests'
Many tests hardcode the raw object names, which would change once
we migrate away from SHA-1.  While some of them must test against
exact object names, most of them do not have to use hardcoded
constants in the test.  The latter kind of tests have been updated
to test the moral equivalent of the original without hardcoding the
actual object names.

* bc/hash-independent-tests: (28 commits)
  t5300: abstract away SHA-1-specific constants
  t4208: abstract away SHA-1-specific constants
  t4045: abstract away SHA-1-specific constants
  t4042: abstract away SHA-1-specific constants
  t4205: sort log output in a hash-independent way
  t/lib-diff-alternative: abstract away SHA-1-specific constants
  t4030: abstract away SHA-1-specific constants
  t4029: abstract away SHA-1-specific constants
  t4029: fix test indentation
  t4022: abstract away SHA-1-specific constants
  t4020: abstract away SHA-1-specific constants
  t4014: abstract away SHA-1-specific constants
  t4008: abstract away SHA-1-specific constants
  t4007: abstract away SHA-1-specific constants
  t3905: abstract away SHA-1-specific constants
  t3702: abstract away SHA-1-specific constants
  t3103: abstract away SHA-1-specific constants
  t2203: abstract away SHA-1-specific constants
  t: skip pack tests if not using SHA-1
  t4044: skip test if not using SHA-1
  ...
2018-05-30 21:51:28 +09:00
brian m. carlson d7a2fc8249 t1512: skip test if not using SHA-1
This test relies on objects with colliding short names which are
necessarily dependent on the hash used.  Skip the test if we're not
using SHA-1.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-14 11:02:01 +09:00
Ævar Arnfjörð Bjarmason 5cc044e025 get_short_oid: sort ambiguous objects by type, then SHA-1
Change the output emitted when an ambiguous object is encountered so
that we show tags first, then commits, followed by trees, and finally
blobs. Within each type we show objects in hashcmp() order. Before
this change the objects were only ordered by hashcmp().

The reason for doing this is that the output looks better as a result,
e.g. the v2.17.0 tag before this change on "git show e8f2" would
display:

    hint: The candidates are:
    hint:   e8f2093055 tree
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f25a3a50 tree
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2650052 tag v2.17.0
    hint:   e8f2867228 blob
    hint:   e8f28d537c tree
    hint:   e8f2a35526 blob
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2cf6ec0 tree

Now we'll instead show:

    hint:   e8f2650052 tag v2.17.0
    hint:   e8f21caf94 commit 2013-06-24 - bash prompt: print unique detached HEAD abbreviated object name
    hint:   e8f26250fa commit 2017-02-03 - Merge pull request #996 from jeffhostetler/jeffhostetler/register_rename_src
    hint:   e8f2bc0c06 commit 2015-05-10 - Documentation: note behavior for multiple remote.url entries
    hint:   e8f2093055 tree
    hint:   e8f25a3a50 tree
    hint:   e8f28d537c tree
    hint:   e8f2cf6ec0 tree
    hint:   e8f21d02f7 blob
    hint:   e8f21d577c blob
    hint:   e8f2867228 blob
    hint:   e8f2a35526 blob

Since we show the commit data in the output that's nicely aligned once
we sort by object type. The decision to show tags before commits is
pretty arbitrary. I don't want to order by object_type since there
tags come last after blobs, which doesn't make sense if we want to
show the most important things first.

I could display them after commits, but it's much less likely that
we'll display a tag, so if there is one it makes sense to show it
prominently at the top.

A note on the implementation: Derrick rightly pointed out[1] that
we're bending over backwards here in get_short_oid() to first
de-duplicate the list, and then emit it, but could simply do it in one
step.

The reason for that is that oid_array_for_each_unique() doesn't
actually require that the array be sorted by oid_array_sort(), it just
needs to be sorted in some order that guarantees that all objects with
the same ID are adjacent to one another, which (barring a hash
collision, which'll be someone else's problem) the sort_ambiguous()
function does.

I agree that would be simpler for this code, and had forgotten why I
initially wrote it like this[2]. But on further reflection I think
it's better to do more work here just so we're not underhandedly using
the oid-array API where we lie about the list being sorted. That would
break any subsequent use of oid_array_lookup() in subtle ways.

I could get around that by hacking the API itself to support this
use-case and documenting it, which I did as a WIP patch in [3], but I
think it's too much code smell just for this one call site. It's
simpler for the API to just introduce a oid_array_for_each() function
to eagerly spew out the list without sorting or de-duplication, and
then do the de-duplication and sorting in two passes.

1. https://public-inbox.org/git/20180501130318.58251-1-dstolee@microsoft.com/
2. https://public-inbox.org/git/876047ze9v.fsf@evledraar.gmail.com/
3. https://public-inbox.org/git/874ljrzctc.fsf@evledraar.gmail.com/

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-11 14:43:23 +09:00
Vasco Almeida 3bb464437e t1512: become resilient to GETTEXT_POISON build
The concerned message was marked for translation by 0c99171
("get_short_sha1: mark ambiguity error for translation", 2016-09-26).

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-12 10:39:01 -07:00
Jeff King 5b33cb1fd7 get_short_sha1: make default disambiguation configurable
When we find ambiguous short sha1s, we may get a
disambiguation rule from our caller's context. But if we
don't, we fall back to treating all sha1s the same, even
though most projects will tend to refer only to commits by
their short sha1s.

This patch introduces a configuration option that lets the
user pick a different fallback (e.g., only commits). It's
possible that we may want to make this the default, but it's
a good idea to start as a config option for two reasons:

  1. It lets people experiment with this and see if it's a
     good idea (i.e., the "tend to" above is an assumption;
     we don't really know if this will break some obscure
     cases).

  2. Even if we do flip the default, it gives people an
     escape hatch if it causes problems (you can sometimes
     override it by asking for "1234^{tree}", but not all
     combinations are possible).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27 10:29:56 -07:00
Jeff King 1ffa26c461 get_short_sha1: list ambiguous objects on error
When the user gives us an ambiguous short sha1, we print an
error and refuse to resolve it. In some cases, the next step
is for them to feed us more characters (e.g., if they were
retyping or cut-and-pasting from a full sha1). But in other
cases, that might be all they have. For example, an old
commit message may have used a 7-character hex that was
unique at the time, but is now ambiguous.  Git doesn't
provide any information about the ambiguous objects it
found, so it's hard for the user to find out which one they
probably meant.

This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant. Here's what
it looks like on git.git:

  $ git rev-parse b2e1
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  hint:   b2e1759 blob
  hint:   b2e18954 blob
  hint:   b2e1895c blob
  fatal: ambiguous argument 'b2e1': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

We show the tagname for tags, and the date and subject for
commits. For trees and blobs, in theory we could dig in the
history to find the paths at which they were present. But
that's very expensive (on the order of 30s for the kernel),
and it's not likely to be all that helpful. Most short
references are to commits, so the useful information is
typically going to be that the object in question _isn't_ a
commit. So it's silly to spend a lot of CPU preemptively
digging up the path; the user can do it themselves if they
really need to.

And of course it's somewhat ironic that we abbreviate the
sha1s in the disambiguation hint. But full sha1s would cause
annoying line wrapping for the commit lines, and presumably
the user is going to just re-issue their command immediately
with the corrected sha1.

We also restrict the list to those that match any
disambiguation hint. E.g.:

  $ git rev-parse b2e1:foo
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  fatal: Invalid object name 'b2e1'.

does not bother reporting the blobs, because they cannot
work as a treeish.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:55:31 -07:00
Jeff King fad6b9e590 for_each_abbrev: drop duplicate objects
If an object appears multiple times in the object database
(e.g., in both loose and packed form, or in two separate
packs), the disambiguation machinery may see it more than
once. The get_short_sha1() function handles this already,
but for_each_abbrev() blindly fires the callback for each
instance it finds.

We can fix this by collecting the output in a sha1 array and
de-duplicating it.  As a bonus, the sort done for the
de-duplication means that our output will be stable,
regardless of the order in which the objects are found.

Note that the old code normalized the callback's output to
0/1 to store in the 1-bit ds->ambiguous flag (which both
halted the iteration and was returned from the
for_each_abbrev function). Now that we are using sha1_array,
we can return the real value. In practice, it doesn't matter
as the sole caller only ever returns 0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:41 -07:00
Jeff King 5d5def2aa5 get_short_sha1: parse tags when looking for treeish
The treeish disambiguation function tries to peel tags, but
it does so by calling:

  deref_tag(lookup_object(sha1), ...);

This will only work if we have previously looked at the tag
and created a "struct tag" for it. Since parsing revision
arguments typically happens before anything else, this is
usually not the case, and we would fail to peel the tag (we
are lucky that deref_tag() gracefully handles the NULL and
does not segfault).

Instead, we can use parse_object(). Note that this is the
same fix done by 94d75d1 (get_short_sha1(): correctly
disambiguate type-limited abbreviation, 2013-07-01), but
that commit fixed only the committish disambiguator, and
left the bug in the treeish one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Jeff King 8a10fea49b get_sha1: propagate flags to child functions
The get_sha1() function is actually implementation by many
sub-functions, but we do not always pass our flags around to
all of those functions. As a result, we may forget that our
caller asked us to resolve with GET_SHA1_QUIETLY and output
messages. The two triggerable cases are:

  1. Resolving treeish:path will resolve the "treeish"
     portion using GET_SHA1_TREEISH, dropping all other
     flags.

  2. The peel_onion() function did not take flags at all
     but recurses to get_sha1_1(), which does.

The solution for both is to bitwise-OR their new flags with
the existing ones (after dropping any mutually exclusive
disambiguation flags).

This bug can trigger with "git rev-parse --quiet", which
asks for quiet resolution. But it can also happen in a more
vanilla code path when we do a follow-up ONLY_TO_DIE
invocation of get_sha1(), and that's what the tests check.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Jeff King 7243ffdd78 get_sha1: avoid repeating ourselves via ONLY_TO_DIE
When the revision code cannot parse an argument like
"HEAD:foo", it will call maybe_die_on_misspelt_object_name(),
which re-runs get_sha1() with an extra ONLY_TO_DIE flag. We
then spend more effort to generate a better error message.

Unfortunately, a side effect is that our second call may
repeat the same error messages from the original get_sha1()
call. You can see this with:

  $ git show 0017
  error: short SHA1 0017 is ambiguous.
  error: short SHA1 0017 is ambiguous.
  fatal: ambiguous argument '0017': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

where the second "error:" line comes from the ONLY_TO_DIE
call.

To fix this, we can make ONLY_TO_DIE imply QUIETLY. This is
a little odd, because the whole point of ONLY_TO_DIE is to
output error messages. But what we want to do is tell the
rest of the get_sha1() code (particularly get_sha1_1()) that
the _regular_ messages should be quiet, but the only-to-die
ones should not.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Elia Pinto dcfbb2aa89 t/t1512-rev-parse-disambiguation.sh: use the $( ... ) construct for command substitution
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
	perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg'  "${_f}"
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-27 15:44:49 -08:00
Junio C Hamano eb40e51597 Merge branch 'jc/t1512-fix'
A test that should have failed but didn't revealed a bug that needs
to be corrected.

* jc/t1512-fix:
  get_short_sha1(): correctly disambiguate type-limited abbreviation
  t1512: correct leftover constants from earlier edition
2013-07-11 13:06:11 -07:00
Junio C Hamano 2c57f7c9a2 t1512: correct leftover constants from earlier edition
The earliest iteration of this test script used a magic string
110282 as the common prefix for ambiguous object names, but the
final edition switched the common prefix to 0000000000 (10 "0"s).

Unfortunately, instances of the original prefix were left in the
comments and a few tests.  Replace them with the correct constants.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-01 21:54:27 -07:00
Nguyễn Thái Ngọc Duy 798c35fcd8 get_sha1: warn about full or short object names that look like refs
When we get 40 hex digits, we immediately assume it's an SHA-1. This
is the right thing to do because we have no way else to specify an
object. If there is a ref with the same object name, it will be
ignored. Warn the user about this case because the ref with full
object name is likely a mistake, for example

    git checkout -b $empty_var $(git rev-parse something)

advice.object_name_warning is not documented because frankly people
should not be aware about it until they encounter this situation.

While at there, warn about ambiguation with abbreviated SHA-1 too.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-29 11:31:36 -07:00
Junio C Hamano 31ffd0c0c1 t1512: match the "other" object names
The test creates 16 objects that share the same prefix, and two other
objects that do not.  Tweak the test so that the other two share the
same prefix that is different from the one that is shared by the 16.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-13 12:43:43 -07:00
Junio C Hamano 4c654f5f27 t1512: ignore whitespaces in wc -l output
Some implementations of sed (e.g. MacOS X) have whitespaces in the
output of "wc -l" that reads from the standard input.  Ignore these
whitespaces by not quoting the command substitution to be compared
with the constant "16".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-11 16:30:49 -07:00
Junio C Hamano 957d74062c rev-parse --disambiguate=<prefix>
The new option allows you to feed an ambiguous prefix and enumerate
all the objects that share it as a prefix of their object names.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:23 -07:00
Junio C Hamano c036c4c5e4 rev-parse: A and B in "rev-parse A..B" refer to committish
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:23 -07:00
Junio C Hamano 13243c2c7a reset: the command takes committish
This is not strictly correct, in that resetting selected index
entries from corresponding paths out of a given tree without moving
HEAD is a valid operation, and in such case a tree-ish would suffice.

But the existing code already requires a committish in the codepath,
so let's be consistent with it for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:23 -07:00
Junio C Hamano 75f5ac04a2 commit-tree: the command wants a tree and commits
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano da3ac0c149 apply: --build-fake-ancestor expects blobs
The "index" line read from the patch to reconstruct a partial
preimage tree records the object names of blob objects.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano daba53aeaf sha1_name.c: add support for disambiguating other types
This teaches the revision parser that in "$name:$path" (used for a
blob object name), "$name" must be a tree-ish.

There are many more places where we know what types of objects are
called for.  This patch adds support for "commit", "treeish", "tree",
and "blob", which could be used in the following contexts:

 - "git apply --build-fake-ancestor" reads the "index" lines from
   the patch; they must name blob objects (not even "blob-ish");

 - "git commit-tree" reads a tree object name (not "tree-ish"), and
   zero or more commit object names (not "committish");

 - "git reset $rev" wants a committish; "git reset $rev -- $path"
   wants a treeish.

They will come in later patches in the series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano d5f6b1d756 revision.c: the "log" family, except for "show", takes committish
Add a field to setup_revision_opt structure and allow these callers
to tell the setup_revisions command parsing machinery that short SHA1
it encounters are meant to name committish.

This step does not go all the way to connect the setup_revisions()
to sha1_name.c yet.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano cd74e4733d sha1_name.c: introduce get_sha1_committish()
Many callers know that the user meant to name a committish by
syntactical positions where the object name appears.  Calling this
function allows the machinery to disambiguate shorter-than-unique
abbreviated object names between committish and others.

Note that this does NOT error out when the named object is not a
committish. It is merely to give a hint to the disambiguation
machinery.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano e2643617d7 sha1_name.c: many short names can only be committish
We know that the token "$name" that appear in "$name^{commit}",
"$name^4", "$name~4" etc. can only name a committish (either a
commit or a tag that peels to a commit).  Teach get_short_sha1() to
take advantage of that knowledge when disambiguating an abbreviated
SHA-1 given as an object name.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano 6269b6b676 sha1_name.c: get_describe_name() by definition groks only commits
Teach get_describe_name() to pass the disambiguation hint down the
callchain to get_short_sha1().

Also add tests to show various syntactic elements that we could take
advantage of the object type information to help disambiguration of
abbreviated object names.  Many of them are marked as broken, and
some of them will be fixed in later patches in this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00