git/builtin/describe.c

707 lines
18 KiB
C
Raw Normal View History

#define USE_THE_INDEX_VARIABLE
#include "builtin.h"
#include "config.h"
#include "environment.h"
#include "gettext.h"
#include "hex.h"
#include "lockfile.h"
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
#include "commit.h"
#include "tag.h"
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
#include "blob.h"
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
#include "refs.h"
#include "exec-cmd.h"
#include "object-name.h"
#include "parse-options.h"
#include "read-cache-ll.h"
#include "revision.h"
Teach "git describe" --dirty option With the --dirty option, git describe works on HEAD but append s"-dirty" iff the contents of the work tree differs from HEAD. E.g. $ git describe --dirty v1.6.5-15-gc274db7 $ echo >> Makefile $ git describe --dirty v1.6.5-15-gc274db7-dirty The --dirty option can also be used to specify what is appended, instead of the default string "-dirty". $ git describe --dirty=.mod v1.6.5-15-gc274db7.mod Many build scripts use `git describe` to produce a version number based on the description of HEAD (on which the work tree is based) + saying that if the build contains uncommitted changes. This patch helps the writing of such scripts since `git describe --dirty` does directly the intended thing. Three possiblities were considered while discussing this new feature: 1. Describe the work tree by default and describe HEAD only if "HEAD" is explicitly specified Pro: does the right thing by default (both for users and for scripts) Pro: other git commands that works on the work tree by default Con: breaks existing scripts used by the Linux kernel and other projects 2. Use --worktree instead of --dirty Pro: does what it says: "git describe --worktree" describes the work tree Con: other commands do not require a --worktree option when working on the work tree (it often is the default mode for them) Con: unusable with an optional value: "git describe --worktree=.mod" is quite unintuitive. 3. Use --dirty as in this patch Pro: makes sense to specify an optional value (what the dirty mark is) Pro: does not have any of the big cons of previous alternatives * does not break scripts * is not inconsistent with other git commands This patch takes the third approach. Signed-off-by: Jean Privat <jean@pryen.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 13:35:22 +00:00
#include "diff.h"
#include "hashmap.h"
#include "setup.h"
#include "strvec.h"
#include "run-command.h"
#include "object-store-ll.h"
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
#include "list-objects.h"
#include "commit-slab.h"
#include "wildmatch.h"
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
#define MAX_TAGS (FLAG_BITS - 1)
#define DEFAULT_CANDIDATES 10
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
define_commit_slab(commit_names, struct commit_name *);
static const char * const describe_usage[] = {
N_("git describe [--all] [--tags] [--contains] [--abbrev=<n>] [<commit-ish>...]"),
N_("git describe [--all] [--tags] [--contains] [--abbrev=<n>] --dirty[=<mark>]"),
N_("git describe <blob>"),
NULL
};
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
static int debug; /* Display lots of verbose info */
static int all; /* Any valid ref can be used */
static int tags; /* Allow lightweight tags */
static int longformat;
static int first_parent;
static int abbrev = -1; /* unspecified */
static int max_candidates = DEFAULT_CANDIDATES;
static struct hashmap names;
static int have_util;
static struct string_list patterns = STRING_LIST_INIT_NODUP;
static struct string_list exclude_patterns = STRING_LIST_INIT_NODUP;
static int always;
static const char *suffix, *dirty, *broken;
static struct commit_names commit_names;
Teach "git describe" --dirty option With the --dirty option, git describe works on HEAD but append s"-dirty" iff the contents of the work tree differs from HEAD. E.g. $ git describe --dirty v1.6.5-15-gc274db7 $ echo >> Makefile $ git describe --dirty v1.6.5-15-gc274db7-dirty The --dirty option can also be used to specify what is appended, instead of the default string "-dirty". $ git describe --dirty=.mod v1.6.5-15-gc274db7.mod Many build scripts use `git describe` to produce a version number based on the description of HEAD (on which the work tree is based) + saying that if the build contains uncommitted changes. This patch helps the writing of such scripts since `git describe --dirty` does directly the intended thing. Three possiblities were considered while discussing this new feature: 1. Describe the work tree by default and describe HEAD only if "HEAD" is explicitly specified Pro: does the right thing by default (both for users and for scripts) Pro: other git commands that works on the work tree by default Con: breaks existing scripts used by the Linux kernel and other projects 2. Use --worktree instead of --dirty Pro: does what it says: "git describe --worktree" describes the work tree Con: other commands do not require a --worktree option when working on the work tree (it often is the default mode for them) Con: unusable with an optional value: "git describe --worktree=.mod" is quite unintuitive. 3. Use --dirty as in this patch Pro: makes sense to specify an optional value (what the dirty mark is) Pro: does not have any of the big cons of previous alternatives * does not break scripts * is not inconsistent with other git commands This patch takes the third approach. Signed-off-by: Jean Privat <jean@pryen.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 13:35:22 +00:00
/* diff-index command arguments to check if working tree is dirty. */
static const char *diff_index_args[] = {
"diff-index", "--quiet", "HEAD", "--", NULL
};
struct commit_name {
struct hashmap_entry entry;
struct object_id peeled;
struct tag *tag;
unsigned prio:2; /* annotated tag = 2, tag = 1, head = 0 */
unsigned name_checked:1;
describe: force long format for a name based on a mislocated tag An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20 17:34:36 +00:00
unsigned misnamed:1;
struct object_id oid;
char *path;
};
static const char *prio_names[] = {
N_("head"), N_("lightweight"), N_("annotated"),
};
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
static int commit_name_neq(const void *cmp_data UNUSED,
const struct hashmap_entry *eptr,
const struct hashmap_entry *entry_or_key,
const void *peeled)
{
const struct commit_name *cn1, *cn2;
cn1 = container_of(eptr, const struct commit_name, entry);
cn2 = container_of(entry_or_key, const struct commit_name, entry);
return !oideq(&cn1->peeled, peeled ? peeled : &cn2->peeled);
}
static inline struct commit_name *find_commit_name(const struct object_id *peeled)
{
return hashmap_get_entry_from_hash(&names, oidhash(peeled), peeled,
struct commit_name, entry);
}
static int replace_name(struct commit_name *e,
int prio,
const struct object_id *oid,
struct tag **tag)
{
if (!e || e->prio < prio)
return 1;
if (e->prio == 2 && prio == 2) {
/* Multiple annotated tags point to the same commit.
* Select one to keep based upon their tagger date.
*/
struct tag *t;
if (!e->tag) {
t = lookup_tag(the_repository, &e->oid);
if (!t || parse_tag(t))
return 1;
e->tag = t;
}
t = lookup_tag(the_repository, oid);
if (!t || parse_tag(t))
return 0;
*tag = t;
if (e->tag->date < t->date)
return 1;
}
return 0;
}
static void add_to_known_names(const char *path,
const struct object_id *peeled,
int prio,
const struct object_id *oid)
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
{
struct commit_name *e = find_commit_name(peeled);
struct tag *tag = NULL;
if (replace_name(e, prio, oid, &tag)) {
if (!e) {
e = xmalloc(sizeof(struct commit_name));
oidcpy(&e->peeled, peeled);
hashmap_entry_init(&e->entry, oidhash(peeled));
hashmap_add(&names, &e->entry);
e->path = NULL;
}
e->tag = tag;
e->prio = prio;
e->name_checked = 0;
describe: force long format for a name based on a mislocated tag An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20 17:34:36 +00:00
e->misnamed = 0;
oidcpy(&e->oid, oid);
free(e->path);
e->path = xstrdup(path);
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
}
}
static int get_name(const char *path, const struct object_id *oid,
int flag UNUSED, void *cb_data UNUSED)
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
{
int is_tag = 0;
struct object_id peeled;
int is_annotated, prio;
const char *path_to_match = NULL;
if (skip_prefix(path, "refs/tags/", &path_to_match)) {
is_tag = 1;
} else if (all) {
if ((exclude_patterns.nr || patterns.nr) &&
!skip_prefix(path, "refs/heads/", &path_to_match) &&
!skip_prefix(path, "refs/remotes/", &path_to_match)) {
/* Only accept reference of known type if there are match/exclude patterns */
return 0;
}
} else {
/* Reject anything outside refs/tags/ unless --all */
return 0;
}
/*
* If we're given exclude patterns, first exclude any tag which match
* any of the exclude pattern.
*/
if (exclude_patterns.nr) {
struct string_list_item *item;
for_each_string_list_item(item, &exclude_patterns) {
if (!wildmatch(item->string, path_to_match, 0))
return 0;
}
}
/*
* If we're given patterns, accept only tags which match at least one
* pattern.
*/
if (patterns.nr) {
int found = 0;
struct string_list_item *item;
for_each_string_list_item(item, &patterns) {
if (!wildmatch(item->string, path_to_match, 0)) {
found = 1;
break;
}
}
if (!found)
return 0;
}
/* Is it annotated? */
refs: switch peel_ref() to peel_iterated_oid() The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char *refname, const struct object_id *oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 19:44:43 +00:00
if (!peel_iterated_oid(oid, &peeled)) {
is_annotated = !oideq(oid, &peeled);
} else {
oidcpy(&peeled, oid);
is_annotated = 0;
}
/*
* By default, we only use annotated tags, but with --tags
* we fall back to lightweight ones (even without --tags,
* we still remember lightweight ones, only to give hints
* in an error message). --all allows any refs to be used.
*/
if (is_annotated)
prio = 2;
else if (is_tag)
prio = 1;
else
prio = 0;
add_to_known_names(all ? path + 5 : path + 10, &peeled, prio, oid);
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
return 0;
}
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
struct possible_tag {
struct commit_name *name;
int depth;
int found_order;
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
unsigned flag_within;
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
};
static int compare_pt(const void *a_, const void *b_)
{
struct possible_tag *a = (struct possible_tag *)a_;
struct possible_tag *b = (struct possible_tag *)b_;
if (a->depth != b->depth)
return a->depth - b->depth;
if (a->found_order != b->found_order)
return a->found_order - b->found_order;
return 0;
}
Compute accurate distances in git-describe before output. My prior change to git-describe attempts to print the distance between the input commit and the best matching tag, but this distance was usually only an estimate as we always aborted revision walking as soon as we overflowed the configured limit on the number of possible tags (as set by --candidates). Displaying an estimated distance is not very useful and can just be downright confusing. Most users (heck, most Git developers) don't immediately understand why this distance differs from the output of common tools such as `git rev-list | wc -l`. Even worse, the estimated distance could change in the future (including decreasing despite no rebase occuring) if we find more possible tags earlier on during traversal. (This could happen if more tags are merged into the branch between queries.) These factors basically make an estimated distance useless. Fortunately we are usually most of the way through an accurate distance computation by the time we abort (due to reaching the current --candidates limit). This means we can simply finish counting out the revisions still in our commit queue to present the accurate distance at the end. The number of commits remaining in the commit queue is probably less than the number of commits already traversed, so finishing out the count is not likely to take very long. This final distance will then always match the output of `git rev-list | wc -l`. We can easily reduce the total number of commits that need to be walked at the end by stopping as soon as all of the commits in the commit queue are flagged as being merged into the already selected best possible tag. If that's true then there are no remaining unseen commits which can contribute to our best possible tag's depth counter, so further traversal is useless. Basic testing on my Mac OS X system shows there is no noticable performance difference between this accurate distance counting version of git-describe and the prior version of git-describe, at least when run on git.git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-27 06:54:21 +00:00
static unsigned long finish_depth_computation(
struct commit_list **list,
struct possible_tag *best)
{
unsigned long seen_commits = 0;
while (*list) {
struct commit *c = pop_commit(list);
struct commit_list *parents = c->parents;
seen_commits++;
if (c->object.flags & best->flag_within) {
struct commit_list *a = *list;
while (a) {
struct commit *i = a->item;
if (!(i->object.flags & best->flag_within))
break;
a = a->next;
}
if (!a)
break;
} else
best->depth++;
while (parents) {
struct commit *p = parents->item;
repo_parse_commit(the_repository, p);
Compute accurate distances in git-describe before output. My prior change to git-describe attempts to print the distance between the input commit and the best matching tag, but this distance was usually only an estimate as we always aborted revision walking as soon as we overflowed the configured limit on the number of possible tags (as set by --candidates). Displaying an estimated distance is not very useful and can just be downright confusing. Most users (heck, most Git developers) don't immediately understand why this distance differs from the output of common tools such as `git rev-list | wc -l`. Even worse, the estimated distance could change in the future (including decreasing despite no rebase occuring) if we find more possible tags earlier on during traversal. (This could happen if more tags are merged into the branch between queries.) These factors basically make an estimated distance useless. Fortunately we are usually most of the way through an accurate distance computation by the time we abort (due to reaching the current --candidates limit). This means we can simply finish counting out the revisions still in our commit queue to present the accurate distance at the end. The number of commits remaining in the commit queue is probably less than the number of commits already traversed, so finishing out the count is not likely to take very long. This final distance will then always match the output of `git rev-list | wc -l`. We can easily reduce the total number of commits that need to be walked at the end by stopping as soon as all of the commits in the commit queue are flagged as being merged into the already selected best possible tag. If that's true then there are no remaining unseen commits which can contribute to our best possible tag's depth counter, so further traversal is useless. Basic testing on my Mac OS X system shows there is no noticable performance difference between this accurate distance counting version of git-describe and the prior version of git-describe, at least when run on git.git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-27 06:54:21 +00:00
if (!(p->object.flags & SEEN))
commit_list_insert_by_date(p, list);
Compute accurate distances in git-describe before output. My prior change to git-describe attempts to print the distance between the input commit and the best matching tag, but this distance was usually only an estimate as we always aborted revision walking as soon as we overflowed the configured limit on the number of possible tags (as set by --candidates). Displaying an estimated distance is not very useful and can just be downright confusing. Most users (heck, most Git developers) don't immediately understand why this distance differs from the output of common tools such as `git rev-list | wc -l`. Even worse, the estimated distance could change in the future (including decreasing despite no rebase occuring) if we find more possible tags earlier on during traversal. (This could happen if more tags are merged into the branch between queries.) These factors basically make an estimated distance useless. Fortunately we are usually most of the way through an accurate distance computation by the time we abort (due to reaching the current --candidates limit). This means we can simply finish counting out the revisions still in our commit queue to present the accurate distance at the end. The number of commits remaining in the commit queue is probably less than the number of commits already traversed, so finishing out the count is not likely to take very long. This final distance will then always match the output of `git rev-list | wc -l`. We can easily reduce the total number of commits that need to be walked at the end by stopping as soon as all of the commits in the commit queue are flagged as being merged into the already selected best possible tag. If that's true then there are no remaining unseen commits which can contribute to our best possible tag's depth counter, so further traversal is useless. Basic testing on my Mac OS X system shows there is no noticable performance difference between this accurate distance counting version of git-describe and the prior version of git-describe, at least when run on git.git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-27 06:54:21 +00:00
p->object.flags |= c->object.flags;
parents = parents->next;
}
}
return seen_commits;
}
static void append_name(struct commit_name *n, struct strbuf *dst)
{
if (n->prio == 2 && !n->tag) {
n->tag = lookup_tag(the_repository, &n->oid);
if (!n->tag || parse_tag(n->tag))
die(_("annotated tag %s not available"), n->path);
}
if (n->tag && !n->name_checked) {
describe: force long format for a name based on a mislocated tag An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20 17:34:36 +00:00
if (strcmp(n->tag->tag, all ? n->path + 5 : n->path)) {
warning(_("tag '%s' is externally known as '%s'"),
n->path, n->tag->tag);
n->misnamed = 1;
}
n->name_checked = 1;
}
if (n->tag) {
if (all)
strbuf_addstr(dst, "tags/");
strbuf_addstr(dst, n->tag->tag);
} else {
strbuf_addstr(dst, n->path);
}
}
static void append_suffix(int depth, const struct object_id *oid, struct strbuf *dst)
{
strbuf_addf(dst, "-%d-g%s", depth,
repo_find_unique_abbrev(the_repository, oid, abbrev));
}
static void describe_commit(struct object_id *oid, struct strbuf *dst)
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
{
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
struct commit *cmit, *gave_up_on = NULL;
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
struct commit_list *list;
struct commit_name *n;
struct possible_tag all_matches[MAX_TAGS];
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
unsigned int match_cnt = 0, annotated_cnt = 0, cur_match;
unsigned long seen_commits = 0;
unsigned int unannotated_cnt = 0;
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
cmit = lookup_commit_reference(the_repository, oid);
n = find_commit_name(&cmit->object.oid);
if (n && (tags || all || n->prio == 2)) {
/*
* Exact match to an existing ref.
*/
append_name(n, dst);
describe: force long format for a name based on a mislocated tag An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20 17:34:36 +00:00
if (n->misnamed || longformat)
append_suffix(0, n->tag ? get_tagged_oid(n->tag) : oid, dst);
if (suffix)
strbuf_addstr(dst, suffix);
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
return;
}
if (!max_candidates)
die(_("no tag exactly matches '%s'"), oid_to_hex(&cmit->object.oid));
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (debug)
fprintf(stderr, _("No exact match on refs or tags, searching to describe\n"));
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (!have_util) {
struct hashmap_iter iter;
struct commit *c;
struct commit_name *n;
init_commit_names(&commit_names);
hashmap_for_each_entry(&names, &iter, n,
entry /* member name */) {
c = lookup_commit_reference_gently(the_repository,
&n->peeled, 1);
if (c)
*commit_names_at(&commit_names, c) = n;
}
have_util = 1;
}
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
list = NULL;
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
cmit->object.flags = SEEN;
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
commit_list_insert(cmit, &list);
while (list) {
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
struct commit *c = pop_commit(&list);
struct commit_list *parents = c->parents;
struct commit_name **slot;
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
seen_commits++;
slot = commit_names_peek(&commit_names, c);
n = slot ? *slot : NULL;
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
if (n) {
if (!tags && !all && n->prio < 2) {
unannotated_cnt++;
} else if (match_cnt < max_candidates) {
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
struct possible_tag *t = &all_matches[match_cnt++];
t->name = n;
t->depth = seen_commits - 1;
t->flag_within = 1u << match_cnt;
t->found_order = match_cnt;
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
c->object.flags |= t->flag_within;
if (n->prio == 2)
annotated_cnt++;
}
else {
gave_up_on = c;
break;
}
}
for (cur_match = 0; cur_match < match_cnt; cur_match++) {
struct possible_tag *t = &all_matches[cur_match];
if (!(c->object.flags & t->flag_within))
t->depth++;
}
2020-02-26 17:48:53 +00:00
/* Stop if last remaining path already covered by best candidate(s) */
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (annotated_cnt && !list) {
2020-02-26 17:48:53 +00:00
int best_depth = INT_MAX;
unsigned best_within = 0;
for (cur_match = 0; cur_match < match_cnt; cur_match++) {
struct possible_tag *t = &all_matches[cur_match];
if (t->depth < best_depth) {
best_depth = t->depth;
best_within = t->flag_within;
} else if (t->depth == best_depth) {
best_within |= t->flag_within;
}
}
if ((c->object.flags & best_within) == best_within) {
if (debug)
fprintf(stderr, _("finished search at %s\n"),
oid_to_hex(&c->object.oid));
break;
}
}
while (parents) {
struct commit *p = parents->item;
repo_parse_commit(the_repository, p);
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (!(p->object.flags & SEEN))
commit_list_insert_by_date(p, &list);
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
p->object.flags |= c->object.flags;
parents = parents->next;
if (first_parent)
break;
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
}
}
if (!match_cnt) {
struct object_id *cmit_oid = &cmit->object.oid;
if (always) {
strbuf_add_unique_abbrev(dst, cmit_oid, abbrev);
if (suffix)
strbuf_addstr(dst, suffix);
return;
}
if (unannotated_cnt)
die(_("No annotated tags can describe '%s'.\n"
"However, there were unannotated tags: try --tags."),
oid_to_hex(cmit_oid));
else
die(_("No tags can describe '%s'.\n"
"Try --always, or create some tags."),
oid_to_hex(cmit_oid));
}
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
QSORT(all_matches, match_cnt, compare_pt);
Compute accurate distances in git-describe before output. My prior change to git-describe attempts to print the distance between the input commit and the best matching tag, but this distance was usually only an estimate as we always aborted revision walking as soon as we overflowed the configured limit on the number of possible tags (as set by --candidates). Displaying an estimated distance is not very useful and can just be downright confusing. Most users (heck, most Git developers) don't immediately understand why this distance differs from the output of common tools such as `git rev-list | wc -l`. Even worse, the estimated distance could change in the future (including decreasing despite no rebase occuring) if we find more possible tags earlier on during traversal. (This could happen if more tags are merged into the branch between queries.) These factors basically make an estimated distance useless. Fortunately we are usually most of the way through an accurate distance computation by the time we abort (due to reaching the current --candidates limit). This means we can simply finish counting out the revisions still in our commit queue to present the accurate distance at the end. The number of commits remaining in the commit queue is probably less than the number of commits already traversed, so finishing out the count is not likely to take very long. This final distance will then always match the output of `git rev-list | wc -l`. We can easily reduce the total number of commits that need to be walked at the end by stopping as soon as all of the commits in the commit queue are flagged as being merged into the already selected best possible tag. If that's true then there are no remaining unseen commits which can contribute to our best possible tag's depth counter, so further traversal is useless. Basic testing on my Mac OS X system shows there is no noticable performance difference between this accurate distance counting version of git-describe and the prior version of git-describe, at least when run on git.git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-27 06:54:21 +00:00
if (gave_up_on) {
commit_list_insert_by_date(gave_up_on, &list);
Compute accurate distances in git-describe before output. My prior change to git-describe attempts to print the distance between the input commit and the best matching tag, but this distance was usually only an estimate as we always aborted revision walking as soon as we overflowed the configured limit on the number of possible tags (as set by --candidates). Displaying an estimated distance is not very useful and can just be downright confusing. Most users (heck, most Git developers) don't immediately understand why this distance differs from the output of common tools such as `git rev-list | wc -l`. Even worse, the estimated distance could change in the future (including decreasing despite no rebase occuring) if we find more possible tags earlier on during traversal. (This could happen if more tags are merged into the branch between queries.) These factors basically make an estimated distance useless. Fortunately we are usually most of the way through an accurate distance computation by the time we abort (due to reaching the current --candidates limit). This means we can simply finish counting out the revisions still in our commit queue to present the accurate distance at the end. The number of commits remaining in the commit queue is probably less than the number of commits already traversed, so finishing out the count is not likely to take very long. This final distance will then always match the output of `git rev-list | wc -l`. We can easily reduce the total number of commits that need to be walked at the end by stopping as soon as all of the commits in the commit queue are flagged as being merged into the already selected best possible tag. If that's true then there are no remaining unseen commits which can contribute to our best possible tag's depth counter, so further traversal is useless. Basic testing on my Mac OS X system shows there is no noticable performance difference between this accurate distance counting version of git-describe and the prior version of git-describe, at least when run on git.git. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-27 06:54:21 +00:00
seen_commits--;
}
seen_commits += finish_depth_computation(&list, &all_matches[0]);
free_commit_list(list);
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (debug) {
static int label_width = -1;
if (label_width < 0) {
int i, w;
for (i = 0; i < ARRAY_SIZE(prio_names); i++) {
w = strlen(_(prio_names[i]));
if (label_width < w)
label_width = w;
}
}
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
for (cur_match = 0; cur_match < match_cnt; cur_match++) {
struct possible_tag *t = &all_matches[cur_match];
fprintf(stderr, " %-*s %8d %s\n",
label_width, _(prio_names[t->name->prio]),
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
t->depth, t->name->path);
}
fprintf(stderr, _("traversed %lu commits\n"), seen_commits);
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (gave_up_on) {
fprintf(stderr,
_("more than %i tags found; listed %i most recent\n"
"gave up search at %s\n"),
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
max_candidates, max_candidates,
oid_to_hex(&gave_up_on->object.oid));
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
}
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
}
append_name(all_matches[0].name, dst);
describe: force long format for a name based on a mislocated tag An annotated tag has two names: where it sits in the refs/tags hierarchy and the tagname recorded in the "tag" field in the object itself. They usually should match. Since 212945d4 ("Teach git-describe to verify annotated tag names before output", 2008-02-28), a commit described using an annotated tag bases its name on the tagname from the object. While this was a deliberate design decision to make it easier to converse about tags with others, even if the tags happen to be fetched to a different name than it was given by its creator, it had one downside. The output from "git describe", at least in the modern Git, should be usable as an object name to name the exact commit given to the "git describe" command. Using the tagname, when two names differ, breaks this property, when describing a commit that is directly pointed at by such a tag. An annotated tag Bob made as "v1.0" may sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from "git describe v1.0-bob^0" would say "v1.0", but there may not be any tag at "refs/tags/v1.0" locally or there may be another tag that points at a different object. Note that this won't be a problem if a commit being described is not directly pointed at by such a mislocated tag. In the example in the previous paragraph, describing a commit whose parent is v1.0-bob would result in "v1.0" (i.e. the tagname taken from the tag object) followed by "-1-gXXXXX" where XXXXX is the abbreviated object name, and a string that ends with "-g" followed by a hexadecimal string is an object name for the object whose name begins with hexadecimal string (as long as it is unique), so it does not matter if the leading part is "v1.0" or "v1.0-bob". Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when the name we give is based on a mislocated annotated tag to ensure that the output can be used as the object name for the object originally given to the command to fix the issue. While at it, remove an overly cautious dead code to protect against an annotated tag object without the tagname. Such a tag is filtered out much earlier in the codeflow, and will not reach this part of the code. Helped-by: Matheus Tavares <matheus.bernardino@usp.br> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-20 17:34:36 +00:00
if (all_matches[0].name->misnamed || abbrev)
append_suffix(all_matches[0].depth, &cmit->object.oid, dst);
if (suffix)
strbuf_addstr(dst, suffix);
}
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
struct process_commit_data {
struct object_id current_commit;
struct object_id looking_for;
struct strbuf *dst;
struct rev_info *revs;
};
static void process_commit(struct commit *commit, void *data)
{
struct process_commit_data *pcd = data;
pcd->current_commit = commit->object.oid;
}
static void process_object(struct object *obj, const char *path, void *data)
{
struct process_commit_data *pcd = data;
if (oideq(&pcd->looking_for, &obj->oid) && !pcd->dst->len) {
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
reset_revision_walk();
describe_commit(&pcd->current_commit, pcd->dst);
strbuf_addf(pcd->dst, ":%s", path);
free_commit_list(pcd->revs->commits);
pcd->revs->commits = NULL;
}
}
static void describe_blob(struct object_id oid, struct strbuf *dst)
{
struct rev_info revs;
struct strvec args = STRVEC_INIT;
struct process_commit_data pcd = { *null_oid(), oid, dst, &revs};
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
strvec_pushl(&args, "internal: The first arg is not parsed",
"--objects", "--in-commit-order", "--reverse", "HEAD",
NULL);
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
repo_init_revisions(the_repository, &revs, NULL);
if (setup_revisions(args.nr, args.v, &revs, NULL) > 1)
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
BUG("setup_revisions could not handle all args?");
if (prepare_revision_walk(&revs))
die("revision walk setup failed");
traverse_commit_list(&revs, process_commit, process_object, &pcd);
reset_revision_walk();
release_revisions(&revs);
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
}
static void describe(const char *arg, int last_one)
{
struct object_id oid;
struct commit *cmit;
struct strbuf sb = STRBUF_INIT;
if (debug)
fprintf(stderr, _("describe %s\n"), arg);
if (repo_get_oid(the_repository, arg, &oid))
die(_("Not a valid object name %s"), arg);
cmit = lookup_commit_reference_gently(the_repository, &oid, 1);
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
if (cmit)
describe_commit(&oid, &sb);
else if (oid_object_info(the_repository, &oid, NULL) == OBJ_BLOB)
builtin/describe.c: describe a blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16 02:00:39 +00:00
describe_blob(oid, &sb);
else
die(_("%s is neither a commit nor blob"), arg);
puts(sb.buf);
Chose better tag names in git-describe after merges. Recently git.git itself encountered a situation on its master and next branches where git-describe stopped reporting 'v1.5.0-rc0-gN' and instead started reporting 'v1.4.4.4-gN'. This appeared to be a backward jump in version numbering. maint o-------------------4 \ \ master o-o-o-o-o-o-o-5-o-C-o-W The issue is that commit C in the diagram claims it is version 1.5.0, as the tag v1.5.0 is placed on commit 5. Yet commit W claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag date than the v1.4.4.4 tag. As it turns out this situation is very common. A bug fix applied to maint and later merged into master occurs frequently enough that it should Just Work Right(tm). Rather than taking the first tag that gets found git-describe will now generate a list of all possible tags and select the one which has the most number of commits in common with HEAD (or whatever revision the user requested the description of). This rule is based on the principle shown in the diagram above. There are a large number of commits on the primary development branch 'master' which do not appear in the 'maint' branch, and many of these are already tagged as part of v1.5.0-rc0. Additionally these commits are not in v1.4.4.4, as they are part of the v1.5.0 release still being developed. The v1.5.0-rc0 tag is more descriptive of W than v1.4.4.4 is, and therefore should be used. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 11:39:47 +00:00
Improve git-describe performance by reducing revision listing. My prior version of git-describe ran very slowly on even reasonably sized projects like git.git and linux.git as it tended to identify a large number of possible tags and then needed to generate the revision list for each of those tags to sort them and select the best tag to describe the input commit. All we really need is the number of commits in the input revision which are not in the tag. We can generate these counts during the revision walking and tag matching loop by assigning a color to each tag and coloring the commits as we walk them. This limits us to identifying no more than 26 possible tags, as there is limited space available within the flags field of struct commit. The limitation of 26 possible tags is hopefully not going to be a problem in real usage, as most projects won't create 26 maintenance releases and merge them back into a development trunk after the development trunk was tagged with a release candidate tag. If that does occur git-describe will start to revert to its old behavior of using the newer maintenance release tag to describe the development trunk, rather than the development trunk's own tag. The suggested workaround would be to retag the development trunk's tip. However since even 26 possible tags can take a while to generate a description for on some projects I'm defaulting the limit to 10 but offering the user --candidates to increase the number of possible matches if they need a more accurate result. I specifically chose 10 for the default as it seems unlikely projects will have more than 10 maintenance releases merged into a development trunk before retagging the development trunk, and it seems to perform about the same on linux.git as v1.4.4.4 git-describe. A large amount of debugging information was also added during the development of this change, so I've left it in to be toggled on with --debug. It may be useful to the end user to help them understand why git-describe took one particular tag over another. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-13 22:30:53 +00:00
if (!last_one)
clear_commit_marks(cmit, -1);
strbuf_release(&sb);
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
}
static int option_parse_exact_match(const struct option *opt, const char *arg,
int unset)
{
BUG_ON_OPT_ARG(arg);
max_candidates = unset ? DEFAULT_CANDIDATES : 0;
return 0;
}
int cmd_describe(int argc, const char **argv, const char *prefix)
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
{
int contains = 0;
struct option options[] = {
OPT_BOOL(0, "contains", &contains, N_("find the tag that comes after the commit")),
OPT_BOOL(0, "debug", &debug, N_("debug search strategy on stderr")),
OPT_BOOL(0, "all", &all, N_("use any ref")),
OPT_BOOL(0, "tags", &tags, N_("use any tag, even unannotated")),
OPT_BOOL(0, "long", &longformat, N_("always use long format")),
OPT_BOOL(0, "first-parent", &first_parent, N_("only follow first parent")),
OPT__ABBREV(&abbrev),
OPT_CALLBACK_F(0, "exact-match", NULL, NULL,
N_("only output exact matches"),
PARSE_OPT_NOARG, option_parse_exact_match),
OPT_INTEGER(0, "candidates", &max_candidates,
N_("consider <n> most recent tags (default: 10)")),
OPT_STRING_LIST(0, "match", &patterns, N_("pattern"),
N_("only consider tags matching <pattern>")),
OPT_STRING_LIST(0, "exclude", &exclude_patterns, N_("pattern"),
N_("do not consider tags matching <pattern>")),
OPT_BOOL(0, "always", &always,
N_("show abbreviated commit object as fallback")),
{OPTION_STRING, 0, "dirty", &dirty, N_("mark"),
N_("append <mark> on dirty working tree (default: \"-dirty\")"),
PARSE_OPT_OPTARG, NULL, (intptr_t) "-dirty"},
{OPTION_STRING, 0, "broken", &broken, N_("mark"),
N_("append <mark> on broken working tree (default: \"-broken\")"),
PARSE_OPT_OPTARG, NULL, (intptr_t) "-broken"},
OPT_END(),
};
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
git_config(git_default_config, NULL);
argc = parse_options(argc, argv, prefix, options, describe_usage, 0);
if (abbrev < 0)
abbrev = DEFAULT_ABBREV;
if (max_candidates < 0)
max_candidates = 0;
else if (max_candidates > MAX_TAGS)
max_candidates = MAX_TAGS;
save_commit_buffer = 0;
if (longformat && abbrev == 0)
die(_("options '%s' and '%s' cannot be used together"), "--long", "--abbrev=0");
if (contains) {
struct string_list_item *item;
struct strvec args;
strvec_init(&args);
strvec_pushl(&args, "name-rev",
"--peel-tag", "--name-only", "--no-undefined",
NULL);
if (always)
strvec_push(&args, "--always");
if (!all) {
strvec_push(&args, "--tags");
for_each_string_list_item(item, &patterns)
strvec_pushf(&args, "--refs=refs/tags/%s", item->string);
for_each_string_list_item(item, &exclude_patterns)
strvec_pushf(&args, "--exclude=refs/tags/%s", item->string);
}
if (argc)
strvec_pushv(&args, argv);
else
strvec_push(&args, "HEAD");
return cmd_name_rev(args.nr, args.v, prefix);
}
hashmap_init(&names, commit_name_neq, NULL, 0);
for_each_rawref(get_name, NULL);
hashmap: add API to disable item counting when threaded This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4). See: https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/ Add API to hashmap to disable item counting and thus automatic rehashing. Also include API to later re-enable them. When item counting is disabled, the map.size field is invalid. So to prevent accidents, the field has been renamed and an accessor function hashmap_get_size() has been added. All direct references to this field have been been updated. And the name of the field changed to map.private_size to communicate this. Here is the relevant output from ThreadSanitizer showing the problem: WARNING: ThreadSanitizer: data race (pid=10554) Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16): #0 hashmap_add hashmap.c:209 #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 #2 handle_range_dir name-hash.c:347 #3 handle_range_1 name-hash.c:415 #4 lazy_dir_thread_proc name-hash.c:471 #5 <null> <null> Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31): #0 hashmap_add hashmap.c:209 #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 #2 handle_range_dir name-hash.c:347 #3 handle_range_1 name-hash.c:415 #4 handle_range_dir name-hash.c:380 #5 handle_range_1 name-hash.c:415 #6 lazy_dir_thread_proc name-hash.c:471 #7 <null> <null> Martin gives instructions for running TSan on test t3008 in this post: https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4EXXOmCKvsG5MSuzxsA@mail.gmail.com/ Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 15:43:48 +00:00
if (!hashmap_get_size(&names) && !always)
die(_("No names found, cannot describe anything."));
if (argc == 0) {
if (broken) {
struct child_process cp = CHILD_PROCESS_INIT;
strvec_pushv(&cp.args, diff_index_args);
cp.git_cmd = 1;
cp.no_stdin = 1;
cp.no_stdout = 1;
if (!dirty)
dirty = "-dirty";
switch (run_command(&cp)) {
case 0:
suffix = NULL;
break;
case 1:
suffix = dirty;
break;
default:
/* diff-index aborted abnormally */
suffix = broken;
}
} else if (dirty) {
struct lock_file index_lock = LOCK_INIT;
struct rev_info revs;
struct strvec args = STRVEC_INIT;
diff: drop useless return from run_diff_{files,index} functions Neither of these functions ever returns a value other than zero. Instead, they expect unrecoverable errors to exit immediately, and things like "--exit-code" are stored inside the diff_options struct to be handled later via diff_result_code(). Some callers do check the return values, but many don't bother. Let's drop the useless return values, which are misleading callers about how the functions work. This could be seen as a step in the wrong direction, as we might want to eventually "lib-ify" these to more cleanly return errors up the stack, in which case we'd have to add the return values back in. But there are some benefits to doing this now: 1. In the current code, somebody could accidentally add a "return -1" to one of the functions, which would be erroneously ignored by many callers. By removing the return code, the compiler can notice the mismatch and force the developer to decide what to do. Obviously the other option here is that we could start consistently checking the error code in every caller. But it would be dead code, and we wouldn't get any compile-time help in catching new cases. 2. It communicates the situation to callers, who may want to choose a different function. These functions are really thin wrappers for doing git-diff-files and git-diff-index within the process. But callers who care about recovering from an error here are probably better off using the underlying library functions, many of which do return errors. If somebody eventually wants to teach these functions to propagate errors, they'll have to switch back to returning a value, effectively reverting this patch. But at least then they will be starting with a level playing field: they know that they will need to inspect each caller to see how it should handle the error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 20:18:55 +00:00
int fd;
setup_work_tree();
prepare_repo_settings(the_repository);
the_repository->settings.command_requires_full_index = 0;
repo_read_index(the_repository);
refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED,
NULL, NULL, NULL);
fd = repo_hold_locked_index(the_repository,
&index_lock, 0);
if (0 <= fd)
repo_update_index_if_able(the_repository, &index_lock);
repo_init_revisions(the_repository, &revs, prefix);
strvec_pushv(&args, diff_index_args);
if (setup_revisions(args.nr, args.v, &revs, NULL) != 1)
BUG("malformed internal diff-index command line");
diff: drop useless return from run_diff_{files,index} functions Neither of these functions ever returns a value other than zero. Instead, they expect unrecoverable errors to exit immediately, and things like "--exit-code" are stored inside the diff_options struct to be handled later via diff_result_code(). Some callers do check the return values, but many don't bother. Let's drop the useless return values, which are misleading callers about how the functions work. This could be seen as a step in the wrong direction, as we might want to eventually "lib-ify" these to more cleanly return errors up the stack, in which case we'd have to add the return values back in. But there are some benefits to doing this now: 1. In the current code, somebody could accidentally add a "return -1" to one of the functions, which would be erroneously ignored by many callers. By removing the return code, the compiler can notice the mismatch and force the developer to decide what to do. Obviously the other option here is that we could start consistently checking the error code in every caller. But it would be dead code, and we wouldn't get any compile-time help in catching new cases. 2. It communicates the situation to callers, who may want to choose a different function. These functions are really thin wrappers for doing git-diff-files and git-diff-index within the process. But callers who care about recovering from an error here are probably better off using the underlying library functions, many of which do return errors. If somebody eventually wants to teach these functions to propagate errors, they'll have to switch back to returning a value, effectively reverting this patch. But at least then they will be starting with a level playing field: they know that they will need to inspect each caller to see how it should handle the error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 20:18:55 +00:00
run_diff_index(&revs, 0);
if (!diff_result_code(&revs.diffopt))
suffix = NULL;
else
suffix = dirty;
release_revisions(&revs);
}
2006-01-16 06:25:35 +00:00
describe("HEAD", 1);
Teach "git describe" --dirty option With the --dirty option, git describe works on HEAD but append s"-dirty" iff the contents of the work tree differs from HEAD. E.g. $ git describe --dirty v1.6.5-15-gc274db7 $ echo >> Makefile $ git describe --dirty v1.6.5-15-gc274db7-dirty The --dirty option can also be used to specify what is appended, instead of the default string "-dirty". $ git describe --dirty=.mod v1.6.5-15-gc274db7.mod Many build scripts use `git describe` to produce a version number based on the description of HEAD (on which the work tree is based) + saying that if the build contains uncommitted changes. This patch helps the writing of such scripts since `git describe --dirty` does directly the intended thing. Three possiblities were considered while discussing this new feature: 1. Describe the work tree by default and describe HEAD only if "HEAD" is explicitly specified Pro: does the right thing by default (both for users and for scripts) Pro: other git commands that works on the work tree by default Con: breaks existing scripts used by the Linux kernel and other projects 2. Use --worktree instead of --dirty Pro: does what it says: "git describe --worktree" describes the work tree Con: other commands do not require a --worktree option when working on the work tree (it often is the default mode for them) Con: unusable with an optional value: "git describe --worktree=.mod" is quite unintuitive. 3. Use --dirty as in this patch Pro: makes sense to specify an optional value (what the dirty mark is) Pro: does not have any of the big cons of previous alternatives * does not break scripts * is not inconsistent with other git commands This patch takes the third approach. Signed-off-by: Jean Privat <jean@pryen.org> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 13:35:22 +00:00
} else if (dirty) {
die(_("option '%s' and commit-ishes cannot be used together"), "--dirty");
} else if (broken) {
die(_("option '%s' and commit-ishes cannot be used together"), "--broken");
} else {
while (argc-- > 0)
describe(*argv++, argc == 0);
}
Add a "git-describe" command It shows you the most recent tag that is reachable from a particular commit is. Maybe this is something that "git-name-rev" should be taught to do, instead of having a separate command for it. Regardless, I find it useful. What it does is to take any random commit, and "name" it by looking up the most recent commit that is tagged and reachable from that commit. If the match is exact, it will just print out that ref-name directly. Otherwise it will print out the ref-name, followed by the 8-character "short SHA". IOW, with something like Junios current tree, I get: [torvalds@g5 git]$ git-describe parent refs/tags/v1.0.4-g2414721b ie the current head of my "parent" branch (ie Junio) is based on v1.0.4, but since it has a few commits on top of that, it has added the git hash of the thing to the end: "-g" + 8-char shorthand for the commit 2414721b194453f058079d897d13c4e377f92dc6. Doing a "git-describe" on a tag-name will just show the full tag path: [torvalds@g5 git]$ git-describe v1.0.4 refs/tags/v1.0.4 unless there are _other_ tags pointing to that commit, in which case it will just choose one at random. This is useful for two things: - automatic version naming in Makefiles, for example. We could use it in git itself: when doing "git --version", we could use this to give a much more useful description of exactly what version was installed. - for any random commit (say, you use "gitk <pathname>" or "git-whatchanged" to look at what has changed in some file), you can figure out what the last version of the repo was. Ie, say I find a bug in commit 39ca371c45b04cd50d0974030ae051906fc516b6, I just do: [torvalds@g5 linux]$ git-describe 39ca371c45b04cd50d0974030ae051906fc516b6 refs/tags/v2.6.14-rc4-g39ca371c and I now know that it was _not_ in v2.6.14-rc4, but was presumably in v2.6.14-rc5. The latter is useful when you want to see what "version timeframe" a commit happened in. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-24 21:50:45 +00:00
return 0;
}