Commit graph

7 commits

Author SHA1 Message Date
Shawn O. Pearce 8713ab3079 Improve git-describe performance by reducing revision listing.
My prior version of git-describe ran very slowly on even reasonably
sized projects like git.git and linux.git as it tended to identify
a large number of possible tags and then needed to generate the
revision list for each of those tags to sort them and select the
best tag to describe the input commit.

All we really need is the number of commits in the input revision
which are not in the tag.  We can generate these counts during
the revision walking and tag matching loop by assigning a color to
each tag and coloring the commits as we walk them.  This limits us
to identifying no more than 26 possible tags, as there is limited
space available within the flags field of struct commit.

The limitation of 26 possible tags is hopefully not going to be a
problem in real usage, as most projects won't create 26 maintenance
releases and merge them back into a development trunk after the
development trunk was tagged with a release candidate tag.  If that
does occur git-describe will start to revert to its old behavior of
using the newer maintenance release tag to describe the development
trunk, rather than the development trunk's own tag.  The suggested
workaround would be to retag the development trunk's tip.

However since even 26 possible tags can take a while to generate a
description for on some projects I'm defaulting the limit to 10 but
offering the user --candidates to increase the number of possible
matches if they need a more accurate result.  I specifically chose
10 for the default as it seems unlikely projects will have more
than 10 maintenance releases merged into a development trunk before
retagging the development trunk, and it seems to perform about the
same on linux.git as v1.4.4.4 git-describe.

A large amount of debugging information was also added during
the development of this change, so I've left it in to be toggled
on with --debug.  It may be useful to the end user to help them
understand why git-describe took one particular tag over another.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-14 21:17:27 -08:00
Shawn O. Pearce 910c0d7b5e Use binary searching on large buckets in git-describe.
If a project has a really huge number of tags (such as several
thousand tags) then we are likely to have nearly a hundred tags in
some buckets.  Scanning those buckets as linked lists could take
a large amount of time if done repeatedly during history traversal.

Since we are searching for a unique commit SHA1 we can sort all
tags by commit SHA1 and perform a binary search within the bucket.
Once we identify a particular tag as matching this commit we walk
backwards within the bucket matches to make sure we pick up the
highest priority tag for that commit, as the binary search may
have landed us in the middle of a set of tags which point at the
same commit.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-14 21:17:27 -08:00
Shawn O. Pearce c3e3cd4bf8 Hash tags by commit SHA1 in git-describe.
If a project has a very large number of tags then git-describe
will spend a good part of its time looping over the tags testing
them one at a time to determine if it matches a given commit.
For 10 tags this is not a big deal, but for hundreds of tags the
time could become considerable if we don't find an exact match for
the input commit and we need to walk back along the history chain.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-14 21:17:27 -08:00
Shawn O. Pearce dccd0c2abd Always perfer annotated tags in git-describe.
Several people have suggested that its always better to describe
a commit using an annotated tag, and to only use a lightweight tag
if absolutely no annotated tag matches the input commit.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-14 21:17:27 -08:00
Junio C Hamano 94d23673e3 plug a few leaks in revision walking used in describe.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-11 18:05:53 -08:00
Shawn O. Pearce 80dbae03b0 Chose better tag names in git-describe after merges.
Recently git.git itself encountered a situation on its master and
next branches where git-describe stopped reporting 'v1.5.0-rc0-gN'
and instead started reporting 'v1.4.4.4-gN'.  This appeared to be
a backward jump in version numbering.

  maint     o-------------------4
            \                    \
  master     o-o-o-o-o-o-o-5-o-C-o-W

The issue is that commit C in the diagram claims it is version
1.5.0, as the tag v1.5.0 is placed on commit 5.  Yet commit W
claims it is version 1.4.4.4 as the tag v1.5.0 has an older tag
date than the v1.4.4.4 tag.

As it turns out this situation is very common.  A bug fix applied
to maint and later merged into master occurs frequently enough that
it should Just Work Right(tm).

Rather than taking the first tag that gets found git-describe will
now generate a list of all possible tags and select the one which
has the most number of commits in common with HEAD (or whatever
revision the user requested the description of).

This rule is based on the principle shown in the diagram above.
There are a large number of commits on the primary development branch
'master' which do not appear in the 'maint' branch, and many of
these are already tagged as part of v1.5.0-rc0.  Additionally these
commits are not in v1.4.4.4, as they are part of the v1.5.0 release
still being developed.  The v1.5.0-rc0 tag is more descriptive of
W than v1.4.4.4 is, and therefore should be used.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-11 18:05:53 -08:00
Shawn O. Pearce 9a0eaf83ea Make git-describe a builtin.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-10 08:27:01 -08:00
Renamed from describe.c (Browse further)