git/builtin
Jeff King b773ddea2c pack-objects: walk tag chains for --include-tag
When pack-objects is given --include-tag, it peels each tag
ref down to a non-tag object, and if that non-tag object is
going to be packed, we include the tag, too. But what
happens if we have a chain of tags (e.g., tag "A" points to
tag "B", which points to commit "C")?

We'll peel down to "C" and realize that we want to include
tag "A", but we do not ever consider tag "B", leading to a
broken pack (assuming "B" was not otherwise selected).
Instead, we have to walk the whole chain, adding any tags we
find to the pack.

Interestingly, it doesn't seem possible to trigger this
problem with "git fetch", but you can with "git clone
--single-branch". The reason is that we generate the correct
pack when the client explicitly asks for "A" (because we do
a real reachability analysis there), and "fetch" is more
willing to do so. There are basically two cases:

  1. If "C" is already a ref tip, then the client can deduce
     that it needs "A" itself (via find_non_local_tags), and
     will ask for it explicitly rather than relying on the
     include-tag capability. Everything works.

  2. If "C" is not already a ref tip, then we hope for
     include-tag to send us the correct tag. But it doesn't;
     it generates a broken pack. However, the next step is
     to do a follow-up run of find_non_local_tags(),
     followed by fetch_refs() to backfill any tags we
     learned about.

     In the normal case, fetch_refs() calls quickfetch(),
     which does a connectivity check and sees we have no
     new objects to fetch. We just write the refs.

     But for the broken-pack case, the connectivity check
     fails, and quickfetch will follow-up with the remote,
     asking explicitly for each of the ref tips. This picks
     up the missing object in a new pack.

For a regular "git clone", we are similarly OK, because we
explicitly request all of the tag refs, and get a correct
pack. But with "--single-branch", we kick in tag
auto-following via "include-tag", but do _not_ do a
follow-up backfill. We just take whatever the server sent us
via include-tag and write out tag refs for any tag objects
we were sent. So prior to c6807a4 (clone: open a shortcut
for connectivity check, 2013-05-26), we actually claimed the
clone was a success, but the result was silently
corrupted!  Since c6807a4, index-pack's connectivity
check catches this case, and we correctly complain.

The included test directly checks that pack-objects does not
generate a broken pack, but also confirms that "clone
--single-branch" does not hit the bug.

Note that tag chains introduce another interesting question:
if we are packing the tag "B" but not the commit "C", should
"A" be included?

Both before and after this patch, we do not include "A",
because the initial peel_ref() check only knows about the
bottom-most level, "C". To realize that "B" is involved at
all, we would have to switch to an incremental peel, in
which we examine each tagged object, asking if it is being
packed (and including the outer tag if so).

But that runs contrary to the optimizations in peel_ref(),
which avoid accessing the objects at all, in favor of using
the value we pull from packed-refs. It's OK to walk the
whole chain once we know we're going to include the tag (we
have to access it anyway, so the effort is proportional to
the pack we're generating). But for the initial selection,
we have to look at every ref. If we're only packing a few
objects, we'd still have to parse every single referenced
tag object just to confirm that it isn't part of a tag
chain.

This could be addressed if packed-refs stored the complete
tag chain for each peeled ref (in most cases, this would be
the same cost as now, as each "chain" is only a single
link). But given the size of that project, it's out of scope
for this fix (and probably nobody cares enough anyway, as
it's such an obscure situation). This commit limits itself
to just avoiding the creation of a broken pack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:31 -07:00
..
add.c add: add --chmod=+x / --chmod=-x options 2016-06-07 17:43:39 -07:00
am.c Merge branch 'jk/reset-ident-time-per-commit' into maint 2016-08-12 09:16:56 -07:00
annotate.c
apply.c Merge branch 'rs/apply-name-terminate' 2016-06-03 14:38:04 -07:00
archive.c
bisect--helper.c
blame.c Merge branch 'mh/blame-worktree' into maint 2016-08-08 14:21:32 -07:00
branch.c Merge branch 'va/i18n-misc-updates' into maint 2016-05-26 13:17:20 -07:00
bundle.c
cat-file.c Merge branch 'nd/pack-ofs-4gb-limit' into maint 2016-08-08 14:21:36 -07:00
check-attr.c give "nbuf" strbuf a more meaningful name 2016-02-01 13:43:02 -08:00
check-ignore.c give "nbuf" strbuf a more meaningful name 2016-02-01 13:43:02 -08:00
check-mailmap.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
check-ref-format.c use xmallocz to avoid size arithmetic 2016-02-22 14:51:09 -08:00
checkout-index.c checkout-index: disallow "--no-stage" option 2016-02-01 13:43:49 -08:00
checkout.c add: add --chmod=+x / --chmod=-x options 2016-06-07 17:43:39 -07:00
clean.c Merge branch 'jk/tighten-alloc' 2016-02-26 13:37:16 -08:00
clone.c Merge branch 'sb/clone-shallow-passthru' into maint 2016-07-11 10:44:12 -07:00
column.c column: read lines with strbuf_getline() 2016-01-15 10:35:07 -08:00
commit-tree.c Merge branch 'jc/commit-tree-ignore-commit-gpgsign' 2016-05-13 13:18:27 -07:00
commit.c Merge branch 'os/no-verify-skips-commit-msg-too' into maint 2016-08-10 11:55:25 -07:00
config.c Merge branch 'jk/config-get-urlmatch' into maint 2016-04-14 18:57:43 -07:00
count-objects.c
credential.c
describe.c Remove get_object_hash. 2015-11-20 08:02:05 -05:00
diff-files.c diff: run arguments through precompose_argv 2016-05-13 14:35:49 -07:00
diff-index.c diff: run arguments through precompose_argv 2016-05-13 14:35:49 -07:00
diff-tree.c Merge branch 'ar/diff-args-osx-precompose' into maint 2016-06-06 14:27:35 -07:00
diff.c Merge branch 'ar/diff-args-osx-precompose' into maint 2016-06-06 14:27:35 -07:00
fast-export.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
fetch-pack.c fetch-pack: fix object_id of exact sha1 2016-03-01 11:19:19 -08:00
fetch.c Merge branch 'km/fetch-do-not-free-remote-name' into maint 2016-07-11 10:44:10 -07:00
fmt-merge-msg.c use strbuf_addstr() instead of strbuf_addf() with "%s" 2016-08-05 15:09:25 -07:00
for-each-ref.c ref-filter: add option to match literal pattern 2015-09-17 10:02:49 -07:00
fsck.c Merge branch 'nd/pack-ofs-4gb-limit' into maint 2016-08-08 14:21:36 -07:00
gc.c Merge branch 'ew/gc-auto-pack-limit-fix' into maint 2016-07-28 11:25:56 -07:00
get-tar-commit-id.c usage: do not insist that standard input must come from a file 2015-10-16 15:27:52 -07:00
grep.c Merge branch 'nd/ita-cleanup' into maint 2016-07-28 11:25:51 -07:00
hash-object.c Merge branch 'jk/options-cleanup' 2016-02-10 14:20:08 -08:00
help.c builtin/help.c: use warning_errno() 2016-05-09 12:29:08 -07:00
index-pack.c index-pack: correct "offset" type in unpack_entry_data() 2016-07-13 09:15:08 -07:00
init-db.c Merge branch 'jk/check-repository-format' into maint 2016-05-02 14:24:04 -07:00
interpret-trailers.c interpret-trailers: add option for in-place editing 2016-01-14 12:22:17 -08:00
log.c Merge branch 'xy/format-patch-base' 2016-05-23 14:54:31 -07:00
ls-files.c Merge branch 'jk/options-cleanup' 2016-02-10 14:20:08 -08:00
ls-remote.c ls-remote: add support for showing symrefs 2016-01-19 10:07:56 -08:00
ls-tree.c convert trivial sprintf / strcpy calls to xsnprintf 2015-09-25 10:18:18 -07:00
mailinfo.c mailinfo: libify 2015-10-21 15:59:34 -07:00
mailsplit.c builtin/mailsplit.c: use error_errno() 2016-05-09 12:29:08 -07:00
merge-base.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
merge-file.c builtin/merge-file.c: use error_errno() 2016-05-09 12:29:08 -07:00
merge-index.c use sha1_to_hex_r() instead of strcpy 2015-10-05 11:08:05 -07:00
merge-ours.c
merge-recursive.c convert trivial sprintf / strcpy calls to xsnprintf 2015-09-25 10:18:18 -07:00
merge-tree.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
merge.c Merge branch 'en/merge-trivial-fix' 2016-04-25 15:17:15 -07:00
mktag.c usage: do not insist that standard input must come from a file 2015-10-16 15:27:52 -07:00
mktree.c Merge branch 'jk/tighten-alloc' 2016-02-26 13:37:16 -08:00
mv.c Merge branch 'sb/mv-submodule-fix' into HEAD 2016-05-18 14:40:05 -07:00
name-rev.c Merge branch 'js/name-rev-use-oldest-ref' into maint 2016-05-31 14:08:26 -07:00
notes.c Merge branch 'sb/misc-cleanups' into HEAD 2016-05-18 14:40:15 -07:00
pack-objects.c pack-objects: walk tag chains for --include-tag 2016-09-07 11:45:31 -07:00
pack-redundant.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
pack-refs.c
patch-id.c Merge branch 'rs/patch-id-use-skip-prefix' 2016-06-03 14:38:03 -07:00
prune-packed.c
prune.c Merge branch 'jk/repository-extension' into maint 2015-11-03 15:32:25 -08:00
pull.c Merge branch 'va/i18n-misc-updates' 2016-05-17 14:38:23 -07:00
push.c Merge branch 'mm/push-default-warning' 2016-02-26 13:37:25 -08:00
read-tree.c convert trivial sprintf / strcpy calls to xsnprintf 2015-09-25 10:18:18 -07:00
receive-pack.c Merge branch 'dt/pre-refs-backend' 2016-04-25 15:17:15 -07:00
reflog.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
remote-ext.c typofix: assorted typofixes in comments, documentation and messages 2016-05-06 13:16:37 -07:00
remote-fd.c
remote.c i18n: remote: add comment for translators 2016-05-09 12:20:40 -07:00
repack.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
replace.c Merge branch 'js/replace-edit-use-editor-configuration' into maint 2016-05-06 14:53:24 -07:00
rerere.c Sync with 2.6.1 2015-10-05 13:20:08 -07:00
reset.c Merge branch 'js/find-commit-subject-ignore-leading-blanks' into maint 2016-07-28 11:25:50 -07:00
rev-list.c Merge branch 'jk/rev-list-count-with-bitmap' into maint 2016-06-27 09:56:24 -07:00
rev-parse.c use strbuf_addstr() for adding constant strings to a strbuf 2016-08-01 13:42:10 -07:00
revert.c parse_options: allocate a new array when concatenating 2016-07-06 10:11:08 -07:00
rm.c Merge branch 'rs/rm-strbuf-optim' into maint 2016-08-10 11:55:24 -07:00
send-pack.c Merge branch 'sk/send-pack-all-fix' into maint 2016-04-29 14:15:57 -07:00
shortlog.c Merge branch 'jk/shortlog' 2016-01-28 16:10:14 -08:00
show-branch.c Merge branch 'rs/show-branch-argv-array' into maint 2015-12-11 11:14:14 -08:00
show-ref.c show-ref: stop using PARSE_OPT_NO_INTERNAL_HELP 2015-11-20 08:02:07 -05:00
stripspace.c stripspace: call U+0020 a "space" instead of a "blank" 2016-01-29 16:02:34 -08:00
submodule--helper.c submodule: remove bashism from shell script 2016-06-01 11:32:53 -07:00
symbolic-ref.c symbolic-ref: propagate error code from create_symref() 2015-12-21 12:03:03 -08:00
tag.c Merge branch 'st/verify-tag' 2016-04-29 12:59:09 -07:00
unpack-file.c convert trivial sprintf / strcpy calls to xsnprintf 2015-09-25 10:18:18 -07:00
unpack-objects.c Remove get_object_hash. 2015-11-20 08:02:05 -05:00
update-index.c builtin/update-index.c: prefer "err" to "errno" in process_lstat_error 2016-05-09 12:29:08 -07:00
update-ref.c tag, update-ref: improve description of option "create-reflog" 2015-09-11 09:50:02 -07:00
update-server-info.c
upload-archive.c builtin/upload-archive.c: use error_errno() 2016-05-09 12:29:08 -07:00
var.c
verify-commit.c
verify-pack.c
verify-tag.c verify-tag: move tag verification code to tag.c 2016-04-22 14:06:46 -07:00
worktree.c Merge branch 'nd/worktree-various-heads' 2016-05-23 14:54:29 -07:00
write-tree.c