git/builtin
Ævar Arnfjörð Bjarmason 96e41f58fe fsck: report invalid object type-path combinations
Improve the error that's emitted in cases where we find a loose object
we parse, but which isn't at the location we expect it to be.

Before this change we'd prefix the error with a not-a-OID derived from
the path at which the object was found, due to an emergent behavior in
how we'd end up with an "OID" in these codepaths.

Now we'll instead say what object we hashed, and what path it was
found at. Before this patch series e.g.:

    $ git hash-object --stdin -w -t blob </dev/null
    e69de29bb2
    $ mv objects/e6/ objects/e7

Would emit ("[...]" used to abbreviate the OIDs):

    git fsck
    error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...])
    error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...]

Now we'll instead emit:

    error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...]

Furthermore, we'll do the right thing when the object type and its
location are bad. I.e. this case:

    $ git hash-object --stdin -w -t garbage --literally </dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ mv objects/83 objects/84

As noted in an earlier commits we'd simply die early in those cases,
until preceding commits fixed the hard die on invalid object type:

    $ git fsck
    fatal: invalid object type

Now we'll instead emit sensible error messages:

    $ git fsck
    error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...]
    error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...]

In both fsck.c and object-file.c we're using null_oid as a sentinel
value for checking whether we got far enough to be certain that the
issue was indeed this OID mismatch.

We need to add the "object corrupt or missing" special-case to deal
with cases where read_loose_object() will return an error before
completing check_object_signature(), e.g. if we have an error in
unpack_loose_rest() because we find garbage after the valid gzip
content:

    $ git hash-object --stdin -w -t blob </dev/null
    e69de29bb2
    $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ git fsck
    error: garbage at end of loose object 'e69d[...]'
    error: unable to unpack contents of ./objects/e6/9d[...]
    error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...]

There is currently some weird messaging in the edge case when the two
are combined, i.e. because we're not explicitly passing along an error
state about this specific scenario from check_stream_oid() via
read_loose_object() we'll end up printing the null OID if an object is
of an unknown type *and* it can't be unpacked by zlib, e.g.:

    $ git hash-object --stdin -w -t garbage --literally </dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ /usr/bin/git fsck
    fatal: invalid object type
    $ ~/g/git/git fsck
    error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f'
    error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    [...]

I think it's OK to leave that for future improvements, which would
involve enum-ifying more error state as we've done with "enum
unpack_loose_header_result" in preceding commits. In these
increasingly more obscure cases the worst that can happen is that
we'll get slightly nonsensical or inapplicable error messages.

There's other such potential edge cases, all of which might produce
some confusing messaging, but still be handled correctly as far as
passing along errors goes. E.g. if check_object_signature() returns
and oideq(real_oid, null_oid()) is true, which could happen if it
returns -1 due to the read_istream() call having failed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01 15:06:01 -07:00
..
add.c Merge branch 'ow/no-dryrun-in-add-i' 2021-05-14 08:26:09 +09:00
am.c am: learn to process quoted lines that ends with CRLF 2021-05-10 15:06:22 +09:00
annotate.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
apply.c
archive.c
bisect--helper.c bisect--helper: use BISECT_TERMS in 'bisect skip' command 2021-04-30 09:56:42 +09:00
blame.c Merge branch 'rs/blame-optim' 2021-02-25 16:43:29 -08:00
branch.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
bugreport.c builtin/bugreport: don't leak prefixed filename 2021-04-28 09:25:45 +09:00
bundle.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
cat-file.c Merge branch 'cc/cat-file-usage-update' into master 2020-07-09 14:00:41 -07:00
check-attr.c
check-ignore.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c make_transient_cache_entry(): optionally alloc from mem_pool 2021-05-05 12:25:25 +09:00
checkout-index.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
checkout.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
clean.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
clone.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
column.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
commit-graph.c builtin/*: update usage format 2021-01-06 15:10:49 -08:00
commit-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
commit.c Merge branch 'ds/sparse-index-protections' 2021-04-30 13:50:26 +09:00
config.c config: unify code paths to get global config paths 2021-04-19 14:16:59 -07:00
count-objects.c
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c unix-socket: disallow chdir() when creating unix domain sockets 2021-03-15 14:32:51 -07:00
credential-store.c crendential-store: use timeout when locking file 2020-11-25 12:30:18 -08:00
credential.c credential: load default config 2020-10-16 12:30:45 -07:00
describe.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-tree.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
difftool.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c fsck: report invalid object type-path combinations 2021-10-01 15:06:01 -07:00
fast-import.c Use the final_oid_fn to finalize hashing of object IDs 2021-04-27 16:31:38 +09:00
fetch-pack.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
fetch.c Merge branch 'jt/push-negotiation' 2021-05-16 21:05:22 +09:00
fmt-merge-msg.c Lib-ify fmt-merge-msg 2020-03-24 15:04:43 -07:00
for-each-ref.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
for-each-repo.c for-each-repo: do nothing on empty config 2021-01-07 19:12:02 -08:00
fsck.c fsck: report invalid object type-path combinations 2021-10-01 15:06:01 -07:00
gc.c maintenance: fix two memory leaks 2021-05-12 07:00:45 +09:00
get-tar-commit-id.c
grep.c Merge branch 'bc/hash-transition-interop-part-1' 2021-05-10 16:59:46 +09:00
hash-object.c
help.c help: drop usage of 'common' and 'useful' for guides 2020-08-04 18:34:01 -07:00
index-pack.c fsck: report invalid object type-path combinations 2021-10-01 15:06:01 -07:00
init-db.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
interpret-trailers.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
log.c Merge branch 'so/log-diff-merge' 2021-04-30 13:50:26 +09:00
ls-files.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
ls-remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
ls-tree.c tree.h API: simplify read_tree_recursive() signature 2021-03-20 16:09:26 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c
merge-base.c rebase: --fork-point regression fix 2020-02-11 09:59:39 -08:00
merge-file.c
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c
merge-recursive.c Ensure index matches head before invoking merge machinery, round N 2019-08-19 10:08:03 -07:00
merge-tree.c merge-base, xdiff: zero out xpparam_t structures 2020-10-20 12:53:26 -07:00
merge.c merge: fix swapped "up to date" message components 2021-05-03 14:14:58 +09:00
mktag.c fsck: report invalid object type-path combinations 2021-10-01 15:06:01 -07:00
mktree.c
multi-pack-index.c midx: allow marking a pack as preferred 2021-04-01 13:07:37 -07:00
mv.c git mv foo FOO ; git mv foo bar gave an assert 2021-03-03 17:07:12 -08:00
name-rev.c oid_pos(): access table through const pointers 2021-01-28 12:03:26 -08:00
notes.c use CALLOC_ARRAY 2021-03-13 16:00:09 -08:00
pack-objects.c Merge branch 'jk/pack-objects-negative-options-fix' 2021-05-11 15:27:23 +09:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c patch-id: use oid_to_hex() to print multiple object IDs 2019-12-09 12:26:40 -08:00
prune-packed.c Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
prune.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
pull.c pull: display default warning only when non-ff 2020-12-15 17:39:42 -08:00
push.c Merge branch 'jc/push-delete-nothing' 2021-02-25 16:43:33 -08:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
rebase.c Merge branch 'bc/hash-transition-interop-part-1' 2021-05-10 16:59:46 +09:00
receive-pack.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
reflog.c reflog expire --stale-fix: be generous about missing objects 2021-02-11 09:21:52 -08:00
remote-ext.c strvec: convert builtin/ callers away from argv_array name 2020-07-28 15:02:18 -07:00
remote-fd.c
remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
repack.c repack: avoid loosening promisor objects in partial clones 2021-04-28 13:36:13 +09:00
replace.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
rerere.c
reset.c reset: free instead of leaking unneeded ref 2021-03-14 15:57:59 -07:00
rev-list.c rev-list: allow filtering of provided items 2021-04-19 14:09:11 -07:00
rev-parse.c rev-parse: add option for absolute or relative path formatting 2020-12-12 23:35:51 -08:00
revert.c sequencer: fix edit handling for cherry-pick and revert messages 2021-03-31 14:10:50 -07:00
rm.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
send-pack.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
shortlog.c Merge branch 'ab/mailmap' 2021-01-25 14:19:19 -08:00
show-branch.c Merge branch 'jt/interpret-branch-name-fallback' 2020-09-09 13:53:09 -07:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c refs: switch peel_ref() to peel_iterated_oid() 2021-01-21 15:51:31 -08:00
sparse-checkout.c Merge branch 'ds/sparse-index-protections' 2021-04-30 13:50:26 +09:00
stash.c Merge branch 'dl/stash-show-untracked-fixup' 2021-05-16 21:05:24 +09:00
stripspace.c
submodule--helper.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
unpack-file.c
unpack-objects.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
update-index.c update-index: ensure full index 2021-04-14 13:47:29 -07:00
update-ref.c update-ref: disallow "start" for ongoing transactions 2020-11-16 13:44:01 -08:00
update-server-info.c
upload-archive.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
upload-pack.c
var.c
verify-commit.c Merge branch 'jk/no-system-includes-in-dot-c' 2019-07-31 14:38:56 -07:00
verify-pack.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
verify-tag.c
worktree.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
write-tree.c