git/builtin
Taylor Blau a7d493833f builtin/pack-objects.c: --cruft with expiration
In a previous patch, pack-objects learned how to generate a cruft pack
so long as no objects are dropped.

This patch teaches pack-objects to handle the case where a non-never
`--cruft-expiration` value is passed. This case is slightly more
complicated than before, because we want pack-objects to save
unreachable objects which would have been pruned when there is another
recent (i.e., non-prunable) unreachable object which reaches the other.
We'll call these objects "unreachable but reachable-from-recent".

Here is how pack-objects handles `--cruft-expiration`:

  - Instead of adding all objects outside of the kept pack(s) into the
    packing list, only handle the ones whose mtime is within the grace
    period.

  - Construct a reachability traversal whose tips are the
    unreachable-but-recent objects.

  - Then, walk along that traversal, stopping if we reach an object in
    the kept pack. At each step along the traversal, we add the object
    we are visiting to the packing list.

In the majority of these cases, any object we visit in this traversal
will already be in our packing list. But we will sometimes encounter
reachable-from-recent cruft objects, which we want to retain even if
they aged out of the grace period.

The most subtle point of this process is that we actually don't need to
bother to update the rescued object's mtime. Even though we will write
an .mtimes file with a value that is older than the expiration window,
it will continue to survive cruft repacks so long as any objects which
reach it haven't aged out.

That is, a future repack will also exclude that object from the initial
packing list, only to discover it later on when doing the reachability
traversal.

Finally, stopping early once an object is found in a kept pack is safe
to do because the kept packs ordinarily represent which packs will
survive after repacking. Assuming that it _isn't_ safe to halt a
traversal early would mean that there is some ancestor object which is
missing, which implies repository corruption (i.e., the complete set of
reachable objects isn't present).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26 15:48:26 -07:00
..
add.c add: remove support for git-legacy-stash 2022-01-27 18:00:15 -08:00
am.c Merge branch 'ab/date-mode-release' 2022-02-25 15:47:36 -08:00
annotate.c
apply.c
archive.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
bisect--helper.c Merge branch 'ac/usage-string-fixups' 2022-03-06 21:25:32 -08:00
blame.c Merge branch 'ja/i18n-common-messages' 2022-02-25 15:47:35 -08:00
branch.c Merge branch 'gc/branch-recurse-submodules' 2022-02-18 13:53:29 -08:00
bugreport.c hook-list.h: add a generated list of hooks, like config-list.h 2021-09-27 09:44:54 -07:00
bundle.c bundle: call strvec_clear() on allocated strvec 2022-03-04 13:24:18 -08:00
cat-file.c Merge branch 'jc/cat-file-batch-default-format-optim' 2022-03-23 14:09:31 -07:00
check-attr.c
check-ignore.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c pkt-line.[ch]: remove unused packet_read_line_buf() 2021-10-15 13:09:40 -07:00
checkout-index.c checkout-index: integrate with sparse index 2022-01-13 13:49:45 -08:00
checkout.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
clean.c Merge branch 'vd/sparse-clean-etc' 2022-02-17 16:25:05 -08:00
clone.c Merge branch 'ds/partial-bundles' 2022-03-21 15:14:24 -07:00
column.c column: fix parsing of the '--nl' option 2021-08-26 14:36:27 -07:00
commit-graph.c commit-graph: fix memory leak in misused string_list API 2022-03-04 13:24:18 -08:00
commit-tree.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
commit.c hooks: fix an obscure TOCTOU "did we just run a hook?" race 2022-03-07 13:00:53 -08:00
config.c Merge branch 'mf/fix-type-in-config-h' 2022-03-16 17:53:07 -07:00
count-objects.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c credential-cache: check for windows specific errors 2021-09-14 09:30:54 -07:00
credential-store.c Use a better name for the function interpolating paths 2021-07-26 12:17:16 -07:00
credential.c doc: fix git credential synopsis 2021-10-28 09:57:09 -07:00
describe.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c diff-index: restore -c/--cc options handling 2021-09-07 11:11:35 -07:00
diff-tree.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
diff.c builtin/diff.c: fix "git-diff" usage string typo 2022-02-02 11:30:53 -08:00
difftool.c i18n: factorize more 'incompatible options' messages 2022-02-04 13:58:28 -08:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
fast-import.c Merge branch 'ns/core-fsyncmethod' 2022-03-25 16:38:24 -07:00
fetch-pack.c Merge branch 'rc/fetch-refetch' 2022-04-04 10:56:23 -07:00
fetch.c Merge branch 'rc/fetch-refetch' 2022-04-04 10:56:23 -07:00
fmt-merge-msg.c merge: allow to pretend a merge is made into a different branch 2021-12-20 14:55:02 -08:00
for-each-ref.c for-each-ref: delay parsing of --sort=<atom> options 2021-10-20 14:33:07 -07:00
for-each-repo.c builtin/for-each-repo: remove unnecessary argv copy to plug leak 2021-07-26 12:19:20 -07:00
fsck.c run-command API users: use strvec_pushl(), not argv construction 2021-11-25 22:15:07 -08:00
fsmonitor--daemon.c fsmonitor--daemon: use a cookie file to sync with file system 2022-03-25 16:04:17 -07:00
gc.c builtin/gc.c: delete duplicate include 2022-03-13 22:23:16 +00:00
get-tar-commit-id.c
grep.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
hash-object.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
help.c Merge branch 'ab/help-fixes' 2022-03-09 13:38:24 -08:00
hook.c git hook run: add an --ignore-missing flag 2022-01-07 15:19:34 -08:00
index-pack.c Merge branch 'ns/core-fsyncmethod' 2022-03-25 16:38:24 -07:00
init-db.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
interpret-trailers.c
log.c Merge branch 'ab/grep-patterntype' 2022-02-25 15:47:36 -08:00
ls-files.c ls-files: support --recurse-submodules --stage 2022-02-23 16:41:55 -08:00
ls-remote.c ls-remote & transport API: release "struct transport_ls_refs_options" 2022-02-06 18:02:34 -08:00
ls-tree.c Merge branch 'tl/ls-tree-oid-only' 2022-04-06 15:21:59 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c am/apply: warn if we end up reading patches from terminal 2022-03-03 14:00:32 -08:00
merge-base.c merge-base: free() allocated "struct commit **" list 2022-03-04 13:24:17 -08:00
merge-file.c xdiff: implement a zealous diff3, or "zdiff3" 2021-12-01 14:45:58 -08:00
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
merge-recursive.c gettext API users: don't explicitly cast ngettext()'s "n" 2022-03-07 11:57:52 -08:00
merge-tree.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
merge.c hooks: fix an obscure TOCTOU "did we just run a hook?" race 2022-03-07 13:00:53 -08:00
mktag.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
mktree.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
multi-pack-index.c builtin/multi-pack-index.c: don't leak concatenated options 2021-10-28 15:32:14 -07:00
mv.c mv: refuse to move sparse paths 2021-09-28 10:31:02 -07:00
name-rev.c name-rev: use generation numbers if available 2022-03-13 18:39:29 +00:00
notes.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
pack-objects.c builtin/pack-objects.c: --cruft with expiration 2022-05-26 15:48:26 -07:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c patch-id: fix scan_hunk_header on diffs with 1 line of before/after 2022-02-02 11:24:23 -08:00
prune-packed.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
prune.c Merge branch 'ns/tmp-objdir' 2022-01-03 16:24:15 -08:00
pull.c Merge branch 'ja/i18n-common-messages' 2022-02-25 15:47:35 -08:00
push.c i18n: factorize "invalid value" messages 2022-02-04 13:58:28 -08:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c read-tree: make three-way merge sparse-aware 2022-03-01 12:36:01 -08:00
rebase.c rebase: set REF_HEAD_DETACH in checkout_up_to_date() 2022-03-18 09:48:53 -07:00
receive-pack.c Merge branch 'ab/string-list-count-in-size-t' 2022-03-16 17:53:09 -07:00
reflog.c reflog: fix 'show' subcommand's argv 2022-03-28 15:45:46 -07:00
remote-ext.c
remote-fd.c
remote.c Merge branch 'tb/rename-remote-progress' 2022-03-16 17:53:08 -07:00
repack.c pack-mtimes: support reading .mtimes files 2022-05-26 15:48:26 -07:00
replace.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
rerere.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
reset.c reset: show --no-refresh in the short-help 2022-03-24 13:36:21 -07:00
rev-list.c Merge branch 'ds/partial-bundles' 2022-03-21 15:14:24 -07:00
rev-parse.c refs: drop "broken" flag from for_each_fullref_in() 2021-09-27 12:36:45 -07:00
revert.c Merge branch 'ds/mergies-with-sparse-index' 2021-09-20 15:20:45 -07:00
rm.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
send-pack.c i18n: factorize "invalid value" messages 2022-02-04 13:58:28 -08:00
shortlog.c string-list API: change "nr" and "alloc" to "size_t" 2022-03-07 12:02:04 -08:00
show-branch.c date API: create a date.h, split from cache.h 2022-02-16 09:40:00 -08:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c refs: switch peel_ref() to peel_iterated_oid() 2021-01-21 15:51:31 -08:00
sparse-checkout.c Merge branch 'ep/remove-duplicated-includes' 2022-03-23 14:09:30 -07:00
stash.c Merge branch 'vd/stash-silence-reset' 2022-03-30 18:01:10 -07:00
stripspace.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
submodule--helper.c i18n: fix some badly formatted i18n strings 2022-04-11 14:13:46 -07:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
unpack-file.c
unpack-objects.c object-file API: have hash_object_file() take "enum object_type" 2022-02-25 17:16:32 -08:00
update-index.c fsmonitor: config settings are repository-specific 2022-03-25 16:04:15 -07:00
update-ref.c update-ref: fix streaming of status updates 2021-09-03 11:35:15 -07:00
update-server-info.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
upload-archive.c upload-archive: use regular "struct child_process" pattern 2021-11-25 22:15:07 -08:00
upload-pack.c upload-pack: document and rename --advertise-refs 2021-08-05 08:59:37 -07:00
var.c var: add GIT_DEFAULT_BRANCH variable 2021-11-03 13:25:36 -07:00
verify-commit.c
verify-pack.c
verify-tag.c
worktree.c Merge branch 'pw/worktree-list-with-z' 2022-04-04 10:56:25 -07:00
write-tree.c