Commit graph

45 commits

Author SHA1 Message Date
Junio C Hamano
58ecb2e383 Merge branch 'tb/gc-recent-object-hook'
"git pack-objects" learned to invoke a new hook program that
enumerates extra objects to be used as anchoring points to keep
otherwise unreachable objects in cruft packs.

* tb/gc-recent-object-hook:
  gc: introduce `gc.recentObjectsHook`
  reachable.c: extract `obj_is_recent()`
2023-06-23 11:21:17 -07:00
Taylor Blau
4dc16e2cb0 gc: introduce gc.recentObjectsHook
This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.

When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.

However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:

  - Point references at the reachability tips of any objects you
    consider precious, which may be undesirable or infeasible if there
    are many such objects.

  - Track them via the reflog, which may be undesirable since the
    reflog's lifetime is limited to that of the reference it's tracking
    (and callers may want to keep those unreachable objects around for
    longer).

  - Extend the grace period, which may keep around other objects that
    the caller *does* want to discard.

  - Manually modify the mtimes of objects you want to keep. If those
    objects are already loose, this is easy enough to do (you can just
    enumerate and `touch -m` each one).

    But if they are packed, you will either end up modifying the mtimes
    of *all* objects in that pack, or be forced to write out a loose
    copy of that object, both of which may be undesirable. Even worse,
    if they are in a cruft pack, that requires modifying its `*.mtimes`
    file by hand, since there is no exposed plumbing for this.

  - Force the caller to construct the pack of objects they want
    to keep themselves, and then mark the pack as kept by adding a
    ".keep" file. This works, but is burdensome for the caller, and
    having extra packs is awkward as you roll forward your cruft pack.

This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.

The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.

This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".

Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:

  - If generating a cruft pack, the caller is able to retain additional
    objects via the cruft pack, even if they would have otherwise been
    pruned due to their age.

  - If not generating a cruft pack, the caller is likewise able to
    retain additional objects as loose.

A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.

But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.

Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 14:12:20 -07:00
Jeff King
8ddfce7144 t: drop "verbose" helper function
We have a small helper function called "verbose", with the idea that you
can write:

  verbose foo

to get a message to stderr when the "foo" command fails, even if it does
not produce any output itself. This goes back to 8ad1652418 (t5304: use
helper to report failure of "test foo = bar", 2014-10-10). It does work,
but overall it has not been a big success for two reasons:

  1. Test writers have to remember to put it there (and the resulting
     test code is longer as a result).

  2. It doesn't handle the opposite case (we expect "foo" to fail, but
     it succeeds), leading to inconsistencies in tests (which you can
     see in many hunks of this patch, e.g. ones involving "has_cr").

Most importantly, we added a136f6d8ff (test-lib.sh: support -x option
for shell-tracing, 2014-10-10) at the same time, and it does roughly the
same thing. The output is not quite as succinct as "verbose", and you
have to watch out for stray shell-traces ending up in stderr. But it
solves both of the problems above, and has clearly become the preferred
tool.

Let's consider the "verbose" function a failed experiment and remove the
last few callers (which are all many years old, and have been dwindling
as we remove them from scripts we touch for other reasons). It will be
one less thing for new test writers to see and wonder if they should be
using themselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08 14:50:28 -07:00
Taylor Blau
b934207a22 t/t5304-prune.sh: prepare for gc --cruft by default
Many of the tests in t5304 run `git gc`, and rely on its behavior that
unreachable-but-recent objects are written out loose. This is sensible,
since t5304 deals specifically with this kind of pruning.

If left unattended, however, this test would break when the default
behavior of a bare "git gc" is adjusted to generate a cruft pack by
default.

Ensure that these tests continue to work as-is (and continue to provide
coverage of loose object pruning) by passing `--no-cruft` explicitly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
Junio C Hamano
87daf40750 Merge branch 'ab/config-multi-and-nonbool'
Assorted config API updates.

* ab/config-multi-and-nonbool:
  for-each-repo: with bad config, don't conflate <path> and <cmd>
  config API: add "string" version of *_value_multi(), fix segfaults
  config API users: test for *_get_value_multi() segfaults
  for-each-repo: error on bad --config
  config API: have *_multi() return an "int" and take a "dest"
  versioncmp.c: refactor config reading next commit
  config API: add and use a "git_config_get()" family of functions
  config tests: add "NULL" tests for *_get_value_multi()
  config tests: cover blind spots in git_die_config() tests
2023-04-06 13:38:29 -07:00
Ævar Arnfjörð Bjarmason
258902ce07 config tests: cover blind spots in git_die_config() tests
There were no tests checking for the output of the git_die_config()
function in the config API, added in 5a80e97c82 (config: add
`git_die_config()` to the config-set API, 2014-08-07). We only tested
"test_must_fail", but didn't assert the output.

We need tests for this because a subsequent commit will alter the
return value of git_config_get_value_multi(), which is used to get the
config values in the git_die_config() function. This test coverage
helps to build confidence in that subsequent change.

These tests cover different interactions with git_die_config():

- The "notes.mergeStrategy" test in
  "t/t3309-notes-merge-auto-resolve.sh" is a case where a function
  outside of config.c (git_config_get_notes_strategy()) calls
  git_die_config().

- The "gc.pruneExpire" test in "t5304-prune.sh" is a case where
  git_config_get_expiry() calls git_die_config(), covering a different
  "type" than the "string" test for "notes.mergeStrategy".

- The "fetch.negotiationAlgorithm" test in
  "t/t5552-skipping-fetch-negotiator.sh" is a case where
  git_config_get_string*() calls git_die_config().

We also cover both the "from command-line config" and "in file..at
line" cases here.

The clobbering of existing ".git/config" files here is so that we're
not implicitly testing the line count of the default config.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
Eric Wong
6974765352 prune: quiet ENOENT on missing directories
$GIT_DIR/objects/pack may be removed to save inodes in shared
repositories.  Quiet down prune in cases where either
$GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent,
but emit the system error in other cases to help users diagnose
permissions problems or resource constraints.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 15:58:54 +09:00
Elijah Newren
f222bd34ff tests: remove leftover untracked files
Remove untracked files that are unwanted after they are done being used.

While the set of cases in this patch is certainly far from
comprehensive, it was motivated by some work to see what the fallout
would be if we were to make the removal of untracked files as a side
effect of other commands into an error.  Some cases were a bit more
involved than the testcase changes included in this patch, but the ones
included here represent the simple cases.  While this patch is not that
important since we are not changing the behavior of those other commands
into an error in the near term, I thought these changes were useful
anyway as an explicit documentation of the intent that these untracked
files are no longer useful.

Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-12 16:42:40 -07:00
Junio C Hamano
c9780bb2ca Merge branch 'hn/prep-tests-for-reftable'
Preliminary clean-up of tests before the main reftable changes
hits the codebase.

* hn/prep-tests-for-reftable: (22 commits)
  t1415: set REFFILES for test specific to storage format
  t4202: mark bogus head hash test with REFFILES
  t7003: check reflog existence only for REFFILES
  t7900: stop checking for loose refs
  t1404: mark tests that muck with .git directly as REFFILES.
  t2017: mark --orphan/logAllRefUpdates=false test as REFFILES
  t1414: mark corruption test with REFFILES
  t1407: require REFFILES for for_each_reflog test
  test-lib: provide test prereq REFFILES
  t5304: use "reflog expire --all" to clear the reflog
  t5304: restyle: trim empty lines, drop ':' before >
  t7003: use rev-parse rather than FS inspection
  t5000: inspect HEAD using git-rev-parse
  t5000: reformat indentation to the latest fashion
  t1301: fix typo in error message
  t1413: use tar to save and restore entire .git directory
  t1401-symbolic-ref: avoid direct filesystem access
  t1401: use tar to snapshot and restore repo state
  t5601: read HEAD using rev-parse
  t9300: check ref existence using test-helper rather than a file system check
  ...
2021-07-13 16:52:50 -07:00
Han-Wen Nienhuys
d491f5ea07 t5304: use "reflog expire --all" to clear the reflog
This test checks that unreachable objects are really removed. For the test to
work, it has to ensure that no reflog retain any reachable objects.

Previously, it did this by manipulating the file system to remove reflog in the
first test, and relying on git not updating the reflog if the relevant logfile
doesn't exist in follow-up tests.

Now, explicitly clear the reflog using 'reflog expire'. This reduces the
dependency between test functions. It also is more amenable to use with
reftable, which has no concept of (non)-existence of a reflog

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
Han-Wen Nienhuys
1fa9cf6ea1 t5304: restyle: trim empty lines, drop ':' before >
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:01:54 +09:00
Jeff King
2ba582ba4c prune: save reachable-from-recent objects with bitmaps
We pass our prune expiration to mark_reachable_objects(), which will
traverse not only the reachable objects, but consider any recent ones as
tips for reachability; see d3038d22f9 (prune: keep objects reachable
from recent objects, 2014-10-15) for details.

However, this interacts badly with the bitmap code path added in
fde67d6896 (prune: use bitmaps for reachability traversal, 2019-02-13).
If we hit the bitmap-optimized path, we return immediately to avoid the
regular traversal, accidentally skipping the "also traverse recent"
code.

Instead, we should do an if-else for the bitmap versus regular
traversal, and then follow up with the "recent" traversal in either
case. This reuses the "rev_info" for a bitmap and then a regular
traversal, but that should work OK (the bitmap code clears the pending
array in the usual way, just like a regular traversal would).

Note that I dropped the comment above the regular traversal here.  It
has little explanatory value, and makes the if-else logic much harder to
read.

Here are a few variants that I rejected:

  - it seems like both the reachability and recent traversals could be
    done in a single traversal. This was rejected by d3038d22f9 (prune:
    keep objects reachable from recent objects, 2014-10-15), though the
    balance may be different when using bitmaps. However, there's a
    subtle correctness issue, too: we use revs->ignore_missing_links for
    the recent traversal, but not the reachability one.

  - we could try using bitmaps for the recent traversal, too, which
    could possibly improve performance. But it would require some fixes
    in the bitmap code, which uses ignore_missing_links for its own
    purposes. Plus it would probably not help all that much in practice.
    We use the reachable tips to generate bitmaps, so those objects are
    likely not covered by bitmaps (unless they just became unreachable).
    And in general, we expect the set of unreachable objects to be much
    smaller anyway, so there's less to gain.

The test in t5304 detects the bug and confirms the fix.

I also beefed up the tests in t6501, which covers the mtime-checking
code more thoroughly, to handle the bitmap case (in addition to just
"loose" and "packed" cases). Interestingly, this test doesn't actually
detect the bug, because it is running "git gc", and not "prune"
directly. And "gc" will call "repack" first, which does not suffer the
same bug. So the old-but-reachable-from-recent objects get scooped up
into the new pack along with the actually-recent objects, which gives
both a recent mtime. But it seemed prudent to get more coverage of the
bitmap case for related code.

Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-29 10:38:25 +09:00
Johannes Schindelin
966b4be276 t5[0-4]*: adjust the references to the default branch name "main"
Carefully excluding t5310, which is developed independently of the
current patch series at the time of writing, we now use `main` as
default branch in t5[0-4]*. This trick was performed via

	$ (cd t &&
	   sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
		-e 's/Master/Main/g' -- t5[0-4]*.sh &&
	   git checkout HEAD -- t5310\*)

This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:18 -08:00
Johannes Schindelin
334afbc76f tests: mark tests relying on the current default for init.defaultBranch
In addition to the manual adjustment to let the `linux-gcc` CI job run
the test suite with `master` and then with `main`, this patch makes sure
that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts
that currently rely on the initial branch name being `master by default.

To determine which test scripts to mark up, the first step was to
force-set the default branch name to `master` in

- all test scripts that contain the keyword `master`,

- t4211, which expects `t/t4211/history.export` with a hard-coded ref to
  initialize the default branch,

- t5560 because it sources `t/t556x_common` which uses `master`,

- t8002 and t8012 because both source `t/annotate-tests.sh` which also
  uses `master`)

This trick was performed by this command:

	$ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\
	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
	' $(git grep -l master t/t[0-9]*.sh) \
	t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh

After that, careful, manual inspection revealed that some of the test
scripts containing the needle `master` do not actually rely on a
specific default branch name: either they mention `master` only in a
comment, or they initialize that branch specificially, or they do not
actually refer to the current default branch. Therefore, the
aforementioned modification was undone in those test scripts thusly:

	$ git checkout HEAD -- \
		t/t0027-auto-crlf.sh t/t0060-path-utils.sh \
		t/t1011-read-tree-sparse-checkout.sh \
		t/t1305-config-include.sh t/t1309-early-config.sh \
		t/t1402-check-ref-format.sh t/t1450-fsck.sh \
		t/t2024-checkout-dwim.sh \
		t/t2106-update-index-assume-unchanged.sh \
		t/t3040-subprojects-basic.sh t/t3301-notes.sh \
		t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \
		t/t3436-rebase-more-options.sh \
		t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \
		t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \
		t/t5511-refspec.sh t/t5526-fetch-submodules.sh \
		t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \
		t/t5548-push-porcelain.sh \
		t/t5552-skipping-fetch-negotiator.sh \
		t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \
		t/t5614-clone-submodules-shallow.sh \
		t/t7508-status.sh t/t7606-merge-custom.sh \
		t/t9302-fast-import-unpack-limit.sh

We excluded one set of test scripts in these commands, though: the range
of `git p4` tests. The reason? `git p4` stores the (foreign) remote
branch in the branch called `p4/master`, which is obviously not the
default branch. Manual analysis revealed that only five of these tests
actually require a specific default branch name to pass; They were
modified thusly:

	$ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\
	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
	' t/t980[0167]*.sh t/t9811*.sh

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 15:44:17 -08:00
Jeff King
fe6f2b081f t5304: add a test for pruning with bitmaps
Commit fde67d6896 (prune: use bitmaps for reachability traversal,
2019-02-13) uses bitmaps for pruning when they're available, but only
covers this functionality in the t/perf tests. This makes a kind of
sense, since the point is that the behaviour is indistinguishable before
and after the patch, just faster.

But since the bitmap code path is not exercised at all in the regular
test suite, it leaves us open to a regression where the behavior does in
fact change. The most thorough way to test that would be running the
whole suite with bitmaps enabled. But we don't yet have a way to do
that, and anyway it's expensive to do so. Let's at least add a basic
test that exercises this path and make sure we prune an object we should
(and not one that we shouldn't).

That would hopefully catch the most obvious breakages early.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-19 14:22:32 +09:00
Jeff King
cc80c95f42 t5304: rename "sha1" variables to "oid"
Let's make the script less jarring to read in a post-sha1 world by
using more hash-agnostic variable names.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14 15:25:56 -08:00
Jeff King
d55a30bb1d prune: lazily perform reachability traversal
The general strategy of "git prune" is to do a full reachability walk,
then for each loose object see if we found it in our walk. But if we
don't have any loose objects, we don't need to do the expensive walk in
the first place.

This patch postpones that walk until the first time we need to see its
results.

Note that this is really a specific case of a more general optimization,
which is that we could traverse only far enough to find the object under
consideration (i.e., stop the traversal when we find it, then pick up
again when asked about the next object, etc). That could save us in some
instances from having to do a full walk. But it's actually a bit tricky
to do with our traversal code, and you'd need to do a full walk anyway
if you have even a single unreachable object (which you generally do, if
any objects are actually left after running git-repack).

So in practice this lazy-load of the full walk catches one easy but
common case (i.e., you've just repacked via git-gc, and there's nothing
unreachable).

The perf script is fairly contrived, but it does show off the
improvement:

  Test                            HEAD^             HEAD
  -------------------------------------------------------------------------
  5304.4: prune with no objects   3.66(3.60+0.05)   0.00(0.00+0.00) -100.0%

and would let us know if we accidentally regress this optimization.

Note also that we need to take special care with prune_shallow(), which
relies on us having performed the traversal. So this optimization can
only kick in for a non-shallow repository. Since this is easy to get
wrong and is not covered by existing tests, let's add an extra test to
t5304 that covers this case explicitly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14 15:25:32 -08:00
SZEDER Gábor
1c5e94f459 tests: use 'test_must_be_empty' instead of 'test_cmp <empty> <out>'
Using 'test_must_be_empty' is shorter and more idiomatic than

  >empty &&
  test_cmp empty out

as it saves the creation of an empty file.  Furthermore, sometimes the
expected empty file doesn't have such a descriptive name like 'empty',
and its creation is far away from the place where it's finally used
for comparison (e.g. in 't7600-merge.sh', where two expected empty
files are created in the 'setup' test, but are used only about 500
lines later).

These cases were found by instrumenting 'test_cmp' to error out the
test script when it's used to compare empty files, and then converted
manually.

Note that even after this patch there still remain a lot of cases
where we use 'test_cmp' to check empty files:

  - Sometimes the expected output is not hard-coded in the test, but
    'test_cmp' is used to ensure that two similar git commands produce
    the same output, and that output happens to be empty, e.g. the
    test 'submodule update --merge  - ignores --merge  for new
    submodules' in 't7406-submodule-update.sh'.

  - Repetitive common tasks, including preparing the expected results
    and running 'test_cmp', are often extracted into a helper
    function, and some of this helper's callsites expect no output.

  - For the same reason as above, the whole 'test_expect_success'
    block is within a helper function, e.g. in 't3070-wildmatch.sh'.

  - Or 'test_cmp' is invoked in a loop, e.g. the test 'cvs update
    (-p)' in 't9400-git-cvsserver-server.sh'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21 11:48:36 -07:00
Junio C Hamano
3915f9a4fa Merge branch 'jc/parseopt-expiry-errors'
"git gc --prune=nonsense" spent long time repacking and then
silently failed when underlying "git prune --expire=nonsense"
failed to parse its command line.  This has been corrected.

* jc/parseopt-expiry-errors:
  parseopt: handle malformed --expire arguments more nicely
  gc: do not upcase error message shown with die()
2018-05-08 15:59:33 +09:00
Junio C Hamano
8ab5aa4bd8 parseopt: handle malformed --expire arguments more nicely
A few commands that parse --expire=<time> command line option behave
sillily when given nonsense input.  For example

    $ git prune --no-expire
    Segmentation falut
    $ git prune --expire=npw; echo $?
    129

Both come from parse_opt_expiry_date_cb().

The former is because the function is not prepared to see arg==NULL
(for "--no-expire", it is a norm; "--expire" at the end of the
command line could be made to pass NULL, if it is told that the
argument is optional, but we don't so we do not have to worry about
that case).

The latter is because it does not check the value returned from the
underlying parse_expiry_date().

This seems to be a recent regression introduced while we attempted
to avoid spewing the entire usage message when given a correct
option but with an invalid value at 3bb0923f ("parse-options: do not
show usage upon invalid option value", 2018-03-22).  Before that, we
didn't fail silently but showed a full usage help (which arguably is
not all that better).

Also catch this error early when "git gc --prune=<expiration>" is
misspelled by doing a dummy parsing before the main body of "gc"
that is time consuming even begins.  Otherwise, we'd spend time to
pack objects and then later have "git prune" first notice the error.
Aborting "gc" in the middle that way is not harmful but is ugly and
can be avoided.

Helped-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-23 22:36:59 +09:00
Nguyễn Thái Ngọc Duy
0e496492d2 t/helper: merge test-chmtime into test-tool
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy
acd9544a8f revision.c: --reflog add HEAD reflog from all worktrees
Note that add_other_reflogs_to_pending() is a bit inefficient, since
it scans reflog for all refs of each worktree, including shared refs,
so the shared ref's reflog is scanned over and over again.

We could update refs API to pass "per-worktree only" flag to avoid
that. But long term we should be able to obtain a "per-worktree only"
ref store and would need to revert the changes in reflog iteration
API. So let's just wait until then.

add_reflogs_to_pending() is called by reachable.c so by default "git
prune" will examine reflog from all worktrees.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:58:47 -07:00
Nguyễn Thái Ngọc Duy
d0c39a49cc revision.c: --all adds HEAD from all worktrees
Unless single_worktree is set, --all now adds HEAD from all worktrees.

Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.

A new FIXME is added because we would need something like

    int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);

in addition to other_head_refs() to handle it, which might require

    int get_submodule_worktrees(const char *submodule, int flags);

It could be a separate topic to reduce the scope of this one.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:56:43 -07:00
Nguyễn Thái Ngọc Duy
be489d02d2 revision.c: --indexed-objects add objects from all worktrees
This is the result of single_worktree flag never being set (no way to up
until now). To get objects from current index only, set single_worktree.

The other add_index_objects_to_pending's caller is mark_reachable_objects()
(e.g. "git prune") which also mark objects from all indexes.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:44:41 -07:00
Elia Pinto
213ea1161c t/t5304-prune.sh: use the $( ... ) construct for command substitution
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.

The backquoted form is the traditional method for command
substitution, and is supported by POSIX.  However, all but the
simplest uses become complicated quickly.  In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.

The patch was generated by:

for _f in $(find . -name "*.sh")
do
	perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg'  "${_f}"
done

and then carefully proof-read.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-28 13:37:02 -08:00
Doug Kelly
478f34d2b6 gc: remove garbage .idx files from pack dir
Add a custom report_garbage handler to collect and remove
garbage .idx files from the pack directory.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-04 11:30:22 -08:00
Doug Kelly
e6d65c9a47 t5304: test cleaning pack garbage
Pack garbage, noticeably stale .idx files, can be cleaned up during
a garbage collection.  This tests to ensure such garbage is properly
cleaned up.

Note that the prior test for checking pack garbage with count-objects
left some stale garbage after the test exited.  This has also been
corrected.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-04 11:30:10 -08:00
Jeff King
f813e9ea5f for_each_packed_object: automatically open pack index
When for_each_packed_object is called, we call
prepare_packed_git() to make sure we have the actual list of
packs. But the latter does not actually open the pack
indices, meaning that pack->nr_objects may simply be 0 if
the pack has not otherwise been used since the program
started.

In practice, this didn't come up for the current callers,
because they iterate the packed objects only after iterating
all reachable objects (so for it to matter you would have to
have a pack consisting only of unreachable objects). But it
is a dangerous and confusing interface that should be fixed
for future callers.

Note that we do not end the iteration when a pack cannot be
opened, but we do return an error. That lets you complete
the iteration even in actively-repacked repository where an
.idx file may racily go away, but it also lets callers know
that they may not have gotten the complete list (which the
current reachability-check caller does care about).

We have to tweak one of the prune tests due to the changed
return value; an earlier test creates bogus .idx files and
does not clean them up. Having to make this tweak is a good
thing; it means we will not prune in a broken repository,
and the test confirms that we do not negatively impact a
more lenient caller, count-objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 14:53:58 -07:00
Jonathon Mah
b0a4264277 sha1_file: fix iterating loose alternate objects
The string in 'base' contains a path suffix to a specific object;
when its value is used, the suffix must either be filled (as in
stat_sha1_file, open_sha1_file, check_and_freshen_nonlocal) or
cleared (as in prepare_packed_git) to avoid junk at the end.

660c889e (sha1_file: add for_each iterators for loose and packed
objects, 2014-10-15) introduced loose_from_alt_odb(), but this did
neither and treated 'base' as a complete path to the "base" object
directory, instead of a pointer to the "base" of the full path
string.

The trailing path after 'base' is still initialized to NUL, hiding
the bug in some common cases.  Additionally the descendent
for_each_file_in_obj_subdir() function swallows ENOENT, so an error
only shows if the alternate's path was last filled with a valid
object (where statting /path/to/existing/00/0bjectfile/00 fails).

Signed-off-by: Jonathon Mah <me@JonathonMah.com>
Helped-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-09 14:14:56 -08:00
Jeff King
8ad1652418 t5304: use helper to report failure of "test foo = bar"
For small outputs, we sometimes use:

  test "$(some_cmd)" = "something we expect"

instead of a full test_cmp. The downside of this is that
when it fails, there is no output at all from the script.
Let's introduce a small helper to make tests easier to
debug.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13 11:27:40 -07:00
Jeff King
f1dd90bd19 t5304: use test_path_is_* instead of "test -f"
This is slightly more robust (checking "! test -f" would not
notice a directory of the same name, though that is not
likely to happen here). It also makes debugging easier, as
the test script will output a message on failure.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13 11:27:07 -07:00
Max Kirillov
c40fdd01dd reachable.c: add HEAD to reachability starting commits
HEAD is not explicitly used as a starting commit for
calculating reachability, so if it's detached and reflogs
are disabled it may be pruned.

Add tests which demonstrate it. Test 'prune: prune former HEAD after checking
out branch' also reverts changes to repository.

Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:47:44 -07:00
Justin Lebar
235e8d5914 code and test: fix misuses of "nor"
Signed-off-by: Justin Lebar <jlebar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 15:29:33 -07:00
Nguyễn Thái Ngọc Duy
eab3296c7e prune: clean .git/shallow after pruning objects
This patch teaches "prune" to remove shallow roots that are no longer
reachable from any refs (e.g. when the relevant refs are removed).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:19 -08:00
Nguyễn Thái Ngọc Duy
543c5caa6c count-objects: report garbage files in pack directory too
prepare_packed_git_one() is modified to allow count-objects to hook a
report function to so we don't need to duplicate the pack searching
logic in count-objects.c. When report_pack_garbage is NULL, the
overhead is insignificant.

The garbage is reported with warning() instead of error() in packed
garbage case because it's not an error to have garbage. Loose garbage
is still reported as errors and will be converted to warnings later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-15 08:13:13 -08:00
Michael J Gruber
e3b02bc953 t3306,t5304: avoid clock skew issues
On systems where the local time and file modification time may be out of
sync (e.g. test directory on NFS) t3306 and t5305 can fail because prune
compares times such as "now" (client time) with file modification times
(server times for remote file systems). I.e., these are spurious test
failures.

Avoid this by setting the relevant modification times to the local time.

Noticed on a system with as little as 2s time skew.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-14 10:47:18 -07:00
Adam Simpkins
cbf731ed4e prune: honor --expire=never
Previously, prune treated an expiration time of 0 to mean that no
expire argument was supplied, and everything should be pruned.  As a
result, "prune --expire=never" would prune all unreachable objects,
regardless of their timestamp.

prune can be called with --expire=never automatically by gc, when the
gc.pruneExpire configuration is set to "never".

Signed-off-by: Adam Simpkins <simpkins@facebook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-28 10:28:05 -08:00
Clemens Buchacher
98df233e2d test local clone by copying
Test the effect of an earlier change by f7835a2 (preserve mtime of local
clone, 2009-09-12) to keep stale loose object files stale in the new
repository when a local clone is made by copying files in .git/
directory.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 13:22:29 -07:00
Johannes Schindelin
58e9d9d472 gc: make --prune useful again by accepting an optional parameter
With this patch, "git gc --no-prune" will not prune any loose (and
dangling) object, and "git gc --prune=5.minutes.ago" will prune
all loose objects older than 5 minutes.

This patch benefitted from suggestions by Thomas Rast and Jan Krï¿œger.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-14 21:14:07 -08:00
Brandon Casey
21926fe885 t5304-prune: adjust file mtime based on system time rather than file mtime
test-chmtime can adjust the mtime of a file based on the file's mtime, or
based on the system time. For files accessed over NFS, the file's mtime is
set by the NFS server, and as such may vary a great deal from the NFS
client's system time if the clocks of the client and server are out of
sync. Since these tests are testing the expire feature of git-prune, an
incorrect mtime could cause a file to be expired or not expired incorrectly
and produce a test failure.

Avoid this NFS pitfall by modifying the calls to test-chmtime so that the
mtime is adjusted based on the system time, rather than the file's mtime.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13 18:18:23 -07:00
Junio C Hamano
2b2828b452 Fix executable bits in t/ scripts
Pointed out by Ramsay Jones.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-04 01:38:41 -07:00
Junio C Hamano
fe308f5373 builtin-prune: protect objects listed on the command line
Finally, this resurrects the documented behaviour to protect other
objects listed on the command line from getting pruned.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-27 15:39:57 -07:00
Michele Ballabio
0c62705a0d Add tests for git-prune
It seems that git prune changed behaviour with respect to revisions added
from command line, probably when it became a builtin. Currently, it prints
a short usage and exits: instead, it should take those revisions into
account and not prune them. So add a couple of test to point this out.

We'll be fixing this by switching to parse_options(), so add tests to
detect bogus command line parameters as well, to keep ourselves from
introducing regressions.

Signed-off-by: Michele Ballabio <barra_cuda@katamail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-27 13:55:15 -07:00
Johannes Schindelin
25ee9731c1 gc: call "prune --expire 2.weeks.ago" by default
The only reason we did not call "prune" in git-gc was that it is an
inherently dangerous operation: if there is a commit going on, you will
prune loose objects that were just created, and are, in fact, needed by the
commit object just about to be created.

Since it is dangerous, we told users so.  That led to many users not even
daring to run it when it was actually safe. Besides, they are users, and
should not have to remember such details as when to call git-gc with
--prune, or to call git-prune directly.

Of course, the consequence was that "git gc --auto" gets triggered much
more often than we would like, since unreferenced loose objects (such as
left-overs from a rebase or a reset --hard) were never pruned.

Alas, git-prune recently learnt the option --expire <minimum-age>, which
makes it a much safer operation.  This allows us to call prune from git-gc,
with a grace period of 2 weeks for the unreferenced loose objects (this
value was determined in a discussion on the git list as a safe one).

If you want to override this grace period, just set the config variable
gc.pruneExpire to a different value; an example would be

	[gc]
		pruneExpire = 6.months.ago

or even "never", if you feel really paranoid.

Note that this new behaviour makes "--prune" be a no-op.

While adding a test to t5304-prune.sh (since it really tests the implicit
call to "prune"), also the original test for "prune --expire" was moved
there from t1410-reflog.sh, where it did not belong.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2008-03-12 23:47:01 -07:00
David Steven Tweed
8464010f97 Make git prune remove temporary packs that look like write failures
Write errors when repacking (eg, due to out-of-space conditions)
can leave temporary packs (and possibly other files beginning
with "tmp_") lying around which no existing
codepath removes and which aren't obvious to the casual user.
These can also be multi-megabyte files wasting noticeable space.
Unfortunately there's no way to definitely tell in builtin-prune
that a tmp_ file is not being used by a concurrent process,
such as a fetch. However, it is documented that pruning should
only be done on a quiet repository and --expire is honoured
(using code from Johannes Schindelin, along with a test case
he wrote) so that its safety is the same as that of loose
object pruning.

Since they might be signs of a problem (unlike orphaned loose
objects) the names of any removed files are printed.

Signed-off-by: David Tweed (david.tweed@gmail.com)
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-11 12:22:58 -08:00