Rename detection logic ignored the final line of a file if it is an
incomplete line.
* en/diffcore-delta-final-line-fix:
diffcore-delta: avoid ignoring final 'line' of file
Update to a new feature recently added, "git show-ref --exists".
* tc/show-ref-exists-fix:
builtin/show-ref: treat directory as non-existing in --exists
This test is leak-free since it was added in e137fe3b29 (unit tests: add
TAP unit test framework, 2023-11-09)
Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
TEST_PASSES_SANITIZE_LEAK must be set before sourcing test-lib.sh, as we
say in t/README:
GIT_TEST_PASSING_SANITIZE_LEAK=true skips those tests that haven't
declared themselves as leak-free by setting
"TEST_PASSES_SANITIZE_LEAK=true" before sourcing "test-lib.sh". This
test mode is used by the "linux-leaks" CI target.
GIT_TEST_PASSING_SANITIZE_LEAK=check checks that our
"TEST_PASSES_SANITIZE_LEAK=true" markings are current. Rather than
skipping those tests that haven't set "TEST_PASSES_SANITIZE_LEAK=true"
before sourcing "test-lib.sh" this mode runs them with
"--invert-exit-code". This is used to check that there's a one-to-one
mapping between "TEST_PASSES_SANITIZE_LEAK=true" and those tests that
pass under "SANITIZE=leak". This is especially useful when testing a
series that fixes various memory leaks with "git rebase -x".
In a recent commit we fixed a test where it was set after sourcing
test-lib.sh, leading to confusing results.
To prevent future oversights, let's add a simple check to ensure the
value for TEST_PASSES_SANITIZE_LEAK remains unchanged at test_done().
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test does not leak since a96015a517 (pack-bitmap: plug leak in
find_objects(), 2023-12-14) when the annotation
TEST_PASSES_SANITIZE_LEAK=true was also added.
Unfortunately it was added after test-lib.sh is sourced, which makes
GIT_TEST_PASSING_SANITIZE_LEAK=check error:
$ make SANITIZE=leak GIT_TEST_PASSING_SANITIZE_LEAK=check test T=t6113-rev-list-bitmap-filters.sh
...
make[2]: Entering directory '/tmp/git/git/t'
*** t6113-rev-list-bitmap-filters.sh ***
in GIT_TEST_PASSING_SANITIZE_LEAK=check mode, setting --invert-exit-code for TEST_PASSES_SANITIZE_LEAK != true
ok 1 - set up bitmapped repo
ok 2 - filters fallback to non-bitmap traversal
ok 3 - blob:none filter
ok 4 - blob:none filter with specified blob
ok 5 - blob:limit filter
ok 6 - blob:limit filter with specified blob
ok 7 - tree:0 filter
ok 8 - tree:0 filter with specified blob, tree
ok 9 - tree:1 filter
ok 10 - object:type filter
ok 11 - object:type filter with --filter-provided-objects
ok 12 - combine filter
ok 13 - combine filter with --filter-provided-objects
ok 14 - bitmap traversal with --unpacked
# passed all 14 test(s)
1..14
# faking up non-zero exit with --invert-exit-code
make[2]: *** [Makefile:68: t6113-rev-list-bitmap-filters.sh] Error 1
make[2]: Leaving directory '/tmp/git/git/t'
make[1]: *** [Makefile:55: test] Error 2
make[1]: Leaving directory '/tmp/git/git/t'
make: *** [Makefile:3212: test] Error 2
Let's move the annotation before sourcing test-lib.sh, to make
GIT_TEST_PASSING_SANITIZE_LEAK=check happy.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test is leak-free since it was added in af626ac0e0 (pack-bitmap:
enable reuse from all bitmapped packs, 2023-12-14).
Let's mark it as leak-free to make sure it stays that way (and to reduce
noise when looking for other leak-free scripts after we fix some leaks).
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both t1409 and t3210 exercise parts of git-pack-refs(1). Given that we
must check the on-disk files to verify whether the backend has indeed
packed refs as expected those test suites are deeply tied to the actual
backend that is in use.
Mark the test suites to depend on the REFFILES backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 10f5c52656 (submodule: avoid auto-discovery in
prepare_submodule_repo_env(), 2016-09-01) we fixed a bug when doing a
recursive fetch with submodule in the case where the submodule is broken
due to whatever reason. The test to exercise that the fix works breaks
the submodule by deleting its `HEAD` reference, which will cause us to
not detect the directory as a Git repository.
While this is perfectly fine in theory, this way of breaking the repo
becomes problematic with the current efforts to introduce another refdb
backend into Git. The new reftable backend has a stub HEAD file that
always contains "ref: refs/heads/.invalid" so that tools continue to be
able to detect such a repository. But as the reftable backend will never
delete this file even when asked to delete `HEAD` the current way to
delete the `HEAD` reference will stop working.
Adapt the code to instead delete the objects database. Going back with
this new way to cause breakage confirms that it triggers the infinite
recursion just the same, and there are no equivalent ongoing efforts to
replace the object database with an alternate backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 59c35fac54 (refs/packed-backend.c: implement jump lists to avoid
excluded pattern(s), 2023-07-10) we have implemented logic to handle
excluded refs more efficiently in the "packed" ref backend. This logic
allows us to skip emitting refs completely which we know to not be of
any interest to the caller, which can avoid quite some allocations and
object lookups.
This was wired up via a new `exclude_patterns` parameter passed to the
backend's ref iterator. The backend only needs to handle them on a best
effort basis though, and in fact we only handle it for the "packed-refs"
file, but not for loose references. Consequently, all callers must still
filter emitted refs with those exclude patterns.
The result is that handling exclude patterns is completely optional in
the ref backend, and any future backends may or may not implement it.
Let's thus mark the test for t1419 to depend on the REFFILES prereq.
An alternative would be to introduce a new prereq that tells us whether
the backend under test supports exclude patterns or not. But this does
feel a bit overblown:
- It would either map to the REFFILES prereq, in which case it feels
overengineered because the prereq is only ever relevant to t1419.
- Otherwise, it could auto-detect whether the backend supports exclude
patterns. But this could lead to silent failures in case the support
for this feature breaks at any point in time.
It should thus be good enough to just use the REFFILES prereq for now.
If future backends ever grow support for exclude patterns we can easily
add their respective prereq as another condition for this test suite to
execute.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1302 we exercise logic around "core.repositoryFormatVersion" and
extensions. These tests are not particularly robust against extensions
like the newly introduced "refStorage" extension as we tend to clobber
the repository's config file. We thus overwrite any extensions that were
set, which may render the repository inaccessible in case it has to be
accessed with a non-default ref storage.
Refactor the tests to be more robust:
- Check the DEFAULT_REPO_FORMAT prereq to determine the expected
repository format version. This helps to ensure that we only need to
update the prereq in a central place when new extensions are added.
Furthermore, this allows us to stop seeding the now-unneeded object
ID cache that was only used to figure out the repository version.
- Use a separate repository to rewrite ".git/config" to test
combinations of the repository format version and extensions. This
ensures that we don't break the main test repository. While we could
rewrite these tests to not overwrite preexisting extensions, it
feels cleaner like this so that we can test extensions standalone
without interference from the environment.
- Do not rewrite ".git/config" when exercising the "preciousObjects"
extension.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1301 we verify whether reflog files written by the "files" ref
backend correctly honor permissions when "core.sharedRepository" is set.
The test logic is thus specific to the reffiles backend and will not
work with any other backends.
Mark the test accordingly with the REFFILES prereq.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t1300 test suite exercises the git-config(1) tool. To do so, the
test overwrites ".git/config" to contain custom contents in several
places with code like the following:
```
cat > .git/config <<\EOF
...
EOF
```
While this is easy enough to do, it may create problems when using a
non-default repository format because this causes us to overwrite the
repository format version as well as any potential extensions. With the
upcoming "reftable" ref backend the result is that Git would try to
access refs via the "files" backend even though the repository has been
initialized with the "reftable" backend, which will cause failures when
trying to access any refs.
Ideally, we would rewrite the whole test suite to not depend on state
written by previous tests, but that would result in a lot of changes in
this test suite. Instead, we only refactor tests which access the refdb
to be more robust by using their own separate repositories, which allows
us to be more careful and not discard required extensions.
Note that we also have to touch up how the CUSTOM_CONFIG_FILE gets
accessed. This environment variable contains the relative path to a
custom config file which we're setting up. But because we are now using
subrepositories, this relative path will not be found anymore because
our working directory changes. This issue is addressed by storing the
absolute path to the file in CUSTOM_CONFIG_FILE instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running this:
$ touch foo bar
$ chmod +x foo
$ git -c diff.external=echo diff --ext-diff --no-index foo bar
results in a segfault. The issue is that run_diff_cmd() passes a NULL
"xfrm_msg" variable to run_external_diff(), which feeds it to
strvec_push(), causing the segfault. The bug dates back to 82fbf269b9
(run_external_diff: use an argv_array for the command line, 2014-04-19),
though it mostly only ever worked accidentally. Before then, we just
stuck the NULL pointer into a "const char **" array, so our NULL ended
up acting as an extra end-of-argv sentinel (which was OK, because it was
the last thing in the array).
Curiously, though, this is only a problem with --no-index. We set up
xfrm_msg by calling fill_metainfo(). This result may be empty, or may
have text like "index 1234..5678\n", "rename from foo\nrename from
bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the
"other" variable is not NULL. That variable is set when the paths of the
two sides of the diff pair aren't the same (in which case the
destination path becomes "other"). So normally it would kick in only for
a rename, in which case xfrm_msg should not be NULL (it would have the
rename information in it).
But with a "--no-index" of two blobs, we of course have two different
pathnames, and thus end up with a non-NULL "other" filename (which is
always just a repeat of the file2-name), but possibly a NULL xfrm_msg.
So how to fix it? I have a feeling that --no-index always passing
"other" to the external diff command is probably a bug. There was no
rename, and the name is always redundant with existing information we
pass (and this may even cause us to pass a useless "xfrm_msg" that
contains an "index 1234..5678" line). So one option would be to change
that behavior. We don't seem to have ever documented the "other" or
"xfrm_msg" parameters for external diffs.
But I'm not sure what fallout we might have from changing that behavior
now. So this patch takes the less-risky option, and simply teaches
run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
fixes the segfault now, and if we want to change the --no-index "other"
behavior on top, it will handle that, too.
Reported-by: Wilfred Hughes <me@wilfred.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tighten URL checks fsck makes in a URL recorded for submodules.
* vd/fsck-submodule-url-test:
submodule-config.c: strengthen URL fsck check
t7450: test submodule urls
test-submodule: remove command line handling for check-name
submodule-config.h: move check_submodule_url
When $HOME/.gitignore is missing but XDG config file available, we
should write into the latter, not former. "git gc" and "git
maintenance" wrote into a wrong "global config" file, which have
been corrected.
* kh/maintenance-use-xdg-when-it-should:
maintenance: use XDG config if it exists
config: factor out global config file retrieval
config: rename global config function
config: format newlines
A few tests to "git commit -o <pathspec>" and "git commit -i
<pathspec>" has been added.
* gt/test-commit-o-i-options:
t7501: add tests for --amend --signoff
t7501: add tests for --include and --only
CI for GitLab learned to drive macOS jobs.
* ps/gitlab-ci-macos:
ci: add macOS jobs to GitLab CI
ci: make p4 setup on macOS more robust
ci: handle TEST_OUTPUT_DIRECTORY when printing test failures
Makefile: detect new Homebrew location for ARM-based Macs
t7527: decrease likelihood of racing with fsmonitor daemon
Completion update to prepare for reftable
* ps/completion-with-reftable-fix:
completion: treat dangling symrefs as existing pseudorefs
completion: silence pseudoref existence check
completion: improve existence check for pseudo-refs
t9902: verify that completion does not print anything
completion: discover repo path in `__git_pseudoref_exists ()`
Tweak a few tests not to manually modify the reference database
(hence easier to work with other backends like reftable).
* jt/tests-with-reftable:
t5541: remove lockfile creation
t1401: remove lockfile creation
Custom merge drivers need access to the names of the revisions they
are working on, so that the merge conflict markers they introduce
can refer to those revisions. The placeholders '%S', '%X' and '%Y'
are introduced to this end.
Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch fixes a bug where fetch over http (or any helper) using the
v0 protocol may sometimes fail to auto-follow tags. The bug comes from
61c7711cfe (sha1-file: use loose object cache for quick existence check,
2018-11-12). But to explain why (and why this is the right fix), let's
take a step back.
After fetching a pack, the object database has changed, but we may still
hold in-memory caches that are now out of date. Traditionally this was
just the packed_git list, but 61c7711cfe started using a loose-object
cache, as well.
Usually these caches are invalidated automatically. When an expected
object cannot be found, the low-level object lookup routines call
reprepare_packed_git(), which re-scans the set of packs (and thanks to
some preparatory patches ahead of 61c7711cfe, throws away the loose
object cache). But not all calls do this! In some cases we expect that
the object might not exist, and pass OBJECT_INFO_QUICK to tell the
low-level routines not to bother re-scanning. And the tag auto-following
code is one such caller, since we are asking about oids that the other
side has (but we might not have locally).
To deal with this, we explicitly call reprepare_packed_git() ourselves
after fetching a pack; this goes all the way back to 48ec3e5c07
(Incorporate fetched packs in future object traversal, 2008-06-15). But
that only helps if we call fetch_pack() in the main fetch process. When
we're using a transport helper, it happens in a separate sub-process,
and the parent process is left with old values. So this is only a
problem with protocols which require a separate helper process (like
http).
This patch fixes it by teaching the parent process in the transport
helper relationship to make that same reprepare call after the helper
finishes fetching.
You might be left with some lingering questions, like:
1. Why only the v0 protocol, and not v2? It's because in v2 the child
helper doesn't actually run fetch_pack(); it merely establishes a
tunnel over which the main process can talk to the remote side (so
the fetch_pack() and reprepare happen in the main process).
2. Wouldn't we have the same bug even before the 61c7711cfe added
the loose object cache? For example, when we store the fetch as a
pack locally, wouldn't our packed_git list still be out of date?
If we store a pack, everything works because other parts of the
fetch process happen to trigger a call to reprepare_packed_git().
In particular, before storing whatever ref was originally
requested, we'll make sure we have the pointed-to object, and that
call happens without the QUICK flag. So in that case we'll see that
we don't know about it, reprepare, and then repeat our lookup. And
now we _do_ know about the pack, and further calls with QUICK will
find its contents.
Whereas when we unpack the result into loose objects, we never get
that same invalidation trigger. We didn't have packs before, and we
don't after. But when we do the loose object lookup, we find the
object. There's no way to realize that we didn't have the object
before the pack, and that having it now means things have changed
(in theory we could do a superfluous cache lookup to see that it
was missing from the old cache; but depending on the tags the other
side showed us, we might not even have filled in that part of the
cache earlier).
3. Why does the included test use "--depth 1"? This is important
because without it, we happen to invalidate the cache as a side
effect of other parts of the fetch process. What happens in a
non-shallow fetch is something like this:
1. we call find_non_local_tags() once before actually getting the
pack, to see if there are any tags we can fill in from what we
already have. This fills in the cache (which is obviously
missing objects we're about to fetch).
2. before fetching the actual pack, fetch_and_consume_refs()
calls check_exist_and_connected(), to see if we even need to
fetch a pack at all. This doesn't use QUICK (though arguably
it could, as it's purely an optimization). And since it sees
there are objects we are indeed missing, that triggers a
reprepare_packed_git() call, which throws out the loose object
cache.
3. after fetching, now we call find_non_local_tags() again. And
since step (2) invalidated our loose object cache, we find
the new objects and create the tags.
So everything works, but mostly due to luck. Whereas in a fetch
with --depth, we skip step 2 entirely, and thus the out-of-date
cache is still in place for step 3, giving us the wrong answer.
So the test works with a small "--depth 1" fetch, which makes sure that
we don't store the pack from the other side, and that we don't trigger
the accidental cache invalidation. And of course it forces the use of
v0 along with using the http protocol.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move a few tests into t0601 since they specifically test the packed-refs
file and thus are specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move two tests into t0600 since they write loose reflog refs manually
and thus are specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, the calls to cut(1) are only used to verify that the
contents of the reflog entry look as expected. By replacing these with
git-reflog(1) calls, we can make this test ref-backend agnostic.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0600 with other reffiles specific tests since it
checks for loose refs and is specific to the reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test into t0601 with other reffiles pack-refs specific tests
since it checks for individual loose refs and thus is specific to the
reffiles backend.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move these tests to t0600 with other reffiles specific tests since they
do things like take a lock on an individual ref, and write directly into
the reflog refs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0600 with the rest of the tests that are specific to
reffiles. This test reaches into reflog directories manually, and so are
specific to reffiles.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this test to t0601 with other reffiles specific pack-refs tests
since it is reffiles specific in that it looks into the loose refs
directory for an assertion.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests modify loose refs manually and are specific to the reffiles
backend. Move these to t0600 to be part of a test suite of reffiles
specific tests.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test can be re-written to use Git commands rather than writing a
manual ref in the reflog. This way this test no longer needs the
REFFILES prerequisite.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests are compatible with the reftable backend and thus do not
need the REFFILES prerequisite. Even though 53af25e4
(t1405: mark test that checks existence as REFFILES, 2022-01-31) and
53af25e4 (t1405: mark test that checks existence as REFFILES,
2022-01-31) marked these tests to require REFFILES, the reftable backend
in its current state does indeed work with these tests.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move t3210 to t0601, since these tests are reffiles specific in that
they modify loose refs manually. This is part of the effort to
categorize these tests together based on the ref backend they test. When
we upstream the reftable backend, we can add more tests to t06xx. This
way, all tests that test specific ref backend behavior will be grouped
together.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test fails in the GitHub Actions CI pipeline, we mark it up using
special GitHub syntax so it stands out when looking at the run log. We
also mark up "fixed" test cases, and skip passing tests since we want to
concentrate on the failures.
The finalize_test_case_output function in
test-lib-github-workflow-markup.sh which performs this markup is however
missing a fourth case: "broken" tests, i.e. tests using
'test_expect_failure' to document a known bug. This leads to these
"broken" tests appearing along with any failed tests, potentially
confusing the reader who might not be aware that "broken" is the status
for 'test_expect_failure' tests that indeed failed, and wondering what
their commits "broke".
Also skip these "broken" tests so that only failures and fixed tests
stand out.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
tests Git's implementation of a priority queue. Migrate the
test over to the new unit testing framework to simplify debugging
and reduce test run-time. Refactor the required logic and add
a new test case in addition to porting over the original ones in
shell.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After successfully connecting to the smart transport by calling
process_connect_service() in connect_helper(), run do_take_over() to
replace the old vtable with a new one which has methods ready for the
smart transport connection. This fixes the exit code of git-archive
in test case "archive remote http repository" of t5003.
The connect_helper() function is used as the connect method of the
vtable in "transport-helper.c", and it is called by transport_connect()
in "transport.c" to setup a connection. The only place that we call
transport_connect() so far is in "builtin/archive.c". Without running
do_take_over(), it may fail to call transport_disconnect() in
run_remote_archiver() of "builtin/archive.c". This is because for a
stateless connection and a service like "git-upload-archive", the
remote helper may receive a SIGPIPE signal and exit early. Call
do_take_over() to have a graceful disconnect method, so that we still
call transport_disconnect() even if the remote helper exits early.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new rpc-service "upload-archive" in http-backend to add server side
support for remote archive over HTTP/HTTPS protocols.
Also add new test cases in t5003. In the test case "archive remote http
repository", git-archive exits with a non-0 exit code even though we
create the archive correctly. It will be fixed in a later commit.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The safe.bareRepository setting can be set to 'explicit' to disallow
implicit uses of bare repositories, preventing an attack [1] where an
artificial and malicious bare repository is embedded in another git
repository. Unfortunately, some tooling uses myrepo/.git/ as the cwd
when executing commands, and this is blocked when
safe.bareRepository=explicit. Blocking is unnecessary, as git already
prevents nested .git directories.
Teach git to not reject uses of git inside of the .git directory: check
if cwd is .git (or a subdirectory of it) and allow it even if
safe.bareRepository=explicit.
[1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md
Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The exit code of the preceding command in a pipe is disregarded. So
if that preceding command is a Git command that fails, the test would
not fail. Instead, by saving the output of that Git command to a file,
and removing the pipe, we make sure the test will fail if that Git
command fails.
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
baa4adc66a (parse-options: disable option abbreviation with
PARSE_OPT_KEEP_UNKNOWN, 2019-01-27) turned off support for abbreviated
options when the flag PARSE_OPT_KEEP_UNKNOWN is given, as any shortened
option could also be an abbreviation for one of the unknown options.
The code for handling abbreviated options is guarded by an if, but it
can also be reached via goto. baa4adc66a only blocked the first way.
Add the condition to the other ones as well.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0024 has multiple command invocations on a single line, which
goes against the style described in CodingGuidelines, thus fix
that.
Also, use the -C flag to give the destination when using $TAR,
therefore, not requiring a subshell.
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace pipe with redirection operator '>' to store the output
to a temporary file after 'git archive' command since the pipe
will swallow the command's exit code and a crash won't
necessarily be noticed.
Also fix an unwanted space after redirection '>' to match the
style described in CodingGuidelines.
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" learned to pay attention to "fetch.all" configuration
variable, which pretends as if "--all" was passed from the command
line when no remote parameter was given.
* tb/fetch-all-configuration:
fetch: add new config option fetch.all
Update the validation of "curl URL" submodule URLs (i.e. those that specify
an "http[s]" or "ftp[s]" protocol) in 'check_submodule_url()' to catch more
invalid URLs. The existing validation using 'credential_from_url_gently()'
parses certain URLs incorrectly, leading to invalid submodule URLs passing
'git fsck' checks. Conversely, 'url_normalize()' - used to validate remote
URLs in 'remote_get()' - correctly identifies the invalid URLs missed by
'credential_from_url_gently()'.
To catch more invalid cases, replace 'credential_from_url_gently()' with
'url_normalize()' followed by a 'url_decode()' and a check for newlines
(mirroring 'check_url_component()' in the 'credential_from_url_gently()'
validation).
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to 't7450-bad-git-dotfiles.sh' to check the validity of different
submodule URLs. To verify this directly (without setting up test
repositories & submodules), add a 'check-url' subcommand to 'test-tool
submodule' that calls 'check_submodule_url' in the same way that
'check-name' calls 'check_submodule_name'.
Add two tests to separately address cases where the URL check correctly
filters out invalid URLs and cases where the check misses invalid URLs. Mark
the latter ("url check misses invalid cases") with 'test_expect_failure' to
indicate that this is currently broken, which will be fixed in the next step.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hash_chars() would hash lines to integers, and store them in a spanhash,
but cut lines at 64 characters. Thus, whenever it reached 64 characters
or a newline, it would create a new spanhash. The problem is, the final
part of the file might not end 64 characters after the previous 'line'
and might not end with a newline. This could, for example, cause an
85-byte file with 12 lines and only the first character in the file
differing to appear merely 23% similar rather than the expected 97%.
Ensure the last line is included, and add a testcase that would have
caught this problem.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git maintenance register` registers the repository in the user's global
config. `$XDG_CONFIG_HOME/git/config` is supposed to be used if
`~/.gitconfig` does not exist. However, this command creates a
`~/.gitconfig` file and writes to that one even though the XDG variant
exists.
This used to work correctly until 50a044f1e4 (gc: replace config
subprocesses with API calls, 2022-09-27), when the command started calling
the config API instead of git-config(1).
Also change `unregister` accordingly.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7527, we test that the builtin fsmonitor daemon works well in
various edge cases. One of these tests is frequently failing because
events reported by the fsmonitor--daemon are missing an expected event.
This failure is essentially a race condition: we do not wait for the
daemon to flush out all events before we ask it to quit. Consequently,
it can happen that we miss some expected events.
In other testcases we counteract this race by sending a simple query to
the daemon. Quoting a comment:
We run a simple query after modifying the filesystem just to introduce
a bit of a delay so that the trace logging from the daemon has time to
get flushed to disk.
Now this workaround is not a "proper" fix as we do not wait for all
events to have been synchronized in a deterministic way. But this fix
seems to be sufficient for all the other tests to pass, so it must not
be all that bad.
Convert the failing test to do the same. While the test was previously
failing in about 50% of the test runs, I couldn't reproduce the failure
after the change anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9080a7f178 (builtin/show-ref: add new mode to check for reference
existence, 2023-10-31) added the option --exists to git-show-ref(1).
When you use this option against a ref that doesn't exist, but it is
a parent directory of an existing ref, you get the following error:
$ git show-ref --exists refs/heads
error: failed to look up reference: Is a directory
when the ref-files backend is in use. To be more clear to user,
hide the error about having found a directory. What matters to the
user is that the named ref does not exist. Instead, print the same
error as when the ref was not found:
error: reference does not exist
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'check-name' subcommand to 'test-tool submodule' is documented as being
able to take a command line argument '<name>'. However, this does not work -
and has never worked - because 'argc > 0' triggers the usage message in
'cmd__submodule_check_name()'. To simplify the helper and avoid future
confusion around proper use of the subcommand, remove any references to
command line arguments for 'check-name' in usage strings and handling in
'check_name()'.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for amending the commit to add Signed-off-by trailer. And
also to check if it does not add another trailer if one already exists.
Currently, there are tests for --signoff separately in t7501, however,
they are not tested with --amend.
Therefore, these tests belong with other similar tests of --amend in
t7501-commit-basic-functionality.
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for --only (-o) and --include (-i). This include testing
with or without staged changes for both -i and -o. Also to test
for committing untracked files with -i, -o and without -i/-o.
Some tests already exist in t7501 for testing --only, however,
it is only tested in combination with --amend and --allow-empty
and on to-be-born branch. The addition of these tests check, when
the pathspec is provided without using -only, that only the files
matching the pathspec get committed. This behavior is same when
we provide --only and it is checked by the tests.
(as --only is the default mode of operation when pathspec is
provided.)
As for --include, there is no prior test for checking if --include
also commits staged changes, thus add test for that. Along with
the tests also document a potential bug, in which, when provided
with -i and a pathspec that does not match any tracked path,
commit does not fail if there are staged changes. And when there
are no staged changes commit fails. However, no error is returned
to stderr in either of the cases. This is described in the TODO
comment before the relevent testcase.
And also add a test for checking incompatibilty when using -o and
-i together.
Thus, these tests belong in t7501 with other similar existing tests,
as described in the case of --only.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using advise_if_enabled() to display an advice will automatically
include instructions on how to disable the advice, alongside the main
advice:
hint: use --reapply-cherry-picks to include skipped commits
hint: Disable this message with "git config advice.skippedCherryPicks false"
To do so, we provide a knob which can be used to disable the advice.
But also to tell us the opposite: to show the advice.
Let's not include the deactivation instructions for an advice if the
user explicitly sets its visibility.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clearing in-core repository (happens during e.g., "git fetch
--recurse-submodules" with commit graph enabled) made in-core
commit object in an inconsistent state by discarding the necessary
data from commit-graph too early, which has been corrected.
* jk/commit-graph-slab-clear-fix:
commit-graph: retain commit slab when closing NULL commit_graph
Introduce a new extension "refstorage" so that we can mark a
repository that uses a non-default ref backend, like reftable.
* ps/refstorage-extension:
t9500: write "extensions.refstorage" into config
builtin/clone: introduce `--ref-format=` value flag
builtin/init: introduce `--ref-format=` value flag
builtin/rev-parse: introduce `--show-ref-format` flag
t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
setup: introduce GIT_DEFAULT_REF_FORMAT envvar
setup: introduce "extensions.refStorage" extension
setup: set repository's formats on init
setup: start tracking ref storage format
refs: refactor logic to look up storage backends
worktree: skip reading HEAD when repairing worktrees
t: introduce DEFAULT_REPO_FORMAT prereq
The `__git_pseudoref_exists ()` helper function back to git-rev-parse(1)
in case the reftable backend is in use. This is not in the same spirit
as the simple existence check that the "files" backend does though,
because there we only check for the pseudo-ref to exist with `test -f`.
With git-rev-parse(1) we not only check for existence, but also verify
that the pseudo-ref resolves to an object, which may not be the case
when the pseudo-ref points to an unborn branch.
Fix this issue by using `git show-ref --exists` instead. Note that we do
not have to silence stdout anymore as git-show-ref(1) will not print
anything.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 44dbb3bf29 (completion: support pseudoref existence checks for
reftables, 2023-12-19), we have extended the Bash completion script to
support future ref backends better by using git-rev-parse(1) to check
for pseudo-ref existence. This conversion has introduced a bug, because
even though we pass `--quiet` to git-rev-parse(1) it would still output
the resolved object ID of the ref in question if it exists.
Fix this by redirecting its stdout to `/dev/null` and add a test that
catches this behaviour. Note that the test passes even without the fix
for the "files" backend because we parse pseudo refs via the filesystem
directly in that case. But the test will fail with the "reftable"
backend.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Bash completion script must not print anything to either stdout or
stderr. Instead, it is only expected to populate certain variables.
Tighten our `test_completion ()` test helper to verify this requirement.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recent codebase update (8bf6fbd00d (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.
This commit migrates the unit tests for C character classification
functions (isdigit(), isspace(), etc) from the legacy approach
using the test-tool command `test-tool ctype` in t/helper/test-ctype.c
to the new unit testing framework (t/unit-tests/test-lib.h).
The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sideband demultiplexer fixes.
* jx/sideband-chomp-newline-fix:
pkt-line: do not chomp newlines for sideband messages
pkt-line: memorize sideband fragment in reader
test-pkt-line: add option parser for unpack-sideband
Streaming spans of packfile data used to be done only from a
single, primary, pack in a repository with multiple packfiles. It
has been extended to allow reuse from other packfiles, too.
* tb/multi-pack-verbatim-reuse: (26 commits)
t/perf: add performance tests for multi-pack reuse
pack-bitmap: enable reuse from all bitmapped packs
pack-objects: allow setting `pack.allowPackReuse` to "single"
t/test-lib-functions.sh: implement `test_trace2_data` helper
pack-objects: add tracing for various packfile metrics
pack-bitmap: prepare to mark objects from multiple packs for reuse
pack-revindex: implement `midx_pair_to_pack_pos()`
pack-revindex: factor out `midx_key_to_pack_pos()` helper
midx: implement `midx_preferred_pack()`
git-compat-util.h: implement checked size_t to uint32_t conversion
pack-objects: include number of packs reused in output
pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse
pack-objects: prepare `write_reused_pack()` for multi-pack reuse
pack-objects: pass `bitmapped_pack`'s to pack-reuse functions
pack-objects: keep track of `pack_start` for each reuse pack
pack-objects: parameterize pack-reuse routines over a single pack
pack-bitmap: return multiple packs via `reuse_partial_packfile_from_bitmap()`
pack-bitmap: simplify `reuse_partial_packfile_from_bitmap()` signature
ewah: implement `bitmap_is_empty()`
pack-bitmap: pass `bitmapped_pack` struct to pack-reuse functions
...
The builtin_objectmode attribute is populated for each path
without adding anything in .gitattributes files, which would be
useful in magic pathspec, e.g., ":(attr:builtin_objectmode=100755)"
to limit to executables.
* jw/builtin-objectmode-attr:
attr: add builtin objectmode values support
To create error conditions, some tests set up reference locks by
directly creating its lockfile. While this works for the files reference
backend, this approach is incompatible with the reftable backend.
Refactor the test to create a d/f conflict via git-update-ref(1) instead
so that the test is reference backend agnostic.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To create error conditions, some tests set up reference locks by
directly creating its lockfile. While this works for the files reference
backend, this approach is incompatible with the reftable backend.
Refactor the test to create a d/f conflict via git-symbolic-ref(1)
instead so that the test is reference backend agnostic.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Piping the output of git commands like git-ls-files to another
command (grep in this case) hides the exit code returned by
these commands. Prevent this by storing the output of git-ls-files
to a temporary file and then "grep-ping" from that file. Replace
grep with test_grep as the latter is more verbose when it fails.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git sparse-checkout (add|set) --[no-]cone --end-of-options" did
not handle "--end-of-options" correctly after a recent update.
* en/sparse-checkout-eoo:
sparse-checkout: be consistent with end of options markers
Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
"git archive --list extra garbage" silently ignored excess command
line parameters, which has been corrected.
* jc/archive-list-with-extra-args:
archive: "--list" does not take further options
Introduce a boolean configuration option fetch.all which allows to
fetch all available remotes by default. The config option can be
overridden by explicitly specifying a remote or by using --no-all.
The behavior for --all is unchanged and calling git-fetch with --all
and a remote will still result in an error.
Additionally, describe the configuration variable in the config
documentation and implement new tests to cover the expected behavior.
Also add --no-all to the command-line documentation of git-fetch.
Signed-off-by: Tamino Bauknecht <dev@tb6.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a regression introduced in ac6d45d11f (commit-graph: move
slab-clearing to close_commit_graph(), 2023-10-03), in which running:
git -c fetch.writeCommitGraph=true fetch --recurse-submodules
multiple times in a freshly cloned repository causes a segfault. What
happens in the second (and subsequent) runs is this:
1. We make a "struct commit" for any ref tips which we're storing
(even if we already have them, they still go into FETCH_HEAD).
Because the first run will have created a commit graph, we'll find
those commits in the graph.
The commit struct is therefore created with a NULL "maybe_tree"
entry, because we can load its oid from the graph later. But to do
that we need to remember that we got the commit from the graph,
which is recorded in a global commit_graph_data_slab object.
2. Because we're using --recurse-submodules, we'll try to fetch each
of the possible submodules. That implies creating a separate
"struct repository" in-process for each submodule, which will
require a later call to repo_clear().
The call to repo_clear() calls raw_object_store_clear(), which in
turn calls close_object_store(), which in turn calls
close_commit_graph(). And the latter frees the commit graph data
slab.
3. Later, when trying to write out a new commit graph, we'll ask for
their tree oid via get_commit_tree_oid(), which will see that the
object is parsed but with a NULL maybe_tree field. We'd then
usually pull it from the graph file, but because the slab was
cleared, we don't realize that we can do so! We end up returning
NULL and segfaulting.
(It seems questionable that we'd write a graph entry for such a
commit anyway, since we know we already have one. I didn't
double-check, but that may simply be another side effect of having
cleared the slab).
The bug is in step (2) above. We should not be clearing the slab when
cleaning up the submodule repository structs. Prior to ac6d45d11f, we
did not do so because it was done inside a helper function that returned
early when it saw NULL. So the behavior change from that commit is that
we'll now _always_ clear the slab via repo_clear(), even if the
repository being closed did not have a commit graph (and thus would have
a NULL commit_graph struct).
The most immediate fix is to add in a NULL check in close_commit_graph(),
making it a true noop when passed in an object_store with a NULL
commit_graph (it's OK to just return early, since the rest of its code
is already a noop when passed NULL). That restores the pre-ac6d45d11f
behavior. And that's what this patch does, along with a test that
exercises it (we already have a test that uses submodules along with
fetch.writeCommitGraph, but the bug only triggers when there is a
subsequent fetch and when that fetch uses --recurse-submodules).
So that fixes the regression in the least-risky way possible.
I do think there's some fragility here that we might want to follow up
on. We have a global commit_graph_data_slab that contains graph
positions, and our global commit structs depend on the that slab
remaining valid. But close_commit_graph() is just about closing _one_
object store's graph. So it's dangerous to call that function and clear
the slab without also throwing away any "struct commit" we might have
parsed that depends on it.
Which at first glance seems like a bug we could already trigger. In the
situation described here, there is no commit graph in the submodule
repository, so our commit graph is NULL (in fact, in our test script
there is no submodule repo at all, so we immediately return from
repo_init() and call repo_clear() only to free up memory). But what
would happen if there was one? Wouldn't we see a non-NULL commit_graph
entry, and then clear the global slab anyway?
The answer is "no", but for very bizarre reasons. Remember that
repo_clear() calls raw_object_store_clear(), which then calls
close_object_store() and thus close_commit_graph(). But before it does
so, raw_object_store_clear() does something else: it frees the commit
graph and sets it to NULL! So by this code path we'll _never_ see a
non-NULL commit_graph struct, and thus never clear the slab.
So it happens to work out. But it still seems questionable to me that we
would clear a global slab (which might still be in use) when closing the
commit graph. This clearing comes from 957ba814bf (commit-graph: when
closing the graph, also release the slab, 2021-09-08), and was fixing a
case where we really did need it to be closed (and in that case we
presumably call close_object_store() more directly).
So I suspect there may still be a bug waiting to happen there, as any
object loaded before the call to close_object_store() may be stranded
with a bogus maybe_tree entry (and thus looking at it after the call
might cause an error). But I'm not sure how to trigger it, nor what the
fix should look like (you probably would need to "unparse" any objects
pulled from the graph). And so this patch punts on that for now in favor
of fixing the recent regression in the most direct way, which should not
have any other fallouts.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To compute the expected on-disk size of packed objects, we sort the
output of show-index by pack offset and then compute the difference
between adjacent entries using awk. This works but has a few readability
problems:
1. Reading the index in pack order means don't find out the size of an
oid's entry until we see the _next_ entry. So we have to save it to
print later.
We can instead iterate in reverse order, so we compute each oid's
size as we see it.
2. Since the awk invocation is inside a text_expect block, we can't
easily use single-quotes to hold the script. So we use
double-quotes, but then have to escape the dollar signs in the awk
script.
We can swap this out for a shell loop instead (which is made much
easier by the first change).
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Assorted changes around pseudoref handling.
* ps/pseudo-refs:
bisect: consistently write BISECT_EXPECTED_REV via the refdb
refs: complete list of special refs
refs: propagate errno when reading special refs fails
wt-status: read HEAD and ORIG_HEAD via the refdb
Doc updates to clarify what an "unborn branch" means.
* jc/orphan-unborn:
orphan/unborn: fix use of 'orphan' in end-user facing messages
orphan/unborn: add to the glossary and use them consistently
"git status" is taught to show both the branch being bisected and
being rebased when both are in effect at the same time.
* rj/status-bisect-while-rebase:
status: fix branch shown when not only bisecting
In t9500 we're writing a custom configuration that sets up gitweb. This
requires us to manually ensure that the repository format is configured
as required, including both the repository format version and
extensions. With the introduction of the "extensions.refStorage"
extension we need to update the test to also write this new one.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new `--ref-format` value flag for git-clone(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new `--ref-format` value flag for git-init(1) that allows
the user to specify the ref format that is to be used for a newly
initialized repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new `--show-ref-format` to git-rev-parse(1) that causes it
to print the ref format used by a repository.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new GIT_TEST_DEFAULT_REF_FORMAT environment variable that
lets developers run the test suite with a different default ref format
without impacting the ref format used by non-test Git invocations. This
is modeled after GIT_TEST_DEFAULT_OBJECT_FORMAT, which does the same
thing for the repository's object format.
Adapt the setup of the `REFFILES` test prerequisite to be conditionally
set based on the default ref format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new GIT_DEFAULT_REF_FORMAT environment variable that lets
users control the default ref format used by both git-init(1) and
git-clone(1). This is modeled after GIT_DEFAULT_OBJECT_FORMAT, which
does the same thing for the repository's object format.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new "extensions.refStorage" extension that allows us to
specify the ref storage format used by a repository. For now, the only
supported format is the "files" format, but this list will likely soon
be extended to also support the upcoming "reftable" format.
There have been discussions on the Git mailing list in the past around
how exactly this extension should look like. One alternative [1] that
was discussed was whether it would make sense to model the extension in
such a way that backends are arbitrarily stackable. This would allow for
a combined value of e.g. "loose,packed-refs" or "loose,reftable", which
indicates that new refs would be written via "loose" files backend and
compressed into "packed-refs" or "reftable" backends, respectively.
It is arguable though whether this flexibility and the complexity that
it brings with it is really required for now. It is not foreseeable that
there will be a proliferation of backends in the near-term future, and
the current set of existing formats and formats which are on the horizon
can easily be configured with the much simpler proposal where we have a
single value, only.
Furthermore, if we ever see that we indeed want to gain the ability to
arbitrarily stack the ref formats, then we can adapt the current
extension rather easily. Given that Git clients will refuse any unknown
value for the "extensions.refStorage" extension they would also know to
ignore a stacked "loose,packed-refs" in the future.
So let's stick with the easy proposal for the time being and wire up the
extension.
[1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A limited number of tests require repositories to have the default
repository format or otherwise they would fail to run, e.g. because they
fail to detect the correct hash function. While the hash function is the
only extension right now that creates problems like this, we are about
to add a second extension for the ref format.
Introduce a new DEFAULT_REPO_FORMAT prereq that can easily be amended
whenever we add new format extensions. Next to making any such changes
easier on us, the prerequisite's name should also help to clarify the
intent better.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gives all paths builtin objectmode values based on the paths' modes
(one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:builtin_objectmode=160000)' could filter for submodules without
needing to have `builtin_objectmode=160000` to be set in .gitattributes
for every submodule path.
These values are also reflected in `git check-attr` results.
If the git_attr_direction is set to GIT_ATTR_INDEX or GIT_ATTR_CHECKIN
and a path is not found in the index, the value will be unspecified.
This patch also reserves the builtin_* attribute namespace for objectmode
and any future builtin attributes. Any user defined attributes using this
reserved namespace will result in a warning. This is a breaking change for
any existing builtin_* attributes.
Pathspecs with some builtin_* attribute name (excluding builtin_objectmode)
will behave like any attribute where there are no user specified values.
Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Memory pool allocations that require a new block and would fill at
least half of it are handled specially. Before 158dfeff3d (mem-pool:
add life cycle management functions, 2018-07-02) they used to be
allocated outside of the pool. This patch made mem_pool_alloc() create
a bespoke block instead, to allow releasing it when the pool gets
discarded.
Unfortunately mem_pool_alloc() returns a pointer to the start of such a
bespoke block, i.e. to the struct mp_block at its top. When the caller
writes to it, the management information gets corrupted. This affects
mem_pool_discard() and -- if there are no other blocks in the pool --
also mem_pool_alloc().
Return the payload pointer of bespoke blocks, just like for smaller
allocations, to protect the management struct.
Also update next_free to mark the block as full. This is only strictly
necessary for the first allocated block, because subsequent ones are
inserted after the current block and never considered for further
allocations, but it's easier to just do it in all cases.
Add a basic unit test to demonstrate the issue by using
mem_pool_calloc() with a tiny block size, which forces the creation of a
bespoke block.
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone" has been prepared to allow cloning a repository with
non-default hash function into a repository that uses the reftable
backend.
* ps/clone-into-reftable-repository:
builtin/clone: create the refdb with the correct object format
builtin/clone: skip reading HEAD when retrieving remote
builtin/clone: set up sparse checkout later
builtin/clone: fix bundle URIs with mismatching object formats
remote-curl: rediscover repository when fetching refs
setup: allow skipping creation of the refdb
setup: extract function to create the refdb
"git fetch --atomic" issued an unnecessary empty error message,
which has been corrected.
* jx/fetch-atomic-error-message-fix:
fetch: no redundant error message for atomic fetch
t5574: test porcelain output of atomic fetch
The code to parse the From e-mail header has been updated to avoid
recursion.
* jk/mailinfo-iterative-unquote-comment:
mailinfo: avoid recursion when unquoting From headers
t5100: make rfc822 comment test more careful
Code clean-up for sanity checking of command line options for "git
show-ref".
* rs/show-ref-incompatible-options:
show-ref: use die_for_incompatible_opt3()
"git checkout -B <branch> [<start-point>]" allowed a branch that is
in use in another worktree to be updated and checked out, which
might be a bit unexpected. The rule has been tightened, which is a
breaking change. "--ignore-other-worktrees" option is required to
unbreak you, if you are used to the current behaviour that "-B"
overrides the safety.
* jc/checkout-B-branch-in-use:
checkout: forbid "-B <branch>" from touching a branch used elsewhere
checkout: refactor die_if_checked_out() caller
When applying a patch that adds an executable file, git apply
ignores the core.fileMode setting (core.fileMode in git config
specifies whether the executable bit on files in the working tree
should be honored or not) resulting in warnings like:
warning: script.sh has type 100644, expected 100755
even when core.fileMode is set to false, which is undesired. This
is extra true for systems like Windows.
Fix this by inferring the correct file mode from either the existing
index entry, and when it is unavailable, assuming that the file mode
was OK by pretending it had the mode that the preimage wants to see,
when core.filemode is set to false. Add a test case that verifies
the change and prevents future regression.
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
93851746 (parse-options: decouple "--end-of-options" and "--",
2023-12-06) updated the world order to make callers of parse-options
that set PARSE_OPT_KEEP_UNKNOWN_OPT responsible for deciding what to
do with "--end-of-options" they may see after parse_options() returns.
This made a previous bug in sparse-checkout more visible; namely,
that
git sparse-checkout [add|set] --[no-]cone --end-of-options ...
would simply treat "--end-of-options" as one of the paths to include in
the sparse-checkout. But this was already problematic before; namely,
git sparse-checkout [add|set| --[no-]cone --sikp-checks ...
would not give an error on the mis-typed "--skip-checks" but instead
simply treat "--sikp-checks" as a path or pattern to include in the
sparse-checkout, which is highly unfriendly.
This behavior began when the command was converted to parse-options in
7bffca95ea (sparse-checkout: add '--stdin' option to set subcommand,
2019-11-21). Back then it was just called KEEP_UNKNOWN. Later it was
renamed to KEEP_UNKNOWN_OPT in 99d86d60e5 (parse-options:
PARSE_OPT_KEEP_UNKNOWN only applies to --options, 2022-08-19) to clarify
that it was only about dashed options; we always keep non-option
arguments. Looking at that original patch, both Peff and I think that
the author was simply confused about the mis-named option, and really
just wanted to keep the non-option arguments. We never should have used
the flag all along (and the other cases were cargo-culted within the
file).
Remove the erroneous PARSE_OPT_KEEP_UNKNOWN_OPT flag now to fix this
bug. Note that this does mean that anyone who might have been using
git sparse-checkout [add|set] [--[no-]cone] --foo --bar
to request paths or patterns '--foo' and '--bar' will now have to use
git sparse-checkout [add|set] [--[no-]cone] -- --foo --bar
That makes sparse-checkout more consistent with other git commands,
provides users much friendlier error messages and behavior, and is
consistent with the all-caps warning in git-sparse-checkout.txt that
this command "is experimental...its behavior...will likely change". :-)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though. Have those
source files explicitly include the headers they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back when we added this placeholder in a4ac106178 (cat-file: add
%(objectsize:disk) format atom, 2013-07-10), there were no tests,
claiming "[...]the exact numbers returned are volatile and subject to
zlib and packing decisions".
But we can use a little shell hackery to get the expected numbers
ourselves. To a certain degree this is just re-implementing what Git is
doing under the hood, but it is still worth doing. It makes sure we
exercise the %(objectsize:disk) code at all, and having the two
implementations agree gives us more confidence.
Note that our shell code assumes that no object appears twice (either in
two packs, or as both loose and packed), as then the results really are
undefined. That's OK for our purposes, and the test will notice if that
assumption is violated (the shell version would produce duplicate lines
that Git's output does not have).
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git archive --list blah" should notice an extra command line
parameter that goes unused. Make it so.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/clone-into-reftable-repository:
builtin/clone: create the refdb with the correct object format
builtin/clone: skip reading HEAD when retrieving remote
builtin/clone: set up sparse checkout later
builtin/clone: fix bundle URIs with mismatching object formats
remote-curl: rediscover repository when fetching refs
setup: allow skipping creation of the refdb
setup: extract function to create the refdb
"git bisect reset" has been taught to clean up state files and refs
even when BISECT_START file is gone.
* jk/bisect-reset-fix:
bisect: always clean on reset
"git $cmd --end-of-options --rev -- --path" for some $cmd failed
to interpret "--rev" as a rev, and "--path" as a path. This was
fixed for many programs like "reset" and "checkout".
* jk/end-of-options:
parse-options: decouple "--end-of-options" and "--"
Clean-up code that handles combinations of incompatible options.
* rs/incompatible-options-messages:
worktree: simplify incompatibility message for --orphan and commit-ish
worktree: standardize incompatibility messages
clean: factorize incompatibility message
revision, rev-parse: factorize incompatibility messages about - -exclude-hidden
revision: use die_for_incompatible_opt3() for - -graph/--reverse/--walk-reflogs
repack: use die_for_incompatible_opt3() for -A/-k/--cruft
push: use die_for_incompatible_opt4() for - -delete/--tags/--all/--mirror
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.
* jc/revision-parse-int:
revision: parse integer arguments to --max-count, --skip, etc., more carefully
Tests update.
* ps/ref-tests-update-more:
t6301: write invalid object ID via `test-tool ref-store`
t5551: stop writing packed-refs directly
t5401: speed up creation of many branches
t4013: simplify magic parsing and drop "failure"
t3310: stop checking for reference existence via `test -f`
t1417: make `reflog --updateref` tests backend agnostic
t1410: use test-tool to create empty reflog
t1401: stop treating FETCH_HEAD as real reference
t1400: split up generic reflog tests from the reffile-specific ones
t0410: mark tests to require the reffiles backend
Command line completion (in contrib/) learned to complete path
arguments to the "add/set" subcommands of "git sparse-checkout"
better.
* en/complete-sparse-checkout:
completion: avoid user confusion in non-cone mode
completion: avoid misleading completions in cone mode
completion: fix logic for determining whether cone mode is active
completion: squelch stray errors in sparse-checkout completion
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.
* jh/trace2-redact-auth:
t0212: test URL redacting in EVENT format
t0211: test URL redacting in PERF format
trace2: redact passwords from https:// URLs by default
trace2: fix signature of trace2_def_param() macro
"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.
* ad/merge-file-diff-algo:
merge-file: add --diff-algorithm option
Clean-up code that handles combinations of incompatible options.
* rs/i18n-cannot-be-used-together:
i18n: factorize even more 'incompatible options' messages
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications. The default has been
flipped to disable this pessimization.
* ps/commit-graph-less-paranoid:
commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
Introduce "git replay", a tool meant on the server side without
working tree to recreate a history.
* cc/git-replay:
replay: stop assuming replayed branches do not diverge
replay: add --contained to rebase contained branches
replay: add --advance or 'cherry-pick' mode
replay: use standard revision ranges
replay: make it a minimal server side command
replay: remove HEAD related sanity check
replay: remove progress and info output
replay: add an important FIXME comment about gpg signing
replay: change rev walking options
replay: introduce pick_regular_commit()
replay: die() instead of failing assert()
replay: start using parse_options API
replay: introduce new builtin
t6429: remove switching aspects of fast-rebase
Simplify API implementation to delete references by eliminating
duplication.
* ps/ref-deletion-updates:
refs: remove `delete_refs` callback from backends
refs: deduplicate code to delete references
refs/files: use transactions to delete references
t5510: ensure that the packed-refs file needs locking
When calling "packet_read_with_status()" to parse pkt-line encoded
packets, we can turn on the flag "PACKET_READ_CHOMP_NEWLINE" to chomp
newline character for each packet for better line matching. But when
receiving data and progress information using sideband, we should turn
off the flag "PACKET_READ_CHOMP_NEWLINE" to prevent mangling newline
characters from data and progress information.
When both the server and the client support "sideband-all" capability,
we have a dilemma that newline characters in negotiation packets should
be removed, but the newline characters in the progress information
should be left intact.
Add new flag "PACKET_READ_USE_SIDEBAND" for "packet_read_with_status()"
to prevent mangling newline characters in sideband messages.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we turn on the "use_sideband" field of the packet_reader,
"packet_reader_read()" will call the function "demultiplex_sideband()"
to parse and consume sideband messages. Sideband fragment which does not
end with "\r" or "\n" will be saved in the sixth parameter "scratch"
and it can be reused and be concatenated when parsing another sideband
message.
In "packet_reader_read()" function, the local variable "scratch" can
only be reused by subsequent sideband messages. But if there is a
payload message between two sideband fragments, the first fragment
which is saved in the local variable "scratch" will be lost.
To solve this problem, we can add a new field "scratch" in
packet_reader to memorize the sideband fragment across different calls
of "packet_reader_read()".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can use the test helper program "test-tool pkt-line" to test pkt-line
related functions. E.g.:
* Use "test-tool pkt-line send-split-sideband" to generate sideband
messages.
* Pipe these generated sideband messages to command "test-tool pkt-line
unpack-sideband" to test packet_reader_read() function.
In order to make a complete test of the packet_reader_read() function,
add option parser for command "test-tool pkt-line unpack-sideband".
* To remove newlines in sideband messages, we can use:
$ test-tool pkt-line unpack-sideband --chomp-newline
* To preserve newlines in sideband messages, we can use:
$ test-tool pkt-line unpack-sideband --no-chomp-newline
* To parse sideband messages using "demultiplex_sideband()" inside the
function "packet_reader_read()", we can use:
$ test-tool pkt-line unpack-sideband --reader-use-sideband
We also add new example sideband packets in send_split_sideband() and
add several new test cases in t0070. Among these test cases, we pipe
output of the "send-split-sideband" subcommand to the "unpack-sideband"
subcommand. We found two issues:
1. The two splitted sideband messages "Hello," and " world!\n" should
be concatenated together. But when we turn on use_sideband field of
reader to parse sideband messages, the first part of the splitted
message ("Hello,") is lost.
2. The newline characters in sideband 2 (progress info) and sideband 3
(error message) should be preserved, but they are both trimmed.
Will fix the above two issues in subsequent commits.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recent commit 2e87fca189 (test framework: further deprecate
test_i18ngrep, 2023-10-31), the test_i18ngrep function was
deprecated, and all the callers were updated to call the test_grep
function instead. But test_grep inherited an error message that
still refers to test_i18ngrep by mistake. Correct it so that a
broken call to the test_grep will identify itself as such.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an error occurs during an atomic fetch, a redundant error message
will appear at the end of do_fetch(). It was introduced in b3a804663c
(fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).
Because a failure message is displayed before setting retcode in the
function do_fetch(), calling error() on the err message at the end of
this function may result in redundant or empty error message to be
displayed.
We can remove the redundant error() function, because we know that
the function ref_transaction_abort() never fails. While we can find a
common pattern for calling ref_transaction_abort() by running command
"git grep -A1 ref_transaction_abort", e.g.:
if (ref_transaction_abort(transaction, &error))
error("abort: %s", error.buf);
Following this pattern, we can tolerate the return value of the function
ref_transaction_abort() being changed in the future. We also delay the
output of the err message to the end of do_fetch() to reduce redundant
code. With these changes, the test case "fetch porcelain output
(atomic)" in t5574 will also be fixed.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case "fetch porcelain output" checks output of the fetch
command. The error output must be empty with the follow assertion:
test_must_be_empty stderr
But this assertion fails if using atomic fetch. Refactor this test case
to use different fetch options by splitting it into three test cases.
1. "setup for fetch porcelain output".
2. "fetch porcelain output": for non-atomic fetch.
3. "fetch porcelain output (atomic)": for atomic fetch.
Add new command "test_commit ..." in the first test case, so that if we
run these test cases individually (--run=4-6), "git rev-parse HEAD~"
command will work properly. Run the above test cases, we can find that
one test case has a known breakage, as shown below:
ok 4 - setup for fetch porcelain output
ok 5 - fetch porcelain output # TODO known breakage vanished
not ok 6 - fetch porcelain output (atomic) # TODO known breakage
The failed test case has an error message with only the error prompt but
no message body, as follows:
'stderr' is not empty, it contains:
error:
In a later commit, we will fix this issue.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "check-chainlint" target runs automatically when running tests and
performs self-checks to verify that the chainlinter itself produces the
expected output. Originally, the chainlinter was implemented via sed,
but the infrastructure has been rewritten in fb41727b7e (t: retire
unused chainlint.sed, 2022-09-01) to use a Perl script instead.
The rewrite caused some slight whitespace changes in the output that are
ultimately not of much importance. In order to be able to assert that
the actual chainlinter errors match our expectations we thus have to
ignore whitespace characters when diffing them. As the `-w` flag is not
in POSIX we try to use `git diff -w --no-index` before we fall back to
`diff -w -u`.
To accomodate for cases where the host system has no Git installation we
use the locally-compiled version of Git. This can result in problems
though when the Git project's repository is using extensions that the
locally-compiled version of Git doesn't understand. It will refuse to
run and thus cause the checks to fail.
Instead of improving the detection logic, fix our ".expect" files so
that we do not need any post-processing at all anymore. This allows us
to drop the `-w` flag when diffing so that we can always use diff(1)
now.
Note that we keep some of the post-processing of `chainlint.pl` output
intact to strip leading line numbers generated by the script. Having
these would cause a rippling effect whenever we add a new test that
sorts into the middle of existing tests and would require us to
renumerate all subsequent lines, which seems rather pointless.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To ensure that we don't regress either the size or runtime performance
of multi-pack reuse, add a performance test to measure both of these.
The test partitions the objects in GIT_TEST_PERF_LARGE_REPO into 1, 10,
and 100 packs, and then tries to perform a "clone" at each stage with
both single- and multi-pack reuse enabled.
Note that the `repack_into_n_chunks()` function in this new test script
differs from the existing `repack_into_n()`. The former partitions the
repository into N equal-sized chunks, while the latter produces N packs
of five commits each (plus their objects), and then another pack with
the remainder.
On git.git, I can produce the following results on my machine:
Test this tree
--------------------------------------------------------------------------------
5332.3: clone for 1-pack scenario (single-pack reuse) 1.57(2.99+0.15)
5332.4: clone size for 1-pack scenario (single-pack reuse) 231.8M
5332.5: clone for 1-pack scenario (multi-pack reuse) 1.79(2.96+0.21)
5332.6: clone size for 1-pack scenario (multi-pack reuse) 231.7M
5332.9: clone for 10-pack scenario (single-pack reuse) 3.89(16.75+0.35)
5332.10: clone size for 10-pack scenario (single-pack reuse) 209.9M
5332.11: clone for 10-pack scenario (multi-pack reuse) 1.56(2.99+0.17)
5332.12: clone size for 10-pack scenario (multi-pack reuse) 224.4M
5332.15: clone for 100-pack scenario (single-pack reuse) 8.24(54.31+0.59)
5332.16: clone size for 100-pack scenario (single-pack reuse) 278.3M
5332.17: clone for 100-pack scenario (multi-pack reuse) 2.13(2.44+0.33)
5332.18: clone size for 100-pack scenario (multi-pack reuse) 357.9M
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that both the pack-bitmap and pack-objects code are prepared to
handle marking and using objects from multiple bitmapped packs for
verbatim reuse, allow marking objects from all bitmapped packs as
eligible for reuse.
Within the `reuse_partial_packfile_from_bitmap()` function, we no longer
only mark the pack whose first object is at bit position zero for reuse,
and instead mark any pack contained in the MIDX as a reuse candidate.
Provide a handful of test cases in a new script (t5332) exercising
interesting behavior for multi-pack reuse to ensure that we performed
all of the previous steps correctly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a helper function which looks for a specific (category, key,
value) tuple in the output of a trace2 event stream.
We will use this function in a future patch to ensure that the expected
number of objects are reused from an expected number of packs.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a binary search over the objects in a MIDX's bitmap
(i.e. in pseudo-pack order), the reader reconstructs the pseudo-pack
ordering using a combination of (a) the preferred pack, (b) the pack's
lexical position in the MIDX based on pack names, and (c) the object
offset within the pack.
In order to perform this binary search, the reader must know the
identity of the preferred pack. This could be stored in the MIDX, but
isn't for historical reasons, mostly because it can easily be inferred
at read-time by looking at the object in the first bit position and
finding out which pack it was selected from in the MIDX, like so:
nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0));
In midx_to_pack_pos() which performs this binary search, we look up the
identity of the preferred pack before each search. This is relatively
quick, since it involves two table-driven lookups (one in the MIDX's
revindex for `pack_pos_to_midx()`, and another in the MIDX's object
table for `nth_midxed_pack_int_id()`).
But since the preferred pack does not change after the MIDX is written,
it is safe to cache this value on the MIDX itself.
Write a helper to do just that, and rewrite all of the existing
call-sites that care about the identity of the preferred pack in terms
of this new helper.
This will prepare us for a subsequent patch where we will need to binary
search through the MIDX's pseudo-pack order multiple times.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a multi-pack bitmap is used to implement verbatim pack reuse (that
is, when verbatim chunks from an on-disk packfile are copied
directly[^1]), it does so by using its "preferred pack" as the source
for pack-reuse.
This allows repositories to pack the majority of their objects into a
single (often large) pack, and then use it as the single source for
verbatim pack reuse. This increases the amount of objects that are
reused verbatim (and consequently, decrease the amount of time it takes
to generate many packs). But this performance comes at a cost, which is
that the preferred packfile must pace its growth with that of the entire
repository in order to maintain the utility of verbatim pack reuse.
As repositories grow beyond what we can reasonably store in a single
packfile, the utility of verbatim pack reuse diminishes. Or, at the very
least, it becomes increasingly more expensive to maintain as the pack
grows larger and larger.
It would be beneficial to be able to perform this same optimization over
multiple packs, provided some modest constraints (most importantly, that
the set of packs eligible for verbatim reuse are disjoint with respect
to the subset of their objects being sent).
If we assume that the packs which we treat as candidates for verbatim
reuse are disjoint with respect to any of their objects we may output,
we need to make only modest modifications to the verbatim pack-reuse
code itself. Most notably, we need to remove the assumption that the
bits in the reachability bitmap corresponding to objects from the single
reuse pack begin at the first bit position.
Future patches will unwind these assumptions and reimplement their
existing functionality as special cases of the more general assumptions
(e.g. that reuse bits can start anywhere within the bitset, but happen
to start at 0 for all existing cases).
This patch does not yet relax any of those assumptions. Instead, it
implements a foundational data-structure, the "Bitampped Packs" (`BTMP`)
chunk of the multi-pack index. The `BTMP` chunk's contents are described
in detail here. Importantly, the `BTMP` chunk contains information to
map regions of a multi-pack index's reachability bitmap to the packs
whose objects they represent.
For now, this chunk is only written, not read (outside of the test-tool
used in this patch to test the new chunk's behavior). Future patches
will begin to make use of this new chunk.
[^1]: Modulo patching any `OFS_DELTA`'s that cross over a region of the
pack that wasn't used verbatim.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `find_objects()` function creates an object_list for any tips of the
reachability query which do not have corresponding bitmaps.
The object_list is not used outside of `find_objects()`, but we never
free it with `object_list_free()`, resulting in a leak. Let's plug that
leak by calling `object_list_free()`, which results in t6113 becoming
leak-free.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing "From" headers in an email, mailinfo "unquotes" quoted
strings and rfc822 parenthesized comments. For quoted strings, we
actually remove the double-quotes, so:
From: "A U Thor" <someone@example.com>
become:
Author: A U Thor
Email: someone@example.com
But for comments, we leave the outer parentheses in place, so:
From: A U (this is a comment) Thor <someone@example.com>
becomes:
Author: A U (this is a comment) Thor
Email: someone@example.com
So what is the comment "unquoting" actually doing? In our code, being in
a comment section has exactly two effects:
1. We'll unquote backslash-escaped characters inside a comment
section.
2. We _won't_ unquote double-quoted strings inside a comment section.
Our test for comments in t5100 checks this:
From: "A U Thor" <somebody@example.com> (this is \(really\) a comment (honestly))
So it is covering (1), but not (2). Let's add in a quoted string to
cover this.
Moreover, because the comment appears at the end of the From header,
there's nothing to confirm that we correctly found the end of the
comment section (and not just the end-of-string). Let's instead move it
to the beginning of the header, which means we can confirm that the
existing quoted string is detected (which will only happen if we know
we've left the comment block).
As expected, the test continues to pass, but this will give us more
confidence as we refactor the code in the next patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.
Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some refs in Git are more special than others due to reasons explained
in the next commit. These refs are read via `refs_read_special_head()`,
but this function doesn't behave the same as when we try to read a
normal ref. Most importantly, we do not propagate `failure_errno` in the
case where the reference does not exist, which is behaviour that we rely
on in many parts of Git.
Fix this bug by propagating errno when `strbuf_read_file()` fails.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git checkout -B <branch> [<start-point>]", being a "forced" version
of "-b", switches to the <branch>, after optionally resetting its
tip to the <start-point>, even if the <branch> is in use in another
worktree, which is somewhat unexpected.
Protect the <branch> using the same logic that forbids "git checkout
<branch>" from touching a branch that is in use elsewhere.
This is a breaking change that may deserve backward compatibliity
warning in the Release Notes. The "--ignore-other-worktrees" option
can be used as an escape hatch if the finger memory of existing
users depend on the current behaviour of "-B".
Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f4ee22b526 (ref-filter: add tests for objectsize:disk, 2018-12-24)
hard-coded the expected object sizes. Coincidentally the size of commit
and tag is the same with zlib at the default compression level.
1f5f8f3e85 (t6300: abstract away SHA-1-specific constants, 2020-02-22)
encoded the sizes as a single value, which coincidentally also works
with sha256.
Different compression libraries like zlib-ng may arrive at different
values. Get them from the file system instead of hard-coding them to
make switching the compression library (or changing the compression
level) easier.
Reported-by: Ondrej Pohorelsky <opohorel@redhat.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing a header like a "From" line, mailinfo uses
unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized
comments. It takes a NUL-terminated string on input, and loops over the
"in" pointer until it sees the NUL. When it finds the start of an
interesting block, it delegates to helper functions which also increment
"in", and return the updated pointer.
But there's a bug here: the helpers find the NUL with a post-increment
in the loop condition, like:
while ((c = *in++) != 0)
So when they do see a NUL (rather than the correct termination of the
quote or comment section), they return "in" as one _past_ the NUL
terminator. And thus the outer loop in unquote_quoted_pair() does not
realize we hit the NUL, and keeps reading past the end of the buffer.
We should instead make sure to return "in" positioned at the NUL, so
that the caller knows to stop their loop, too. A hacky way to do this is
to return "in - 1" after leaving the inner loop. But a slightly cleaner
solution is to avoid incrementing "in" until we are sure it contained a
non-NUL byte (i.e., doing it inside the loop body).
The two tests here show off the problem. Since we check the output,
they'll _usually_ report a failure in a normal build, but it depends on
what garbage bytes are found after the heap buffer. Building with
SANITIZE=address reliably notices the problem. The outcome (both the
exit code and the exact bytes) are just what Git happens to produce for
these cases today, and shouldn't be taken as an endorsement. It might be
reasonable to abort on an unterminated string, for example. The priority
for this patch is fixing the out-of-bounds memory access.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're currently creating the reference database with a potentially
incorrect object format when the remote repository's object format is
different from the local default object format. This works just fine for
now because the files backend never records the object format anywhere.
But this logic will fail with any new reference backend that encodes
this information in some form either on-disk or in-memory.
The preceding commits have reshuffled code in git-clone(1) so that there
is no code path that will access the reference database before we have
detected the remote's object format. With these refactorings we can now
defer initialization of the reference database until after we have
learned the remote's object format and thus initialize it with the
correct format from the get-go.
These refactorings are required to make git-clone(1) work with the
upcoming reftable backend when cloning repositories with the SHA256
object format.
This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
the resulting hash format of the empty cloned repository to match the
default hash, but now we always end up with a sha1 repository. The
problem is that for dumb HTTP fetches, we have no easy way to figure out
the remote's hash function except for deriving it based on the hash
length of refs in `info/refs`. But as the remote repository is empty we
cannot rely on this detection mechanism.
Before the change in this commit we already initialized the repository
with the default hash function and then left it as-is. With this patch
we always use the hash function detected via the remote, where we fall
back to "sha1" in case we cannot detect it.
Neither the old nor the new behaviour are correct as we second-guess the
remote hash function in both cases. But given that this is a rather
unlikely edge case (we use the dumb HTTP protocol, the remote repository
uses SHA256 and the remote repository is empty), let's simply adapt the
test to assert the new behaviour. If we want to properly address this
edge case in the future we will have to extend the dumb HTTP protocol so
that we can properly detect the hash function for empty repositories.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create the reference database in git-clone(1) quite early before
connecting to the remote repository. Given that we do not yet know about
the object format that the remote repository uses at that point in time
the consequence is that the refdb may be initialized with the wrong
object format.
This is not a problem in the context of the files backend as we do not
encode the object format anywhere, and furthermore the only reference
that we write between initializing the refdb and learning about the
object format is the "HEAD" symref. It will become a problem though once
we land the reftable backend, which indeed does require to know about
the proper object format at the time of creation. We thus need to
rearrange the logic in git-clone(1) so that we only initialize the refdb
once we have learned about the actual object format.
As a first step, move listing of remote references to happen earlier,
which also allow us to set up the hash algorithm of the repository
earlier now. While we aim to execute this logic as late as possible
until after most of the setup has happened already, detection of the
object format and thus later the setup of the reference database must
happen before any other logic that may spawn Git commands or otherwise
these Git commands may not recognize the repository as such.
The first Git step where we expect the repository to be fully initalized
is when we fetch bundles via bundle URIs. Funny enough, the comments
there also state that "the_repository must match the cloned repo", which
is indeed not necessarily the case for the hash algorithm right now. So
in practice it is the right thing to detect the remote's object format
before downloading bundle URIs anyway, and not doing so causes clones
with bundle URIs to fail when the local default object format does not
match the remote repository's format.
Unfortunately though, this creates a new issue: downloading bundles may
take a long time, so if we list refs beforehand they might've grown
stale meanwhile. It is not clear how to solve this issue except for a
second reference listing though after we have downloaded the bundles,
which may be an expensive thing to do.
Arguably though, it's preferable to have a staleness issue compared to
being unable to clone a repository altogether.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard message for reporting the use of multiple mutually
exclusive options by calling die_for_incompatible_opt3() instead of
rolling our own. This has the benefits of showing only the actually
given options, reducing the number of strings to translate and making
the UI slightly more consistent.
Adjust the test to no longer insist on a specific order of the
reported options, as this implementation detail does not affect the
usefulness of the error message.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email". Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.
* tz/send-email-negatable-options:
send-email: avoid duplicate specification warnings
perl: bump the required Perl version to 5.8.1 from 5.8.0
"git rebase --autosquash" is now enabled for non-interactive rebase,
but it is still incompatible with the apply backend.
* ak/rebase-autosquash:
rebase: rewrite --(no-)autosquash documentation
rebase: support --autosquash without -i
rebase: fully ignore rebase.autoSquash without -i
"git for-each-ref --no-sort" still sorted the refs alphabetically
which paid non-trivial cost. It has been redefined to show output
in an unspecified order, to allow certain optimizations to take
advantage of.
* vd/for-each-ref-unsorted-optimization:
t/perf: add perf tests for for-each-ref
ref-filter.c: use peeled tag for '*' format fields
for-each-ref: clean up documentation of --format
ref-filter.c: filter & format refs in the same callback
ref-filter.c: refactor to create common helper functions
ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()'
ref-filter.h: add functions for filter/format & format-only
ref-filter.h: move contains caches into filter
ref-filter.h: add max_count and omit_empty to ref_format
ref-filter.c: really don't sort when using --no-sort
Test and shell scripts clean-up.
* ps/ban-a-or-o-operator-with-test:
Makefile: stop using `test -o` when unlinking duplicate executables
contrib/subtree: convert subtree type check to use case statement
contrib/subtree: stop using `-o` to test for number of args
global: convert trivial usages of `test <expr> -a/-o <expr>`
"git format-patch --encode-email-headers" ignored the option when
preparing the cover letter, which has been corrected.
* ss/format-patch-use-encode-headers-for-cover-letter:
format-patch: fix ignored encode_email_headers for cover letter
Update ref-related tests.
* ps/ref-tests-update:
t: mark several tests that assume the files backend with REFFILES
t7900: assert the absence of refs via git-for-each-ref(1)
t7300: assert exact states of repo
t4207: delete replace references via git-update-ref(1)
t1450: convert tests to remove worktrees via git-worktree(1)
t: convert tests to not access reflog via the filesystem
t: convert tests to not access symrefs via the filesystem
t: convert tests to not write references via the filesystem
t: allow skipping expected object ID in `ref-store update-ref`