Commit graph

20788 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason 1c7e239bd0 config API users: test for *_get_value_multi() segfaults
As we'll discuss in the subsequent commit these tests all
show *_get_value_multi() API users unable to handle there being a
value-less key in the config, which is represented with a "NULL" for
that entry in the "string" member of the returned "struct
string_list", causing a segfault.

These added tests exhaustively test for that issue, as we'll see in a
subsequent commit we'll need to change all of the API users
of *_get_value_multi(). These cases were discovered by triggering each
one individually, and then adding these tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason f7b2ff9516 for-each-repo: error on bad --config
As noted in 6c62f01552 (for-each-repo: do nothing on empty config,
2021-01-08) this command wants to ignore a non-existing config key,
but let's not conflate that with bad config.

Before this, all these added tests would pass with an exit code of 0.

We could preserve the comment added in 6c62f01552, but now that we're
directly using the documented repo_config_get_value_multi() value it's
just narrating something that should be obvious from the API use, so
let's drop it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason a428619309 config API: have *_multi() return an "int" and take a "dest"
Have the "git_configset_get_value_multi()" function and its siblings
return an "int" and populate a "**dest" parameter like every other
git_configset_get_*()" in the API.

As we'll take advantage of in subsequent commits, this fixes a blind
spot in the API where it wasn't possible to tell whether a list was
empty from whether a config key existed. For now we don't make use of
those new return values, but faithfully convert existing API users.

Most of this is straightforward, commentary on cases that stand out:

- To ensure that we'll properly use the return values of this function
  in the future we're using the "RESULT_MUST_BE_USED" macro introduced
  in [1].

  As git_die_config() now has to handle this return value let's have
  it BUG() if it can't find the config entry. As tested for in a
  preceding commit we can rely on getting the config list in
  git_die_config().

- The loops after getting the "list" value in "builtin/gc.c" could
  also make use of "unsorted_string_list_has_string()" instead of using
  that loop, but let's leave that for now.

- In "versioncmp.c" we now use the return value of the functions,
  instead of checking if the lists are still non-NULL.

1. 1e8697b5c4 (submodule--helper: check repo{_submodule,}_init()
   return values, 2022-09-01),

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason b83efcecaf config API: add and use a "git_config_get()" family of functions
We already have the basic "git_config_get_value()" function and its
"repo_*" and "configset" siblings to get a given "key" and assign the
last key found to a provided "value".

But some callers don't care about that value, but just want to use the
return value of the "get_value()" function to check whether the key
exist (or another non-zero return value).

The immediate motivation for this is that a subsequent commit will
need to change all callers of the "*_get_value_multi()" family of
functions. In two cases here we (ab)used it to check whether we had
any values for the given key, but didn't care about the return value.

The rest of the callers here used various other config API functions
to do the same, all of which resolved to the same underlying functions
to provide the answer.

Some of these were using either git_config_get_string() or
git_config_get_string_tmp(), see fe4c750fb1 (submodule--helper: fix a
configure_added_submodule() leak, 2022-09-01) for a recent example. We
can now use a helper function that doesn't require a throwaway
variable.

We could have changed git_configset_get_value_multi() (and then
git_config_get_value() etc.) to accept a "NULL" as a "dest" for all
callers, but let's avoid changing the behavior of existing API
users. Having an "unused" value that we throw away internal to
config.c is cheap.

A "NULL as optional dest" pattern is also more fragile, as the intent
of the caller might be misinterpreted if he were to accidentally pass
"NULL", e.g. when "dest" is passed in from another function.

Another name for this function could have been
"*_config_key_exists()", as suggested in [1]. That would work for all
of these callers, and would currently be equivalent to this function,
as the git_configset_get_value() API normalizes all non-zero return
values to a "1".

But adding that API would set us up to lose information, as e.g. if
git_config_parse_key() in the underlying configset_find_element()
fails we'd like to return -1, not 1.

Let's change the underlying configset_find_element() function to
support this use-case, we'll make further use of it in a subsequent
commit where the git_configset_get_value_multi() function itself will
expose this new return value.

This still leaves various inconsistencies and clobbering or ignoring
of the return value in place. E.g here we're modifying
configset_add_value(), but ever since it was added in [2] we've been
ignoring its "int" return value, but as we're changing the
configset_find_element() it uses, let's have it faithfully ferry that
"ret" along.

Let's also use the "RESULT_MUST_BE_USED" macro introduced in [3] to
assert that we're checking the return value of
configset_find_element().

We're leaving the same change to configset_add_value() for some future
series. Once we start paying attention to its return value we'd need
to ferry it up as deep as do_config_from(), and would need to make
least read_{,very_}early_config() and git_protected_config() return an
"int" instead of "void". Let's leave that for now, and focus on
the *_get_*() functions.

1. 3c8687a73e (add `config_set` API for caching config-like files, 2014-07-28)
2. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/
3. 1e8697b5c4 (submodule--helper: check repo{_submodule,}_init()
   return values, 2022-09-01),

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason e7587a8f53 config tests: add "NULL" tests for *_get_value_multi()
A less well known edge case in the config format is that keys can be
value-less, a shorthand syntax for "true" boolean keys. I.e. these two
are equivalent as far as "--type=bool" is concerned:

	[a]key
	[a]key = true

But as far as our parser is concerned the values for these two are
NULL, and "true". I.e. for a sequence like:

	[a]key=x
	[a]key
	[a]key=y

We get a "struct string_list" with "string" members with ".string"
values of:

	{ "x", NULL, "y" }

This behavior goes back to the initial implementation of
git_config_bool() in 17712991a5 (Add ".git/config" file parser,
2005-10-10).

When parts of the config_set API were tested for in [1] they didn't
add coverage for 3/4 of the "(NULL)" cases handled in
"t/helper/test-config.c". We'd test that case for "get_value", but not
"get_value_multi", "configset_get_value" and
"configset_get_value_multi".

We now cover all of those cases, which in turn expose the details of
how this part of the config API works.

1. 4c715ebb96 (test-config: add tests for the config_set API,
   2014-07-28)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason 258902ce07 config tests: cover blind spots in git_die_config() tests
There were no tests checking for the output of the git_die_config()
function in the config API, added in 5a80e97c82 (config: add
`git_die_config()` to the config-set API, 2014-08-07). We only tested
"test_must_fail", but didn't assert the output.

We need tests for this because a subsequent commit will alter the
return value of git_config_get_value_multi(), which is used to get the
config values in the git_die_config() function. This test coverage
helps to build confidence in that subsequent change.

These tests cover different interactions with git_die_config():

- The "notes.mergeStrategy" test in
  "t/t3309-notes-merge-auto-resolve.sh" is a case where a function
  outside of config.c (git_config_get_notes_strategy()) calls
  git_die_config().

- The "gc.pruneExpire" test in "t5304-prune.sh" is a case where
  git_config_get_expiry() calls git_die_config(), covering a different
  "type" than the "string" test for "notes.mergeStrategy".

- The "fetch.negotiationAlgorithm" test in
  "t/t5552-skipping-fetch-negotiator.sh" is a case where
  git_config_get_string*() calls git_die_config().

We also cover both the "from command-line config" and "in file..at
line" cases here.

The clobbering of existing ".git/config" files here is so that we're
not implicitly testing the line count of the default config.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason bab821646a cocci: apply the "pretty.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"pretty.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason ecb5091fd4 cocci: apply the "commit.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason cb338c23d6 cocci: apply the "commit-reach.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit-reach.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Ævar Arnfjörð Bjarmason d850b7a545 cocci: apply the "cache.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Michael J Gruber 3dc0b7f0dc t3070: make chain lint tester happy
1f2e05f0b7 ("wildmatch: fix exponential behavior", 2023-03-20)
introduced a new test with a background process. Backgrounding
necessarily gives a result of 0, so that a seemingly broken && chain is
not really broken.

Adjust t3070 slightly so that our chain lint test recognizes the
construct for what it is and does not raise a false positive.

Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 17:02:38 -07:00
Patrick Steinhardt d3af1c193d commit-graph: fix truncated generation numbers
In 80c928d947 (commit-graph: simplify compute_generation_numbers(),
2023-03-20), the code to compute generation numbers was simplified to
use the same infrastructure as is used to compute topological levels.
This refactoring introduced a bug where the generation numbers are
truncated when they exceed UINT32_MAX because we explicitly cast the
computed generation number to `uint32_t`. This is not required though:
both the computed value and the field of `struct commit_graph_data` are
of the same type `timestamp_t` already, so casting to `uint32_t` will
cause truncation.

This cast can cause us to miscompute generation data overflows:

    1. Given a commit with no parents and committer date
       `UINT32_MAX + 1`.

    2. We compute its generation number as `UINT32_MAX + 1`, but
       truncate it to `1`.

    3. We calculate the generation offset via `$generation - $date`,
       which is thus `1 - (UINT32_MAX + 1)`. The computation underflows
       and we thus end up with an offset that is bigger than the maximum
       allowed offset.

As a result, we'd be writing generation data overflow information into
the commit-graph that is bogus and ultimately not even required.

Fix this bug by removing the needless cast.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:52:06 -07:00
William Sprent 00408adeac builtin/sparse-checkout: add check-rules command
There exists no direct way to interrogate git about which paths are
matched by a given set of sparsity rules. It is possible to get this
information from git, but it includes checking out the commit that
contains the paths, applying the sparse checkout patterns and then using
something like 'git ls-files -t' to check if the skip worktree bit is
set. This works in some case, but there are cases where it is awkward or
infeasible to generate a checkout for this purpose.

Exposing the pattern matching of sparse checkout enables more tooling to
be built and avoids a situation where tools that want to reason about
sparse checkouts start containing parallel implementation of the rules.
To accommodate this, add a 'check-rules' subcommand to the
'sparse-checkout' builtin along the lines of the 'git check-ignore' and
'git check-attr' commands. The new command accepts a list of paths on
stdin and outputs just the ones the match the sparse checkout.

To allow for use in a bare repository and to allow for interrogating
about other patterns than the current ones, include a '--rules-file'
option which allows the caller to explicitly pass sparse checkout rules
in the format accepted by 'sparse-checkout set --stdin'.

To allow for reuse of the handling of input patterns for the
'--rules-file' flag, modify 'add_patterns_from_input()' to be able to
read from a 'FILE' instead of just stdin.

To allow for reuse of the logic which decides whether or not rules
should be interpreted as cone-mode patterns, split that part out of
'update_modes()' such that can be called without modifying the config.

An alternative could have been to create a new 'check-sparsity' command.
However, placing it under 'sparse-checkout' allows for a) more easily
re-using the sparse checkout pattern matching and cone/non-code mode
handling, and b) keeps the documentation for the command next to the
experimental warning and the cone-mode discussion.

Signed-off-by: William Sprent <williams@unity3d.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:51:12 -07:00
William Sprent 24fc2cde64 builtin/sparse-checkout: remove NEED_WORK_TREE flag
In preparation for adding a sub-command to 'sparse-checkout' that can be
run in a bare repository, remove the 'NEED_WORK_TREE' flag from its
entry in the 'commands' array of 'git.c'.

To avoid that this changes any behaviour, add calls to
'setup_work_tree()' to all of the 'sparse-checkout' sub-commands and add
tests that verify that 'sparse-checkout <cmd>' still fail with a clear
error message telling the user that the command needs a work tree.

Signed-off-by: William Sprent <williams@unity3d.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:43:51 -07:00
Johannes Schindelin 3457b50e8c t3701: we don't need no Perl for add -i anymore
This should have been removed in `ab/retire-scripted-add-p` but wasn't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:40:12 -07:00
Johannes Schindelin 3b7a4475b0 split-index; stop abusing the base_oid to strip the "link" extension
When a split-index is in effect, the `$GIT_DIR/index` file needs to
contain a "link" extension that contains all the information about the
split-index, including the information about the shared index.

However, in some cases Git needs to suppress writing that "link"
extension (i.e. to fall back to writing a full index) even if the
in-memory index structure _has_ a `split_index` configured. This is the
case e.g. when "too many not shared" index entries exist.

In such instances, the current code sets the `base_oid` field of said
`split_index` structure to all-zero to indicate that `do_write_index()`
should skip writing the "link" extension.

This can lead to problems later on, when the in-memory index is still
used to perform other operations and eventually wants to write a
split-index, detects the presence of the `split_index` and reuses that,
too (under the assumption that it has been initialized correctly and
still has a non-null `base_oid`).

Let's stop zeroing out the `base_oid` to indicate that the "link"
extension should not be written.

One might be tempted to simply call `discard_split_index()` instead,
under the assumption that Git decided to write a non-split index and
therefore the `split_index` structure might no longer be wanted.
However, that is not possible because that would release index entries
in `split_index->base` that are likely to still be in use. Therefore we
cannot do that.

The next best thing we _can_ do is to introduce a bit field to indicate
specifically which index extensions (not) to write. So that's what we do
here.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:39 -07:00
Johannes Schindelin 3704fed5ea split-index & fsmonitor: demonstrate a bug
This commit adds a new test case that demonstrates a bug in the
split-index code that is triggered under certain circumstances when the
FSMonitor is enabled, and its symptom manifests in the form of one of
the following error messages:

    BUG: fsmonitor.c:20: fsmonitor_dirty has more entries than the index (2 > 1)

    BUG: unpack-trees.c:776: pos <n> doesn't point to the first entry of <dir>/ in index

    error: invalid path ''
    error: The following untracked working tree files would be overwritten by reset:
            initial.t

Which of these error messages appears depends on timing-dependent
conditions.

Technically the root cause lies with a bug in the split-index code that
has nothing to do with FSMonitor, but for the sake of this new test case
it was the easiest way to trigger the bug.

The bug is this: Under specific conditions, Git needs to skip writing
the "link" extension (which is the index extension containing the
information pertaining to the split-index). To do that, the `base_oid`
attribute of the `split_index` structure in the in-memory index is
zeroed out, and `do_write_index()` specifically checks for a "null"
`base_oid` to understand that the "link" extension should not be
written. However, this violates the consistency of the in-memory index
structure, but that does not cause problems in most cases because the
process exits without using the in-memory index structure anymore,
anyway.

But: _When_ the in-memory index is still used (which is the case e.g. in
`git rebase`), subsequent writes of `the_index` are at risk of writing
out a bogus index file, one that _should_ have a "link" extension but
does not. In many cases, the `SPLIT_INDEX_ORDERED` flag _happens_ to be
set for subsequent writes, forcing the shared index to be written, which
re-initializes `base_oid` to a non-bogus state, and all is good.

When it is _not_ set, however, all kinds of mayhem ensue, resulting in
above-mentioned error messages, and often enough putting worktrees in a
totally broken state where the only recourse is to manually delete the
`index` and the `index.lock` files and then call `git reset` manually.
Not something to ask users to do.

The reason why it is comparatively easy to trigger the bug with
FSMonitor is that there is _another_ bug in the FSMonitor code:
`mark_fsmonitor_valid()` sets `cache_changed` to 1, i.e. treating that
variable as a Boolean. But it is a bit field, and 1 happens to be the
`SOMETHING_CHANGED` bit that forces the "link" extension to be skipped
when writing the index, among other things.

"Comparatively easy" is a relative term in this context, for sure. The
essence of how the new test case triggers the bug is as following:

1. The `git rebase` invocation will first reset the worktree to
   a commit that contains only the `one.t` file, and then execute a
   rebase script that starts with the following commands (commit hashes
   skipped):

   label onto

   reset initial
   pick two
   label two

   reset two
   pick three
   [...]

2. Before executing the `label` command, a split index is written, as
   well as the shared index.

3. The `reset initial` command in the rebase script writes out a new
   split index but skips writing the shared index, as intended.

4. The `pick two` command updates the worktree and refreshes the index,
   marking the `two.t` entry as valid via the FSMonitor, which sets the
   `SOMETHING_CHANGED` bit in `cache_changed`, which in turn causes the
   `base_oid` attribute to be zeroed out and a full (non-split) index
   to be written (making sure _not_ to write the "link" extension).

5. Now, the `reset two` command will leave the worktree alone, but
   still write out a new split index, not writing the shared index
   (because `base_oid` is still zeroed out, and there is no index entry
   update requiring it to be written, either).

6. When it is turn to run `pick three`, the index is read, but it is
   too short: It only contains a single entry when there should be two,
   because the "link" extension is missing from the written-out index
   file.

There are three bugs at play, actually, which will be fixed over the
course of the next commits:

- The `base_oid` attribute should not be zeroed out to indicate when
  the "link" extension should not be written, as it puts the in-memory
  index structure into an inconsistent state.

- The FSMonitor should not overwrite bits in `cache_changed`.

- The `unpack_trees()` function tries to reuse the `split_index`
  structure from the source index, if any, but does not propagate the
  `SPLIT_INDEX_ORDERED` flag.

While a fix for the second bug would let this test case pass, there are
other conditions where the `SOMETHING_CHANGED` bit is set. Therefore,
the bug that most crucially needs to be fixed is the first one.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:40:39 -07:00
Alex Henrie 6605fb70cb rebase: add a config option for --rebase-merges
The purpose of the new option is to accommodate users who would like
--rebase-merges to be on by default and to facilitate turning on
--rebase-merges by default without configuration in a future version of
Git.

Name the new option rebase.rebaseMerges, even though it is a little
redundant, for consistency with the name of the command line option and
to be clear when scrolling through values in the [rebase] section of
.gitconfig.

Support setting rebase.rebaseMerges to the nonspecific value "true" for
users who don't need to or don't want to learn about the difference
between rebase-cousins and no-rebase-cousins.

Make --rebase-merges without an argument on the command line override
any value of rebase.rebaseMerges in the configuration, for consistency
with other command line flags with optional arguments that have an
associated config option.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:32:49 -07:00
Alex Henrie 7e5dcec3ca rebase: add documentation and test for --no-rebase-merges
As far as I can tell, --no-rebase-merges has always worked, but has
never been documented. It is especially important to document it before
a rebase.rebaseMerges option is introduced so that users know how to
override the config option on the command line. It's also important to
clarify that --rebase-merges without an argument is not the same as
--no-rebase-merges and not passing --rebase-merges is not the same as
passing --rebase-merges=no-rebase-cousins.

A test case is necessary to make sure that --no-rebase-merges keeps
working after its code is refactored in the following patches of this
series. The test case is a little contrived: It's unlikely that a user
would type both --rebase-merges and --no-rebase-merges at the same time.
However, if an alias is defined which includes --rebase-merges, the user
might decide to add --no-rebase-merges to countermand that part of the
alias but leave alone other flags set by the alias.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:32:49 -07:00
René Scharfe 1aaed69d11 t5000: use check_mtime()
fd2da4b1ea (archive: add --mtime, 2023-02-18) added a helper function
for checking the file modification time of an extracted entry.  Use it
for the older mtime test as well to shorten the code and piggyback on
the archive extraction done to validate file contents.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 09:13:30 -07:00
Jacob Keller 1a3119ed06 blame: allow --contents to work with non-HEAD commit
The --contents option can be used with git blame to blame the file as if
it had the contents from the specified file. This is akin to copying the
contents into the working tree and then running git blame. This option
has been supported since 1cfe77333f ("git-blame: no rev means start
from the working tree file.")

The --contents option always blames the file as if it was based on the
current HEAD commit. If you try to pass a revision while using
--contents, you get the following error:

  fatal: cannot use --contents with final commit object name

This is because the blame process generates a fake working tree commit
which always uses the HEAD object as its sole parent.

Enhance fake_working_tree_commit to take the object ID to use for the
parent instead of always using the HEAD object. Then, always generate a
fake commit when we have contents provided, even if we have a final
object. Remove the check to disallow --contents and a final revision.

Note that the behavior of generating a fake working commit is still
skipped when a revision is provided but --contents is not provided.
Generating such a commit in that case would combine the currently
checked out file contents with the provided revision, which breaks
normal blame behavior and produces unexpected results.

This enables use of --contents with an arbitrary revision, rather than
forcing the use of the local HEAD commit. This makes the --contents
option significantly more flexible, as it is no longer required to check
out the working tree to the desired commit before using --contents.

Reword the documentation so that its clear that --contents can be used
with <rev>.

Add tests for the --contents option to the annotate-tests.sh test
script.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-24 12:05:22 -07:00
Jeff King aa548459a0 fast-export: de-obfuscate --anonymize-map handling
When we handle an --anonymize-map option, we parse the orig/anon pair,
and then feed the "orig" string to anonymize_str(), along with a
generator function that duplicates the "anon" string to be cached in the
map.

This works, because anonymize_str() says "ah, there is no mapping yet
for orig; I'll add one from the generator". But there are some
downsides:

  1. It's a bit too clever, as it's not obvious what the code is trying
     to do or why it works.

  2. It requires allowing generator functions to take an extra void
     pointer, which is not something any of the normal callers of
     anonymize_str() want.

  3. It does the wrong thing if the same token is provided twice.
     When there are conflicting options, like:

       git fast-export --anonymize \
         --anonymize-map=foo:one \
	 --anonymize-map=foo:two

     we usually let the second one override the first. But by using
     anonymize_str(), which has first-one-wins logic, we do the
     opposite.

So instead of relying on anonymize_str(), let's directly add the entry
ourselves. We can tweak the tests to show that we handle overridden
options correctly now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-22 15:37:09 -07:00
Junio C Hamano ba235249c0 Merge branch 'fc/test-aggregation-clean-up'
Code clean-up for test framework.

* fc/test-aggregation-clean-up:
  test: don't print aggregate-results command
  test: simplify counts aggregation
2023-03-21 14:18:56 -07:00
Junio C Hamano 1071deae00 Merge branch 'aj/ls-files-format-fix'
Fix for a "ls-files --format="%(path)" that produced nonsense
output, which was a bug in 2.38.

* aj/ls-files-format-fix:
  ls-files: fix "--format" output of relative paths
2023-03-21 14:18:55 -07:00
Junio C Hamano 15108de2fa Merge branch 'jk/format-patch-ignore-noprefix'
"git format-patch" honors the src/dst prefixes set to nonstandard
values with configuration variables like "diff.noprefix", causing
receiving end of the patch that expects the standard -p1 format to
break.  Teach "format-patch" to ignore end-user configuration and
always use the standard prefixes.

This is a backward compatibility breaking change.

* jk/format-patch-ignore-noprefix:
  rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch
  format-patch: add format.noprefix option
  format-patch: do not respect diff.noprefix
  diff: add --default-prefix option
  t4013: add tests for diff prefix options
  diff: factor out src/dst prefix setup
2023-03-21 14:18:55 -07:00
Elijah Newren d48be35ca6 write-or-die.h: move declarations for write-or-die.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren 61a7b98264 treewide: remove cache.h inclusion due to setup.h changes
By moving several declarations to setup.h, the previous patch made it
possible to remove the include of cache.h in several source files.  Do
so.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren 9875058870 treewide: remove cache.h inclusion due to environment.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren 32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren a64acf7298 treewide: remove unnecessary includes of cache.h
The last several commits were geared at replacing the include of cache.h
in strbuf.c with an include of git-compat-util.h.  Unfortunately, I had
to drop a patch moving some functions from cache.h to object-name.h, due
to excessive conflicts with other in-flight topics.

However, even without that patch, the series of patches so far allows us
to modify a number of C files to replace an include of cache.h with
git-compat-util.h.  Do that to reduce our dependencies.

(If we could have kept our object-name.h patch in this series, it would
have also let us reduce the includes in checkout.c and fmt-merge-msg.c
in addition to strbuf.c).

Just to ensure that nothing else was bringing in cache.h, all of the
affected files have been checked to ensure that
    gcc -E -I. $SOURCE_FILE | grep '"cache.h"'
found no hits and that
    make DEVELOPER=1 ${OBJECT_FILE_FOR_SOURCE_FILE}
successfully compiles without warnings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren 0b027f6ca7 abspath.h: move absolute path functions from cache.h
This is another step towards letting us remove the include of cache.h in
strbuf.c.  It does mean that we also need to add includes of abspath.h
in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
Elijah Newren 4f6728d52d treewide: remove unnecessary cache.h inclusion from several sources
A number of files were apparently including cache.h solely to get
gettext.h.  By making those files explicitly include gettext.h, we can
already drop the include of cache.h in these files.  On top of that,
there were some files using cache.h that didn't need to for any reason.
Remove these unnecessary includes.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren 553d4d70d1 treewide: remove unnecessary inclusion of gettext.h
Looking at things from the opposite angle of the last patch, we had a
few files that were including gettext.h and perhaps needed it at some
point in history, but no longer require it.  Remove the include.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren a6dc3d364c treewide: remove unnecessary cache.h inclusion from a few headers
Ever since a64215b6cd ("object.h: stop depending on cache.h; make
cache.h depend on object.h", 2023-02-24), we have a few headers that
could have replaced their include of cache.h with an include of
object.h.  Make that change now.

Some C files had to start including cache.h after this change (or some
smaller header it had brought in), because the C files were depending
on things from cache.h but were only formerly implicitly getting
cache.h through one of these headers being modified in this patch.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:50 -07:00
Derrick Stolee cbfe360b14 commit-reach: add tips_reachable_from_bases()
Both 'git for-each-ref --merged=<X>' and 'git branch --merged=<X>' use
the ref-filter machinery to select references or branches (respectively)
that are reachable from a set of commits presented by one or more
--merged arguments. This happens within reach_filter(), which uses the
revision-walk machinery to walk history in a standard way.

However, the commit-reach.c file is full of custom searches that are
more efficient, especially for reachability queries that can terminate
early when reachability is discovered. Add a new
tips_reachable_from_bases() method to commit-reach.c and call it from
within reach_filter() in ref-filter.c. This affects both 'git branch'
and 'git for-each-ref' as tested in p1500-graph-walks.sh.

For the Linux kernel repository, we take an already-fast algorithm and
make it even faster:

Test                                            HEAD~1  HEAD
-------------------------------------------------------------------
1500.5: contains: git for-each-ref --merged     0.13    0.02 -84.6%
1500.6: contains: git branch --merged           0.14    0.02 -85.7%
1500.7: contains: git tag --merged              0.15    0.03 -80.0%

(Note that we remove the iterative 'git rev-list' test from p1500
because it no longer makes sense as a comparison to 'git for-each-ref'
and would just waste time running it for these comparisons.)

The algorithm is implemented in commit-reach.c in the method
tips_reachable_from_base(). This method takes a string_list of tips and
assigns the 'util' for each item with the value 1 if the base commit can
reach those tips.

Like other reachability queries in commit-reach.c, the fastest way to
search for "can A reach B?" is to do a depth-first search up to the
generation number of B, preferring to explore first parents before later
parents. While we must walk all reachable commits up to that generation
number when the answer is "no", the depth-first search can answer "yes"
much faster than other approaches in most cases.

This search becomes trickier when there are multiple targets for the
depth-first search. The commits with lower generation number are more
likely to be within the history of the start commit, but we don't want
to waste time searching commits of low generation number if the commit
target with lowest generation number has already been found.

The trick here is to take the input commits and sort them by generation
number in ascending order. Track the index within this order as
min_generation_index. When we find a commit, if its index in the list is
equal to min_generation_index, then we can increase the generation
number boundary of our search to the next-lowest value in the list.

With this mechanism, the number of commits to search is minimized with
respect to the depth-first search heuristic. We will walk all commits up
to the minimum generation number of a commit that is _not_ reachable
from the start, but we will walk only the necessary portion of the
depth-first search for the reachable commits of lower generation.

Add extra tests for this behavior in t6600-test-reach.sh as the
interesting data shape of that repository can sometimes demonstrate
corner case bugs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee 49abcd21da for-each-ref: add ahead-behind format atom
The previous change implemented the ahead_behind() method, including an
algorithm to compute the ahead/behind values for a number of commit tips
relative to a number of commit bases. Now, integrate that algorithm as
part of 'git for-each-ref' hidden behind a new format atom,
ahead-behind. This naturally extends to 'git branch' and 'git tag'
builtins, as well.

This format allows specifying multiple bases, if so desired, and all
matching references are compared against all of those bases. For this
reason, failing to read a reference provided from these atoms results in
an error.

In order to translate the ahead_behind() method information to the
format output code in ref-filter.c, we must populate arrays of
ahead_behind_count structs. In struct ref_array, we store the full array
that will be passed to ahead_behind(). In struct ref_array_item, we
store an array of pointers that point to the relvant items within the
full array. In this way, we can pull all relevant ahead/behind values
directly when formatting output for a specific item. It also ensures the
lifetime of the ahead_behind_count structs matches the time that the
array is being used.

Add specific tests of the ahead/behind counts in t6600-test-reach.sh, as
it has an interesting repository shape. In particular, its merging
strategy and its use of different commit-graphs would demonstrate over-
counting if the ahead_behind() method did not already account for that
possibility.

Also add tests for the specific for-each-ref, branch, and tag builtins.
In the case of 'git tag', there are intersting cases that happen when
some of the selected tips are not commits. This requires careful logic
around commits_nr in the second loop of filter_ahead_behind(). Also, the
test in t7004 is carefully located to avoid being dependent on the GPG
prereq. It also avoids using the test_commit helper, as that will add
ticks to the time and disrupt the expected timestamps in later tag
tests.

Also add performance tests in a new p1300-graph-walks.sh script. This
will be useful for more uses in the future, but for now compare the
ahead-behind counting algorithm in 'git for-each-ref' to the naive
implementation by running 'git rev-list --count' processes for each
input.

For the Git source code repository, the improvement is already obvious:

Test                                            this tree
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref   0.07(0.07+0.00)
1500.3: ahead-behind counts: git branch         0.07(0.06+0.00)
1500.4: ahead-behind counts: git tag            0.07(0.06+0.00)
1500.5: ahead-behind counts: git rev-list       1.32(1.04+0.27)

But the standard performance benchmark is the Linux kernel repository,
which demosntrates a significant improvement:

Test                                            this tree
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref   0.27(0.24+0.02)
1500.3: ahead-behind counts: git branch         0.27(0.24+0.03)
1500.4: ahead-behind counts: git tag            0.28(0.27+0.01)
1500.5: ahead-behind counts: git rev-list       4.57(4.03+0.54)

The 'git rev-list' test exists in this change as a demonstration, but it
will be removed in the next change to avoid wasting time on this
comparison.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee 2ee11f7261 commit-graph: return generation from memory
The commit_graph_generation() method used to report a value of
GENERATION_NUMBER_INFINITY if the commit_graph_data_slab had an instance
for the given commit but the graph_pos indicated the commit was not in
the commit-graph file.

However, an upcoming change will introduce the ability to set generation
values in-memory without writing the commit-graph file. Thus, we can no
longer trust 'graph_pos' to indicate whether or not the generation
member can be trusted.

Instead, trust the 'generation' member if the commit has a value in the
slab _and_ the 'generation' member is non-zero. Otherwise, treat it as
GENERATION_NUMBER_INFINITY.

This only makes a difference for a very old case for the commit-graph:
the very first Git release to write commit-graph files wrote zeroes in
the topological level positions. If we are parsing a commit-graph with
all zeroes, those commits will now appear to have
GENERATION_NUMBER_INFINITY (as if they were not parsed from the
commit-graph).

I attempted several variations to work around the need for providing an
uninitialized 'generation' member, but this was the best one I found. It
does require a change to a verification test in t5318 because it reports
a different error than the one about non-zero generation numbers.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee b2c51b7590 for-each-ref: explicitly test no matches
The for-each-ref builtin can take a list of ref patterns, but if none
match, it still succeeds (but with no output). Add an explicit test that
demonstrates that behavior.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:32 -07:00
Derrick Stolee b73dec5530 for-each-ref: add --stdin option
When a user wishes to input a large list of patterns to 'git
for-each-ref' (likely a long list of exact refs) there are frequently
system limits on the number of command-line arguments.

Add a new --stdin option to instead read the patterns from standard
input. Add tests that check that any unrecognized arguments are
considered an error when --stdin is provided. Also, an empty pattern
list is interpreted as the complete ref set.

When reading from stdin, we populate the filter.name_patterns array
dynamically as opposed to pointing to the 'argv' array directly. This is
simple when using a strvec, as it is NULL-terminated in the same way. We
then free the memory directly from the strvec.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:32 -07:00
Phillip Wood 1f2e05f0b7 wildmatch: fix exponential behavior
When dowild() cannot match a '*' or '/**/' wildcard then it must return
WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe
this results in unnecessary backtracking and the time taken for a failed
match increases exponentially with the number of wildcards in the
pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH
for a failed match resulting in long match times for patterns containing
multiple wildcards as can be seen in the following benchmark.
(Note that the timings in the Benchmark 1 are really measuring the time
to execute test-tool rather than the time to match the pattern)

Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"
  Time (mean ± σ):      22.8 ms ±   1.7 ms    [User: 12.1 ms, System: 10.6 ms]
  Range (min … max):    19.4 ms …  26.9 ms    113 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"
  Time (mean ± σ):      5.244 s ±  0.228 s    [User: 5.229 s, System: 0.010 s]
  Range (min … max):    4.969 s …  5.707 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran
  230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"'

The security implications are limited as it only affects operations that
are potentially DoS vectors. For example by creating a blob containing
such a pattern a malicious user can exploit this behavior to use large
amounts of CPU time on a remote server by pushing the blob and then
creating a new clone with --filter=sparse:oid. However this filter type
is usually disabled as it is known to consume large amounts of CPU time
even without this bug.

The WM_MATCH changed in the first hunk of this patch comes from the
original implementation imported from rsync in 5230f605e1 (Import
wildmatch from rsync, 2012-10-15). Compared to the others converted here
it is fairly harmless as it only triggers at the end of the pattern and
so will only cause a single unnecessary backtrack. The others introduced
by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal>
patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for
"*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause
exponential behavior.

A new test is added to protect against future regressions.

[1] https://research.swtch.com/glob

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 10:58:53 -07:00
Andrei Rybak a93cbe8d78 t1507: assert output of rev-parse
Tests in t1507-rev-parse-upstream.sh compare files "expect" and "actual"
to assert the output of "git rev-parse", "git show", and "git log".
However, two of the tests '@{reflog}-parsing does not look beyond colon'
and '@{upstream}-parsing does not look beyond colon' don't inspect the
contents of the created files.

Assert output of "git rev-parse" in tests in t1507-rev-parse-upstream.sh
to improve test coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
Andrei Rybak 7deec9442f t1404: don't create unused file
Some tests in file t1404-update-ref-errors.sh create file "unchanged" as
the expected side for a test_cmp assertion at the end of the test for
output of "git for-each-ref".  Test 'no bogus intermediate values during
delete' also creates a file named "unchanged" using "git for-each-ref".
However, the file isn't used for any assertions in the test.  Instead,
"git rev-parse" is used to compare the reference with variable $D.

Don't create unused file "unchanged" in test 'no bogus intermediate
values during delete' of t1404-update-ref-errors.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
Andrei Rybak 94f07b5544 t1400: assert output of update-ref
In t1400-update-ref.sh test 'transaction can create and delete' creates
files "expect" and "actual", but doesn't compare them.  Similarly, test
'transaction cannot restart ongoing transaction' redirects output of
"git update-ref" to file "actual", but doesn't check its contents with
any assertions.

Assert output of "git update-ref" in tests to improve test coverage in
t1400-update-ref.sh.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:42 -07:00
Andrei Rybak 17ae7f758e t1302: don't create unused file
Test 'gitdir selection on unsupported repo' in t1302-repo-version.sh
writes output of a "git config" invocation to file "actual".  However,
the test doesn't have any assertions for the file.  The file was used by
this test until commit b9605bc4f2 (config: only read .git/config from
configured repos, 2016-09-12), before which "git config" was expected to
print the bogus value of "core.repositoryformatversion" to standard
output.

Don't redirect output of "git config" to file "actual" in test 'gitdir
selection on unsupported repo'.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
Andrei Rybak f4b98e17cf t1010: don't create unused files
Builtin "git mktree" writes the the object name of the tree object built
to the standard output.  Tests 'mktree refuses to read ls-tree -r output
(1)' and 'mktree refuses to read ls-tree -r output (2)' in
"t1010-mktree.sh" redirect output of "git mktree" to a file, but don't
use its contents in assertions.

Don't redirect output of "git mktree" to file "actual" in tests that
assert that an invocation of "git mktree" must fail.

Output of "git mktree" is empty when it refuses to build a tree object.
So, alternatively, the test could assert that the output is empty.
However, there isn't a good reason for the user to expect the command to
be silent in such cases, so we shouldn't enforce it.  The user shouldn't
use the output of a failing command anyway.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
Andrei Rybak 4e273368ce t1006: assert error output of cat-file
Test "cat-file $arg1 $arg2 error on missing full OID" in
t1006-cat-file.sh compares files "expect.err" and "err.actual" to assert
the expected error output of "git cat-file".  A similar test in the same
file named "cat-file $arg1 $arg2 error on missing short OID" also
creates these two files, but doesn't use them in assertions.

Assert error output of "git cat-file" in test "cat-file $arg1 $arg2
error on missing short OID" of t1006-cat-file.sh to improve test
coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
Andrei Rybak 8fc184c0eb t1005: assert output of ls-files
Test 'reset should work' in t1005-read-tree-reset.sh compares two files
"expect" and "actual" to assert the expected output of "git ls-files".
Several other tests in the same file also create files "expect" and
"actual", but don't use them in assertions.

Assert output of "git ls-files" in t1005-read-tree-reset.sh to improve
test coverage.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 09:11:41 -07:00
Junio C Hamano a9f4a01760 Merge branch 'jk/add-p-unmerged-fix'
"git add -p" while the index is unmerged sometimes failed to parse
the diff output it internally produces and died, which has been
corrected.

* jk/add-p-unmerged-fix:
  add-patch: handle "* Unmerged path" lines
2023-03-19 15:03:13 -07:00
Junio C Hamano 947604ddb7 Merge branch 'ew/fetch-no-write-fetch-head-fix'
* ew/fetch-no-write-fetch-head-fix:
  fetch: pass --no-write-fetch-head to subprocesses
2023-03-19 15:03:13 -07:00
Junio C Hamano fc1a4ce043 Merge branch 'ab/fix-strategy-opts-parsing'
The code to parse "git rebase -X<opt>" was not prepared to see an
unparsable option string, which has been corrected.

* ab/fix-strategy-opts-parsing:
  sequencer.c: fix overflow & segfault in parse_strategy_opts()
2023-03-19 15:03:12 -07:00
Junio C Hamano 5c92a451be Merge branch 'jk/format-patch-change-format-for-empty-commits'
"git format-patch" learned to write a log-message only output file
for empty commits.

* jk/format-patch-change-format-for-empty-commits:
  format-patch: output header for empty commits
2023-03-19 15:03:12 -07:00
Junio C Hamano 95de376349 Merge branch 'jk/bundle-use-dash-for-stdfiles'
"git bundle" learned that "-" is a common way to say that the input
comes from the standard input and/or the output goes to the
standard output.  It used to work only for output and only from the
root level of the working tree.

* jk/bundle-use-dash-for-stdfiles:
  parse-options: use prefix_filename_except_for_dash() helper
  parse-options: consistently allocate memory in fix_filename()
  bundle: don't blindly apply prefix_filename() to "-"
  bundle: document handling of "-" as stdin
  bundle: let "-" mean stdin for reading operations
2023-03-19 15:03:12 -07:00
Junio C Hamano 12201fd756 Merge branch 'jk/bundle-progress'
Simplify UI to control progress meter given by "git bundle" command.

* jk/bundle-progress:
  bundle: turn on --all-progress-implied by default
2023-03-19 15:03:11 -07:00
Junio C Hamano 96a806f87a Merge branch 'rj/avoid-switching-to-already-used-branch'
A few subcommands have been taught to stop users from working on a
branch that is being used in another worktree linked to the same
repository.

* rj/avoid-switching-to-already-used-branch:
  switch: reject if the branch is already checked out elsewhere (test)
  rebase: refuse to switch to a branch already checked out elsewhere (test)
  branch: fix die_if_checked_out() when ignore_current_worktree
  worktree: introduce is_shared_symref()
2023-03-19 15:03:11 -07:00
Junio C Hamano c79786c486 Merge branch 'rj/bisect-already-used-branch'
Allow "git bisect reset" to check out the original branch when the
branch is already checked out in a different worktree linked to the
same repository.

* rj/bisect-already-used-branch:
  bisect: fix "reset" when branch is checked out elsewhere
2023-03-19 15:03:11 -07:00
Junio C Hamano 4a25b911cd Merge branch 'zh/push-to-delete-onelevel-ref'
"git push" has been taught to allow deletion of refs with one-level
names to help repairing a repository who acquired such a ref by
mistake.  In general, we don't encourage use of such a ref, and
creation or update to such a ref is rejected as before.

* zh/push-to-delete-onelevel-ref:
  push: allow delete single-level ref
  receive-pack: fix funny ref error messsage
2023-03-19 15:03:10 -07:00
Junio C Hamano 67076b85b8 Merge branch 'ak/restore-both-incompatible-with-conflicts'
"git restore" supports options like "--ours" that are only
meaningful during a conflicted merge, but these options are only
meaningful when updating the working tree files.  These options are
marked to be incompatible when both "--staged" and "--worktree" are
in effect.

* ak/restore-both-incompatible-with-conflicts:
  restore: fault --staged --worktree with merge opts
2023-03-19 15:03:10 -07:00
Junio C Hamano 6f54213718 Merge branch 'ab/avoid-losing-exit-codes-in-tests'
Test clean-up.

* ab/avoid-losing-exit-codes-in-tests:
  tests: don't lose misc "git" exit codes
  tests: don't lose exit status with "test <op> $(git ...)"
  tests: don't lose "git" exit codes in "! ( git ... | grep )"
  tests: don't lose exit status with "(cd ...; test <op> $(git ...))"
  t/lib-patch-mode.sh: fix ignored exit codes
  auto-crlf tests: don't lose exit code in loops and outside tests
2023-03-19 15:03:10 -07:00
Jeff King eaa0fd6584 git_connect(): fix corner cases in downgrading v2 to v0
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:

  1. it checks the "prog" variable, which is actually the path to
     receive-pack on the remote side. By default this is just
     "git-receive-pack", but it could be an arbitrary string (like
     "/path/to/git receive-pack", etc). We'd accidentally stay in v2
     mode in this case.

  2. besides "receive-pack" and "upload-pack", there's one other value
     we'd expect: "upload-archive" for handling "git archive --remote".
     Like receive-pack, this doesn't understand v2, and should use the
     v0 protocol.

In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.

We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).

We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.

The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).

Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:

  - transport_connect() is called only for git-upload-archive. For
    non-http git remotes, that resolves to the virtual connect_git()
    function (which then calls git_connect(); confused yet?). So
    plumbing through "name" in connect_git() covers that.

  - for regular fetches and pushes, callers use higher-level functions
    like transport_fetch_refs(). For non-http git remotes, that means
    calling git_connect() under the hood via connect_setup(). And that
    uses the "for_push" flag to decide which name to use.

  - likewise, plumbing like fetch-pack and send-pack may call
    git_connect() directly; they each know which name to use.

  - for remote helpers (including http), we already have separate
    parameters for "name" and "exec" (another name for "prog"). In
    process_connect_service(), we feed the "name" to the helper via
    "connect" or "stateless-connect" directives.

    There's also a "servpath" option, which can be used to tell the
    helper about the "exec" path. But no helpers we implement support
    it! For http it would be useless anyway (no reasonable server
    implementation will allow you to send a shell command to run the
    server). In theory it would be useful for more obscure helpers like
    remote-ext, but even there it is not implemented.

    It's tempting to get rid of it simply to reduce confusion, but we
    have publicly documented it since it was added in fa8c097cc9
    (Support remote helpers implementing smart transports, 2009-12-09),
    so it's possible some helper in the wild is using it.

  - So for v2, helpers (again, including http) are mainly used via
    stateless-connect, driven by the main program. But they do still
    need to decide whether to do a v2 probe. And so there's similar
    logic in remote-curl.c's discover_refs() that looks for
    "git-receive-pack". But it's not buggy in the same way. Since it
    doesn't support servpath, it is always dealing with a "service"
    string like "git-receive-pack". And since it doesn't support
    straight "connect", it can't be used for "upload-archive".

    So we could leave that spot alone. But I've updated it here to match
    the logic we're changing in connect_git(). That seems like the least
    confusing thing for somebody who has to touch both of these spots
    later (say, to add v2 push support). I didn't add a new test to make
    sure this doesn't break anything; we already have several tests (in
    t5551 and elsewhere) that make sure we are using v2 over http.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 15:15:59 -07:00
Junio C Hamano 4d87411ffe Merge branch 'ew/fetch-hiderefs'
A new "fetch.hideRefs" option can be used to exclude specified refs
from "rev-list --objects --stdin --not --all" traversal for
checking object connectivity, most useful when there are many
unrelated histories in a single repository.

* ew/fetch-hiderefs:
  fetch: support hideRefs to speed up connectivity checks
2023-03-17 14:03:10 -07:00
Junio C Hamano 92c56da096 Merge branch 'mc/credential-helper-www-authenticate'
Allow information carried on the WWW-AUthenticate header to be
passed to the credential helpers.

* mc/credential-helper-www-authenticate:
  credential: add WWW-Authenticate header to cred requests
  http: read HTTP WWW-Authenticate response headers
  t5563: add tests for basic and anoymous HTTP access
2023-03-17 14:03:10 -07:00
Junio C Hamano af5388d2dd Merge branch 'jc/gpg-lazy-init'
Instead of forcing each command to choose to honor GPG related
configuration variables, make the subsystem lazily initialize
itself.

* jc/gpg-lazy-init:
  drop pure pass-through config callbacks
  gpg-interface: lazily initialize and read the configuration
2023-03-17 14:03:10 -07:00
Junio C Hamano d0732a8120 Merge branch 'jk/unused-post-2.39-part2'
More work towards -Wunused.

* jk/unused-post-2.39-part2: (21 commits)
  help: mark unused parameter in git_unknown_cmd_config()
  run_processes_parallel: mark unused callback parameters
  userformat_want_item(): mark unused parameter
  for_each_commit_graft(): mark unused callback parameter
  rewrite_parents(): mark unused callback parameter
  fetch-pack: mark unused parameter in callback function
  notes: mark unused callback parameters
  prio-queue: mark unused parameters in comparison functions
  for_each_object: mark unused callback parameters
  list-objects: mark unused callback parameters
  mark unused parameters in signal handlers
  run-command: mark error routine parameters as unused
  mark "pointless" data pointers in callbacks
  ref-filter: mark unused callback parameters
  http-backend: mark unused parameters in virtual functions
  http-backend: mark argc/argv unused
  object-name: mark unused parameters in disambiguate callbacks
  serve: mark unused parameters in virtual functions
  serve: use repository pointer to get config
  ls-refs: drop config caching
  ...
2023-03-17 14:03:09 -07:00
Junio C Hamano 88cc8ed8bc Merge branch 'en/header-cleanup'
Code clean-up to clarify the rule that "git-compat-util.h" must be
the first to be included.

* en/header-cleanup:
  diff.h: remove unnecessary include of object.h
  Remove unnecessary includes of builtin.h
  treewide: replace cache.h with more direct headers, where possible
  replace-object.h: move read_replace_refs declaration from cache.h to here
  object-store.h: move struct object_info from cache.h
  dir.h: refactor to no longer need to include cache.h
  object.h: stop depending on cache.h; make cache.h depend on object.h
  ident.h: move ident-related declarations out of cache.h
  pretty.h: move has_non_ascii() declaration from commit.h
  cache.h: remove dependence on hex.h; make other files include it explicitly
  hex.h: move some hex-related declarations from cache.h
  hash.h: move some oid-related declarations from cache.h
  alloc.h: move ALLOC_GROW() functions from cache.h
  treewide: remove unnecessary cache.h includes in source files
  treewide: remove unnecessary cache.h includes
  treewide: remove unnecessary git-compat-util.h includes in headers
  treewide: ensure one of the appropriate headers is sourced first
2023-03-17 14:03:09 -07:00
Junio C Hamano f17d232f14 Merge branch 'en/dir-api-cleanup'
Code clean-up to clarify directory traversal API.

* en/dir-api-cleanup:
  unpack-trees: add usage notices around df_conflict_entry
  unpack-trees: special case read-tree debugging as internal usage
  unpack-trees: rewrap a few overlong lines from previous patch
  unpack-trees: mark fields only used internally as internal
  unpack_trees: start splitting internal fields from public API
  sparse-checkout: avoid using internal API of unpack-trees, take 2
  sparse-checkout: avoid using internal API of unpack-trees
  unpack-trees: clean up some flow control
  dir: mark output only fields of dir_struct as such
  dir: add a usage note to exclude_per_dir
  dir: separate public from internal portion of dir_struct
  unpack-trees: heed requests to overwrite ignored files
  t2021: fix platform-specific leftover cruft
2023-03-17 14:03:08 -07:00
Junio C Hamano 2d019f46b0 Merge branch 'jk/fsck-indices-in-worktrees'
"git fsck" learned to check the index files in other worktrees,
just like "git gc" honors them as anchoring points.

* jk/fsck-indices-in-worktrees:
  fsck: check even zero-entry index files
  fsck: mention file path for index errors
  fsck: check index files in all worktrees
  fsck: factor out index fsck
2023-03-17 14:03:08 -07:00
Felipe Contreras 7ee1af8cb8 completion: prompt: use generic colors
When the prompt command mode was introduced in 1bfc51ac81 (Allow
__git_ps1 to be used in PROMPT_COMMAND, 2012-10-10), the assumption was
that it was necessary in order to properly add colors to PS1 in bash,
but this wasn't true.

It's true that the \[ \] markers add the information needed to properly
calculate the width of the prompt, and they have to be added directly to
PS1, a function returning them doesn't work.

But that is because bash coverts the \[ \] markers in PS1 to \001 \002,
which is what readline ultimately needs in order to calculate the width.

We don't need bash to do this conversion, we can use \001 \002
ourselves, and then the prompt command mode is not necessary to display
colors.

This is what functions returning colors are supposed to do [1].

[1] http://mywiki.wooledge.org/BashFAQ/053

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Joakim Petersen <joak-pet@online.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-16 15:58:22 -07:00
Felipe Contreras dfbfdc521d object-name: fix quiet @{u} parsing
Currently `git rev-parse --quiet @{u}` is not actually quiet when
upstream isn't configured:

  fatal: no upstream configured for branch 'foo'

Make it so.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-16 10:44:56 -07:00
Adam Johnson cfb62dd006 ls-files: fix "--format" output of relative paths
Fix a bug introduced with the "--format" option in
ce74de93 (ls-files: introduce "--format" option, 2022-07-23),
where relative paths were computed using the output buffer,
which could lead to random garbage data in the output.

Signed-off-by: Adam Johnson <me@adamj.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-10 09:16:16 -08:00
Felipe Contreras 90ff7c9898 test: don't print aggregate-results command
There's no value in it.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 14:57:57 -08:00
Felipe Contreras 5d1d62e875 test: simplify counts aggregation
When the list of files as input was implemented in 6508eedf67
(t/aggregate-results: accomodate systems with small max argument list
length, 2010-06-01), a much simpler solution wasn't considered.

Let's just pass the directory as an argument.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 14:57:55 -08:00
Eric Wong 15184ae9da fetch: pass --no-write-fetch-head to subprocesses
It seems a user would expect this option would work regardless
of whether it's fetching from a single remote, many remotes,
or recursing into submodules.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 11:06:39 -08:00
Jeff King 28d1122f9c add-patch: handle "* Unmerged path" lines
When we generate a diff with --cached, unmerged entries have no oid for
their index entry:

  $ git diff-index --abbrev --cached HEAD
  :100644 000000 f719efd 0000000 U	my-conflict

So when we are asked to produce a patch, since we only have one side, we
just emit a special message:

  $ git diff-index --cached -p HEAD
  * Unmerged path my-conflict

This confuses interactive-patch modes that look at cached diffs. For
example:

  $ git reset -p
  BUG: add-patch.c:498: diff starts with unexpected line:
  * Unmerged path my-conflict

Making things even more confusing, you'll get that error only if the
unmerged entry is alphabetically the first changed file. Otherwise, we
simply stick the unrecognized line to the end of the previous hunk.
There it's mostly harmless, as it eventually gets fed back to "git
apply", which happily ignores it. But it's still shown to the user
attached to the hunk, which is wrong.

So let's handle these lines as a noop. There's not really anything
useful to do with a conflicted merge in this case, and that's what we do
for other cases like "add -p". There we get a "diff --cc" line, which we
accept as starting a new file, but we refuse to use any of its hunks
(their headers start with "@@@" and not "@@ ", so we silently ignore
them).

It seems like simply recognizing the line and continuing in our parsing
loop would work. But we actually need to run the rest of the loop body
to handle matching up our colored/filtered output. But that code assumes
that we have some active file_diff we're working on. So instead, we'll
just insert a dummy entry into our array. This ends up the same as if we
saw a "diff --cc" line (a file with no hunks).

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 10:06:18 -08:00
Jeff King 8d5213decf format-patch: add format.noprefix option
The previous commit dropped support for diff.noprefix in format-patch.
While this will do the right thing in most cases (where sending patches
without a prefix was an accidental side effect of the sender preferring
to see their local patches without prefixes), it left no good option for
a project or workflow where you really do want to send patches without
prefixes. You'd be stuck using "--no-prefix" for every invocation.

So let's add a config option specific to format-patch that enables this
behavior. That gives people who have such a workflow a way to get what
they want, but makes it hard to accidentally trigger it.

A more backwards-compatible way of doing the transition would be to have
format.noprefix default to diff.noprefix when it's not set. But that
doesn't really help the "accidental" problem; people would have to
manually set format.noprefix=false. And it's unlikely that anybody
really wants format.noprefix=true in the first place. I'm adding it here
mostly as an escape hatch, not because anybody has expressed any
interest in it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:37:27 -08:00
Jeff King c169af8f7a format-patch: do not respect diff.noprefix
The output of format-patch respects diff.noprefix, but this usually ends
up being a hassle for people receiving the patch, as they have to
manually specify "-p0" in order to apply it.

I don't think there was any specific intention for it to behave this
way. The noprefix option is handled by git_diff_ui_config(), and
format-patch exists in a gray area between plumbing and porcelain.
People do look at the output, and we'd expect it to colorize things,
respect their choice of algorithm, and so on. But this particular option
creates problems for the receiver (in theory so does diff.mnemonicprefix,
but since we are always formatting commits, the mnemonic prefixes will
always be "a/" and "b/").

So let's disable it. The slight downsides are:

  - people who have set diff.noprefix presumably like to see their
    patches without prefixes. If they use format-patch to review their
    series, they'll see prefixes. On the other hand, it is probably a
    good idea for them to look at what will actually get sent out.

    We could try to play games here with "is stdout a tty", as we do for
    color. But that's not a completely reliable signal, and it's
    probably not worth the trouble. If you want to see the patch with
    the usual bells and whistles, then you are better off using "git
    log" or "git show".

  - if a project really does have a workflow that likes prefix-less
    patches, and the receiver is prepared to use "-p0", then the sender
    now has to manually say "--no-prefix" for each format-patch
    invocation. That doesn't seem _too_ terrible given that the receiver
    has to manually say "-p0" for each git-am invocation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:23 -08:00
Jeff King b39a569729 diff: add --default-prefix option
You can change the output of prefixes with diff.noprefix and
diff.mnemonicprefix, but there's no easy way to override them from the
command-line. We do have "--no-prefix", but there's no way to get back
to the default prefix. So let's add an option to do that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:21 -08:00
Jeff King 7c03d0db88 t4013: add tests for diff prefix options
We don't have any specific test coverage of diff's various prefix
options. We do incidentally invoke them in a few places, but it's worth
having a more thorough set of tests that covers all of the effects we
expect to see, and that the options kick in at the appropriate times.

This will be especially useful as the next patch adds more options.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-09 08:32:19 -08:00
Ævar Arnfjörð Bjarmason 15a4cc912e sequencer.c: fix overflow & segfault in parse_strategy_opts()
The split_cmdline() function introduced in [1] returns an "int". If
it's negative it signifies an error. The option parsing in [2] didn't
account for this, and assigned the value directly to the "size_t
xopts_nr". We'd then attempt to loop over all of these elements, and
access uninitialized memory.

There's a few things that use this for option parsing, but one way to
trigger it is with a bad value to "-X <strategy-option>", e.g:

	git rebase -X"bad argument\""

In another context this might be a security issue, but in this case
someone who's already able to inject arguments directly to our
commands would be past other defenses, making this potential
escalation a moot point.

As the example above & test case shows the error reporting leaves
something to be desired. The function will loop over the
whitespace-split values, but when it encounters an error we'll only
report the first element, which is OK, not the second "argument\""
whose quote is unbalanced.

This is an inherent limitation of the current API, and the issue
affects other API users. Let's not attempt to fix that now. If and
when that happens these tests will need to be adjusted to assert the
new output.

1. 2b11e3170e (If you have a config containing something like this:,
   2006-06-05)
2. ca6c6b45dd (sequencer (rebase -i): respect strategy/strategy_opts
   settings, 2017-01-02)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-08 14:14:42 -08:00
Jeff King 7ce4088ab7 parse-options: consistently allocate memory in fix_filename()
When handling OPT_FILENAME(), we have to stick the "prefix" (if any) in
front of the filename to make up for the fact that Git has chdir()'d to
the top of the repository. We can do this with prefix_filename(), but
there are a few special cases we handle ourselves.

Unfortunately the memory allocation is inconsistent here; if we do make
it to prefix_filename(), we'll allocate a string which the caller must
free to avoid a leak. But if we hit our special cases, we'll return the
string as-is, and a caller which tries to free it will crash. So there's
no way to win.

Let's consistently allocate, so that callers can do the right thing.

There are now three cases to care about in the function (and hence a
three-armed if/else):

  1. we got a NULL input (and should leave it as NULL, though arguably
     this is the sign of a bug; let's keep the status quo for now and we
     can pick at that scab later)

  2. we hit a special case that means we leave the name intact; we
     should duplicate the string. This includes our special "-"
     matching. Prior to this patch, it also included empty prefixes and
     absolute filenames. But we can observe that prefix_filename()
     already handles these, so we don't need to detect them.

  3. everything else goes to prefix_filename()

I've dropped the "const" from the "char **file" parameter to indicate
that we're allocating, though in practice it's not really important.
This is all being shuffled through a void pointer via opt->value before
it hits code which ever looks at the string. And it's even a bit weird,
because we are really taking _in_ a const string and using the same
out-parameter for a non-const string. A better function signature would
be:

  static char *fix_filename(const char *prefix, const char *file);

but that would mean the caller dereferences the double-pointer (and the
NULL check is currently handled inside this function). So I took the
path of least-change here.

Note that we have to fix several callers in this commit, too, or we'll
break the leak-checking tests. These are "new" leaks in the sense that
they are now triggered by the test suite, but these spots have always
been leaky when Git is run in a subdirectory of the repository. I fixed
all of the cases that trigger with GIT_TEST_PASSING_SANITIZE_LEAK. There
may be others in scripts that have other leaks, but we can fix them
later along with those other leaks (and again, you _couldn't_ fix them
before this patch, so this is the necessary first step).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:14:45 -08:00
Junio C Hamano a8bfa99d44 bundle: don't blindly apply prefix_filename() to "-"
A user can specify a filename to a command from the command line,
either as the value given to a command line option, or a command
line argument.  When it is given as a relative filename, in the
user's mind, it is relative to the directory "git" was started from,
but by the time the filename is used, "git" would almost always have
chdir()'ed up to the root level of the working tree.

The given filename, if it is relative, needs to be prefixed with the
path to the current directory, and it typically is done by calling
prefix_filename() helper function.  For commands that can also take
"-" to use the standard input or the standard output, however, this
needs to be done with care.

"git bundle create" uses the next word on the command line as the
output filename, and can take "-" to mean "write to the standard
output".  It blindly called prefix_filename(), so running it in a
subdirectory did not quite work as expected.

Introduce a new helper, prefix_filename_except_for_dash(), and use
it to help "git bundle create" codepath.

Reported-by: Michael Henry
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:12:56 -08:00
Jeff King bf8b1e04ff bundle: let "-" mean stdin for reading operations
For writing, "bundle create -" indicates that the bundle should be
written to stdout. But there's no matching handling of "-" for reading
operations. This is inconsistent, and a little inflexible (though one
can always use "/dev/stdin" on systems that support it).

However, it's easy to change. Once upon a time, the bundle-reading code
required a seekable descriptor, but that was fixed long ago in
e9ee84cf28 (bundle: allowing to read from an unseekable fd,
2011-10-13). So we just need to handle "-" explicitly when opening the
file.

We _could_ do this by handling "-" in read_bundle_header(), which the
reading functions all call already. But that is probably a bad idea.
It's also used by low-level code like the transport functions, and we
may want to be more careful there. We do not know that stdin is even
available to us, and certainly we would not want to get confused by a
configured URL that happens to point to "-".

So instead, let's add a helper to builtin/bundle.c. Since both the
bundle code and some of the callers refer to the bundle by name for
error messages, let's use the string "<stdin>" to make the output a bit
nicer to read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 13:12:55 -08:00
Jeff King 8b95521edb bundle: turn on --all-progress-implied by default
In 79862b6b77 (bundle-create: progress output control, 2019-11-10),
"bundle create" learned about the --all-progress and
--all-progress-implied options, which were copied from pack-objects.
I think these were a mistake.

In pack-objects, "all-progress-implied" is about switching the behavior
between a regular on-disk "git repack" and the use of pack-objects for
push/fetch (where a fetch does not want progress from the server during
the write stage; the client will print progress as it receives the
data). But there's no such distinction for bundles. Prior to
79862b6b77, we always printed the write stage. Afterwards, a vanilla:

  git bundle create foo.bundle

omits the write progress, appearing to hang (especially if your
repository is large or your disk is slow). That seems like a regression.

It's possible that the flexibility to disable the write-phase progress
_could_ be useful for bundle. E.g., if you did something like:

  ssh some-host git bundle create foo.bundle |
  git bundle unbundle

But if you are running both in real-time, why are you using bundles in
the first place? You're better off doing a real fetch.

But even if we did want to support that, it should be the exception, and
vanilla "bundle create" should display the full progress. So we'd want
to name the option "--no-write-progress" or something.

The "--all-progress" option itself is even worse. It exists in
pack-objects only for historical reasons. It's a mistake because it
implies "--progress", and we added "--all-progress-implied" to fix that.
There is no reason to propagate that mistake to new commands.

Likewise, the documentation for these options was pulled from
pack-objects. But it doesn't make any sense in this context. It talks
about "--stdout", but that is not even an option that git-bundle
supports.

This patch flips the default for "--all-progress-implied" back to
"true", fixing the regression in 79862b6b77. This turns that option
into a noop, and means that "--all-progress" is really the same as
"--progress". We _could_ drop them completely, but since they've been
shipped with Git since v2.25.0, it's polite to continue accepting them.

I didn't implement any sort of "--no-write-progress" here. I'm not at
all convinced it's necessary, and the discussion from the original
thread:

  https://lore.kernel.org/git/20191110204126.30553-2-robbat2@gentoo.org/

shows that that the main focus was on getting --progress and --quiet
support, and not any kind of clever "real-time bundle over the network"
feature. But technically this patch is making it impossible to do
something that you _could_ do post-79862b6b77c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-06 09:51:06 -08:00
John Keeping 94c4289435 format-patch: output header for empty commits
When formatting an empty commit, it is surprising that a totally empty
file is generated.  Set the flag to always print the header, matching
the behaviour of git-log.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-03 09:13:52 -08:00
ZheNing Hu 7c3c55026c push: allow delete single-level ref
We discourage the creation/update of single-level refs
because some upper-layer applications only work in specified
reference namespaces, such as "refs/heads/*" or "refs/tags/*",
these single-level refnames may not be recognized. However,
we still hope users can delete them which have been created
by mistake.

Therefore, when updating branches on the server with
"git receive-pack", by checking whether it is a branch deletion
operation, it will determine whether to allow the update of
a single-level refs. This avoids creating/updating such
single-level refs, but allows them to be deleted.

On the client side, "git push" also does not properly fill in
the old-oid of single-level refs, which causes the server-side
"git receive-pack" to think that the ref's old-oid has changed
when deleting single-level refs, this causes the push to be
rejected. So the solution is to fix the client to be able to
delete single-level refs by properly filling old-oid.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-01 08:08:10 -08:00
Junio C Hamano 4240e0f6c0 Merge branch 'ar/test-lib-remove-stale-comment'
Test library clean-up.

* ar/test-lib-remove-stale-comment:
  test-lib: drop comment about test_description
2023-02-28 16:38:47 -08:00
Junio C Hamano 8760a2b3c6 Merge branch 'zy/t9700-style'
Test style fixes.

* zy/t9700-style:
  t9700: modernize test scripts
2023-02-28 16:38:47 -08:00
Junio C Hamano a2d2b5229e Merge branch 'pw/rebase-i-parse-fix'
Fixes to code that parses the todo file used in "rebase -i".

* pw/rebase-i-parse-fix:
  rebase -i: fix parsing of "fixup -C<commit>"
  rebase -i: match whole word in is_command()
2023-02-28 16:38:47 -08:00
Junio C Hamano b2893ea403 Merge branch 'jk/http-test-fixes'
Various fix-ups on HTTP tests.

* jk/http-test-fixes:
  t5559: make SSL/TLS the default
  t5559: fix test failures with LIB_HTTPD_SSL
  t/lib-httpd: enable HTTP/2 "h2" protocol, not just h2c
  t/lib-httpd: respect $HTTPD_PROTO in expect_askpass()
  t5551: drop curl trace lines without headers
  t5551: handle v2 protocol in cookie test
  t5551: simplify expected cookie file
  t5551: handle v2 protocol in upload-pack service test
  t5551: handle v2 protocol when checking curl trace
  t5551: stop forcing clone to run with v0 protocol
  t5551: handle HTTP/2 when checking curl trace
  t5551: lower-case headers in expected curl trace
  t5551: drop redundant grep for Accept-Language
  t5541: simplify and move "no empty path components" test
  t5541: stop marking "used receive-pack service" test as v0 only
  t5541: run "used receive-pack service" test earlier
2023-02-28 16:38:47 -08:00
Matthew John Cheetham 5f2117b24f credential: add WWW-Authenticate header to cred requests
Add the value of the WWW-Authenticate response header to credential
requests. Credential helpers that understand and support HTTP
authentication and authorization can use this standard header (RFC 2616
Section 14.47 [1]) to generate valid credentials.

WWW-Authenticate headers can contain information pertaining to the
authority, authentication mechanism, or extra parameters/scopes that are
required.

The current I/O format for credential helpers only allows for unique
names for properties/attributes, so in order to transmit multiple header
values (with a specific order) we introduce a new convention whereby a
C-style array syntax is used in the property name to denote multiple
ordered values for the same property.

In this case we send multiple `wwwauth[]` properties where the order
that the repeated attributes appear in the conversation reflects the
order that the WWW-Authenticate headers appeared in the HTTP response.

Add a set of tests to exercise the HTTP authentication header parsing
and the interop with credential helpers. Credential helpers will receive
WWW-Authenticate information in credential requests.

[1] https://datatracker.ietf.org/doc/html/rfc2616#section-14.47

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:40:40 -08:00
Matthew John Cheetham 988aad99b4 t5563: add tests for basic and anoymous HTTP access
Add a test showing simple anoymous HTTP access to an unprotected
repository, that results in no credential helper invocations.
Also add a test demonstrating simple basic authentication with
simple credential helper support.

Leverage a no-parsed headers (NPH) CGI script so that we can directly
control the HTTP responses to simulate a multitude of good, bad and ugly
remote server implementations around auth.

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-27 10:40:40 -08:00
Junio C Hamano 630501ceef Merge branch 'jc/countermand-format-attach'
The format.attach configuration variable lacked a way to override a
value defined in a lower-priority configuration file (e.g. the
system one) by redefining it in a higher-priority configuration
file.  Now, setting format.attach to an empty string means show the
patch inline in the e-mail message, without using MIME attachment.

This is a backward incompatible change.

* jc/countermand-format-attach:
  format.attach: allow empty value to disable multi-part messages
2023-02-27 10:08:57 -08:00
Junio C Hamano dda83e69d0 Merge branch 'jk/shorten-unambiguous-ref-wo-sscanf'
sscanf(3) used in "git symbolic-ref --short" implementation found
to be not working reliably on macOS in UTF-8 locales.  Rewrite the
code to avoid sscanf() altogether to work it around.

* jk/shorten-unambiguous-ref-wo-sscanf:
  shorten_unambiguous_ref(): avoid sscanf()
  shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
  shorten_unambiguous_ref(): avoid integer truncation
2023-02-27 10:08:57 -08:00
Junio C Hamano 7dc55a04d8 Merge branch 'mh/credential-password-expiry'
The credential subsystem learned that a password may have an
explicit expiration.

* mh/credential-password-expiry:
  credential: new attribute password_expiry_utc
2023-02-27 10:08:57 -08:00
Junio C Hamano 5e572aaa5d Merge branch 'rs/archive-mtime'
"git archive HEAD^{tree}" records the paths with the current
timestamp in the archive, making it harder to obtain a stable
output.  The command learned the --mtime option to specify an
arbitrary timestamp (e.g. --mtime="@0 +0000" for the epoch).

* rs/archive-mtime:
  archive: add --mtime
2023-02-27 10:08:57 -08:00
Junio C Hamano b8840a72e2 Merge branch 'tb/drop-dir-iterator-follow-symlink-bit'
Remove leftover and unused code.

* tb/drop-dir-iterator-follow-symlink-bit:
  t0066: drop setup of "dir5"
  dir-iterator: drop unused `DIR_ITERATOR_FOLLOW_SYMLINKS`
2023-02-27 10:08:57 -08:00
Junio C Hamano 63f74cfbcc Merge branch 'tl/range-diff-custom-abbrev'
"git range-diff" learned --abbrev=<num> option.

* tl/range-diff-custom-abbrev:
  range-diff: let '--abbrev' option takes effect
2023-02-27 10:08:56 -08:00
Junio C Hamano 93c12724f1 Merge branch 'ap/t2015-style-update'
Test clean-up.

* ap/t2015-style-update:
  t2015-checkout-unborn.sh: changes the style for cd
2023-02-27 10:08:56 -08:00