Commit graph

61697 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
2708ce62d2 branch: sort detached HEAD based on a flag
Change the ref-filter sorting of detached HEAD to check the
FILTER_REFS_DETACHED_HEAD flag, instead of relying on the ref
description filled-in by get_head_description() to start with "(",
which in turn we expect to ASCII-sort before any other reference.

For context, we'd like the detached line to appear first at the start
of "git branch -l", e.g.:

    $ git branch -l
    * (HEAD detached at <hash>)
      master

This doesn't change that, but improves on a fix made in
28438e84e0 (ref-filter: sort detached HEAD lines firstly, 2019-06-18)
and gives the Chinese translation the ability to use its preferred
punctuation marks again.

In Chinese the fullwidth versions of punctuation like "()" are
typically written as (U+FF08 fullwidth left parenthesis), (U+FF09
fullwidth right parenthesis) instead[1]. This form is used in both
po/zh_{CN,TW}.po in most cases where "()" is translated in a string.

Aside from that improvement to the Chinese translation, it also just
makes for cleaner code that we mark any special cases in the ref_array
we're sorting with flags and make the sort function aware of them,
instead of piggy-backing on the general-case of strcmp() doing the
right thing.

As seen in the amended tests this made reverse sorting a bit more
consistent. Before this we'd sometimes sort this message in the
middle, now it's consistently at the beginning or end, depending on
whether we're doing a normal or reverse sort. Having it at the end
doesn't make much sense either, but at least it behaves consistently
now. A follow-up commit will make this behavior under reverse sorting
even better.

I'm removing the "TRANSLATORS" comments that were in the old code
while I'm at it. Those were added in d4919bb288 (ref-filter: move
get_head_description() from branch.c, 2017-01-10). I think it's
obvious from context, string and translation memory in typical
translation tools that these are the same or similar string.

1. https://en.wikipedia.org/wiki/Chinese_punctuation#Marks_similar_to_European_punctuation

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason
7c269a7b16 ref-filter: move ref_sorting flags to a bitfield
Change the reverse/ignore_case/version sort flags in the ref_sorting
struct into a bitfield. Having three of them was already a bit
unwieldy, but it would be even more so if another flag needed a
function like ref_sorting_icase_all() introduced in
76f9e569ad (ref-filter: apply --ignore-case to all sorting keys,
2020-05-03).

A follow-up change will introduce such a flag, so let's move this over
to a bitfield. Instead of using the usual '#define' pattern I'm using
the "enum" pattern from builtin/rebase.c's b4c8eb024a (builtin
rebase: support --quiet, 2018-09-04).

Perhaps there's a more idiomatic way of doing the "for each in list
amend mask" pattern than this "mask/on" variable combo. This function
doesn't allow us to e.g. do any arbitrary changes to the bitfield for
multiple flags, but I think in this case that's fine. The common case
is that we're calling this with a list of one.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason
d0947483a3 ref-filter: move "cmp_fn" assignment into "else if" arm
Further amend code changed in 7c5045fc18 (ref-filter: apply fallback
refname sort only after all user sorts, 2020-05-03) to move an
assignment only used in the "else if" arm to happen there. Before that
commit the cmp_fn would be used outside of it.

We could also just skip the "cmp_fn" assignment and use
strcasecmp/strcmp directly in a ternary statement here, but this is
probably more readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason
75c50e599c ref-filter: add braces to if/else if/else chain
Per the CodingGuidelines add braces to an if/else if/else chain where
only the "else" had braces. This is in preparation for a subsequent
change where the "else if" will have lines added to it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Jeff King
6aed56736b fsck: reject .gitmodules git:// urls with newlines
The previous commit taught the clone/fetch client side to reject a
git:// URL with a newline in it. Let's also catch these when fscking a
.gitmodules file, which will give an earlier warning.

Note that it would be simpler to just complain about newline in _any_
URL, but an earlier tightening for http/ftp made sure we kept allowing
newlines for unknown protocols (and this is covered in the tests). So
we'll stick to that precedent.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 14:25:44 -08:00
Jeff King
a02ea57717 git_connect_git(): forbid newlines in host and path
When we connect to a git:// server, we send an initial request that
looks something like:

  002dgit-upload-pack repo.git\0host=example.com

If the repo path contains a newline, then it's included literally, and
we get:

  002egit-upload-pack repo
  .git\0host=example.com

This works fine if you really do have a newline in your repository name;
the server side uses the pktline framing to parse the string, not
newlines. However, there are many _other_ protocols in the wild that do
parse on newlines, such as HTTP. So a carefully constructed git:// URL
can actually turn into a valid HTTP request. For example:

  git://localhost:1234/%0d%0a%0d%0aGET%20/%20HTTP/1.1 %0d%0aHost:localhost%0d%0a%0d%0a

becomes:

  0050git-upload-pack /
  GET / HTTP/1.1
  Host:localhost

  host=localhost:1234

on the wire. Again, this isn't a problem for a real Git server, but it
does mean that feeding a malicious URL to Git (e.g., through a
submodule) can cause it to make unexpected cross-protocol requests.
Since repository names with newlines are presumably quite rare (and
indeed, we already disallow them in git-over-http), let's just disallow
them over this protocol.

Hostnames could likewise inject a newline, but this is unlikely a
problem in practice; we'd try resolving the hostname with a newline in
it, which wouldn't work. Still, it doesn't hurt to err on the side of
caution there, since we would not expect them to work in the first
place.

The ssh and local code paths are unaffected by this patch. In both cases
we're trying to run upload-pack via a shell, and will quote the newline
so that it makes it intact. An attacker can point an ssh url at an
arbitrary port, of course, but unless there's an actual ssh server
there, we'd never get as far as sending our shell command anyway.  We
_could_ similarly restrict newlines in those protocols out of caution,
but there seems little benefit to doing so.

The new test here is run alongside the git-daemon tests, which cover the
same protocol, but it shouldn't actually contact the daemon at all.  In
theory we could make the test more robust by setting up an actual
repository with a newline in it (so that our clone would succeed if our
new check didn't kick in). But a repo directory with newline in it is
likely not portable across all filesystems. Likewise, we could check
git-daemon's log that it was not contacted at all, but we do not
currently record the log (and anyway, it would make the test racy with
the daemon's log write). We'll just check the client-side stderr to make
sure we hit the expected code path.

Reported-by: Harold Kim <h.kim@flatt.tech>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 14:25:44 -08:00
Junio C Hamano
72c4083ddf The first batch in 2.31 cycle
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 23:33:44 -08:00
Junio C Hamano
d3aff11c3e Merge branch 'es/perf-export-fix'
Tweak unneeded recursion from a test framework helper function.

* es/perf-export-fix:
  t/perf: avoid unnecessary test_export() recursion
2021-01-06 23:33:44 -08:00
Junio C Hamano
cf4b0714f7 Merge branch 'fc/t6030-bisect-reset-removes-auxiliary-files'
A 3-year old test that was not testing anything useful has been
corrected.

* fc/t6030-bisect-reset-removes-auxiliary-files:
  test: bisect-porcelain: fix location of files
2021-01-06 23:33:44 -08:00
Junio C Hamano
8664fcb83b Merge branch 'es/worktree-repair-both-moved'
"git worktree repair" learned to deal with the case where both the
repository and the worktree moved.

* es/worktree-repair-both-moved:
  worktree: teach `repair` to fix multi-directional breakage
2021-01-06 23:33:44 -08:00
Junio C Hamano
45a177069f Merge branch 'en/merge-ort-recursive'
The ORT merge strategy learned to synthesize virtual ancestor tree
by recursively merging multiple merge bases together, just like the
recursive backend has done for years.

* en/merge-ort-recursive:
  merge-ort: implement merge_incore_recursive()
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: copy a few small helper functions from merge-recursive.c
  commit: move reverse_commit_list() from merge-recursive
2021-01-06 23:33:44 -08:00
Junio C Hamano
d3fa84d528 Merge branch 'fc/pull-merge-rebase'
When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation.  Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.

* fc/pull-merge-rebase:
  pull: display default warning only when non-ff
  pull: correct condition to trigger non-ff advice
  pull: get rid of unnecessary global variable
  pull: give the advice for choosing rebase/merge much later
  pull: refactor fast-forward check
2021-01-06 23:33:44 -08:00
Junio C Hamano
85cf82ff01 Merge branch 'en/merge-ort-2'
More "ORT" merge strategy.

* en/merge-ort-2:
  merge-ort: add modify/delete handling and delayed output processing
  merge-ort: add die-not-implemented stub handle_content_merge() function
  merge-ort: add function grouping comments
  merge-ort: add a paths_to_free field to merge_options_internal
  merge-ort: add a path_conflict field to merge_options_internal
  merge-ort: add a clear_internal_opts helper
  merge-ort: add a few includes
2021-01-06 23:33:44 -08:00
Junio C Hamano
f9d29daba6 Merge branch 'en/merge-ort-impl'
The merge backend "done right" starts to emerge.

* en/merge-ort-impl:
  merge-ort: free data structures in merge_finalize()
  merge-ort: add implementation of record_conflicted_index_entries()
  tree: enable cmp_cache_name_compare() to be used elsewhere
  merge-ort: add implementation of checkout()
  merge-ort: basic outline for merge_switch_to_result()
  merge-ort: step 3 of tree writing -- handling subdirectories as we go
  merge-ort: step 2 of tree writing -- function to create tree object
  merge-ort: step 1 of tree writing -- record basenames, modes, and oids
  merge-ort: have process_entries operate in a defined order
  merge-ort: add a preliminary simple process_entries() implementation
  merge-ort: avoid recursing into identical trees
  merge-ort: record stage and auxiliary info for every path
  merge-ort: compute a few more useful fields for collect_merge_info
  merge-ort: avoid repeating fill_tree_descriptor() on the same tree
  merge-ort: implement a very basic collect_merge_info()
  merge-ort: add an err() function similar to one from merge-recursive
  merge-ort: use histogram diff
  merge-ort: port merge_start() from merge-recursive
  merge-ort: add some high-level algorithm structure
  merge-ort: setup basic internal data structures
2021-01-06 23:33:43 -08:00
Junio C Hamano
c256631065 Merge branch 'tb/pack-bitmap'
Various improvements to the codepath that writes out pack bitmaps.

* tb/pack-bitmap: (24 commits)
  pack-bitmap-write: better reuse bitmaps
  pack-bitmap-write: relax unique revwalk condition
  pack-bitmap-write: use existing bitmaps
  pack-bitmap: factor out 'add_commit_to_bitmap()'
  pack-bitmap: factor out 'bitmap_for_commit()'
  pack-bitmap-write: ignore BITMAP_FLAG_REUSE
  pack-bitmap-write: build fewer intermediate bitmaps
  pack-bitmap.c: check reads more aggressively when loading
  pack-bitmap-write: rename children to reverse_edges
  t5310: add branch-based checks
  commit: implement commit_list_contains()
  bitmap: implement bitmap_is_subset()
  pack-bitmap-write: fill bitmap with commit history
  pack-bitmap-write: pass ownership of intermediate bitmaps
  pack-bitmap-write: reimplement bitmap writing
  ewah: add bitmap_dup() function
  ewah: implement bitmap_or()
  ewah: make bitmap growth less aggressive
  ewah: factor out bitmap growth
  rev-list: die when --test-bitmap detects a mismatch
  ...
2021-01-06 23:33:43 -08:00
Junio C Hamano
b62bbd3580 Merge branch 'ab/trailers-extra-format'
The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.

* ab/trailers-extra-format:
  pretty format %(trailers): add a "key_value_separator"
  pretty format %(trailers): add a "keyonly"
  pretty-format %(trailers): fix broken standalone "valueonly"
  pretty format %(trailers) doc: avoid repetition
  pretty format %(trailers) test: split a long line
2021-01-06 23:33:43 -08:00
Junio C Hamano
c977ff4407 Merge branch 'pk/subsub-fetch-fix-take-2'
"git fetch --recurse-submodules" fix (second attempt).

* pk/subsub-fetch-fix-take-2:
  submodules: fix of regression on fetching of non-init subsub-repo
2021-01-06 23:33:43 -08:00
Philippe Blain
80f5a16798 mergetool--lib: fix '--tool-help' to correctly show available tools
Commit 83bbf9b92e (mergetool--lib: improve support for vimdiff-style tool
variants, 2020-07-29) introduced a regression in the output of `git mergetool
--tool-help` and `git difftool --tool-help` [1].

In function 'show_tool_names' in git-mergetool--lib.sh, we loop over the
supported mergetools and their variants and accumulate them in the variable
'variants', separating them with a literal '\n'.

The code then uses 'echo $variants' to turn these '\n' into newlines, but this
behaviour is not portable, it just happens to work in some shells, like
dash(1)'s 'echo' builtin.

For shells in which 'echo' does not turn '\n' into newlines, the end
result is that the only tools that are shown are the existing variants
(except the last variant alphabetically), since the variants are
separated by actual newlines in '$variants' because of the several
'echo' calls in mergetools/{bc,vimdiff}::list_tool_variants.

Fix this bug by embedding an actual line feed into `variants` in
show_tool_names(). While at it, replace `sort | uniq` by `sort -u`.

To prevent future regressions, add a simple test that checks that a few
known tools are correctly shown (let's avoid counting the total number
of tools to lessen the maintenance burden when new tools are added or if
'--tool-help' learns additional logic, like hiding tools depending on
the current platform).

[1] https://lore.kernel.org/git/CADtb9DyozjgAsdFYL8fFBEWmq7iz4=prZYVUdH9W-J5CKVS4OA@mail.gmail.com/

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Based-on-patch-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 18:31:27 -08:00
Matheus Tavares
ea8bbf2a4e t4129: don't fail if setgid is set in the test directory
The last test of t4129 creates a directory and expects its setgid bit
(g+s) to be off. But this makes the test fail when the parent directory
has the bit set, as setgid's state is inherited by newly created
subdirectories.

One way to solve this problem is to allow the presence of this bit when
comparing the return of `test_modebits` with the expected value. But
then we may have the same problem in the future when other tests start
using `test_modebits` on directories (currently t4129 is the only one)
and forget about setgid. Instead, let's make the helper function more
robust with respect to the state of the setgid bit in the test directory
by removing this bit from the returning value. There should be no
problem with existing callers as no one currently expects this bit to be
on.

Note that the sticky bit (+t) and the setuid bit (u+s) are not
inherited, so we don't have to worry about those.

Reported-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:59:17 -08:00
Ævar Arnfjörð Bjarmason
08bf6a8bc3 branch tests: add to --sort tests
Further stress the --sort callback in ref-filter.c. The implementation
uses certain short-circuiting logic, let's make sure it behaves the
same way on e.g. name & version sort. Improves a test added in
aedcb7dc75 (branch.c: use 'ref-filter' APIs, 2015-09-23).

I don't think all of this output makes sense, but let's test for the
behavior as-is, we can fix bugs in it in a later commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:16:56 -08:00
Ævar Arnfjörð Bjarmason
ffdd02a55d branch: change "--local" to "--list" in comment
There has never been a "git branch --local", this is just a typo for
"--list". Fixes a comment added in 23e714df91 (branch: roll
show_detached HEAD into regular ref_list, 2015-09-23).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:15:39 -08:00
ZheNing Hu
e73fe3dd02 builtin/*: update usage format
According to the guidelines in parse-options.h,
we should not end in a full stop or start with
a capital letter. Fix old error and usage
messages to match this expectation.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:10:49 -08:00
Junio C Hamano
4ca7994b2a parse-options: format argh like error messages
"Keep it homogeneous across the repository" is in general a
guideline that can be used to converge to a good practice, but
we can be a bit more prescriptive in this case.  Just like the
messages we give die(_("...")) are formatted without the final
full stop and without the initial capitalization, most of the
argument help text are already formatted that way, and we want
to encourage that as the house style.

Noticed-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:10:27 -08:00
Ævar Arnfjörð Bjarmason
06ce79152b mktag: add a --[no-]strict option
Now that mktag has been migrated to use the fsck machinery to check
its input, it makes sense to teach it to run in the equivalent of "git
fsck"'s default mode.

For cases where mktag is used to (re)create a tag object using data
from an existing and malformed tag object, the validation may
optionally have to be loosened. Teach the command to take the
"--[no-]strict" option to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 14:22:24 -08:00
Martin Ågren
6a8c89d053 read-cache: try not to peek into struct {lock_,temp}file
Similar to the previous commits, try to avoid peeking into the `struct
lock_file`. We also have some `struct tempfile`s -- let's avoid looking
into those as well.

Note that `do_write_index()` takes a tempfile and that when we call it,
we either have a tempfile which we can easily hand down, or we have a
lock file, from which we need to somehow obtain the internal tempfile.
So we need to leave that one instance of peeking-into. Nevertheless,
this commit leaves us not relying on exactly how the path of the
tempfile / lock file is stored internally.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Martin Ågren
7f0dc7998b refs/files-backend: don't peek into struct lock_file
Similar to the previous commits, avoid peeking into the `struct
lock_file`. Use the lock file API instead. Note how we obtain the path
to the lock file if `fdopen_lock_file()` failed and that this is not a
problem: as documented in lockfile.h, failure to "fdopen" does not roll
back the lock file and we're free to, e.g., query it for its path.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Martin Ågren
acd7160201 midx: don't peek into struct lock_file
Similar to the previous commits, avoid peeking into the `struct
lock_file`. Use the lock file API instead.

The two functions we're calling here double-check that the tempfile is
indeed "active", which is arguably overkill considering how we took the
lock on the line immediately above. More importantly, this future-proofs
us against, e.g., other code appearing between these two lines or the
lock file and/or tempfile internals changing.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Martin Ågren
a52cdce936 commit-graph: don't peek into struct lock_file
Similar to the previous commit, avoid peeking into the `struct
lock_file`. Use the lock file API instead.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Martin Ågren
d4a4976648 builtin/gc: don't peek into struct lock_file
A `struct lock_file` is pretty much just a wrapper around a tempfile.
But it's easy enough to avoid relying on this. Use the wrappers that the
lock file API provides rather than peeking at the temp file or even into
*its* internals.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Taylor Blau
cc2d43be2b p7519: allow running without watchman prereq
p7519 measures the performance of the fsmonitor code. To do this, it
uses the installed copy of Watchman. If Watchman isn't installed, a noop
integration script is installed in its place.

When in the latter mode, it is expected that the script should not write
a "last update token": in fact, it doesn't write anything at all since
the script is blank.

Commit 33226af42b (t/perf/fsmonitor: improve error message if typoing
hook name, 2020-10-26) made sure that running 'git update-index
--fsmonitor' did not write anything to stderr, but this is not the case
when using the empty Watchman script, since Git will complain that:

    $ which watchman
    watchman not found
    $ cat .git/hooks/fsmonitor-empty
    $ git -c core.fsmonitor=.git/hooks/fsmonitor-empty update-index --fsmonitor
    warning: Empty last update token.

Prior to 33226af42b, the output wasn't checked at all, which allowed
this noop mode to work. But, 33226af42b breaks p7519 when running it
without a 'watchman(1)' on your system.

Handle this by only checking that the stderr is empty only when running
with a real watchman executable. Otherwise, assert that the error
message is the expected one when running in the noop mode.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:48:25 -08:00
Ævar Arnfjörð Bjarmason
2aa9425fbe mktag: mark strings for translation
Mark the errors mktag might emit for translation. This is a plumbing
command, but the errors it emits are intended to be human-readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
3f390a366c mktag: convert to parse-options
Convert the "mktag" command to use parse-options.h instead of its own
ad-hoc argc handling. This doesn't matter much in practice since it
doesn't support any options, but removes another special-case in our
codebase, and makes it easier to add options to it in the future.

It does marginally improve the situation for programs that want to
execute git commands in a consistent manner and e.g. always use
--end-of-options. E.g. "gitaly" does that, and has a blacklist of
built-ins that don't support --end-of-options. This is one less
special case for it and other similar programs to support.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
9a1a3a4d4c mktag: allow omitting the header/body \n separator
Change mktag's acceptance rules to accept an empty body without an
empty line after the header again. This fixes an ancient unintended
dregression in "mktag".

When "mktag" was introduced in ec4465adb3 (Add "tag" objects that can
be used to sign other objects., 2005-04-25) the input checks were much
looser. When it was documented it 6cfec03680 (mktag: minimally update
the description., 2007-06-10) it was clearly intended for this \n to
be optional:

    The message, when [it] exists, is separated by a blank line from
    the header.

But then in e0aaf781f6 (mktag.c: improve verification of tagger field
and tests, 2008-03-27) this was made an error, seemingly by
accident. It was just a result of the general header checks, and all
the tests after that patch have a trailing empty line (but did not
before).

Let's allow this again, and tweak the test semantics changed in
e0aaf781f6 to remove the redundant empty line. New tests added in
previous commits of mine already added an explicit test for allowing
the empty line between header and body.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acfc01332b mktag: allow turning off fsck.extraHeaderEntry
In earlier commits mktag learned to use the fsck machinery, at which
point we needed to add fsck.extraHeaderEntry so it could be as strict
about extra headers as it's been ever since it was implemented.

But it's not nice to need to switch away from "mktag" to "hash-object"
+ manual "fsck" just because you'd like to have an extra header. So
let's support turning it off by getting "fsck.*" variables from the
config.

Pedantically speaking it's still not possible to make "mktag" behave
just like "hash-object -t tag" does, since we're unconditionally going
to check the referenced object in verify_object_in_tag(), which is our
own check, and not one that exists in fsck.c.

But the spirit of "this works like fsck" is preserved, in that if you
created such a tag with "hash-object" and did a full "fsck" on the
repository it would also error out about that invalid object, it just
wouldn't emit the same message as fsck does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
1f3299fda9 fsck: make fsck_config() re-usable
Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This
allows for re-using it in other tools that expose fsck logic and want
to support its configuration variables.

A logical continuation of this change would be to use a common
function for all of {fetch,receive}.fsck.* and fsck.*. See
5d477a334a (fsck (receive-pack): allow demoting errors to warnings,
2015-06-22) and my own 1362df0d41 (fetch: implement fetch.fsck.*,
2018-07-27) for the relevant code.

However, those routines want to not parse the fsck.skipList into OIDs,
but rather pass them along with the --strict option to another
process. It would be possible to refactor that whole thing so we
support e.g. a "fetch." prefix, then just keep track of the skiplist
as a filename instead of parsing it, and learn to spew that all out
from our internal structures into something we can append to the
--strict option.

But instead I'm planning to re-use this in "mktag", which'll just
re-use these "fsck.*" variables as-is.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acf9de4c94 mktag: use fsck instead of custom verify_tag()
Change the validation logic in "mktag" to use fsck's fsck_tag()
instead of its own custom parser. Curiously the logic for both dates
back to the same commit[1]. Let's unify them so we're not maintaining
two sets functions to verify that a tag is OK.

The behavior of fsck_tag() and the old "mktag" code being removed here
is different in few aspects.

I think it makes sense to remove some of those checks, namely:

 A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag
    code disallowed values larger than 1400.

    Yes there's currently no timezone with a greater offset[2], but
    since we allow any number of non-offical timezones (e.g. +1234)
    passing this through seems fine. Git also won't break in the
    future if e.g. French Polynesia decides it needs to outdo the Line
    Islands when it comes to timezone extravagance.

 B. fsck allows missing author names such as "tagger <email>", mktag
    wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but
    not "tagger [1 space] <email>"). Now we allow all of these.

 C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
    allows it.

In some ways fsck_tag() is stricter than "mktag" was, namely:

 D. fsck disallows zero-padded dates, but mktag didn't care. So
    e.g. the timestamp "0000000000 +0000" produces an error now. A
    test in "t1006-cat-file.sh" relied on this, it's been changed to
    use "hash-object" (without fsck) instead.

There was one check I deemed worth keeping by porting it over to
fsck_tag():

 E. "mktag" did not allow any custom headers, and by extension (as an
    empty commit is allowed) also forbade an extra stray trailing
    newline after the headers it knew about.

    Add a new check in the "ignore" category to fsck and use it. This
    somewhat abuses the facility added in efaba7cc77 (fsck:
    optionally ignore specific fsck issues completely, 2015-06-22).

    This is somewhat of hack, but probably the least invasive change
    we can make here. The fsck command will shuffle these categories
    around, e.g. under --strict the "info" becomes a "warn" and "warn"
    becomes "error". Existing users of fsck's (and others,
    e.g. index-pack) --strict option rely on this.

    So we need to put something into a category that'll be ignored by
    all existing users of the API. Pretending that
    fsck.extraHeaderEntry=error ("ignore" by default) was set serves
    to do this for us.

1. ec4465adb3 (Add "tag" objects that can be used to sign other
   objects., 2005-04-25)

2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
40ef015a27 mktag: use puts(str) instead of printf("%s\n", str)
This introduces no functional change, but refactors the print-out of
the hash at the end to do the same thing with less code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
dfe3948728 mktag: remove redundant braces in one-line body "if"
This minor stylistic churn is usually something we'd avoid, but if we
don't do this then the file after changes in subsequent commits will
only have this minor style inconsistency, so let's change this while
we're at it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
0c439117bb mktag: use default strbuf_read() hint
Change the hardcoded hint of 2^12 to 0. The default strbuf hint is
perfectly fine here, and the only reason we were hardcoding it is
because it survived migration from a pre-strbuf fixed-sized buffer.

See fd17f5b5f7 (Replace all read_fd use with strbuf_read, and get rid
of it., 2007-09-10) for that migration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
692654dca0 mktag tests: test verify_object() with replaced objects
Add tests to demonstrate what "mktag" does in the face of replaced
objects.

There was an existing test for replaced objects fed to "mktag" added
in cc400f5011 (mktag: call "check_sha1_signature" with the
replacement sha1, 2009-01-23), but that one only tests a
commit->commit mapping. Not a mapping to a different type as like
we're also testing for here. We could remove the "mktag" test in
t6050-replace.sh now if the created tag wasn't being used by a
subsequent "fsck" test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
30f882c16d mktag tests: improve verify_object() test coverage
The verify_object() function in "mktag.c" is tasked with ensuring that
our tag refers to a valid object.

The existing test for this might fail because it was also testing that
"type taggg" didn't refer to a valid object type (it should be "type
tag"), or because we referred to a valid object but got the type
wrong.

Let's split these tests up, so we're testing all combinations of a
non-existing object and in invalid/wrong "type" lines.

We need to provide GIT_TEST_GETTEXT_POISON=false here because the
"invalid object type" error is emitted by
parse_loose_header_extended(), which has that message already marked
for translation. Another option would be to use test_i18ngrep, but I
prefer always running the test, not skipping it under gettext poison
testing.

I'm not testing this in combination with "git replace". That'll be
done in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
ca9a1ed969 mktag tests: test "hash-object" compatibility
Change all the successful "mktag" tests to test that "hash-object"
produces the same hash for the input, and that fsck passes for
both.

This tests e.g. that "mktag" doesn't trim its input or otherwise munge
it in a way that "hash-object" doesn't.

Since we're doing an "fsck --strict" here at the end let's incorporate
the creation of the "mytag" name into this test, removing the
special-case at the end of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
47c95e77d1 mktag tests: stress test whitespace handling
Add tests for a couple of whitespace edge cases around the header/body
boundary.

I consider the requirement for a blank line before the empty body a
bug, it's a long-standing regression which goes against the command's
documented behavior. This bug will be addressed in a follow-up change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
3b9e4dd3a3 mktag tests: run "fsck" after creating "mytag"
Change the last test in the file to run an "fsck --strict" after
creating the tag at the end.

We're just doing this for good measure to check that fsck behaves as
expected now that there's finally a reference for our valid tag. Other
tests going to be checking this elsewhere, but it's nice to cover all
the edge cases in this test to make it as self-contained as possible.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
5c2303e0c7 mktag tests: don't create "mytag" twice
Change a test added in e0aaf781f6 (mktag.c: improve verification of
tagger field and tests, 2008-03-27) to not create "mytag", which
should only be created and verified at the end in an earlier test
added in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While we're at it let's prevent a similar logic error from creeping
into the test by asserting that "mytag" doesn't exist before we create
it. Let's do this by moving the test to use "update-ref", instead of
our own homebrew ad-hoc refstore update.

We're not really testing for anything yet by creating the tag at the
end here. A subsequent commit will change that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
317c176279 mktag tests: don't redirect stderr to a file needlessly
Remove the redirection of stderr to "message" in the valid tag
test. This pattern seems to have been copy/pasted from the failure
case in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While I'm at it do the same for the "replace" tests. The tag creation
I'm changing here seems to have been copy/pasted from the "mktag"
tests to those tests in cc400f5011 (mktag: call
"check_sha1_signature" with the replacement sha1, 2009-01-23).

Nobody examines the contents of the resulting "message" file, so the
net result is that error messages cannot be seen in "sh t3800-mktag.sh
-v" output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
0d35ccb5e0 mktag tests: remove needless SHA-1 hardcoding
Change the tests amended in acb49d1cc8 (t3800: make hash-size
independent, 2019-08-18) even more to make them independent of either
SHA-1 or SHA-256.

Some of these tests were failing for the wrong reasons. The first one
being modified here would fail because the line starts with "xxxxxx"
instead of "object", the rest of the line doesn't matter.

Let's just put a valid hash on the rest of the line anyway to narrow
the test down for just the s/object/xxxxxx/ case.

The second one being modified here would fail under
GIT_TEST_DEFAULT_HASH=sha256 because <some sha-1 length garbage> is an
invalid SHA-256, but we should really be testing <some sha-256 length
garbage> when under SHA-256.

This doesn't really matter since we should be able to trust other
parts of the code to validate things in the 0-9a-f range, but let's
keep it for good measure.

There's a later test which tests an invalid SHA which looks like a
valid one, to stress the "We refuse to tag something we can't
verify[...]" logic in mktag.c.

But here we're testing for a SHA-length string which contains
characters outside of the /[0-9a-f]/i set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
b5ca549c93 mktag tests: use "test_commit" helper
Replace ad-hoc setup of a single commit in the "mktag" tests with our
standard helper pattern. The old setup dated back to 446c6faec6 (New
tests and en-passant modifications to mktag., 2006-07-29) before the
helper existed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
aba5377f69 mktag tests: don't needlessly use a subshell
The use of a subshell dates back to e9b20943b7 (t/t3800: do not use a
temporary file to hold expected result., 2008-01-04). It's not needed
anymore, if it ever was.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
18430ed363 mktag doc: update to explain why to use this
Change the mktag documentation to compare itself to the similar
"hash-object -t tag" command. Before this someone reading the
documentation wouldn't have much of an idea what the difference
was.

Let's allude to our own validation logic, and cross-link the "mktag"
and "hash-object" documentation to aid discover-ability. A follow-up
change to migrate "mktag" to use "fsck" validation will make the part
about validation logic clearer.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00