Commit graph

11527 commits

Author SHA1 Message Date
Junio C Hamano 3821eb6c3d reset: reject --no-(mixed|soft|hard|merge|keep) option
"git reset --no-mixed" behaved exactly like "git reset --mixed",
which was nonsense.

If there were only two kinds, e.g. "mixed" vs "separate", it might
have made sense to make "git reset --no-mixed" behave identically to
"git reset --separate" and vice-versa, but because we have many
types of reset, let's just forbid "--no-mixed" and negated form of
other types.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 22:02:53 -07:00
Junio C Hamano 68cbb20e73 show-branch: reject --[no-](topo|date)-order
"git show-branch --no-topo-order" behaved exactly the same way as
"git show-branch --topo-order" did, which was nonsense.  This was
because we choose between topo- and date- by setting a variable to
either REV_SORT_IN_GRAPH_ORDER or REV_SORT_BY_COMMIT_DATE with
OPT_SET_INT() and REV_SORT_IN_GRAPH_ORDER happens to be 0.  The
OPT_SET_INT() macro assigns 0 to the target variable in respose to
the negated form of its option.

"--no-date-order" by luck behaves identically to "--topo-order"
exactly for the same reason, and it sort-of makes sense right now,
but the "sort-of makes sense" will quickly break down once we add a
third way to sort.  Not-A may be B when there are only two choices
between A and B, but once your choices become among A, B, and C,
not-A does not mean B.

Just mark these two ordering options to reject negation, and add a
test, which was missing.  "git show-branch --no-reflog" is also
unnegatable, so throw in a test for that while we are at it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 22:00:39 -07:00
Junio C Hamano d86a8f386d remote: simplify "remote add --tags" help text
The help text for the --tags option was split into two option[]
entries, which was a hacky way to give two lines of help text (the
second entry did not have either short or long help, and there was
no way to invoke its entry---it was there only for the help text).

As we now support multi-line text in the option help, let's make
the second line of the help a proper second line and remove the
hacky second entry.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 16:39:02 -07:00
Junio C Hamano 83bb8e5a06 show-branch: --no-sparse should give dense output
"git show-branch --no-sparse" behaved exactly the same way as "git
show-branch --sparse", which did not make any sense.  This was
because it used a variable "dense" initialized to 1 by default to
give "non sparse" behaviour, and OPT_SET_INT() to set the varilable
to 0 in response to the "--sparse" option.  Unfortunately,
OPT_SET_INT() sets 0 to the given variable when the option is
negated.

Flip the polarity of the variable "dense" by renaming it to "sparse"
and initializing it to 0, and have OPT_SET_INT() set the variable to
1 when "--sparse" is given.  This way, "--no-sparse" would set 0 to
the variable and would give us the "dense" behaviour.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-19 09:16:37 -07:00
Junio C Hamano ae2c912c04 parse-options: introduce OPT_IPVERSION()
The command line option parsing for "git clone", "git fetch", and
"git push" have duplicated implementations of parsing "--ipv4" and
"--ipv6" options, by having two OPT_SET_INT() for "ipv4" and "ipv6".

Introduce a new OPT_IPVERSION() macro and use it in these three
commands.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 14:35:54 -07:00
Junio C Hamano e12cb98e1e branch: reject "--no-all" and "--no-remotes" early
As the command line parser for "git branch --all" forgets to use
PARSE_OPT_NONEG, it accepted "git branch --no-all", and then passed
a nonsense value to the underlying machinery, leading to a fatal
error "filter_refs: invalid type".  The "--remotes" option had
exactly the same issue.

Catch the unsupported options early in the option parser.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 12:19:53 -07:00
Junio C Hamano 947ebd62a0 am: simplify parsing of "--[no-]keep-cr"
Command line options "--keep-cr" and its negation trigger
OPT_SET_INT_F(PARSE_OPT_NONEG) to set a variable to 1 and 0
respectively.  Using OPT_SET_INT() to implement the positive variant
that sets the variable to 1 without specifying PARSE_OPT_NONEG gives
us the negative variant to set it to 0 for free.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 12:19:31 -07:00
René Scharfe 991c552916 ls-tree: fix --no-full-name
Since 61fdbcf98b (ls-tree: migrate to parse-options, 2009-11-13) git
ls-tree has accepted the option --no-full-name, but it does the same
as --full-name, contrary to convention.  That's because it's defined
using OPT_SET_INT with a value of 0, where the negative variant sets
0 as well.

Turn --no-full-name into the opposite of --full-name by using OPT_BOOL
instead and storing the option's status directly in a variable named
"full_name" instead of in negated form in "chomp_prefix".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-18 09:38:24 -07:00
Junio C Hamano c6a5e1a22e Merge branch 'tb/repack-cleanup'
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.

* tb/repack-cleanup:
  builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
  builtin/repack.c: only repack `.pack`s that exist
2023-07-18 07:28:53 -07:00
Junio C Hamano 6016ee0a71 Merge branch 'tb/fsck-no-progress'
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.

* tb/fsck-no-progress:
  commit-graph.c: avoid duplicated progress output during `verify`
  commit-graph.c: pass progress to `verify_one_commit_graph()`
  commit-graph.c: iteratively verify commit-graph chains
  commit-graph.c: extract `verify_one_commit_graph()`
  fsck: suppress MIDX output with `--no-progress`
  fsck: suppress commit-graph output with `--no-progress`
2023-07-18 07:28:53 -07:00
Junio C Hamano ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
Junio C Hamano 0e074fb4e5 Merge branch 'rs/ls-tree-prefix-simplify'
Code simplification.

* rs/ls-tree-prefix-simplify:
  ls-tree: simplify prefix handling
2023-07-17 11:30:42 -07:00
Jeff King cc2f810172 tag: mark unused parameters in each_tag_name_fn callbacks
We iterate over the set of input tag names using callbacks. But not all
operations need the same inputs, so some parameters go unused (but of
course not the same ones for each operation). Mark the unused ones to
avoid -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King 1e6459efca rev-parse: mark unused parameter in for_each_abbrev callback
We don't need to use the "data" parameter in this instance. Let's mark
it to avoid -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King 4c7b06f208 replace: mark unused parameter in each_mergetag_fn callback
We don't look at the "commit" parameter to our callback, as our
"mergetag_data" pointer contains the original name "ref", which we use
instead. But we can't get rid of it, since other for_each_mergetag
callbacks do use it. Let's mark the parameter to avoid
-Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King 80d4e5f3a5 replace: mark unused parameter in ref callback
We don't look at the "flags" parameter, which is natural for something
that is just printing the contents of the replace refs. But let's mark
it to appease -Wunused-parameter.

This probably should have been part of 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but I missed it as this one is a
repo_each_ref_fn, which takes an extra repository argument.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King ee550abcce merge-tree: mark unused parameter in traverse callback
Our threeway_callback() does not bother to look at its "n" parameter. It
is static in this file and used only by trivial_merge_trees(), which
always passes 3 trees (hence the name "threeway"). It also does not look
at "dirmask". This is OK, as it handles directories specifically by
looking at the mode bits.

Other traverse_info callbacks need these, so we can't get drop them from
the interface. But let's annotate these ones to avoid complaints from
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King 0b4e9013f1 fsck: mark unused parameters in various fsck callbacks
There are a few callback functions which are used with the fsck code,
but it's natural that not all callbacks need all parameters. For
reporting, even something as obvious as "the oid of the object which had
a problem" is not always used, as some callers are only checking a
single object in the first place. And for both reporting and walking,
things like void data pointers and the fsck_options aren't always
necessary.

But since each such parameter is used by _some_ callback, we have to
keep them in the interface. Mark the unused ones in specific callbacks
to avoid triggering -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King cc88afad62 revisions: drop unused "opt" parameter in "tweak" callbacks
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.

But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.

We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King 506d35f13d count-objects: mark unused parameter in alternates callback
Callbacks to for_each_altodb() get a void data pointer, but we don't
need it here. Mark it as unused to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:24:00 -07:00
Jeff King a8a8e75e9e am: mark unused keep_cr parameters
When parsing the input, we have a "keep_cr" parameter to tell us how to
handle line endings. But this doesn't apply to stgit or hg patches
(which are not mailbox formats where we have to worry about that), so we
ignore the parameter entirely in those functions.

Let's mark these as unused so that -Wunused-parameter does not complain
about them.

Note that we could just drop these parameters entirely. They are
necessary to conform to the mail_conv_fn interface used by
split_mail_conv(), but these two callbacks are the only ones used with
that function. The other formats (which _do_ care about keep_cr) use
split_mail_mbox(). But it's conceivable that we'd eventually add another
format that does care about this option, so let's leave it as part of
the generic interface.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 17:23:59 -07:00
Alex Henrie c577d65158 push: don't imply that integration is always required before pushing
In a narrow but common case, the user is the only author of a branch and
doesn't mind overwriting the corresponding branch on the remote. This
workflow is especially common on GitHub, GitLab, and Gerrit, which keep
a permanent record of every version of a branch that is pushed while a
pull request is open for that branch. On those platforms, force-pushing
is encouraged and is analogous to emailing a new version of a patchset.

When giving advice about divergent branches, tell the user about
`git pull`, but don't unconditionally instruct the user to do it. A less
prescriptive message will help prevent users from thinking that they are
required to create an integrated history instead of simply replacing the
previous history. Also, don't put `git pull` in an awkward
parenthetical, because `git pull` can always be used to reconcile
branches and is the normal way to do so.

Due to the difficulty of knowing which command for force-pushing is best
suited to the user's situation, no specific advice is given about
force-pushing. Instead, the user is directed to the Git documentation to
read about possible ways forward that do not involve integration.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 09:14:58 -07:00
Alex Henrie b6f3da5132 wt-status: don't show divergence advice when committing
When the user is in the middle of making a commit, they are not yet at
the point where they are ready to think about integrating their local
branch with the corresponding remote branch or force-pushing over the
remote branch. Don't include advice on how to deal with divergent
branches in the commit template, to avoid giving the impression that the
divergence needs to be dealt with immediately. Similar advice will be
printed when it is most relevant, that is, if the user does try to push
without first reconciling the two branches.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-13 09:14:58 -07:00
Taylor Blau def390d593 builtin/repack.c: avoid dir traversal in collect_pack_filenames()
When repacking, the function `collect_pack_filenames()` is responsible
for collecting the set of existing packs in the repository, and
partitioning them into "kept" (if the pack has a ".keep" file or was
given via `--keep-pack`) and "nonkept" (otherwise) lists.

This function comes from the original C port of git-repack.sh from back
in a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15),
where it first appears as `get_non_kept_pack_filenames()`. At the time,
the implementation was a fairly direct translation from the relevant
portion of git-repack.sh, which looped over the results of

    find "$PACKDIR" -type f -name '*.pack'

either ignoring the pack as kept, or adding it to the list of existing
packs.

So the choice to directly translate this function in terms of
`readdir()` in a1bbc6c017 made sense. At the time, it was possible to
refine the C version in terms of packed_git structs, but was never done.

However, manually enumerating a repository's packs via `readdir()` is
confusing and error-prone. It leads to frustrating inconsistencies
between which packs Git considers to be part of a repository (i.e.,
could be found in the list of packs from `get_all_packs()`), and which
packs `collect_pack_filenames()` considers to meet the same criteria.

This bit us in 73320e49ad (builtin/repack.c: only collect fully-formed
packs, 2023-06-07), and again in the previous commit.

Prevent these issues from biting us in the future by implementing the
`collect_pack_filenames()` function by looping over an array of pointers
to `packed_git` structs, ensuring that we use the same criteria to
determine the set of available packs.

One gotcha here is that we have to ignore non-local packs, since the
original version of `collect_pack_filenames()` only looks at the local
pack directory to collect existing packs.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-11 13:07:51 -07:00
Derrick Stolee 0af067276e builtin/repack.c: only repack .packs that exist
In 73320e49ad (builtin/repack.c: only collect fully-formed packs,
2023-06-07), we switched the check for which packs to collect by
starting at the .idx files and looking for matching .pack files. This
avoids trying to repack pack-files that have not had their pack-indexes
installed yet.

However, it does cause maintenance to halt if we find the (problematic,
but not insurmountable) case of a .idx file without a corresponding
.pack file. In an environment where packfile maintenance is a critical
function, such a hard stop is costly and requires human intervention to
resolve (by deleting the .idx file).

This was not the case before. We successfully repacked through this
scenario until the recent change to scan for .idx files.

Further, if we are actually in a case where objects are missing, we
detect this at a different point during the reachability walk.

In other cases, Git prepares its list of packfiles by scanning .idx
files and then only adds it to the packfile list if the corresponding
.pack file exists. It even does so without a warning! (See
add_packed_git() in packfile.c for details.)

This case is much less likely to occur than the failures seen before
73320e49ad. Packfiles are "installed" by writing the .pack file before
the .idx and that process can be interrupted. Packfiles _should_ be
deleted by deleting the .idx first, followed by the .pack file, but
unlink_pack_path() does not do this: it deletes the .pack _first_,
allowing a window where this process could be interrupted. We leave the
consideration of changing this order as a separate concern. Knowing that
this condition is possible from interrupted Git processes and not other
tools lends some weight that Git should be more flexible around this
scenario.

Add a check to see if the .pack file exists before adding it to the list
for repacking. This will stop a number of maintenance failures seen in
production but fixed by deleting the .idx files.

This brings us closer to the case before 73320e49ad in that 'git
repack' will not fail when there is an orphaned .idx file, at least, not
due to the way we scan for packfiles. In the case that the .pack file
was erroneously deleted without copies of its objects in other installed
packfiles, then 'git repack' will fail due to the reachable object walk.

This does resolve the case where automated repacks will no longer be
halted on this case. The tests in t7700 show both these successful
scenarios and the case of failing if the .pack was truly required.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-11 13:07:50 -07:00
Taylor Blau cc2a1f98ac builtin/receive-pack.c: avoid enumerating hidden references
Now that `refs_for_each_fullref_in()` has the ability to avoid
enumerating references matching certain pattern(s), use that to avoid
visiting hidden refs when constructing the ref advertisement via
receive-pack.

Note that since this exclusion is best-effort, we still need
`show_ref_cb()` to check whether or not each reference is hidden or not
before including it in the advertisement.

As was the case when applying this same optimization to `upload-pack`,
`receive-pack`'s reference advertisement phase can proceed much quicker
by avoiding enumerating references that will not be part of the
advertisement.

(Below, we're still using linux.git with one hidden refs/pull/N ref per
commit):

    $ hyperfine -L v ,.compile 'git{v} -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
    Benchmark 1: git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
      Time (mean ± σ):      89.1 ms ±   1.7 ms    [User: 82.0 ms, System: 7.0 ms]
      Range (min … max):    87.7 ms …  95.5 ms    31 runs

    Benchmark 2: git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
      Time (mean ± σ):       4.5 ms ±   0.2 ms    [User: 0.5 ms, System: 3.9 ms]
      Range (min … max):     4.1 ms …   5.6 ms    508 runs

    Summary
      'git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git' ran
       20.00 ± 1.05 times faster than 'git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau c45841fff8 revision.h: store hidden refs in a strvec
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.

Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.

Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:

    ref = xstrdup(value);
    len = strlen(ref);
    while (len && ref[len - 1] == '/')
            ref[--len] = '\0';
    strvec_push(hide_refs, ref);
    free(ref);

but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:56 -07:00
Taylor Blau 8255dd8a3d builtin/for-each-ref.c: add --exclude option
When using `for-each-ref`, it is sometimes convenient for the caller to
be able to exclude certain parts of the references.

For example, if there are many `refs/__hidden__/*` references, the
caller may want to emit all references *except* the hidden ones.
Currently, the only way to do this is to post-process the output, like:

    $ git for-each-ref --format='%(refname)' | grep -v '^refs/hidden/'

Which is do-able, but requires processing a potentially large quantity
of references.

Teach `git for-each-ref` a new `--exclude=<pattern>` option, which
excludes references from the results if they match one or more excluded
patterns.

This patch provides a naive implementation where the `ref_filter` still
sees all references (including ones that it will discard) and is left to
check whether each reference matches any excluded pattern(s) before
emitting them.

By culling out references we know the caller doesn't care about, we can
avoid allocating memory for their storage, as well as spending time
sorting the output (among other things). Even the naive implementation
provides a significant speed-up on a modified copy of linux.git (that
has a hidden ref pointing at each commit):

    $ hyperfine \
      'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"' \
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/'
    Benchmark 1: git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"
      Time (mean ± σ):     820.1 ms ±   2.0 ms    [User: 703.7 ms, System: 152.0 ms]
      Range (min … max):   817.7 ms … 823.3 ms    10 runs

    Benchmark 2: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/
      Time (mean ± σ):     106.6 ms ±   1.1 ms    [User: 99.4 ms, System: 7.1 ms]
      Range (min … max):   104.7 ms … 109.1 ms    27 runs

    Summary
      'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/' ran
        7.69 ± 0.08 times faster than 'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"'

Subsequent patches will improve on this by avoiding visiting excluded
sections of the `packed-refs` file in certain cases.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Jeff King b571fb9800 ref-filter: add ref_filter_clear()
We did not bother to clean up at all in `git branch` or `git tag`, and
`git for-each-ref` only cleans up a couple of members.

Add and call `ref_filter_clear()` when cleaning up a `struct
ref_filter`. Running this patch (without any test changes) indicates a
couple of now leak-free tests. This was found by running:

    $ make SANITIZE=leak
    $ make -C t GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_OPTS=--immediate

(Note that the `reachable_from` and `unreachable_from` lists should be
cleaned as they are used. So this is just covering any case where we
might bail before running the reachability check.)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Jeff King b9f7daa6ef ref-filter.h: provide REF_FILTER_INIT
Provide a sane initialization value for `struct ref_filter`, which in a
subsequent patch will be used to initialize a new field.

In the meantime, ensure that the `ref_filter` struct used in the
test-helper's `cmd__reach()` is zero-initialized. The lack of
initialization is OK, since `commit_contains()` only looks at the single
`with_commit_tag_algo` field that *is* initialized directly above.

So this does not fix a bug, but rather prevents one from biting us in
the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 14:48:55 -07:00
Taylor Blau 39bdd30377 fsck: suppress MIDX output with --no-progress
In a similar spirit as the previous commit, address a bug where `git
fsck` produces output when calling `git multi-pack-index verify` even
when invoked with `--no-progress`.

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying OID order in multi-pack-index: 100% (605677/605677), done.
    Sorting objects by packfile: 100% (605678/605678), done.
    Verifying object offsets: 100% (605678/605678), done.

The three lines produced by `git fsck` come from `git multi-pack-index
verify`, but should be squelched due to `--no-progress`.

The MIDX machinery learned to generate these progress messages as early
as 430efb8a74 (midx: add progress indicators in multi-pack-index
verify, 2019-03-21), but did not respect `--progress` or `--no-progress`
until ad60096d1c (midx: honor the MIDX_PROGRESS flag in
verify_midx_file, 2019-10-21).

But the `git multi-pack-index verify` step was added to fsck in
66ec0390e7 (fsck: verify multi-pack-index, 2018-09-13), pre-dating any
of the above patches.

Pass `--[no-]progress` as appropriate to ensure that we don't produce
output when told not to.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:40 -07:00
Taylor Blau eda206f611 fsck: suppress commit-graph output with --no-progress
Since e0fd51e1d7 (fsck: verify commit-graph, 2018-06-27), `fsck` runs
`git commit-graph verify` to check the integrity of any commit-graph(s).

Originally, the `git commit-graph verify` step would always print to
stdout/stderr, regardless of whether or not `fsck` was invoked with
`--[no-]progress` or not. But in 7371612255 (commit-graph: add
--[no-]progress to write and verify, 2019-08-26), the commit-graph
machinery learned the `--[no-]progress` option, though `fsck` was not
updated to pass this new flag (or not).

This led to seeing output from running `git fsck`, even with
`--no-progress` on repositories that have a commit-graph:

    $ git.compile fsck --connectivity-only --no-progress --no-dangling
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

Ensure that `fsck` passes `--[no-]progress` as appropriate when calling
`git commit-graph verify`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:37 -07:00
Junio C Hamano b00ec259e7 Merge branch 'jk/fsck-indices-in-worktrees'
Code clarification.

* jk/fsck-indices-in-worktrees:
  fsck: avoid misleading variable name
2023-07-08 11:23:08 -07:00
Junio C Hamano 7f5ad0ca8d Merge branch 'js/empty-index-fixes'
A few places failed to differenciate the case where the index is
truly empty (nothing added) and we haven't yet read from the
on-disk index file, which have been corrected.

* js/empty-index-fixes:
  commit -a -m: allow the top-level tree to become empty again
  split-index: accept that a base index can be empty
  do_read_index(): always mark index as initialized unless erroring out
2023-07-08 11:23:07 -07:00
Junio C Hamano 0ad927e9e0 tree-walk: lose base_offset that is never used in tree_entry_interesting
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.

The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:27:28 -07:00
René Scharfe 7b7203e78a ls-tree: simplify prefix handling
git ls-tree has two prefixes: The one handed to cmd_ls_tree(), i.e. the
current subdirectory in the repository (if any) and the "display" prefix
used by the show_tree_*() functions.  The option --full-name clears the
last one, i.e. it shows full paths, and --full-tree clears both, i.e. it
acts as if the command was started in the root of the repository.

The show_tree_*() functions use the ls_tree_options members chomp_prefix
and ls_tree_prefix to determine their prefix values.  Calculate it once
in cmd_ls_tree() instead, once the main prefix value is finalized.

This allows chomp_prefix to become a local variable.  Stop using
strlen(3) to determine its initial value -- we only care whether we got
a non-empty string, not precisely how long it is.

Rename ls_tree_prefix to prefix to demonstrate that we converted all
users and because the ls_tree_ part is no longer necessary since
030a3d5d9e (ls-tree: use a "struct options", 2023-01-12) turned it from
a global variable to a struct member.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 11:57:13 -07:00
Junio C Hamano b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
Junio C Hamano a9cc3b8fc7 Merge branch 'tl/notes-separator'
'git notes append' was taught '--separator' to specify string to insert
between paragraphs.

* tl/notes-separator:
  notes: introduce "--no-separator" option
  notes.c: introduce "--[no-]stripspace" option
  notes.c: append separator instead of insert by pos
  notes.c: introduce '--separator=<paragraph-break>' option
  t3321: add test cases about the notes stripspace behavior
  notes.c: use designated initializers for clarity
  notes.c: cleanup 'strbuf_grow' call in 'append_edit'
2023-07-06 11:54:47 -07:00
Junio C Hamano 67e7305e64 Merge branch 'cw/strbuf-cleanup'
Move functions that are not about pure string manipulation out of
strbuf.[ch]

* cw/strbuf-cleanup:
  strbuf: remove global variable
  path: move related function to path
  object-name: move related functions to object-name
  credential-store: move related functions to credential-store file
  abspath: move related functions to abspath
  strbuf: clarify dependency
  strbuf: clarify API boundary
2023-07-06 11:54:46 -07:00
Junio C Hamano da269af920 Merge branch 'rs/strbuf-expand-step'
Code clean-up around strbuf_expand() API.

* rs/strbuf-expand-step:
  strbuf: simplify strbuf_expand_literal_cb()
  replace strbuf_expand() with strbuf_expand_step()
  replace strbuf_expand_dict_cb() with strbuf_expand_step()
  strbuf: factor out strbuf_expand_step()
  pretty: factor out expand_separator()
2023-07-06 11:54:45 -07:00
Calvin Wan 91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Calvin Wan da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Calvin Wan fda5d9595d git-compat-util: move strbuf.c funcs to its header
While functions like starts_with() probably should not belong in the
boundaries of the strbuf library, this commit focuses on first splitting
out headers from git-compat-util.h.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:18 -07:00
Junio C Hamano 89d62d5e8e Merge branch 'bc/more-git-var'
Add more "git var" for toolsmiths to learn various locations Git is
configured with either via the configuration or hardcoded defaults.

* bc/more-git-var:
  var: add config file locations
  var: add attributes files locations
  attr: expose and rename accessor functions
  var: adjust memory allocation for strings
  var: format variable structure with C99 initializers
  var: add support for listing the shell
  t: add a function to check executable bit
  var: mark unused parameters in git_var callbacks
2023-07-04 16:08:18 -07:00
Junio C Hamano a1264a08a1 Merge branch 'en/header-split-cache-h-part-3'
Header files cleanup.

* en/header-split-cache-h-part-3: (28 commits)
  fsmonitor-ll.h: split this header out of fsmonitor.h
  hash-ll, hashmap: move oidhash() to hash-ll
  object-store-ll.h: split this header out of object-store.h
  khash: name the structs that khash declares
  merge-ll: rename from ll-merge
  git-compat-util.h: remove unneccessary include of wildmatch.h
  builtin.h: remove unneccessary includes
  list-objects-filter-options.h: remove unneccessary include
  diff.h: remove unnecessary include of oidset.h
  repository: remove unnecessary include of path.h
  log-tree: replace include of revision.h with simple forward declaration
  cache.h: remove this no-longer-used header
  read-cache*.h: move declarations for read-cache.c functions from cache.h
  repository.h: move declaration of the_index from cache.h
  merge.h: move declarations for merge.c from cache.h
  diff.h: move declaration for global in diff.c from cache.h
  preload-index.h: move declarations for preload-index.c from elsewhere
  sparse-index.h: move declarations for sparse-index.c from cache.h
  name-hash.h: move declarations for name-hash.c from cache.h
  run-command.h: move declarations for run-command.c from cache.h
  ...
2023-06-29 16:43:21 -07:00
Eric Sunshine 6e6a529b57 fsck: avoid misleading variable name
When reporting a problem, `git fsck` emits a message such as:

    missing blob 1234abcd (:file)

However, this can be ambiguous when the problem is detected in the index
of a worktree other than the one in which `git fsck` was invoked. To
address this shortcoming, 592ec63b38 (fsck: mention file path for index
errors, 2023-02-24) enhanced the output to mention the path of the index
when the problem is detected in some other worktree:

    missing blob 1234abcd (.git/worktrees/wt/index:file)

Unfortunately, the variable in fsck_index() which controls whether the
index path should be shown is misleadingly named "is_main_index" which
can be misunderstood as referring to the main worktree (i.e. the one
housing the .git/ repository) rather than to the current worktree (i.e.
the one in which `git fsck` was invoked). Avoid such potential confusion
by choosing a name more reflective of its actual purpose.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 13:58:57 -07:00
Johannes Schindelin 2ee045eea1 commit -a -m: allow the top-level tree to become empty again
In 03267e8656 (commit: discard partial cache before (re-)reading it,
2022-11-08), a memory leak was plugged by discarding any partial index
before re-reading it.

The problem with this memory leak fix is that it was based on an
incomplete understanding of the logic introduced in 7168624c35 (Do not
generate full commit log message if it is not going to be used,
2007-11-28).

That logic was introduced to add a shortcut when committing without
editing the commit message interactively. A part of that logic was to
ensure that the index was read into memory:

	if (!active_nr && read_cache() < 0)
		die(...)

Translation to English: If the index has not yet been read, read it, and
if that fails, error out.

That logic was incorrect, though: It used `!active_nr` as an indicator
that the index was not yet read. Usually this is not a problem because
in the vast majority of instances, the index contains at least one
entry.

And it was natural to do it this way because at the time that condition
was introduced, the `index_state` structure had no explicit flag to
indicate that it was initialized: This flag was only introduced in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), but that commit did not adjust the code path
where no index file was found and a new, pristine index was initialized.

Now, when the index does not contain any entry (which is quite
common in Git's test suite because it starts quite a many repositories
from scratch), subsequent calls to `do_read_index()` will mistake the
index not to be initialized, and read it again unnecessarily.

This is a problem because after initializing the empty index e.g. the
`cache_tree` in that index could have been initialized before a
subsequent call to `do_read_index()` wants to ensure an initialized
index. And if that subsequent call mistakes the index not to have been
initialized, it would lead to leaked memory.

The correct fix for that memory leak is to adjust the condition so that
it does not mistake `active_nr == 0` to mean that the index has not yet
been read.

Using the `initialized` flag instead, we avoid that mistake, and as a
bonus we can fix a bug at the same time that was introduced by the
memory leak fix: When deleting all tracked files and then asking `git
commit -a -m ...` to commit the result, Git would internally update the
index, then discard and re-read the index undoing the update, and fail
to commit anything.

This fixes https://github.com/git-for-windows/git/issues/4462

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 12:20:04 -07:00
Glen Choo 8868b1ebfb config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.

In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.

Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.

The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
Glen Choo 26b669324b config.c: pass ctx with CLI config
Pass config_context when parsing CLI config. To provide the .kvi member,
refactor out kvi_from_param() from the logic that caches CLI config in
configsets. Now that config_context and config_context.kvi is always
present when config machinery calls config callbacks, plumb "kvi" so
that we can remove all calls of current_config_scope() except for
trace2/*.c (which will be handled in a later commit), and remove all
other current_config_*() (the functions themselves and their calls).
Note that this results in .kvi containing a different, more complete
set of information than the mocked up "struct config_source" in
git_config_from_parameters().

Plumbing "kvi" reveals a few places where we've been doing the wrong
thing:

* git_config_parse_parameter() hasn't been setting config source
  information, so plumb "kvi" there too.

* Several sites in builtin/config.c have been calling current_config_*()
  functions outside of config callbacks (indirectly, via the
  format_config() helper), which means they're reading state that isn't
  set correctly:

  * "git config --get-urlmatch --show-scope" iterates config to collect
    values, but then attempts to display the scope after config
    iteration, causing the "unknown" scope to be shown instead of the
    config file's scope. It's clear that this wasn't intended: we knew
    that "--get-urlmatch" couldn't show config source metadata, which is
    why "--show-origin" was marked incompatible with "--get-urlmatch"
    when it was introduced [1]. It was most likely a mistake that we
    allowed "--show-scope" to sneak through.

    Fix this by copying the "kvi" value in the collection phase so that
    it can be read back later. This means that we can now support "git
    config --get-urlmatch --show-origin", but that is left unchanged
    for now.

  * "git config --default" doesn't have config source metadata when
    displaying the default value, so "--show-scope" also results in
    "unknown", and "--show-origin" results in a BUG(). Fix this by
    treating the default value as if it came from the command line (e.g.
    like we do with "git -c" or "git config --file"), using
    kvi_from_param().

[1] https://lore.kernel.org/git/20160205112001.GA13397@sigill.intra.peff.net/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Glen Choo 6021e1d158 config.c: pass ctx in configsets
Pass config_context to config callbacks in configset_iter(), trivially
setting the .kvi member to the cached key_value_info. Then, in config
callbacks that are only used with configsets, use the .kvi member to
replace calls to current_config_*(), and delete current_config_line()
because it has no remaining callers.

This leaves builtin/config.c and config.c as the only remaining users of
current_config_*().

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Glen Choo a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Glen Choo 97eeeea2dc config: inline git_color_default_config
git_color_default_config() is a shorthand for calling two other config
callbacks. There are no other non-static functions that do this and it
will complicate our refactoring of config_fn_t so inline it instead.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:38 -07:00
brian m. carlson ed773a18c6 var: add config file locations
Much like with attributes files, sometimes programs would like to know
the location of configuration files at the global or system levels.
However, it isn't always clear where these may live, especially for the
system file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

Since other parties cannot intuitively know how Git was compiled and
where it looks for these files, help them by providing variables that
can be queried.  Because we have multiple paths for global config
values, print them in order from highest to lowest priority, and be sure
to split on newlines so that "git var -l" produces two entries for the
global value.

However, be careful not to split all values on newlines, since our
editor values could well contain such characters, and we don't want to
split them in such a case.

Note in the documentation that some values may contain multiple paths
and that callers should be prepared for that fact.  This helps people
write code that will continue to work in the event we allow multiple
items elsewhere in the future.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson 576a37fccb var: add attributes files locations
Currently, there are some programs which would like to read and parse
the gitattributes files at the global or system levels.  However, it's
not always obvious where these files live, especially for the system
file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.

It's not reasonable to expect all callers of Git to intuitively know
where the Git distributor or user has configured these locations to
be, so add some entries to allow us to determine their location.  Honor
the GIT_ATTR_NOSYSTEM environment variable if one is specified.  Expose
the accessor functions in a way that we can reuse them from within the
var code.

In order to make our paths consistent on Windows and also use the same
form as paths use in "git rev-parse", let's normalize the path before we
return it.  This results in Windows-style paths that use slashes, which
is convenient for making our tests function in a consistent way across
platforms.  Note that this requires that some of our values be freed, so
let's add a flag about whether the value needs to be freed and use it
accordingly.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson cdd489eaf9 var: adjust memory allocation for strings
Right now, all of our values are constants whose allocation is managed
elsewhere.  However, in the future, we'll have some variables whose
memory we will need to free.  To keep things consistent, let's make each
of our functions allocate its own memory and make the caller responsible
for freeing it.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson f74c90dcf7 var: format variable structure with C99 initializers
Right now, we have only two items in our variable struct.  However, in
the future, we're going to add two more items.  To help keep our diffs
nice and tidy and make this structure easier to read, switch to use
C99-style initializers for our data.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:06 -07:00
brian m. carlson 1e65721227 var: add support for listing the shell
On most Unix systems, finding a suitable shell is easy: one simply uses
"sh" with an appropriate PATH value.  However, in many Windows
environments, the shell is shipped alongside Git, and it may or may not
be in PATH, even if Git is.

In such an environment, it can be very helpful to query Git for the
shell it's using, since other tools may want to use the same shell as
well.  To help them out, let's add a variable, GIT_SHELL_PATH, that
points to the location of the shell.

On Unix, we know our shell must be executable to be functional, so
assume that the distributor has correctly configured their environment,
and use that as a basic test.  On Git for Windows, we know that our
shell will be one of a few fixed values, all of which end in "sh" (such
as "bash").  This seems like it might be a nice test on Unix as well,
since it is customary for all shells to end in "sh", but there probably
exist such systems that don't have such a configuration, so be careful
here not to break them.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
Jeff King 4db16f58c7 var: mark unused parameters in git_var callbacks
We abstract the set of variables into a table, with a "read" callback to
provide the value of each. Each callback takes a "flag" argument, but
most callbacks don't make use of it.

This flag is a bit odd. It may be set to IDENT_STRICT, which make sense
for ident-based callbacks, but is just confusing for things like
GIT_EDITOR.

At first glance, it seems like this is just a hack to let us directly
stick the generic git_committer_info() and git_author_info() functions
into our table. And we'd be better off to wrap them with local functions
which pass IDENT_STRICT, and have our callbacks take no option at all.

But that doesn't quite work. We pass IDENT_STRICT when the caller asks
for a specific variable, but otherwise do not (so that "git var -l" does
not bail if the committer ident cannot be formed).

So we really do need to pass in the flag to each invocation, even if the
individual callback doesn't care about it. Let's mark the unused ones so
that -Wunused-parameter does not complain. And while we're here, let's
rename them so that it's clear that the flag values we get will be from
the IDENT_* set. That may prevent confusion for future readers of the
code.

Another option would be to define our own local "strict" flag for the
callbacks, and then have wrappers that translate that to IDENT_STRICT
where it matters. But that would be more boilerplate for little gain
(most functions would still ignore the "strict" flag anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-27 11:31:05 -07:00
Junio C Hamano e224f26892 Merge branch 'tb/collect-pack-filenames-fix'
Avoid breakage of "git pack-objects --cruft" due to inconsistency
between the way the code enumerates packfiles in the repository.

* tb/collect-pack-filenames-fix:
  builtin/repack.c: only collect fully-formed packs
2023-06-26 09:29:50 -07:00
Junio C Hamano 8d5c5a05d7 Merge branch 'jk/commit-use-no-divider-with-interpret-trailers'
When "git commit --trailer=..." invokes the interpret-trailers
machinery, it knows what it feeds to interpret-trailers is a full
log message without any patch, but failed to express that by
passing the "--no-divider" option, which has been corrected.

* jk/commit-use-no-divider-with-interpret-trailers:
  commit: pass --no-divider to interpret-trailers
2023-06-26 09:29:49 -07:00
Junio C Hamano 4e4fc50cf7 Merge branch 'rj/leakfixes'
Leakfixes

* rj/leakfixes:
  tests: mark as passing with SANITIZE=leak
  config: fix a leak in git_config_copy_or_rename_section_in_file
  branch: fix a leak in cmd_branch
  branch: fix a leak in setup_tracking
  rev-parse: fix a leak with --abbrev-ref
  branch: fix a leak in setup_tracking
  branch: fix a leak in check_tracking_branch
  branch: fix a leak in inherit_tracking
  branch: fix a leak in dwim_and_setup_tracking
  remote: fix a leak in query_matches_negative_refspec
  config: fix a leak in git_config_copy_or_rename_section_in_file
2023-06-23 11:21:17 -07:00
Junio C Hamano a813d9e239 Merge branch 'sl/worktree-sparse'
"git worktree" learned to work better with sparse index feature.

* sl/worktree-sparse:
  worktree: integrate with sparse-index
2023-06-23 11:21:16 -07:00
Junio C Hamano 5ee8fcdabc Merge branch 'mh/credential-erase-improvements'
* mh/credential-erase-improvements:
  credential: erase all matching credentials
  credential: avoid erasing distinct password
2023-06-23 11:21:16 -07:00
Junio C Hamano 644591bd06 Merge branch 'ds/add-i-color-configuration-fix'
The reimplemented "git add -i" did not honor color.ui configuration.

* ds/add-i-color-configuration-fix:
  add: test use of brackets when color is disabled
  add: check color.ui for interactive add
2023-06-22 16:29:06 -07:00
Junio C Hamano a9ea4c23dc Merge branch 'ps/cat-file-null-output'
"git cat-file --batch" and friends learned "-Z" that uses NUL
delimiter for both input and output.

* ps/cat-file-null-output:
  cat-file: add option '-Z' that delimits input and output with NUL
  cat-file: simplify reading from standard input
  strbuf: provide CRLF-aware helper to read until a specified delimiter
  t1006: modernize test style to use `test_cmp`
  t1006: don't strip timestamps from expected results
2023-06-22 16:29:06 -07:00
Junio C Hamano d9f9f6b358 Merge branch 'ds/disable-replace-refs'
Introduce a mechanism to disable replace refs globally and per
repository.

* ds/disable-replace-refs:
  repository: create read_replace_refs setting
  replace-objects: create wrapper around setting
  repository: create disable_replace_refs()
2023-06-22 16:29:06 -07:00
Junio C Hamano 4dd0469328 Merge branch 'ja/worktree-orphan'
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.

* ja/worktree-orphan:
  worktree add: emit warn when there is a bad HEAD
  worktree add: extend DWIM to infer --orphan
  worktree add: introduce "try --orphan" hint
  worktree add: add --orphan flag
  t2400: add tests to verify --quiet
  t2400: refactor "worktree add" opt exclusion tests
  t2400: cleanup created worktree in test
  worktree add: include -B in usage docs
2023-06-22 16:29:05 -07:00
Elijah Newren 68d686460f fsmonitor-ll.h: split this header out of fsmonitor.h
This creates a new fsmonitor-ll.h with most of the functions from
fsmonitor.h, though it leaves three inline functions where they were.
Two-thirds of the files that previously included fsmonitor.h did not
need those three inline functions or the six extra includes those inline
functions required, so this allows them to only include the lower level
header.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren 6723899932 merge-ll: rename from ll-merge
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch].  I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.

Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix.  Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.

Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
  ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group.  But giving them the common "ll-" prefix would group them.  Using
"-ll" as a suffix thus seems just much more logical to me.  Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.

[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren dd77d58795 git-compat-util.h: remove unneccessary include of wildmatch.h
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch().  The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.

Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren 88e4e18325 builtin.h: remove unneccessary includes
This also made it clear that a few .c files under builtin/ were
depending upon some headers but had forgotten to #include them.  Add the
missing direct includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren df6e874496 diff.h: remove unnecessary include of oidset.h
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers.  Add those now.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 0fd2e21571 log-tree: replace include of revision.h with simple forward declaration
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren bc5c5ec044 cache.h: remove this no-longer-used header
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.

Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first).  This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 08c46a499a read-cache*.h: move declarations for read-cache.c functions from cache.h
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h.  Also move some related inline
functions from cache.h to read-cache.h.  The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 750324ddb8 merge.h: move declarations for merge.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren fbffdfb11c preload-index.h: move declarations for preload-index.c from elsewhere
We already have a preload-index.c file; move the declarations for the
functions in that file into a new preload-index.h.  These were
previously split between cache.h and repository.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren baf889c2cd sparse-index.h: move declarations for sparse-index.c from cache.h
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren f5653856c2 name-hash.h: move declarations for name-hash.c from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 6cee5ebc7a read-cache: move shared add/checkout/commit code
The function add_files_to_cache(), plus associated helper functions,
were defined in builtin/add.c, but also shared with builtin/checkout.c
and builtin/commit.c.  Move these shared functions to read-cache.c.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 50c37ee839 add: modify add_files_to_cache() to avoid globals
The function add_files_to_cache() is used by all three of builtin/{add,
checkout, commit}.c.  That suggests this is common library code, and
should be moved somewhere else, like read-cache.c.  However, the
function and its helpers made use of two global variables that made
straight code movement difficult:
  * the_index
  * include_sparse
The latter was perhaps more problematic since it was only accessible in
builtin/add.c but was still affecting builtin/checkout.c and
builtin/commit.c without this fact being very clear from the code.  I'm
not sure if the other two callers would want to add a `--sparse` flag
similar to add.c to get non-default behavior, but exposing this
dependence will help if we ever decide we do want to add such a flag.

Modify add_files_to_cache() and its helpers to accept the necessary
arguments instead of relying on globals.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren 1a40e7be6c read-cache: move shared commit and ls-files code
The function overlay_tree_on_index(), plus associated helper functions,
were defined in builtin/ls-files.c, but also shared with
builtin/commit.c.  Move these shared functions to read-cache.c.

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren e8cf8ef507 setup: adopt shared init-db & clone code
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.

Move these functions, plus their several helpers only used by these
functions, to setup.[ch].

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren fc81735057 init-db, clone: change unnecessary global into passed parameter
Much like the parent commit, this commit was prompted by a desire to
move the functions which builtin/init-db.c and builtin/clone.c share out
of the former file and into setup.c.  A secondary issue that made it
difficult was the init_shared_repository global variable; replace it
with a simple parameter that is passed to the relevant functions.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:34 -07:00
Elijah Newren c2f76965d0 init-db: remove unnecessary global variable
This commit was prompted by a desire to move the functions which
builtin/init-db.c and builtin/clone.c share out of the former file and
into setup.c.  One issue that made it difficult was the
init_is_bare_repository global variable.

init_is_bare_repository's sole use in life it to cache a value in
init_db(), and then be used in create_default_files().  This is a bit
odd since init_db() directly calls create_default_files(), and is the
only caller of that function.  Convert the global to a simple function
parameter instead.

(Of course, this doesn't fix the fact that this value is then ignored by
create_default_files(), as noted in a big TODO comment in that function,
but it at least includes no behavioral change other than getting rid of
a very questionable global variable.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:34 -07:00
Elijah Newren 0f7443bdc7 init-db: document existing bug with core.bare in template config
The comments in create_default_files() talks about reading config from
the config file in the specified `--templates` directory, which leads to
the question of whether core.bare could be set in such a config file and
thus whether the code is doing the right thing.  It turns out, that it
doesn't; it unconditionally ignores core.bare in the config file in any
--templates directory.  It is not clear to me that fixing it can be done
within this function; it seems to occur too late:
  * create_default_files() is called by init_db()
  * init_db() is called by both builtin/{clone.c,init-db.c}
  * both callers of init_db() call set_git_work_tree() before init_db()
and in order to actual affect whether a repository is bear, we'd need to
somewhere reset these values, not just the is_bare_repository_cfg
setting.

I do not want to open this can of worms at this time; I'm trying to
clean up some headers, for which I need to move some functions, for
which I need to clean up some globals, and that's far enough down the
rabbit hole.  So, simply document the issue with a careful TODO comment
and a few testcases.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 11:14:33 -07:00
Teng Long 3d6a316464 notes: introduce "--no-separator" option
Sometimes, the user may want to add or append multiple notes
without any separator to be added between them.

Disscussion:

  https://public-inbox.org/git/3f86a553-246a-4626-b1bd-bacd8148318a@app.fastmail.com/

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:01 -07:00
Teng Long c4e2aa7d45 notes.c: introduce "--[no-]stripspace" option
This commit introduces a new option "--[no-]stripspace" to git notes
append, git notes edit, and git notes add. This option allows users to
control whether the note message need to stripped out.

For the consideration of backward compatibility, let's look at the
behavior about "stripspace" in "git notes" command:

1. "Edit Message" case: using the default editor to edit the note
message.

    In "edit" case, the edited message will always be stripped out, the
    implementation which can be found in the "prepare_note_data()". In
    addition, the "-c" option supports to reuse an existing blob as a
    note message, then open the editor to make a further edition on it,
    the edited message will be stripped.

    This commit doesn't change the default behavior of "edit" case by
    using an enum "notes_stripspace", only when "--no-stripspace" option
    is specified, the note message will not be stripped out. If you do
    not specify the option or you specify "--stripspace", clearly, the
    note message will be stripped out.

2. "Assign Message" case: using the "-m"/"-F"/"-C" option to specify the
note message.

    In "assign" case, when specify message by "-m" or "-F", the message
    will be stripped out by default, but when specify message by "-C",
    the message will be copied verbatim, in other word, the message will
    not be stripped out. One more thing need to note is "the order of
    the options matter", that is, if you specify "-C" before "-m" or
    "-F", the reused message by "-C" will be stripped out together,
    because everytime concat "-m" or "-F" message, the concated message
    will be stripped together. Oppositely, if you specify "-m" or "-F"
    before "-C", the reused message by "-C" will not be stripped out.

    This commit doesn't change the default behavior of "assign" case by
    extending the "stripspace" field in "struct note_msg", so we can
    distinguish the different behavior of "-m"/"-F" and "-C" options
    when we need to parse and concat the message.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:01 -07:00
Teng Long b7d87ad537 notes.c: append separator instead of insert by pos
Rename "insert_separator" to "append_separator" and also remove the
"postion" argument, this serves two purpose:

The first is that when specifying more than one "-m" ( like "-F", etc)
to "git notes add" or "git notes append", the order of them matters,
which means we need to append the each separator and message in turn,
so we don't have to make the caller specify the position, the "append"
operation is enough and clear.

The second is that when we execute the "git notes append" subcommand,
we need to combine the "prev_note" and "current_note" to get the
final result. Before, we inserted a newline character at the beginning
of "current_note". Now, we will append a newline to the end of
"prev_note" instead, this will give the consisitent results.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
Teng Long 90bc19b3ae notes.c: introduce '--separator=<paragraph-break>' option
When adding new notes or appending to an existing notes, we will
insert a blank line between the paragraphs, like:

     $ git notes add -m foo -m bar
     $ git notes show HEAD
     foo

     bar

The default behavour sometimes is not enough, the user may want
to use a custom delimiter between paragraphs, like when
specifying '-m', '-F', '-C', '-c' options. So this commit
introduce a new '--separator' option for 'git notes add' and
'git notes append', for example when executing:

    $ git notes add -m foo -m bar --separator="-"
    $ git notes show HEAD
    foo
    -
    bar

a newline is added to the value given to --separator if it
does not end with one already. So when executing:

      $ git notes add -m foo -m bar --separator="-"
and
      $ export LF="
      "
      $ git notes add -m foo -m bar --separator="-$LF"

Both the two exections produce the same result.

The reason we use a "strbuf" array to concat but not "string_list", is
that the binary file content may contain '\0' in the middle, this will
cause the corrupt result if using a string to save.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
Teng Long 3d27ae0712 notes.c: use designated initializers for clarity
The "struct note_data d = { 0, 0, NULL, STRBUF_INIT };" style could be
replaced with designated initializer for clarity.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
Teng Long ef48fcc432 notes.c: cleanup 'strbuf_grow' call in 'append_edit'
Let's cleanup the unnecessary 'strbuf_grow' call in 'append_edit'. This
"strbuf_grow(&d.buf, size + 1);" is prepared for insert a blank line if
needed, but actually when inserting, "strbuf_insertstr(&d.buf, 0,
"\n");" will do the "grow" for us.

348f199b (builtin-notes: Refactor handling of -F option to allow
combining -m and -F, 2010-02-13) added these to mimic the code
introduced by 2347fae5 (builtin-notes: Add "append" subcommand for
appending to note objects, 2010-02-13) that reads in previous note
before the message.  And the resulting code with explicit sizing is
carried to this day.

In the context of reading an existing note in, exact sizing may have
made sense, but because the resulting note needs cleansing with
stripspace() when appending with this option, such an exact sizing
does not buy us all that much in practice.

It may help avoiding overallocation due to ALLOC_GROW() slop, but
nobody can feed so many long messages for it to matter from the
command line.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 08:51:00 -07:00
Junio C Hamano de00f4b7f3 Merge branch 'jk/log-follow-with-non-literal-pathspec'
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf.

* jk/log-follow-with-non-literal-pathspec:
  diff: detect pathspec magic not supported by --follow
  diff: factor out --follow pathspec check
  pathspec: factor out magic-to-name function
2023-06-20 15:53:13 -07:00
Junio C Hamano 7cb4274d26 Merge branch 'vd/worktree-config-is-per-repository'
The value of config.worktree is per-repository, but has been kept
in a singleton global variable per process. This has been OK as
most Git operations interacted with a single repository at a time,
but not right for operations like recursive "grep" that want to
access multiple repositories from a single process without forking.

The global variable has been eliminated and made into a member in
the per-repository data structure.

* vd/worktree-config-is-per-repository:
  repository: move 'repository_format_worktree_config' to repo scope
  config: pass 'repo' directly to 'config_with_options()'
  config: use gitdir to get worktree config
2023-06-20 15:53:13 -07:00
Junio C Hamano 9cd234e646 Merge branch 'tb/submodule-null-deref-fix'
"git submodule" code trusted the data coming from the config (and
the in-tree .gitmodules file) too much without validating, leading
to NULL dereference if the user mucks with a repository (e.g.
submodule.<name>.url is removed).  This has been corrected.

* tb/submodule-null-deref-fix:
  builtin/submodule--helper.c: handle missing submodule URLs
2023-06-20 15:53:13 -07:00
Junio C Hamano ae19633021 Merge branch 'tl/quote-problematic-arg-for-clarity'
Error message fix.

* tl/quote-problematic-arg-for-clarity:
  surround %s with quotes when failed to lookup commit
2023-06-20 15:53:10 -07:00
Junio C Hamano 06cff0c8d4 Merge branch 'ps/fetch-cleanups'
Code clean-up.

* ps/fetch-cleanups:
  fetch: use `fetch_config` to store "submodule.fetchJobs" value
  fetch: use `fetch_config` to store "fetch.parallel" value
  fetch: use `fetch_config` to store "fetch.recurseSubmodules" value
  fetch: use `fetch_config` to store "fetch.showForcedUpdates" value
  fetch: use `fetch_config` to store "fetch.pruneTags" value
  fetch: use `fetch_config` to store "fetch.prune" value
  fetch: pass through `fetch_config` directly
  fetch: drop unneeded NULL-check for `remote_ref`
  fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
2023-06-20 15:53:10 -07:00
René Scharfe 4416b86c6b strbuf: simplify strbuf_expand_literal_cb()
Now that strbuf_expand_literal_cb() is no longer used as a callback,
drop its "_cb" name suffix and unused context parameter.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
René Scharfe 6f1e2d5279 replace strbuf_expand() with strbuf_expand_step()
Avoid the overhead of passing context to a callback function of
strbuf_expand() by using strbuf_expand_step() in a loop instead.  It
requires explicit handling of %% and unrecognized placeholders, but is
simpler, more direct and avoids void pointers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
René Scharfe 44ccb337f1 strbuf: factor out strbuf_expand_step()
Extract the part of strbuf_expand that finds the next placeholder into a
new function.  It allows to build parsers without callback functions and
the overhead imposed by them.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-18 12:55:30 -07:00
Rubén Justo 2935a97836 branch: fix a leak in cmd_branch
In 98e7ab6d42 (for-each-ref: delay parsing of --sort=<atom> options,
2021-10-20) a new string_list was introduced to accumulate any
"branch.sort" setting.

That string_list is cleared in ref_sorting_options(), which is only
called when processing the "--list" sub-command.  Therefore, with other
sub-command, while having any sort option set, a leak is produced, e.g.:

   $ git config branch.sort invalid_sort_option
   $ git branch --edit-description

   Direct leak of 384 byte(s) in 1 object(s) allocated from:
       ... in xrealloc wrapper.c
       ... in string_list_append_nodup string-list.c
       ... in string_list_append string-list.c
       ... in git_branch_config builtin/branch.c
       ... in configset_iter config.c
       ... in repo_config config.c
       ... in git_config config.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

   Indirect leak of 20 byte(s) in 1 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in string_list_append string-list.c
       ... in git_branch_config builtin/branch.c
       ... in configset_iter config.c
       ... in repo_config config.c
       ... in git_config config.c
       ... in cmd_branch builtin/branch.c
       ... in run_builtin git.c

We don't have a common clean-up section in cmd_branch().  To avoid
refactoring, keep the fix simple, and while we find a better solution
which hopefuly will avoid entirely that string_list, when no sort
options are needed; let's squelch the leak sanitizer using UNLEAK().

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:48 -07:00
Rubén Justo 05e717d556 rev-parse: fix a leak with --abbrev-ref
To handle "--abbrev-ref" we use shorten_unambiguous_ref().  This
function takes a refname and returns a shortened refname, which is a
newly allocated string that needs to be freed.

Unfortunately, the refname variable is reused to receive the shortened
one.  Therefore, we lose the original refname, which needs to be freed
as well, producing a leak.

This leak can be reviewed with:

   $ for a in {1..10}; do git branch foo_${a}; done
   $ git rev-parse --abbrev-ref refs/heads/foo_{1..10}

   Direct leak of 171 byte(s) in 10 object(s) allocated from:
       ... in xstrdup wrapper.c
       ... in expand_ref refs.c
       ... in repo_dwim_ref refs.c
       ... in show_rev builtin/rev-parse.c
       ... in cmd_rev_parse builtin/rev-parse.c
       ... in run_builtin git.c

We have this leak since a45d34691e (rev-parse: --abbrev-ref option to
shorten ref name, 2009-04-13) when "--abbrev-ref" was introduced.

Let's fix it.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-17 09:02:47 -07:00
Jeff King be3d654343 commit: pass --no-divider to interpret-trailers
When git-commit sees any "--trailer" options, it passes the
COMMIT_EDITMSG file through git-interpret-trailers. But it does so
without passing --no-divider, which means that interpret-trailers will
look for a "---" divider to signal the end of the commit message.

That behavior doesn't make any sense in this context; we know we have a
complete and solitary commit message, not something we have to further
parse. And as a result, we'll do the wrong thing if the commit message
contains a "---" marker (which otherwise is not syntactically
significant), inserting any new trailers at the wrong spot.

We can fix this by passing --no-divider. This is the exact situation for
which it was added in 1688c9a489 (interpret-trailers: allow suppressing
"---" divider, 2018-08-22). As noted in the message for that commit, it
just adds the mechanism, and further patches were needed to trigger it
from various callers.  We did that back then in a few spots, like
ffce7f590f (sequencer: ignore "---" divider when parsing trailers,
2018-08-22), but obviously missed this one.

Reported-by: <eric.frederich@siemens.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-16 21:47:40 -07:00
M Hickford 6c26da8404 credential: erase all matching credentials
`credential reject` sends the erase action to each helper, but the
exact behaviour of erase isn't specified in documentation or tests.
Some helpers (such as credential-store and credential-libsecret) delete
all matching credentials, others (such as credential-cache) delete at
most one matching credential.

Test that helpers erase all matching credentials. This behaviour is
easiest to reason about. Users expect that `echo
"url=https://example.com" | git credential reject` or `echo
"url=https://example.com\nusername=tim" | git credential reject` erase
all matching credentials.

Fix credential-cache.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:41 -07:00
M Hickford aeb21ce22e credential: avoid erasing distinct password
Test that credential helpers do not erase a password distinct from the
input. Such calls can happen when multiple credential helpers are
configured.

Fixes for credential-cache and credential-store.

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-15 13:26:39 -07:00
Junio C Hamano 32fe7fff0c Merge branch 'zh/ls-files-format-atoms'
Some atoms that can be used in "--format=<format>" for "git ls-tree"
were not supported by "git ls-files", even though they were relevant
in the context of the latter.

* zh/ls-files-format-atoms:
  ls-files: align format atoms with ls-tree
2023-06-13 12:29:46 -07:00
Junio C Hamano ca9c063c18 Merge branch 'sl/diff-tree-sparse'
"git diff-tree" has been taught to take advantage of the
sparse-index feature.

* sl/diff-tree-sparse:
  diff-tree: integrate with sparse index
2023-06-13 12:29:46 -07:00
Junio C Hamano e490bea8a6 Merge branch 'jk/format-patch-message-id-unleak'
Leakfix.

* jk/format-patch-message-id-unleak:
  format-patch: free elements of rev.ref_message_ids list
  format-patch: free rev.message_id when exiting
2023-06-13 12:29:46 -07:00
Junio C Hamano cbc882ea38 Merge branch 'jc/pack-ref-exclude-include'
"git pack-refs" learns "--include" and "--exclude" to tweak the ref
hierarchy to be packed using pattern matching.

* jc/pack-ref-exclude-include:
  pack-refs: teach pack-refs --include option
  pack-refs: teach --exclude option to exclude refs from being packed
  docs: clarify git-pack-refs --all will pack all refs
2023-06-13 12:29:45 -07:00
Junio C Hamano 6d2a88c728 Merge branch 'kh/keep-tag-editmsg-upon-failure'
"git tag" learned to leave the "$GIT_DIR/TAG_EDITMSG" file when the
command failed, so that the user can salvage what they typed.

* kh/keep-tag-editmsg-upon-failure:
  tag: keep the message file in case ref transaction fails
  t/t7004-tag: add regression test for successful tag creation
  doc: tag: document `TAG_EDITMSG`
2023-06-13 12:29:44 -07:00
Taylor Blau 73320e49ad builtin/repack.c: only collect fully-formed packs
To partition the set of packs based on which ones are "kept" (either
they have a .keep file, or were otherwise marked via the `--keep-pack`
option) and "non-kept" ones (anything else), `git repack` uses its
`collect_pack_filenames()` function.

Ordinarily, we would rely on a convenience function such as
`get_all_packs()` to enumerate and partition the set of packs. But
`collect_pack_filenames()` uses `readdir()` directly to read the
contents of the "$GIT_DIR/objects/pack" directory, and adds each entry
ending in ".pack" to the appropriate list (either kept, or non-kept as
above).

This is subtly racy, since `collect_pack_filenames()` may see a pack
that is not fully staged (i.e., it is missing its ".idx" file).
Ordinarily, this doesn't cause a problem. But it can cause issues when
generating a cruft pack.

This is because `git repack` feeds (among other things) the list of
existing kept packs down to `git pack-objects --cruft` to indicate that
any kept packs will not be removed from the repository (so that the
cruft pack machinery can avoid packing objects that appear in those
packs as cruft).

But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`.
So if a ".pack" file exists (necessary to get that pack to appear to
`collect_pack_filenames()`), but doesn't have a corresponding ".idx"
file (necessary to get that pack to appear via `get_all_packs()`), we'll
complain with:

    fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack'

Fix the above by teaching `collect_pack_filenames()` to only collect
packs with their corresponding `*.idx` files in place, indicating that
those packs have been fully staged.

There are a couple of things worth noting:

  - Since each entry in the `extra_keep` list (which contains the
    `--keep-pack` names) has a `*.pack` suffix, we'll have to swap the
    suffix from ".pack" to ".idx", and compare that instead.

  - Since we use the the `fname_kept_list` to figure out which packs to
    delete (with `git repack -d`), we would have previously deleted a
    `*.pack` with no index (since the existince of a ".pack" file is
    necessary and sufficient to include that pack in the list of
    existing non-kept packs).

    Now we will leave it alone (since that pack won't appear in the
    list). This is far more correct behavior, since we don't want
    to race with a pack being staged. Deleting a partially staged pack
    is unlikely, however, since the window of time between staging a
    pack and moving its .idx file into place is miniscule.

    Note that this window does *not* include the time it takes to
    receive and index the pack, since the incoming data goes into
    "$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack"
    and is thus ignored by collect_pack_filenames().

In the future, this function should probably be rewritten as a callback
to `for_each_file_in_pack_dir()`, but this is the simplest change we
could do in the short-term.

Reported-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:54:38 -07:00
Calvin Wan 787cb8a48a strbuf: remove global variable
As a library that only interacts with other primitives, strbuf should
not utilize the comment_line_char global variable within its
functions. Therefore, add an additional parameter for functions that use
comment_line_char and refactor callers to pass it in instead.
strbuf_stripspace() removes the skip_comments boolean and checks if
comment_line_char is a non-NUL character to determine whether to skip
comments or not.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
Calvin Wan f89854362c credential-store: move related functions to credential-store file
is_rfc3986_unreserved() and is_rfc3986_reserved_or_unreserved() are only
called from builtin/credential-store.c and they are only relevant to that
file so move those functions and make them static.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:49:36 -07:00
Derrick Stolee d24eda4e03 repository: create disable_replace_refs()
Several builtins depend on being able to disable the replace references
so we actually operate on each object individually. These currently do
so by directly mutating the 'read_replace_refs' global.

A future change will move this global into a different place, so it will
be necessary to change all of these lines. However, we can simplify that
transition by abstracting the purpose of these global assignments with a
method call.

We will need to keep this read_replace_refs global forever, as we want
to make sure that we never use replace refs throughout the life of the
process if this method is called. Future changes may present a
repository-scoped version of the variable to represent that repository's
core.useReplaceRefs config value, but a zero-valued read_replace_refs
will always override such a setting.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
Patrick Steinhardt f79e18849b cat-file: add option '-Z' that delimits input and output with NUL
In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with
`-z`, 2022-07-22), we have introduced a new mode to read the input via
NUL-delimited records instead of newline-delimited records. This allows
the user to query for revisions that have newlines in their path
component. While unusual, such queries are perfectly valid and thus it
is clear that we should be able to support them properly.

Unfortunately, the commit only changed the input to be NUL-delimited,
but didn't change the output at the same time. While this is fine for
queries that are processed successfully, it is less so for queries that
aren't. In the case of missing commits for example the result can become
entirely unparsable:

```
$ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" |
    git cat-file --batch -z
7ce4f05bae blob 10
1234567890

commit missing
```

This is of course a crafted query that is intentionally gaming the
deficiency, but more benign queries that contain newlines would have
similar problems.

Ideally, we should have also changed the output to be NUL-delimited when
`-z` is specified to avoid this problem. As the input is NUL-delimited,
it is clear that the output in this case cannot ever contain NUL
characters by itself. Furthermore, Git does not allow NUL characters in
revisions anyway, further stressing the point that using NUL-delimited
output is safe. The only exception is of course the object data itself,
but as git-cat-file(1) prints the size of the object data clients should
read until that specified size has been consumed.

But even though `-z` has only been introduced a few releases ago in Git
v2.38.0, changing the output format retroactively to also NUL-delimit
output would be a backwards incompatible change. And while one could
make the argument that the output is inherently broken already, we need
to assume that there are existing users out there that use it just fine
given that revisions containing newlines are quite exotic.

Instead, introduce a new option `-Z` that switches to NUL-delimited
input and output. While this new option could arguably only switch the
output format to be NUL-delimited, the consequence would be that users
have to always specify both `-z` and `-Z` when the input may contain
newlines. On the other hand, if the user knows that there never will be
newlines in the input, they don't have to use either of those options.
There is thus no usecase that would warrant treating input and output
format separately, which is why we instead opt to "do the right thing"
and have `-Z` mean to NUL-terminate both formats.

The old `-z` option is marked as deprecated with a hint that its output
may become unparsable. It is thus hidden both from the synopsis as well
as the command's help output.

Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:46 -07:00
Patrick Steinhardt 3217f52a49 cat-file: simplify reading from standard input
The batch modes of git-cat-file(1) read queries from stantard input that
are either newline- or NUL-delimited. This code was introduced via
db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`,
2022-07-22), which notes that:

"""
The refactoring here is slightly unfortunate, since we turn loops like:

     while (strbuf_getline(&buf, stdin) != EOF)

 into:

     while (1) {
         int ret;
         if (opt->nul_terminated)
             ret = strbuf_getline_nul(&input, stdin);
         else
             ret = strbuf_getline(&input, stdin);

         if (ret == EOF)
             break;
     }
"""

The commit proposed introducing a helper function that is easier to use,
which is just what we have done in the preceding commit. Refactor the
code to use this new helper to simplify the loop.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:23:24 -07:00
Shuqi Liang 8fac776f44 worktree: integrate with sparse-index
The index is read in 'worktree.c' at two points:

1.The 'validate_no_submodules' function, which checks if there are any
submodules present in the worktree.

2.The 'check_clean_worktree' function, which verifies if a worktree is
'clean', i.e., there are no untracked or modified but uncommitted files.
This is done by running the 'git status' command, and an error message
is thrown if the worktree is not clean. Given that 'git status' is
already sparse-aware, the function is also sparse-aware.

Hence we can just set the requires-full-index to false for
"git worktree".

Add tests that verify that 'git worktree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~20% execution time reduction for
'git worktree' using a sparse index:

(Note:the p2000 test results didn't reflect the huge speedup because of
the index reading time is minuscule comparing to the filesystem
operations.)

Test                                       before  after
-----------------------------------------------------------------------
2000.102: git worktree add....(full-v3)    3.15    2.82  -10.5%
2000.103: git worktree add....(full-v4)    3.14    2.84  -9.6%
2000.104: git worktree add....(sparse-v3)  2.59    2.14  -16.4%
2000.105: git worktree add....(sparse-v4)  2.10    1.57  -25.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 12:13:58 -07:00
Derrick Stolee 7cf3b49f47 add: check color.ui for interactive add
When 'git add -i' and 'git add -p' were converted to a builtin, they
introduced a color bug: the 'color.ui' config setting is ignored.

The included test demonstrates an example that is similar to the
previous test, which focuses on customizing colors. Here, we are
demonstrating that colors are not being used at all by comparing the raw
output and the color-decoded version of that output.

The fix is simple, to use git_color_default_config() as the fallback for
git_add_config(). A more robust change would instead encapsulate the
git_use_color_default global in methods that would check the config
setting if it has not been initialized yet. Some ideas are being
discussed on this front [1], but nothing has been finalized.

[1] https://lore.kernel.org/git/pull.1539.git.1685716420.gitgitgadget@gmail.com/

This test case naturally bisects down to 0527ccb1b5 (add -i: default to
the built-in implementation, 2021-11-30), but the fix makes it clear
that this would be broken even if we added the config to use the builtin
earlier than this.

Reported-by: Greg Alexander <gitgreg@galexander.org>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 10:49:16 -07:00
Jeff King 9eac5954e8 diff: factor out --follow pathspec check
In --follow mode, we require exactly one pathspec. We check this
condition in two places:

  - in diff_setup_done(), we complain if --follow is used with an
    inapropriate pathspec

  - in git-log's revision "tweak" function, we enable log.follow only if
    the pathspec allows it

The duplication isn't a big deal right now, since the logic is so
simple. But in preparation for it becoming more complex, let's pull it
into a shared function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 10:34:25 +09:00
Teng Long e4cf013468 surround %s with quotes when failed to lookup commit
The output may become confusing to recognize if the user
accidentally gave an extra opening space, like:

   $ git commit --fixup=" 6d6360b67e99c2fd82d64619c971fdede98ee74b"
   fatal: could not lookup commit  6d6360b67e99c2fd82d64619c971fdede98ee74b

and it will be better if we surround the %s specifier with single quotes.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03 09:01:10 +09:00
Victoria Dye 3867f6d650 repository: move 'repository_format_worktree_config' to repo scope
Move 'repository_format_worktree_config' out of the global scope and into
the 'repository' struct. This change is similar to how
'repository_format_partial_clone' was moved in ebaf3bcf1a (repository: move
global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository'
struct and updating 'setup.c' & 'repository.c' functions to assign the value
appropriately.

The primary goal of this change is to be able to load the worktree config of
a submodule depending on whether that submodule - not its superproject - has
'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()'
has access to the newly repo-scoped configuration, add a 'struct repository'
argument to 'do_git_config_sequence()' and pass it the 'repo' value from
'config_with_options()'.

Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to
verify 'extensions.worktreeConfig' is read an used independently by
superprojects and submodules.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:41 +09:00
Victoria Dye 9b6b06c159 config: pass 'repo' directly to 'config_with_options()'
Add a 'struct repository' argument to 'config_with_options()' and remove the
'repo' field from 'struct git_config_source'.

A 'struct repository' instance was originally added to the config source in
e3e8bf046e (submodule-config: pass repo upon blob config read, 2021-08-16)
to improve how submodule blob config content was accessed. At the time, this
was the only use for a 'repository' instance, so it was naturally added only
where it was needed: to 'struct git_config_source'. However, in upcoming
patches, 'config_with_options()' will need the repository instance to access
extension information (regardless of whether a 'config_source' exists). To
make the 'struct repository' instance more easily accessible, move it into
the function's arguments.

Update all callers of 'config_with_options()' to pass the appropriate 'repo'
value:

* in 'builtin/config.c', use 'the_repository'
* in 'submodule--config.c', use the 'repo' arg in 'config_from_gitmodules()'
* in 'read_[very_]early_config()' & 'read_protected_config()', set 'repo' to
  NULL (repository instances aren't available there)
* in 'populate_remote_urls()', use the repo instance that has been added to
  the 'struct config_include_data'
* in 'repo_read_config()', use the given 'repo' arg

Finally, note that this patch eliminates the fallback to 'the_repository'
that previously existed for the 'config_source' repo instance if it was
NULL. The fallback is no longer necessary, as the 'repo' is set explicitly
in all cases where it is needed.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:40 +09:00
Taylor Blau fbc806acd1 builtin/submodule--helper.c: handle missing submodule URLs
In e0a862fdaf (submodule helper: convert relative URL to absolute URL if
needed, 2018-10-16), `prepare_to_clone_next_submodule()` lost the
ability to handle URL-less submodules, due to a change from:

    if (repo_get_config_string_const(the_repostiory, sb.buf, &url))
        url = sub->url;

to

    if (repo_get_config_string_const(the_repostiory, sb.buf, &url)) {
        if (starts_with_dot_slash(sub->url) ||
            starts_with_dot_dot_slash(sub->url)) {
                /* ... */
            }
    }

, which will segfault when `sub->url` is NULL, since both
`starts_with_dot_slash()` does not guard its arguments as non-NULL.

Guard the checks to both of the above functions by first checking
whether `sub->url` is non-NULL. There is no need to check whether `sub`
itself is NULL, since we already perform this check earlier in
`prepare_to_clone_next_submodule()`.

By adding a NULL-ness check on `sub->url`, we'll fall into the 'else'
branch, setting `url` to `sub->url` (which is NULL). Before attempting
to invoke `git submodule--helper clone`, check whether `url` is NULL,
and die() if it is.

Reported-by: Tribo Dar <3bodar@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-25 05:26:59 +09:00
ZheNing Hu 4d28c4f75f ls-files: align format atoms with ls-tree
"git ls-files --format" can be used to format the output of
multiple file entries in the index, while "git ls-tree --format"
can be used to format the contents of a tree object. However,
the current set of %(objecttype), "(objectsize)", and
"%(objectsize:padded)" atoms supported by "git ls-files --format"
is a subset of what is available in "git ls-tree --format".

Users sometimes need to establish a unified view between the index
and tree, which can help with comparison or conversion between the two.

Therefore, this patch adds the missing atoms to "git ls-files --format".
"%(objecttype)" can be used to retrieve the object type corresponding
to a file in the index, "(objectsize)" can be used to retrieve the
object size corresponding to a file in the index, and "%(objectsize:padded)"
is the same as "(objectsize)", except with padded format.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-23 20:12:57 +09:00
Jeff King c6d26a9dda format-patch: free elements of rev.ref_message_ids list
When we are showing multiple patches with format-patch, we have to
repeatedly overwrite the rev.message_id field. We take care to avoid
leaking the old value by either freeing it, or adding it to
ref_message_ids, a string list of ids to reference in subsequent
messages.

But unfortunately we do leak the value via that string list. We try
to clear the string list, courtesy of 89f45cf4eb (format-patch: don't
leak "extra_headers" or "ref_message_ids", 2022-04-13). But since it was
initialized as "nodup", the string list doesn't realize it owns the
strings, and it leaks them.

We have two options here:

  1. Continue to init with "nodup", but then tweak the value of
     ref_message_ids.strdup_strings just before clearing.

  2. Init with "dup", but use "append_nodup" when transferring ownership
     of strings to the list. Clearing just works.

I picked the second here, as I think it calls attention to the tricky
part (transferring ownership via the nodup call).

There's one other related fix we have to do, though. We also insert the
result of clean_message_id() into the list. This _sometimes_ allocates
and sometimes does not, depending on whether we have to remove cruft
from the end of the string. Let's teach it to consistently return an
allocated string, so that the caller knows it must be freed.

There's no new test here, as the leak can already be seen in t4014.44 (as
well as others in that script). We can't mark all of t4014 as leak-free,
though, as there are other unrelated leaks that it triggers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-19 09:42:26 -07:00
Jeff King cfa120947e format-patch: free rev.message_id when exiting
We may allocate a message-id string via gen_message_id(), but we never
free it, causing a small leak. This can be demonstrated by running t9001
with a leak-checking build. The offending test is the one touched by
3ece9bf0f9 (send-email: clear the $message_id after validation,
2023-05-17), but the leak is much older than that. The test was simply
unlucky enough to trigger the leaking code path for the first time.

We can fix this by freeing the string at the end of the function. We can
also re-mark the test script as leak-free, effectively reverting
20bd08aefb (t9001: mark the script as no longer leak checker clean,
2023-05-17).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-18 18:33:04 -07:00
Shuqi Liang 48c5fbfb89 diff-tree: integrate with sparse index
The index is read in 'cmd_diff_tree' at two points:

1. The first index read was added in fd66bcc31f (diff-tree: read the
index so attribute checks work in bare repositories, 2017-12-06) to deal
with reading '.gitattributes' content. 77efbb366a (attr: be careful
about sparse directories, 2021-09-08) established that, in a sparse
index, we do _not_ try to load a '.gitattributes' file from within a
sparse directory.

2. The second index access point is involved in rename detection,
specifically when reading from stdin.This was initially added in
f0c6b2a2fd ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where
'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'.
That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in
ff7fe37b05 (diff.c: move read_index() code back to the caller,
2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as
of 6e0b8ed6d3 (diff.c: do not use a separate "size cache"., 2007-05-07)
and nothing about 'detect_rename' otherwise indicates index usage.

Hence we can just set the requires-full-index to false for "diff-tree".

Add tests that verify that 'git diff-tree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~98% execution time reduction for
'git diff-tree' using a sparse index:

Test                                                before  after
-----------------------------------------------------------------------
2000.94: git diff-tree HEAD (full-v3)                0.05   0.04 -20.0%
2000.95: git diff-tree HEAD (full-v4)                0.06   0.05 -16.7%
2000.96: git diff-tree HEAD (sparse-v3)              0.59   0.01 -98.3%
2000.97: git diff-tree HEAD (sparse-v4)              0.61   0.01 -98.4%
2000.98: git diff-tree HEAD -- f2/f4/a (full-v3)     0.05   0.05 +0.0%
2000.99: git diff-tree HEAD -- f2/f4/a (full-v4)     0.05   0.04 -20.0%
2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3)  0.58   0.01 -98.3%
2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4)  0.55   0.01 -98.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-18 10:40:33 -07:00
Jacob Abel 926c40d04b worktree add: emit warn when there is a bad HEAD
Add a warning to `worktree add` when the command tries to reference
HEAD, there exist valid local branches, and the HEAD points to a
non-existent reference.

Current Behavior:
% git -C foo worktree list
/path/to/repo/foo     dadc8e6dac [main]
/path/to/repo/foo_wt  0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

New Behavior:
% git -C foo worktree list
/path/to/repo/foo     dadc8e6dac [main]
/path/to/repo/foo_wt  0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
warning: HEAD points to an invalid (or orphaned) reference.
HEAD path: '/path/to/repo/foo/.git/worktrees/foo_wt/HEAD'
HEAD contents: 'ref: refs/heads/badref'
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:25 -07:00
Jacob Abel 128e5496b3 worktree add: extend DWIM to infer --orphan
Extend DWIM to try to infer `--orphan` when in an empty repository. i.e.
a repository with an invalid/unborn HEAD, no local branches, and if
`--guess-remote` is used then no remote branches.

This behavior is equivalent to `git switch -c` or `git checkout -b` in
an empty repository.

Also warn the user (overriden with `-f`/`--force`) when they likely
intend to checkout a remote branch to the worktree but have not yet
fetched from the remote. i.e. when using `--guess-remote` and there is a
remote but no local or remote refs.

Current Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git workree add --guess-remote ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%

New Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git workree add --guess-remote ../main
fatal: No local or remote refs exist despite at least one remote
present, stopping; use 'add -f' to overide or fetch a remote first
% git workree add --guess-remote -f ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:25 -07:00
Jacob Abel 35f0383ca6 worktree add: introduce "try --orphan" hint
Add a new advice/hint in `git worktree add` for when the user
tries to create a new worktree from a reference that doesn't exist.

Current Behavior:

% git init foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
fatal: invalid reference: HEAD
%

New Behavior:

% git init --bare foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint:   git worktree add --orphan newbranch/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git -C foo worktree add -b newbranch2 new_wt/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint:   git worktree add --orphan -b newbranch2 new_wt/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
Jacob Abel 7ab8918985 worktree add: add --orphan flag
Add support for creating an orphan branch when adding a new worktree.
The functionality of this flag is equivalent to git switch's --orphan
option.

Current Behavior:
% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%

% git init bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l

% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: not a valid object name: 'HEAD'
%

New Behavior:

% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%

% git init --bare bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l

% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: invalid reference: HEAD
% git -C bar.git worktree add --orphan -b main/
Preparing worktree (new branch 'main')
% git -C bar.git worktree add --orphan -b newbranch worktreedir/
Preparing worktree (new branch 'newbranch')
%

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
Jacob Abel b71f919dda worktree add: include -B in usage docs
Document `-B` next to where `-b` is already documented to bring the
usage docs in line with other commands such as git checkout.

Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 15:55:24 -07:00
Junio C Hamano 67a3b2b39f Merge branch 'jc/attr-source-tree'
"git --attr-source=<tree> cmd $args" is a new way to have any
command to read attributes not from the working tree but from the
given tree object.

* jc/attr-source-tree:
  attr: teach "--attr-source=<tree>" global option to "git"
2023-05-17 10:11:41 -07:00
Patrick Steinhardt f7e063f326 fetch: use fetch_config to store "submodule.fetchJobs" value
Move the parsed "submodule.fetchJobs" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt ac197cc094 fetch: use fetch_config to store "fetch.parallel" value
Move the parsed "fetch.parallel" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt 56e8bb4fb4 fetch: use fetch_config to store "fetch.recurseSubmodules" value
Move the parsed "fetch.recurseSubmodules" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt ba28b2ca5d fetch: use fetch_config to store "fetch.showForcedUpdates" value
Move the parsed "fetch.showForcedUpdaets" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt 2b472cfeac fetch: use fetch_config to store "fetch.pruneTags" value
Move the parsed "fetch.pruneTags" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt b779a25e05 fetch: use fetch_config to store "fetch.prune" value
Move the parsed "fetch.prune" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt d1adf85b0a fetch: pass through fetch_config directly
The `fetch_config` structure currently only has a single member, which
is the `display_format`. We're about extend it to contain all parsed
config values and will thus need it available in most of the code.

Prepare for this change by passing through the `fetch_config` directly
instead of only passing its single member.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt 6bc7a37e79 fetch: drop unneeded NULL-check for remote_ref
Drop the `NULL` check for `remote_ref` in `update_local_ref()`. The
function only has a single caller, and that caller always passes in a
non-`NULL` value.

This fixes a false positive in Coverity.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Patrick Steinhardt a40449bcd4 fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
With 50957937f9 (fetch: introduce `display_format` enum, 2023-05-10), a
new enumeration was introduced to determine the display format that is
to be used by git-fetch(1). The `DISPLAY_FORMAT_UNKNOWN` value isn't
ever used though, and neither do we rely on the explicit `0` value for
initialization anywhere.

Remove the enum value.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-17 09:55:33 -07:00
Kristoffer Haugsbakk 08c12ec1d0 tag: keep the message file in case ref transaction fails
The ref transaction can fail after the user has written their tag
message. In particular, if there exists a tag `foo/bar` and `git tag -a
foo` is said then the command will only fail once it tries to write
`refs/tags/foo`, which is after the file has been unlinked.

Hold on to the message file for a little longer so that it is not
unlinked before the fatal error occurs.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-16 11:38:14 -07:00
Junio C Hamano 15ba44f1b4 Merge branch 'ps/fetch-output-format'
"git fetch" learned the "--porcelain" option that emits what it did
in a machine-parseable format.

* ps/fetch-output-format:
  fetch: introduce machine-parseable "porcelain" output format
  fetch: move option related variables into main function
  fetch: lift up parsing of "fetch.output" config variable
  fetch: introduce `display_format` enum
  fetch: refactor calculation of the display table width
  fetch: print left-hand side when fetching HEAD:foo
  fetch: add a test to exercise invalid output formats
  fetch: split out tests for output format
  fetch: fix `--no-recurse-submodules` with multi-remote fetches
2023-05-15 13:59:07 -07:00
Junio C Hamano 5ca11547bb Merge branch 'sl/diff-files-sparse'
Teach "diff-files" not to expand sparse-index unless needed.

* sl/diff-files-sparse:
  diff-files: integrate with sparse index
  t1092: add tests for `git diff-files`
2023-05-15 13:59:06 -07:00
Junio C Hamano 80754c5cc0 Merge branch 'ds/merge-tree-use-config'
Allow git forges to disable replace-refs feature while running "git
merge-tree".

* ds/merge-tree-use-config:
  merge-tree: load default git config
2023-05-15 13:59:06 -07:00
Junio C Hamano f37da97723 Merge branch 'tl/push-branches-is-an-alias-for-all'
"git push --all" gained an alias "git push --branches".

* tl/push-branches-is-an-alias-for-all:
  t5583: fix shebang line
  push: introduce '--branches' option
2023-05-15 13:59:05 -07:00
Junio C Hamano be2fd0edb1 Merge branch 'jc/name-rev-deprecate-stdin-further'
The "--stdin" option of "git name-rev" has been replaced with
the "--annotate-stdin" option more than a year ago.  We stop
advertising it in the "git name-rev -h" output.

* jc/name-rev-deprecate-stdin-further:
  name-rev: make --stdin hidden
2023-05-15 13:59:05 -07:00
Junio C Hamano cd2b740ca9 Merge branch 'ds/fsck-bitmap'
"git fsck" learned to detect bit-flip breakages in the reachability
bitmap files.

* ds/fsck-bitmap:
  fsck: use local repository
  fsck: verify checksums of all .bitmap files
2023-05-15 13:59:04 -07:00
Junio C Hamano d3f2e4ab13 Merge branch 'rj/branch-unborn-in-other-worktrees'
Error messages given when working on an unborn branch that is
checked out in another worktree have been improved.

* rj/branch-unborn-in-other-worktrees:
  branch: avoid unnecessary worktrees traversals
  branch: rename orphan branches in any worktree
  branch: description for orphan branch errors
  branch: use get_worktrees() in copy_or_rename_branch()
  branch: test for failures while renaming branches
2023-05-15 13:59:03 -07:00
John Cai 4fe42f326e pack-refs: teach pack-refs --include option
Allow users to be more selective over which refs to pack by adding an
--include option to git-pack-refs.

The existing options allow some measure of selectivity. By default
git-pack-refs packs all tags. --all can be used to include all refs,
and the previous commit added the ability to exclude certain refs with
--exclude.

While these knobs give the user some selection over which refs to pack,
it could be useful to give more control. For instance, a repository may
have a set of branches that are rarely updated and would benefit from
being packed. --include would allow the user to easily include a set of
branches to be packed while leaving everything else unpacked.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:14 -07:00
John Cai 826ae79fca pack-refs: teach --exclude option to exclude refs from being packed
At GitLab, we have a system that creates ephemeral internal refs that
don't live long before getting deleted. Having an option to exclude
certain refs from a packed-refs file allows these internal references to
be deleted much more efficiently.

Add an --exclude option to the pack-refs builtin, and use the ref
exclusions API to exclude certain refs from being packed into the final
packed-refs file

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-12 14:54:14 -07:00
Derrick Stolee b6551feadf merge-tree: load default git config
The 'git merge-tree' command handles creating root trees for merges
without using the worktree. This is a critical operation in many Git
hosts, as they typically store bare repositories.

This builtin does not load the default Git config, which can have
several important ramifications.

In particular, one config that is loaded by default is
core.useReplaceRefs. This is typically disabled in Git hosts due to
the ability to spoof commits in strange ways.

Since this config is not loaded specifically during merge-tree, users
were previously able to use refs/replace/ references to make pull
requests that looked valid but introduced malicious content. The
resulting merge commit would have the correct commit history, but the
malicious content would exist in the root tree of the merge.

The fix is simple: load the default Git config in cmd_merge_tree().
This may also fix other behaviors that are effected by reading default
config. The only possible downside is a little extra computation time
spent reading config. The config parsing is placed after basic argument
parsing so it does not slow down usage errors.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 12:20:44 -07:00
Patrick Steinhardt dd781e3856 fetch: introduce machine-parseable "porcelain" output format
The output of git-fetch(1) is obviously designed for consumption by
users, only: we neatly columnize data, we abbreviate reference names, we
print neat arrows and we don't provide information about actual object
IDs that have changed. This makes the output format basically unusable
in the context of scripted invocations of git-fetch(1) that want to
learn about the exact changes that the command performs.

Introduce a new machine-parseable "porcelain" output format that is
supposed to fix this shortcoming. This output format is intended to
provide information about every reference that is about to be updated,
the old object ID that the reference has been pointing to and the new
object ID it will be updated to. Furthermore, the output format provides
the same flags as the human-readable format to indicate basic conditions
for each reference update like whether it was a fast-forward update, a
branch deletion, a rejected update or others.

The output format is quite simple:

```
<flag> <old-object-id> <new-object-id> <local-reference>\n
```

We assume two conditions which are generally true:

    - The old and new object IDs have fixed known widths and cannot
      contain spaces.

    - References cannot contain newlines.

With these assumptions, the output format becomes unambiguously
parseable. Furthermore, given that this output is designed to be
consumed by scripts, the machine-readable data is printed to stdout
instead of stderr like the human-readable output is. This is mostly done
so that other data printed to stderr, like error messages or progress
meters, don't interfere with the parseable data.

A notable ommission here is that the output format does not include the
remote from which a reference was fetched, which might be important
information especially in the context of multi-remote fetches. But as
such a format would require us to print the remote for every single
reference update due to parallelizable fetches it feels wasteful for the
most likely usecase, which is when fetching from a single remote.

In a similar spirit, a second restriction is that this cannot be used
with `--recurse-submodules`. This is because any reference updates would
be ambiguous without also printing the repository in which the update
happens.

Considering that both multi-remote and submodule fetches are user-facing
features, using them in conjunction with `--porcelain` that is intended
for scripting purposes is likely not going to be useful in the majority
of cases. With that in mind these restrictions feel acceptable. If
usecases for either of these come up in the future though it is easy
enough to add a new "porcelain-v2" format that adds this information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt cdc034a0ac fetch: move option related variables into main function
The options of git-fetch(1) which we pass to `parse_options()` are
declared globally in `builtin/fetch.c`. This means we're forced to use
global variables for all the options, which is more likely to cause
confusion than explicitly passing state around.

Refactor the code to move the options into `cmd_fetch()`. Move variables
that were previously forced to be declared globally and which are only
used by `cmd_fetch()` into function-local scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt 58afbe885c fetch: lift up parsing of "fetch.output" config variable
Parsing the display format happens inside of `display_state_init()`. As
we only need to check for a simple config entry, this is a natural
location to put this code as it means that display-state logic is neatly
contained in a single location.

We're about to introduce a new "porcelain" output format though that is
intended to be parseable by machines, for example inside of a script.
This format can be enabled by passing the `--porcelain` switch to
git-fetch(1). As a consequence, we'll have to add a second callsite that
influences the output format, which will become awkward to handle.

Refactor the code such that callers are expected to pass the display
format that is to be used into `display_state_init()`. This allows us to
lift up the code into the main function, where we can then hook it into
command line options parser in a follow-up commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt 50957937f9 fetch: introduce display_format enum
We currently have two different display formats in git-fetch(1) with the
"full" and "compact" formats. This is tracked with a boolean value that
simply denotes whether the display format is supposed to be compacted
or not. This works reasonably well while there are only two formats, but
we're about to introduce another format that will make this a bit more
awkward to use.

Introduce a `enum display_format` that is more readily extensible.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt 9539638a2b fetch: refactor calculation of the display table width
When displaying reference updates, we try to print the references in a
neat table. As the table's width is determined its contents we thus need
to precalculate the overall width before we can start printing updated
references.

The calculation is driven by `display_state_init()`, which invokes
`refcol_width()` for every reference that is to be printed. This split
is somewhat confusing. For one, we filter references that shall be
attributed to the overall width in both places. And second, we
needlessly recalculate the maximum line length based on the terminal
columns and display format for every reference.

Refactor the code so that the complete width calculations are neatly
contained in `refcol_width()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt 1c31764dda fetch: print left-hand side when fetching HEAD:foo
`store_updated_refs()` parses the remote reference for two purposes:

    - It gets used as a note when writing FETCH_HEAD.

    - It is passed through to `display_ref_update()` to display
      updated references in the following format:

      ```
       * branch               master          -> master
      ```

In most cases, the parsed remote reference is the prettified reference
name and can thus be used for both cases. But if the remote reference is
HEAD, the parsed remote reference becomes empty. This is intended when
we write the FETCH_HEAD, where we skip writing the note in that case.
But when displaying the updated references this leads to inconsistent
output where the left-hand side of reference updates is missing in some
cases:

```
$ git fetch origin HEAD HEAD:explicit-head :implicit-head main
From https://github.com/git/git
 * branch                  HEAD       -> FETCH_HEAD
 * [new ref]                          -> explicit-head
 * [new ref]                          -> implicit-head
 * branch                  main       -> FETCH_HEAD
```

This behaviour has existed ever since the table-based output has been
introduced for git-fetch(1) via 165f390250 (git-fetch: more terse fetch
output, 2007-11-03) and was never explicitly documented either in the
commit message or in any of our tests. So while it may not be a bug per
se, it feels like a weird inconsistency and not like it was a concious
design decision.

The logic of how we compute the remote reference name that we ultimately
pass to `display_ref_update()` is not easy to follow. There are three
different cases here:

    - When the remote reference name is "HEAD" we set the remote
      reference name to the empty string. This is the case that causes
      the left-hand side to go missing, where we would indeed want to
      print "HEAD" instead of the empty string. This is what
      `prettify_refname()` would return.

    - When the remote reference name has a well-known prefix then we
      strip this prefix. This matches what `prettify_refname()` does.

    - Otherwise, we keep the fully qualified reference name. This also
      matches what `prettify_refname()` does.

As the return value of `prettify_refname()` would do the correct thing
for us in all three cases, we can thus fix the inconsistency by passing
through the full remote reference name to `display_ref_update()`, which
learns to call `prettify_refname()`. At the same time, this also
simplifies the code a bit.

Note that this patch also changes formatting of the block that computes
the "kind" (which is the category like "branch" or "tag") and "what"
(which is the prettified reference name like "master" or "v1.0")
variables. This is done on purpose so that it is part of the diff,
hopefully making the change easier to comprehend.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:25 -07:00
Patrick Steinhardt 5667141e3b fetch: fix --no-recurse-submodules with multi-remote fetches
When running `git fetch --no-recurse-submodules`, the exectation is that
we don't fetch any submodules. And while this works for fetches of a
single remote, it doesn't when fetching multiple remotes at once. The
result is that we do recurse into submodules even though the user has
explicitly asked us not to.

This is because while we pass on `--recurse-submodules={yes,on-demand}`
if specified by the user, we don't pass on `--no-recurse-submodules` to
the subprocess spawned to perform the submodule fetch.

Fix this by also forwarding this flag as expected.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-10 10:35:24 -07:00
Junio C Hamano 2ca91d1ee0 Merge branch 'mh/credential-oauth-refresh-token'
The credential subsystem learns to help OAuth framework.

* mh/credential-oauth-refresh-token:
  credential: new attribute oauth_refresh_token
2023-05-10 10:23:29 -07:00
Junio C Hamano 461eea3fb8 Merge branch 'ob/messages-capitalize-exception'
Message update.

* ob/messages-capitalize-exception:
  messages: capitalization and punctuation exceptions
2023-05-09 16:45:46 -07:00
Junio C Hamano ccd12a3d6c Merge branch 'en/header-split-cache-h-part-2'
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
2023-05-09 16:45:46 -07:00
Shuqi Liang 8c30be9176 diff-files: integrate with sparse index
Remove full index requirement for `git diff-files`. Refactor the
ensure_expanded and ensure_not_expanded functions by introducing a
common helper function, ensure_index_state. Add test to ensure the index
is no expanded in `git diff-files`.

The `p2000` tests demonstrate a ~96% execution time reduction for 'git
diff-files' and a ~97% execution time reduction for 'git diff-files'
for a file using a sparse index:

Test                                               before  after
-----------------------------------------------------------------------
2000.94: git diff-files (full-v3)                  0.09    0.08 -11.1%
2000.95: git diff-files (full-v4)                  0.09    0.09 +0.0%
2000.96: git diff-files (sparse-v3)                0.52    0.02 -96.2%
2000.97: git diff-files (sparse-v4)                0.51    0.02 -96.1%
2000.98: git diff-files -- f2/f4/a (full-v3)       0.06    0.07 +16.7%
2000.99: git diff-files -- f2/f4/a (full-v4)       0.08    0.08 +0.0%
2000.100: git diff-files -- f2/f4/a (sparse-v3)    0.46    0.01 -97.8%
2000.101: git diff-files -- f2/f4/a (sparse-v4)    0.51    0.02 -96.1%

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09 14:26:36 -07:00
Teng Long 425b4d7f47 push: introduce '--branches' option
The '--all' option of git-push built-in cmd support to push all branches
(refs under refs/heads) to remote. Under the usage, a user can easlily
work in some scenarios, for example, branches synchronization and batch
upload.

The '--all' was introduced for a long time, meanwhile, git supports to
customize the storage location under "refs/". when a new git user see
the usage like, 'git push origin --all', we might feel like we're
pushing _all_ the refs instead of just branches without looking at the
documents until we found the related description of it or '--mirror'.

To ensure compatibility, we cannot rename '--all' to another name
directly, one way is, we can try to add a new option '--heads' which be
identical with the functionality of '--all' to let the user understand
the meaning of representation more clearly. Actually, We've more or less
named options this way already, for example, in 'git-show-ref' and 'git
ls-remote'.

At the same time, we fix a related issue about the wrong help
information of '--all' option in code and add some test cases in
t5523, t5543 and t5583.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:36:43 -07:00
John Cai 44451a2e5e attr: teach "--attr-source=<tree>" global option to "git"
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish,
2023-01-14) taught "git check-attr" the "--source=<tree>" option to
allow it to read attribute files from a tree-ish, but did so only
for the command.  Just like "check-attr" users wanted a way to use
attributes from a tree-ish and not from the working tree files,
users of other commands (like "git diff") would benefit from the
same.

Undo most of the UI change the commit made, while keeping the
internal logic to read attributes from a given tree-ish. Expose the
internal logic via a new "--attr-source=<tree>" command line option
given to "git", so that it can be used with any git command that
runs as part of the main git process.

Additionally, add an environment variable GIT_ATTR_SOURCE that is set
when --attr-source is passed in, so that subprocesses use the same value
for the attributes source tree.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:34:09 -07:00
John Cai 9019d7dceb name-rev: make --stdin hidden
In 34ae3b70 (name-rev: deprecate --stdin in favor of --annotate-stdin),
we renamed --stdin to --annotate-stdin for the sake of a clearer name
for the option, and added text that indicates --stdin is deprecated. The
next step is to hide --stdin completely.

Make the option hidden. Also, update documentation to remove all
mentions of --stdin.

Signed-off-by: "John Cai" <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-06 14:32:20 -07:00
Junio C Hamano d699e27bd4 Merge branch 'tb/ban-strtok'
Mark strtok() and strtok_r() to be banned.

* tb/ban-strtok:
  banned.h: mark `strtok()` and `strtok_r()` as banned
  t/helper/test-json-writer.c: avoid using `strtok()`
  t/helper/test-oidmap.c: avoid using `strtok()`
  t/helper/test-hashmap.c: avoid using `strtok()`
  string-list: introduce `string_list_setlen()`
  string-list: multi-delimiter `string_list_split_in_place()`
2023-05-02 10:13:35 -07:00
Derrick Stolee cf9cd8b55c fsck: use local repository
In 0d30feef3c (fsck: create scaffolding for rev-index checks,
2023-04-17) and later 5a6072f631 (fsck: validate .rev file header,
2023-04-17), the check_pack_rev_indexes() method was created with a
'struct repository *r' parameter. However, this parameter was unused and
instead 'the_repository' was used in its place.

Fix this situation with the obvious replacement.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 08:48:23 -07:00
Derrick Stolee 756f1bcd29 fsck: verify checksums of all .bitmap files
If a filesystem-level corruption occurs in a .bitmap file, Git can react
poorly. This could take the form of a run-time error due to failing to
parse an EWAH bitmap or be more subtle such as returning the wrong set
of objects to a fetch or clone.

A natural first response to either of these kinds of errors is to run
'git fsck' to see if any files are corrupt. This currently ignores all
.bitmap files.

Add checks to 'git fsck' for all .bitmap files that are currently
associated with a multi-pack-index or pack file. Verify their checksums
using the hashfile API.

We iterate through all multi-pack-indexes and pack-files to be sure to
check all .bitmap files, not just the one that would be read by the
process. For example, a multi-pack-index bitmap overrules a pack-bitmap.
However, if the multi-pack-index is removed, the pack-bitmap may be
selected instead. Be thorough to include every file that could become
active in such a way. This includes checking files in alternates.

There is potential that we could extend this effort to check the
structure of the reachability bitmaps themselves, but it is very
expensive to do so. At minimum, it's as expensive as generating the
bitmaps in the first place, and that's assuming that we don't use the
trivial algorithm of verifying each bitmap individually. The trivial
algorithm will result in quadratic behavior (number of objects times
number of bitmapped commits) while the bitmap building operation
constructs a lattice of commits to build bitmaps incrementally and then
generate the final bitmaps from a subset of those commits.

If we were to extend 'git fsck' to check .bitmap file contents more
closely like this, then we would likely want to hide it behind an option
that signals the user is more willing to do expensive operations such as
this.

For testing, set up a repository with a pack-bitmap _and_ a
multi-pack-index bitmap. This requires some file movement to avoid
deleting the pack-bitmap during the repack that creates the
multi-pack-index bitmap. We can then verify that 'git fsck' is checking
all files, not just the "active" bitmap.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02 08:48:22 -07:00
Junio C Hamano fc23c397c7 Merge branch 'tb/enable-cruft-packs-by-default'
When "gc" needs to retain unreachable objects, packing them into
cruft packs (instead of exploding them into loose object files) has
been offered as a more efficient option for some time.  Now the use
of cruft packs has been made the default and no longer considered
an experimental feature.

* tb/enable-cruft-packs-by-default:
  repository.h: drop unused `gc_cruft_packs`
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: add additional test cases
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  builtin/repack.c: fix incorrect reference to '-C'
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-28 16:03:03 -07:00
Oswald Buddenhagen b734fe49fd messages: capitalization and punctuation exceptions
These are conscious violations of the usual rules for error messages,
based on this reasoning:

 - If an error message is directly followed by another sentence, it
   needs to be properly terminated with a period, lest the grammar
   looks broken and becomes hard to read.

 - That second sentence isn't actually an error message any more, so
   it should abide to conventional language rules for good looks and
   legibility. Arguably, these should be converted to advice
   messages (which the user can squelch, too), but that's a much
   bigger effort to get right.

 - Neither of these apply to the first hunk in do_exec(), but this
   two-line message looks just too much like a real sentence to not
   terminate it. Also, leaving it alone would make it asymmetrical
   to the other hunk.

Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-28 12:06:27 -07:00
Junio C Hamano a02675ad90 Merge branch 'ds/fsck-pack-revindex'
"git fsck" learned to validate the on-disk pack reverse index files.

* ds/fsck-pack-revindex:
  fsck: validate .rev file header
  fsck: check rev-index position values
  fsck: check rev-index checksums
  fsck: create scaffolding for rev-index checks
2023-04-27 16:00:59 -07:00
Junio C Hamano 849c8b3dbf Merge branch 'tb/pack-revindex-on-disk'
The on-disk reverse index that allows mapping from the pack offset
to the object name for the object stored at the offset has been
enabled by default.

* tb/pack-revindex-on-disk:
  t: invert `GIT_TEST_WRITE_REV_INDEX`
  config: enable `pack.writeReverseIndex` by default
  pack-revindex: introduce `pack.readReverseIndex`
  pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK
  pack-revindex: make `load_pack_revindex` take a repository
  t5325: mark as leak-free
  pack-write.c: plug a leak in stage_tmp_packfiles()
2023-04-27 16:00:59 -07:00
Junio C Hamano 36628c56ed Merge branch 'ps/fix-geom-repack-with-alternates'
Geometric repacking ("git repack --geometric=<n>") in a repository
that borrows from an alternate object database had various corner
case bugs, which have been corrected.

* ps/fix-geom-repack-with-alternates:
  repack: disable writing bitmaps when doing a local repack
  repack: honor `-l` when calculating pack geometry
  t/helper: allow chmtime to print verbosely without modifying mtime
  pack-objects: extend test coverage of `--stdin-packs` with alternates
  pack-objects: fix error when same packfile is included and excluded
  pack-objects: fix error when packing same pack twice
  pack-objects: split out `--stdin-packs` tests into separate file
  repack: fix generating multi-pack-index with only non-local packs
  repack: fix trying to use preferred pack in alternates
  midx: fix segfault with no packs and invalid preferred pack
2023-04-25 13:56:20 -07:00
Junio C Hamano 80d268f309 Merge branch 'jk/protocol-cap-parse-fix'
The code to parse capability list for v0 on-wire protocol fell into
an infinite loop when a capability appears multiple times, which
has been corrected.

* jk/protocol-cap-parse-fix:
  v0 protocol: use size_t for capability length/offset
  t5512: test "ls-remote --heads --symref" filtering with v0 and v2
  t5512: allow any protocol version for filtered symref test
  t5512: add v2 support for "ls-remote --symref" test
  v0 protocol: fix sha1/sha256 confusion for capabilities^{}
  t5512: stop referring to "v1" protocol
  v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25 13:56:20 -07:00
Junio C Hamano 0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
Taylor Blau 52acddf36c string-list: multi-delimiter string_list_split_in_place()
Enhance `string_list_split_in_place()` to accept multiple characters as
delimiters instead of a single character.

Instead of using `strchr(2)` to locate the first occurrence of the given
delimiter character, `string_list_split_in_place_multi()` uses
`strcspn(2)` to move past the initial segment of characters comprised of
any characters in the delimiting set.

When only a single delimiting character is provided, `strpbrk(2)` (which
is implemented with `strcspn(2)`) has equivalent performance to
`strchr(2)`. Modern `strcspn(2)` implementations treat an empty
delimiter or the singleton delimiter as a special case and fall back to
calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)`
this way.

This change is one step to removing `strtok(2)` from the tree. Note that
`string_list_split_in_place()` is not a strict replacement for
`strtok()`, since it will happily turn sequential delimiter characters
into empty entries in the resulting string_list. For example:

    string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1)

would yield a string list of:

    ["foo", "", "", "bar", "", "", "baz"]

Callers that wish to emulate the behavior of strtok(2) more directly
should call `string_list_remove_empty_items()` after splitting.

To avoid regressions for the new multi-character delimter cases, update
t0063 in this patch as well.

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35
[2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 16:01:28 -07:00
Elijah Newren d4a4f9291d commit.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
Elijah Newren d1cbe1e6d8 hash-ll.h: split out of hash.h to remove dependency on repository.h
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository->hash_algo).  However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id.  Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.

This allows hash.h and object.h to be fairly small, minimal headers.  It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
Elijah Newren b388633c5c pkt-line.h: move declarations for pkt-line.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:32 -07:00
Elijah Newren d5fff46f40 copy.h: move declarations for copy.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
Elijah Newren 623b80bef2 server-info.h: move declarations for server-info.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
Elijah Newren cb2a51356d symlinks.h: move declarations for symlinks.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:31 -07:00
Junio C Hamano b64894c206 Merge branch 'ow/ref-filter-omit-empty'
"git branch --format=..." and "git format-patch --format=..."
learns "--omit-empty" to hide refs that whose formatting result
becomes an empty string from the output.

* ow/ref-filter-omit-empty:
  branch, for-each-ref, tag: add option to omit empty lines
2023-04-21 15:35:05 -07:00
Junio C Hamano 7ac228c994 Merge branch 'rn/sparse-describe'
"git describe --dirty" learns to work better with sparse-index.

* rn/sparse-describe:
  describe: enable sparse index for describe
2023-04-21 15:35:04 -07:00
M Hickford a5c76569e7 credential: new attribute oauth_refresh_token
Git authentication with OAuth access token is supported by every popular
Git host including GitHub, GitLab and BitBucket [1][2][3]. Credential
helpers Git Credential Manager (GCM) and git-credential-oauth generate
OAuth credentials [4][5]. Following RFC 6749, the application prints a
link for the user to authorize access in browser. A loopback redirect
communicates the response including access token to the application.

For security, RFC 6749 recommends that OAuth response also includes
expiry date and refresh token [6]. After expiry, applications can use
the refresh token to generate a new access token without user
reauthorization in browser. GitLab and BitBucket set the expiry at two
hours [2][3]. (GitHub doesn't populate expiry or refresh token.)

However the Git credential protocol has no attribute to store the OAuth
refresh token (unrecognised attributes are silently discarded). This
means that the user has to regularly reauthorize the helper in browser.
On a browserless system, this is particularly intrusive, requiring a
second device.

Introduce a new attribute oauth_refresh_token. This is especially
useful when a storage helper and a read-only OAuth helper are configured
together. Recall that `credential fill` calls each helper until it has a
non-expired password.

```
[credential]
	helper = storage  # eg. cache or osxkeychain
	helper = oauth
```

The OAuth helper can use the stored refresh token forwarded by
`credential fill` to generate a fresh access token without opening the
browser. See
https://github.com/hickford/git-credential-oauth/pull/3/files
for an implementation tested with this patch.

Add support for the new attribute to credential-cache. Eventually, I
hope to see support in other popular storage helpers.

Alternatives considered: ask helpers to store all unrecognised
attributes. This seems excessively complex for no obvious gain.
Helpers would also need extra information to distinguish between
confidential and non-confidential attributes.

Workarounds: GCM abuses the helper get/store/erase contract to store the
refresh token during credential *get* as the password for a fictitious
host [7] (I wrote this hack). This workaround is only feasible for a
monolithic helper with its own storage.

[1] https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/
[2] https://docs.gitlab.com/ee/api/oauth2.html#access-git-over-https-with-access-token
[3] https://support.atlassian.com/bitbucket-cloud/docs/use-oauth-on-bitbucket-cloud/#Cloning-a-repository-with-an-access-token
[4] https://github.com/GitCredentialManager/git-credential-manager
[5] https://github.com/hickford/git-credential-oauth
[6] https://datatracker.ietf.org/doc/html/rfc6749#section-5.1
[7] 66b94e489a/src/shared/GitLab/GitLabHostProvider.cs (L207)

Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21 09:38:30 -07:00
Junio C Hamano a4a4db8cf7 Merge branch 'gc/better-error-when-local-clone-fails-with-symlink'
"git clone --local" stops copying from an original repository that
has symbolic links inside its $GIT_DIR; an error message when that
happens has been updated.

* gc/better-error-when-local-clone-fails-with-symlink:
  clone: error specifically with --local and symlinked objects
2023-04-20 14:33:36 -07:00
Junio C Hamano 08bd076ce4 Merge branch 'rs/get-tar-commit-id-use-defined-const'
Code clean-up to replace a hardcoded constant with a CPP macro.

* rs/get-tar-commit-id-use-defined-const:
  get-tar-commit-id: use TYPEFLAG_GLOBAL_HEADER instead of magic value
2023-04-20 14:33:36 -07:00
Taylor Blau e3e24de1bf builtin/gc.c: make gc.cruftPacks enabled by default
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.

Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.

Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.

Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.

In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.

As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:48 -07:00
Taylor Blau 05b9013b71 builtin/gc.c: ignore cruft packs with --keep-largest-pack
When cruft packs were implemented, we never adjusted the code for `git
gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
packs. This option and configuration option share a common
implementation, but including cruft packs is wrong in both cases:

  - Running `git gc --keep-largest-pack` in a repository where the
    largest pack is the cruft pack itself will make it impossible for
    `git gc` to prune objects, since the cruft pack itself is kept.

  - The same is true for `gc.bigPackThreshold`, if the size of the cruft
    pack exceeds the limit set by the caller.

In the future, it is possible that `gc.bigPackThreshold` could be used
to write a separate cruft pack containing any new unreachable objects
that entered the repository since the last time a cruft pack was
written.

There are some complexities to doing so, mainly around handling
pruning objects that are in an existing cruft pack that is above the
threshold (which would either need to be rewritten, or else delay
pruning). Rewriting a substantially similar cruft pack isn't ideal, but
it is significantly better than the status-quo.

If users have large cruft packs that they don't want to rewrite, they
can mark them as `*.keep` packs. But in general, if a repository has a
cruft pack that is so large it is slowing down GC's, it should probably
be pruned anyway.

In the meantime, ignore cruft packs in the common implementation for
both of these options, and add a pair of tests to prevent any future
regressions here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
Taylor Blau c512f31109 builtin/repack.c: fix incorrect reference to '-C'
When cruft packs were originally being developed, `-C` was designated as
the short-form for `--cruft` (as in `git repack -C`).

This was dropped due to confusion with Git's top-level `-C` option
before submitting to the list. But the reference to it in
`--cruft-expiration`'s help text was never updated. Fix that dangling
reference in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 14:56:47 -07:00
Junio C Hamano 3c957e6d39 Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling'
Clean-up of the code path that deals with merge strategy option
handling in "git rebase".

* pw/rebase-cleanup-merge-strategy-option-handling:
  rebase: remove a couple of redundant strategy tests
  rebase -m: fix serialization of strategy options
  rebase -m: cleanup --strategy-option handling
  sequencer: use struct strvec to store merge strategy options
  rebase: stop reading and writing unnecessary strategy state
2023-04-17 18:05:13 -07:00
Junio C Hamano 66bf8f1943 Merge branch 'cm/branch-delete-error-message-update'
"git branch -d origin/master" would say "no such branch", but it is
likely a missed "-r" if refs/remotes/origin/master exists.  The
command has been taught to give such a hint in its error message.

* cm/branch-delete-error-message-update:
  branch: improve error log on branch not found by checking remotes refs
2023-04-17 18:05:12 -07:00
Junio C Hamano 9d8370d445 Merge branch 'tk/mergetool-gui-default-config'
"git mergetool" and "git difftool" learns a new configuration
guiDefault to optionally favor configured guitool over non-gui-tool
automatically when $DISPLAY is set.

* tk/mergetool-gui-default-config:
  mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
2023-04-17 18:05:11 -07:00
Junio C Hamano d47ee0a565 Merge branch 'sl/sparse-write-tree'
"git write-tree" learns to work better with sparse-index.

* sl/sparse-write-tree:
  write-tree: integrate with sparse index
2023-04-17 18:05:11 -07:00
Derrick Stolee 5a6072f631 fsck: validate .rev file header
While parsing a .rev file, we check the header information to be sure it
makes sense. This happens before doing any additional validation such as
a checksum or value check. In order to differentiate between a bad
header and a non-existent file, we need to update the API for loading a
reverse index.

Make load_pack_revindex_from_disk() non-static and specify that a
positive value means "the file does not exist" while other errors during
parsing are negative values. Since an invalid header prevents setting up
the structures we would use for further validations, we can stop at that
point.

The place where we can distinguish between a missing file and a corrupt
file is inside load_revindex_from_disk(), which is used both by pack
rev-indexes and multi-pack-index rev-indexes. Some tests in t5326
demonstrate that it is critical to take some conditions to allow
positive error signals.

Add tests that check the three header values.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:05 -07:00
Derrick Stolee 0d30feef3c fsck: create scaffolding for rev-index checks
The 'fsck' builtin checks many of Git's on-disk data structures, but
does not currently validate the pack rev-index files (a .rev file to
pair with a .pack and .idx file).

Before doing a more-involved check process, create the scaffolding
within builtin/fsck.c to have a new error type and add that error type
when the API method verify_pack_revindex() returns an error. That method
does nothing currently, but we will add checks to it in later changes.

For now, check that 'git fsck' succeeds without any errors in the normal
case. Future checks will be paired with tests that corrupt the .rev file
appropriately.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-17 14:39:04 -07:00
Jeff King 7ce4c8f752 v0 protocol: use size_t for capability length/offset
When parsing server capabilities, we use "int" to store lengths and
offsets. At first glance this seems like a spot where our parser may be
confused by integer overflow if somebody sent us a malicious response.

In practice these strings are all bounded by the 64k limit of a
pkt-line, so using "int" is OK. However, it makes the code simpler to
audit if they just use size_t everywhere. Note that because we take
these parameters as pointers, this also forces many callers to update
their declared types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:13 -07:00
Patrick Steinhardt d85cd18777 repack: disable writing bitmaps when doing a local repack
In order to write a bitmap, we need to have full coverage of all objects
that are about to be packed. In the traditional non-multi-pack-index
world this meant we need to do a full repack of all objects into a
single packfile. But in the new multi-pack-index world we can get away
with writing bitmaps when we have multiple packfiles as long as the
multi-pack-index covers all objects.

This is not always the case though. When asked to perform a repack of
local objects, only, then we cannot guarantee to have full coverage of
all objects regardless of whether we do a full repack or a repack with a
multi-pack-index. The end result is that writing the bitmap will fail in
both worlds:

    $ git multi-pack-index write --stdin-packs --bitmap <packfiles
    warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing)
    error: could not write multi-pack bitmap

Now there are two different ways to fix this. The first one would be to
amend git-multi-pack-index(1) to disable writing bitmaps when we notice
that we don't have full object coverage.

    - We don't have enough information in git-multi-pack-index(1) in
      order to tell whether the local repository _should_ have full
      coverage. Because even when connected to an alternate object
      directory, it may be the case that we still have all objects
      around in the main object database.

    - git-multi-pack-index(1) is quite a low-level tool. Automatically
      disabling functionality that it was asked to provide does not feel
      like the right thing to do.

We can easily fix it at a higher level in git-repack(1) though. When
asked to only include local objects via `-l` and when connected to an
alternate object directory then we will override the user's ask and
disable writing bitmaps with a warning. This is similar to what we do in
git-pack-objects(1), where we also disable writing bitmaps in case we
omit an object from the pack.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
Patrick Steinhardt 932c16c04b repack: honor -l when calculating pack geometry
When the user passes `-l` to git-repack(1), then they essentially ask us
to only repack objects part of the local object database while ignoring
any packfiles part of an alternate object database. And we in fact honor
this bit when doing a geometric repack as the resulting packfile will
only ever contain local objects.

What we're missing though is that we don't take locality of packfiles
into account when computing whether the geometric sequence is intact or
not. So even though we would only ever roll up local packfiles anyway,
we could end up trying to repack because of non-local packfiles. This
does not make much sense, and in the worst case it can cause us to try
and do the geometric repack over and over again because we're never able
to restore the geometric sequence.

Fix this bug by honoring whether the user has passed `-l`. If so, we
skip adding any non-local packfiles to the pack geometry.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:52 -07:00
Patrick Steinhardt 752b465c3c pack-objects: fix error when same packfile is included and excluded
When passing the same packfile both as included and excluded via the
`--stdin-packs` option, then we will return an error because the
excluded packfile cannot be found. This is because we will only set the
`util` pointer for the included packfile list if it was found, so that
we later die when we notice that it's in fact not set for the excluded
packfile list.

Fix this bug by always setting the `util` pointer for both the included
and excluded list entries.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
Patrick Steinhardt 732194b5f2 pack-objects: fix error when packing same pack twice
When passed the same packfile twice via `--stdin-packs` we return an
error that the packfile supposedly was not found. This is because when
reading packs into the list of included or excluded packfiles, we will
happily re-add packfiles even if they are part of the lists already. And
while the list can now contain duplicates, we will only set the `util`
pointer of the first list entry to the `packed_git` structure. We notice
that at a later point when checking that all list entries have their
`util` pointer set and die with an error.

While this is kind of a nonsensical request, this scenario can be hit
when doing geometric repacks. When a repository is connected to an
alternate object directory and both have the exact same packfile then
both would get added to the geometric sequence. And when we then decide
to perform the repack, we will invoke git-pack-objects(1) with the same
packfile twice.

Fix this bug by removing any duplicates from both the included and
excluded packs.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
Patrick Steinhardt 51861340f8 repack: fix generating multi-pack-index with only non-local packs
When writing the multi-pack-index with geometric repacking we will add
all packfiles to the index that are part of the geometric sequence. This
can potentially also include packfiles borrowed from an alternate object
directory. But given that a multi-pack-index can only ever include packs
that are part of the main object database this does not make much sense
whatsoever.

In the edge case where all packfiles are contained in the alternate
object database and the local repository has none itself this bug can
cause us to invoke git-multi-pack-index(1) with only non-local packfiles
that it ultimately cannot find. This causes it to return an error and
thus causes the geometric repack to fail.

Fix the code to skip non-local packfiles.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
Patrick Steinhardt 3d74a2337c repack: fix trying to use preferred pack in alternates
When doing a geometric repack with multi-pack-indices, then we ask
git-multi-pack-index(1) to use the largest packfile as the preferred
pack. It can happen though that the largest packfile is not part of the
main object database, but instead part of an alternate object database.
The result is that git-multi-pack-index(1) will not be able to find the
preferred pack and print a warning. It then falls back to use the first
packfile that the multi-pack-index shall reference.

Fix this bug by only considering packfiles as preferred pack that are
local. This is the right thing to do given that a multi-pack-index
should never reference packfiles borrowed from an alternate.

While at it, rename the function `get_largest_active_packfile()` to
`get_preferred_pack()` to better document its intent.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 10:27:51 -07:00
Øystein Walle aabfdc9514 branch, for-each-ref, tag: add option to omit empty lines
If the given format string expands to the empty string, a newline is
still printed. This makes using the output linewise more tedious. For
example, git update-ref --stdin does not accept empty lines.

Add options to "git branch", "git for-each-ref", and "git tag" to
not print these empty lines.  The default behavior remains the same.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 08:07:45 -07:00
Taylor Blau 9f7f10a282 t: invert GIT_TEST_WRITE_REV_INDEX
Back in e8c58f894b (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we
added a test knob to conditionally enable writing a ".rev" file when
indexing a pack. At the time, this was used to ensure that the test
suite worked even when ".rev" files were written, which served as a
stress-test for the on-disk reverse index implementation.

Now that reading from on-disk ".rev" files is enabled by default, the
test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning.

We could get rid of the option entirely, but there would be no
convenient way to test Git when ".rev" files *aren't* in place.

Instead of getting rid of the option, invert its meaning to instead
disable writing ".rev" files, thereby running the test suite in a mode
where the reverse index is generated from scratch.

This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some
spelling of "true", we are still running and exercising Git's behavior
when forced to generate reverse indexes from scratch. Do so by setting
it in the linux-TEST-vars CI run to ensure that we are maintaining good
coverage of this now-legacy code.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
Taylor Blau a8dd7e05b1 config: enable pack.writeReverseIndex by default
Back in e37d0b8730 (builtin/index-pack.c: write reverse indexes,
2021-01-25), Git learned how to read and write a pack's reverse index
from a file instead of in-memory.

A pack's reverse index is a mapping from pack position (that is, the
order that objects appear together in a ".pack")  to their position in
lexical order (that is, the order that objects are listed in an ".idx"
file).

Reverse indexes are consulted often during pack-objects, as well as
during auxiliary operations that require mapping between pack offsets,
pack order, and index index.

They are useful in GitHub's infrastructure, where we have seen a
dramatic increase in performance when writing ".rev" files[1]. In
particular:

  - an ~80% reduction in the time it takes to serve fetches on a popular
    repository, Homebrew/homebrew-core.

  - a ~60% reduction in the peak memory usage to serve fetches on that
    same repository.

  - a collective savings of ~35% in CPU time across all pack-objects
    invocations serving fetches across all repositories in a single
    datacenter.

Reverse indexes are also beneficial to end-users as well as forges. For
example, the time it takes to generate a pack containing the objects for
the 10 most recent commits in linux.git (representing a typical push) is
significantly faster when on-disk reverse indexes are available:

    $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null'
    Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     543.0 ms ±  20.3 ms    [User: 616.2 ms, System: 58.8 ms]
      Range (min … max):   521.0 ms … 577.9 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     245.0 ms ±  11.4 ms    [User: 335.6 ms, System: 31.3 ms]
      Range (min … max):   226.0 ms … 259.6 ms    13 runs

    Summary
      'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran
	2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null'

The same is true of writing a pack containing the objects for the 30
most-recent commits:

    $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null'
    Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     866.5 ms ±  16.2 ms    [User: 1414.5 ms, System: 97.0 ms]
      Range (min … max):   839.3 ms … 886.9 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null
      Time (mean ± σ):     581.6 ms ±  10.2 ms    [User: 1181.7 ms, System: 62.6 ms]
      Range (min … max):   567.5 ms … 599.3 ms    10 runs

    Summary
      'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran
	1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null'

...and savings on trivial operations like computing the on-disk size of
a single (packed) object are even more dramatic:

    $ git rev-parse HEAD >in
    $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in'
    Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):     305.8 ms ±  11.4 ms    [User: 264.2 ms, System: 41.4 ms]
      Range (min … max):   290.3 ms … 331.1 ms    10 runs

    Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in
      Time (mean ± σ):       4.0 ms ±   0.3 ms    [User: 1.7 ms, System: 2.3 ms]
      Range (min … max):     1.6 ms …   4.6 ms    1155 runs

    Summary
      'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran
       76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in'

In the more than two years since e37d0b8730 was merged, Git's
implementation of on-disk reverse indexes has been thoroughly tested,
both from users enabling `pack.writeReverseIndexes`, and from GitHub's
deployment of the feature. The latter has been running without incident
for more than two years.

This patch changes Git's behavior to write on-disk reverse indexes by
default when indexing a pack, which should make the above operations
faster for everybody's Git installation after a repack.

(The previous commit explains some potential drawbacks of using on-disk
reverse indexes in certain limited circumstances, that essentially boil
down to a trade-off between time to generate, and time to access. For
those limited cases, the `pack.readReverseIndex` escape hatch can be
used).

[1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-13 07:55:46 -07:00
Junio C Hamano 96f4113ac0 Merge branch 'jc/clone-object-format-from-void'
"git clone" from an empty repository learned to propagate the
choice of the hash algorithm from the source repository to the
newly created repository.

* jc/clone-object-format-from-void:
  clone: propagate object-format when cloning from void
2023-04-11 13:49:13 -07:00
Junio C Hamano d02343b599 Merge branch 'ws/sparse-check-rules'
"git sparse-checkout" command learns a debugging aid for the sparse
rule definitions.

* ws/sparse-check-rules:
  builtin/sparse-checkout: add check-rules command
  builtin/sparse-checkout: remove NEED_WORK_TREE flag
2023-04-11 13:49:12 -07:00
Elijah Newren 65156bb7ec treewide: remove double forward declaration of read_in_full
cache.h's nature of a dumping ground of includes prevented it from
being included in some compat/ files, forcing us into a workaround
of having a double forward declaration of the read_in_full() function
(see commit 14086b0a13 ("compat/pread.c: Add a forward declaration to
fix a warning", 2007-11-17)).  Now that we have moved functions like
read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered
with unrelated and scary #defines, get rid of the extra forward
declaration and just have compat/pread.c include wrapper.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
Elijah Newren ca4eed708d pager.h: move declarations for pager.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren 4e120823a3 editor: move editor-related functions and declarations into common file
cache.h and strbuf.[ch] had editor-related functions.  Move these into
editor.[ch].

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren 87bed17907 object-file.h: move declarations for object-file.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren d88dbaa718 git-zlib: move declarations for git-zlib functions from cache.h
Move functions from cache.h for zlib.c into a new header file.  Since
adding a "zlib.h" would cause issues with the real zlib, rename zlib.c
to git-zlib.c while we are at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 5bc07225e5 treewide: be explicit about dependence on mem-pool.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 6f2d743043 treewide: be explicit about dependence on oid-array.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 75f273d9b7 treewide: be explicit about dependence on pack-revindex.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 73359a9b43 treewide: be explicit about dependence on convert.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 6c6ddf92d5 treewide: be explicit about dependence on advice.h
Dozens of files made use of advice functions, without explicitly
including advice.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
advice.h if they are using it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
Glen Choo 4e33535ea9 clone: error specifically with --local and symlinked objects
6f054f9fb3 (builtin/clone.c: disallow --local clones with
symlinks, 2022-07-28) gives a good error message when "git clone
--local" fails when the repo to clone has symlinks in
"$GIT_DIR/objects". In bffc762f87 (dir-iterator: prevent top-level
symlinks without FOLLOW_SYMLINKS, 2023-01-24), we later extended this
restriction to the case where "$GIT_DIR/objects" is itself a symlink,
but we didn't update the error message then - bffc762f87's tests show
that we print a generic "failed to start iterator over" message.

This is exacerbated by the fact that Documentation/git-clone.txt
mentions neither restriction, so users are left wondering if this is
intentional behavior or not.

Fix this by adding a check to builtin/clone.c: when doing a local clone,
perform an extra check to see if "$GIT_DIR/objects" is a symlink, and if
so, assume that that was the reason for the failure and report the
relevant information. Ideally, dir_iterator_begin() would tell us that
the real failure reason is the presence of the symlink, but (as far as I
can tell) there isn't an appropriate errno value for that.

Also, update Documentation/git-clone.txt to reflect that this
restriction exists.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:46:09 -07:00
Phillip Wood 4a8bc9860a rebase -m: cleanup --strategy-option handling
When handling "--strategy-option" rebase collects the commands into a
struct string_list, then concatenates them into a string, prepending "--"
to each one before splitting the string and removing the "--" prefix.
This is an artifact of the scripted rebase and the need to support
"rebase --preserve-merges". Now that "--preserve-merges" no-longer
exists we can cleanup the way the argument is handled.

The tests for a bad strategy option are adjusted now that
parse_strategy_opts() is no-longer called when starting a rebase. The
fact that it only errors out when running "git rebase --continue" is a
mixed blessing but the next commit will fix the root cause of the
parsing problem so lets not worry about that here.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
Phillip Wood fb60b9f37f sequencer: use struct strvec to store merge strategy options
The sequencer stores the merge strategy options in an array of strings
which allocated with ALLOC_GROW(). Using "struct strvec" avoids manually
managing the memory of that array and simplifies the code.

Aside from memory allocation the changes to the sequencer are largely
mechanical, changing xopts_nr to xopts.nr and xopts[i] to xopts.v[i]. A
new option parsing macro OPT_STRVEC() is also added to collect the
strategy options.  Hopefully this can be used to simplify the code in
builtin/merge.c in the future.

Note that there is a change of behavior to "git cherry-pick" and "git
revert" as passing “--no-strategy-option” will now clear any previous
strategy options whereas before this change it did nothing.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
Phillip Wood 461434a013 rebase: stop reading and writing unnecessary strategy state
The state files for "--strategy" and "--strategy-option" are written and
read twice, once by builtin/rebase.c and then by sequencer.c. This is an
artifact of the scripted rebase and the need to support "rebase
--preserve-merges". Now that "--preserve-merges" no-longer exists we
only need to read and write these files in sequencer.c. This enables us
to remove a call to free() in read_strategy_opts() that was added by
f1f4ebf432 (sequencer.c: fix "opts->strategy" leak in
read_strategy_opts(), 2022-11-08) as this commit fixes the root cause of
that leak.

There is further scope for removing duplication in the reading and
writing of state files between builtin/rebase.c and sequencer.c but that
is left for a follow up series.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:53:19 -07:00
René Scharfe c870de6502 get-tar-commit-id: use TYPEFLAG_GLOBAL_HEADER instead of magic value
Use the same macro in the archive reader code as on the writer side in
archive-tar.c to document the connection.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-10 09:22:34 -07:00
Junio C Hamano 89833fc249 Merge branch 'ds/fetch-bundle-uri-with-all'
"git fetch --all" does not have to download and handle the same
bundleURI over and over, which has been corrected.

* ds/fetch-bundle-uri-with-all:
  fetch: download bundles once, even with --all
2023-04-06 13:38:32 -07:00
Junio C Hamano 6047b28eb7 Merge branch 'en/header-split-cleanup'
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.

* en/header-split-cleanup:
  csum-file.h: remove unnecessary inclusion of cache.h
  write-or-die.h: move declarations for write-or-die.c functions from cache.h
  treewide: remove cache.h inclusion due to setup.h changes
  setup.h: move declarations for setup.c functions from cache.h
  treewide: remove cache.h inclusion due to environment.h changes
  environment.h: move declarations for environment.c functions from cache.h
  treewide: remove unnecessary includes of cache.h
  wrapper.h: move declarations for wrapper.c functions from cache.h
  path.h: move function declarations for path.c functions from cache.h
  cache.h: remove expand_user_path()
  abspath.h: move absolute path functions from cache.h
  environment: move comment_line_char from cache.h
  treewide: remove unnecessary cache.h inclusion from several sources
  treewide: remove unnecessary inclusion of gettext.h
  treewide: be explicit about dependence on gettext.h
  treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06 13:38:31 -07:00
Junio C Hamano 72871b198f Merge branch 'ab/remove-implicit-use-of-the-repository'
Code clean-up around the use of the_repository.

* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06 13:38:30 -07:00
Junio C Hamano 87daf40750 Merge branch 'ab/config-multi-and-nonbool'
Assorted config API updates.

* ab/config-multi-and-nonbool:
  for-each-repo: with bad config, don't conflate <path> and <cmd>
  config API: add "string" version of *_value_multi(), fix segfaults
  config API users: test for *_get_value_multi() segfaults
  for-each-repo: error on bad --config
  config API: have *_multi() return an "int" and take a "dest"
  versioncmp.c: refactor config reading next commit
  config API: add and use a "git_config_get()" family of functions
  config tests: add "NULL" tests for *_get_value_multi()
  config tests: cover blind spots in git_die_config() tests
2023-04-06 13:38:29 -07:00
Junio C Hamano e9dffbc7f1 Merge branch 'ps/fetch-ref-update-reporting'
Clean-up of the code path that reports what "git fetch" did to each
ref.

* ps/fetch-ref-update-reporting:
  fetch: centralize printing of reference updates
  fetch: centralize logic to print remote URL
  fetch: centralize handling of per-reference format
  fetch: pass the full local reference name to `format_display`
  fetch: move output format into `display_state`
  fetch: move reference width calculation into `display_state`
2023-04-06 13:38:28 -07:00
Junio C Hamano 955abf5f72 Merge branch 'jk/unused-post-2.40-part2'
Code clean-up for "-Wunused-parameter" build.

* jk/unused-post-2.40-part2:
  parse-options: drop parse_opt_unknown_cb()
  t/helper: mark unused argv/argc arguments
  mark "argv" as unused when we check argc
  builtins: mark unused prefix parameters
  builtins: annotate always-empty prefix parameters
  builtins: always pass prefix to parse_options()
  fast-import: fix file access when run from subdir
2023-04-06 13:38:28 -07:00
Junio C Hamano 7727da99df Merge branch 'ds/ahead-behind'
"git for-each-ref" learns '%(ahead-behind:<base>)' that computes the
distances from a single reference point in the history with bunch
of commits in bulk.

* ds/ahead-behind:
  commit-reach: add tips_reachable_from_bases()
  for-each-ref: add ahead-behind format atom
  commit-reach: implement ahead_behind() logic
  commit-graph: introduce `ensure_generations_valid()`
  commit-graph: return generation from memory
  commit-graph: simplify compute_generation_numbers()
  commit-graph: refactor compute_topological_levels()
  for-each-ref: explicitly test no matches
  for-each-ref: add --stdin option
2023-04-06 13:38:21 -07:00
Clement Mabileau 4c643fb321 branch: improve error log on branch not found by checking remotes refs
New git users may want to locally delete remote-tracking branches but
don't really understand how they are distinguished from branches by git.
Then one may naively try:
`git branch -d foo/bar` and get a correct error `branch foo/bar not
found` but hard to understand for a newbie, this patch aims to guide one
in such case.

when failing to delete a branch with `git branch -d <branch>` because
of branch not found, try to find a **remote refs** matching `<branch>`
and if so, add an hint:
`Did you forget --remote?` to the error message

Signed-off-by: Clement Mabileau <mabileau.clement@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06 13:11:26 -07:00
Tao Klerks 42943b950e mergetool: new config guiDefault supports auto-toggling gui by DISPLAY
When no merge.tool or diff.tool is configured or manually selected, the
selection of a default tool is sensitive to the DISPLAY variable; in a
GUI session a gui-specific tool will be proposed if found, and
otherwise a terminal-based one. This "GUI-optimizing" behavior is
important because a GUI can make a huge difference to a user's ability
to understand and correctly complete a non-trivial conflicting merge.

Some time ago the merge.guitool and diff.guitool config options were
introduced to enable users to configure both a GUI tool, and a non-GUI
tool (with fallback if no GUI tool configured), in the same environment.

Unfortunately, the --gui argument introduced to support the selection of
the guitool is still explicit. When using configured tools, there is no
equivalent of the no-tool-configured "propose a GUI tool if we are in a GUI
environment" behavior.

As proposed in <xmqqmtb8jsej.fsf@gitster.g>, introduce new configuration
options, difftool.guiDefault and mergetool.guiDefault, supporting a special
value "auto" which causes the corresponding tool or guitool to be selected
depending on the presence of a non-empty DISPLAY value. Also support "true"
to say "default to the guitool (unless --no-gui is passed on the
commandline)", and "false" as the previous default behavior when these new
configuration options are not specified.

Signed-off-by: Tao Klerks <tao@klerks.biz>
Acked-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 21:03:29 -07:00
Junio C Hamano 8b214c2e9d clone: propagate object-format when cloning from void
A user could prepare an empty repository and set it to use SHA256 as
the object format.  The new repository created by "git clone" from
such a repository however would not record that it is expecting
objects in the same SHA256 format.  This works as expected if the
source repository is not empty.

Just like we started copying the name of the primary branch from the
remote repository even if it is unborn in 3d8314f8 (clone: propagate
empty remote HEAD even with other branches, 2022-07-07), lift the
code that records the object format out of the block executed only
when cloning from an instantiated repository, so that it works also
when cloning from an empty repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-05 14:17:00 -07:00
Junio C Hamano 5e4070e128 Merge branch 'jk/really-deprecate-pack-redundant'
"git pack-redundant" gave a warning when run, as the command has
outlived its usefulness long ago and is nominated for future
removal.  Now we escalate to give an error.

* jk/really-deprecate-pack-redundant:
  pack-redundant: escalate deprecation warning to an error
2023-04-04 14:28:29 -07:00
Junio C Hamano 9142fce9b0 Merge branch 'ah/rebase-merges-config'
Streamline --rebase-merges command line option handling and
introduce rebase.merges configuration variable.

* ah/rebase-merges-config:
  rebase: add a config option for --rebase-merges
  rebase: deprecate --rebase-merges=""
  rebase: add documentation and test for --no-rebase-merges
2023-04-04 14:28:28 -07:00
Junio C Hamano 7e13d654c2 Merge branch 'jk/fast-export-cleanup'
Code clean-up.

* jk/fast-export-cleanup:
  fast-export: drop unused parameter from anonymize_commit_message()
  fast-export: drop data parameter from anonymous generators
  fast-export: de-obfuscate --anonymize-map handling
  fast-export: factor out anonymized_entry creation
  fast-export: simplify initialization of anonymized hashmaps
  fast-export: drop const when storing anonymized values
2023-04-04 14:28:27 -07:00
Shuqi Liang 1a65b41b38 write-tree: integrate with sparse index
Update 'git write-tree' to allow using the sparse-index in memory
without expanding to a full one.

The recursive algorithm for update_one() was already updated in 2de37c5
(cache-tree: integrate with sparse directory entries, 2021-03-03) to
handle sparse directory entries in the index. Hence we can just set the
requires-full-index to false for "write-tree".

The `p2000` tests demonstrate a ~96% execution time reduction for 'git
write-tree' using a sparse index:

Test                                           before  after
-----------------------------------------------------------------
2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-04 12:50:54 -07:00
Junio C Hamano e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Raghul Nanth A 748b8d669a describe: enable sparse index for describe
git describe compares the index with the working tree when (and only
when) it is run with the "--dirty" flag. This is done by the
run_diff_index() function. The function has been made aware of the
sparse-index in the series that led to 8d2c3732 (Merge branch
'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the
requires-full-index to false for "describe".

Performance metrics

  Test                                                     HEAD~1            HEAD
  -------------------------------------------------------------------------------------------------
  2000.2: git describe --dirty (full-v3)                   0.08(0.09+0.01)   0.08(0.06+0.03) +0.0%
  2000.3: git describe --dirty (full-v4)                   0.09(0.07+0.03)   0.08(0.05+0.04) -11.1%
  2000.4: git describe --dirty (sparse-v3)                 0.88(0.82+0.06)   0.02(0.01+0.05) -97.7%
  2000.5: git describe --dirty (sparse-v4)                 0.68(0.60+0.08)   0.02(0.02+0.04) -97.1%
  2000.6: echo >>new && git describe --dirty (full-v3)     0.08(0.04+0.05)   0.08(0.05+0.04) +0.0%
  2000.7: echo >>new && git describe --dirty (full-v4)     0.08(0.07+0.03)   0.08(0.05+0.04) +0.0%
  2000.8: echo >>new && git describe --dirty (sparse-v3)   0.75(0.69+0.07)   0.02(0.03+0.03) -97.3%
  2000.9: echo >>new && git describe --dirty (sparse-v4)   0.81(0.73+0.09)   0.02(0.01+0.05) -97.5%

Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03 11:30:23 -07:00
Derrick Stolee 25bccb4b79 fetch: download bundles once, even with --all
When fetch.bundleURI is set, 'git fetch' downloads bundles from the
given bundle URI before fetching from the specified remote. However,
when using non-file remotes, 'git fetch --all' will launch 'git fetch'
subprocesses which then read fetch.bundleURI and fetch the bundle list
again. We do not expect the bundle list to have new information during
these multiple runs, so avoid these extra calls by un-setting
fetch.bundleURI in the subprocess arguments.

Be careful to skip fetching bundles for the empty bundle string.
Fetching bundles from the empty list presents some interesting test
failures.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-31 10:07:33 -07:00
Junio C Hamano dbb4102f7b Merge branch 'sg/parse-options-h-users'
Code clean-up to include and/or uninclude parse-options.h file as
needed.

* sg/parse-options-h-users:
  treewide: remove unnecessary inclusions of parse-options.h from headers
  treewide: include parse-options.h in source files
2023-03-30 13:47:11 -07:00
Jeff King 6ba21fa65c mark "argv" as unused when we check argc
A few commands don't take any options at all, and confirm this by
checking argc. After that they have no need to look at argv, but we're
still stuck with it by convention. Let's annotate these cases so that
the compiler doesn't complain with -Wunused-parameter.

Note that in scalar and get-tar-commit-id, we're forced to keep argv by
calling convention (the functions must match cmd_main() and builtin
cmd_foo() conventions, respectively). In diff, these are subcommand
modes that we call individually, so we _could_ just drop the argv
parameters entirely. But it's weird to pass argc without argv, and it
implies that the caller knows that the subcommands aren't interested in
further arguments. It's less confusing to just keep them and silence the
compiler warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
Jeff King 5247b762d0 builtins: mark unused prefix parameters
All builtins receive a "prefix" parameter, but it is only useful if they
need to adjust filenames given by the user on the command line. For
builtins that do not even call parse_options(), they often don't look at
the prefix at all, and -Wunused-parameter complains.

Let's annotate those to silence the compiler warning. I gave a quick
scan of each of these cases, and it seems like they don't have anything
they _should_ be using the prefix for (i.e., there is no hidden bug that
we are missing). The only questionable cases I saw were:

  - in git-unpack-file, we create a tempfile which will always be at the
    root of the repository, even if the command is run from a subdir.
    Arguably this should be created in the subdir from which we're run
    (as we report the path only as a relative name). However, nobody has
    complained, and I'm hesitant to change something that is deep
    plumbing going back to April 2005 (though I think within our
    scripts, the sole caller in git-merge-one-file would be OK, as it
    moves to the toplevel itself).

  - in fetch-pack, local-filesystem remotes are taken as relative to the
    project root, not the current directory. So:

       git init server.git
       [...put stuff in server.git...]
       git init client.git
       cd client.git
       mkdir subdir
       cd subdir
       git fetch-pack ../../server.git ...

    won't work, as we quietly move to the top of the repository before
    interpreting the path (so "../server.git" would work). This is
    weird, but again, nobody has complained and this is how it has
    always worked. And this is how "git fetch" works, too. Plus it
    raises questions about how a configured remote like:

      git config remote.origin.url ../server.git

    should behave. I can certainly come up with a reasonable set of
    behavior, but it may not be worth stirring up complications in a
    plumbing tool.

So I've left the behavior untouched in both of those cases. If anybody
really wants to revisit them, it's easy enough to drop the UNUSED
marker. This commit is just about removing them as obstacles to turning
on -Wunused-parameter all the time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00
Jeff King 7915691377 builtins: annotate always-empty prefix parameters
It's usually a bad idea for a builtin's cmd_foo() to ignore the "prefix"
argument it gets, as it needs to prepend that string when accessing any
paths given by the user.

But if a builtin does not ask for the git wrapper to run repository
setup (via the RUN_SETUP or RUN_SETUP_GENTLY flags), then we know the
prefix will always be NULL (it is adjusting for the chdir() done during
repo setup, but there cannot be one if we did not set up the repo). In
those cases it's OK to ignore "prefix", but it's worth annotating for a
few reasons:

  1. It serves as documentation to somebody reading the code about what
     we expect.

  2. If the flags in git.c ever change, the run-time assertion may help
     detect the problem (though only if the command is run from a
     subdirectory of the repository).

  3. It notes to the compiler that we are OK ignoring "prefix". In
     particular, this silences -Wunused-parameter. It _could_ also help
     the compiler generate better code (because it will know the prefix
     is NULL), but in practice this is quite unlikely to matter.

Note that I've only added this annotation to commands which triggered
-Wunused-parameter. It would be correct to add it to any builtin which
doesn't ask for RUN_SETUP, but most of the rest of them do the sensible
thing with "prefix" by passing it to parse_options(). So they're much
more likely to just work if they ever switched to RUN_SETUP, and aren't
worth annotating.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 14:11:24 -07:00