Commit graph

10513 commits

Author SHA1 Message Date
SZEDER Gábor fa83cc834d parse-options: add support for parsing subcommands
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.

Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.

The approach is guided by the following observations:

  - Most subcommands [1] are implemented in dedicated functions, and
    most of those functions [2] either have a signature matching the
    'int cmd_foo(int argc, const char **argc, const char *prefix)'
    signature of builtin commands or can be trivially converted to
    that signature, because they miss only that last prefix parameter
    or have no parameters at all.

  - Subcommand arguments only have long form, and they have no double
    dash prefix, no negated form, and no description, and they don't
    take any arguments, and can't be abbreviated.

  - There must be exactly one subcommand among the arguments, or zero
    if the command has a default operation mode.

  - All arguments following the subcommand are considered to be
    arguments of the subcommand, and, conversely, arguments meant for
    the subcommand may not preceed the subcommand.

So in the end subcommand declaration and parsing would look something
like this:

    parse_opt_subcommand_fn *fn = NULL;
    struct option builtin_commit_graph_options[] = {
        OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"),
                   N_("the object directory to store the graph")),
        OPT_SUBCOMMAND("verify", &fn, graph_verify),
        OPT_SUBCOMMAND("write", &fn, graph_write),
        OPT_END(),
    };
    argc = parse_options(argc, argv, prefix, options,
                         builtin_commit_graph_usage, 0);
    return fn(argc, argv, prefix);

Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer.  parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed.  If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.

If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag.  In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged.  Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).

Some thoughts about the implementation:

  - The same pointer to 'fn' must be specified as 'value' for each
    OPT_SUBCOMMAND, because there can be only one set of mutually
    exclusive subcommands; parse_options() will BUG() otherwise.

    There are other ways to tell parse_options() where to put the
    function associated with the subcommand given on the command line,
    but I didn't like them:

      - Change parse_options()'s signature by adding a pointer to
        subcommand function to be set to the function associated with
        the given subcommand, affecting all callsites, even those that
        don't have subcommands.

      - Introduce a specific parse_options_and_subcommand() variant
        with that extra funcion parameter.

  - I decided against automatically calling the subcommand function
    from within parse_options(), because:

      - There are commands that have to perform additional actions
        after option parsing but before calling the function
        implementing the specified subcommand.

      - The return code of the subcommand is usually the return code
        of the git command, but preserving the return code of the
        automatically called subcommand function would have made the
        API awkward.

  - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
    option flag: we have two subcommands that are purposefully
    excluded from completion ('git remote rm' and 'git stash save'),
    so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
    flag.

  - Some of the 'parse_opt_flags' don't make sense with subcommands,
    and using them is probably just an oversight or misunderstanding.
    Therefore parse_options() will BUG() when invoked with any of the
    following flags while the options array contains at least one
    OPT_SUBCOMMAND:

      - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
        arguments when encountering a "--" argument, so it doesn't
        make sense to expect and keep one before a subcommand, because
        it would prevent the parsing of the subcommand.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
        might be meaningful for the command's default operation mode,
        e.g. to disambiguate refs and pathspecs.

      - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
        tells parse_options() to stop as soon as it encouners a
        non-option argument, but subcommands are by definition not
        options...  so how could they be parsed, then?!

      - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
        unknown --options and then pass them to a different command or
        subsystem.  Surely if a command has subcommands, then this
        functionality should rather be delegated to one of those
        subcommands, and not performed by the command itself.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
        --options to the default operation mode.

  - If the command with subcommands has a default operation mode, then
    all arguments to the command must preceed the arguments of the
    subcommand.

    AFAICT we don't have any commands where this makes a difference,
    because in those commands either only the command accepts any
    arguments ('notes' and 'remote'), or only the default subcommand
    ('reflog' and 'stash'), but never both.

  - The 'argv' array passed to subcommand functions currently starts
    with the name of the subcommand.  Keep this behavior.  AFAICT no
    subcommand functions depend on the actual content of 'argv[0]',
    but the parse_options() call handling their options expects that
    the options start at argv[1].

  - To support handling subcommands programmatically in our Bash
    completion script, 'git cmd --git-completion-helper' will now list
    both subcommands and regular --options, if any.  This means that
    the completion script will have to separate subcommands (i.e.
    words without a double dash prefix) from --options on its own, but
    that's rather easy to do, and it's not much work either, because
    the number of subcommands a command might have is rather low, and
    those commands accept only a single --option or none at all.  An
    alternative would be to introduce a separate option that lists
    only subcommands, but then the completion script would need not
    one but two git invocations and command substitutions for commands
    with subcommands.

    Note that this change doesn't affect the behavior of our Bash
    completion script, because when completing the --option of a
    command with subcommands, e.g. for 'git notes --<TAB>', then all
    subcommands will be filtered out anyway, as none of them will
    match the word to be completed starting with that double dash
    prefix.

[1] Except 'git rerere', because many of its subcommands are
    implemented in the bodies of the if-else if statements parsing the
    command's subcommand argument.

[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
    because some of the functions implementing their subcommands take
    special parameters.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
SZEDER Gábor 99d86d60e5 parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown
arguments instead of erroring out".  This is a bit misleading, as this
flag only applies to unknown --options, while non-option arguments are
kept even without this flag.

Update the description to clarify this, and rename the flag to
PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at
the flag name.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
Junio C Hamano fddd8b4801 Merge branch 'll/disk-usage-humanise'
"git rev-list --disk-usage" learned to take an optional value
"human" to show the reported value in human-readable format, like
"3.40MiB".

* ll/disk-usage-humanise:
  rev-list: support human-readable output for `--disk-usage`
2022-08-18 13:07:05 -07:00
Junio C Hamano 9b9445cfde Merge branch 'sy/sparse-rm'
"git rm" has become more aware of the sparse-index feature.

* sy/sparse-rm:
  rm: integrate with sparse-index
  rm: expand the index only when necessary
  pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
  t1092: add tests for `git-rm`
2022-08-18 13:07:05 -07:00
Junio C Hamano 80ffc849bd Merge branch 'vd/sparse-reset-checkout-fixes'
Fixes to sparse index compatibility work for "reset" and "checkout"
commands.

* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-18 13:07:04 -07:00
Junio C Hamano c0f6dd49f1 Merge branch 'ab/tech-docs-to-help'
Expose a lot of "tech docs" via "git help" interface.

* ab/tech-docs-to-help:
  docs: move http-protocol docs to man section 5
  docs: move cruft pack docs to gitformat-pack
  docs: move pack format docs to man section 5
  docs: move signature docs to man section 5
  docs: move index format docs to man section 5
  docs: move protocol-related docs to man section 5
  docs: move commit-graph format docs to man section 5
  git docs: add a category for file formats, protocols and interfaces
  git docs: add a category for user-facing file, repo and command UX
  git help doc: use "<doc>" instead of "<guide>"
  help.c: remove common category behavior from drop_prefix() behavior
  help.c: refactor drop_prefix() to use a "switch" statement"
2022-08-14 23:19:28 -07:00
Victoria Dye aac0e8ffee builtin/bugreport.c: create '--diagnose' option
Create a '--diagnose' option for 'git bugreport' to collect additional
information about the repository and write it to a zipped archive.

The '--diagnose' option behaves effectively as an alias for simultaneously
running 'git bugreport' and 'git diagnose'. In the documentation, users are
explicitly recommended to attach the diagnostics alongside a bug report to
provide additional context to readers, ideally reducing some back-and-forth
between reporters and those debugging the issue.

Note that '--diagnose' may take an optional string arg (either 'stats' or
'all'). If specified without the arg, the behavior corresponds to running
'git diagnose' without '--mode'. As with 'git diagnose', this default is
intended to help reduce unintentional leaking of sensitive information).
Users can also explicitly specify '--diagnose=(stats|all)' to generate the
respective archive created by 'git diagnose --mode=(stats|all)'.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Victoria Dye 7ecf193f7d builtin/diagnose.c: add '--mode' option
Create '--mode=<mode>' option in 'git diagnose' to allow users to optionally
select non-default diagnostic information to include in the output archive.
Additionally, document the currently-available modes, emphasizing the
importance of not sharing a '--mode=all' archive publicly due to the
presence of sensitive information.

Note that the option parsing callback - 'option_parse_diagnose()' - is added
to 'diagnose.c' rather than 'builtin/diagnose.c' so that it may be reused in
future callers configuring a diagnostics archive.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Victoria Dye 6783fd3cef builtin/diagnose.c: create 'git diagnose' builtin
Create a 'git diagnose' builtin to generate a standalone zip archive of
repository diagnostics.

The "diagnose" functionality was originally implemented for Scalar in
aa5c79a331 (scalar: implement `scalar diagnose`, 2022-05-28). However, the
diagnostics gathered are not specific to Scalar-cloned repositories and
can be useful when diagnosing issues in any Git repository.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Junio C Hamano 83489a5b20 Merge branch 'ab/plug-revisions-leak'
Plug a bit more leaks in the revisions API.

* ab/plug-revisions-leak:
  revisions API: don't leak memory on argv elements that need free()-ing
  bisect.c: partially fix bisect_rev_setup() memory leak
  log: refactor "rev.pending" code in cmd_show()
  log: fix a memory leak in "git show <revision>..."
  test-fast-rebase helper: use release_revisions() (again)
  bisect.c: add missing "goto" for release_revisions()
2022-08-12 13:19:08 -07:00
Junio C Hamano 8faaf690f7 Merge branch 'lt/symbolic-ref-sanity'
"git symbolic-ref symref non..sen..se" is now diagnosed as an error.

* lt/symbolic-ref-sanity:
  symbolic-ref: refuse to set syntactically invalid target
2022-08-12 13:19:08 -07:00
Li Linchao 9096451acd rev-list: support human-readable output for --disk-usage
The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the resulting number is quit hard for a human to read.

Teach git rev-list to output a human readable result when using
'--disk-usage'.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 13:45:23 -07:00
Junio C Hamano b0fd38a515 Merge branch 'jc/string-list-cleanup' into maint
Code clean-up.
source: <xmqq7d471dns.fsf@gitster.g>

* jc/string-list-cleanup:
  builtin/remote.c: use the right kind of STRING_LIST_INIT
2022-08-10 21:52:36 -07:00
Junio C Hamano 340a6120e5 Merge branch 'mt/checkout-count-fix' into maint
"git checkout" miscounted the paths it updated, which has been
corrected.
source: <cover.1657799213.git.matheus.bernardino@usp.br>

* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-08-10 21:52:34 -07:00
Junio C Hamano a6aeb2fef9 Merge branch 'jc/resolve-undo' into maint
The resolve-undo information in the index was not protected against
GC, which has been corrected.
source: <xmqq35f7kzad.fsf@gitster.g>

* jc/resolve-undo:
  fsck: do not dereference NULL while checking resolve-undo data
  revision: mark blobs needed for resolve-undo as reachable
2022-08-10 21:52:32 -07:00
Derrick Stolee e21e663cd1 clone: --bundle-uri cannot be combined with --depth
A previous change added the '--bundle-uri' option, but did not check
if the --depth parameter was included. Since bundles are not compatible
with shallow clones, provide an error message to the user who is
attempting this combination.

I am leaving this as its own change, separate from the one that
implements '--bundle-uri', because this is more of an advisory for the
user. There is nothing wrong with bootstrapping with bundles and then
fetching a shallow clone. However, that is likely going to involve too
much work for the client _and_ the server. The client will download all
of this bundle information containing the full history of the
repository only to ignore most of it. The server will get a shallow
fetch request, but with a list of haves that might cause a more painful
computation of that shallow pack-file.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee 5556891961 clone: add --bundle-uri option
Cloning a remote repository is one of the most expensive operations in
Git. The server can spend a lot of CPU time generating a pack-file for
the client's request. The amount of data can clog the network for a long
time, and the Git protocol is not resumable. For users with poor network
connections or are located far away from the origin server, this can be
especially painful.

Add a new '--bundle-uri' option to 'git clone' to bootstrap a clone from
a bundle. If the user is aware of a bundle server, then they can tell
Git to bootstrap the new repository with these bundles before fetching
the remaining objects from the origin server.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Shaoxuan Yuan ede241c715 rm: integrate with sparse-index
Enable the sparse index within the `git-rm` command.

The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.

Test                              HEAD~1            HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3)     0.41(0.37+0.05)   0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4)     0.38(0.34+0.05)   0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3)   0.57(0.56+0.01)   0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4)   0.57(0.55+0.02)   0.03(0.03+0.00) -94.7%

----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].

`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.

For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:

        rm 'folder1/'

Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:

        rm 'folder1/0/0/0'
        rm 'folder1/0/1'
        rm 'folder1/a'

Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".

Modify a previous test so such difference is not considered as an error.

[1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan bcf96cfca6 rm: expand the index only when necessary
Remove the `ensure_full_index()` method so `git-rm` does not always
expand the index when the expansion is unnecessary, i.e. when
<pathspec> does not have any possibilities to match anything outside
of sparse-checkout definition.

Expand the index when the <pathspec> needs an expanded index, i.e. the
<pathspec> contains wildcard that may need a full-index or the
<pathspec> is simply outside of sparse-checkout definition.

Notice that the test 'rm pathspec expands index when necessary' in
t1092 *is* testing this code change behavior, though it will be marked
as 'test_expect_success' only in the next patch, where we officially
mark `command_requires_full_index = 0`, so the index does not expand
unless we tell it to do so.

Notice that because we also want `ensure_full_index` to record the
stdout and stderr from Git command, a corresponding modification
is also included in this patch. The reason we want the "sparse-index-out"
and "sparse-index-err", is that we need to make sure there is no error
from Git command itself, so we can rely on the `test_region` result
and determine if the index is expanded or not.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan b29ad38322 pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
Method pathspec_needs_expanded_index() in reset.c from 4d1cfc1351
(reset: make --mixed sparse-aware, 2021-11-29) is reusable when we
need to verify if the index needs to be expanded when the command
is utilizing a pathspec rather than a literal path.

Move it to pathspec.h for reusability.

Add a few items to the function so it can better serve its purpose as
a standalone public function:

* Add a check in front so if the index is not sparse, return early since
  no expansion is needed.

* It now takes an arbitrary 'struct index_state' pointer instead of
  using `the_index` and `active_cache`.

* Add documentation to the function.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Junio C Hamano 3f61790678 Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm
* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-08 13:23:06 -07:00
Victoria Dye 49ff3cb90f checkout: fix nested sparse directory diff in sparse index
Add the 'recursive' diff flag to the local changes reporting done by 'git
checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded
sparse directories will not be recursed into to report the diff of each
file's contents, resulting in the reported local changes including
"modified" sparse directories.

The same issue was found and fixed for 'git status' in 2c521b0e49 (status:
fix nested sparse directory diff in sparse index, 2022-03-01)

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:49 -07:00
Junio C Hamano 1e92768aa1 Merge branch 'tb/cat-file-z'
Operating modes like "--batch" of "git cat-file" command learned to
take NUL-terminated input, instead of one-item-per-line.

* tb/cat-file-z:
  builtin/cat-file.c: support NUL-delimited input with `-z`
  t1006: extract --batch-command inputs to variables
2022-08-05 15:52:14 -07:00
Junio C Hamano de28459136 Merge branch 'jk/clone-unborn-confusion' into maint
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.
source: <YsdyLS4UFzj0j/wB@coredump.intra.peff.net>

* jk/clone-unborn-confusion:
  clone: move unborn head creation to update_head()
  clone: use remote branch if it matches default HEAD
  clone: propagate empty remote HEAD even with other branches
  clone: drop extra newline from warning message
2022-08-05 15:51:35 -07:00
Derrick Stolee 992f25d713 fetch: use ref_namespaces during prefetch
The "refs/prefetch/" namespace is used by 'git fetch --prefetch' as a
replacement of the destination of the refpsec for a remote. Git also
removes refspecs that include tags.

Instead of using string literals for the 'refs/tags/ and
'refs/prefetch/' namespaces, use the entries in the ref_namespaces
array.

This kind of change could be done in many places around the codebase,
but we are isolating only to this change because of the way the
refs/prefetch/ namespace somewhat motivated the creation of the
ref_namespaces array.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
Derrick Stolee 863a8ae97b maintenance: stop writing log.excludeDecoration
This reverts commit 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19).

The previous change created a default decoration filter that does not
include refs/prefetch/, so this modification of the config is no longer
needed.

One issue that can happen from this point on is that users who ran the
prefetch task on previous versions of Git will still have a
log.excludeDecoration value and that will prevent the new default
decoration filter from being active. Thus, when we add the refs/bundle/
namespace as part of the bundle URI feature, those users will see
refs/bundle/ decorations.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
Derrick Stolee 3e103ed23f log: create log.initialDecorationSet=all
The previous change introduced the --clear-decorations option for users
who do not want their decorations limited to a narrow set of ref
namespaces.

Add a config option that is equivalent to specifying --clear-decorations
by default.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee 748706d713 log: add --clear-decorations option
The previous changes introduced a new default ref filter for decorations
in the 'git log' command. This can be overridden using
--decorate-refs=HEAD and --decorate-refs=refs/, but that is cumbersome
for users.

Instead, add a --clear-decorations option that resets all previous
filters to a blank filter that accepts all refs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee 92156291ca log: add default decoration filter
When a user runs 'git log', they expect a certain set of helpful
decorations. This includes:

* The HEAD ref
* Branches (refs/heads/)
* Stashes (refs/stash)
* Tags (refs/tags/)
* Remote branches (refs/remotes/)
* Replace refs (refs/replace/ or $GIT_REPLACE_REF_BASE)

Each of these namespaces was selected due to existing test cases that
verify these namespaces appear in the decorations. In particular,
stashes and replace refs can have custom colors from the
color.decorate.<slot> config option.

While one test checks for a decoration from notes, it only applies to
the tip of refs/notes/commit (or its configured ref name). Notes form
their own kind of decoration instead. Modify the expected output for the
tests in t4013 that expect this note decoration.  There are several
tests throughout the codebase that verify that --decorate-refs,
--decorate-refs-exclude, and log.excludeDecoration work as designed and
the tests continue to pass without intervention.

However, there are other refs that are less helpful to show as
decoration:

* Prefetch refs (refs/prefetch/)
* Rebase refs (refs/rebase-merge/ and refs/rebase-apply/)
* Bundle refs (refs/bundle/) [!]

[!] The bundle refs are part of a parallel series that bootstraps a repo
    from a bundle file, storing the bundle's refs into the repo's
    refs/bundle/ namespace.

In the case of prefetch refs, 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19) added logic to add
refs/prefetch/ to the log.excludeDecoration config option. Additional
feedback pointed out that having such a side-effect can be confusing and
perhaps not helpful to users. Instead, we should hide these ref
namespaces that are being used by Git for internal reasons but are not
helpful for the users to see.

The way to provide a seamless user experience without setting the config
is to modify the default decoration filters to match our expectation of
what refs the user actually wants to see.

In builtin/log.c, after parsing the --decorate-refs and
--decorate-refs-exclude options from the command-line, call
set_default_decoration_filter(). This method populates the exclusions
from log.excludeDecoration, then checks if the list of pattern
modifications are empty. If none are specified, then the default set is
restricted to the set of inclusions mentioned earlier (HEAD, branches,
etc.).  A previous change introduced the ref_namespaces array, which
includes all of these currently-used namespaces. The 'decoration' value
is non-zero when that namespace is associated with a special coloring
and fits into the list of "expected" decorations as described above,
which makes the implementation of this filter very simple.

Note that the logic in ref_filter_match() in log-tree.c follows this
matching pattern:

 1. If there are exclusion patterns and the ref matches one, then ignore
    the decoration.

 2. If there are inclusion patterns and the ref matches one, then
    definitely include the decoration.

 3. If there are config-based exclusions from log.excludeDecoration and
    the ref matches one, then ignore the decoration.

With this logic in mind, we need to ensure that we do not populate our
new defaults if any of these filters are manually set. Specifically, if
a user runs

	git -c log.excludeDecoration=HEAD log

then we expect the HEAD decoration to not appear. If we left the default
inclusions in the set, then HEAD would match that inclusion before
reaching the config-based exclusions.

A potential alternative would be to check the list of default inclusions
at the end, after the config-based exclusions. This would still create a
behavior change for some uses of --decorate-refs-exclude=<X>, and could
be overwritten somewhat with --decorate-refs=refs/ and
--decorate-refs=HEAD. However, it no longer becomes possible to include
refs outside of the defaults while also excluding some using
log.excludeDecoration.

Another alternative would be to exclude the known namespaces that are
not intended to be shown. This would reduce the visible effect of the
change for expert users who use their own custom ref namespaces. The
implementation change would be very simple to swap due to our use of
ref_namespaces:

	int i;
	struct string_list *exclude = decoration_filter->exclude_ref_pattern;

	/*
	 * No command-line or config options were given, so
	 * populate with sensible defaults.
	 */
	for (i = 0; i < NAMESPACE__COUNT; i++) {
		if (ref_namespaces[i].decoration)
			continue;

		string_list_append(exclude, ref_namespaces[i].ref);
	}

The main downside of this approach is that we expect to add new hidden
namespaces in the future, and that means that Git versions will be less
stable in how they behave as those namespaces are added.

It is critical that we provide ways for expert users to disable this
behavior change via command-line options and config keys. These changes
will be implemented in a future change.

Add a test that checks that the defaults are not added when
--decorate-refs is specified. We verify this by showing that HEAD is not
included as it normally would.  Also add a test that shows that the
default filter avoids the unwanted decorations from refs/prefetch,
refs/rebase-merge,
and refs/bundle.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee 97e61e0f9c refs: use ref_namespaces for replace refs base
The git_replace_ref_base global is used to store the value of the
GIT_REPLACE_REF_BASE environment variable or the default of
"refs/replace/". This is initialized within setup_git_env().

The ref_namespaces array is a new centralized location for information
such as the ref namespace used for replace refs. Instead of having this
namespace stored in two places, use the ref_namespaces array instead.

For simplicity, create a local git_replace_ref_base variable wherever
the global was previously used.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Ævar Arnfjörð Bjarmason 844739ba27 git docs: add a category for file formats, protocols and interfaces
Create a new "File formats, protocols and other developer interfaces"
section in the main "git help git" manual page and start moving the
documentation that now lives in "Documentation/technical/*.git" over
to it. This complements the newly added and adjacent "Repository,
command and file interfaces" section.

This makes the technical documentation more accessible and
discoverable. Before this we wouldn't install it by default, and had
no ability to build man page versions of them. The links to them from
our existing documentation link to the generated HTML version of these
docs.

So let's start moving those over, starting with just the
"bundle-format.txt" documentation added in 7378ec90e1 (doc: describe
Git bundle format, 2020-02-07). We'll now have a new
gitformat-bundle(5) man page. Subsequent commits will move more git
internal format documentation over.

Unfortunately the syntax of the current Documentation/technical/*.txt
is not the same (when it comes to section headings etc.) as our
Documentation/*.txt documentation, so change the relevant bits of
syntax as we're moving this over.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason d976c5100f git docs: add a category for user-facing file, repo and command UX
Create a new "Repository, command and file interfaces" section in the
main "git help git" manual page. Move things that belong under this
new criteria from the generic "Guides" section.

The "Guides" section was added in f442f28a81 (git.txt: add list of
guides, 2020-08-05). It makes sense to have e.g. "giteveryday(7)" and
"gitfaq(7)" listed under "Guides".

But placing e.g. "gitignore(5)" in it is stretching the meaning of
what a "guide" is, ideally that section should list things similar to
"giteveryday(7)" and "gitcore-tutorial(7)".

An alternate name that was considered for this new section was "User
formats", for consistency with the nomenclature used for man section 5
in general. My man(1) lists it as "File formats and conventions,
e.g. /etc/passwd".

So calling this "git help --formats" or "git help --user-formats"
would make sense for e.g. gitignore(5), but would be stretching it
somewhat for githooks(5), and would seem really suspect for the likes
of gitcli(7).

Let's instead pick a name that's closer to the generic term "User
interface", which is really what this documentation discusses: General
user-interface documentation that doesn't obviously belong elsewhere.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason dba1e5392f git help doc: use "<doc>" instead of "<guide>"
Replace the use of "<guide>" originally introduced (as "GUIDE") in
a133737b80 (doc: include --guide option description for "git help",
2013-04-02) with the more generic "<doc>". The "<doc>" placeholder is
more generic, and one we'll be able to use as we introduce new
documentation categories.

Let's also add "<doc>" to the "git help -h" output, when it was made
to use parse_option() in in 41eb33bd0c (help: use parseopt,
2008-02-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Junio C Hamano 30c6495e1e Merge branch 'jc/string-list-cleanup'
Code clean-up.

* jc/string-list-cleanup:
  builtin/remote.c: use the right kind of STRING_LIST_INIT
2022-08-03 13:36:09 -07:00
Junio C Hamano 966ff64a30 Merge branch 'en/merge-restore-to-pristine'
When "git merge" finds that it cannot perform a merge, it should
restore the working tree to the state before the command was
initiated, but in some corner cases it didn't.

* en/merge-restore-to-pristine:
  merge: do not exit restore_state() prematurely
  merge: ensure we can actually restore pre-merge state
  merge: make restore_state() restore staged state too
  merge: fix save_state() to work when there are stat-dirty files
  merge: do not abort early if one strategy fails to handle the merge
  merge: abort if index does not match HEAD for trivial merges
  merge-resolve: abort if index does not match HEAD
  merge-ort-wrappers: make printed message match the one from recursive
2022-08-03 13:36:09 -07:00
Junio C Hamano 87098a047b Merge branch 'sa/cat-file-mailmap'
"git cat-file" learned an option to use the mailmap when showing
commit and tag objects.

* sa/cat-file-mailmap:
  cat-file: add mailmap support
  ident: rename commit_rewrite_person() to apply_mailmap_to_header()
  ident: move commit_rewrite_person() to ident.c
  revision: improve commit_rewrite_person()
2022-08-03 13:36:08 -07:00
Junio C Hamano 8e56affcb5 Merge branch 'zh/ls-files-format'
"git ls-files" learns the "--format" option to tweak its output.

* zh/ls-files-format:
  ls-files: introduce "--format" option
2022-08-03 13:36:08 -07:00
Ævar Arnfjörð Bjarmason f92dbdbc6a revisions API: don't leak memory on argv elements that need free()-ing
Add a "free_removed_argv_elements" member to "struct
setup_revision_opt", and use it to fix several memory leaks.

We have various memory leaks in APIs that take and munge "const
char **argv", e.g. parse_options(). Sometimes these APIs are given the
"argv" we get to the "main" function, in which case we don't leak
memory, but other times we're giving it the "v" member of a "struct
strvec" we created.

There's several potential ways to fix those sort of leaks, we could
add a "nodup" mode to "struct strvec", which would work for the cases
where we push constant strings to it. But that wouldn't work as soon
as we used strvec_pushf(), or otherwise needed to duplicate or create
a string for that "struct strvec".

Let's instead make it the responsibility of the revisions API. If it's
going to clobber elements of argv it can also free() them, which it
will now do if instructed to do so via "free_removed_argv_elements".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:12:36 -07:00
Ævar Arnfjörð Bjarmason f89d085b3f log: refactor "rev.pending" code in cmd_show()
Refactor the juggling of "rev.pending" and our replacement for it
amended in the preceding commit so that:

 * We use an "unsigned int" instead of an "int" for "i", this matches
   the types of "struct rev_info" itself.

 * We don't need the "count" and "objects" variables introduced in
   5d7eeee2ac (git-show: grok blobs, trees and tags, too, 2006-12-14).

   They were originally added since we'd clobber rev.pending in the
   loop without restoring it. Since the preceding commit we are
   restoring it when we handle OBJ_COMMIT, so the main for-loop can
   refer to "rev.pending" didrectly.

 * We use the "memcpy a &blank" idiom introduced in
   5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
   macro, 2021-07-01).

   This is more obvious than relying on us enumerating all of the
   relevant members of the "struct object_array" that we need to
   clear.

 * We comment on why we don't need an object_array_clear() here, see
   the analysis in [1].

1. https://lore.kernel.org/git/YuQtJ2DxNKX%2Fy70N@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:54:20 -07:00
Ævar Arnfjörð Bjarmason 055e57b7b2 log: fix a memory leak in "git show <revision>..."
Fix a memory leak in code added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14). As we iterate over a "<revision>..."
command-line and encounter ad OBJ_COMMIT we want to use our "struct
rev_info", but with a "pending" array of one element: the one commit
we're showing in the loop.

To do this 5d7eeee2ac saved away a pointer to rev.pending.objects and
rev.pending.nr for its iteration. We'd then clobber those (and alloc)
when we needed to show an OBJ_COMMIT.

We'd therefore leak the "rev.pending" we started out with, and only
free the new "rev.pending" in the "OBJ_COMMIT" case arm as
prepare_revision_walk() would draw it down.

Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we
save away the "rev.pending" before clearing it. We then add a single
commit to it, which our indirect invocation of prepare_revision_walk()
will remove. After that we restore the "rev.pending".

Our "rev.pending" will then get free'd by the release_revisions()
added in f6bfea0ad0 (revisions API users: use release_revisions() in
builtin/log.c, 2022-04-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:16:28 -07:00
Linus Torvalds 04ede97211 symbolic-ref: refuse to set syntactically invalid target
You can feed absolute garbage to symbolic-ref as a target like:

  git symbolic-ref HEAD refs/heads/foo..bar

While this doesn't technically break the repo entirely (our "is it a git
directory" detector looks only for "refs/" at the start), we would never
resolve such a ref, as the ".." is invalid within a refname.

Let's flag these as invalid at creation time to help the caller realize
that what they're asking for is bogus.

A few notes:

  - We use REFNAME_ALLOW_ONELEVEL here, which lets:

     git update-ref refs/heads/foo FETCH_HEAD

    continue to work. It's unclear whether anybody wants to do something
    so odd, but it does work now, so this is erring on the conservative
    side. There's a test to make sure we didn't accidentally break this,
    but don't take that test as an endorsement that it's a good idea, or
    something we might not change in the future.

  - The test in t4202-log.sh checks how we handle such an invalid ref on
    the reading side, so it has to be updated to touch the HEAD file
    directly.

  - We need to keep our HEAD-specific check for "does it start with
    refs/". The ALLOW_ONELEVEL flag means we won't be enforcing that for
    other refs, but HEAD is special here because of the checks in
    validate_headref().

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-01 12:17:13 -07:00
Junio C Hamano acdb1e1053 Merge branch 'mt/checkout-count-fix'
"git checkout" miscounted the paths it updated, which has been
corrected.
source: <cover.1657799213.git.matheus.bernardino@usp.br>

* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-08-01 09:58:38 -07:00
Junio C Hamano 3d8e3dc4fc Merge branch 'ds/rebase-update-ref'
"git rebase -i" learns to update branches whose tip appear in the
rebased range with "--update-refs" option.
source: <pull.1247.v5.git.1658255624.gitgitgadget@gmail.com>

* ds/rebase-update-ref:
  sequencer: notify user of --update-refs activity
  sequencer: ignore HEAD ref under --update-refs
  rebase: add rebase.updateRefs config option
  sequencer: rewrite update-refs as user edits todo list
  rebase: update refs from 'update-ref' commands
  rebase: add --update-refs option
  sequencer: add update-ref command
  sequencer: define array with enum values
  rebase-interactive: update 'merge' description
  branch: consider refs under 'update-refs'
  t2407: test branches currently using apply backend
  t2407: test bisect and rebase as black-boxes
2022-08-01 09:58:38 -07:00
Junio C Hamano 54ec7b817d Merge branch 'ro/mktree-allow-missing-fix' into maint
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.
source: <748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com>

* ro/mktree-allow-missing-fix:
  mktree: do not check type of remote objects
2022-07-27 13:00:30 -07:00
Junio C Hamano 494d31e9d6 Merge branch 'jk/diff-files-cleanup-fix' into maint
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.
source: <Ys0c0ePxPOqZ/5ck@coredump.intra.peff.net>

* jk/diff-files-cleanup-fix:
  diff-files: move misplaced cleanup label
2022-07-27 13:00:27 -07:00
ZheNing Hu ce74de931d ls-files: introduce "--format" option
Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-23 10:53:55 -07:00
Elijah Newren c23fc075c6 merge: do not exit restore_state() prematurely
Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren 034195ef92 merge: ensure we can actually restore pre-merge state
Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren aa77ce88ed merge: make restore_state() restore staged state too
There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.  Also while at it, switch to using strvec
so folks don't have to count the arguments to ensure we avoided an
off-by-one error, and so it's easier to add additional arguments to
the command.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren 1369f1475b merge: fix save_state() to work when there are stat-dirty files
When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren 8f240b8bbb merge: do not abort early if one strategy fails to handle the merge
builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren e4cdfe84a0 merge: abort if index does not match HEAD for trivial merges
As noted in the last commit and the links therein (especially commit
9822175d2b ("Ensure index matches head before invoking merge machinery,
round N", 2019-08-17), we have had a very long history of problems with
failing to enforce the requirement that index matches HEAD when starting
a merge.

The "trivial merge" logic in builtin/merge.c is yet another such case
we previously missed.  Add a check for it to ensure it aborts if the
index does not match HEAD, and add a testcase where this fix is needed.

Note that the fix here would also incidentally be an alternative fix
for the testcase added in the last patch, but the fix in the last patch
is still needed when multiple merge strategies are in use, so tweak the
testcase from the previous commit so that it continues to exercise the
codepath added in the last commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Taylor Blau db9d67f2e9 builtin/cat-file.c: support NUL-delimited input with -z
When callers are using `cat-file` via one of the stdin-driven `--batch`
modes, all input is newline-delimited. This presents a problem when
callers wish to ask about, e.g. tree-entries that have a newline
character present in their filename.

To support this niche scenario, introduce a new `-z` mode to the
`--batch`, `--batch-check`, and `--batch-command` suite of options that
instructs `cat-file` to treat its input as NUL-delimited, allowing the
individual commands themselves to have newlines present.

The refactoring here is slightly unfortunate, since we turn loops like:

    while (strbuf_getline(&buf, stdin) != EOF)

into:

    while (1) {
        int ret;
        if (opt->nul_terminated)
            ret = strbuf_getline_nul(&input, stdin);
        else
            ret = strbuf_getline(&input, stdin);

        if (ret == EOF)
            break;
    }

It's tempting to think that we could use `strbuf_getwholeline()` and
specify either `\n` or `\0` as the terminating character. But for input
on platforms that include a CR character preceeding the LF, this
wouldn't quite be the same, since `strbuf_getline(...)` will trim any
trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

In the future, we could clean this up further by introducing a variant
of `strbuf_getwholeline()` that addresses the aforementioned gap, but
that approach felt too heavy-handed for this pair of uses.

Some tests are added in t1006 to ensure that `cat-file` produces the
same output in `--batch`, `--batch-check`, and `--batch-command` modes
with and without the new `-z` option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:42:06 -07:00
Junio C Hamano dd7c820d9e Merge branch 'js/shortlog-sort-stably'
"git shortlog -n" relied on the underlying qsort() to be stable,
which shouldn't have.  Fixed.

* js/shortlog-sort-stably:
  shortlog: use a stable sort
2022-07-22 15:04:02 -07:00
Junio C Hamano 1e11fab59c builtin/remote.c: use the right kind of STRING_LIST_INIT
Since 4a4b4cda (builtin-remote: Make "remote -v" display push urls,
2009-06-13), the string_list that was initialized with 0 in its
strdup_string member is immediately made to strdup its key strings
by flipping the strdup_string member to true.  When 183113a5
(string_list: Add STRING_LIST_INIT macro and make use of it.,
2010-07-04) has introduced STRING_LIST_INIT macros, it mechanically
replaced the initialization to STRING_LIST_INIT_NODUP.

Instead, just use the other initialization macro to make it strdup
the key from the beginning.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-20 21:46:21 -07:00
Junio C Hamano 7c683389d6 Merge branch 'jk/diff-files-cleanup-fix'
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.

* jk/diff-files-cleanup-fix:
  diff-files: move misplaced cleanup label
2022-07-19 16:40:18 -07:00
Junio C Hamano cf92cb29e9 Merge branch 'jk/clone-unborn-confusion'
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.

* jk/clone-unborn-confusion:
  clone: move unborn head creation to update_head()
  clone: use remote branch if it matches default HEAD
  clone: propagate empty remote HEAD even with other branches
  clone: drop extra newline from warning message
2022-07-19 16:40:17 -07:00
Junio C Hamano 418aef9055 Merge branch 'jc/resolve-undo'
The resolve-undo information in the index was not protected against
GC, which has been corrected.

* jc/resolve-undo:
  fsck: do not dereference NULL while checking resolve-undo data
  revision: mark blobs needed for resolve-undo as reachable
2022-07-19 16:40:16 -07:00
Derrick Stolee 3113fedaeb rebase: add rebase.updateRefs config option
The previous change added the --update-refs command-line option.  For
users who always want this mode, create the rebase.updateRefs config
option which behaves the same way as rebase.autoSquash does with the
--autosquash option.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
Derrick Stolee 900b50c242 rebase: add --update-refs option
When working on a large feature, it can be helpful to break that feature
into multiple smaller parts that become reviewed in sequence. During
development or during review, a change to one part of the feature could
affect multiple of these parts. An interactive rebase can help adjust
the multi-part "story" of the branch.

However, if there are branches tracking the different parts of the
feature, then rebasing the entire list of commits can create commits not
reachable from those "sub branches". It can take a manual step to update
those branches.

Add a new --update-refs option to 'git rebase -i' that adds 'update-ref
<ref>' steps to the todo file whenever a commit that is being rebased is
decorated with that <ref>. At the very end, the rebase process updates
all of the listed refs to the values stored during the rebase operation.

Be sure to iterate after any squashing or fixups are placed. Update the
branch only after those squashes and fixups are complete. This allows a
--fixup commit at the tip of the feature to apply correctly to the sub
branch, even if it is fixing up the most-recent commit in that part.

This change update the documentation and builtin to accept the
--update-refs option as well as updating the todo file with the
'update-ref' commands. Tests are added to ensure that these todo
commands are added in the correct locations.

This change does _not_ include the actual behavior of tracking the
updated refs and writing the new ref values at the end of the rebase
process. That is deferred to a later change.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-19 12:49:04 -07:00
Junio C Hamano afbe62d84c Merge branch 'sg/multi-pack-index-parse-options-fix'
The way "git multi-pack" uses parse-options API has been improved.

* sg/multi-pack-index-parse-options-fix:
  multi-pack-index: simplify handling of unknown --options
2022-07-18 13:31:58 -07:00
Junio C Hamano 7f8d098b1b Merge branch 'ab/cocci-unused'
Add Coccinelle rules to detect the pattern of initializing and then
finalizing a structure without using it in between at all, which
happens after code restructuring and the compilers fail to
recognize as an unused variable.

* ab/cocci-unused:
  cocci: generalize "unused" rule to cover more than "strbuf"
  cocci: add and apply a rule to find "unused" strbufs
  cocci: have "coccicheck{,-pending}" depend on "coccicheck-test"
  cocci: add a "coccicheck-test" target and test *.cocci rules
  Makefile & .gitignore: ignore & clean "git.res", not "*.res"
  Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
2022-07-18 13:31:57 -07:00
Junio C Hamano 6d003858e5 Merge branch 'gc/submodule-use-super-prefix'
Another step to rewrite more parts of "git submodule" in C.

* gc/submodule-use-super-prefix:
  submodule--helper: remove display path helper
  submodule--helper update: use --super-prefix
  submodule--helper: remove unused SUPPORT_SUPER_PREFIX flags
  submodule--helper: use correct display path helper
  submodule--helper: don't recreate recursive prefix
  submodule--helper update: use display path helper
  submodule--helper tests: add missing "display path" coverage
2022-07-18 13:31:56 -07:00
Junio C Hamano 44357f64f6 Merge branch 'ab/leakfix'
Plug various memory leaks.

* ab/leakfix:
  pull: fix a "struct oid_array" memory leak
  cat-file: fix a common "struct object_context" memory leak
  gc: fix a memory leak
  checkout: avoid "struct unpack_trees_options" leak
  merge-file: fix memory leaks on error path
  merge-file: refactor for subsequent memory leak fix
  cat-file: fix a memory leak in --batch-command mode
  revert: free "struct replay_opts" members
  submodule.c: free() memory from xgetcwd()
  clone: fix memory leak in wanted_peer_refs()
  check-ref-format: fix trivial memory leak
2022-07-18 13:31:54 -07:00
Junio C Hamano f01315ef7d Merge branch 'jc/builtin-mv-move-array'
Apply Coccinelle rule to turn raw memmove() into MOVE_ARRAY() cpp
macro, which would improve maintainability and readability.

* jc/builtin-mv-move-array:
  builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove()
2022-07-18 13:31:53 -07:00
Siddharth Asthana ec031da9f9 cat-file: add mailmap support
git-cat-file is used by tools like GitLab to get commit tag contents
that are then displayed to users. This content which has author,
committer or tagger information, could benefit from passing through the
mailmap mechanism before being sent or displayed.

This patch adds --[no-]use-mailmap command line option to the git
cat-file command. It also adds --[no-]mailmap option as an alias to
--[no-]use-mailmap.

This patch also introduces new test cases to test the mailmap mechanism in
git cat-file command.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-18 12:55:53 -07:00
Junio C Hamano 361cbe6d6d Merge branch 'ab/submodule-cleanup'
Further preparation to turn git-submodule.sh into a builtin.

* ab/submodule-cleanup:
  git-sh-setup.sh: remove "say" function, change last users
  git-submodule.sh: use "$quiet", not "$GIT_QUIET"
  submodule--helper: eliminate internal "--update" option
  submodule--helper: understand --checkout, --merge and --rebase synonyms
  submodule--helper: report "submodule" as our name in some "-h" output
  submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
  submodule update: remove "-v" option
  submodule--helper: have --require-init imply --init
  git-submodule.sh: remove unused top-level "--branch" argument
  git-submodule.sh: make the "$cached" variable a boolean
  git-submodule.sh: remove unused $prefix variable
  git-submodule.sh: remove unused sanitize_submodule_env()
2022-07-14 15:04:00 -07:00
Junio C Hamano 0455aad1e3 Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path between directories that are "in cone" (i.e. expected
to be materialized in the working tree) and "out of cone"
(i.e. expected to be hidden).  The handling of such cases has been
improved.

* sy/mv-out-of-cone:
  mv: add check_dir_in_index() and solve general dir check issue
  mv: use flags mode for update_mode
  mv: check if <destination> exists in index to handle overwriting
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: decouple if/else-if checks using goto
  mv: update sparsity after moving from out-of-cone to in-cone
  t1092: mv directory from out-of-cone to in-cone
  t7002: add tests for moving out-of-cone file/directory
2022-07-14 15:04:00 -07:00
Junio C Hamano 73b9ef6ab1 Merge branch 'hx/unpack-streaming'
Allow large objects read from a packstream to be streamed into a
loose object file straight, without having to keep it in-core as a
whole.

* hx/unpack-streaming:
  unpack-objects: use stream_loose_object() to unpack large objects
  core doc: modernize core.bigFileThreshold documentation
  object-file.c: add "stream_loose_object()" to handle large object
  object-file.c: factor out deflate part of write_loose_object()
  object-file.c: refactor write_loose_object() to several steps
  unpack-objects: low memory footprint for get_data() in dry_run mode
2022-07-14 15:03:59 -07:00
Junio C Hamano be733e1200 Merge branch 'en/merge-tree'
"git merge-tree" learned a new mode where it takes two commits and
computes a tree that would result in the merge commit, if the
histories leading to these two commits were to be merged.

* en/merge-tree:
  git-merge-tree.txt: add a section on potentional usage mistakes
  merge-tree: add a --allow-unrelated-histories flag
  merge-tree: allow `ls-files -u` style info to be NUL terminated
  merge-ort: optionally produce machine-readable output
  merge-ort: store more specific conflict information
  merge-ort: make `path_messages` a strmap to a string_list
  merge-ort: store messages in a list, not in a single strbuf
  merge-tree: provide easy access to `ls-files -u` style info
  merge-tree: provide a list of which files have conflicts
  merge-ort: remove command-line-centric submodule message from merge-ort
  merge-ort: provide a merge_get_conflicted_files() helper function
  merge-tree: support including merge messages in output
  merge-ort: split out a separate display_update_messages() function
  merge-tree: implement real merges
  merge-tree: add option parsing and initial shell for real merge function
  merge-tree: move logic for existing merge into new function
  merge-tree: rename merge_trees() to trivial_merge_trees()
2022-07-14 15:03:59 -07:00
Johannes Schindelin df534dcbaa shortlog: use a stable sort
When sorting the output of `git shortlog` by count, a list of authors in
alphabetical order is then sorted by contribution count. Obviously, the
idea is to maintain the alphabetical order for items with identical
contribution count.

At the moment, this job is performed by `qsort()`. As that function is
not guaranteed to implement a stable sort algorithm, this can lead to
inconsistent and/or surprising behavior: items with identical
contribution count could lose their alphabetical sub-order.

The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and under certain circumstances this even causes a test
failure in t4201.21 "shortlog can match multiple groups", where two
authors both are listed with 2 contributions, and are listed in inverse
alphabetical order.

Let's instead use the stable sort provided by `git_stable_qsort()` to
avoid this inconsistency.

This is a companion to 2049b8dc65 (diffcore_rename(): use a stable sort,
2019-09-30).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 11:24:11 -07:00
Matheus Tavares 611c7785e8 checkout: fix two bugs on the final count of updated entries
At the end of `git checkout <pathspec>`, we get a message informing how
many entries were updated in the working tree. However, this number can
be inaccurate for two reasons:

1) Delayed entries currently get counted twice.
2) Failed entries are included in the count.

The first problem happens because the counter is first incremented
before inserting the entry in the delayed checkout queue, and once again
when finish_delayed_checkout() calls checkout_entry(). And the second
happens because the counter is incremented too early in
checkout_entry(), before the entry was in fact checked out. Fix that by
moving the count increment further down in the call stack and removing
the duplicate increment on delayed entries. Note that we have to keep
a per-entry reference for the counter (both on parallel checkout and
delayed checkout) because not all entries are always accumulated at the
same counter. See checkout_worktree(), at builtin/checkout.c for an
example.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 10:19:28 -07:00
Junio C Hamano 8c4f65e0bf Merge branch 'cl/grep-max-count'
"git grep -m<max-hits>" is a way to limit the hits shown per file.

* cl/grep-max-count:
  grep: add --max-count command line option
2022-07-13 14:54:55 -07:00
Junio C Hamano 33f448b5fc Merge branch 'jk/remote-show-with-negative-refspecs'
"git remote show [-n] frotz" now pays attention to negative
pathspec.

* jk/remote-show-with-negative-refspecs:
  remote: handle negative refspecs in git remote show
2022-07-13 14:54:54 -07:00
Junio C Hamano 6fccbdaa51 Merge branch 'ro/mktree-allow-missing-fix'
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.

* ro/mktree-allow-missing-fix:
  mktree: do not check type of remote objects
2022-07-13 14:54:53 -07:00
Junio C Hamano 7fefa1b68e Merge branch 'ds/branch-checked-out' into ds/rebase-update-ref
* ds/branch-checked-out:
  branch: drop unused worktrees variable
  fetch: stop passing around unused worktrees variable
  branch: fix branch_checked_out() leaks
  branch: use branch_checked_out() when deleting refs
  fetch: use new branch_checked_out() and add tests
  branch: check for bisects and rebases
  branch: add branch_checked_out() helper
2022-07-12 08:38:42 -07:00
Jeff King 04393ae7f7 diff-files: move misplaced cleanup label
Commit 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13) converted an early return in
cmd_diff_files() into a goto. But it put the cleanup label too early: if
read_cache_preload() returns an error, we'll set result to "-1", but
then jump to calling run_diff_files(), overwriting our result.

We should jump past the call to run_diff_files(). Likewise, we should go
past diff_result_code(), which is expecting to see a code from an actual
diff, not a negative error code.

In practice, I suspect this bug cannot actually be triggered, because
read_cache_preload() does not seem to ever return an error. Its return
value (eventually) comes from do_read_index(), which gives the number of
cache entries found, and calls die() on error. Still, it makes sense to
fix the inadvertent change from 0139c58ab9 first, and we can look into
the overall error handling of read_cache() separately (which is present
in many other callsites).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-12 07:17:28 -07:00
Junio C Hamano e0ad13977a fsck: do not dereference NULL while checking resolve-undo data
When we found an invalid object recorded in the resolve-undo data,
we would have ended up dereferencing NULL while fsck.  Reporting the
problem and going on to the next object is the right thing to do
here.

Noticed by SZEDER Gábor.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 16:26:33 -07:00
Junio C Hamano c2d01098fb Merge branch 'ds/branch-checked-out'
Introduce a helper to see if a branch is already being worked on
(hence should not be newly checked out in a working tree), which
performs much better than the existing find_shared_symref() to
replace many uses of the latter.

* ds/branch-checked-out:
  branch: drop unused worktrees variable
  fetch: stop passing around unused worktrees variable
  branch: fix branch_checked_out() leaks
  branch: use branch_checked_out() when deleting refs
  fetch: use new branch_checked_out() and add tests
  branch: check for bisects and rebases
  branch: add branch_checked_out() helper
2022-07-11 15:38:51 -07:00
Jeff King daf7898abb clone: move unborn head creation to update_head()
Prior to 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05),
creation of the local HEAD was always done in update_head(). That commit
added code to handle an unborn head in an empty repository, and just did
all symref creation and config setup there.

This makes the code flow a little bit confusing, especially as new
corner cases have been covered (like the previous commit to match our
default branch name to a non-HEAD remote branch).

Let's move the creation of the unborn symref into update_head(). This
matches the other HEAD-creation cases, and now the logic is consistently
separated: the main cmd_clone() function only examines the situation and
sets variables based on what it finds, and update_head() actually
performs the update.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 13:32:37 -07:00
SZEDER Gábor cc74afb83f multi-pack-index: simplify handling of unknown --options
Although parse_options() can handle unknown --options just fine, none
of 'git multi-pack-index's subcommands rely on it, but do it on their
own: they invoke parse_options() with the PARSE_OPT_KEEP_UNKNOWN flag,
then check whether there are any unparsed arguments left, and print
usage and quit if necessary.

Drop that PARSE_OPT_KEEP_UNKNOWN flag to let parse_options() handle
unknown options instead, which has the additional benefit that it
prints not only the usage but an "error: unknown option `foo'" message
as well.

Do leave the unparsed arguments check to catch any unexpected
non-option arguments, though, e.g. 'git multi-pack-index write foo'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-10 14:53:48 -07:00
Junio C Hamano eee227ad8e builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove()
The variables 'source', 'destination', and 'submodule_gitfile' are
all of type "const char **", and an element of such an array is of
"type const char *", but these memmove() calls were written as if
these variables are of type "char **".

Once these memmove() calls are fixed to use the correct type to
compute the number of bytes to be moved, e.g.

-      memmove(source + i, source + i + 1, n * sizeof(char *));
+      memmove(source + i, source + i + 1, n * sizeof(const char *));

existing contrib/coccinelle/array.cocci rules can recognize them as
candidates for turning into MOVE_ARRAY().

While at it, use CALLOC_ARRAY() instead of xcalloc() to allocate the
modes[] array that is involved in the change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-09 18:38:57 -07:00
Jeff King cc8fcd1e1a clone: use remote branch if it matches default HEAD
Usually clone tries to use the same local HEAD as the remote (unless the
user has given --branch explicitly). Even if the remote HEAD is detached
or unborn, we can detect those situations with modern versions of Git.
If the remote is too old to support the "unborn" extension (or it has
been disabled via config), then we can't know the name of the remote's
unborn HEAD, and we fall back whatever the local default branch name is
configured to be.

But that leads to one weird corner case. It's rare because it needs a
number of factors:

  - the remote has an unborn HEAD

  - the remote is too old to support "unborn", or has disabled it

  - the remote has another branch "foo"

  - the local default branch name is "foo"

In that case you end up with a local clone on an unborn "foo" branch,
disconnected completely from the remote's "foo". This is rare in
practice, but the result is quite confusing.

When choosing "foo", we can double check whether the remote has such a
name, and if so, start our local "foo" at the same spot, rather than
making it unborn.

Note that this causes a test failure in t5605, which is cloning from a
bundle that doesn't contain HEAD (so it behaves like a remote that
doesn't support "unborn"), but has a single "main" branch. That test
expects that we end up in the weird "unborn main" case, where we don't
actually check out the remote branch of the same name. Even though we
have to update the test, this seems like an argument in favor of this
patch: checking out main is what I'd expect from such a bundle.

So this patch updates the test for the new behavior and adds an adjacent
one that checks what the original was going for: if there's no HEAD and
the bundle _doesn't_ have a branch that matches our local default name,
then we end up with nothing checked out.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-07 20:57:54 -07:00
Jeff King 3d8314f8d1 clone: propagate empty remote HEAD even with other branches
Unless "--branch" was given, clone generally tries to match the local
HEAD to the remote one. For most repositories, this is easy: the remote
tells us which branch HEAD was pointing to, and we call our local
checkout() function on that branch.

When cloning an empty repository, it's a little more tricky: we have
special code that checks the transport's "unborn" extension, or falls back
to our local idea of what the default branch should be. In either case,
we point the new HEAD to that, and set up the branch.* config.

But that leaves one case unhandled: when the remote repository _isn't_
empty, but its HEAD is unborn. The checkout() function is smart enough
to realize we didn't fetch the remote HEAD and it bails with a warning.
But we'll have ignored any information the remote gave us via the unborn
extension. This leads to nonsense outcomes:

  - If the remote has its HEAD pointing to an unborn "foo" and contains
    another branch "bar", cloning will get branch "bar" but leave the
    local HEAD pointing at "master" (or whatever our local default is),
    which is useless. The project does not use "master" as a branch.

  - Worse, if the other branch "bar" is instead called "master" (but
    again, the remote HEAD is not pointing to it), then we end up with a
    local unborn branch "master", which is not connected to the remote
    "master" (it shares no history, and there's no branch.* config).

Instead, we should try to use the remote's HEAD, even if its unborn, to
be consistent with the other cases.

The reason this case was missed is that cmd_clone() handles empty and
non-empty repositories on two different sides of a conditional:

  if (we have any refs) {
      fetch refs;
      check for --branch;
      otherwise, try to point our head at remote head;
      otherwise, our head is NULL;
  } else {
      check for --branch;
      otherwise, try to use "unborn" extension;
      otherwise, fall back to our default name name;
  }

So the smallest change would be to repeat the "unborn" logic at the end
of the first block. But we can note some other overlaps and
inconsistencies:

  - both sides have to handle --branch (though note that it's always an
    error for the empty repo case, since an empty repo by definition
    does not have a matching branch)

  - the fall back to the default name is much more explicit in the
    empty-repo case. The non-empty case eventually ends up bailing
    from checkout() with a warning, which produces a similar result, but
    fails to set up the branch config we do in the empty case.

So let's pull the HEAD setup out of this conditional entirely. This
de-duplicates some of the code and the result is easy to follow, because
helper functions like find_ref_by_name() do the right thing even in the
empty-repo case (i.e., by returning NULL).

There are two subtleties:

  - for a remote with a detached HEAD, it will advertise an oid for HEAD
    (which we store in our "remote_head" variable), but we won't find a
    matching refname (so our "remote_head_points_at" is NULL). In this
    case we make a local detached HEAD to match. Right now this happens
    implicitly by reaching update_head() with a non-NULL remote_head
    (since we skip all of the unborn-fallback). We'll now need to
    account for it explicitly before doing the fallback.

  - for an empty repo, we issue a warning to the user that they've
    cloned an empty repo. The text of that warning doesn't make sense
    for a non-empty repo with an unborn HEAD, so we'll have to
    differentiate the two cases there. We could just use different text,
    but instead let's allow the code to continue down to checkout(),
    which will issue an appropriate warning, like:

      remote HEAD refers to nonexistent ref, unable to checkout

    Continuing down to checkout() will make it easier to do more fixes
    on top (see below).

Note that this patch fixes the case where the other side reports an
unborn head to us using the protocol extension. It _doesn't_ fix the
case where the other side doesn't tell us, we locally guess "master",
and the other side happens to have a "master" which its HEAD doesn't
point. But it doesn't make anything worse there, and it should actually
make it easier to fix that problem on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-07 20:57:54 -07:00
Jeff King f77710c504 clone: drop extra newline from warning message
We don't need to put a "\n" in calls to warning(), since it adds one
itself (and the user sees an extra blank line). Drop it, and while we're
here, drop the full-stop from the message, which goes against our
guidelines.

This bug dates all the way back to 8434c2f1af (Build in clone,
2008-04-27), but presumably nobody noticed because it's hard to trigger:
you have to clone a repository whose HEAD is unborn, but which is not
otherwise empty.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-07 20:57:54 -07:00
Ævar Arnfjörð Bjarmason 06f5f8940c cocci: generalize "unused" rule to cover more than "strbuf"
Generalize the newly added "unused.cocci" rule to find more than just
"struct strbuf", let's have it find the same unused patterns for
"struct string_list", as well as other code that uses
similar-looking *_{release,clear,free}() and {release,clear,free}_*()
functions.

We're intentionally loose in accepting e.g. a "strbuf_init(&sb)"
followed by a "string_list_clear(&sb, 0)".  It's assumed that the
compiler will catch any such invalid code, i.e. that our
constructors/destructors don't take a "void *".

See [1] for example of code that would be covered by the
"get_worktrees()" part of this rule. We'd still need work that the
series is based on (we were passing "worktrees" to a function), but
could now do the change in [1] automatically.

1. https://lore.kernel.org/git/Yq6eJFUPPTv%2Fzc0o@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
Ævar Arnfjörð Bjarmason 4f40f6cb73 cocci: add and apply a rule to find "unused" strbufs
Add a coccinelle rule to remove "struct strbuf" initialization
followed by calling "strbuf_release()" function, without any uses of
the strbuf in the same function.

See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's
intended to find and replace.

The inclusion of "contrib/scalar/scalar.c" is because "spatch" was
manually run on it (we don't usually run spatch on contrib).

Per the "buggy code" comment we also match a strbuf_init() before the
xmalloc(), but we're not seeking to be so strict as to make checks
that the compiler will catch for us redundant. Saying we'll match
either "init" or "xmalloc" lines makes the rule simpler.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 12:24:43 -07:00
Shaoxuan Yuan b91a2b6594 mv: add check_dir_in_index() and solve general dir check issue
Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan 24ea81d9ac mv: use flags mode for update_mode
As suggested by Derrick [1], move the in-line definition of
"enum update_mode" to the top of the file and make it use "flags"
mode (each state is a different bit in the word).

Change the flag assignments from '=' (single assignment) to '|='
(additive). Also change flag evaluation from '==' to '&', etc.

[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan 8a26a3915f mv: check if <destination> exists in index to handle overwriting
Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan 6645b03ca5 mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan 7889755bae mv: decouple if/else-if checks using goto
Previous if/else-if chain are highly nested and hard to develop/extend.

Refactor to decouple this if/else-if chain by using goto to jump ahead.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan 707fa2f76a mv: update sparsity after moving from out-of-cone to in-cone
Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.

Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
   corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
   the updated sparsity.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Ævar Arnfjörð Bjarmason ece3974ba6 pull: fix a "struct oid_array" memory leak
Fix a memory leak introduced in 44c175c7a4 (pull: error on no merge
candidates, 2015-06-18). As a result we can mark several tests as
passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".

Removing the "int ret = 0" assignment added here in a6d7eb2c7a (pull:
optionally rebase submodules (remote submodule changes only),
2017-06-23) is not a logic error, it could always have been left
uninitialized (as "int ret"), now that we'll use the "ret" from the
upper scope we can drop the assignment in the "opt_rebase" branch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason 27472b5195 cat-file: fix a common "struct object_context" memory leak
Fix a memory leak where "cat-file" will leak the "path" member. See
e5fba602e5 (textconv: support for cat_file, 2010-06-15) for the code
that introduced the offending get_oid_with_context() call (called
get_sha1_with_context() at the time).

As a result we can mark several tests as passing with SANITIZE=leak
using "TEST_PASSES_SANITIZE_LEAK=true".

As noted in dc944b65f1 (get_sha1_with_context: dynamically allocate
oc->path, 2017-05-19) callers must free the "path" member. That same
commit added the relevant free() to this function, but we weren't
catching cases where we'd return early.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason 55916bba0f gc: fix a memory leak
Fix a memory leak in code added in 41abfe15d9 (maintenance: add
pack-refs task, 2021-02-09), we need to call strvec_clear() on the
"struct strvec" that we initialized.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason 33d0dda633 checkout: avoid "struct unpack_trees_options" leak
In 1c41d2805e (unpack_trees_options: free messages when done,
2018-05-21) we started calling clear_unpack_trees_porcelain() on this
codepath, but missed this error path.

We could call clear_unpack_trees_porcelain() just before we error()
and return when unmerged_cache() fails, but the more correct fix is to
not have the unmerged_cache() check happen in the middle of our
"topts" setup.

Before 23cbf11b5c (merge-recursive: porcelain messages for checkout,
2010-08-11) we would not malloc() to setup our "topts", which is when
this started to leak on the error path.

Before that this code wasn't conflating the setup of "topts" and the
unmerged_cache() call in any meaningful way. The initial version in
782c2d65c2 (Build in checkout, 2008-02-07) just does a "memset" of
it, and initializes a single struct member.

Then in 8ccba008ee (unpack-trees: allow Porcelain to give different
error messages, 2008-05-17) we added the initialization of the error
message, which as noted above finally started leaking in 23cbf11b5c.

Let's fix the memory leak, and avoid future issues by initializing the
"topts" with a helper function. There are no functional changes here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason e72e12cc02 merge-file: fix memory leaks on error path
Fix a memory leak in "merge-file", we need to loop over the "mmfs"
array and free() what we've got so far when we error out. As a result
we can mark a test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason 480a0e30a7 merge-file: refactor for subsequent memory leak fix
Refactor the code in builtin/merge-file.c to:

 * Use the initializer to zero out "mmfs", and use modern C syntax for
   the rest.

 * Refactor the the inner loop to use a variable and "if/else if"
   pattern followed by "return". This will make a change to change it to
   a "goto cleanup" pattern smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00
Ævar Arnfjörð Bjarmason d90dafbe31 cat-file: fix a memory leak in --batch-command mode
Fix a memory leak introduced in 440c705ea6 (cat-file: add
--batch-command mode, 2022-02-18). The free_cmds() function was only
called on "queued_nr" if we had a "flush" command. As the "without
flush for blob info" test added in the same commit shows we can't rely
on that, so let's call free_cmds() again at the end.

Since "nr" follows the usual pattern of being set to 0 if we've
free()'d the memory already it's OK to call it twice, even in cases
where we are doing a "flush".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 11:43:43 -07:00