Commit graph

10457 commits

Author SHA1 Message Date
Junio C Hamano
c068a3b8ee Merge branch 'ds/decorate-filter-tweak'
The namespaces used by "log --decorate" from "refs/" hierarchy by
default has been tightened.

* ds/decorate-filter-tweak:
  fetch: use ref_namespaces during prefetch
  maintenance: stop writing log.excludeDecoration
  log: create log.initialDecorationSet=all
  log: add --clear-decorations option
  log: add default decoration filter
  log-tree: use ref_namespaces instead of if/else-if
  refs: use ref_namespaces for replace refs base
  refs: add array of ref namespaces
  t4207: test coloring of grafted decorations
  t4207: modernize test
  refs: allow "HEAD" as decoration filter
2022-08-29 14:55:11 -07:00
Junio C Hamano
f00ddc9f48 Merge branch 'vd/scalar-generalize-diagnose'
The "diagnose" feature to create a zip archive for diagnostic
material has been lifted from "scalar" and made into a feature of
"git bugreport".

* vd/scalar-generalize-diagnose:
  scalar: update technical doc roadmap
  scalar-diagnose: use 'git diagnose --mode=all'
  builtin/bugreport.c: create '--diagnose' option
  builtin/diagnose.c: add '--mode' option
  builtin/diagnose.c: create 'git diagnose' builtin
  diagnose.c: add option to configure archive contents
  scalar-diagnose: move functionality to common location
  scalar-diagnose: move 'get_disk_info()' to 'compat/'
  scalar-diagnose: add directory to archiver more gently
  scalar-diagnose: avoid 32-bit overflow of size_t
  scalar-diagnose: use "$GIT_UNZIP" in test
2022-08-25 14:42:32 -07:00
Junio C Hamano
fddd8b4801 Merge branch 'll/disk-usage-humanise'
"git rev-list --disk-usage" learned to take an optional value
"human" to show the reported value in human-readable format, like
"3.40MiB".

* ll/disk-usage-humanise:
  rev-list: support human-readable output for `--disk-usage`
2022-08-18 13:07:05 -07:00
Junio C Hamano
9b9445cfde Merge branch 'sy/sparse-rm'
"git rm" has become more aware of the sparse-index feature.

* sy/sparse-rm:
  rm: integrate with sparse-index
  rm: expand the index only when necessary
  pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
  t1092: add tests for `git-rm`
2022-08-18 13:07:05 -07:00
Junio C Hamano
80ffc849bd Merge branch 'vd/sparse-reset-checkout-fixes'
Fixes to sparse index compatibility work for "reset" and "checkout"
commands.

* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-18 13:07:04 -07:00
Junio C Hamano
c0f6dd49f1 Merge branch 'ab/tech-docs-to-help'
Expose a lot of "tech docs" via "git help" interface.

* ab/tech-docs-to-help:
  docs: move http-protocol docs to man section 5
  docs: move cruft pack docs to gitformat-pack
  docs: move pack format docs to man section 5
  docs: move signature docs to man section 5
  docs: move index format docs to man section 5
  docs: move protocol-related docs to man section 5
  docs: move commit-graph format docs to man section 5
  git docs: add a category for file formats, protocols and interfaces
  git docs: add a category for user-facing file, repo and command UX
  git help doc: use "<doc>" instead of "<guide>"
  help.c: remove common category behavior from drop_prefix() behavior
  help.c: refactor drop_prefix() to use a "switch" statement"
2022-08-14 23:19:28 -07:00
Victoria Dye
aac0e8ffee builtin/bugreport.c: create '--diagnose' option
Create a '--diagnose' option for 'git bugreport' to collect additional
information about the repository and write it to a zipped archive.

The '--diagnose' option behaves effectively as an alias for simultaneously
running 'git bugreport' and 'git diagnose'. In the documentation, users are
explicitly recommended to attach the diagnostics alongside a bug report to
provide additional context to readers, ideally reducing some back-and-forth
between reporters and those debugging the issue.

Note that '--diagnose' may take an optional string arg (either 'stats' or
'all'). If specified without the arg, the behavior corresponds to running
'git diagnose' without '--mode'. As with 'git diagnose', this default is
intended to help reduce unintentional leaking of sensitive information).
Users can also explicitly specify '--diagnose=(stats|all)' to generate the
respective archive created by 'git diagnose --mode=(stats|all)'.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Victoria Dye
7ecf193f7d builtin/diagnose.c: add '--mode' option
Create '--mode=<mode>' option in 'git diagnose' to allow users to optionally
select non-default diagnostic information to include in the output archive.
Additionally, document the currently-available modes, emphasizing the
importance of not sharing a '--mode=all' archive publicly due to the
presence of sensitive information.

Note that the option parsing callback - 'option_parse_diagnose()' - is added
to 'diagnose.c' rather than 'builtin/diagnose.c' so that it may be reused in
future callers configuring a diagnostics archive.

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Victoria Dye
6783fd3cef builtin/diagnose.c: create 'git diagnose' builtin
Create a 'git diagnose' builtin to generate a standalone zip archive of
repository diagnostics.

The "diagnose" functionality was originally implemented for Scalar in
aa5c79a331 (scalar: implement `scalar diagnose`, 2022-05-28). However, the
diagnostics gathered are not specific to Scalar-cloned repositories and
can be useful when diagnosing issues in any Git repository.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-12 13:20:02 -07:00
Junio C Hamano
83489a5b20 Merge branch 'ab/plug-revisions-leak'
Plug a bit more leaks in the revisions API.

* ab/plug-revisions-leak:
  revisions API: don't leak memory on argv elements that need free()-ing
  bisect.c: partially fix bisect_rev_setup() memory leak
  log: refactor "rev.pending" code in cmd_show()
  log: fix a memory leak in "git show <revision>..."
  test-fast-rebase helper: use release_revisions() (again)
  bisect.c: add missing "goto" for release_revisions()
2022-08-12 13:19:08 -07:00
Junio C Hamano
8faaf690f7 Merge branch 'lt/symbolic-ref-sanity'
"git symbolic-ref symref non..sen..se" is now diagnosed as an error.

* lt/symbolic-ref-sanity:
  symbolic-ref: refuse to set syntactically invalid target
2022-08-12 13:19:08 -07:00
Li Linchao
9096451acd rev-list: support human-readable output for --disk-usage
The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the resulting number is quit hard for a human to read.

Teach git rev-list to output a human readable result when using
'--disk-usage'.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 13:45:23 -07:00
Shaoxuan Yuan
ede241c715 rm: integrate with sparse-index
Enable the sparse index within the `git-rm` command.

The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.

Test                              HEAD~1            HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3)     0.41(0.37+0.05)   0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4)     0.38(0.34+0.05)   0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3)   0.57(0.56+0.01)   0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4)   0.57(0.55+0.02)   0.03(0.03+0.00) -94.7%

----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].

`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.

For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:

        rm 'folder1/'

Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:

        rm 'folder1/0/0/0'
        rm 'folder1/0/1'
        rm 'folder1/a'

Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".

Modify a previous test so such difference is not considered as an error.

[1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan
bcf96cfca6 rm: expand the index only when necessary
Remove the `ensure_full_index()` method so `git-rm` does not always
expand the index when the expansion is unnecessary, i.e. when
<pathspec> does not have any possibilities to match anything outside
of sparse-checkout definition.

Expand the index when the <pathspec> needs an expanded index, i.e. the
<pathspec> contains wildcard that may need a full-index or the
<pathspec> is simply outside of sparse-checkout definition.

Notice that the test 'rm pathspec expands index when necessary' in
t1092 *is* testing this code change behavior, though it will be marked
as 'test_expect_success' only in the next patch, where we officially
mark `command_requires_full_index = 0`, so the index does not expand
unless we tell it to do so.

Notice that because we also want `ensure_full_index` to record the
stdout and stderr from Git command, a corresponding modification
is also included in this patch. The reason we want the "sparse-index-out"
and "sparse-index-err", is that we need to make sure there is no error
from Git command itself, so we can rely on the `test_region` result
and determine if the index is expanded or not.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan
b29ad38322 pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
Method pathspec_needs_expanded_index() in reset.c from 4d1cfc1351
(reset: make --mixed sparse-aware, 2021-11-29) is reusable when we
need to verify if the index needs to be expanded when the command
is utilizing a pathspec rather than a literal path.

Move it to pathspec.h for reusability.

Add a few items to the function so it can better serve its purpose as
a standalone public function:

* Add a check in front so if the index is not sparse, return early since
  no expansion is needed.

* It now takes an arbitrary 'struct index_state' pointer instead of
  using `the_index` and `active_cache`.

* Add documentation to the function.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Junio C Hamano
3f61790678 Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm
* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-08 13:23:06 -07:00
Victoria Dye
49ff3cb90f checkout: fix nested sparse directory diff in sparse index
Add the 'recursive' diff flag to the local changes reporting done by 'git
checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded
sparse directories will not be recursed into to report the diff of each
file's contents, resulting in the reported local changes including
"modified" sparse directories.

The same issue was found and fixed for 'git status' in 2c521b0e49 (status:
fix nested sparse directory diff in sparse index, 2022-03-01)

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:49 -07:00
Junio C Hamano
1e92768aa1 Merge branch 'tb/cat-file-z'
Operating modes like "--batch" of "git cat-file" command learned to
take NUL-terminated input, instead of one-item-per-line.

* tb/cat-file-z:
  builtin/cat-file.c: support NUL-delimited input with `-z`
  t1006: extract --batch-command inputs to variables
2022-08-05 15:52:14 -07:00
Derrick Stolee
992f25d713 fetch: use ref_namespaces during prefetch
The "refs/prefetch/" namespace is used by 'git fetch --prefetch' as a
replacement of the destination of the refpsec for a remote. Git also
removes refspecs that include tags.

Instead of using string literals for the 'refs/tags/ and
'refs/prefetch/' namespaces, use the entries in the ref_namespaces
array.

This kind of change could be done in many places around the codebase,
but we are isolating only to this change because of the way the
refs/prefetch/ namespace somewhat motivated the creation of the
ref_namespaces array.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
Derrick Stolee
863a8ae97b maintenance: stop writing log.excludeDecoration
This reverts commit 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19).

The previous change created a default decoration filter that does not
include refs/prefetch/, so this modification of the config is no longer
needed.

One issue that can happen from this point on is that users who ran the
prefetch task on previous versions of Git will still have a
log.excludeDecoration value and that will prevent the new default
decoration filter from being active. Thus, when we add the refs/bundle/
namespace as part of the bundle URI feature, those users will see
refs/bundle/ decorations.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:13 -07:00
Derrick Stolee
3e103ed23f log: create log.initialDecorationSet=all
The previous change introduced the --clear-decorations option for users
who do not want their decorations limited to a narrow set of ref
namespaces.

Add a config option that is equivalent to specifying --clear-decorations
by default.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee
748706d713 log: add --clear-decorations option
The previous changes introduced a new default ref filter for decorations
in the 'git log' command. This can be overridden using
--decorate-refs=HEAD and --decorate-refs=refs/, but that is cumbersome
for users.

Instead, add a --clear-decorations option that resets all previous
filters to a blank filter that accepts all refs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee
92156291ca log: add default decoration filter
When a user runs 'git log', they expect a certain set of helpful
decorations. This includes:

* The HEAD ref
* Branches (refs/heads/)
* Stashes (refs/stash)
* Tags (refs/tags/)
* Remote branches (refs/remotes/)
* Replace refs (refs/replace/ or $GIT_REPLACE_REF_BASE)

Each of these namespaces was selected due to existing test cases that
verify these namespaces appear in the decorations. In particular,
stashes and replace refs can have custom colors from the
color.decorate.<slot> config option.

While one test checks for a decoration from notes, it only applies to
the tip of refs/notes/commit (or its configured ref name). Notes form
their own kind of decoration instead. Modify the expected output for the
tests in t4013 that expect this note decoration.  There are several
tests throughout the codebase that verify that --decorate-refs,
--decorate-refs-exclude, and log.excludeDecoration work as designed and
the tests continue to pass without intervention.

However, there are other refs that are less helpful to show as
decoration:

* Prefetch refs (refs/prefetch/)
* Rebase refs (refs/rebase-merge/ and refs/rebase-apply/)
* Bundle refs (refs/bundle/) [!]

[!] The bundle refs are part of a parallel series that bootstraps a repo
    from a bundle file, storing the bundle's refs into the repo's
    refs/bundle/ namespace.

In the case of prefetch refs, 96eaffebbf (maintenance: set
log.excludeDecoration durin prefetch, 2021-01-19) added logic to add
refs/prefetch/ to the log.excludeDecoration config option. Additional
feedback pointed out that having such a side-effect can be confusing and
perhaps not helpful to users. Instead, we should hide these ref
namespaces that are being used by Git for internal reasons but are not
helpful for the users to see.

The way to provide a seamless user experience without setting the config
is to modify the default decoration filters to match our expectation of
what refs the user actually wants to see.

In builtin/log.c, after parsing the --decorate-refs and
--decorate-refs-exclude options from the command-line, call
set_default_decoration_filter(). This method populates the exclusions
from log.excludeDecoration, then checks if the list of pattern
modifications are empty. If none are specified, then the default set is
restricted to the set of inclusions mentioned earlier (HEAD, branches,
etc.).  A previous change introduced the ref_namespaces array, which
includes all of these currently-used namespaces. The 'decoration' value
is non-zero when that namespace is associated with a special coloring
and fits into the list of "expected" decorations as described above,
which makes the implementation of this filter very simple.

Note that the logic in ref_filter_match() in log-tree.c follows this
matching pattern:

 1. If there are exclusion patterns and the ref matches one, then ignore
    the decoration.

 2. If there are inclusion patterns and the ref matches one, then
    definitely include the decoration.

 3. If there are config-based exclusions from log.excludeDecoration and
    the ref matches one, then ignore the decoration.

With this logic in mind, we need to ensure that we do not populate our
new defaults if any of these filters are manually set. Specifically, if
a user runs

	git -c log.excludeDecoration=HEAD log

then we expect the HEAD decoration to not appear. If we left the default
inclusions in the set, then HEAD would match that inclusion before
reaching the config-based exclusions.

A potential alternative would be to check the list of default inclusions
at the end, after the config-based exclusions. This would still create a
behavior change for some uses of --decorate-refs-exclude=<X>, and could
be overwritten somewhat with --decorate-refs=refs/ and
--decorate-refs=HEAD. However, it no longer becomes possible to include
refs outside of the defaults while also excluding some using
log.excludeDecoration.

Another alternative would be to exclude the known namespaces that are
not intended to be shown. This would reduce the visible effect of the
change for expert users who use their own custom ref namespaces. The
implementation change would be very simple to swap due to our use of
ref_namespaces:

	int i;
	struct string_list *exclude = decoration_filter->exclude_ref_pattern;

	/*
	 * No command-line or config options were given, so
	 * populate with sensible defaults.
	 */
	for (i = 0; i < NAMESPACE__COUNT; i++) {
		if (ref_namespaces[i].decoration)
			continue;

		string_list_append(exclude, ref_namespaces[i].ref);
	}

The main downside of this approach is that we expect to add new hidden
namespaces in the future, and that means that Git versions will be less
stable in how they behave as those namespaces are added.

It is critical that we provide ways for expert users to disable this
behavior change via command-line options and config keys. These changes
will be implemented in a future change.

Add a test that checks that the defaults are not added when
--decorate-refs is specified. We verify this by showing that HEAD is not
included as it normally would.  Also add a test that shows that the
default filter avoids the unwanted decorations from refs/prefetch,
refs/rebase-merge,
and refs/bundle.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Derrick Stolee
97e61e0f9c refs: use ref_namespaces for replace refs base
The git_replace_ref_base global is used to store the value of the
GIT_REPLACE_REF_BASE environment variable or the default of
"refs/replace/". This is initialized within setup_git_env().

The ref_namespaces array is a new centralized location for information
such as the ref namespace used for replace refs. Instead of having this
namespace stored in two places, use the ref_namespaces array instead.

For simplicity, create a local git_replace_ref_base variable wherever
the global was previously used.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-05 14:13:12 -07:00
Ævar Arnfjörð Bjarmason
844739ba27 git docs: add a category for file formats, protocols and interfaces
Create a new "File formats, protocols and other developer interfaces"
section in the main "git help git" manual page and start moving the
documentation that now lives in "Documentation/technical/*.git" over
to it. This complements the newly added and adjacent "Repository,
command and file interfaces" section.

This makes the technical documentation more accessible and
discoverable. Before this we wouldn't install it by default, and had
no ability to build man page versions of them. The links to them from
our existing documentation link to the generated HTML version of these
docs.

So let's start moving those over, starting with just the
"bundle-format.txt" documentation added in 7378ec90e1 (doc: describe
Git bundle format, 2020-02-07). We'll now have a new
gitformat-bundle(5) man page. Subsequent commits will move more git
internal format documentation over.

Unfortunately the syntax of the current Documentation/technical/*.txt
is not the same (when it comes to section headings etc.) as our
Documentation/*.txt documentation, so change the relevant bits of
syntax as we're moving this over.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason
d976c5100f git docs: add a category for user-facing file, repo and command UX
Create a new "Repository, command and file interfaces" section in the
main "git help git" manual page. Move things that belong under this
new criteria from the generic "Guides" section.

The "Guides" section was added in f442f28a81 (git.txt: add list of
guides, 2020-08-05). It makes sense to have e.g. "giteveryday(7)" and
"gitfaq(7)" listed under "Guides".

But placing e.g. "gitignore(5)" in it is stretching the meaning of
what a "guide" is, ideally that section should list things similar to
"giteveryday(7)" and "gitcore-tutorial(7)".

An alternate name that was considered for this new section was "User
formats", for consistency with the nomenclature used for man section 5
in general. My man(1) lists it as "File formats and conventions,
e.g. /etc/passwd".

So calling this "git help --formats" or "git help --user-formats"
would make sense for e.g. gitignore(5), but would be stretching it
somewhat for githooks(5), and would seem really suspect for the likes
of gitcli(7).

Let's instead pick a name that's closer to the generic term "User
interface", which is really what this documentation discusses: General
user-interface documentation that doesn't obviously belong elsewhere.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason
dba1e5392f git help doc: use "<doc>" instead of "<guide>"
Replace the use of "<guide>" originally introduced (as "GUIDE") in
a133737b80 (doc: include --guide option description for "git help",
2013-04-02) with the more generic "<doc>". The "<doc>" placeholder is
more generic, and one we'll be able to use as we introduce new
documentation categories.

Let's also add "<doc>" to the "git help -h" output, when it was made
to use parse_option() in in 41eb33bd0c (help: use parseopt,
2008-02-24).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-04 14:12:23 -07:00
Junio C Hamano
30c6495e1e Merge branch 'jc/string-list-cleanup'
Code clean-up.

* jc/string-list-cleanup:
  builtin/remote.c: use the right kind of STRING_LIST_INIT
2022-08-03 13:36:09 -07:00
Junio C Hamano
966ff64a30 Merge branch 'en/merge-restore-to-pristine'
When "git merge" finds that it cannot perform a merge, it should
restore the working tree to the state before the command was
initiated, but in some corner cases it didn't.

* en/merge-restore-to-pristine:
  merge: do not exit restore_state() prematurely
  merge: ensure we can actually restore pre-merge state
  merge: make restore_state() restore staged state too
  merge: fix save_state() to work when there are stat-dirty files
  merge: do not abort early if one strategy fails to handle the merge
  merge: abort if index does not match HEAD for trivial merges
  merge-resolve: abort if index does not match HEAD
  merge-ort-wrappers: make printed message match the one from recursive
2022-08-03 13:36:09 -07:00
Junio C Hamano
87098a047b Merge branch 'sa/cat-file-mailmap'
"git cat-file" learned an option to use the mailmap when showing
commit and tag objects.

* sa/cat-file-mailmap:
  cat-file: add mailmap support
  ident: rename commit_rewrite_person() to apply_mailmap_to_header()
  ident: move commit_rewrite_person() to ident.c
  revision: improve commit_rewrite_person()
2022-08-03 13:36:08 -07:00
Junio C Hamano
8e56affcb5 Merge branch 'zh/ls-files-format'
"git ls-files" learns the "--format" option to tweak its output.

* zh/ls-files-format:
  ls-files: introduce "--format" option
2022-08-03 13:36:08 -07:00
Ævar Arnfjörð Bjarmason
f92dbdbc6a revisions API: don't leak memory on argv elements that need free()-ing
Add a "free_removed_argv_elements" member to "struct
setup_revision_opt", and use it to fix several memory leaks.

We have various memory leaks in APIs that take and munge "const
char **argv", e.g. parse_options(). Sometimes these APIs are given the
"argv" we get to the "main" function, in which case we don't leak
memory, but other times we're giving it the "v" member of a "struct
strvec" we created.

There's several potential ways to fix those sort of leaks, we could
add a "nodup" mode to "struct strvec", which would work for the cases
where we push constant strings to it. But that wouldn't work as soon
as we used strvec_pushf(), or otherwise needed to duplicate or create
a string for that "struct strvec".

Let's instead make it the responsibility of the revisions API. If it's
going to clobber elements of argv it can also free() them, which it
will now do if instructed to do so via "free_removed_argv_elements".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:12:36 -07:00
Ævar Arnfjörð Bjarmason
f89d085b3f log: refactor "rev.pending" code in cmd_show()
Refactor the juggling of "rev.pending" and our replacement for it
amended in the preceding commit so that:

 * We use an "unsigned int" instead of an "int" for "i", this matches
   the types of "struct rev_info" itself.

 * We don't need the "count" and "objects" variables introduced in
   5d7eeee2ac (git-show: grok blobs, trees and tags, too, 2006-12-14).

   They were originally added since we'd clobber rev.pending in the
   loop without restoring it. Since the preceding commit we are
   restoring it when we handle OBJ_COMMIT, so the main for-loop can
   refer to "rev.pending" didrectly.

 * We use the "memcpy a &blank" idiom introduced in
   5726a6b401 (*.c *_init(): define in terms of corresponding *_INIT
   macro, 2021-07-01).

   This is more obvious than relying on us enumerating all of the
   relevant members of the "struct object_array" that we need to
   clear.

 * We comment on why we don't need an object_array_clear() here, see
   the analysis in [1].

1. https://lore.kernel.org/git/YuQtJ2DxNKX%2Fy70N@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:54:20 -07:00
Ævar Arnfjörð Bjarmason
055e57b7b2 log: fix a memory leak in "git show <revision>..."
Fix a memory leak in code added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14). As we iterate over a "<revision>..."
command-line and encounter ad OBJ_COMMIT we want to use our "struct
rev_info", but with a "pending" array of one element: the one commit
we're showing in the loop.

To do this 5d7eeee2ac saved away a pointer to rev.pending.objects and
rev.pending.nr for its iteration. We'd then clobber those (and alloc)
when we needed to show an OBJ_COMMIT.

We'd therefore leak the "rev.pending" we started out with, and only
free the new "rev.pending" in the "OBJ_COMMIT" case arm as
prepare_revision_walk() would draw it down.

Let's fix this memory leak. Now when we encounter an OBJ_COMMIT we
save away the "rev.pending" before clearing it. We then add a single
commit to it, which our indirect invocation of prepare_revision_walk()
will remove. After that we restore the "rev.pending".

Our "rev.pending" will then get free'd by the release_revisions()
added in f6bfea0ad0 (revisions API users: use release_revisions() in
builtin/log.c, 2022-04-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:16:28 -07:00
Linus Torvalds
04ede97211 symbolic-ref: refuse to set syntactically invalid target
You can feed absolute garbage to symbolic-ref as a target like:

  git symbolic-ref HEAD refs/heads/foo..bar

While this doesn't technically break the repo entirely (our "is it a git
directory" detector looks only for "refs/" at the start), we would never
resolve such a ref, as the ".." is invalid within a refname.

Let's flag these as invalid at creation time to help the caller realize
that what they're asking for is bogus.

A few notes:

  - We use REFNAME_ALLOW_ONELEVEL here, which lets:

     git update-ref refs/heads/foo FETCH_HEAD

    continue to work. It's unclear whether anybody wants to do something
    so odd, but it does work now, so this is erring on the conservative
    side. There's a test to make sure we didn't accidentally break this,
    but don't take that test as an endorsement that it's a good idea, or
    something we might not change in the future.

  - The test in t4202-log.sh checks how we handle such an invalid ref on
    the reading side, so it has to be updated to touch the HEAD file
    directly.

  - We need to keep our HEAD-specific check for "does it start with
    refs/". The ALLOW_ONELEVEL flag means we won't be enforcing that for
    other refs, but HEAD is special here because of the checks in
    validate_headref().

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-01 12:17:13 -07:00
Junio C Hamano
acdb1e1053 Merge branch 'mt/checkout-count-fix'
"git checkout" miscounted the paths it updated, which has been
corrected.
source: <cover.1657799213.git.matheus.bernardino@usp.br>

* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-08-01 09:58:38 -07:00
Junio C Hamano
3d8e3dc4fc Merge branch 'ds/rebase-update-ref'
"git rebase -i" learns to update branches whose tip appear in the
rebased range with "--update-refs" option.
source: <pull.1247.v5.git.1658255624.gitgitgadget@gmail.com>

* ds/rebase-update-ref:
  sequencer: notify user of --update-refs activity
  sequencer: ignore HEAD ref under --update-refs
  rebase: add rebase.updateRefs config option
  sequencer: rewrite update-refs as user edits todo list
  rebase: update refs from 'update-ref' commands
  rebase: add --update-refs option
  sequencer: add update-ref command
  sequencer: define array with enum values
  rebase-interactive: update 'merge' description
  branch: consider refs under 'update-refs'
  t2407: test branches currently using apply backend
  t2407: test bisect and rebase as black-boxes
2022-08-01 09:58:38 -07:00
Junio C Hamano
54ec7b817d Merge branch 'ro/mktree-allow-missing-fix' into maint
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.
source: <748f39a9-65aa-2110-cf92-7ddf81b5f507@roku.com>

* ro/mktree-allow-missing-fix:
  mktree: do not check type of remote objects
2022-07-27 13:00:30 -07:00
Junio C Hamano
494d31e9d6 Merge branch 'jk/diff-files-cleanup-fix' into maint
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.
source: <Ys0c0ePxPOqZ/5ck@coredump.intra.peff.net>

* jk/diff-files-cleanup-fix:
  diff-files: move misplaced cleanup label
2022-07-27 13:00:27 -07:00
ZheNing Hu
ce74de931d ls-files: introduce "--format" option
Add a new option "--format" that outputs index entries
informations in a custom format, taking inspiration
from the option with the same name in the `git ls-tree`
command.

"--format" cannot used with "-s", "-o", "-k", "-t",
" --resolve-undo","--deduplicate" and "--eol".

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-23 10:53:55 -07:00
Elijah Newren
c23fc075c6 merge: do not exit restore_state() prematurely
Previously, if the user:

* Had no local changes before starting the merge
* A merge strategy makes changes to the working tree/index but returns
  with exit status 2

Then we'd call restore_state() to clean up the changes and either let
the next merge strategy run (if there is one), or exit telling the user
that no merge strategy could handle the merge.  Unfortunately,
restore_state() did not clean up the changes as expected; that function
was a no-op if the stash was a null, and the stash would be null if
there were no local changes before starting the merge.  So, instead of
"Rewinding the tree to pristine..." as the code claimed, restore_state()
would leave garbage around in the index and working tree (possibly
including conflicts) for either the next merge strategy or for the user
after aborting the merge.  And in the case of aborting the merge, the
user would be unable to run "git merge --abort" to get rid of the
unintended leftover conflicts, because the merge control files were not
written as it was presumed that we had restored to a clean state
already.

Fix the main problem by making sure that restore_state() only skips the
stash application if the stash is null rather than skipping the whole
function.

However, there is a secondary problem -- since merge.c forks
subprocesses to do the cleanup, the in-memory index is left out-of-sync.
While there was a refresh_cache(REFRESH_QUIET) call that attempted to
correct that, that function would not handle cases where the previous
merge strategy added conflicted entries.  We need to drop the index and
re-read it to handle such cases.

(Alternatively, we could stop forking subprocesses and instead call some
appropriate function to do the work which would update the in-memory
index automatically.  For now, just do the simple fix.)

Also, add a testcase checking this, one for which the octopus strategy
fails on the first commit it attempts to merge, and thus which it
cannot handle at all and must completely bail on (as per the "exit 2"
code path of commit 98efc8f3d8 ("octopus: allow manual resolve on the
last round.", 2006-01-13)).

Reported-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren
034195ef92 merge: ensure we can actually restore pre-merge state
Merge strategies can:
  * succeed with a clean merge
  * succeed with a conflicted merge
  * fail to handle the given type of merge

If one is thinking in terms of automatic mergeability, they would use
the word "fail" instead of "succeed" for the second bullet, but I am
focusing here on ability of the merge strategy to handle the given
inputs, not on whether the given inputs are mergeable.  The third
category is about the merge strategy failing to know how to handle the
given data; examples include:

  * Passing more than 2 branches to 'recursive' or 'ort'
  * Passing 2 or fewer branches to 'octopus'
  * Trying to do more complicated merges with 'resolve' (I believe
    directory/file conflicts will cause it to bail.)
  * Octopus running into a merge conflict for any branch OTHER than
    the final one (see the "exit 2" codepath of commit 98efc8f3d8
    ("octopus: allow manual resolve on the last round.", 2006-01-13))

That final one is particularly interesting, because it shows that the
merge strategy can muck with the index and working tree, and THEN bail
and say "sorry, this strategy cannot handle this type of merge; use
something else".

Further, we do not currently expect the individual strategies to clean
up after themselves, but instead expect builtin/merge.c to do so.  For
it to be able to, it needs to save the state before trying the merge
strategy so it can have something to restore to.  Therefore, remove the
shortcut bypassing the save_state() call.

There is another bug on the restore_state() side of things, so no
testcase will be added until the next commit when we have addressed that
issue as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren
aa77ce88ed merge: make restore_state() restore staged state too
There are multiple issues at play here:

  1) If `git merge` is invoked with staged changes, it should abort
     without doing any merging, and the user's working tree and index
     should be the same as before merge was invoked.
  2) Merge strategies are responsible for enforcing the index == HEAD
     requirement. (See 9822175d2b ("Ensure index matches head before
     invoking merge machinery, round N", 2019-08-17) for some history
     around this.)
  3) Merge strategies can bail saying they are not an appropriate
     handler for the merge in question (possibly allowing other
     strategies to be used instead).
  4) Merge strategies can make changes to the index and working tree,
     and have no expectation to clean up after themselves, *even* if
     they bail out and say they are not an appropriate handler for
     the merge in question.  (The `octopus` merge strategy does this,
     for example.)
  5) Because of (3) and (4), builtin/merge.c stashes state before
     trying merge strategies and restores it afterward.

Unfortunately, if users had staged changes before calling `git merge`,
builtin/merge.c could do the following:

   * stash the changes, in order to clean up after the strategies
   * try all the merge strategies in turn, each of which report they
     cannot function due to the index not matching HEAD
   * restore the changes via "git stash apply"

But that last step would have the net effect of unstaging the user's
changes.  Fix this by adding the "--index" option to "git stash apply".
While at it, also squelch the stash apply output; we already report
"Rewinding the tree to pristine..." and don't need a detailed `git
status` report afterwards.  Also while at it, switch to using strvec
so folks don't have to count the arguments to ensure we avoided an
off-by-one error, and so it's easier to add additional arguments to
the command.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren
1369f1475b merge: fix save_state() to work when there are stat-dirty files
When there are stat-dirty files, but no files are modified,
`git stash create` exits with unsuccessful status.  This causes merge
to fail.  Copy some code from sequencer.c's create_autostash to refresh
the index first to avoid this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren
8f240b8bbb merge: do not abort early if one strategy fails to handle the merge
builtin/merge is setup to allow multiple strategies to be specified,
and it will find the "best" result and use it.  This is defeated if
some of the merge strategies abort early when they cannot handle the
merge.  Fix the logic that calls recursive and ort to not do such an
early abort, but instead return "2" or "unhandled" so that the next
strategy can try to handle the merge.

Coming up with a testcase for this is somewhat difficult, since
recursive and ort both handle nearly any two-headed merge (there is
a separate code path that checks for non-two-headed merges and
already returns "2" for them).  So use a somewhat synthetic testcase
of having the index not match HEAD before the merge starts, since all
merge strategies will abort for that.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Elijah Newren
e4cdfe84a0 merge: abort if index does not match HEAD for trivial merges
As noted in the last commit and the links therein (especially commit
9822175d2b ("Ensure index matches head before invoking merge machinery,
round N", 2019-08-17), we have had a very long history of problems with
failing to enforce the requirement that index matches HEAD when starting
a merge.

The "trivial merge" logic in builtin/merge.c is yet another such case
we previously missed.  Add a check for it to ensure it aborts if the
index does not match HEAD, and add a testcase where this fix is needed.

Note that the fix here would also incidentally be an alternative fix
for the testcase added in the last patch, but the fix in the last patch
is still needed when multiple merge strategies are in use, so tweak the
testcase from the previous commit so that it continues to exercise the
codepath added in the last commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:45:23 -07:00
Taylor Blau
db9d67f2e9 builtin/cat-file.c: support NUL-delimited input with -z
When callers are using `cat-file` via one of the stdin-driven `--batch`
modes, all input is newline-delimited. This presents a problem when
callers wish to ask about, e.g. tree-entries that have a newline
character present in their filename.

To support this niche scenario, introduce a new `-z` mode to the
`--batch`, `--batch-check`, and `--batch-command` suite of options that
instructs `cat-file` to treat its input as NUL-delimited, allowing the
individual commands themselves to have newlines present.

The refactoring here is slightly unfortunate, since we turn loops like:

    while (strbuf_getline(&buf, stdin) != EOF)

into:

    while (1) {
        int ret;
        if (opt->nul_terminated)
            ret = strbuf_getline_nul(&input, stdin);
        else
            ret = strbuf_getline(&input, stdin);

        if (ret == EOF)
            break;
    }

It's tempting to think that we could use `strbuf_getwholeline()` and
specify either `\n` or `\0` as the terminating character. But for input
on platforms that include a CR character preceeding the LF, this
wouldn't quite be the same, since `strbuf_getline(...)` will trim any
trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

In the future, we could clean this up further by introducing a variant
of `strbuf_getwholeline()` that addresses the aforementioned gap, but
that approach felt too heavy-handed for this pair of uses.

Some tests are added in t1006 to ensure that `cat-file` produces the
same output in `--batch`, `--batch-check`, and `--batch-command` modes
with and without the new `-z` option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-22 21:42:06 -07:00
Junio C Hamano
dd7c820d9e Merge branch 'js/shortlog-sort-stably'
"git shortlog -n" relied on the underlying qsort() to be stable,
which shouldn't have.  Fixed.

* js/shortlog-sort-stably:
  shortlog: use a stable sort
2022-07-22 15:04:02 -07:00
Junio C Hamano
1e11fab59c builtin/remote.c: use the right kind of STRING_LIST_INIT
Since 4a4b4cda (builtin-remote: Make "remote -v" display push urls,
2009-06-13), the string_list that was initialized with 0 in its
strdup_string member is immediately made to strdup its key strings
by flipping the strdup_string member to true.  When 183113a5
(string_list: Add STRING_LIST_INIT macro and make use of it.,
2010-07-04) has introduced STRING_LIST_INIT macros, it mechanically
replaced the initialization to STRING_LIST_INIT_NODUP.

Instead, just use the other initialization macro to make it strdup
the key from the beginning.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-20 21:46:21 -07:00
Junio C Hamano
7c683389d6 Merge branch 'jk/diff-files-cleanup-fix'
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.

* jk/diff-files-cleanup-fix:
  diff-files: move misplaced cleanup label
2022-07-19 16:40:18 -07:00