Commit graph

66103 commits

Author SHA1 Message Date
Junio C Hamano 991b4d47f0 Merge branch 'ps/avoid-unnecessary-hook-invocation-with-packed-refs'
Because a deletion of ref would need to remove it from both the
loose ref store and the packed ref store, a delete-ref operation
that logically removes one ref may end up invoking ref-transaction
hook twice, which has been corrected.

* ps/avoid-unnecessary-hook-invocation-with-packed-refs:
  refs: skip hooks when deleting uncovered packed refs
  refs: do not execute reference-transaction hook on packing refs
  refs: demonstrate excessive execution of the reference-transaction hook
  refs: allow skipping the reference-transaction hook
  refs: allow passing flags when beginning transactions
  refs: extract packed_refs_delete_refs() to allow control of transaction
2022-02-18 13:53:27 -08:00
Junio C Hamano bcd020f88e Merge branch 'pw/use-in-process-checkout-in-rebase'
Use an internal call to reset_head() helper function instead of
spawning "git checkout" in "rebase", and update code paths that are
involved in the change.

* pw/use-in-process-checkout-in-rebase:
  rebase -m: don't fork git checkout
  rebase --apply: set ORIG_HEAD correctly
  rebase --apply: fix reflog
  reset_head(): take struct rebase_head_opts
  rebase: cleanup reset_head() calls
  create_autostash(): remove unneeded parameter
  reset_head(): make default_reflog_action optional
  reset_head(): factor out ref updates
  reset_head(): remove action parameter
  rebase --apply: don't run post-checkout hook if there is an error
  rebase: do not remove untracked files on checkout
  rebase: pass correct arguments to post-checkout hook
  t5403: refactor rebase post-checkout hook tests
  rebase: factor out checkout for up to date branch
2022-02-18 13:53:27 -08:00
Junio C Hamano 867b520301 Merge branch 'cb/clear-quarantine-early-on-all-ref-update-errors'
"receive-pack" checks if it will do any ref updates (various
conditions could reject a push) before received objects are taken
out of the temporary directory used for quarantine purposes, so
that a push that is known-to-fail will not leave crufts that a
future "gc" needs to clean up.

* cb/clear-quarantine-early-on-all-ref-update-errors:
  receive-pack: purge temporary data if no command is ready to run
2022-02-18 13:53:27 -08:00
Taylor Blau e8d56ca863 CODE_OF_CONDUCT.md: update PLC members list
As part of our code of conduct, we maintain a list of active members on
the Project Leadership Committee, which serves a couple of purposes. The
details are in 3f9ef874a7 (CODE_OF_CONDUCT: mention individual
project-leader emails, 2019-09-26), but the gist is as follows:

  - It makes it clear that people with a CoC complaint may contact
    members individually as opposed to the general PLC list (in case the
    subject of their complaint has to do with one of the committee
    members).

  - It also serves as the de-facto list of people on the PLC, which
    isn't committed anywhere else in the tree.

As of [1], Peff is no longer a member of Git's Project Leadership
Committee. Let's update the list of active members accordingly [2].

This also gives us a convenient opportunity to thank Peff for his many
years of service on the PLC, during which he helped the Git community in
more ways than we can easily list here.

[1]: https://lore.kernel.org/git/YboaAe4LWySOoAe7@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/CAP8UFD2XxP9r3PJ4GQjxUbV=E1ASDq1NDgB-h+S=v-bZQ7DYwQ@mail.gmail.com/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 12:36:29 -08:00
John Cai 440c705ea6 cat-file: add --batch-command mode
Add a new flag --batch-command that accepts commands and arguments
from stdin, similar to git-update-ref --stdin.

At GitLab, we use a pair of long running cat-file processes when
accessing object content. One for iterating over object metadata with
--batch-check, and the other to grab object contents with --batch.

However, if we had --batch-command, we wouldn't need to keep both
processes around, and instead just have one --batch-command process
where we can flip between getting object info, and getting object
contents. Since we have a pair of cat-file processes per repository,
this means we can get rid of roughly half of long lived git cat-file
processes. Given there are many repositories being accessed at any given
time, this can lead to huge savings.

git cat-file --batch-command

will enter an interactive command mode whereby the user can enter in
commands and their arguments that get queued in memory:

<command1> [arg1] [arg2] LF
<command2> [arg1] [arg2] LF

When --buffer mode is used, commands will be queued in memory until a
flush command is issued that execute them:

flush LF

The reason for a flush command is that when a consumer process (A)
talks to a git cat-file process (B) and interactively writes to and
reads from it in --buffer mode, (A) needs to be able to control when
the buffer is flushed to stdout.

Currently, from (A)'s perspective, the only way is to either

1. kill (B)'s process
2. send an invalid object to stdin.

1. is not ideal from a performance perspective as it will require
spawning a new cat-file process each time, and 2. is hacky and not a
good long term solution.

With this mechanism of queueing up commands and letting (A) issue a
flush command, process (A) can control when the buffer is flushed and
can guarantee it will receive all of the output when in --buffer mode.
--batch-command also will not allow (B) to flush to stdout until a flush
is received.

This patch adds the basic structure for adding command which can be
extended in the future to add more commands. It also adds the following
two commands (on top of the flush command):

contents <object> LF
info <object> LF

The contents command takes an <object> argument and prints out the object
contents.

The info command takes an <object> argument and prints out the object
metadata.

These can be used in the following way with --buffer:

info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
flush LF
info <object> LF
flush LF

When used without --buffer:

info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
info <object> LF

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 11:21:46 -08:00
John Cai 4cf5d53b62 cat-file: add remove_timestamp helper
maybe_remove_timestamp() takes arguments, but it would be useful to have
a function that reads from stdin and strips the timestamp. This would
allow tests to pipe data into a function to remove timestamps, and
wouldn't have to always assign a variable. This is especially helpful
when the data is multiple lines.

Keep maybe_remove_timestamp() the same, but add a remove_timestamp
helper that reads from stdin.

The tests in the next patch will make use of this.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 11:21:46 -08:00
John Cai ac4e58cab9 cat-file: introduce batch_mode enum to replace print_contents
A future patch introduces a new --batch-command flag. Including --batch
and --batch-check, we will have a total of three batch modes. print_contents
is the only boolean on the batch_options sturct used to distinguish
between the different modes. This makes the code harder to read.

To reduce potential confusion, replace print_contents with an enum to
help readability and clarity.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 11:21:46 -08:00
John Cai a2c75526d2 cat-file: rename cmdmode to transform_mode
In the next patch, we will add an enum on the batch_options struct that
indicates which type of batch operation will be used: --batch,
--batch-check and the soon to be  --batch-command that will read
commands from stdin. --batch-command mode might get confused with
the cmdmode flag.

There is value in renaming cmdmode in any case. cmdmode refers to how
the result output of the blob will be transformed, either according to
--filter or --textconv. So transform_mode is a more descriptive name
for the flag.

Rename cmdmode to transform_mode in cat-file.c

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 11:21:46 -08:00
Junio C Hamano e2ac9141e6 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-17 16:25:06 -08:00
Junio C Hamano 037dbe8ed7 Merge branch 'ab/complete-show-all-commands'
The command line completion script (in contrib/) learned to
complete all Git subcommands, including the ones that are normally
hidden, when GIT_COMPLETION_SHOW_ALL_COMMANDS is used.

* ab/complete-show-all-commands:
  completion: add a GIT_COMPLETION_SHOW_ALL_COMMANDS
  completion tests: re-source git-completion.bash in a subshell
2022-02-17 16:25:06 -08:00
Junio C Hamano 6cfe518967 Merge branch 'sy/modernize-t-lib-read-tree-m-3way'
Style updates on a test script helper.

* sy/modernize-t-lib-read-tree-m-3way:
  t/lib-read-tree-m-3way: indent with tabs
  t/lib-read-tree-m-3way: modernize style
2022-02-17 16:25:05 -08:00
Junio C Hamano a4ec347888 Merge branch 'po/doc-check-ignore-markup-fix'
Typofix.

* po/doc-check-ignore-markup-fix:
  doc: check-ignore: code-quote an exclamation mark
2022-02-17 16:25:05 -08:00
Junio C Hamano ff6f1695a3 Merge branch 'js/scalar-global-options'
Scalar update.

* js/scalar-global-options:
  scalar: accept -C and -c options before the subcommand
2022-02-17 16:25:05 -08:00
Junio C Hamano 2f45f3e2bc Merge branch 'vd/sparse-clean-etc'
"git update-index", "git checkout-index", and "git clean" are
taught to work better with the sparse checkout feature.

* vd/sparse-clean-etc:
  update-index: reduce scope of index expansion in do_reupdate
  update-index: integrate with sparse index
  update-index: add tests for sparse-checkout compatibility
  checkout-index: integrate with sparse index
  checkout-index: add --ignore-skip-worktree-bits option
  checkout-index: expand sparse checkout compatibility tests
  clean: integrate with sparse index
  reset: reorder wildcard pathspec conditions
  reset: fix validation in sparse index test
2022-02-17 16:25:05 -08:00
Junio C Hamano 708cbef33a Merge branch 'jz/rev-list-exclude-first-parent-only'
"git log" and friends learned an option --exclude-first-parent-only
to propagate UNINTERESTING bit down only along the first-parent
chain, just like --first-parent option shows commits that lack the
UNINTERESTING bit only along the first-parent chain.

* jz/rev-list-exclude-first-parent-only:
  git-rev-list: add --exclude-first-parent-only flag
2022-02-17 16:25:05 -08:00
Junio C Hamano d077db1df0 Merge branch 'jz/patch-id-hunk-header-parsing-fix'
Unlike "git apply", "git patch-id" did not handle patches with
hunks that has only 1 line in either preimage or postimage, which
has been corrected.

* jz/patch-id-hunk-header-parsing-fix:
  patch-id: fix scan_hunk_header on diffs with 1 line of before/after
  patch-id: fix antipatterns in tests
2022-02-17 16:25:04 -08:00
Junio C Hamano 75ff34bcf7 Merge branch 'hn/reftable-tests'
Prepare more test scripts for the introduction of reftable.

* hn/reftable-tests:
  t5312: prepare for reftable
  t1405: mark test that checks existence as REFFILES
  t1405: explictly delete reflogs for reftable
2022-02-17 16:25:04 -08:00
Junio C Hamano 0ac270cf7c Merge branch 'tk/subtree-merge-not-ff-only'
When "git subtree" wants to create a merge, it used "git merge" and
let it be affected by end-user's "merge.ff" configuration, which
has been corrected.

* tk/subtree-merge-not-ff-only:
  subtree: force merge commit
2022-02-17 16:25:04 -08:00
Elijah Newren 4a3d86e1bb merge-ort: make informational messages from recursive merges clearer
This is another simple change with a long explanation...

merge-recursive and merge-ort are both based on the same recursive idea:
if there is more than one merge base, merge the merge bases (which may
require first merging the merge bases of the merges bases, etc.).  The
depth of the inner merge is recorded via a variable called "call_depth",
which we'll bring up again later.  Naturally, the inner merges
themselves can have conflicts and various messages generated about those
files.

merge-recursive immediately prints to stdout as it goes, at the risk of
printing multiple conflict notices for the same path separated far apart
from each other with many intervenining conflict notices for other paths
between them.  And this is true even if there are no inner merges
involved.  An example of this was given in [1] and apparently caused
some confusion:

    CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
    ...dozens of conflicts for OTHER paths...
    CONFLICT (content): Merge conflicts in B

In contrast, merge-ort collects messages and stores them by path so that
it can print them grouped by path.  Thus, the same case handled by
merge-ort would have output of the form:

    CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
    CONFLICT (content): Merge conflicts in B
    ...dozens of conflicts for OTHER paths...

This is generally helpful, but does make a separate bug more
problematic.  In particular, while merge-recursive might report the
following for a recursive merge:

      Auto-merging dir.c
      Auto-merging midx.c
      CONFLICT (content): Merge conflict in midx.c
    Auto-merging diff.c
    Auto-merging dir.c
    CONFLICT (content): Merge conflict in dir.c

merge-ort would instead report:

    Auto-merging diff.c
    Auto-merging dir.c
    Auto-merging dir.c
    CONFLICT (content): Merge conflict in dir.c
    Auto-merging midx.c
    CONFLICT (content): Merge conflict in midx.c

The fact that messages for the same file are together is probably
helpful in general, but with the indentation missing for the inner
merge it unfortunately serves to confuse.  This probably would lead
users to wonder:

  * Why is Git reporting that "dir.c" is being merged twice?
  * If midx.c has conflicts, why do I not see any when I open up the
    file and why are no conflicts shown in the index?

Fix this output confusion by changing the output to clearly
differentiate the messages for outer merges from the ones for inner
merges, changing the above output from merge-ort to:

    Auto-merging diff.c
      From inner merge:  Auto-merging dir.c
    Auto-merging dir.c
    CONFLICT (content): Merge conflict in dir.c
      From inner merge:  Auto-merging midx.c
      From inner merge:  CONFLICT (content): Merge conflict in midx.c

(Note: the number of spaces after the 'From inner merge:' is
2*call_depth).

One other thing to note here, that I didn't notice until typing up this
commit message, is that merge-recursive does not print any messages from
the inner merges by default; the extra verbosity has to be requested.
merge-ort currently has no verbosity controls and always prints these.
We may also want to change that, but for now, just make the output
clearer with these extra markings and indentation.

[1] https://lore.kernel.org/git/CAGyf7-He4in8JWUh9dpAwvoPkQz9hr8nCBpxOxhZEd8+jtqTpg@mail.gmail.com/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-17 15:11:00 -08:00
René Scharfe 97169fc361 grep: fix triggering PCRE2_NO_START_OPTIMIZE workaround
PCRE2 bug 2642 was fixed in version 10.36.  Our 95ca1f987e (grep/pcre2:
better support invalid UTF-8 haystacks, 2021-01-24) worked around it on
older versions by setting the flag PCRE2_NO_START_OPTIMIZE.  797c359978
(grep/pcre2: use compile-time PCREv2 version test, 2021-02-18) switched
it around to set the flag on 10.36 and higher instead, while it claimed
to use "the same test done at compile-time".

Switch the condition back to apply the workaround on PCRE2 versions
_before_ 10.36.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-17 14:56:56 -08:00
Derrick Stolee 26b8946421 dir: force untracked cache with core.untrackedCache
The GIT_FORCE_UNTRACKED_CACHE environment variable writes the untracked
cache more frequently than the core.untrackedCache config variable. This
is due to how read_directory() handles the creation of an untracked
cache.

Before this change, Git would not create the untracked cache extension
for an index that did not already have one. Users would need to run a
command such as 'git update-index --untracked-cache' before the index
would actually contain an untracked cache.

In particular, users noticed that the untracked cache would not appear
even with core.untrackedCache=true. Some users reported setting
GIT_FORCE_UNTRACKED_CACHE=1 in their engineering system environment to
ensure the untracked cache would be created.

The decision to not write the untracked cache without an environment
variable tracks back to fc9ecbeb9 (dir.c: don't flag the index as dirty
for changes to the untracked cache, 2018-02-05). The motivation of that
change is that writing the index is expensive, and if the untracked
cache is the only thing that needs to be written, then it is more
expensive than the benefit of the cache. However, this also means that
the untracked cache never gets populated, so the user who enabled it via
config does not actually get the extension until running 'git
update-index --untracked-cache' manually or using the environment
variable.

We have had a version of this change in the microsoft/git fork for a few
major releases now. It has been working well to get users into a good
state. Yes, that first index write is slow, but the remaining index
writes are much faster than they would be without this change.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-17 14:47:13 -08:00
Junio C Hamano 45fe28c951 The fourth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 15:14:30 -08:00
Junio C Hamano 834625bd6f Merge branch 'ab/do-not-hide-failures-in-git-dot-pm'
Git.pm update.

* ab/do-not-hide-failures-in-git-dot-pm:
  perl Git.pm: don't ignore signalled failure in _cmd_close()
2022-02-16 15:14:30 -08:00
Junio C Hamano b9f791aee6 Merge branch 'js/no-more-legacy-stash'
Removal of unused code and doc.

* js/no-more-legacy-stash:
  stash: stop warning about the obsolete `stash.useBuiltin` config setting
  stash: remove documentation for `stash.useBuiltin`
  add: remove support for `git-legacy-stash`
  git-sh-setup: remove remnant bits referring to `git-legacy-stash`
2022-02-16 15:14:30 -08:00
Junio C Hamano 9a160990ef Merge branch 'js/diff-filter-negation-fix'
"git diff --diff-filter=aR" is now parsed correctly.

* js/diff-filter-negation-fix:
  diff-filter: be more careful when looking for negative bits
  diff.c: move the diff filter bits definitions up a bit
  docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
2022-02-16 15:14:30 -08:00
Junio C Hamano 70ff41ffcf Merge branch 'en/fetch-negotiation-default-fix'
Interaction between fetch.negotiationAlgorithm and
feature.experimental configuration variables has been corrected.

* en/fetch-negotiation-default-fix:
  repo-settings: rename the traditional default fetch.negotiationAlgorithm
  repo-settings: fix error handling for unknown values
  repo-settings: fix checking for fetch.negotiationAlgorithm=default
2022-02-16 15:14:30 -08:00
Junio C Hamano 00e38ba6d8 Merge branch 'ab/auto-detect-zlib-compress2'
The build procedure has been taught to notice older version of zlib
and enable our replacement uncompress2() automatically.

* ab/auto-detect-zlib-compress2:
  compat: auto-detect if zlib has uncompress2()
2022-02-16 15:14:30 -08:00
Junio C Hamano f2cb46a6b3 Merge branch 'tb/midx-bitmap-corruption-fix'
A bug that made multi-pack bitmap and the object order out-of-sync,
making the .midx data corrupt, has been fixed.

* tb/midx-bitmap-corruption-fix:
  pack-bitmap.c: gracefully fallback after opening pack/MIDX
  midx: read `RIDX` chunk when present
  t/lib-bitmap.sh: parameterize tests over reverse index source
  t5326: move tests to t/lib-bitmap.sh
  t5326: extract `test_rev_exists`
  t5326: drop unnecessary setup
  pack-revindex.c: instrument loading on-disk reverse index
  midx.c: make changing the preferred pack safe
  t5326: demonstrate bitmap corruption after permutation
2022-02-16 15:14:29 -08:00
Junio C Hamano 90b7153806 Merge branch 'en/remerge-diff'
"git log --remerge-diff" shows the difference from mechanical merge
result and the result that is actually recorded in a merge commit.

* en/remerge-diff:
  diff-merges: avoid history simplifications when diffing merges
  merge-ort: mark conflict/warning messages from inner merges as omittable
  show, log: include conflict/warning messages in --remerge-diff headers
  diff: add ability to insert additional headers for paths
  merge-ort: format messages slightly different for use in headers
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: capture and print ll-merge warnings in our preferred fashion
  ll-merge: make callers responsible for showing warnings
  log: clean unneeded objects during `log --remerge-diff`
  show, log: provide a --remerge-diff capability
2022-02-16 15:14:29 -08:00
Junio C Hamano 34230514b8 Merge branch 'hn/reftable-coverity-fixes'
Problems identified by Coverity in the reftable code have been
corrected.

* hn/reftable-coverity-fixes:
  reftable: add print functions to the record types
  reftable: make reftable_record a tagged union
  reftable: remove outdated file reftable.c
  reftable: implement record equality generically
  reftable: make reftable-record.h function signatures const correct
  reftable: handle null refnames in reftable_ref_record_equal
  reftable: drop stray printf in readwrite_test
  reftable: order unittests by complexity
  reftable: all xxx_free() functions accept NULL arguments
  reftable: fix resource warning
  reftable: ignore remove() return value in stack_test.c
  reftable: check reftable_stack_auto_compact() return value
  reftable: fix resource leak blocksource.c
  reftable: fix resource leak in block.c error path
  reftable: fix OOB stack write in print functions
2022-02-16 15:14:28 -08:00
Junio C Hamano dd77ff8181 Merge branch 'll/doc-mktree-typofix'
Typofix.

* ll/doc-mktree-typofix:
  fix typo in git-mktree.txt
2022-02-16 15:14:26 -08:00
Junio C Hamano 9d2f9a6188 Merge branch 'ld/sparse-index-bash-completion'
The command line completion (in contrib/) learns to complete
arguments to give to "git sparse-checkout" command.

* ld/sparse-index-bash-completion:
  completion: handle unusual characters for sparse-checkout
  completion: improve sparse-checkout cone mode directory completion
  completion: address sparse-checkout issues
2022-02-16 15:14:26 -08:00
Ævar Arnfjörð Bjarmason 6ee36364eb diff.[ch]: have diff_free() free options->parseopts
The "struct option" added in 4a28847839 (diff.c: prepare to use
parse_options() for parsing, 2019-01-27) would be free'd in the case
of diff_setup_done() being called.

But not all codepaths that allocate it reach that,
e.g. "t6427-diff3-conflict-markers.sh" will now free memory that it
didn't free before. By using FREE_AND_NULL() here (which
diff_setup_done() also does) we ensure that we free the memory, and
that we won't have double-free's.

Before this running:

    ./t6427-diff3-conflict-markers.sh -vixd --run=7

Would report:

    SUMMARY: LeakSanitizer: 7823 byte(s) leaked in 6 allocation(s).

But now we'll report:

    SUMMARY: LeakSanitizer: 703 byte(s) leaked in 5 allocation(s).

I.e. the largest leak in that particular test has now been addressed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 13:50:37 -08:00
Ævar Arnfjörð Bjarmason 244c27242f diff.[ch]: have diff_free() call clear_pathspec(opts.pathspec)
Have the diff_free() function call clear_pathspec(). Since the
diff_flush() function calls this all its callers can be simplified to
rely on it instead.

When I added the diff_free() function in e900d494dc (diff: add an API
for deferred freeing, 2021-02-11) I simply missed this, or wasn't
interested in it. Let's consolidate this now. This means that any
future callers (and I've got revision.c in mind) that embed a "struct
diff_options" can simply call diff_free() instead of needing know that
it has an embedded pathspec.

This does fix a bunch of leaks, but I can't mark any test here as
passing under the SANITIZE=leak testing mode because in
886e1084d7 (builtin/: add UNLEAKs, 2017-10-01) an UNLEAK(rev) was
added, which plasters over the memory
leak. E.g. "t4011-diff-symlink.sh" would report fewer leaks with this
fix, but because of the UNLEAK() reports none.

I'll eventually loop around to removing that UNLEAK(rev) annotation as
I'll fix deeper issues with the revisions API leaking. This is one
small step on the way there, a new freeing function in revisions.c
will want to call this diff_free().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 13:50:13 -08:00
Phillip Wood 43ad3af380 xdiff: handle allocation failure when merging
Other users of xdiff such as libgit2 need to be able to handle
allocation failures. These allocation failures were previously
ignored.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:16 -08:00
Phillip Wood 4a37b80e88 xdiff: refactor a function
Use the standard "goto out" pattern rather than repeating very similar
code after checking for each error. This will simplify the next commit
that starts handling allocation failures that are currently ignored.
On error xdl_do_diff() frees the environment so we need to take care
to avoid a double free in that case. xdl_build_script() does not
assign a result unless it is successful so there is no possibility of
a double free if it fails.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:15 -08:00
Phillip Wood 61f883965f xdiff: handle allocation failure in patience diff
Other users of libxdiff such as libgit2 need to be able to handle
allocation failures. As NULL is a valid return value the function
signature is changed to be able report allocation failures.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:13 -08:00
Phillip Wood 9df0fc3d57 xdiff: fix a memory leak
Although the patience and histogram algorithms initialize the
environment they do not free it if there is an error. In contrast for
the Myers algorithm the environment is initalized in xdl_do_diff() and
it is freed if there is an error. Fix this by always initializing the
environment in xdl_do_diff() and freeing it there if there is an
error. Remove the comment in do_patience_diff() about the environment
being freed by xdl_diff() as it is not accurate because (a) xdl_diff()
does not do that if there is an error and (b) xdl_diff() is not the
only caller.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 10:58:05 -08:00
Ævar Arnfjörð Bjarmason 974c919d36 date API: add and use a date_mode_release()
Fix a memory leak in the parse_date_format() function by providing a
new date_mode_release() companion function.

By using this in "t/helper/test-date.c" we can mark the
"t0006-date.sh" test as passing when git is compiled with
SANITIZE=leak, and whitelist it to run under
"GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
"TEST_PASSES_SANITIZE_LEAK=true" to the test itself.

The other tests that expose this memory leak (i.e. take the
"mode->type == DATE_STRFTIME" branch in parse_date_format()) are
"t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
easily fixed leak in "ref-filter.c", and brings the failures in
"t6300-for-each-ref.sh" down from 51 to 48.

Fixing the remaining leaks will have to wait until there's a
release_revisions() in "revision.c", as they have to do with leaks via
"struct rev_info".

There is also a leak in "builtin/blame.c" due to its call to
parse_date_format() to parse the "blame.date" configuration. However
as it declares a file-level "static struct date_mode blame_date_mode"
to track the data, LSAN will not report it as a leak. It's possible to
get valgrind(1) to complain about it with e.g.:

    valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md

But let's focus on things LSAN complains about, and are thus
observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to
fixing memory leaks in "builtin/blame.c", but as doing so would
require some re-arrangement of cmd_blame() let's leave it for some
other time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason 2bacb83466 date API: add basic API docs
Add basic API doc comments to date.h, and while doing so move the the
parse_date_format() function adjacent to show_date(). This way all the
"struct date_mode" functions are grouped together. Documenting the
rest is one of our #leftoverbits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason f184289832 date API: provide and use a DATE_MODE_INIT
Provide and use a DATE_MODE_INIT macro. Most of the users of struct
date_mode" use it via pretty.h's "struct pretty_print_context" which
doesn't have an initialization macro, so we're still bound to being
initialized to "{ 0 }" by default.

But we can change the couple of callers that directly declared a
variable on the stack to instead use the initializer, and thus do away
with the "mode.local = 0" added in add00ba2de (date: make "local"
orthogonal to date format, 2015-09-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason 88c7b4c3c8 date API: create a date.h, split from cache.h
Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.

The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.

Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.

We could simply include this new header in "cache.h", but as this
change shows these functions weren't common enough to warrant
including in it in the first place. By moving them out of cache.h
changes to this API will no longer cause a (mostly) full re-build of
the project when "make" is run.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason f6c71f81f9 cache.h: remove always unused show_date_human() declaration
There has never been a show_date_human() function on the "master"
branch in git.git. This declaration was added in b841d4ff43 (Add
`human` format to test-tool, 2019-01-28).

A look at the ML history reveals that it was leftover cruft from an
earlier version of that commit[1].

1. https://lore.kernel.org/git/20190118061805.19086-5-ischis2@cox.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason 04bf052eef grep: simplify config parsing and option parsing
Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4 (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that as the previously added tests indicate this cannot be done
on-the-fly as we see the config variables, without introducing more
state keeping. I.e. if we see:

    -c grep.extendedRegexp=false
    -c grep.patternType=default
    -c extendedRegexp=true

We need to select ERE, since grep.patternType=default unselects that
variable, which normally has higher precedence, but we also need to
select BRE in cases of:

    -c grep.extendedRegexp=true \
    -c grep.extendedRegexp=false

Which would not be the case for this, which select ERE:

    -c grep.patternType=extended \
    -c grep.extendedRegexp=false

Therefore we cannot do this on-the-fly in grep_config without also
introducing tracking variables for not only the pattern type, but what
the source of that pattern type was.

So we need to decide on the pattern after our config was fully
parsed. Let's do that by deferring the decision on the pattern type
until it's time to compile it in compile_regexp().

By that time we've not only parsed the config, but also handled the
command-line options. Those will set "opt.pattern_type_option" (*not*
"opt.extended_regexp_option"!).

At that point all we need to do is see if "grep.patternType" was
UNSPECIFIED in the end (including an explicit "=default"), if so we'll
use the "grep.extendedRegexp" configuration, if any.

See my 07a3d41173 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason ae807d778f grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason 321ee43628 grep.h: make "grep_opt.pattern_type_option" use its enum
Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason 72365bb499 grep API: call grep_config() after grep_init()
The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e0 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b401 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason b8db6ed826 grep.c: don't pass along NULL callback value
Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577 (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason 9725c8dda2 built-ins: trust the "prefix" from run_builtin()
Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2 (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fc (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d8 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2 (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d8 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00
Ævar Arnfjörð Bjarmason a5c0ed3d83 grep tests: add missing "grep.patternType" config tests
Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

See [1] for the source of some of these tests, and their
initial (pseudocode) implementation, and [2] for a later discussion
about a breakage due to missing testing (which had been noted in [1]
all along).

1. https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/
2. https://lore.kernel.org/git/xmqqpmoczwtu.fsf@gitster.g/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-15 18:00:50 -08:00