Commit graph

22849 commits

Author SHA1 Message Date
Patrick Steinhardt 71360809ec t: do not pass GIT_TEST_OPTS to unit tests with prove
When using the prove target, we append GIT_TEST_OPTS to the arguments
that we execute each of the tests with. This doesn't only include the
intended test scripts, but also ends up passing the arguments to our
unit tests. This is unintentional though as they do not even know to
interpret those arguments, and is inconsistent with how we execute unit
tests without prove.

This isn't much of an issue because our current set of unit tests mostly
ignore their arguments anyway. With the introduction of clar-based unit
tests this is about to become an issue though, as these do parse their
command line argument to alter behaviour.

Prepare for this by passing GIT_TEST_OPTS to "run-test.sh" via an
environment variable. Like this, we can conditionally forward it to our
test scripts, only.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:41:36 -07:00
Patrick Steinhardt c3459ae9ef refs/files: use heuristic to decide whether to repack with --auto
The `--auto` flag for git-pack-refs(1) allows the ref backend to decide
whether or not a repack is in order. This switch has been introduced
mostly with the "reftable" backend in mind, which already knows to
auto-compact its tables during normal operations. When the flag is set,
then it will use the same auto-compaction mechanism and thus end up
doing nothing in most cases.

The "files" backend does not have any such heuristic yet and instead
packs any loose references unconditionally. So we rewrite the complete
"packed-refs" file even if there's only a single loose reference to be
packed.

Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using
`git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is
unconditionally executed via our auto maintenance, so we end up repacking
references every single time auto maintenance kicks in. And while that
commit already mentioned that the "files" backend unconditionally packs
refs now, the author obviously didn't quite think about the consequences
thereof. So while the idea was sound, we really should have added a
heuristic to the "files" backend before implementing it.

Introduce a heuristic that decides whether or not it is worth to pack
loose references. The important factors to decide here are the number of
loose references in comparison to the overall size of the "packed-refs"
file. The bigger the "packed-refs" file, the longer it takes to rewrite
it and thus we scale up the limit of allowed loose references before we
repack.

As is the nature of heuristics, this mechansim isn't obviously
"correct", but should rather be seen as a tradeoff between how much
resources we spend packing refs and how inefficient the ref store
becomes. For all I can say, we have successfully been using the exact
same heuristic in Gitaly for several years by now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
Patrick Steinhardt bd51dca36e t0601: merge tests for auto-packing of refs
We have two tests in t0601 which exercise the same underlying logic,
once via `git pack-refs --auto` and once via `git maintenance run
--auto`. Merge these two tests into one such that it becomes easier to
extend test coverage for both commands at the same time.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
Patrick Steinhardt b2dbf97f47 builtin/index-pack: fix segfaults when running outside of a repo
It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.

The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.

For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.

Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 07:40:00 -07:00
Phillip Wood bf6ab087d1 rebase: apply and cleanup autostash when rebase fails to start
If "git rebase" fails to start after stashing the user's uncommitted
changes then it forgets to restore the stashed changes and remove the
state directory. To make matters worse, running "git rebase --abort" to
apply the stashed changes and cleanup the state directory fails because
the state directory only contains the "autostash" file and is missing
the "head-name" and "onto" files required by read_basic_state().

Fix this by applying the autostash and removing the state directory if
the pre-rebase hook or initial checkout fail. This matches what
finish_rebase() does at the end of a successful rebase. If the user
modifies any files after the autostash is created it is possible there
will be conflicts when the autostash is applied. In that case
apply_autostash() saves the stash in a new entry under refs/stash and so
it is safe to remove the state directory containing the autostash file.

New tests are added to check the autostash is applied and the state
directory is removed if the rebase fails to start. Checks are also added
to some existing tests in order to ensure there is no state directory
left behind when a rebase fails to start and no autostash has been
created.

Reported-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03 11:24:43 -07:00
Junio C Hamano 17636cdf3b Merge branch 'ps/reftable-concurrent-compaction'
The code path for compacting reftable files saw some bugfixes
against concurrent operation.

* ps/reftable-concurrent-compaction:
  reftable/stack: fix segfault when reload with reused readers fails
  reftable/stack: reorder swapping in the reloaded stack contents
  reftable/reader: keep readers alive during iteration
  reftable/reader: introduce refcounting
  reftable/stack: fix broken refnames in `write_n_ref_tables()`
  reftable/reader: inline `reader_close()`
  reftable/reader: inline `init_reader()`
  reftable/reader: rename `reftable_new_reader()`
  reftable/stack: inline `stack_compact_range_stats()`
  reftable/blocksource: drop malloc block source
2024-09-03 09:15:03 -07:00
Junio C Hamano 8c1c63d525 Merge branch 'ps/leakfixes-part-5'
Even more leak fixes.

* ps/leakfixes-part-5:
  transport: fix leaking negotiation tips
  transport: fix leaking arguments when fetching from bundle
  builtin/fetch: fix leaking transaction with `--atomic`
  remote: fix leaking peer ref when expanding refmap
  remote: fix leaks when matching refspecs
  remote: fix leaking config strings
  builtin/fetch-pack: fix leaking refs
  sideband: fix leaks when configuring sideband colors
  builtin/send-pack: fix leaking refspecs
  transport: fix leaking OID arrays in git:// transport data
  t/helper: fix leaking multi-pack-indices in "read-midx"
  builtin/repack: fix leaks when computing packs to repack
  midx-write: fix leaking hashfile on error cases
  builtin/archive: fix leaking `OPT_FILENAME()` value
  builtin/upload-archive: fix leaking args passed to `write_archive()`
  builtin/merge-tree: fix leaking `-X` strategy options
  pretty: fix leaking key/value separator buffer
  pretty: fix memory leaks when parsing pretty formats
  convert: fix leaks when resetting attributes
  mailinfo: fix leaking header data
2024-09-03 09:15:00 -07:00
Ghanshyam Thakkar a680635e05 t: port helper/test-oid-array.c to unit-tests/t-oid-array.c
helper/test-oid-array.c along with t0064-oid-array.sh test the
oid-array.h API, which provides storage and processing
efficiency over large lists of object identifiers.

Migrate them to the unit testing framework for better runtime
performance and efficiency. As we don't initialize a repository
in these tests, the hash algo that functions like oid_array_lookup()
use is not initialized, therefore call repo_set_hash_algo() to
initialize it. And init_hash_algo():lib-oid.c can aid in this
process, so make it public.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-01 20:43:38 -07:00
Junio C Hamano 11fd53a6c2 Merge branch 'ds/sparse-diff-index'
The underlying machinery for "git diff-index" has long been made to
expand the sparse index as needed, but the command fully expanded
the sparse index upfront, which now has been taught not to do.

* ds/sparse-diff-index:
  diff-index: integrate with the sparse index
2024-08-29 11:08:17 -07:00
Junio C Hamano 839b808325 Merge branch 'cp/unit-test-reftable-block'
Another test for reftable library ported to the unit test framework.

* cp/unit-test-reftable-block:
  t-reftable-block: mark unused argv/argc
  t-reftable-block: add tests for index blocks
  t-reftable-block: add tests for obj blocks
  t-reftable-block: add tests for log blocks
  t-reftable-block: remove unnecessary variable 'j'
  t-reftable-block: use xstrfmt() instead of xstrdup()
  t-reftable-block: use block_iter_reset() instead of block_iter_close()
  t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
  t-reftable-block: use reftable_record_equal() instead of check_str()
  t-reftable-block: release used block reader
  t: harmonize t-reftable-block.c with coding guidelines
  t: move reftable/block_test.c to the unit testing framework
2024-08-29 11:08:16 -07:00
Junio C Hamano d4d677704d Merge branch 'ps/reftable-drop-generic'
The code in the reftable library has been cleaned up by discarding
unused "generic" interface.

* ps/reftable-drop-generic:
  reftable: mark unused parameters in empty iterator functions
  reftable/generic: drop interface
  t/helper: refactor to not use `struct reftable_table`
  t/helper: use `hash_to_hex_algop()` to print hashes
  t/helper: inline printing of reftable records
  t/helper: inline `reftable_table_print()`
  t/helper: inline `reftable_stack_print_directory()`
  t/helper: inline `reftable_reader_print_file()`
  t/helper: inline `reftable_dump_main()`
  reftable/dump: drop unused `compact_stack()`
  reftable/generic: move generic iterator code into iterator interface
  reftable/iter: drop double-checking logic
  reftable/stack: open-code reading refs
  reftable/merged: stop using generic tables in the merged table
  reftable/merged: rename `reftable_new_merged_table()`
  reftable/merged: expose functions to initialize iterators
2024-08-29 11:08:16 -07:00
Junio C Hamano a9bc27fb18 Merge branch 'gt/unit-test-urlmatch-normalization'
Another rewrite of test.

* gt/unit-test-urlmatch-normalization:
  t: migrate t0110-urlmatch-normalization to the new framework
2024-08-28 10:31:27 -07:00
Junio C Hamano 029c870ab5 Merge branch 'mt/rebase-x-quiet'
"git rebase -x --quiet" was not quiet, which was corrected.

* mt/rebase-x-quiet:
  rebase --exec: respect --quiet
2024-08-28 10:31:26 -07:00
Jeff King 08e83b5ec5 t-reftable-block: mark unused argv/argc
This is conceptually the same as the cases in df9d638c24 (unit-tests:
ignore unused argc/argv, 2024-08-17), but this unit test was migrated
from the reftable tests in a parallel branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 10:09:32 -07:00
Jeff King 8c90b41f0a t-reftable-readwrite: mark unused parameter in callback function
This spot was originally marked in in 4695c3f3a9 (reftable: mark unused
parameters in virtual functions, 2024-08-17), but was copied in
5b539a5361 (t: move reftable/readwrite_test.c to the unit testing
framework, 2024-08-13).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:17 -07:00
Jacob Keller 241499aba0 send-email: add mailmap support via sendemail.mailmap and --mailmap
In some cases, a user may be generating a patch for an old commit which
now has an out-of-date author or other identity. For example, consider a
team member who contributes to an internal fork of an upstream project,
but leaves before this change is submitted upstream.

In this case, the team members company address may no longer be valid,
and will thus bounce when sending email.

This can be manually avoided by editing the generated patch files, or by
carefully using --suppress-<cc|to> options. This requires a lot of
manual intervention and is easy to forget.

Git has support for mapping old email addresses and names to a canonical
name and address via the .mailmap file (and its associated mailmap.file,
mailmap.blob, and log.mailmap options).

Teach git send-email to enable mailmap support for all addresses. This
ensures that addresses point to the canonical real name and email
address.

Add the sendemail.mailmap configuration option and its associated
--mailmap (and --use-mailmap for compatibility with git log) options.
For now, the default behavior is to disable the mailmap in order to
avoid any surprises or breaking any existing setups.

These options support per-identity configuration via the
sendemail.identity configuration blocks. This enables identity-specific
configuration in cases where users may not want to enable support.

In addition, support send-email specific mailmap data via
sendemail.mailmap.file, sendemail.mailmap.blob and their
identity-specific variants.

The intention of these options is to enable mapping addresses which are
no longer valid to a current project or team maintainer. Such mappings
may change the actual person being referred to, and may not make sense
in a traditional mailmap file which is intended for updating canonical
name and address for the same individual.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:29 -07:00
Jacob Keller 3a27e991f2 check-mailmap: accept "user@host" contacts
git check-mailmap splits each provided contact using split_ident_line.
This function requires that the contact either be of the form "Name
<user@host>" or of the form "<user@host>". In particular, if the mail
portion of the contact is not surrounded by angle brackets,
split_ident_line will reject it.

This results in git check-mailmap rejecting attempts to translate simple
email addresses:

  $ git check-mailmap user@host
  fatal: unable to parse contact: user@host

This limits the usability of check-mailmap as it requires placing angle
brackets around plain email addresses.

In particular, attempting to use git check-mailmap to support mapping
addresses in git send-email is not straight forward. The sanitization
and validation functions in git send-email strip angle brackets from
plain email addresses. It is not trivial to add brackets prior to
invoking git check-mailmap.

Instead, modify check_mailmap() to allow such strings as contacts. In
particular, treat any line which cannot be split by split_ident_line as
a simple email address.

No attempt is made to actually parse the address line, or validate that
it is actually an email address. Implementing such validation is not
trivial. Besides, we weren't validating the address between angle
brackets before anyways.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:28 -07:00
Taylor Blau 125c32605a builtin/pack-objects.c: translate bit positions during pack-reuse
When reusing chunks verbatim from an existing source pack, the function
write_reused_pack() first attempts to reuse whole words (via the
function `write_reused_pack_verbatim()`), and then individual bits (via
`write_reused_pack_one()`).

In the non-MIDX case, all of this code works fine. Likewise, in the MIDX
case, processing bits individually from the first (preferred) pack works
fine. However, processing subsequent packs in the MIDX case is broken
when there are duplicate objects among the set of MIDX'd packs.

This is because we treat the individual bit positions as valid pack
positions within the source pack(s), which does not account for gaps in
the source pack, like we see when the MIDX must break ties between
duplicate objects which appear in multiple packs.

The broken code looks like:

    for (; i < reuse_packfile_bitmap->word_alloc; i++) {
            for (offset = 0; offset < BITS_IN_EWORD, offset++) {
                    /* ... */

                    write_reused_pack_one(reuse_packfile->p,
                                          pos + offset - reuse_packfile->bitmap_pos,
                                          f, pack_start, &w_curs);
            }
    }

, where the second argument is incorrect and does not account for gaps.

Instead, make sure that we translate bit positions in the MIDX's
pseudo-pack order to pack positions in the respective source packs by:

  - Translating the bit position (pseudo-pack order) to a MIDX position
    (lexical order).

  - Use the MIDX position to obtain the offset at which the given object
    occurs in the source pack.

  - Then translate that offset back into a pack relative position within
    the source pack by calling offset_to_pack_pos().

After doing this, then we can safely use the result as a pack position.
Note that when doing single-pack reuse, as well as reusing objects from
the MIDX's preferred pack, such translation is not necessary, since
either ties are broken in favor of the preferred pack, or there are no
ties to break at all (in the case of non-MIDX bitmaps).

Failing to do this can result in strange failure modes. One example that
can occur when misinterpreting bits in the above fashion is that Git
thinks it's supposed to send a delta that the caller does not want.
Under this (incorrect) assumption, we try to look up the delta's base
(so that we can patch any OFS_DELTAs if necessary). We do this using
find_reused_offset().

But if we try and call that function for an offset belonging to an
object we did not send, we'll get back garbage. This can result in us
computing a negative fixup value, which results in memory corruption
when trying to write the (patched) OFS_DELTA header.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
Taylor Blau bbc393a9f3 t/t5332-multi-pack-reuse.sh: verify pack generation with --strict
In our tests for multi-pack reuse, we have two helper functions:

  - test_pack_objects_reused_all(), and
  - test_pack_objects_reused()

which invoke pack-objects (either with `--all`, or the supplied tips via
stdin, respectively) and ensure that (a) the number of reused objects,
and (b) the number of packs which those objects were reused from both
match the expected values.

Both functions discard the output of pack-objects and assert only on the
contents of the trace2 stream.

However, if we store the pack and attempt to index it with `--strict`,
we find that a number of our tests are broken, indicating a bug within
multi-pack reuse.

That bug will be addressed in a subsequent commit. But let's first
harden these tests by trying to index the resulting pack, marking the
tests which fail appropriately.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
Junio C Hamano 3222718ad7 Merge branch 'ds/for-each-ref-is-base'
'git for-each-ref' learned a new "--format" atom to find the branch
that the history leading to a given commit "%(is-base:<commit>)" is
likely based on.

* ds/for-each-ref-is-base:
  p1500: add is-base performance tests
  for-each-ref: add 'is-base' token
  commit: add gentle reference lookup method
  commit-reach: add get_branch_base_for_tip
2024-08-26 11:32:24 -07:00
Junio C Hamano 3dd2a2feca Merge branch 'jk/send-email-translate-aliases'
"git send-email" learned "--translate-aliases" option that reads
addresses from the standard input and emits the result of applying
aliases on them to the standard output.

* jk/send-email-translate-aliases:
  send-email: teach git send-email option to translate aliases
  t9001-send-email.sh: update alias list used for pine test
  t9001-send-email.sh: fix quoting for mailrc --dump-aliases test
2024-08-26 11:32:23 -07:00
Junio C Hamano 2b30d66c43 Merge branch 'jk/mark-unused-parameters'
Mark unused parameters as UNUSED to squelch -Wunused warnings.

* jk/mark-unused-parameters:
  t-hashmap: stop calling setup() for t_intern() test
  scalar: mark unused parameters in dummy function
  daemon: mark unused parameters in non-posix fallbacks
  setup: mark unused parameter in config callback
  test-mergesort: mark unused parameters in trivial callback
  t-hashmap: mark unused parameters in callback function
  reftable: mark unused parameters in virtual functions
  reftable: drop obsolete test function declarations
  reftable: ignore unused argc/argv in test functions
  unit-tests: ignore unused argc/argv
  t/helper: mark more unused argv/argc arguments
  oss-fuzz: mark unused argv/argc argument
  refs: mark unused parameters in do_for_each_reflog_helper()
  refs: mark unused parameters in ref_store fsck callbacks
  update-ref: mark more unused parameters in parser callbacks
  imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-26 11:32:23 -07:00
Junio C Hamano 1f4d89dfce Merge branch 'tb/pseudo-merge-bitmap-fixes'
We created a useless pseudo-merge reachability bitmap that is about
0 commits, and attempted to include commits that are not in packs,
which made no sense.  These bugs have been corrected.

* tb/pseudo-merge-bitmap-fixes:
  pseudo-merge.c: ensure pseudo-merge groups are closed
  pseudo-merge.c: do not generate empty pseudo-merge commits
  t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
  pack-bitmap-write.c: select pseudo-merges even for small bitmaps
  pack-bitmap: drop redundant args from `bitmap_writer_finish()`
  pack-bitmap: drop redundant args from `bitmap_writer_build()`
  pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
  pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-26 11:32:21 -07:00
Junio C Hamano 6e6f68b59b Merge branch 'ps/maintenance-detach-fix-more'
A tests for "git maintenance" that were broken on Windows have been
corrected.

* ps/maintenance-detach-fix-more:
  builtin/maintenance: fix loose objects task emitting pack hash
  t7900: exercise detaching via trace2 regions
  t7900: fix flaky test due to leaking background job
2024-08-26 11:32:20 -07:00
Junio C Hamano 1e8962ee08 Merge branch 'ps/maintenance-detach-fix'
Maintenance tasks other than "gc" now properly go background when
"git maintenance" runs them.

* ps/maintenance-detach-fix:
  run-command: fix detaching when running auto maintenance
  builtin/maintenance: add a `--detach` flag
  builtin/gc: add a `--detach` flag
  builtin/gc: stop processing log file on signal
  builtin/gc: fix leaking config values
  builtin/gc: refactor to read config into structure
  config: fix constness of out parameter for `git_config_get_expiry()`
2024-08-26 11:32:20 -07:00
Junio C Hamano 5072ad8260 Merge branch 'xx/diff-tree-remerge-diff-fix' into maint-2.46
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-26 11:10:25 -07:00
Junio C Hamano 164cffa35c Merge branch 'rs/t-example-simplify' into maint-2.46
Unit test simplification.

* rs/t-example-simplify:
  t-example-decorate: remove test messages
2024-08-26 11:10:24 -07:00
Junio C Hamano c93649f98a Merge branch 'jc/safe-directory' into maint-2.46
Follow-up on 2.45.1 regression fix.

* jc/safe-directory:
  safe.directory: setting safe.directory="." allows the "current" directory
  safe.directory: normalize the configured path
  safe.directory: normalize the checked path
  safe.directory: preliminary clean-up
2024-08-26 11:10:24 -07:00
Junio C Hamano 24a64ea0eb Merge branch 'kl/test-fixes' into maint-2.46
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-26 11:10:21 -07:00
Junio C Hamano 710ef8a945 Merge branch 'jc/reflog-expire-lookup-commit-fix' into maint-2.46
"git reflog expire" failed to honor annotated tags when computing
reachable commits.

* jc/reflog-expire-lookup-commit-fix:
  Revert "reflog expire: don't use lookup_commit_reference_gently()"
2024-08-26 11:10:21 -07:00
Junio C Hamano 528a762ca6 Merge branch 'jc/leakfix-mailmap' into maint-2.46
Leakfix.

* jc/leakfix-mailmap:
  mailmap: plug memory leak in read_mailmap_blob()
2024-08-26 11:10:20 -07:00
Junio C Hamano 88639e5d4c Merge branch 'jc/leakfix-hashfile' into maint-2.46
Leakfix.

* jc/leakfix-hashfile:
  csum-file: introduce discard_hashfile()
2024-08-26 11:10:19 -07:00
Junio C Hamano a5e4f53baf Merge branch 'jc/jl-git-no-advice-fix' into maint-2.46
Remove leftover debugging cruft from a test script.

* jc/jl-git-no-advice-fix:
  t0018: remove leftover debugging cruft
2024-08-26 11:10:19 -07:00
Junio C Hamano 5613c83f30 Merge branch 'tb/config-fixed-value-with-valueless-true' into maint-2.46
"git config --value=foo --fixed-value section.key newvalue" barfed
when the existing value in the configuration file used the
valueless true syntax, which has been corrected.

* tb/config-fixed-value-with-valueless-true:
  config.c: avoid segfault with --fixed-value and valueless config
2024-08-26 11:10:18 -07:00
Junio C Hamano a991ffff92 Merge branch 'ps/ls-remote-out-of-repo-fix' into maint-2.46
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-26 11:10:18 -07:00
Junio C Hamano 62c5b88157 Merge branch 'ps/stash-keep-untrack-empty-fix'
A corner case bug in "git stash" was fixed.

* ps/stash-keep-untrack-empty-fix:
  builtin/stash: fix `--keep-index --include-untracked` with empty HEAD
2024-08-23 09:02:36 -07:00
Junio C Hamano 2cf9c2206c Merge branch 'ps/hash-and-ref-format-from-config'
The default object hash and ref backend format used to be settable
only with explicit command line option to "git init" and
environment variables, but now they can be configured in the user's
global and system wide configuration.

* ps/hash-and-ref-format-from-config:
  setup: make ref storage format configurable via config
  setup: make object format configurable via config
  setup: merge configuration of repository formats
  t0001: delete repositories when object format tests finish
  t0001: exercise initialization with ref formats more thoroughly
2024-08-23 09:02:36 -07:00
Junio C Hamano 668843e6d8 Merge branch 'cp/unit-test-reftable-readwrite'
* cp/unit-test-reftable-readwrite:
  t-reftable-readwrite: add test for known error
  t-reftable-readwrite: use 'for' in place of infinite 'while' loops
  t-reftable-readwrite: use free_names() instead of a for loop
  t: move reftable/readwrite_test.c to the unit testing framework
2024-08-23 09:02:35 -07:00
Junio C Hamano 5e56a39e6a Merge branch 'ps/config-wo-the-repository'
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.

* ps/config-wo-the-repository:
  config: hide functions using `the_repository` by default
  global: prepare for hiding away repo-less config functions
  config: don't depend on `the_repository` with branch conditions
  config: don't have setters depend on `the_repository`
  config: pass repo to functions that rename or copy sections
  config: pass repo to `git_die_config()`
  config: pass repo to `git_config_get_expiry_in_days()`
  config: pass repo to `git_config_get_expiry()`
  config: pass repo to `git_config_get_max_percent_split_change()`
  config: pass repo to `git_config_get_split_index()`
  config: pass repo to `git_config_get_index_threads()`
  config: expose `repo_config_clear()`
  config: introduce missing setters that take repo as parameter
  path: hide functions using `the_repository` by default
  path: stop relying on `the_repository` in `worktree_git_path()`
  path: stop relying on `the_repository` when reporting garbage
  hooks: remove implicit dependency on `the_repository`
  editor: do not rely on `the_repository` for interactive edits
  path: expose `do_git_common_path()` as `repo_common_pathv()`
  path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-23 09:02:34 -07:00
Junio C Hamano 1b6b2bfae5 Merge branch 'ps/leakfixes-part-4'
More leak fixes.

* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-23 09:02:33 -07:00
Patrick Steinhardt d857469d85 reftable/reader: introduce refcounting
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:47 -07:00
Patrick Steinhardt a0218203cd reftable/reader: rename reftable_new_reader()
Rename the `reftable_new_reader()` function to `reftable_reader_new()`
to match our coding guidelines.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 08:04:46 -07:00
Derrick Stolee b44c926c9f diff-index: integrate with the sparse index
The sparse index allows focusing the index data structure on the files
present in the sparse-checkout, leaving only tree entries for
directories not within the sparse-checkout. Each builtin needs a
repository setting to indicate that it has been tested with the sparse
index before Git will allow the index to be loaded into memory in its
sparse form. This is a safety precaution.

There are still some builtins that haven't been integrated due to the
complexity of the integration and the lack of significant use. However,
'git diff-index' was neglected only because of initial data showing low
usage. The diff machinery was already integrated and there is no more
work to be done there but add some tests to be sure 'git diff-index'
behaves as expected.

For this purpose, we can follow the testing pattern used in 51ba65b5c3
(diff: enable and test the sparse index, 2021-12-06). One difference
here is that we only verify that the sparse index case agrees with the
full index case, but do not generate the expected output. The 'git diff'
tests use the '--name-status' option to ease the creation of the
expected output, but that's not an option for 'diff-index'. Since the
underlying diff machinery is the same, a simple comparison is sufficient
to give some coverage.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:29:14 -07:00
Patrick Steinhardt 13b23d2da5 transport: fix leaking negotiation tips
We do not free negotiation tips in the transport's smart options. Fix
this by freeing them on disconnect.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt c92abe71df builtin/fetch: fix leaking transaction with --atomic
With the `--atomic` flag, we use a single ref transaction to commit all
ref updates in git-fetch(1). The lifetime of transactions is somewhat
weird: while `ref_transaction_abort()` will free the transaction, a call
to `ref_transaction_commit()` won't. We thus have to manually free the
transaction in the successful case.

Adapt the code to free the transaction in the exit path to plug the
resulting memory leak. As `ref_transaction_abort()` already freed the
transaction for us, we have to unset the transaction when we hit that
code path to not cause a double free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt 8960819e73 remote: fix leaking peer ref when expanding refmap
When expanding remote refs via the refspec in `get_expanded_map()`, we
first copy the remote ref and then override its peer ref with the
expanded name. This may cause a memory leak though in case the peer ref
is already set, as this field is being copied by `copy_ref()`, as well.

Fix the leak by freeing the peer ref before we re-assign the field.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt 5e9e04a064 remote: fix leaks when matching refspecs
In `match_explicit()`, we try to match a source ref with a destination
ref according to a refspec item. This matching sometimes requires us to
allocate a new source spec so that it looks like we expect. And while we
in some end up assigning this allocated ref as `peer_ref`, which hands
over ownership of it to the caller, in other cases we don't. We neither
free it though, causing a memory leak.

Fix the leak by creating a common exit path where we can easily free the
source ref in case it is allocated and hasn't been handed over to the
caller.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt 2a2d5da1f2 sideband: fix leaks when configuring sideband colors
We read a bunch of configs in `use_sideband_colors()` to configure the
colors that Git should use. We never free the strings read from the
config though, causing memory leaks.

Refactor the code to use `git_config_get_string_tmp()` instead, which
does not allocate memory. As we throw the strings away after parsing
them anyway there is no need to use allocated strings.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt a09efb74e3 builtin/send-pack: fix leaking refspecs
We never free data associated with the assembled refspec in
git-send-pack(1), causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt ca52234183 transport: fix leaking OID arrays in git:// transport data
The transport data for the "git://" protocol contains two OID arrays
that we never free, creating a memory leak. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt fb24460e1d t/helper: fix leaking multi-pack-indices in "read-midx"
Several of the subcommands of `test-helper read-midx` do not close the
MIDX that they have opened, leading to memory leaks. Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt 479601e9f4 builtin/archive: fix leaking OPT_FILENAME() value
The "--output" switch is an `OPT_FILENAME()` option, which allocates
memory when specified by the user. But while we free the string when
executed without the "--remote" switch, we don't otherwise because we
return via a separate exit path that doesn't know to free it.

Fix this by creating a common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt ff0935b96e builtin/merge-tree: fix leaking -X strategy options
The `-X` switch for git-merge-tree(1) will push each option into a local
`xopts` vector that we then end up parsing. The vector never gets freed
though, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt 82ea7e59b2 pretty: fix leaking key/value separator buffer
The `format_set_trailers_options()` function is responsible for parsing
a custom pretty format for trailers. It puts the parsed options into a
`struct process_trailer_options` structure, while the allocated memory
required for this will be put into separate caller-provided arguments.
It is thus the caller's responsibility to free the memory not via the
options structure, but via the other parameters.

While we do this alright for the separator and filter keys, we do not
free the memory associated with the key/value separator. Fix this to
plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt 643c6f576c convert: fix leaks when resetting attributes
When resetting parsed gitattributes, we free the list of convert drivers
parsed from the config. We only free some of the drivers' fields though
and thus have memory leaks.

Fix this by freeing all allocated convert driver fields to plug these
memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:03 -07:00
Patrick Steinhardt e5530f9c5c mailinfo: fix leaking header data
We populate the `mailinfo` arrays `p_hdr_data` and `s_hdr_data` with
data parsed from the mail headers. These arrays may end up being only
partially populated with gaps in case some of the headers do not parse
properly. This causes memory leaks because `strbuf_list_free()` will
stop iterating once it hits the first `NULL` pointer in the backing
array.

Fix this by open-coding a variant of `strbuf_list_free()` that knows to
iterate through all headers.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:03 -07:00
Patrick Steinhardt 6014639837 reftable/generic: drop interface
The `reftable_table` interface provides a generic infrastructure that
can abstract away whether the underlying table is a single table, or a
merged table. This abstraction can make it rather hard to reason about
the code. We didn't ever use it to implement the reftable backend, and
with the preceding patches in this patch series we in fact don't use it
at all anymore. Furthermore, it became somewhat useless with the recent
refactorings that made it possible to seek reftable iterators multiple
times, as these now provide generic access to tables for us. The
interface is thus redundant and only brings unnecessary complexity with
it.

Remove the `struct reftable_table` interface and its associated
functions.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt 89191232b8 t/helper: refactor to not use struct reftable_table
The `struct reftable_table` interface in our "reftable" test helper gets
used such that we can easily print either a single table, or a merged
stack. This generic interface is about to go away.

Prepare the code for this change by using merged tables instead. When
printing the stack we've already got one. When using a single table, we
can create a merged table from it to adapt.

This removes the last user of the generic interface.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt 1f39dd2ae5 t/helper: use hash_to_hex_algop() to print hashes
The "reftable" test helper uses a hand-crafted version to convert from a
raw hash to its hex variant. This was done because this code used to be
part of the reftable library, where we do not use most functions from
the Git core.

Now that the code is integrated into the "dump-reftable" helper though,
that limitation went away. Let's thus use `hash_to_hex_algop()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:48 -07:00
Patrick Steinhardt 42c424d69d t/helper: inline printing of reftable records
Move printing of reftable records into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 64a5b7a8ca t/helper: inline reftable_table_print()
Move `reftable_table_print()` into the "dump-reftable" helper. This
follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt ca74ef6ffb t/helper: inline reftable_stack_print_directory()
Move `reftable_stack_print_directory()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Note that this requires us to remove the tests for this functionality in
`reftable/stack_test.c`. The test does not really add much anyway,
because all it verifies is that we do not crash or run into an error,
and it specifically doesn't check the outputted data. Also, as the code
is now part of the test helper, it doesn't make much sense to have a
unit test for it in the first place.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 22f519a9a0 t/helper: inline reftable_reader_print_file()
Move `reftable_reader_print_file()` into the "dump-reftable" helper.
This follows the same reasoning as the preceding commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt 2b06b28fd6 t/helper: inline reftable_dump_main()
The printing functionality part of `reftable/dump.c` is really only used
by our "dump-reftable" test helper. It is certainly not generic logic
that is useful to anybody outside of Git, and the format it generates is
quite specific. Still, parts of it are used in our test suite and the
output may be useful to take a peek into reftable stacks, tables and
blocks. So while it does not make sense to expose this as part of the
reftable library, it does make sense to keep it around.

Inline the `reftable_dump_main()` function into the "dump-reftable" test
helper. This clarifies that its format is subject to change and not part
of our public interface. Furthermore, this allows us to iterate on the
implementation in subsequent patches.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:47 -07:00
Patrick Steinhardt b8ca235ca5 reftable/merged: stop using generic tables in the merged table
The merged table provides access to a reftable stack by merging the
contents of those tables into a virtual table. These subtables are being
tracked via `struct reftable_table`, which is a generic interface for
accessing either a single reftable or a merged reftable. So in theory,
it would be possible for the merged table to merge together other merged
tables.

This is somewhat nonsensical though: we only ever set up a merged table
over normal reftables, and there is no reason to do otherwise. This
generic interface thus makes the code way harder to follow and reason
about than really necessary. The abstraction layer may also have an
impact on performance, even though the extra set of vtable function
calls probably doesn't really matter.

Refactor the merged tables to use a `struct reftable_reader` for each of
the subtables instead, which gives us direct access to the underlying
tables. Adjust names accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Patrick Steinhardt 6631ed3ce7 reftable/merged: rename reftable_new_merged_table()
Rename `reftable_new_merged_table()` to `reftable_merged_table_new()`
such that the name matches our coding style.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 07:59:46 -07:00
Junio C Hamano eb630683c2 Merge branch 'jk/apply-patch-mode-check-fix'
Test fix.

* jk/apply-patch-mode-check-fix:
  t4129: fix racy index when calling chmod after git-add
2024-08-21 12:02:25 -07:00
Junio C Hamano b772c9cf2e Merge branch 'ps/bundle-outside-repo-fix'
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
  bundle: default to SHA1 when reading bundle headers
  builtin/bundle: have unbundle check for repo before opening its bundle
2024-08-21 12:02:24 -07:00
Patrick Steinhardt 8311e3b551 builtin/maintenance: fix loose objects task emitting pack hash
The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.

Fix this issue by disabling stdout of the child process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:22 -07:00
Patrick Steinhardt 51a0b8a2a7 t7900: exercise detaching via trace2 regions
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:02 -07:00
Chandra Pratap 772408fe75 t-reftable-block: add tests for index blocks
In the current testing setup, block operations are left unexercised
for index blocks. Add a test that exercises these operations for
index blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap 1528c481d7 t-reftable-block: add tests for obj blocks
In the current testing setup, block operations are left unexercised
for obj blocks. Add a test that exercises these operations for obj
blocks.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap 5cba56173b t-reftable-block: add tests for log blocks
In the current testing setup, block operations are only exercised
for ref blocks. Add another test that exercises these operations
for log blocks as well.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap abcddcef3d t-reftable-block: remove unnecessary variable 'j'
Currently, there are two variables for array indices, 'i' and 'j'.
The variable 'j' is used only once and can be easily replaced with
'i'. Get rid of 'j' and replace its occurence with 'i'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap 29ee6d5a20 t-reftable-block: use xstrfmt() instead of xstrdup()
Use xstrfmt() to assign a formatted string to a ref record's
refname instead of xstrdup(). This helps save the overhead of
a local 'char' buffer as well as makes the test more compact.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap 31216ee28a t-reftable-block: use block_iter_reset() instead of block_iter_close()
block_iter_reset() restores a block iterator to its state at the time
of initialization without freeing any memory while block_iter_close()
deallocates the memory for the iterator.

In the current testing setup, a block iterator is allocated and
deallocated for every iteration of a loop, which hurts performance.
Improve upon this by using block_iter_reset() at the start of each
iteration instead. This has the added benifit of testing
block_iter_reset(), which currently remains untested.

Similarly, remove reftable_record_release() for a reftable record
that is still in use.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:41 -07:00
Chandra Pratap c25cbcd352 t-reftable-block: use reftable_record_key() instead of strbuf_addstr()
In the current testing setup, the record key required for many block
iterator functions is manually stored in a strbuf struct and then
passed to these functions. This is not ideal when there exists a
dedicated function to encode a record's key into a strbuf, namely
reftable_record_key(). Use this function instead of manual encoding.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap e638e9c8f3 t-reftable-block: use reftable_record_equal() instead of check_str()
In the current testing setup, operations like read and write for
reftable blocks as defined by reftable/block.{c, h} are verified by
comparing only the keys of input and output reftable records. This is
not ideal because there can exist inequal reftable records with the
same key. Use the dedicated function for record comparison,
reftable_record_equal(), instead of key-based comparison.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap 353672f9f8 t-reftable-block: release used block reader
Used block readers must be released using block_reader_release() to
prevent the occurence of a memory leak. Make test_block_read_write()
conform to this statement.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap 6853b931bd t: harmonize t-reftable-block.c with coding guidelines
Harmonize the newly ported test unit-tests/t-reftable-block.c
with the following guidelines:
- Single line 'for' statements must omit curly braces.
- Structs must be 0-initialized with '= { 0 }' instead of '= { NULL }'.
- Array sizes and indices should preferably be of type 'size_t'and
  not 'int'.
- Return code variable should preferably be named 'ret', not 'n'.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Chandra Pratap 546cc0d64e t: move reftable/block_test.c to the unit testing framework
reftable/block_test.c exercises the functions defined in
reftable/block.{c, h}. Migrate reftable/block_test.c to the unit
testing framework. Migration involves refactoring the tests
to use the unit testing framework instead of reftable's test
framework and renaming the tests to follow the unit-tests'
naming conventions.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 09:41:40 -07:00
Matheus Tavares 4bdd6b7bf2 rebase --exec: respect --quiet
rebase --exec doesn't obey --quiet and ends up printing messages about
the command being executed:

  git rebase HEAD~3 --quiet --exec true
  Executing: true
  Executing: true
  Executing: true

Let's fix that by omitting the "Executing" messages when using --quiet.

Furthermore, the sequencer code includes a few calls to
term_clear_line(), which prints a special character sequence to erase
the previous line displayed on stderr (even when nothing was printed
yet). For an user running the command interactively, the net effect of
calling this function with or without --quiet is the same as the
characters are invisible in the terminal. However, when redirecting the
output to a file or piping to another command, the presence of these
invisible characters is noticeable, and it may break user expectation as
--quiet is not being respected.

We could skip the term_clear_line() calls when --quiet is used, like we
are doing with the "Executing" messages, but it makes much more sense to
condition the line cleaning upon stderr being TTY, since these
characters are really only useful for TTY outputs.

The added test checks for both these two changes.

Reported-by: Lincoln Yuji <lincolnyuji@hotmail.com>
Reported-by: Rodrigo Siqueira <siqueirajordao@riseup.net>
Signed-off-by: Matheus Tavares <matheus.tavb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 08:57:51 -07:00
Junio C Hamano 2df380c280 Merge branch 'ps/leakfixes-part-4' into ps/leakfixes-part-5
* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-20 10:15:27 -07:00
Ghanshyam Thakkar 05026637f3 t: migrate t0110-urlmatch-normalization to the new framework
helper/test-urlmatch-normalization along with
t0110-urlmatch-normalization test the `url_normalize()` function from
'urlmatch.h'. Migrate them to the unit testing framework for better
performance. And also add different test_msg()s for better debugging.

In the migration, last two of the checks from `t_url_general_escape()`
were slightly changed compared to the shell script. This involves
changing

'\'' -> '
'\!' -> !

in the urls of those checks. This is because in C strings, we don't
need to escape "'" and "!". Other than these two, all the urls were
pasted verbatim from the shell script.

Another change is the removal of a MINGW prerequisite from one of the
test. It was there because[1] on Windows, the command line is a
Unicode string, it is not possible to pass arbitrary bytes to a
program. But in unit tests we don't have this limitation.

And since we can construct strings with arbitrary bytes in C, let's
also remove the test files which contain URLs with arbitrary bytes in
the 't/t0110' directory and instead embed those URLs in the unit test
code itself.

[1]: https://lore.kernel.org/git/53CAC8EF.6020707@gmail.com/

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 10:08:28 -07:00
Jeff King a6bcb3ca01 t-hashmap: stop calling setup() for t_intern() test
Commit f24a9b78a9 (t-hashmap: mark unused parameters in callback
function, 2024-08-17) noted that the t_intern() does not need its
hashmap parameter, but we have to keep it to conform to the function
pointer interface of setup().

But since the only thing setup() does is create and tear down the
hashmap, we can just skip calling setup() entirely for this case, and
drop the unused parameters. This simplifies the code a bit.

Helped-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-20 08:33:18 -07:00
Junio C Hamano ee218ee952 Merge branch 'ps/transport-leakfix-test-updates'
Test updates.

* ps/transport-leakfix-test-updates:
  transport: mark more tests leak-free
2024-08-19 11:07:38 -07:00
Junio C Hamano b9497848df Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files.

* tb/incremental-midx-part-1:
  midx: implement support for writing incremental MIDX chains
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  midx: implement verification support for incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: add new fields for incremental MIDX chains
  Documentation: describe incremental MIDX format
2024-08-19 11:07:37 -07:00
Junio C Hamano 53129a0680 Merge branch 'jc/tests-no-useless-tee'
Test fixes.

* jc/tests-no-useless-tee:
  tests: drop use of 'tee' that hides exit status
2024-08-19 11:07:37 -07:00
Junio C Hamano 4dbca805e0 Merge branch 'rs/unit-tests-test-run'
Unit-test framework has learned a simple control structure to allow
embedding test statements in-line instead of having to create a new
function to contain them.

* rs/unit-tests-test-run:
  t-strvec: use if_test
  t-reftable-basics: use if_test
  t-ctype: use if_test
  unit-tests: add if_test
  unit-tests: show location of checks outside of tests
  t0080: use here-doc test body
2024-08-19 11:07:36 -07:00
Patrick Steinhardt 759b453f9f t7900: fix flaky test due to leaking background job
One of the recently-added tests in t7900 exercises git-maintanance(1)
with the `--detach` flag, which causes it to perform maintenance in the
background. We do not wait for the backgrounded process to exit though,
which causes the process to leak outside of the test, leading to racy
behaviour.

Fix this by synchronizing with the process via a separate file
descriptor. This is the same workaround as we use in t6500, see the
function `run_and_wait_for_auto_gc ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-19 09:36:23 -07:00
Jacob Keller c038a6f1d7 send-email: teach git send-email option to translate aliases
git send-email has support for converting shorthand alias names to
canonical email addresses via the alias file. It supports a wide variety
of alias file formats based on popular email program file formats.

Other programs, such as b4, would like the ability to convert aliases in
the same way as git send-email without needing to re-implement the logic
for understanding the many file formats.

Teach git send-email a new option, --translate-aliases, which will
enable this functionality. Similar to --dump-aliases, this option works
like a new mode of operation for git send-email.

When run with --translate-aliases, git send-email reads from standard
input and converts any provided alias into its canonical name and email
according to the alias file. Each expanded name and address is printed
to standard output, one per line.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 10:03:06 -07:00
Jeff King f288a57789 test-mergesort: mark unused parameters in trivial callback
The mode_copy() function does nothing, but since it's used as a function
pointer within "struct mode", it has to conform to the interface. Mark
it to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:23 -07:00
Jeff King f24a9b78a9 t-hashmap: mark unused parameters in callback function
The t_intern() setup function doesn't operate on a hashmap, so it
ignores its parameters. But we can't drop them since it is passed as a
pointer to setup(), so we have to match the other setup functions. Mark
them to silence -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:13 -07:00
Jeff King 4695c3f3a9 reftable: mark unused parameters in virtual functions
The reftable code uses a lot of virtual function pointers, but many of
the concrete implementations do not need all of the parameters.

For the most part these are obviously fine to just mark as UNUSED (e.g.,
the empty_iterator functions unsurprisingly do not do anything). Here
are a few cases where I dug a little deeper (but still ended up just
marking them UNUSED):

  - the iterator exclude_patterns is best-effort and optional (though it
    would be nice to support in the long run as an optimization)

  - ignoring the ref_store in many transaction functions is unexpected,
    but works because the ref_transaction itself carries enough
    information to do what we need.

  - ignoring "err" for in some cases (e.g., transaction abort) is OK
    because we do not return any errors. It is a little odd for
    reftable_be_create_reflog(), though, since we do return errors
    there. We should perhaps be creating string error messages at this
    layer, but I've punted on that for now.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
Jeff King df9d638c24 unit-tests: ignore unused argc/argv
All of the unit test programs have their own cmd_main() function, but
none of them actually look at the argc/argv that is passed in.

In the long run we may want them to handle options for the test harness.
But we'd probably do that with a shared harness cmd_main(), dispatching
to the individual tests. In the meantime, let's annotate the unused
parameters to avoid triggering -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:12 -07:00
Jeff King 7046c85cce t/helper: mark more unused argv/argc arguments
This is a continuation of 126e3b3d2a (t/helper: mark unused argv/argc
arguments, 2023-03-28) to cover a few new cases:

 - test-example-tap was added since that commit

 - test-hashmap used to accept the "ignorecase" argument on the command
   line. But since most of its logic was moved to a unit-test in
   3469a23659 (t: port helper/test-hashmap.c to unit-tests/t-hashmap.c,
   2024-08-03), it now ignores its argv entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:11 -07:00
Junio C Hamano b3d175409d Merge branch 'sj/ref-fsck'
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.

* sj/ref-fsck:
  fsck: add ref name check for files backend
  files-backend: add unified interface for refs scanning
  builtin/refs: add verify subcommand
  refs: set up ref consistency check infrastructure
  fsck: add refs report function
  fsck: add a unified interface for reporting fsck messages
  fsck: make "fsck_error" callback generic
  fsck: rename objects-related fsck error functions
  fsck: rename "skiplist" to "skip_oids"
2024-08-16 12:51:51 -07:00
Junio C Hamano d07bb0cd2a Merge branch 'ps/p4-tests-updates' into maint-2.46
Perforce tests have been updated.
cf. <na5mwletzpnacietbc7pzqcgb622mvrwgrkjgjosysz3gvjcso@gzxxi7d7icr7>

* ps/p4-tests-updates:
  t98xx: mark Perforce tests as memory-leak free
  ci: update Perforce version to r23.2
  t98xx: fix Perforce tests with p4d r23 and newer
2024-08-16 12:50:56 -07:00
Junio C Hamano e6698fbfa9 Merge branch 'ks/unit-test-comment-typofix' into maint-2.46
Typofix.

* ks/unit-test-comment-typofix:
  unit-tests/test-lib: fix typo in check_pointer_eq() description
2024-08-16 12:50:56 -07:00
Junio C Hamano c09721cb63 Merge branch 'dd/notes-empty-no-edit-by-default' into maint-2.46
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-16 12:50:55 -07:00
Junio C Hamano 72a50fa03b Merge branch 'pw/add-patch-with-suppress-blank-empty' into maint-2.46
"git add -p" by users with diff.suppressBlankEmpty set to true
failed to parse the patch that represents an unmodified empty line
with an empty line (not a line with a single space on it), which
has been corrected.

* pw/add-patch-with-suppress-blank-empty:
  add-patch: use normalize_marker() when recounting edited hunk
  add-patch: handle splitting hunks with diff.suppressBlankEmpty
2024-08-16 12:50:54 -07:00
Junio C Hamano bb250b5378 Merge branch 'jc/checkout-no-op-switch-errors' into maint-2.46
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-08-16 12:50:51 -07:00
Patrick Steinhardt d2511eeae5 setup: make ref storage format configurable via config
Similar to the preceding commit, introduce a new "init.defaultRefFormat"
config that allows the user to globally set the ref storage format used
by newly created repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:22 -07:00
Patrick Steinhardt 0c22e09b73 setup: make object format configurable via config
The object format for repositories can either be configured explicitly
by passing the `--object-format=` option to git-init(1) or git-clone(1),
or globally by setting the `GIT_DEFAULT_HASH` environment variable.
While the former makes sense, setting random environment variables is
not really a good user experience in case someone decides to only use
SHA256 repositories.

It is only natural to expect for a user that things like this can also
be configured via their config. As such, introduce a new config
"init.defaultObjectFormat", similar to "init.defaultBranch", that allows
the user to configure the default object format when creating new repos.

The precedence order now is the following, where the first one wins:

  1. The `--object-format=` switch.

  2. The `GIT_DEFAULT_HASH` environment variable.

  3. The `init.defaultObjectFormat` config variable.

This matches the typical precedence order we use in Git. We typically
let the environment override the config such that the latter can easily
be overridden on an ephemeral basis, for example by scripts.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
Patrick Steinhardt 7689f6cbd1 t0001: delete repositories when object format tests finish
The object format tests create one-shot repositories that are only used
by the respective test, but never delete them. This makes it hard to
pick a proper repository name in subsequent tests, as more and more
names are taken already.

Delete these repositories via `test_when_finished`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
Patrick Steinhardt 05d20915bc t0001: exercise initialization with ref formats more thoroughly
While our object format tests for git-init(1) exercise tests with all
known formats in t0001, the tests for the ref format don't. This leads
to some missing test coverage for interesting cases, like whether or not
a non-default ref storage format causes us to bump the repository format
version. We also don't test for the precedence of the `--ref-format=`
and the `GIT_DEFAULT_REF_FORMAT=` environment variable.

Extend the test suite to cover more scenarios related to the ref format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:55:21 -07:00
Patrick Steinhardt e3209bd4df builtin/stash: fix --keep-index --include-untracked with empty HEAD
It was reported that creating a stash with `--keep-index
--include-untracked` causes an error when HEAD points to a commit whose
tree is empty:

    $ git stash push --keep-index --include-untracked
    error: pathspec ':/' did not match any file(s) known to git

This error comes from `git checkout --no-overlay $i_tree -- :/`, which
we execute to reset the working tree to the state in our index. As the
tree generated from the index is empty in our case, ':/' does not match
any files and thus causes git-checkout(1) to error out.

Fix the issue by skipping the checkout when the index tree is empty. As
explained in the in-code comment, this should be the correct thing to do
as there is nothing that we'd have to reset in the first place.

Reported-by: Piotr Siupa <piotrsiupa@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:50:33 -07:00
Patrick Steinhardt 98077d06b2 run-command: fix detaching when running auto maintenance
In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt a6affd3343 builtin/maintenance: add a --detach flag
Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt c7185df01b builtin/gc: add a --detach flag
When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt 0ce44e2293 builtin/gc: fix leaking config values
We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Junio C Hamano 0da7673a51 Merge branch 'xx/diff-tree-remerge-diff-fix'
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-15 13:22:16 -07:00
Junio C Hamano e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
Junio C Hamano 88457a6151 Merge branch 'ps/submodule-ref-format'
Support to specify ref backend for submodules has been enhanced.

* ps/submodule-ref-format:
  object: fix leaking packfiles when closing object store
  submodule: fix leaking seen submodule names
  submodule: fix leaking fetch tasks
  builtin/submodule: allow "add" to use different ref storage format
  refs: fix ref storage format for submodule ref stores
  builtin/clone: propagate ref storage format to submodules
  builtin/submodule: allow cloning with different ref storage format
  git-submodule.sh: break overly long command lines
2024-08-15 13:22:14 -07:00
Junio C Hamano 6891103f72 Merge branch 'ag/t7004-modernize'
Coding style fixes to a test script.

* ag/t7004-modernize:
  t7004: make use of write_script
  t7004: use single quotes instead of double quotes
  t7004: begin the test body on the same line as test_expect_success
  t7004: description on the same line as test_expect_success
  t7004: do not prepare things outside test_expect_success
  t7004: use indented here-doc
  t7004: one command per line
  t7004: remove space after redirect operators
2024-08-15 13:22:14 -07:00
Junio C Hamano 69b737999c Merge branch 'ps/reftable-stack-compaction'
The code paths to compact multiple reftable files have been updated
to correctly deal with multiple compaction triggering at the same
time.

* ps/reftable-stack-compaction:
  reftable/stack: handle locked tables during auto-compaction
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: simplify tracking of table locks
  reftable/stack: update stats on failed full compaction
  reftable/stack: test compaction with already-locked tables
  reftable/stack: extract function to setup stack with N tables
  reftable/stack: refactor function to gather table sizes
2024-08-15 13:22:13 -07:00
Junio C Hamano a3d71f2076 Merge branch 'gt/unit-test-hashmap'
An existing test of hashmap API has been rewritten with the
unit-test framework.

* gt/unit-test-hashmap:
  t: port helper/test-hashmap.c to unit-tests/t-hashmap.c
2024-08-15 13:22:12 -07:00
Junio C Hamano f6df5e2d05 Merge branch 'jc/t3206-test-when-finished-fix'
Test clean-up.

* jc/t3206-test-when-finished-fix:
  t3206: test_when_finished before dirtying operations, not after
2024-08-15 13:22:12 -07:00
Junio C Hamano 402f36f33e Merge branch 'rs/t-example-simplify'
Unit test simplification.

* rs/t-example-simplify:
  t-example-decorate: remove test messages
2024-08-15 13:22:11 -07:00
Junio C Hamano 0ed3dde067 Merge branch 'jc/safe-directory'
Follow-up on 2.45.1 regression fix.

* jc/safe-directory:
  safe.directory: setting safe.directory="." allows the "current" directory
  safe.directory: normalize the configured path
  safe.directory: normalize the checked path
  safe.directory: preliminary clean-up
2024-08-15 13:22:11 -07:00
Taylor Blau a72dfab8b8 pseudo-merge.c: ensure pseudo-merge groups are closed
When generating pseudo-merge bitmaps, it's possible that concurrent
reference updates may reveal some pseudo-merge candidates which reach
objects that are not contained in the bitmap's pack or pseudo-pack
order (in the case of MIDX bitmaps).

The latter case is relatively easy to demonstrate: if we generate a MIDX
bitmap with only half of the repository packed, then the unpacked
contents are not part of the MIDX's object order.

If we happen to select one or more commit(s) from the unpacked portion
of the repository for inclusion in a pseudo-merge, we'll get the
following message when trying to generate its bitmap:

    $ git multi-pack-index write --bitmap
    [...]
    Selecting pseudo-merge commits: 100% (1/1), done.
    warning: Failed to write bitmap index. Packfile doesn't have full closure (object ... is missing)
    Building bitmaps:  50% (1/2), done.
    error: could not write multi-pack bitmap

, and the attempted bitmap write will fail, leaving the repository
without a current bitmap.

Rectify this by ensuring that the commits which are pseudo-merge
candidates can only be so if they appear somewhere in the packing order.

This is sufficient, since we know that the original packing order is
closed under reachability, so if a commit appears in that list as a
potential pseudo-merge candidate, we know that everything reachable from
it also appears in the list (and thus the candidate is a good one).

Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:32:28 -07:00
Taylor Blau 25b78668de pseudo-merge.c: do not generate empty pseudo-merge commits
The previous commit demonstrated it is possible to generate empty
pseudo-merge commits, which is not useful as such pseudo-merges carry no
information.

Ensure that we only generate non-empty groups by not pushing a new
commit onto the bitmap_writer when that commit has no parents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:29:15 -07:00
Taylor Blau 42f80e361c t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
Demonstrate that it is possible to generate empty pseudo-merge commits
in certain cases.

In the below instance, we generate one non-empty pseudo-merge
(containing commit "base"), and one empty pseudo-merge group
(corresponding to the unstable commits within that group).

(In my testing, the pseudo-merge machinery seems to handle empty groups
just fine, but generating them is pointless as they carry no
information.)

This commit (introducing a deliberate "test_expect_failure") is split
out from the actual fix (which will appear in the following commit) to
demonstrate that the failure is correctly induced.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:26:35 -07:00
Jeff King 49e5cc5b26 t4129: fix racy index when calling chmod after git-add
This patch fixes a racy test failure in t4129.

The deletion test added by e95d515141 (apply: canonicalize modes read
from patches, 2024-08-05) wants to make sure that git-apply does not
complain about a non-canonical mode in the patch, even if that mode does
not match the working tree file. So it does this:

	echo content >non-canon &&
	git add non-canon &&
	chmod 666 non-canon &&

This is wrong, because running chmod will update the ctime on the file,
making it stat-dirty and causing git-apply to refuse to apply the patch.
But this only happens sometimes, since it depends on the timestamps
crossing a second boundary (but it triggers pretty quickly when run with
--stress).

We can fix this by doing the chmod before updating the index. The order
isn't important here, as the mode will be canonicalized to 100644 in the
index anyway (in fact, the chmod is not even that important in the first
place, since git-apply will only look at the index; I only added it as
an extra confirmation that git-apply would not be confused by it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 09:41:11 -07:00
Junio C Hamano 9b4df82634 Merge branch 'ps/reftable-stack-compaction' into ps/reftable-drop-generic
* ps/reftable-stack-compaction:
  reftable/stack: handle locked tables during auto-compaction
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: simplify tracking of table locks
  reftable/stack: update stats on failed full compaction
  reftable/stack: test compaction with already-locked tables
  reftable/stack: extract function to setup stack with N tables
  reftable/stack: refactor function to gather table sizes
2024-08-15 08:22:03 -07:00
Junio C Hamano d639123742 Merge branch 'tb/t7704-deflake'
A test that fails on an unusually slow machine was found, and made
less likely to cause trouble by lengthening the expiry value it
uses.

* tb/t7704-deflake:
  t/t7704-repack-cruft.sh: avoid failures during long-running tests
2024-08-14 14:54:58 -07:00
Junio C Hamano 7b11e20bff Merge branch 'cp/unit-test-reftable-tree'
A test in reftable library has been rewritten using the unit test
framework.

* cp/unit-test-reftable-tree:
  t-reftable-tree: improve the test for infix_walk()
  t-reftable-tree: add test for non-existent key
  t-reftable-tree: split test_tree() into two sub-test functions
  t: move reftable/tree_test.c to the unit testing framework
  reftable: remove unnecessary curly braces in reftable/tree.c
2024-08-14 14:54:56 -07:00
Junio C Hamano 61fd5de05f Merge branch 'kl/test-fixes'
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-14 14:54:55 -07:00
Junio C Hamano 494c9788e4 Merge branch 'jc/reflog-expire-lookup-commit-fix'
"git reflog expire" failed to honor annotated tags when computing
reachable commits.

* jc/reflog-expire-lookup-commit-fix:
  Revert "reflog expire: don't use lookup_commit_reference_gently()"
2024-08-14 14:54:55 -07:00
Junio C Hamano c147b41f4c Merge branch 'jc/leakfix-mailmap'
Leakfix.

* jc/leakfix-mailmap:
  mailmap: plug memory leak in read_mailmap_blob()
2024-08-14 14:54:54 -07:00
Junio C Hamano dfaa04f3c6 Merge branch 'jc/leakfix-hashfile'
Leakfix.

* jc/leakfix-hashfile:
  csum-file: introduce discard_hashfile()
2024-08-14 14:54:53 -07:00
Junio C Hamano 44773b9f70 Merge branch 'jc/patch-id'
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
2024-08-14 14:54:53 -07:00
Junio C Hamano 5a74eb07ca Merge branch 'jc/jl-git-no-advice-fix'
Remove leftover debugging cruft from a test script.

* jc/jl-git-no-advice-fix:
  t0018: remove leftover debugging cruft
2024-08-14 14:54:51 -07:00
Junio C Hamano 4cf2f1be56 Merge branch 'tb/config-fixed-value-with-valueless-true'
"git config --value=foo --fixed-value section.key newvalue" barfed
when the existing value in the configuration file used the
valueless true syntax, which has been corrected.

* tb/config-fixed-value-with-valueless-true:
  config.c: avoid segfault with --fixed-value and valueless config
2024-08-14 14:54:51 -07:00
Junio C Hamano 0b2c4bc3ff Merge branch 'jk/apply-patch-mode-check-fix'
The patch parser in 'git apply' has been a bit more lenient against
unexpected mode bits, like 100664, recorded on extended header lines.

* jk/apply-patch-mode-check-fix:
  apply: canonicalize modes read from patches
2024-08-14 14:54:50 -07:00
Junio C Hamano 760348212b Merge branch 'ps/ls-remote-out-of-repo-fix'
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-14 14:54:49 -07:00
Junio C Hamano 4bad0119f2 Merge branch 'rh/http-proxy-path'
The value of http.proxy can have "path" at the end for a socks
proxy that listens to a unix-domain socket, but we started to
discard it when we taught proxy auth code path to use the
credential helpers, which has been corrected.

* rh/http-proxy-path:
  http: do not ignore proxy path
2024-08-14 14:54:49 -07:00
Junio C Hamano d65332f241 Merge branch 'cp/unit-test-reftable-pq'
The tests for "pq" part of reftable library got rewritten to use
the unit test framework.

* cp/unit-test-reftable-pq:
  t-reftable-pq: add tests for merged_iter_pqueue_top()
  t-reftable-pq: add test for index based comparison
  t-reftable-pq: make merged_iter_pqueue_check() callable by reference
  t-reftable-pq: make merged_iter_pqueue_check() static
  t: move reftable/pq_test.c to the unit testing framework
  reftable: change the type of array indices to 'size_t' in reftable/pq.c
  reftable: remove unnecessary curly braces in reftable/pq.c
2024-08-14 14:54:48 -07:00
Junio C Hamano 4385f8a52d Merge branch 'ps/leakfixes-part-3'
More leakfixes.

* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-14 14:54:47 -07:00
Jacob Keller bbc04b0094 t9001-send-email.sh: update alias list used for pine test
The set of aliases used for the pine --dump-aliases test do not
perfectly mesh with the way the pine address book is defined. While
technically all valid, there are some oddities including bob's name
being partially split so that the actual address is returned as
"Bobbyton <bob@example.com>". A strict reading of the pine documentation
indicates that the address should either be of the form
"address@domain" or a comma separated list of address, name/address
pairs, or other aliases enclosed by ().

The parsing implementation in git-send-email is not as strict, but it
makes sense to ensure the test data used is. Although the --dump-aliases
test does not make use of the address data, it is helpful to avoid
giving future developers the wrong impression of the file format.

Also add an alias which translates to multiple addresses using the ()
format.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 12:13:00 -07:00
Derrick Stolee 4b707a6e99 p1500: add is-base performance tests
The previous two changes introduced a commit walking heuristic for finding
the most likely base branch for a given source. This algorithm walks
first-parent histories until reaching a collision.

This walk _should_ be very fast. Exceptions include cases where a
commit-graph file does not exist, leading to a full walk of all reachable
commits to compute generation numbers, or a case where no collision in the
first-parent history exists, leading to a walk of all first-parent history
to the root commits.

The p1500 test script guarantees a complete commit-graph file during its
setup, so we will not test that scenario. Do create a new root commit in an
effort to test the scenario of parallel first-parent histories.

Even with the extra root commit, these tests take no longer than 0.02
seconds on my machine for the Git repository. However, the results are
slightly more interesting in a copy of the Linux kernel repository:

Test
---------------------------------------------------------------
1500.2: ahead-behind counts: git for-each-ref              0.12
1500.3: ahead-behind counts: git branch                    0.12
1500.4: ahead-behind counts: git tag                       0.12
1500.5: contains: git for-each-ref --merged                0.04
1500.6: contains: git branch --merged                      0.04
1500.7: contains: git tag --merged                         0.04
1500.8: is-base check: test-tool reach (refs)              0.03
1500.9: is-base check: test-tool reach (tags)              0.03
1500.10: is-base check: git for-each-ref                   0.03
1500.11: is-base check: git for-each-ref (disjoint-base)   0.07

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:06 -07:00
Derrick Stolee 9c1732ca11 for-each-ref: add 'is-base' token
The previous change introduced the get_branch_base_for_tip() method in
commit-reach.c. The motivation of that change was about using a heuristic to
deteremine the base branch for a source commit from a list of candidate
commit tips. This change makes that algorithm visible to users via a new
atom in the 'git for-each-ref' format. This change is very similar to the
chang in 49abcd21da (for-each-ref: add ahead-behind format atom,
2023-03-20).

Introduce the 'is-base:<source>' atom, which will indicate that the
algorithm should be computed and the result of the algorithm is reported
using an indicator of the form '(<source>)'. For example, using
'%(is-base:HEAD)' would result in one line having the token '(HEAD)'.

Use the sorted order of refs included in the ref filter to break ties in the
algorithm's heuristic. In the previous change, the motivating examples
include using an L0 trunk, long-lived L1 branches, and temporary release
branches. A caller could communicate the ordered preference among these
categories using the input refpecs and avoiding a different sort mechanism.
This sorting behavior is tested in the test scripts.

It is important to include this atom as a special case to
can_do_iterative_format() to match the expectations created in bd98f9774e
(ref-filter.c: filter & format refs in the same callback, 2023-11-14). The
ahead-behind atom was one of the special cases, and this similarly requires
using an algorithm across all input refs before starting the format of any
single ref.

In the test script, the format tokens use colons or lack whitespace to avoid
Git complaining about trailing whitespace errors.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:06 -07:00
Derrick Stolee e32eaf73b0 commit-reach: add get_branch_base_for_tip
Add a new reachability algorithm that intends to discover (from a heuristic)
which branch was used as the starting point for a given commit. Add focused
tests using the 'test-tool reach' command.

In repositories that use pull requests (or merge requests) to advance one or
more "protected" branches, the history of that reference can be recovered by
following the first-parent history in most cases. Most are completed using
no-fast-forward merges, though squash merges are quite common. Less common
is rebase-and-merge, which still validates this assumption. Finally, the
case that breaks this assumption is the fast-forward update (with potential
rebasing).  Even in this case, the previous commit commonly appears in the
first-parent history of the branch.

Similar assumptions can be made for a topic branch created by a single user
with the intention to merge back into another branch. Using 'git commit',
'git merge', and 'git cherry-pick' from HEAD will default to having the
first-parent commit be the previous commit at HEAD. This history changes
only with commands such as 'git reset' or 'git rebase', where the command
names also imply that the branch is starting from a new location.

With this movement of branches in mind, the following heuristic is proposed
as a way to determine the base branch for a given source branch:

  Among a list of candidate base branches, select the candidate that
  minimizes the number of commits in the first-parent history of the source
  that are not in the first-parent history of the candidate.

Prior third-party solutions to this problem have used this optimization
criteria, but have relied upon extracting the first-parent history and
comparing those lists as tables instead of using commit-graph walks.

Given current command-line interface options, this optimization criteria is
not easy to detect directly. Even using the command

  git rev-list --count --first-parent <base>..<source>

does not measure this count, as it uses full reachability from <base> to
determine which commits to remove from the range '<base>..<source>'. This
may lead to one asking if we should instead be using the full reachability
of the candidate and only the first-parent history of the source. This,
unfortunately, does not work for repositories that use long-lived branches
and automation to merge across those branches.

In extremely large repositories, merging into a single trunk may not be
feasible.  This is usually due to the desired frequency of updates
(thousands of engineers doing daily work) combined with the time required to
perform a validation build.  These factors combine to create significant
risk of semantic merge conflicts, leading to build breaks on the trunk. In
response, repository maintainers can create a single Level Zero (L0) trunk
and multiple Level One (L1) branches. By partitioning the engineers by
organization, these engineers may see lower risk of semantic merge conflicts
as well as be protected against build breaks in other L1 branches. The key
to making this system work is a semi-automated process of merging L1
branches into the L0 trunk and vice-versa.  In a large enough organization,
these L1 branches may further split into L2 or L3 branches, but the same
principles apply for merging across deeper levels.

If these automated merges use a typical merge with the second parent
bringing in the "new" content, then each L0 and L1 branch can track its
previous positions by following first-parent history, which appear as
parallel paths (until reaching the first place where the branches diverged).
If we also walk to second parents, then the histories overlap significantly
and cannot be distinguished except for very-recent changes.

For this reason, the first-parent condition should be symmetrical across the
base and source branches.

Another common case for desiring the result of this optimization method is
the use of release branches. When releasing a version of a repository, a
branch can be used to track that release. Any updates that are worth fixing
in that release can be merged to the release branch and shipped with only
the necessary fixes without any new features introduced in the trunk branch.
The 'maint-2.<X>' branches represent this pattern in the Git project. The
microsoft/git fork uses 'vfs-2.<X>.<Y>' branches to track the changes that
are custom to that fork on top of each upstream Git release 2.<X>.<Y>. This
application doesn't need the symmetrical first-parent condition, but the use
of first-parent histories does not change the results for these branches.

To determine the base branch from a list of candidates, create a new method
in commit-reach.c that performs a single* commit-graph walk. The core
concept is to walk first-parents starting at the candidate bases and the
source, tracking the "best" base to reach a given commit. Use generation
numbers to ensure that a commit is walked at most once and all children have
been explored before visiting it.  When reaching a commit that is reachable
from both a base and the source, we will then have a guarantee that this is
the closest intersection of first-parent histories. Track the best base to
reach that commit and return it as a result. In rare cases involving
multiple root commits, the first-parent history of the source may never
intersect any of the candidates and thus a null result is returned.

* There are up to two walks, since we require all commits to have a computed
  generation number in order to avoid incorrect results. This is similar to
  the need for computed generation numbers in ahead_behind() as implemented
  in fd67d149bd (commit-reach: implement ahead_behind() logic, 2023-03-20).

In order to track the "best" base, use a new commit slab that stores an
integer.  This value defaults to zero upon initialization, so use -1 to
track that the source commit can reach this commit and use 'i + 1' to track
that the ith base can reach this commit. When multiple bases can reach a
commit, minimize the index to break ties. This allows the caller to specify
an order to the bases that determines some amount of preference when the
heuristic does not result in a unique result.

The trickiest part of the integer slab is what happens when reaching a
collision among the histories of the bases and the history of the source.
This is noticed when viewing the first parent and seeing that it has a slab
value that differs in sign (negative or positive). In this case, the
collision commit is stored in the method variable 'branch_point' and its
slab value is set to -1. The index of the best base (so far) is stored in
the method variable 'best_index'. It is possible that there are multiple
commits that have the branch_point as its first parent, leading to multiple
updates of best_index.  The result is determined when 'branch_point' is
visited in the commit walk, giving the guarantee that all commits that could
reach 'branch_point' were visited.

Several interesting cases of collisions and different results are tested in
the t6600-test-reach.sh script. Recall that this script also tests the
algorithm in three possible states involving the commit-graph file and how
many commits are written in the file. This provides some coverage of the
need (and lack of need) for the ensure_generations_valid() method.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:10:05 -07:00
Patrick Steinhardt 77d4b3dd73 builtin/diff: free symmetric diff members
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:02 -07:00
Patrick Steinhardt 36f971f861 diff: free state populated via options
The `objfind` and `anchors` members of `struct diff_options` are
populated via option parsing, but are never freed in `diff_free()`. Fix
this to plug those memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 0aaca0ec09 builtin/log: fix leak when showing converted blob contents
In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 38678e5df5 userdiff: fix leaking memory for configured diff drivers
The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 1bc158e750 builtin/format-patch: fix various trivial memory leaks
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 6b15d9ca7f diff: fix leak when parsing invalid ignore regex option
When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
Patrick Steinhardt 4dfd4f1dfe unpack-trees: clear index when not propagating it
When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00