Commit graph

12262 commits

Author SHA1 Message Date
Patrick Steinhardt 860b678016 builtin/repack: fix leaking line buffer when packing promisors
In `repack_promisor_objects()` we read output from git-pack-objects(1)
line by line, using `strbuf_getline_lf()`. We never free the line
buffer, causing a memory leak. Plug it.

This leak is being hit in t5616, but plugging it alone is not
sufficient to make the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:12 -07:00
Patrick Steinhardt b8849e236f gpg-interface: fix misdesigned signing key interfaces
The interfaces to retrieve signing keys and their IDs are misdesigned as
they return string constants even though they indeed allocate memory,
which leads to memory leaks. Refactor the code to instead always return
allocated strings and let the callers free them accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
Patrick Steinhardt 1e8cb17ac5 builtin/submodule--helper: fix leaking refs on push-check
In the push-check subcommand of the submodule helper we acquire a list
of local refs, but never free that list. Fix this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:11 -07:00
Patrick Steinhardt 7eb6f02c55 builtin/push: fix leaking refspec query result
When appending a refspec via `refspec_append_mapped()` we leak the
result of `query_refspecs()`. The overall logic around refspec queries
is quite weird, as callers are expected to either set the `src` or `dst`
pointers, and then the (allocated) result will be in the respective
other struct member.

As we have the `src` member set, plugging the memory leak is thus as
easy as just freeing the `dst` member. While at it, use designated
initializers to initialize the structure.

This leak was exposed by t5516, but fixing it is not sufficient to make
the whole test suite leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05 08:49:10 -07:00
Kevin Lyles e65b0c7c36 builtin/cat-file: mark 'git cat-file' sparse-index compatible
This change affects how 'git cat-file' works with the index when
specifying an object with the ":<path>" syntax (which will give file
contents from the index).

'git cat-file' expands a sparse index to a full index any time contents
are requested from the index by specifying an object with the ":<path>"
syntax. This is true even when the requested file is part of the sparse
index, and results in much slower 'git cat-file' operations when working
within the sparse index.

Mark 'git cat-file' as not needing a full index, so that you only pay
the cost of expanding the sparse index to a full index when you request
a file outside of the sparse index.

Add tests to ensure both that:
- 'git cat-file' returns the correct file contents whether or not the
  file is in the sparse index
- 'git cat-file' expands to the full index any time you request
  something outside of the sparse index

Signed-off-by: Kevin Lyles <klyles+github@epic.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 09:19:04 -07:00
Patrick Steinhardt b2dbf97f47 builtin/index-pack: fix segfaults when running outside of a repo
It was reported that git-verify-pack(1) has started to crash with Git
v2.46.0 when run outside of a repository. This is another fallout from
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), where we have stopped setting the default hash algorithm
for `the_repository`. Consequently, code that relies on `the_hash_algo`
will now crash when it hasn't explicitly been initialized, which may be
the case when running outside of a Git repository.

The crash is not in git-verify-pack(1) but instead in git-index-pack(1),
which gets called by the former. Ideally, both of these programs should
be able to identify the hash algorithm used by the packfile and index
without having to rely on external information. But unfortunately, the
format for neither of them is completely self-describing, so it is not
possible to derive that information. This is a design issue that we
should address by introducing a new packfile version that encodes its
object hash.

For now though the more important fix is to not make either of these
programs crash anymore, which we do by falling back to SHA1 when the
object hash is unconfigured. This pessimizes reading packfiles which
use a different hash than SHA1, but restores previous behaviour.

Reported-by: Ilya K <me@0upti.me>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 07:40:00 -07:00
Phillip Wood bf6ab087d1 rebase: apply and cleanup autostash when rebase fails to start
If "git rebase" fails to start after stashing the user's uncommitted
changes then it forgets to restore the stashed changes and remove the
state directory. To make matters worse, running "git rebase --abort" to
apply the stashed changes and cleanup the state directory fails because
the state directory only contains the "autostash" file and is missing
the "head-name" and "onto" files required by read_basic_state().

Fix this by applying the autostash and removing the state directory if
the pre-rebase hook or initial checkout fail. This matches what
finish_rebase() does at the end of a successful rebase. If the user
modifies any files after the autostash is created it is possible there
will be conflicts when the autostash is applied. In that case
apply_autostash() saves the stash in a new entry under refs/stash and so
it is safe to remove the state directory containing the autostash file.

New tests are added to check the autostash is applied and the state
directory is removed if the rebase fails to start. Checks are also added
to some existing tests in order to ensure there is no state directory
left behind when a rebase fails to start and no autostash has been
created.

Reported-by: Brian Lyles <brianmlyles@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-03 11:24:43 -07:00
Junio C Hamano bd3abe0417 Merge branch 'jc/config-doc-update'
Docfix.

* jc/config-doc-update:
  git-config.1: fix description of --regexp in synopsis
  git-config.1: --get-all description update
2024-09-03 09:15:04 -07:00
Junio C Hamano bb4248452e Merge branch 'rs/remote-leakfix'
Leakfix.

* rs/remote-leakfix:
  remote: plug memory leaks at early returns
2024-09-03 09:15:03 -07:00
Junio C Hamano dd903659cd Merge branch 'js/fetch-push-trace2-annotation'
More trace2 events at key points on push and fetch code paths have
been added.

* js/fetch-push-trace2-annotation:
  send-pack: add new tracing regions for push
  fetch: add top-level trace2 regions
  trace2: implement trace2_printf() for event target
2024-09-03 09:15:02 -07:00
Junio C Hamano 8c1c63d525 Merge branch 'ps/leakfixes-part-5'
Even more leak fixes.

* ps/leakfixes-part-5:
  transport: fix leaking negotiation tips
  transport: fix leaking arguments when fetching from bundle
  builtin/fetch: fix leaking transaction with `--atomic`
  remote: fix leaking peer ref when expanding refmap
  remote: fix leaks when matching refspecs
  remote: fix leaking config strings
  builtin/fetch-pack: fix leaking refs
  sideband: fix leaks when configuring sideband colors
  builtin/send-pack: fix leaking refspecs
  transport: fix leaking OID arrays in git:// transport data
  t/helper: fix leaking multi-pack-indices in "read-midx"
  builtin/repack: fix leaks when computing packs to repack
  midx-write: fix leaking hashfile on error cases
  builtin/archive: fix leaking `OPT_FILENAME()` value
  builtin/upload-archive: fix leaking args passed to `write_archive()`
  builtin/merge-tree: fix leaking `-X` strategy options
  pretty: fix leaking key/value separator buffer
  pretty: fix memory leaks when parsing pretty formats
  convert: fix leaks when resetting attributes
  mailinfo: fix leaking header data
2024-09-03 09:15:00 -07:00
Jeff King 3cdddcf6b2 gc: drop MAYBE_UNUSED annotation from used parameter
The "opts" parameter is always used, so marking it with MAYBE_UNUSED is
just confusing.

This annotation goes back to 41abfe15d9 (maintenance: add pack-refs
task, 2021-02-09), when it really was unused. Back then we did not have
the UNUSED macro that would complain if the code changed to use the
parameter. So when we started using it in bfc2f9eb8e (builtin/gc:
forward git-gc(1)'s `--auto` flag when packing refs, 2024-03-25), nobody
noticed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-29 13:56:46 -07:00
Junio C Hamano 11fd53a6c2 Merge branch 'ds/sparse-diff-index'
The underlying machinery for "git diff-index" has long been made to
expand the sparse index as needed, but the command fully expanded
the sparse index upfront, which now has been taught not to do.

* ds/sparse-diff-index:
  diff-index: integrate with the sparse index
2024-08-29 11:08:17 -07:00
Jeff King 551e4de8e1 gc: mark unused config parameter in virtual functions
Commit d1ae15d68b (builtin/gc: refactor to read config into structure,
2024-08-16) added a new parameter to the maintenance_task virtual
functions, but most of them don't need to look at it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-28 09:51:17 -07:00
Jacob Keller f54ca6ae72 check-mailmap: add options for additional mailmap sources
The git check-mailmap command reads the mailmap from either the default
.mailmap location and then from the mailmap.blob and mailmap.file
configurations.

A following change to git send-email will want to support new
configuration options based on the configured identity. The
identity-based configuration and options only make sense in the context
of git send-email.

Expose the read_mailmap_file and read_mailmap_blob functions from
mailmap.c.  Teach git check-mailmap the --mailmap-file and
--mailmap-blob options which load the additional mailmap sources.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:29 -07:00
Jacob Keller 3a27e991f2 check-mailmap: accept "user@host" contacts
git check-mailmap splits each provided contact using split_ident_line.
This function requires that the contact either be of the form "Name
<user@host>" or of the form "<user@host>". In particular, if the mail
portion of the contact is not surrounded by angle brackets,
split_ident_line will reject it.

This results in git check-mailmap rejecting attempts to translate simple
email addresses:

  $ git check-mailmap user@host
  fatal: unable to parse contact: user@host

This limits the usability of check-mailmap as it requires placing angle
brackets around plain email addresses.

In particular, attempting to use git check-mailmap to support mapping
addresses in git send-email is not straight forward. The sanitization
and validation functions in git send-email strip angle brackets from
plain email addresses. It is not trivial to add brackets prior to
invoking git check-mailmap.

Instead, modify check_mailmap() to allow such strings as contacts. In
particular, treat any line which cannot be split by split_ident_line as
a simple email address.

No attempt is made to actually parse the address line, or validate that
it is actually an email address. Implementing such validation is not
trivial. Besides, we weren't validating the address between angle
brackets before anyways.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:51:28 -07:00
Taylor Blau d3e7db2b82 builtin/pack-objects.c: do not open-code MAX_PACK_OBJECT_HEADER
The function `write_reused_pack_one()` defines an header to store the
OFS_DELTA header, but uses the constant "10" instead of
"MAX_PACK_OBJECT_HEADER" (as is done elsewhere in the same patch, circa
bb514de356 (pack-objects: improve partial packfile reuse, 2019-12-18)).

Declare the `ofs_header` field to be sized according to
`MAX_PACK_OBJECT_HEADER` (which is 10, as defined in "pack.h") instead
of the constant 10.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:27 -07:00
Taylor Blau 125c32605a builtin/pack-objects.c: translate bit positions during pack-reuse
When reusing chunks verbatim from an existing source pack, the function
write_reused_pack() first attempts to reuse whole words (via the
function `write_reused_pack_verbatim()`), and then individual bits (via
`write_reused_pack_one()`).

In the non-MIDX case, all of this code works fine. Likewise, in the MIDX
case, processing bits individually from the first (preferred) pack works
fine. However, processing subsequent packs in the MIDX case is broken
when there are duplicate objects among the set of MIDX'd packs.

This is because we treat the individual bit positions as valid pack
positions within the source pack(s), which does not account for gaps in
the source pack, like we see when the MIDX must break ties between
duplicate objects which appear in multiple packs.

The broken code looks like:

    for (; i < reuse_packfile_bitmap->word_alloc; i++) {
            for (offset = 0; offset < BITS_IN_EWORD, offset++) {
                    /* ... */

                    write_reused_pack_one(reuse_packfile->p,
                                          pos + offset - reuse_packfile->bitmap_pos,
                                          f, pack_start, &w_curs);
            }
    }

, where the second argument is incorrect and does not account for gaps.

Instead, make sure that we translate bit positions in the MIDX's
pseudo-pack order to pack positions in the respective source packs by:

  - Translating the bit position (pseudo-pack order) to a MIDX position
    (lexical order).

  - Use the MIDX position to obtain the offset at which the given object
    occurs in the source pack.

  - Then translate that offset back into a pack relative position within
    the source pack by calling offset_to_pack_pos().

After doing this, then we can safely use the result as a pack position.
Note that when doing single-pack reuse, as well as reusing objects from
the MIDX's preferred pack, such translation is not necessary, since
either ties are broken in favor of the preferred pack, or there are no
ties to break at all (in the case of non-MIDX bitmaps).

Failing to do this can result in strange failure modes. One example that
can occur when misinterpreting bits in the above fashion is that Git
thinks it's supposed to send a delta that the caller does not want.
Under this (incorrect) assumption, we try to look up the delta's base
(so that we can patch any OFS_DELTAs if necessary). We do this using
find_reused_offset().

But if we try and call that function for an offset belonging to an
object we did not send, we'll get back garbage. This can result in us
computing a negative fixup value, which results in memory corruption
when trying to write the (patched) OFS_DELTA header.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-27 14:50:26 -07:00
Junio C Hamano 1609470409 git-config.1: fix description of --regexp in synopsis
The synopsis says --regexp=<regexp> but the --regexp option is a
Boolean that says "the name given is not literal, but a pattern to
match the name".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26 11:49:37 -07:00
Junio C Hamano 2b30d66c43 Merge branch 'jk/mark-unused-parameters'
Mark unused parameters as UNUSED to squelch -Wunused warnings.

* jk/mark-unused-parameters:
  t-hashmap: stop calling setup() for t_intern() test
  scalar: mark unused parameters in dummy function
  daemon: mark unused parameters in non-posix fallbacks
  setup: mark unused parameter in config callback
  test-mergesort: mark unused parameters in trivial callback
  t-hashmap: mark unused parameters in callback function
  reftable: mark unused parameters in virtual functions
  reftable: drop obsolete test function declarations
  reftable: ignore unused argc/argv in test functions
  unit-tests: ignore unused argc/argv
  t/helper: mark more unused argv/argc arguments
  oss-fuzz: mark unused argv/argc argument
  refs: mark unused parameters in do_for_each_reflog_helper()
  refs: mark unused parameters in ref_store fsck callbacks
  update-ref: mark more unused parameters in parser callbacks
  imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-26 11:32:23 -07:00
Junio C Hamano 1f4d89dfce Merge branch 'tb/pseudo-merge-bitmap-fixes'
We created a useless pseudo-merge reachability bitmap that is about
0 commits, and attempted to include commits that are not in packs,
which made no sense.  These bugs have been corrected.

* tb/pseudo-merge-bitmap-fixes:
  pseudo-merge.c: ensure pseudo-merge groups are closed
  pseudo-merge.c: do not generate empty pseudo-merge commits
  t/t5333-pseudo-merge-bitmaps.sh: demonstrate empty pseudo-merge groups
  pack-bitmap-write.c: select pseudo-merges even for small bitmaps
  pack-bitmap: drop redundant args from `bitmap_writer_finish()`
  pack-bitmap: drop redundant args from `bitmap_writer_build()`
  pack-bitmap: drop redundant args from `bitmap_writer_build_type_index()`
  pack-bitmap: initialize `bitmap_writer_init()` with packing_data
2024-08-26 11:32:21 -07:00
Junio C Hamano 6e6f68b59b Merge branch 'ps/maintenance-detach-fix-more'
A tests for "git maintenance" that were broken on Windows have been
corrected.

* ps/maintenance-detach-fix-more:
  builtin/maintenance: fix loose objects task emitting pack hash
  t7900: exercise detaching via trace2 regions
  t7900: fix flaky test due to leaking background job
2024-08-26 11:32:20 -07:00
Junio C Hamano 1e8962ee08 Merge branch 'ps/maintenance-detach-fix'
Maintenance tasks other than "gc" now properly go background when
"git maintenance" runs them.

* ps/maintenance-detach-fix:
  run-command: fix detaching when running auto maintenance
  builtin/maintenance: add a `--detach` flag
  builtin/gc: add a `--detach` flag
  builtin/gc: stop processing log file on signal
  builtin/gc: fix leaking config values
  builtin/gc: refactor to read config into structure
  config: fix constness of out parameter for `git_config_get_expiry()`
2024-08-26 11:32:20 -07:00
Junio C Hamano 5072ad8260 Merge branch 'xx/diff-tree-remerge-diff-fix' into maint-2.46
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-26 11:10:25 -07:00
Junio C Hamano 24a64ea0eb Merge branch 'kl/test-fixes' into maint-2.46
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-26 11:10:21 -07:00
Junio C Hamano a991ffff92 Merge branch 'ps/ls-remote-out-of-repo-fix' into maint-2.46
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-26 11:10:18 -07:00
René Scharfe 4e7aa344f2 remote: plug memory leaks at early returns
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-23 14:20:07 -07:00
Junio C Hamano 62c5b88157 Merge branch 'ps/stash-keep-untrack-empty-fix'
A corner case bug in "git stash" was fixed.

* ps/stash-keep-untrack-empty-fix:
  builtin/stash: fix `--keep-index --include-untracked` with empty HEAD
2024-08-23 09:02:36 -07:00
Junio C Hamano 5e56a39e6a Merge branch 'ps/config-wo-the-repository'
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.

* ps/config-wo-the-repository:
  config: hide functions using `the_repository` by default
  global: prepare for hiding away repo-less config functions
  config: don't depend on `the_repository` with branch conditions
  config: don't have setters depend on `the_repository`
  config: pass repo to functions that rename or copy sections
  config: pass repo to `git_die_config()`
  config: pass repo to `git_config_get_expiry_in_days()`
  config: pass repo to `git_config_get_expiry()`
  config: pass repo to `git_config_get_max_percent_split_change()`
  config: pass repo to `git_config_get_split_index()`
  config: pass repo to `git_config_get_index_threads()`
  config: expose `repo_config_clear()`
  config: introduce missing setters that take repo as parameter
  path: hide functions using `the_repository` by default
  path: stop relying on `the_repository` in `worktree_git_path()`
  path: stop relying on `the_repository` when reporting garbage
  hooks: remove implicit dependency on `the_repository`
  editor: do not rely on `the_repository` for interactive edits
  path: expose `do_git_common_path()` as `repo_common_pathv()`
  path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-23 09:02:34 -07:00
Junio C Hamano 1b6b2bfae5 Merge branch 'ps/leakfixes-part-4'
More leak fixes.

* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-23 09:02:33 -07:00
Josh Steadmon a45ab54987 fetch: add top-level trace2 regions
At $DAYJOB we experienced some slow fetch operations and needed some
additional data to help diagnose the issue.

Add top-level trace2 regions for the various modes of operation of
`git-fetch`. None of these regions are in recursive code, so any
enclosed trace messages should only see their nesting level increase by
one.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 15:02:31 -07:00
Derrick Stolee b44c926c9f diff-index: integrate with the sparse index
The sparse index allows focusing the index data structure on the files
present in the sparse-checkout, leaving only tree entries for
directories not within the sparse-checkout. Each builtin needs a
repository setting to indicate that it has been tested with the sparse
index before Git will allow the index to be loaded into memory in its
sparse form. This is a safety precaution.

There are still some builtins that haven't been integrated due to the
complexity of the integration and the lack of significant use. However,
'git diff-index' was neglected only because of initial data showing low
usage. The diff machinery was already integrated and there is no more
work to be done there but add some tests to be sure 'git diff-index'
behaves as expected.

For this purpose, we can follow the testing pattern used in 51ba65b5c3
(diff: enable and test the sparse index, 2021-12-06). One difference
here is that we only verify that the sparse index case agrees with the
full index case, but do not generate the expected output. The 'git diff'
tests use the '--name-status' option to ease the creation of the
expected output, but that's not an option for 'diff-index'. Since the
underlying diff machinery is the same, a simple comparison is sufficient
to give some coverage.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:29:14 -07:00
Patrick Steinhardt 7720460ccf transport: fix leaking arguments when fetching from bundle
In `fetch_refs_from_bundle()` we assemble a vector of arguments to pass
to `unbundle()`, but never free it. And in theory we wouldn't have to
because `unbundle()` already knows to free the vector for us. But it
fails to do so when it exits early due to `verify_bundle()` failing.

The calling convention that the arguments are freed by the callee and
not the caller feels somewhat weird. Refactor the code such that it is
instead the responsibility of the caller to free the vector, adapting
the only two callsites where we pass extra arguments. This also fixes
the memory leak.

This memory leak gets hit in t5510, but fixing it isn't sufficient to
make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt c92abe71df builtin/fetch: fix leaking transaction with --atomic
With the `--atomic` flag, we use a single ref transaction to commit all
ref updates in git-fetch(1). The lifetime of transactions is somewhat
weird: while `ref_transaction_abort()` will free the transaction, a call
to `ref_transaction_commit()` won't. We thus have to manually free the
transaction in the successful case.

Adapt the code to free the transaction in the exit path to plug the
resulting memory leak. As `ref_transaction_abort()` already freed the
transaction for us, we have to unset the transaction when we hit that
code path to not cause a double free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt 46e440694f builtin/fetch-pack: fix leaking refs
We build several ref lists in git-fetch-pack(1), but never free them.
Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt a09efb74e3 builtin/send-pack: fix leaking refspecs
We never free data associated with the assembled refspec in
git-send-pack(1), causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt bda97cb119 builtin/repack: fix leaks when computing packs to repack
When writing an MIDX in git-repack(1) we first collect all the pack
names that we want to add to it in a string list. This list is marked as
`NODUP`, which indicates that it will neither duplicate nor own strings
added to it. In `write_midx_included_packs()` we then `insert()` strings
via `xstrdup()` or `strbuf_detach()`, but the resulting strings will not
be owned by anything and thus leak.

Fix this issue by marking the list as `DUP` and using a local buffer to
compute the pack names.

This leak is hit in t5319, but plugging it is not sufficient to make the
whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt 479601e9f4 builtin/archive: fix leaking OPT_FILENAME() value
The "--output" switch is an `OPT_FILENAME()` option, which allocates
memory when specified by the user. But while we free the string when
executed without the "--remote" switch, we don't otherwise because we
return via a separate exit path that doesn't know to free it.

Fix this by creating a common exit path.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt 149c9e200c builtin/upload-archive: fix leaking args passed to write_archive()
In git-upload-archive(1), we pass an array of arguments to
`write_archive()` to tell it what exactly to do. We don't ever clear the
vector though, causing a memory leak. Furthermore though, the call to
`write_archive()` may cause contents of the array to be modified, which
would cause us to leak memory to allocated strings held by it.

Fix the issue by having `write_archive()` create a shallow copy of
`argv` before parsing the arguments. Like this, we won't modify the
caller's array and can easily `strvec_clear()` it to plug these memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Patrick Steinhardt ff0935b96e builtin/merge-tree: fix leaking -X strategy options
The `-X` switch for git-merge-tree(1) will push each option into a local
`xopts` vector that we then end up parsing. The vector never gets freed
though, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:04 -07:00
Junio C Hamano b772c9cf2e Merge branch 'ps/bundle-outside-repo-fix'
"git bundle unbundle" outside a repository triggered a BUG()
unnecessarily, which has been corrected.

* ps/bundle-outside-repo-fix:
  bundle: default to SHA1 when reading bundle headers
  builtin/bundle: have unbundle check for repo before opening its bundle
2024-08-21 12:02:24 -07:00
Patrick Steinhardt 8311e3b551 builtin/maintenance: fix loose objects task emitting pack hash
The "loose-objects" maintenance tasks executes git-pack-objects(1) to
pack all loose objects into a new packfile. This command ends up
printing the hash of the packfile to stdout though, which clutters the
output of `git maintenance run`.

Fix this issue by disabling stdout of the child process.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:22 -07:00
Patrick Steinhardt 51a0b8a2a7 t7900: exercise detaching via trace2 regions
In t7900, we exercise the `--detach` logic by checking whether the
command ended up writing anything to its output or not. This supposedly
works because we close stdin, stdout and stderr when daemonizing. But
one, it breaks on platforms where daemonize is a no-op, like Windows.
And second, that git-maintenance(1) outputs anything at all in these
tests is a bug in the first place that we'll fix in a subsequent commit.

Introduce a new trace2 region around the detach which allows us to more
explicitly check whether the detaching logic was executed. This is a
much more direct way to exercise the logic, provides a potentially
useful signal to tracing logs and also works alright on platforms which
do not have the ability to daemonize.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
[jc: dropped a stale in-code comment from a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-21 11:33:02 -07:00
Junio C Hamano 2df380c280 Merge branch 'ps/leakfixes-part-4' into ps/leakfixes-part-5
* ps/leakfixes-part-4: (22 commits)
  builtin/diff: free symmetric diff members
  diff: free state populated via options
  builtin/log: fix leak when showing converted blob contents
  userdiff: fix leaking memory for configured diff drivers
  builtin/format-patch: fix various trivial memory leaks
  diff: fix leak when parsing invalid ignore regex option
  unpack-trees: clear index when not propagating it
  sequencer: release todo list on error paths
  merge-ort: unconditionally release attributes index
  builtin/fast-export: plug leaking tag names
  builtin/fast-export: fix leaking diff options
  builtin/fast-import: plug trivial memory leaks
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/rebase: fix leaking `commit.gpgsign` value
  config: fix leaking comment character config
  submodule-config: fix leaking name entry when traversing submodules
  read-cache: fix leaking hashfile when writing index fails
  bulk-checkin: fix leaking state TODO
  object-name: fix leaking symlink paths in object context
  object-file: fix memory leak when reading corrupted headers
  ...
2024-08-20 10:15:27 -07:00
Junio C Hamano b9497848df Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files.

* tb/incremental-midx-part-1:
  midx: implement support for writing incremental MIDX chains
  t/t5313-pack-bounds-checks.sh: prepare for sub-directories
  t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  midx: implement verification support for incremental MIDXs
  midx: support reading incremental MIDX chains
  midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs
  midx: teach `midx_preferred_pack()` about incremental MIDXs
  midx: teach `midx_contains_pack()` about incremental MIDXs
  midx: remove unused `midx_locate_pack()`
  midx: teach `fill_midx_entry()` about incremental MIDXs
  midx: teach `nth_midxed_offset()` about incremental MIDXs
  midx: teach `bsearch_midx()` about incremental MIDXs
  midx: introduce `bsearch_one_midx()`
  midx: teach `nth_bitmapped_pack()` about incremental MIDXs
  midx: teach `nth_midxed_object_oid()` about incremental MIDXs
  midx: teach `prepare_midx_pack()` about incremental MIDXs
  midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs
  midx: add new fields for incremental MIDX chains
  Documentation: describe incremental MIDX format
2024-08-19 11:07:37 -07:00
Jeff King 9dc1e748ef update-ref: mark more unused parameters in parser callbacks
This is a continuation of 44ad082968 (update-ref: mark unused parameter
in parser callbacks, 2023-08-29), as we've grown a few more virtual
functions since then.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-17 09:46:10 -07:00
Junio C Hamano b3d175409d Merge branch 'sj/ref-fsck'
"git fsck" infrastructure has been taught to also check the sanity
of the ref database, in addition to the object database.

* sj/ref-fsck:
  fsck: add ref name check for files backend
  files-backend: add unified interface for refs scanning
  builtin/refs: add verify subcommand
  refs: set up ref consistency check infrastructure
  fsck: add refs report function
  fsck: add a unified interface for reporting fsck messages
  fsck: make "fsck_error" callback generic
  fsck: rename objects-related fsck error functions
  fsck: rename "skiplist" to "skip_oids"
2024-08-16 12:51:51 -07:00
Junio C Hamano c09721cb63 Merge branch 'dd/notes-empty-no-edit-by-default' into maint-2.46
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-16 12:50:55 -07:00
Junio C Hamano b74d885b11 Merge branch 'tn/doc-commit-fix' into maint-2.46
Docfix.

* tn/doc-commit-fix:
  doc: remove dangling closing parenthesis
2024-08-16 12:50:54 -07:00
Junio C Hamano bb250b5378 Merge branch 'jc/checkout-no-op-switch-errors' into maint-2.46
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-08-16 12:50:51 -07:00
Patrick Steinhardt e3209bd4df builtin/stash: fix --keep-index --include-untracked with empty HEAD
It was reported that creating a stash with `--keep-index
--include-untracked` causes an error when HEAD points to a commit whose
tree is empty:

    $ git stash push --keep-index --include-untracked
    error: pathspec ':/' did not match any file(s) known to git

This error comes from `git checkout --no-overlay $i_tree -- :/`, which
we execute to reset the working tree to the state in our index. As the
tree generated from the index is empty in our case, ':/' does not match
any files and thus causes git-checkout(1) to error out.

Fix the issue by skipping the checkout when the index tree is empty. As
explained in the in-code comment, this should be the correct thing to do
as there is nothing that we'd have to reset in the first place.

Reported-by: Piotr Siupa <piotrsiupa@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:50:33 -07:00
Patrick Steinhardt 98077d06b2 run-command: fix detaching when running auto maintenance
In the past, we used to execute `git gc --auto` as part of our automatic
housekeeping routines. As git-gc(1) may require quite some time to
perform the housekeeping, it knows to detach itself and run in the
background so that the user can continue their work.

Eventually, we refactored our automatic housekeeping to instead use the
more flexible git-maintenance(1) command. The upside of this new infra
is that the user can configure which maintenance tasks are performed, at
least to a certain degree. So while it continues to run git-gc(1) by
default, it can also be adapted to e.g. use git-multi-pack-index(1) for
maintenance of the object database.

The auto-detach of the new infra is somewhat broken though once the user
configures non-standard tasks. The problem is essentially that we detach
at the wrong level in the process hierarchy: git-maintenance(1) never
detaches itself, but instead it continues to be git-gc(1) which does.

When configured to only run the git-gc(1) maintenance task, then the
result is basically the same as before. But when configured to run other
tasks, then git-maintenance(1) will wait for these to run to completion.
Even worse, it may be that git-gc(1) runs concurrently with other
housekeeping tasks, stomping on each others feet.

Fix this bug by asking git-gc(1) to not detach when it is being invoked
via git-maintenance(1). Instead, git-maintenance(1) now respects a new
config "maintenance.autoDetach", the equivalent of "gc.autoDetach", and
detaches itself into the background when running as part of our auto
maintenance. This should continue to behave the same for all users which
use the git-gc(1) task, only. For others though, it means that we now
properly perform all tasks in the background. The default behaviour of
git-maintenance(1) when executed by the user does not change, it will
remain in the foreground unless they pass the `--detach` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt a6affd3343 builtin/maintenance: add a --detach flag
Same as the preceding commit, add a `--[no-]detach` flag to the
git-maintenance(1) command. This will be used in a subsequent commit to
fix backgrounding of that command when configured with a non-standard
set of tasks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:26 -07:00
Patrick Steinhardt c7185df01b builtin/gc: add a --detach flag
When running `git gc --auto`, the command will by default detach and
continue running in the background. This behaviour can be tweaked via
the `gc.autoDetach` config, but not via a command line switch. We need
that in a subsequent commit though, where git-maintenance(1) will want
to ask its git-gc(1) child process to not detach anymore.

Add a `--[no-]detach` flag that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt 9b6b994f90 builtin/gc: stop processing log file on signal
When detaching, git-gc(1) will redirect its stderr to a "gc.log" log
file, which is then used to surface errors of a backgrounded process to
the user. To ensure that the file is properly managed on abnormal exit
paths, we install both signal and exit handlers that try to either
commit the underlying lock file or roll it back in case there wasn't any
error.

This logic is severly broken when handling signals though, as we end up
calling all kinds of functions that are not signal safe. This includes
malloc(3P) via `git_path()`, fprintf(3P), fflush(3P) and many more
functions. The consequence can be anything, from deadlocks to crashes.
Unfortunately, we cannot really do much about this without a larger
refactoring.

The least-worst thing we can do is to not set up the signal handler in
the first place. This will still cause us to remove the lockfile, as the
underlying tempfile subsystem already knows to unlink locks when
receiving a signal. But it may cause us to remove the lock even in the
case where it would have contained actual errors, which is a change in
behaviour.

The consequence is that "gc.log" will not be committed, and thus
subsequent calls to `git gc --auto` won't bail out because of this.
Arguably though, it is better to retry garbage collection rather than
having the process run into a potentially-corrupted state.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt 0ce44e2293 builtin/gc: fix leaking config values
We're leaking config values in git-gc(1) when those values are tracked
as strings. Introduce a new `gc_config_release()` function that releases
this memory to plug those leaks and release old values before populating
the config fields via `git_config_string()` et al.

Note that there is one small gotcha here with the "--prune" option. Next
to passing a string, this option also accepts the "--no-prune" option
that overrides the default or configured value. We thus need to discern
between the option not having been passed by the user and the negative
variant of it. This is done by using a simple sentinel value that lets
us discern these cases.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt d1ae15d68b builtin/gc: refactor to read config into structure
The git-gc(1) command knows to read a bunch of config keys to tweak its
own behaviour. The values are parsed into global variables, which makes
it hard to correctly manage the lifecycle of values that may require a
memory allocation.

Refactor the code to use a `struct gc_config` that gets populated and
passed around. For one, this makes previously-implicit dependencies on
these config values clear. Second, it will allow us to properly manage
the lifecycle in the next commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:25 -07:00
Patrick Steinhardt a70a9bf6ee config: fix constness of out parameter for git_config_get_expiry()
The type of the out parameter of `git_config_get_expiry()` is a pointer
to a constant string, which creates the impression that ownership of the
returned data wasn't transferred to the caller. This isn't true though
and thus quite misleading.

Adapt the parameter to be of type `char **` and adjust callers
accordingly. While at it, refactor `get_shared_index_expire_date()` to
drop the static `shared_index_expire` variable. It is only used in that
function, and furthermore we would only hit the code where we parse the
expiry date a single time because we already use a static `prepared`
variable to track whether we did parse it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16 09:46:24 -07:00
Junio C Hamano 0da7673a51 Merge branch 'xx/diff-tree-remerge-diff-fix'
"git rev-list ... | git diff-tree -p --remerge-diff --stdin" should
behave more or less like "git log -p --remerge-diff" but instead it
crashed, forgetting to prepare a temporary object store needed.

* xx/diff-tree-remerge-diff-fix:
  diff-tree: fix crash when used with --remerge-diff
2024-08-15 13:22:16 -07:00
Junio C Hamano e7f86cb69d Merge branch 'jc/refs-symref-referent'
The refs API has been taught to give symref target information to
the users of ref iterators, allowing for-each-ref and friends to
avoid an extra ref_resolve_* API call per a symbolic ref.

* jc/refs-symref-referent:
  ref-filter: populate symref from iterator
  refs: add referent to each_ref_fn
  refs: keep track of unresolved reference value in iterators
2024-08-15 13:22:15 -07:00
Junio C Hamano 88457a6151 Merge branch 'ps/submodule-ref-format'
Support to specify ref backend for submodules has been enhanced.

* ps/submodule-ref-format:
  object: fix leaking packfiles when closing object store
  submodule: fix leaking seen submodule names
  submodule: fix leaking fetch tasks
  builtin/submodule: allow "add" to use different ref storage format
  refs: fix ref storage format for submodule ref stores
  builtin/clone: propagate ref storage format to submodules
  builtin/submodule: allow cloning with different ref storage format
  git-submodule.sh: break overly long command lines
2024-08-15 13:22:14 -07:00
Taylor Blau 11a08e8332 pack-bitmap: drop redundant args from bitmap_writer_finish()
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_finish()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:23:15 -07:00
Taylor Blau f00dda4849 pack-bitmap: drop redundant args from bitmap_writer_build()
In a similar fashion as the previous commit, drop a redundant argument
from the `bitmap_writer_build()` function.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:22:27 -07:00
Taylor Blau 125ee4ae80 pack-bitmap: drop redundant args from bitmap_writer_build_type_index()
The previous commit ensures that the bitmap_writer's "to_pack" field is
initialized early on, so the "to_pack" and "index_nr" arguments to
`bitmap_writer_build_type_index()` are redundant.

Drop them and adjust the callers accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:20:24 -07:00
Taylor Blau 01e9d12939 pack-bitmap: initialize bitmap_writer_init() with packing_data
In order to determine its object order, the pack-bitmap machinery keeps
a 'struct packing_data' corresponding to the pack or pseudo-pack (when
writing a MIDX bitmap) being written.

The to_pack field is provided to the bitmap machinery by callers of
bitmap_writer_build() and assigned to the bitmap_writer struct at that
point.

But a subsequent commit will want to have access to that data earlier on
during commit selection. Prepare for that by adding a 'to_pack' argument
to 'bitmap_writer_init()', and initializing the field during that
function.

Subsequent commits will clean up other functions which take
now-redundant arguments (like nr_objects, which is equivalent to
pdata->objects_nr, or pdata itself).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-15 11:18:04 -07:00
Junio C Hamano 61fd5de05f Merge branch 'kl/test-fixes'
A flakey test and incorrect calls to strtoX() functions have been
fixed.

* kl/test-fixes:
  t6421: fix test to work when repo dir contains d0
  set errno=0 before strtoX calls
2024-08-14 14:54:55 -07:00
Junio C Hamano 44773b9f70 Merge branch 'jc/patch-id'
The patch parser in "git patch-id" has been tightened to avoid
getting confused by lines that look like a patch header in the log
message.

* jc/patch-id:
  patch-id: tighten code to detect the patch header
  patch-id: rewrite code that detects the beginning of a patch
  patch-id: make get_one_patchid() more extensible
  patch-id: call flush_current_id() only when needed
  t4204: patch-id supports various input format
2024-08-14 14:54:53 -07:00
Junio C Hamano 760348212b Merge branch 'ps/ls-remote-out-of-repo-fix'
A recent update broke "git ls-remote" used outside a repository,
which has been corrected.

* ps/ls-remote-out-of-repo-fix:
  builtin/ls-remote: fall back to SHA1 outside of a repo
2024-08-14 14:54:49 -07:00
Junio C Hamano 4385f8a52d Merge branch 'ps/leakfixes-part-3'
More leakfixes.

* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-14 14:54:47 -07:00
Patrick Steinhardt 77d4b3dd73 builtin/diff: free symmetric diff members
We populate a `struct symdiff` in case the user has requested a
symmetric diff. Part of this is to populate a `skip` bitmap that
indicates which commits shall be ignored in the diff. But while this
bitmap is dynamically allocated, we never free it.

Fix this by introducing and calling a new `symdiff_release()` function
that does this for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:02 -07:00
Patrick Steinhardt 0aaca0ec09 builtin/log: fix leak when showing converted blob contents
In `show_blob_object()`, we proactively call `textconv_object()`. In
case we have a textconv driver for this blob we will end up showing the
converted contents, otherwise we'll show the un-converted contents of it
instead.

When the object has been converted we never free the buffer containing
the converted contents. Fix this to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt 1bc158e750 builtin/format-patch: fix various trivial memory leaks
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt a0b82622cb builtin/fast-export: plug leaking tag names
When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 8ed4e96b5b builtin/fast-export: fix leaking diff options
Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
such that its contents aren't getting freed inside of `handle_commit()`.
We never unset that flag though, which means that the structure's
allocated resources will ultimately leak.

Fix this by unsetting the flag after the loop such that we release its
resources via `release_revisions()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 0662f0dacb builtin/fast-import: plug trivial memory leaks
Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 187b623eef builtin/notes: fix leaking struct notes_tree when merging notes
We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt 1ca57bea4a builtin/rebase: fix leaking commit.gpgsign value
In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt 648abbe22d config: fix leaking comment character config
When the comment line character has been specified multiple times in the
configuration, then `git_default_core_config()` will cause a memory leak
because it unconditionally copies the string into `comment_line_str`
without free'ing the previous value. In fact, it can't easily free the
value in the first place because it may contain a string constant.

Refactor the code such that we track allocated comment character strings
via a separate non-constant variable `comment_line_str_to_free`. Adapt
sites that set `comment_line_str` to set both and free the old value
that was stored in `comment_line_str_to_free`.

This memory leak is being hit in t3404. As there are still other memory
leaks in that file we cannot yet mark it as passing with leak checking
enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt 7298bcc573 builtin/bundle: have unbundle check for repo before opening its bundle
The `git bundle unbundle` subcommand requires a repository to unbundle
the contents into. As thus, the subcommand checks whether we have a
startup repository in the first place, and if not it dies.

This check happens after we have already opened the bundle though. This
causes a segfault when running outside of a repository starting with
c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07) because we have no hash function set up, but we do try to
parse refs advertised by the bundle's header.

The next commit will fix that underlying issue by defaulting to the SHA1
object format for bundles, which will also fix the described segfault here.
But as we know that we will die anyway, we can do better than that and
avoid some vain work by moving the check for a repository before we try
to open the bundle.

Reported-by: ArcticLampyrid <ArcticLampyrid@outlook.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:26:20 -07:00
Patrick Steinhardt 76fc9906f2 config: pass repo to functions that rename or copy sections
Refactor functions that rename or copy config sections to accept a
`struct repository` such that we can get rid of the implicit dependency
on `the_repository`. Rename the functions accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
Patrick Steinhardt 0c2c37d16b config: pass repo to git_die_config()
Refactor `git_die_config()` to accept a `struct repository` such that we
can get rid of the implicit dependency on `the_repository`. Rename the
function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:04 -07:00
Patrick Steinhardt 87aace129e config: pass repo to git_config_get_expiry()
Refactor `git_config_get_expiry()` to accept a `struct repository` such
that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
Patrick Steinhardt be7537e6a9 config: pass repo to git_config_get_split_index()
Refactor `git_config_get_split_index()` to accept a `struct repository`
such that we can get rid of the implicit dependency on `the_repository`.
Rename the function accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:03 -07:00
Patrick Steinhardt a973f60dc7 path: stop relying on the_repository in worktree_git_path()
When not provided a worktree, then `worktree_git_path()` will fall back
to returning a path relative to the main repository. In this case, we
implicitly rely on `the_repository` to derive the path. Remove this
dependency by passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Patrick Steinhardt 78f2210b3c path: stop relying on the_repository when reporting garbage
We access `the_repository` in `report_linked_checkout_garbage()` both
directly and indirectly via `get_git_dir()`. Remove this dependency by
instead passing a `struct repository` as parameter.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Patrick Steinhardt 169c979771 hooks: remove implicit dependency on the_repository
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Junio C Hamano 4460e052e0 remerge-diff: clean up temporary objdir at a central place
After running a diff between two things, or a series of diffs while
walking the history, the diff computation is concluded by a call to
diff_result_code() to extract the exit status of the diff machinery.

The function can work on "struct diffopt", but all the callers
historically and currently pass "struct diffopt" that is embedded in
the "struct rev_info" that is used to hold the remerge_diff bit and
the remerge_objdir variable that points at the temporary object
directory in use.

Redefine diff_result_code() to take the whole "struct rev_info" to
give it an access to these members related to remerge-diff, so that
it can get rid of the temporary object directory for any and all
callers that used the feature.  We can lose the equivalent code to
do so from the code paths for individual commands, diff-tree, diff,
and log.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:40 -07:00
Junio C Hamano 245cac5c33 remerge-diff: lazily prepare temporary objdir on demand
It is error prone for each caller that sets revs.remerge_diff bit
to be responsible for preparing a temporary object directory and
rotate it into the list of alternate object stores, making it the
primary object store.

Instead, remove the code to set up and arrange the temporary object
directory from the current callers and implement it in the code that
runs remerge-diff logic.  The code to undo the futzing of the list
of alternate object store is still spread across the callers, but we
will deal with it in future steps.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 15:42:35 -07:00
John Cai e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
Xing Xin a77554ea09 diff-tree: fix crash when used with --remerge-diff
When using "git-diff-tree" to get the tree diff for merge commits with
the diff format set to `remerge`, a bug is triggered as shown below:

  $ git diff-tree -r --remerge-diff 363337e6eb
  363337e6eb
  BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!?

This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check
added in commit 7b90ab467a (log: clean unneeded objects during log
--remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when
attempting to clean up temporary objects generated during the remerge
process.

After some further digging, I find that the remerge-related diff options
were introduced in db757e8b8d (show, log: provide a --remerge-diff
capability, 2022-02-02), which also affect the setup of `rev_info` for
"git-diff-tree", but were not accounted for in the original
implementation (inferred from the commit message).

Elijah Newren, the author of the remerge diff feature, notes that other
callers of `log-tree.c:log_tree_commit` (the only caller of
`log-tree.c:do_remerge_diff`) also exist, but:

  `builtin/am.c`: manually sets all flags; remerge_diff is not among them
  `sequencer.c`: manually sets all flags; remerge_diff is not among them

so `builtin/diff-tree.c` really is the only caller that was overlooked
when remerge-diff functionality was added.

This commit resolves the crash by adding `remerge_objdir` setup logic to
`builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`.
It also includes the necessary cleanup for `remerge_objdir`.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:07:44 -07:00
Junio C Hamano 028cf22904 Merge branch 'dd/notes-empty-no-edit-by-default'
"git notes add -m '' --allow-empty" and friends that take prepared
data to create notes should not invoke an editor, but it started
doing so since Git 2.42, which has been corrected.

* dd/notes-empty-no-edit-by-default:
  notes: do not trigger editor when adding an empty note
2024-08-08 10:41:19 -07:00
shejialuo bf061d26c7 builtin/refs: add verify subcommand
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)".

Add "verbose" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs and objects consistency.

Remove bit-field for "strict" field, this is because we cannot take
address of a bit-field which makes it unhandy to set member variables
when parsing the command line options.

The "git-fsck(1)" declares "fsck_options" variable with "static"
identifier which avoids complaint by the leak-checker. However, in
"git-refs verify", we need to do memory clean manually. Thus add
"fsck_options_clear" function in "fsck.c" to provide memory clean
operation.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:53 -07:00
shejialuo 0ec5dfe8c4 fsck: make "fsck_error" callback generic
The "fsck_error" callback is designed to report the objects-related
error messages. It accepts two parameter "oid" and "object_type" which
is not generic. In order to provide a unified callback which can report
either objects or refs, remove the objects-related parameters and add
the generic parameter "void *fsck_report".

Create a new "fsck_object_report" structure which incorporates the
removed parameters "oid" and "object_type". Then change the
corresponding references to adapt to new "fsck_error" callback.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
shejialuo 8cd4a447b8 fsck: rename objects-related fsck error functions
The names of objects-related fsck error functions are generic. It's OK
when there is only object database check. However, we are going to
introduce refs database check report function. To avoid ambiguity,
rename object-related fsck error functions to explicitly indicate these
functions are used to report objects-related messages.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:36:52 -07:00
Patrick Steinhardt c369fc46d0 builtin/submodule: allow "add" to use different ref storage format
Same as with "clone", users may want to add a submodule to a repository
with a non-default ref storage format. Wire up a new `--ref-format=`
option that works the same as for `git submodule clone`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:22:21 -07:00
Patrick Steinhardt 69814846ab builtin/clone: propagate ref storage format to submodules
When recursively cloning a repository with a non-default ref storage
format, e.g. by passing the `--ref-format=` option, then only the
top-level repository will end up using that ref storage format, and
all recursively cloned submodules will instead use the default format.

While mixed-format constellations are expected to work alright, the
outcome still is somewhat surprising as we have essentially ignored
the user's request.

Fix this by propagating the requested ref format to cloned submodules.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:21:39 -07:00
Patrick Steinhardt 5ac781ad62 builtin/submodule: allow cloning with different ref storage format
As submodules are proper self-contained repositories, it is perfectly
valid for them to have a different ref storage format than their parent
repository. There is no obvious way for users to ask for the ref storage
format when initializing submodules though. Whether the setup of such
mixed-ref-storage-format constellations is all that useful remains to be
seen. But there is no good reason to not expose such an option, and we
will require it in a subsequent patch.

Introduce a new `--ref-format=` option for git-submodule(1) that allows
the user to pick the ref storage format. This option will also be used
in a subsequent commit, where we start to propagate the same flag from
git-clone(1) to cloning submodules with the `--recursive` switch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-08 09:20:49 -07:00
Junio C Hamano 90b801d8ff Merge branch 'ps/leakfixes-part-3' into ps/leakfixes-part-4
* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-06 12:40:41 -07:00
Taylor Blau fcb2205b77 midx: implement support for writing incremental MIDX chains
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

    $ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

  - The `compute_sorted_entries()` function is taught to reject objects
    which appear in any existing MIDX layer.

  - Functions like `write_midx_revindex()` are adjusted to write
    pack_order values which are offset by the number of objects in the
    base MIDX layer.

  - The end of `write_midx_internal()` is adjusted to move
    non-incremental MIDX files when necessary (i.e. when creating an
    incremental chain with an existing non-incremental MIDX in the
    repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

  1. Most of the "interesting" behavior is already thoroughly covered in
     t5319-multi-pack-index.sh, which handles the core logic of reading
     objects through a MIDX.

     The new tests in t5334-incremental-multi-pack-index.sh are mostly
     focused on creating and destroying incremental MIDXs, as well as
     stitching their results together across layers.

  2. A new GIT_TEST environment variable is added called
     "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
     entire test suite to write incremental MIDXs after repacking when
     combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

     This exercises the long tail of other interesting behavior that is
     defined implicitly throughout the rest of the CI suite. It is
     likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00
Taylor Blau 9552c3595a t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
Two years ago, commit ff1e653c8e (midx: respect
'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new
environment variable which caused the test suite to write MIDX bitmaps
after any 'git repack' invocation.

At the time, this was done to help flush out any bugs with MIDX bitmaps
that weren't explicitly covered in the t5326-multi-pack-bitmap.sh
script.

Two years later, that flag has served us well and is no longer providing
meaningful coverage, as the script in t5326 has matured substantially
and covers many more interesting cases than it did back when ff1e653c8e
was originally written.

Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable
as it is no longer serving a useful purpose. More importantly, removing
this variable clears the way for us to introduce a new one to help
similarly flush out bugs related to incremental MIDX chains.

Because these incremental MIDX chains are (for now) incompatible with
MIDX bitmaps, we cannot have both.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:38 -07:00