Commit graph

71874 commits

Author SHA1 Message Date
Junio C Hamano 19fa15fb2d Merge branch 'jk/end-of-options' into maint-2.43
"git $cmd --end-of-options --rev -- --path" for some $cmd failed
to interpret "--rev" as a rev, and "--path" as a path.  This was
fixed for many programs like "reset" and "checkout".

* jk/end-of-options:
  parse-options: decouple "--end-of-options" and "--"
2024-02-08 16:22:02 -08:00
Junio C Hamano 4b50f86141 Merge branch 'jc/revision-parse-int' into maint-2.43
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.

* jc/revision-parse-int:
  revision: parse integer arguments to --max-count, --skip, etc., more carefully
2024-02-08 16:22:02 -08:00
Junio C Hamano 7c05241877 Merge branch 'jp/use-diff-index-in-pre-commit-sample' into maint-2.43
The sample pre-commit hook that tries to catch introduction of new
paths that use potentially non-portable characters did not notice
an existing path getting renamed to such a problematic path, when
rename detection was enabled.

* jp/use-diff-index-in-pre-commit-sample:
  hooks--pre-commit: detect non-ASCII when renaming
2024-02-08 16:22:02 -08:00
Junio C Hamano 13031f6689 Merge branch 'jh/trace2-redact-auth' into maint-2.43
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.

* jh/trace2-redact-auth:
  t0212: test URL redacting in EVENT format
  t0211: test URL redacting in PERF format
  trace2: redact passwords from https:// URLs by default
  trace2: fix signature of trace2_def_param() macro
2024-02-08 16:22:01 -08:00
Junio C Hamano efbae0583b Merge branch 'js/update-urls-in-doc-and-comment' into maint-2.43
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.

* js/update-urls-in-doc-and-comment:
  doc: refer to internet archive
  doc: update links for andre-simon.de
  doc: switch links to https
  doc: update links to current pages
2024-02-08 16:22:01 -08:00
Junio C Hamano 50b8f513a2 Merge branch 'ps/commit-graph-less-paranoid' into maint-2.43
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications.  The default has been
flipped to disable this pessimization.

* ps/commit-graph-less-paranoid:
  commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
2024-02-08 16:22:01 -08:00
Junio C Hamano f8e2ad965a Merge branch 'tz/send-email-negatable-options' into maint-2.43
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email".  Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.

* tz/send-email-negatable-options:
  send-email: avoid duplicate specification warnings
  perl: bump the required Perl version to 5.8.1 from 5.8.0
2024-02-08 16:22:01 -08:00
Junio C Hamano c8bcf66bf7 Merge branch 'js/ci-discard-prove-state' into maint-2.43
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.

* js/ci-discard-prove-state:
  ci: avoid running the test suite _twice_
  ci: add support for GitLab CI
  ci: install test dependencies for linux-musl
  ci: squelch warnings when testing with unusable Git repo
  ci: unify setup of some environment variables
  ci: split out logic to set up failed test artifacts
  ci: group installation of Docker dependencies
  ci: make grouping setup more generic
  ci: reorder definitions for grouping functions
2024-02-08 16:22:00 -08:00
Junio C Hamano 6931049c32 ssh signing: signal an error with a negative return value
The other backend for the sign_buffer() function followed our usual
"an error is signalled with a negative return" convention, but the
SSH signer did not.  Even though we already fixed the caller that
assumed only a negative return value is an error, tighten the callee
to signal an error with a negative return as well.  This way, the
callees will be strict on what they produce, while the callers will
be lenient in what they accept.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 21:31:42 -08:00
Junio C Hamano 841dbd40a3 bisect: document command line arguments for "bisect start"
The syntax commonly used for alternatives is --opt-(a|b), not
--opt-{a,b}.

List bad/new and good/old consistently in this order, to be
consistent with the description for "git bisect terms".  Clarify
<term> to either <term-old> or <term-new> to make them consistent
with the description of "git bisect (good|bad)" subcommands.

Suggested-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 13:46:01 -08:00
Junio C Hamano 47ac5f6e1a bisect: document "terms" subcommand more fully
The documentation for "git bisect terms", although it did not hide
any information, was a bit incomplete and forced readers to fill in
the blanks to get the complete picture.

Acked-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 13:46:01 -08:00
Junio C Hamano abfbff61ef tag: fix sign_buffer() call to create a signed tag
The command "git tag -s" internally calls sign_buffer() to make a
cryptographic signature using the chosen backend like GPG and SSH.
The internal helper functions used by "git tag" implementation seem
to use a "negative return values are errors, zero or positive return
values are not" convention, and there are places (e.g., verify_tag()
that calls gpg_verify_tag()) that these internal helper functions
translate return values that signal errors to conform to this
convention, but do_sign() that calls sign_buffer() forgets to do so.

Fix it, so that a failed call to sign_buffer() that can return the
exit status from pipe_command() will not be overlooked.

Reported-by: Sergey Kosukhin <skosukhin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 10:47:25 -08:00
Philippe Blain 78307f1a89 .github/PULL_REQUEST_TEMPLATE.md: add a note about single-commit PRs
Contributors using Gitgitgadget continue to send single-commit PRs with
their commit message text duplicated below the three-dash line,
increasing the signal-to-noise ratio for reviewers.

This is because Gitgitgadget copies the pull request description as an
in-patch commentary, for single-commit PRs, and _GitHub_ defaults to
prefilling the pull request description with the commit message, for
single-commit PRs (followed by the content of the pull request
template).

Add a note in the pull request template mentioning that for
single-commit PRs, the PR description should thus be kept empty, in the
hope that contributors read it and act on it.

This partly addresses:
https://github.com/gitgitgadget/gitgitgadget/issues/340

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:22:55 -08:00
Patrick Steinhardt d2058cb2f0 builtin/stash: report failure to write to index
The git-stash(1) command needs to write to the index for many of its
operations. When the index is locked by a concurrent writer it will thus
fail to operate, which is expected. What is not expected though is that
we do not print any error message at all in this case. The user can thus
easily miss the fact that the command didn't do what they expected it to
do and would be left wondering why that is.

Fix this bug and report failures to write to the index. Add tests for
the subcommands which hit the respective code paths.

While at it, unify error messages when writing to the index fails. The
chosen error message is already used in "builtin/stash.c".

Reported-by: moti sd <motisd8@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 12:08:38 -08:00
Philippe Blain 30bced3a67 imap-send: add missing "strbuf.h" include under NO_CURL
Building with NO_CURL is currently broken since imap-send.c uses things
defined in "strbuf.h" wihtout including it.

The inclusion of that header was removed in eea0e59ffb (treewide: remove
unnecessary includes in source files, 2023-12-23), which failed to
notice that "strbuf.h" was transitively included in imap-send.c via
"http.h", but only if USE_CURL_FOR_IMAP_SEND is defined. Add back the
missing include. Note that it was explicitely added in 3307f7dde2
(imap-send: include strbuf.h, 2023-05-17) after a similar breakage in
ba3d1c73da (treewide: remove unnecessary cache.h includes, 2023-02-24) -
see the thread starting at [1].

It can be verified by inspection that this is the only case where a
header we include is dependent on a Makefile knob in the files modified
in eea0e59ffb.

[1] https://lore.kernel.org/git/20230517070632.71884-1-list@eworm.de/

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-01 18:20:41 -08:00
Johannes Schindelin 19ed0dff8f win32: special-case ENOSPC when writing to a pipe
Since c6d3cce6f3 (pipe_command(): handle ENOSPC when writing to a
pipe, 2022-08-17), one `write()` call that results in an `errno` value
`ENOSPC` (which typically indicates out of disk space, which makes
little sense in the context of a pipe) is treated the same as `EAGAIN`.

However, contrary to expectations, as diagnosed in
https://github.com/python/cpython/issues/101881#issuecomment-1428667015,
when writing to a non-blocking pipe on Windows, an `errno` value of
`ENOSPC` means something else: the write _fails_. Completely. Because
more data was provided than the internal pipe buffer can handle.
Somewhat surprising, considering that `write()` is allowed to write less
than the specified amount, e.g. by writing only as much as fits in that
buffer. But it doesn't, it writes no byte at all in that instance.

Let's handle this by manually detecting when an `ENOSPC` indicates that
a pipe's buffer is smaller than what needs to be written, and re-try
using the pipe's buffer size as `size` parameter.

It would be plausible to try writing the entire buffer in a loop,
feeding pipe buffer-sized chunks, but experiments show that trying to
write more than one buffer-sized chunk right after that will immediately
fail because the buffer is unlikely to be drained as fast as `write()`
could write again. And the whole point of a non-blocking pipe is to be
non-blocking.

Which means that the logic that determines the pipe's buffer size
unfortunately has to be run potentially many times when writing large
amounts of data to a non-blocking pipe, as there is no elegant way to
cache that information between `write()` calls. It's the best we can do,
though, so it has to be good enough.

This fix is required to let t3701.60 (handle very large filtered diff)
pass with the MSYS2 runtime provided by the MSYS2 project: Without this
patch, the failed write would result in an infinite loop. This patch is
not required with Git for Windows' variant of the MSYS2 runtime only
because Git for Windows added an ugly work-around specifically to avoid
a hang in that test case.

The diff is slightly chatty because it extends an already-existing
conditional that special-cases a _different_ `errno` value for pipes,
and because this patch needs to account for the fact that
`_get_osfhandle()` potentially overwrites `errno`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-30 13:59:16 -08:00
Junio C Hamano de65079d7b reftable/pq_test: comment style fix
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
Junio C Hamano 9d2cdd8ae8 merge-ort.c: comment style fix
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
Junio C Hamano 777f783841 builtin/worktree: comment style fixes
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 14:08:52 -08:00
Jeff King 85a9a63c92 diff: handle NULL meta-info when spawning external diff
Running this:

  $ touch foo bar
  $ chmod +x foo
  $ git -c diff.external=echo diff --ext-diff --no-index foo bar

results in a segfault. The issue is that run_diff_cmd() passes a NULL
"xfrm_msg" variable to run_external_diff(), which feeds it to
strvec_push(), causing the segfault. The bug dates back to 82fbf269b9
(run_external_diff: use an argv_array for the command line, 2014-04-19),
though it mostly only ever worked accidentally.  Before then, we just
stuck the NULL pointer into a "const char **" array, so our NULL ended
up acting as an extra end-of-argv sentinel (which was OK, because it was
the last thing in the array).

Curiously, though, this is only a problem with --no-index. We set up
xfrm_msg by calling fill_metainfo(). This result may be empty, or may
have text like "index 1234..5678\n", "rename from foo\nrename from
bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the
"other" variable is not NULL. That variable is set when the paths of the
two sides of the diff pair aren't the same (in which case the
destination path becomes "other"). So normally it would kick in only for
a rename, in which case xfrm_msg should not be NULL (it would have the
rename information in it).

But with a "--no-index" of two blobs, we of course have two different
pathnames, and thus end up with a non-NULL "other" filename (which is
always just a repeat of the file2-name), but possibly a NULL xfrm_msg.

So how to fix it? I have a feeling that --no-index always passing
"other" to the external diff command is probably a bug. There was no
rename, and the name is always redundant with existing information we
pass (and this may even cause us to pass a useless "xfrm_msg" that
contains an "index 1234..5678" line). So one option would be to change
that behavior. We don't seem to have ever documented the "other" or
"xfrm_msg" parameters for external diffs.

But I'm not sure what fallout we might have from changing that behavior
now. So this patch takes the less-risky option, and simply teaches
run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes
it agnostic to whether "other" and "xfrm_msg" always come as a pair. It
fixes the segfault now, and if we want to change the --no-index "other"
behavior on top, it will handle that, too.

Reported-by: Wilfred Hughes <me@wilfred.me.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 10:37:44 -08:00
Taylor Blau 36c9c44fa4 pack-bitmap: drop unused reuse_objects
This variable is no longer used for doing verbatim pack-reuse (or
anywhere within pack-bitmap.c) since d2ea031046 (pack-bitmap: don't rely
on bitmap_git->reuse_objects, 2019-12-18).

Remove it to avoid an unused struct member.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 09:26:17 -08:00
James Touton 9023198280 git-p4: use raw string literals for regular expressions
Fixes several Python diagnostics about invalid escape sequences. The
diagnostics appear for me in Python 3.12, and may appear in earlier
versions. The fix is to use raw string literals so that backslashes are
not interpreted as introducing escape sequences. Raw string literals
are already in use in this file, so adding more does not impact
toolchain compatibility.

Signed-off-by: James Touton <bekenn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-29 09:25:16 -08:00
Junio C Hamano 976d0251ce CoC: whitespace fix
Fix two lines with trailing whitespaces.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23 10:40:10 -08:00
René Scharfe 457f96252f parse-options: simplify positivation handling
We accept the positive version of options whose long name starts with
"no-" and are defined without the flag PARSE_OPT_NONEG.  E.g. git clone
has an explicitly defined --no-checkout option and also implicitly
accepts --checkout to override it.

parse_long_opt() handles that by restarting the option matching with the
positive version when it finds that only the current option definition
starts with "no-", but not the user-supplied argument.  This code is
located almost at the end of the matching logic.

Avoid the need for a restart by moving the code up.  We don't have to
check the positive arg against the negative long_name at all -- the
"no-" prefix of the latter makes a match impossible.  Skip it and toggle
OPT_UNSET right away to simplify the control flow.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:17:12 -08:00
Junio C Hamano af3d2c160f Docs: majordomo@vger.kernel.org has been decomissioned
Update the instruction for subscribing to the Git mailing list
we have on a few documentation pages.

Reported-by: Kyle Lippincott <spectral@google.com>
Helped-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 10:09:07 -08:00
René Scharfe 5825268db1 parse-options: fully disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
baa4adc66a (parse-options: disable option abbreviation with
PARSE_OPT_KEEP_UNKNOWN, 2019-01-27) turned off support for abbreviated
options when the flag PARSE_OPT_KEEP_UNKNOWN is given, as any shortened
option could also be an abbreviation for one of the unknown options.

The code for handling abbreviated options is guarded by an if, but it
can also be reached via goto.  baa4adc66a only blocked the first way.
Add the condition to the other ones as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 09:55:43 -08:00
Elijah Newren 1c5bc6971e diffcore-delta: avoid ignoring final 'line' of file
hash_chars() would hash lines to integers, and store them in a spanhash,
but cut lines at 64 characters.  Thus, whenever it reached 64 characters
or a newline, it would create a new spanhash.  The problem is, the final
part of the file might not end 64 characters after the previous 'line'
and might not end with a newline.  This could, for example, cause an
85-byte file with 12 lines and only the first character in the file
differing to appear merely 23% similar rather than the expected 97%.
Ensure the last line is included, and add a testcase that would have
caught this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 19:10:11 -08:00
Toon Claes 0aabeaa562 builtin/show-ref: treat directory as non-existing in --exists
9080a7f178 (builtin/show-ref: add new mode to check for reference
existence, 2023-10-31) added the option --exists to git-show-ref(1).

When you use this option against a ref that doesn't exist, but it is
a parent directory of an existing ref, you get the following error:

    $ git show-ref --exists refs/heads
    error: failed to look up reference: Is a directory

when the ref-files backend is in use.  To be more clear to user,
hide the error about having found a directory.  What matters to the
user is that the named ref does not exist.  Instead, print the same
error as when the ref was not found:

    error: reference does not exist

Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-18 11:17:25 -08:00
Nikolay Borisov f10031fadd rebase: fix documentation about used shell in -x
The shell used when using the -x option is erroneously documented to be
the one pointed to by the $SHELL environmental variable. This was true
when rebase was implemented as a shell script but this is no longer
true.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-17 16:14:57 -08:00
Nikolay Edigaryev 8f50984cf4 rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
'--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
filter objects in traverse_commit_list, 2017-11-21) and later expanded
to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
2020-02-14)

The logic that was introduced in these commits (and that still persists
to this day) omits blobs larger than _or equal_ to n bytes or units.

However, the documentation (Documentation/rev-list-options.txt) states:

>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
bytes or units. n may be zero.

Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
logic, so it seems it is the documentation that needs fixing, not the
code.

This changes the explanation to be similar to
Documentation/git-clone.txt, which is correct.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-16 08:53:13 -08:00
Linus Arver f10b0989b8 strvec: use correct member name in comments
In d70a9eb611 (strvec: rename struct fields, 2020-07-28), we renamed the
"argv" member to "v". In the same patch we also did the following rename
in strvec.c:

    -void strvec_pushv(struct strvec *array, const char **argv)
    +void strvec_pushv(struct strvec *array, const char **items)

and it appears that this s/argv/items operation was erroneously applied
to strvec.h.

Rename "items" to "v".

Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 13:38:07 -08:00
Illia Bobyr 25aec06326 rebase: clarify --reschedule-failed-exec default
Documentation should mention the default behavior.

It is better to explain the persistent nature of the
--reschedule-failed-exec flag from the user standpoint, rather than from
the implementation standpoint.

Signed-off-by: Illia Bobyr <illia.bobyr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 09:41:25 -08:00
Jeff King 993d38a066 index-pack: spawn threads atomically
The t5309 script triggers a racy false positive with SANITIZE=leak on a
multi-core system. Running with "--stress --run=6" usually fails within
10 seconds or so for me, complaining with something like:

    + git index-pack --fix-thin --stdin
    fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)

    =================================================================
    ==3904583==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 32 byte(s) in 1 object(s) allocated from:
        #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
        #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
        #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
        #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
        #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
        #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
        #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

    SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
    Aborted

What happens is this:

  0. We construct a bogus pack with a duplicate object in it and trigger
     index-pack.

  1. We spawn a bunch of worker threads to resolve deltas (on my system
     it is 16 threads).

  2. One of the threads sees the duplicate object and bails by calling
     exit(), taking down all of the threads. This is expected and is the
     point of the test.

  3. At the time exit() is called, we may still be spawning threads from
     the main process via pthread_create(). LSan hooks thread creation
     to update its book-keeping; it has to know where each thread's
     stack is (so it can find entry points for reachable memory). So it
     calls pthread_getattr_np() to get information about the new thread.
     That may allocate memory that must be freed with a matching call to
     pthread_attr_destroy(). Probably LSan does that immediately, but
     if you're unlucky enough, the exit() will happen while it's between
     those two calls, and the allocated pthread_attr_t appears as a
     leak.

This isn't a real leak. It's not even in our code, but rather in the
LSan instrumentation code. So we could just ignore it. But the false
positive can cause people to waste time tracking it down.

It's possibly something that LSan could protect against (e.g., cover the
getattr/destroy pair with a mutex, and then in the final post-exit()
check for leaks try to take the same mutex). But I don't know enough
about LSan to say if that's a reasonable approach or not (or if my
analysis is even completely correct).

In the meantime, it's pretty easy to avoid the race by making creation
of the worker threads "atomic". That is, we'll spawn all of them before
letting any of them start to work. That's easy to do because we already
have a work_lock() mutex for handing out that work. If the main process
takes it, then all of the threads will immediately block until we've
finished spawning and released it.

This shouldn't make any practical difference for non-LSan runs. The
thread spawning is quick, and could happen before any worker thread gets
scheduled anyway.

Probably other spots that use threads are subject to the same issues.
But since we have to manually insert locking (and since this really is
kind of a hack), let's not bother with them unless somebody experiences
a similar racy false-positive in practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 08:40:56 -08:00
Jeff King d70f554cdf commit-graph: retain commit slab when closing NULL commit_graph
This fixes a regression introduced in ac6d45d11f (commit-graph: move
slab-clearing to close_commit_graph(), 2023-10-03), in which running:

  git -c fetch.writeCommitGraph=true fetch --recurse-submodules

multiple times in a freshly cloned repository causes a segfault. What
happens in the second (and subsequent) runs is this:

  1. We make a "struct commit" for any ref tips which we're storing
     (even if we already have them, they still go into FETCH_HEAD).

     Because the first run will have created a commit graph, we'll find
     those commits in the graph.

     The commit struct is therefore created with a NULL "maybe_tree"
     entry, because we can load its oid from the graph later. But to do
     that we need to remember that we got the commit from the graph,
     which is recorded in a global commit_graph_data_slab object.

  2. Because we're using --recurse-submodules, we'll try to fetch each
     of the possible submodules. That implies creating a separate
     "struct repository" in-process for each submodule, which will
     require a later call to repo_clear().

     The call to repo_clear() calls raw_object_store_clear(), which in
     turn calls close_object_store(), which in turn calls
     close_commit_graph(). And the latter frees the commit graph data
     slab.

  3. Later, when trying to write out a new commit graph, we'll ask for
     their tree oid via get_commit_tree_oid(), which will see that the
     object is parsed but with a NULL maybe_tree field. We'd then
     usually pull it from the graph file, but because the slab was
     cleared, we don't realize that we can do so! We end up returning
     NULL and segfaulting.

     (It seems questionable that we'd write a graph entry for such a
     commit anyway, since we know we already have one. I didn't
     double-check, but that may simply be another side effect of having
     cleared the slab).

The bug is in step (2) above. We should not be clearing the slab when
cleaning up the submodule repository structs. Prior to ac6d45d11f, we
did not do so because it was done inside a helper function that returned
early when it saw NULL. So the behavior change from that commit is that
we'll now _always_ clear the slab via repo_clear(), even if the
repository being closed did not have a commit graph (and thus would have
a NULL commit_graph struct).

The most immediate fix is to add in a NULL check in close_commit_graph(),
making it a true noop when passed in an object_store with a NULL
commit_graph (it's OK to just return early, since the rest of its code
is already a noop when passed NULL). That restores the pre-ac6d45d11f
behavior. And that's what this patch does, along with a test that
exercises it (we already have a test that uses submodules along with
fetch.writeCommitGraph, but the bug only triggers when there is a
subsequent fetch and when that fetch uses --recurse-submodules).

So that fixes the regression in the least-risky way possible.

I do think there's some fragility here that we might want to follow up
on. We have a global commit_graph_data_slab that contains graph
positions, and our global commit structs depend on the that slab
remaining valid. But close_commit_graph() is just about closing _one_
object store's graph. So it's dangerous to call that function and clear
the slab without also throwing away any "struct commit" we might have
parsed that depends on it.

Which at first glance seems like a bug we could already trigger. In the
situation described here, there is no commit graph in the submodule
repository, so our commit graph is NULL (in fact, in our test script
there is no submodule repo at all, so we immediately return from
repo_init() and call repo_clear() only to free up memory). But what
would happen if there was one? Wouldn't we see a non-NULL commit_graph
entry, and then clear the global slab anyway?

The answer is "no", but for very bizarre reasons. Remember that
repo_clear() calls raw_object_store_clear(), which then calls
close_object_store() and thus close_commit_graph(). But before it does
so, raw_object_store_clear() does something else: it frees the commit
graph and sets it to NULL! So by this code path we'll _never_ see a
non-NULL commit_graph struct, and thus never clear the slab.

So it happens to work out. But it still seems questionable to me that we
would clear a global slab (which might still be in use) when closing the
commit graph. This clearing comes from 957ba814bf (commit-graph: when
closing the graph, also release the slab, 2021-09-08), and was fixing a
case where we really did need it to be closed (and in that case we
presumably call close_object_store() more directly).

So I suspect there may still be a bug waiting to happen there, as any
object loaded before the call to close_object_store() may be stranded
with a bogus maybe_tree entry (and thus looking at it after the call
might cause an error). But I'm not sure how to trigger it, nor what the
fix should look like (you probably would need to "unparse" any objects
pulled from the graph). And so this patch punts on that for now in favor
of fixing the recent regression in the most direct way, which should not
have any other fallouts.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-05 08:35:26 -08:00
Chandra Pratap 556e68032f write-or-die: make GIT_FLUSH a Boolean environment variable
Among Git's environment variables, the ones marked as "Boolean"
accept values in a way similar to Boolean configuration variables,
i.e. values like 'yes', 'on', 'true' and positive numbers are
taken as "on" and values like 'no', 'off', 'false' are taken as
"off".

GIT_FLUSH can be used to force Git to use non-buffered I/O when
writing to stdout. It can only accept two values, '1' which causes
Git to flush more often and '0' which makes all output buffered.
Make GIT_FLUSH accept more values besides '0' and '1' by turning it
into a Boolean environment variable, modifying the required logic.
Update the related documentation.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-04 10:32:21 -08:00
Sam Delmerico ee9895b0ff push: region_leave trace for negotiate_using_fetch
There were two region_enter events for negotiate_using_fetch instead of
one enter and one leave. This commit replaces the second region_enter
event with a region_leave.

Signed-off-by: Sam Delmerico <delmerico@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 15:32:44 -08:00
Maarten van der Schrieck 9cd30af991 Documentation: fix statement about rebase.instructionFormat
Since commit 62db5247 (rebase -i: generate the script via
rebase--helper, 2017-07-14), the short hash is given in
rebase-todo. Specifying rebase.instructionFormat does not alter this
behavior, contrary to what the documentation implies.

Signed-off-by: Maarten van der Schrieck <maarten@thingsconnected.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 11:21:15 -08:00
René Scharfe 54d8a2531b t1006: prefer shell loop to awk for packed object sizes
To compute the expected on-disk size of packed objects, we sort the
output of show-index by pack offset and then compute the difference
between adjacent entries using awk. This works but has a few readability
problems:

  1. Reading the index in pack order means don't find out the size of an
     oid's entry until we see the _next_ entry. So we have to save it to
     print later.

     We can instead iterate in reverse order, so we compute each oid's
     size as we see it.

  2. Since the awk invocation is inside a text_expect block, we can't
     easily use single-quotes to hold the script. So we use
     double-quotes, but then have to escape the dollar signs in the awk
     script.

     We can swap this out for a shell loop instead (which is made much
     easier by the first change).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-03 09:26:53 -08:00
Chandra Pratap 03bcc93769 sideband.c: remove redundant 'NEEDSWORK' tag
Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-28 07:59:12 -08:00
Josh Soref 291873e5d6 SubmittingPatches: hyphenate non-ASCII
Git documentation does this with the exception of ancient release notes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
Josh Soref 7818951623 SubmittingPatches: clarify GitHub artifact format
GitHub wraps artifacts generated by workflows in a .zip file.

Internally, workflows can package anything they like in them.

A recently generated failure artifact had the form:

windows-artifacts.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
 76001695  12-19-2023 01:35   artifacts.tar.gz
 11005650  12-19-2023 01:35   tracked.tar.gz
---------                     -------
 87007345                     2 files

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
Josh Soref 0771a3b55c SubmittingPatches: clarify GitHub visual
GitHub has two general forms for its states, sometimes they're a simple
colored object (e.g. green check or red x), and sometimes there's also a
colored container (e.g. green box or red circle) which contains that
object (e.g. check or x).

That's a lot of words to try to describe things, but in general, the key
for a failure is that it's recognized as an `x` and that it's associated
with the color red -- the color of course is problematic for people who
are red-green color-blind, but that's why they are paired with distinct
shapes.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:24 -08:00
Josh Soref 08e2e6f8d2 SubmittingPatches: provide tag naming advice
Current statistics show a strong preference to only capitalize the first
letter in a hyphenated tag, but that some guidance would be helpful:

git log |
  perl -ne 'next unless /^\s+(?:Signed-[oO]ff|Acked)-[bB]y:/;
    s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n
   2 Signed-off-By:
   4 Signed-Off-by:
  22 Acked-By:
  47 Signed-Off-By:
2202 Acked-by:
95315 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Josh Soref c771ef6f77 SubmittingPatches: update extra tags list
Add items with at least 100 uses in the past three years:
- Co-authored-by
- Helped-by
- Mentored-by
- Suggested-by

git log --since=3.years|
  perl -ne 'next unless /^\s+[A-Z][a-z]+-\S+:/;s/^\s+//;s/:.*/:/;print'|
  sort|uniq -c|sort -n|grep '[0-9][0-9] '
  14 Based-on-patch-by:
  14 Original-patch-by:
  17 Tested-by:
 100 Suggested-by:
 121 Co-authored-by:
 163 Mentored-by:
 274 Reported-by:
 290 Acked-by:
 450 Helped-by:
 602 Reviewed-by:
14111 Signed-off-by:

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Josh Soref ac9fff2bf1 SubmittingPatches: discourage new trailers
There seems to be consensus amongst the core Git community on a working
set of common trailers, and there are non-trivial costs to people
inventing new trailers (research to discover what they mean/how they
differ from existing trailers) such that inventing new ones is generally
unwarranted and not something to be recommended to new contributors.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Josh Soref 127106294a SubmittingPatches: drop ref to "What's in git.git"
"What's in git.git" was last seen in 2010:
  https://lore.kernel.org/git/?q=%22what%27s+in+git.git%22
  https://lore.kernel.org/git/7vaavikg72.fsf@alter.siamese.dyndns.org/

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Josh Soref e6397c5cc8 CodingGuidelines: write punctuation marks
- Match style in Release Notes

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Josh Soref 2d194548cb CodingGuidelines: move period inside parentheses
The contents within parenthesis should be omittable without resulting
in broken text.

Eliding the parenthesis left a period to end a run without any content.

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-27 21:33:23 -08:00
Junio C Hamano 53ded839ae sparse-checkout: use default patterns for 'set' only !stdin
"git sparse-checkout set ---no-cone" uses default patterns when none
is given from the command line, but it should do so ONLY when
--stdin is not being used.  Right now, add_patterns_from_input()
called when reading from the standard input is sloppy and does not
check if there are extra command line parameters that the command
will silently ignore, but that will change soon and not setting this
unnecessary and unused default patterns start to matter when it gets
fixed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:15:58 -08:00
Elijah Newren d57c671a51 treewide: remove unnecessary includes in source files
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:33 -08:00