Commit graph

68575 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
9f3df6c048 run-command.c: use "opts->processes", not "pp->max_processes"
Neither the "processes" nor "max_processes" members ever change after
their initialization, and they're always equivalent, but some existing
code used "pp->max_processes" when we were already passing the "opts"
to the function, let's use the "opts" directly instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
Ævar Arnfjörð Bjarmason
2aa8d2259f run-command.c: don't copy "data" to "struct parallel_processes"
As with the *_fn members removed in a preceding commit, let's not copy
the "data" member of the "struct run_process_parallel_opts" over to
the "struct parallel_processes". Now that we're passing the "opts"
down there's no reason to do so.

This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:42 -07:00
Ævar Arnfjörð Bjarmason
357f8e6e18 run-command.c: don't copy "ungroup" to "struct parallel_processes"
As with the *_fn members removed in the preceding commit, let's not
copy the "ungroup" member of the "struct run_process_parallel_opts"
over to the "struct parallel_processes". Now that we're passing the
"opts" down there's no reason to do so.

This makes the code easier to follow, as we have a "const" attribute
on the "struct run_process_parallel_opts", but not "struct
parallel_processes". We do not alter the "ungroup" argument, so
storing it in the non-const structure would make this control flow
less obvious.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
fa93951d79 run-command.c: don't copy *_fn to "struct parallel_processes"
The only remaining reason for copying the callbacks in the "struct
run_process_parallel_opts" over to the "struct parallel_processes" was
to avoid two if/else statements in case the "start_failure" and
"task_finished" callbacks were NULL.

Let's handle those cases in pp_start_one() and pp_collect_finished()
instead, and avoid the default_* stub functions, and the need to copy
this data around.

Organizing the code like this made more sense before the "struct
run_parallel_parallel_opts" existed, as we'd have needed to pass each
of these as a separate parameter.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
e39c9de860 run-command.c: make "struct parallel_processes" const if possible
Add a "const" to two "struct parallel_processes" parameters where
we're not modifying anything in "pp". For kill_children() we'll call
it from both the signal handler, and from run_processes_parallel()
itself. Adding a "const" there makes it clear that we don't need to
modify any state when killing our children.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
36d69bf77e run-command API: move *_tr2() users to "run_processes_parallel()"
Have the users of the "run_processes_parallel_tr2()" function use
"run_processes_parallel()" instead. In preceding commits the latter
was refactored to take a "struct run_process_parallel_opts" argument,
since the only reason for "run_processes_parallel_tr2()" to exist was
to take arguments that are now a part of that struct we can do away
with it.

See ee4512ed48 (trace2: create new combined trace facility,
2019-02-22) for the addition of the "*_tr2()" variant of the function,
it was used by every caller except "t/helper/test-run-command.c"..

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
6e5ba0bae4 run-command API: have run_process_parallel() take an "opts" struct
As noted in fd3aaf53f7 (run-command: add an "ungroup" option to
run_process_parallel(), 2022-06-07) which added the "ungroup" passing
it to "run_process_parallel()" via the global
"run_processes_parallel_ungroup" variable was a compromise to get the
smallest possible regression fix for "maint" at the time.

This follow-up to that is a start at passing that parameter and others
via a new "struct run_process_parallel_opts", as the earlier
version[1] of what became fd3aaf53f7 did.

Since we need to change all of the occurrences of "n" to
"opt->SOMETHING" let's take the opportunity and rename the terse "n"
to "processes". We could also have picked "max_processes", "jobs",
"threads" etc., but as the API is named "run_processes_parallel()"
let's go with "processes".

Since the new "run_processes_parallel()" function is able to take an
optional "tr2_category" and "tr2_label" via the struct we can at this
point migrate all of the users of "run_processes_parallel_tr2()" over
to it.

But let's not migrate all the API users yet, only the two users that
passed the "ungroup" parameter via the
"run_processes_parallel_ungroup" global

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
c333e6f3a8 run-command.c: use designated init for pp_init(), add "const"
Use a designated initializer to initialize those parts of pp_init()
that don't need any conditionals for their initialization, this sets
us on a path to pp_init() itself into mostly a validation and
allocation function.

Since we're doing that we can add "const" to some of the members of
the "struct parallel_processes", which helps to clarify and
self-document this code. E.g. we never alter the "data" pointer we
pass t user callbacks, nor (after the preceding change to stop
invoking online_cpus()) do we change "max_processes", the same goes
for the "ungroup" option.

We can also do away with a call to strbuf_init() in favor of macro
initialization, and to rely on other fields being NULL'd or zero'd.

Making members of a struct "const" rather that the pointer to the
struct itself is usually painful, as e.g. it precludes us from
incrementally setting up the structure. In this case we only set it up
with the assignment in run_process_parallel() and pp_init(), and don't
pass the struct pointer around as "const", so making individual
members "const" is worth the potential hassle for extra safety.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
51243f9f0f run-command API: don't fall back on online_cpus()
When a "jobs = 0" is passed let's BUG() out rather than fall back on
online_cpus(). The default behavior was added when this API was
implemented in c553c72eed (run-command: add an asynchronous parallel
child processor, 2015-12-15).

Most of our code in-tree that scales up to "online_cpus()" by default
calls that function by itself. Keeping this default behavior just for
the sake of two callers means that we'd need to maintain this one spot
where we're second-guessing the config passed down into pp_init().

The preceding commit has an overview of the API callers that passed
"jobs = 0". There were only two of them (actually three, but they
resolved to these two config parsing codepaths).

The "fetch.parallel" caller already had a test for the
"fetch.parallel=0" case added in 0353c68818 (fetch: do not run a
redundant fetch from submodule, 2022-05-16), but there was no such
test for "submodule.fetchJobs". Let's add one here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:41 -07:00
Ævar Arnfjörð Bjarmason
6a48b428b4 run-command API: make "n" parameter a "size_t"
Make the "n" variable added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) a "size_t". As
we'll see in a subsequent commit we do pass "0" here, but never "jobs
< 0".

We could have made it an "unsigned int", but as we're having to change
this let's not leave another case in the codebase where a size_t and
"unsigned int" size differ on some platforms. In this case it's likely
to never matter, but it's easier to not need to worry about it.

After this and preceding changes:

	make run-command.o DEVOPTS=extra-all CFLAGS=-Wno-unused-parameter

Only has one (and new) -Wsigned-compare warning relevant to a
comparison about our "n" or "{nr,max}_processes": About using our
"n" (size_t) in the same expression as online_cpus() (int). A
subsequent commit will adjust & deal with online_cpus() and that
warning.

The only users of the "n" parameter are:

 * builtin/fetch.c: defaults to 1, reads from the "fetch.parallel"
   config. As seen in the code that parses the config added in
   d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
   2019-10-05) will die if the git_config_int() return value is < 0.

   It will however pass us n = 0, as we'll see in a subsequent commit.

 * submodule.c: defaults to 1, reads from "submodule.fetchJobs"
   config. Read via code originally added in a028a1930c (fetching
   submodules: respect `submodule.fetchJobs` config option, 2016-02-29).

   It now piggy-backs on the the submodule.fetchJobs code and
   validation added in f20e7c1ea2 (submodule: remove
   submodule.fetchjobs from submodule-config parsing, 2017-08-02).

   Like builtin/fetch.c it will die if the git_config_int() return
   value is < 0, but like builtin/fetch.c it will pass us n = 0.

 * builtin/submodule--helper.c: defaults to 1. Read via code
   originally added in 2335b870fa (submodule update: expose parallelism
   to the user, 2016-02-29).

   Since f20e7c1ea2 (submodule: remove submodule.fetchjobs from
   submodule-config parsing, 2017-08-02) it shares a config parser and
   semantics with the submodule.c caller.

 * hook.c: hardcoded to 1, see 96e7225b31 (hook: add 'run'
   subcommand, 2021-12-22).

 * t/helper/test-run-command.c: can be -1 after parsing the arguments,
   but will then be overridden to online_cpus() before passing it to
   this API. See be5d88e112 (test-tool run-command: learn to run (parts
   of) the testsuite, 2019-10-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
Ævar Arnfjörð Bjarmason
910e2b372f run-command tests: use "return", not "exit"
Change the "run-command" test helper to "return" instead of calling
"exit", see 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08)

Because we'd previously gotten past the SANITIZE=leak check by using
exit() here we need to move to "goto cleanup" pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
Ævar Arnfjörð Bjarmason
7dd5762d9f run-command API: have "run_processes_parallel{,_tr2}()" return void
Change the "run_processes_parallel{,_tr2}()" functions to return void,
instead of int. Ever since c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) they have
unconditionally returned 0.

To get a "real" return value out of this function the caller needs to
get it via the "task_finished_fn" callback, see the example in hook.c
added in 96e7225b31 (hook: add 'run' subcommand, 2021-12-22).

So the "result = " and "if (!result)" code added to "builtin/fetch.c"
d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
2019-10-05) has always been redundant, we always took that "if"
path. Likewise the "ret =" in "t/helper/test-run-command.c" added in
be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) wasn't used, instead we got the return value
from the "if (suite.failed.nr > 0)" block seen in the context.

Subsequent commits will alter this API interface, getting rid of this
always-zero return value makes it easier to understand those changes.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
Ævar Arnfjörð Bjarmason
a083f94c21 run-command test helper: use "else if" pattern
Adjust the cmd__run_command() to use an "if/else if" chain rather than
mutually exclusive "if" statements. This non-functional change makes a
subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:12:40 -07:00
Sotir Danailov
a2634646eb docs: git-send-email: difference between ssl and tls smtp-encryption
New explanation for the difference between these values.
It's hard to understand what they do based only on the names.
New description of used default ports.

Signed-off-by: Sotir Danailov <sndanailov@wired4ever.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 11:08:37 -07:00
Derrick Stolee
8628a842bd bundle-uri: suppress stderr from remote-https
When downloading bundles from a git-remote-https subprocess, the bundle
URI logic wants to be opportunistic and download as much as possible and
work with what did succeed. This is particularly important in the "any"
mode, where any single bundle success will work.

If the URI is not available, the git-remote-https process will die()
with a "fatal:" error message, even though that error is not actually
fatal to the super process. Since stderr is passed through, it looks
like a fatal error to the user.

Suppress stderr to avoid these errors from bubbling to the surface. The
bundle URI API adds its own warning() messages on these failures.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
Derrick Stolee
70334fc3eb bundle-uri: quiet failed unbundlings
When downloading a list of bundles in "all" mode, Git has no
understanding of the dependencies between the bundles. Git attempts to
unbundle the bundles in some order, but some may not pass the
verify_bundle() step because of missing prerequisites. This is passed as
error messages to the user, even when they eventually succeed in later
attempts after their dependent bundles are unbundled.

Add a new VERIFY_BUNDLE_QUIET flag to verify_bundle() that avoids the
error messages from the missing prerequisite commits. The method still
returns the number of missing prerequisit commits, allowing callers to
unbundle() to notice that the bundle failed to apply.

Use this flag in bundle-uri.c and test that the messages go away for
'git clone --bundle-uri' commands.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
Derrick Stolee
89bd7fedf9 bundle: add flags to verify_bundle()
The verify_bundle() method has a 'verbose' option, but we will want to
extend this method to have more granular control over its output. First,
replace this 'verbose' option with a new 'flags' option with a single
possible value: VERIFY_BUNDLE_VERBOSE.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
Derrick Stolee
c23f592117 bundle-uri: fetch a list of bundles
When the content at a given bundle URI is not understood as a bundle
(based on inspecting the initial content), then Git currently gives up
and ignores that content. Independent bundle providers may want to split
up the bundle content into multiple bundles, but still make them
available from a single URI.

Teach Git to attempt parsing the bundle URI content as a Git config file
providing the key=value pairs for a bundle list. Git then looks at the
mode of the list to see if ANY single bundle is sufficient or if ALL
bundles are required. The content at the selected URIs are downloaded
and the content is inspected again, creating a recursive process.

To guard the recursion against malformed or malicious content, limit the
recursion depth to a reasonable four for now. This can be converted to a
configured value in the future if necessary. The value of four is twice
as high as expected to be useful (a bundle list is unlikely to point to
more bundle lists).

To test this scenario, create an interesting bundle topology where three
incremental bundles are built on top of a single full bundle. By using a
merge commit, the two middle bundles are "independent" in that they do
not require each other in order to unbundle themselves. They each only
need the base bundle. The bundle containing the merge commit requires
both of the middle bundles, though. This leads to some interesting
decisions when unbundling, especially when we later implement heuristics
that promote downloading bundles until the prerequisite commits are
satisfied.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
Derrick Stolee
c96060b0ce bundle: properly clear all revision flags
The verify_bundle() method checks two things for a bundle's
prerequisites:

 1. Are these objects in the object store?
 2. Are these objects reachable from our references?

In this second question, multiple uses of verify_bundle() in the same
process can report an invalid bundle even though it is correct. The
reason is due to not clearing all of the commit marks on the commits
previously walked.

The revision walk machinery was first introduced in-process by
fb9a54150d (git-bundle: avoid fork() in verify_bundle(), 2007-02-22).
This implementation used "-1" as the set of flags to clear. The next
meaningful change came in 2b064697a5 (revision traversal: retire
BOUNDARY_SHOW, 2007-03-05), which introduced the PREREQ_MARK flag
instead of a flag normally controlled by the revision-walk machinery.

In 86a0a408b9 (commit: factor out
clear_commit_marks_for_object_array, 2011-10-01), the loop over the
array of commits was replaced with a new
clear_commit_marks_for_object_array(), but simultaneously the "-1" value
was replaced with "ALL_REV_FLAGS", which stopped un-setting the
PREREQ_MARK flag. This means that if multiple commits were marked by the
PREREQ_MARK in a previous run of verify_bundle(), then this loop could
terminate early due to 'i' going to zero:

	while (i && (commit = get_revision(&revs)))
		if (commit->object.flags & PREREQ_MARK)
			i--;

The flag clearing work was changed again in 63647391e6 (bundle: avoid
using the rev_info flag leak_pending, 2017-12-25), but that was only
cosmetic and did not change the behavior.

It may seem that it would be sufficient to add the PREREQ_MARK flag to
the clear_commit_marks() call in its current location. However, we
actually need to do it in the "cleanup:" step, since the first loop
checking "Are these objects in the object store?" might add the
PREREQ_MARK flag to some objects and then terminate without performing a
walk due to one missing object. By clearing the flags in all cases, we
avoid this issue when running verify_bundle() multiple times in the same
process.

Moving this loop to the cleanup step alone would cause a segfault when
running 'git bundle verify' outside of a repository, but this is because
of that error condition using "goto cleanup" when returning is perfectly
safe. Nothing has been initialized at that point, so we can return
immediately without causing any leaks.

This behavior is verified carefully by a test that will be added soon
when Git learns to download bundle lists in a 'git clone --bundle-uri'
command.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:25 -07:00
Derrick Stolee
20c1e2a68b bundle-uri: limit recursion depth for bundle lists
The next change will start allowing us to parse bundle lists that are
downloaded from a provided bundle URI. Those lists might point to other
lists, which could proceed to an arbitrary depth (and even create
cycles). Restructure fetch_bundle_uri() to have an internal version that
has a recursion depth. Compare that to a new max_bundle_uri_depth
constant that is twice as high as we expect this depth to be for any
legitimate use of bundle list linking.

We can consider making max_bundle_uri_depth a configurable value if
there is demonstrated value in the future.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Derrick Stolee
738e5245fa bundle-uri: parse bundle list in config format
When a bundle provider wants to operate independently from a Git remote,
they want to provide a single, consistent URI that users can use in
their 'git clone --bundle-uri' commands. At this point, the Git client
expects that URI to be a single bundle that can be unbundled and used to
bootstrap the rest of the clone from the Git server. This single bundle
cannot be re-used to assist with future incremental fetches.

To allow for the incremental fetch case, teach Git to understand a
bundle list that could be advertised at an independent bundle URI. Such
a bundle list is likely to be inspected by human readers, even if only
by the bundle provider creating the list. For this reason, we can take
our expected "key=value" pairs and instead format them using Git config
format.

Create bundle_uri_parse_config_format() to parse a file in config format
and convert that into a 'struct bundle_list' filled with its
understanding of the contents.

Be careful to use error_action CONFIG_ERROR_ERROR when calling
git_config_from_file_with_options() because the default action for
git_config_from_file() is to die() on a parsing error.  The current
warning isn't particularly helpful if it arises to a user, but it will
be made more verbose at a higher layer later.

Update 'test-tool bundle-uri' to take this config file format as input.
It uses a filename instead of stdin because there is no existing way to
parse a FILE pointer in the config machinery. Using
git_config_from_mem() is overly complicated and more likely to introduce
bugs than this simpler version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Ævar Arnfjörð Bjarmason
d796cedbe8 bundle-uri: unit test "key=value" parsing
Create a new 'test-tool bundle-uri' test helper. This helper will assist
in testing logic deep in the bundle URI feature.

This change introduces the 'parse-key-values' subcommand, which parses
an input file as a list of lines. These are fed into
bundle_uri_parse_line() to test how we construct a 'struct bundle_list'
from that data. The list is then output to stdout as if the key-value
pairs were a Git config file.

We use an input file instead of stdin because of a future change to
parse in config-file format that works better as an input file.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Ævar Arnfjörð Bjarmason
9424e373fd bundle-uri: create "key=value" line parsing
When advertising a bundle list over Git's protocol v2, we will use
packet lines. Each line will be of the form "key=value" representing a
bundle list. Connect the API necessary for Git's transport to the
key-value pair parsing created in the previous change.

We are not currently implementing this protocol v2 functionality, but
instead preparing to expose this parsing to be unit-testable.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Derrick Stolee
bff03c47f7 bundle-uri: create base key-value pair parsing
There will be two primary ways to advertise a bundle list: as a list of
packet lines in Git's protocol v2 and as a config file served from a
bundle URI. Both of these fundamentally use a list of key-value pairs.
We will use the same set of key-value pairs across these formats.

Create a new bundle_list_update() method that is currently unusued, but
will be used in the next change. It inspects each key to see if it is
understood and then applies it to the given bundle_list. Here are the
keys that we teach Git to understand:

* bundle.version: This value should be an integer. Git currently
  understands only version 1 and will ignore the list if the version is
  any other value. This version can be increased in the future if we
  need to add new keys that Git should not ignore. We can add new
  "heuristic" keys without incrementing the version.

* bundle.mode: This value should be one of "all" or "any". If this
  mode is not understood, then Git will ignore the list. This mode
  indicates whether Git needs all of the bundle list items to make a
  complete view of the content or if any single item is sufficient.

The rest of the keys use a bundle identifier "<id>" as part of the key
name. Keys using the same "<id>" describe a single bundle list item.

* bundle.<id>.uri: This stores the URI of the bundle item. This
  currently is expected to be an absolute URI, but will be relaxed to be
  a relative URI in the future.

While parsing, return an error if a URI key is repeated, since we can
make that restriction with bundle lists.

Make the git_parse_int() method global so we can parse the integer
version value carefully.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Derrick Stolee
0634f717a3 bundle-uri: create bundle_list struct and helpers
It will likely be rare where a user uses a single bundle URI and expects
that URI to point to a bundle. Instead, that URI will likely be a list
of bundles provided in some format. Alternatively, the Git server could
advertise a list of bundles.

In anticipation of these two ways of advertising multiple bundles,
create a data structure that represents such a list. This will be
populated using a common API, but for now focus on what data can be
represented.

Each list contains a number of remote_bundle_info structs. These contain
an 'id' that is used to uniquely identify them in the list, and also a
'uri' that contains the location of its data. Finally, there is a strbuf
containing the filename used when Git downloads the contents to disk.

The list itself stores these remote_bundle_info structs in a hashtable
using 'id' as the key. The order of the structs in the input is
considered unimportant, but future modifications to the format and these
data structures will place ordering possibilities on the set. The list
also has a few "global" properties, including the version (used when
parsing the list) and the mode. The mode is one of these two options:

1. BUNDLE_MODE_ALL: all listed URIs are intended to be combined
   together. The client should download all of the advertised data to
   have a complete copy of the data.

2. BUNDLE_MODE_ANY: any one listed item is sufficient to have a complete
   copy of the data. The client can choose arbitrarily from these
   options. In the future, the client may use pings to find the closest
   URI among geodistributed replicas, or use some other heuristic
   information added to the format.

This API is currently unused, but will soon be expanded with parsing
logic and then be consumed by the bundle URI download logic.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Derrick Stolee
23b6d00ba7 bundle-uri: use plain string in find_temp_filename()
The find_temp_filename() method was created in 53a50892be (bundle-uri:
create basic file-copy logic, 2022-08-09) and uses odb_mkstemp() to
create a temporary filename. The odb_mkstemp() method uses a strbuf in
its interface, but we do not need to continue carrying a strbuf
throughout the bundle URI code.

Convert the find_temp_filename() method to use a 'char *' and modify its
only caller. This makes sense that we don't actually need to modify this
filename directly later, so using a strbuf is overkill.

This change will simplify the data structure for tracking a bundle list
to use plain strings instead of strbufs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 09:13:24 -07:00
Junio C Hamano
d420dda057 The second batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 10:36:12 -07:00
Junio C Hamano
c68bd3ec22 Merge branch 'rs/gc-pack-refs-simplify'
Code clean-up.

* rs/gc-pack-refs-simplify:
  gc: simplify maintenance_task_pack_refs()
2022-10-11 10:36:12 -07:00
Junio C Hamano
39c1578c5e Merge branch 'nb/doc-mergetool-typofix'
Typofix.

* nb/doc-mergetool-typofix:
  mergetool.txt: typofix 'overwriten' -> 'overwritten'
2022-10-11 10:36:12 -07:00
Junio C Hamano
b0416d8f4a Merge branch 'jk/sequencer-missing-author-name-check'
Typofix in code.

* jk/sequencer-missing-author-name-check:
  sequencer: detect author name errors in read_author_script()
2022-10-11 10:36:12 -07:00
Junio C Hamano
644195e02f Merge branch 'pw/ssh-sign-report-errors'
The codepath to sign learned to report errors when it fails to read
from "ssh-keygen".

* pw/ssh-sign-report-errors:
  ssh signing: return an error when signature cannot be read
2022-10-11 10:36:11 -07:00
Junio C Hamano
601bb23876 Merge branch 'pw/mailinfo-b-fix'
Fix a logic in "mailinfo -b" that miscomputed the length of a
substring, which lead to an out-of-bounds access.

* pw/mailinfo-b-fix:
  mailinfo -b: fix an out of bounds access
2022-10-11 10:36:11 -07:00
Junio C Hamano
654f5cedbc Merge branch 'rs/test-httpd-in-C-locale'
Force C locale while running tests around httpd to make sure we can
find expected error messages in the log.

* rs/test-httpd-in-C-locale:
  t/lib-httpd: pass LANG and LC_ALL to Apache
2022-10-11 10:36:11 -07:00
Junio C Hamano
d54f0c5a44 Merge branch 'ds/bundle-uri-docfix'
Doc formatting fix.

* ds/bundle-uri-docfix:
  bundle-uri: fix technical doc issues
2022-10-11 10:36:10 -07:00
Ævar Arnfjörð Bjarmason
438c2f859b CodingGuidelines: recommend against unportable C99 struct syntax
Per 33665d98e6 (reftable: make assignments portable to AIX xlc
v12.01, 2022-03-28) forms like ".a.b = *c" can be replaced by using
".a = { .b = *c }" instead.

We'll probably allow these sooner than later, but since the workaround
is trivial let's note it among the C99 features we'd like to hold off
on for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:55:01 -07:00
Ævar Arnfjörð Bjarmason
db84376f98 grep.c: remove "extended" in favor of "pattern_expression", fix segfault
Since 79d3696cfb (git-grep: boolean expression on pattern matching.,
2006-06-30) the "pattern_expression" member has been used for complex
queries (AND/OR...), with "pattern_list" being used for the simple OR
queries. Since then we've used both "pattern_expression" and its
associated boolean "extended" member to see if we have a complex
expression.

Since f41fb662f5 (revisions API: have release_revisions() release
"grep_filter", 2022-04-13) we've had a subtle bug relating to that: If
we supplied options that were only used for "complex queries", but
didn't supply the query itself we'd set "opt->extended", but would
have a NULL "pattern_expression". As a result these would segfault as
we tried to call "free_grep_patterns()" from "release_revisions()":

	git -P log -1 --invert-grep
	git -P log -1 --all-match

The root cause of this is that we were conflating the state management
we needed in "compile_grep_patterns()" itself with whether or not we
had an "opt->pattern_expression" later on.

In this cases as we're going through "compile_grep_patterns()" we have
no "opt->pattern_list" but have "opt->no_body_match" or
"opt->all_match". So we'd set "opt->extended = 1", but not "return" on
"opt->extended" as that's an "else if" in the same "if" statement.

That behavior is intentional and required, as the common case is that
we have an "opt->pattern_list" that we're about to parse into the
"opt->pattern_expression".

But we don't need to keep track of this "extended" flag beyond the
state management in compile_grep_patterns() itself. It needs it, but
once we're out of that function we can rely on
"opt->pattern_expression" being non-NULL instead for using these
extended patterns.

As 79d3696cfb itself shows we've assumed that there's a one-to-one
mapping between the two since the very beginning. I.e. "match_line()"
would check "opt->extended" to see if it should call "match_expr()",
and the first thing we do in that function is assume that we have a
"opt->pattern_expression". We'd then call "match_expr_eval()", which
would have died if that "opt->pattern_expression" was NULL.

The "die" was added in c922b01f54 (grep: fix segfault when "git grep
'('" is given, 2009-04-27), and can now be removed as it's now clearly
unreachable. We still do the right thing in the case that prompted
that fix:

	git grep '('
	fatal: unmatched parenthesis

Arguably neither the "--invert-grep" option added in [1] nor the
earlier "--all-match" option added in [2] were intended to be used
stand-alone, and another approach[3] would be to error out in those
cases. But since we've been treating them as a NOOP when given without
--grep for a long time let's keep doing that.

We could also return in "free_pattern_expr()" if the argument is
non-NULL, as an alternative fix for this segfault does [4]. That would
be more elegant in making the "free_*()" function behave like
"free()", but it would also remove a sanity check: The
"free_pattern_expr()" function calls itself recursively, and only the
top-level is allowed to be NULL, let's not conflate those two
conditions.

1. 22dfa8a23d (log: teach --invert-grep option, 2015-01-12)
2. 0ab7befa31 (grep --all-match, 2006-09-27)
3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/
4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com

Reported-by: orygaw <orygaw@protonmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:48:54 -07:00
René Scharfe
e3733b646d archive: deduplicate verbose printing
94bc671a1f (Add directory pattern matching to attributes, 2012-12-08)
moved the code for adding the trailing slash to names of directories and
submodules up.  This left both branches of the if statement starting
with the same conditional fprintf call.  Deduplicate it.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11 08:35:10 -07:00
Jeff King
c4f9490790 fsmonitor: fix leak of warning message
The fsm_settings__get_incompatible_msg() function returns an allocated
string.  So we can't pass its result directly to warning(); we must hold
on to the pointer and free it to avoid a leak.

The leak here is small and fixed size, but Coverity complained, and
presumably SANITIZE=leaks would eventually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 22:16:56 -07:00
Rubén Justo
0dc4e5c574 branch: support for shortcuts like @{-1}, completed
branch command with options "edit-description", "set-upstream-to" and
"unset-upstream" expects a branch name.  Since ae5a6c3684 (checkout:
implement "@{-N}" shortcut name for N-th last branch, 2009-01-17) a
branch can be specified using shortcuts like @{-1}.  Those shortcuts
need to be resolved when considering the arguments.

We can modify the description of the previously checked out branch with:

$ git branch --edit--description @{-1}

We can modify the upstream of the previously checked out branch with:

$ git branch --set-upstream-to upstream @{-1}
$ git branch --unset-upstream @{-1}

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 16:28:59 -07:00
Ævar Arnfjörð Bjarmason
d7d850e2b9 CodingGuidelines: mention C99 features we can't use
The C99 section of the CodingGuidelines is a good overview of what we
can use, but is sorely lacking in what we can't use. Something that
comes up occasionally is the portability of %z.

Per [1] we couldn't use it for the longest time due to MSVC not
supporting it, but nowadays by requiring C99 we rely on the MSVC
version that does, but we can't use it yet because a C library that
MinGW uses doesn't support it.

1. https://lore.kernel.org/git/a67e0fd8-4a14-16c9-9b57-3430440ef93c@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:12 -07:00
Ævar Arnfjörð Bjarmason
82dd01d81b CodingGuidelines: allow declaring variables in for loops
Since 44ba10d671 (revision: use C99 declaration of variable in for()
loop, 2021-11-14) released with v2.35.0 we've had a variable declared
with in a for loop.

Since then we've had inadvertent follow-ups to that with at least
cb2607759e (merge-ort: store more specific conflict information,
2022-06-18) released with v2.38.0.

As November 2022 is within the window of this upcoming release,
let's update the guideline to allow this.  We can have the promised
"revisit" discussion while this patch cooks, and drop it if it turns
out that it is still premature, which is not expected to happen at
this moment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
Ævar Arnfjörð Bjarmason
442c27dde7 CodingGuidelines: mention dynamic C99 initializer elements
The first use of variables in initializer elements appears to have
been 2b6854c863 (Cleanup variables in cat-file, 2007-04-21) released
with v1.5.2.

Some of those caused portability issues, and e.g. that "cat-file" use
was changed in 66dbfd55e3 (Rewrite dynamic structure initializations
to runtime assignment, 2010-05-14) which went out with v1.7.2.

But curiously 66dbfd55e3 missed some of them, e.g. an archive.c use
added in d5f53d6d6f (archive: complain about path specs that don't
match anything, 2009-12-12), and another one in merge-index.c (later
builtin/merge-index.c) in 0077138cd9 (Simplify some instances of
run_command() by using run_command_v_opt()., 2009-06-08).

As far as I can tell there's been no point since 2b6854c863 in 2007
where a compiler that didn't support this has been able to compile
git. Presumably 66dbfd55e3 was an attempt to make headway with wider
portability that ultimately wasn't completed.

In any case, we are thoroughly reliant on this syntax at this point,
so let's update the guidelines, see
https://lore.kernel.org/git/xmqqy1tunjgp.fsf@gitster.g/ for the
initial discussion.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
Ævar Arnfjörð Bjarmason
e88a2d02dc CodingGuidelines: update for C99
Since 7bc341e21b (git-compat-util: add a test balloon for C99
support, 2021-12-01) we've had a hard dependency on C99, but the prose
in CodingGuidelines was written under the assumption that we were
using C89 with a few C99 features.

As the updated prose notes we'd still like to hold off on novel C99
features, but let's make it clear that we target that C version, and
then enumerate new C99 features that are safe to use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 13:41:11 -07:00
Phillip Wood
a677d3c416 t3435: remove redundant test case
rebase --preserve-merges no longer exists so there is no point in
carrying this failing test case.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 11:18:18 -07:00
Jeff Hostetler
54795d37d9 config.mak.dev: disable suggest braces error on old clang versions
Add the "-Wno-missing-braces" option when building with an old version
of clang to suppress the "suggest braces around initialization" error
in developer mode.

For example, using an old version of clang gives the following errors
(when in DEVELOPER=1 mode):

$ make builtin/merge-file.o
    CC builtin/merge-file.o
builtin/merge-file.c:29:23: error: suggest braces around initialization \
			    of subobject [-Werror,-Wmissing-braces]
        mmfile_t mmfs[3] = { 0 };
                             ^
                             {}
builtin/merge-file.c:31:20: error: suggest braces around initialization \
			    of subobject [-Werror,-Wmissing-braces]
        xmparam_t xmp = { 0 };
                          ^
                          {}
2 errors generated.

This example compiles without error/warning with updated versions of
clang.  Since this is an obsolete error, use the -Wno-missing-braces
option to silence the warning when using an older compiler.  This
avoids the need to update the code to use "{{0}}" style
initializations.

Upstream clang version 8 has the problem.  It was fixed in version 9.

The version of clang distributed by Apple with XCode has its own
unique set of version numbers.  Apple clang version 11 has the
problem.  It was fixed in version 12.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 11:15:31 -07:00
Junio C Hamano
e85701b4af The (real) first batch for 2.39
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-10 10:09:09 -07:00
Junio C Hamano
19118cb857 Merge branch 'js/merge-ort-in-read-only-repo'
In read-only repositories, "git merge-tree" tried to come up with a
merge result tree object, which it failed (which is not wrong) and
led to a segfault (which is bad), which has been corrected.

* js/merge-ort-in-read-only-repo:
  merge-ort: return early when failing to write a blob
  merge-ort: fix segmentation fault in read-only repositories
2022-10-10 10:08:43 -07:00
Junio C Hamano
a215853545 Merge branch 'tb/midx-repack-ignore-cruft-packs'
"git multi-pack-index repack/expire" used to repack unreachable
cruft into a new pack, which have been corrected.

* tb/midx-repack-ignore-cruft-packs:
  midx.c: avoid cruft packs with non-zero `repack --batch-size`
  midx.c: remove unnecessary loop condition
  midx.c: replace `xcalloc()` with `CALLOC_ARRAY()`
  midx.c: avoid cruft packs with `repack --batch-size=0`
  midx.c: prevent `expire` from removing the cruft pack
  Documentation/git-multi-pack-index.txt: clarify expire behavior
  Documentation/git-multi-pack-index.txt: fix typo
2022-10-10 10:08:43 -07:00
Junio C Hamano
38bb92cf46 Merge branch 'hn/parse-worktree-ref'
Code and semantics cleaning.

* hn/parse-worktree-ref:
  refs: unify parse_worktree_ref() and ref_type()
2022-10-10 10:08:43 -07:00
Junio C Hamano
dc154c39f7 Merge branch 'ja/rebase-i-avoid-amending-self'
"git rebase -i" can mistakenly attempt to apply a fixup to a commit
itself, which has been corrected.

* ja/rebase-i-avoid-amending-self:
  sequencer: avoid dropping fixup commit that targets self via commit-ish
2022-10-10 10:08:43 -07:00