Commit graph

379 commits

Author SHA1 Message Date
Junio C Hamano 25ff15d108 Merge branch 'jk/unused-post-2.42'
Unused parameters to functions are marked as such, and/or removed,
in order to bring us closer to -Wunused-parameter clean.

* jk/unused-post-2.42: (22 commits)
  update-ref: mark unused parameter in parser callbacks
  gc: mark unused descriptors in scheduler callbacks
  bundle-uri: mark unused parameters in callbacks
  fetch: mark unused parameter in ref_transaction callback
  credential: mark unused parameter in urlmatch callback
  grep: mark unused parmaeters in pcre fallbacks
  imap-send: mark unused parameters with NO_OPENSSL
  worktree: mark unused parameters in noop repair callback
  negotiator/noop: mark unused callback parameters
  add-interactive: mark unused callback parameters
  grep: mark unused parameter in output function
  test-trace2: mark unused argv/argc parameters
  trace2: mark unused config callback parameter
  trace2: mark unused us_elapsed_absolute parameters
  stash: mark unused parameter in diff callback
  ls-tree: mark unused parameter in callback
  commit-graph: mark unused data parameters in generation callbacks
  worktree: mark unused parameters in each_ref_fn callback
  pack-bitmap: mark unused parameters in show_object callback
  ref-filter: mark unused parameters in parser callbacks
  ...
2023-09-07 15:06:07 -07:00
Jeff King e1cba404db commit-graph: mark unused data parameters in generation callbacks
The compute_generation_info code uses function pointers to abstract the
get/set generation operations. Some callers don't need the extra void
data pointer, which should be annotated to appease -Wunused-parameter.

Note that we can drop the assignment of the "data" parameter in
compute_generation_numbers(), as we've just shown that neither of the
callbacks it uses will access it. This matches the caller in
ensure_generations_valid(), which already does not bother to set "data".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29 17:56:24 -07:00
Taylor Blau db6044d762 commit-graph: avoid repeated mixed generation number warnings
When validating that a commit-graph has either all zero, or all non-zero
generation numbers, we emit a warning on both the rising and falling
edge of transitioning between the two.

So if we are unfortunate enough to see a commit-graph which has a
repeating sequence of zero, then non-zero generation numbers, we'll
generate many warnings that contain more or less the same information.

Avoid this by keeping track of a single example for a commit with zero-
and non-zero generation, and emit a single warning at the end of
verification if both are non-NULL.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:16:23 -07:00
Jeff King cc9c9a00a5 commit-graph: verify swapped zero/non-zero generation cases
In verify_one_commit_graph(), we have code that complains when a commit
is found with a generation number of zero, and then later with a
non-zero number. It works like this:

  1. When we see an entry with generation zero, we set the
     generation_zero flag to GENERATION_ZERO_EXISTS.

  2. When we later see an entry with a non-zero generation, we complain
     if the flag is GENERATION_ZERO_EXISTS.

There's a matching GENERATION_NUMBER_EXISTS value, which in theory would
be used to find the case that we see the entries in the opposite order:

  1. When we see an entry with a non-zero generation, we set the
     generation_zero flag to GENERATION_NUMBER_EXISTS.

  2. When we later see an entry with a zero generation, we complain if
     the flag is GENERATION_NUMBER_EXISTS.

But that doesn't work; step 2 is implemented, but there is no step 1. We
never use NUMBER_EXISTS at all, and Coverity rightly complains that step
2 is dead code.

We can fix that by implementing that step 1.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:07:33 -07:00
Taylor Blau 868c991155 commit-graph: introduce commit_graph_generation_from_graph()
In 2ee11f7261 (commit-graph: return generation from memory, 2023-03-20),
the `commit_graph_generation()` function stopped returning zeros when
asked to locate the generation number of a given commit.

This was done at the time to prepare for a later change which set
generation values in memory, meaning that we could no longer rely on
`graph_pos` alone to tell us whether or not to trust the generation
number returned by this function.

In 2ee11f7261, it was noted that this change only impacted very old
commit-graphs, which were written with all commits having generation
number 0. Indeed, zero is not a valid generation number, so we should
never expect to see that value outside of the aforementioned case.

The test fallout in 2ee11f7261 indicated that we were no longer able to
fsck a specific old case of commit-graph corruption, where we see a
non-zero generation number after having seen a generation number of 0
earlier.

Introduce a variant of `commit_graph_generation()` which behaves like
that function did prior to 2ee11f7261, known as
`commit_graph_generation_from_graph()`. Then use this function in the
context of `verify_one_commit_graph()`, where we only want to trust the
values from the graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21 16:07:33 -07:00
Junio C Hamano 4488bb3bed Merge branch 'tb/object-access-overflow-protection'
Various offset computation in the code that accesses the packfiles
and other data in the object layer has been hardened against
arithmetic overflow, especially on 32-bit systems.

* tb/object-access-overflow-protection:
  commit-graph.c: prevent overflow in `verify_commit_graph()`
  commit-graph.c: prevent overflow in `write_commit_graph()`
  commit-graph.c: prevent overflow in `merge_commit_graph()`
  commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
  commit-graph.c: prevent overflow in `load_tree_for_commit()`
  commit-graph.c: prevent overflow in `fill_commit_in_graph()`
  commit-graph.c: prevent overflow in `fill_commit_graph_info()`
  commit-graph.c: prevent overflow in `load_oid_from_graph()`
  commit-graph.c: prevent overflow in add_graph_to_chain()
  commit-graph.c: prevent overflow in `write_commit_graph_file()`
  pack-bitmap.c: ensure that eindex lookups don't overflow
  midx.c: prevent overflow in `fill_included_packs_batch()`
  midx.c: prevent overflow in `write_midx_internal()`
  midx.c: store `nr`, `alloc` variables as `size_t`'s
  midx.c: prevent overflow in `nth_midxed_offset()`
  midx.c: prevent overflow in `nth_midxed_object_oid()`
  midx.c: use `size_t`'s for fanout nr and alloc
  packfile.c: use checked arithmetic in `nth_packed_object_offset()`
  packfile.c: prevent overflow in `load_idx()`
  packfile.c: prevent overflow in `nth_packed_object_id()`
2023-07-25 12:05:23 -07:00
Junio C Hamano 6016ee0a71 Merge branch 'tb/fsck-no-progress'
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.

* tb/fsck-no-progress:
  commit-graph.c: avoid duplicated progress output during `verify`
  commit-graph.c: pass progress to `verify_one_commit_graph()`
  commit-graph.c: iteratively verify commit-graph chains
  commit-graph.c: extract `verify_one_commit_graph()`
  fsck: suppress MIDX output with `--no-progress`
  fsck: suppress commit-graph output with `--no-progress`
2023-07-18 07:28:53 -07:00
Taylor Blau 9a25cad7e0 commit-graph.c: prevent overflow in verify_commit_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an OID out of an existing commit-graph during
verification.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 588af1bfd3 commit-graph.c: prevent overflow in write_commit_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an existing OID while writing a new commit-graph.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau d76e0a744d commit-graph.c: prevent overflow in merge_commit_graph()
When merging two commit graphs, ensure that we don't attempt to merge
two graphs which, when combined, have more total commits than the 32-bit
unsigned maximum.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 19565d093d commit-graph.c: prevent overflow in split_graph_merge_strategy()
In a similar spirit as previous commits, ensure that we don't overflow
when choosing how to split and merge different layers of the
commit-graph.

In particular, avoid a potential overflow between `size_mult` and
`num_commits`, as well as a potential overflow between the number of
commits currently in the merged graph, and the number of commits in the
graph about to be merged.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 51c31a6408 commit-graph.c: prevent overflow in load_tree_for_commit()
In a similar spirit as previous commits, ensure that we don't overflow
when computing an offset into the commit_data chunk when the (relative)
graph position exceeds 2^32-1/GRAPH_DATA_WIDTH.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 50a71c2942 commit-graph.c: prevent overflow in fill_commit_in_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when the lex_index of the commit we are trying to fill out exceeds
2^32-1/(g->hash_len+16).

The other hunk touched in this patch is not susceptible to overflow,
since an explicit cast is made to a 64-bit unsigned value. For clarity
and consistency with the rest of the commits in this series, avoid a
tricky to reason about cast, and use `st_mult()` directly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 2740ed1c76 commit-graph.c: prevent overflow in fill_commit_graph_info()
In a similar spirit as previous commits, ensure that we don't overflow
in a few spots within `fill_commit_graph_info()`:

  - First, when computing an offset into the commit data chunk, which
    can occur when the `lex_index` of the item we're looking up exceeds
    2^32-1/GRAPH_DATA_WIDTH.

  - A similar issue when computing the generation date offset for
    commits with `lex_index` greater than 2^32-1/4. Note that in
    practice this will never overflow, since the left-hand operand is
    from calling `sizeof(...)` and is thus already a `size_t`. But wrap
    that in an `st_mult()` to make it clear that we intend to perform
    this computation using 64-bit operands.

  - Finally, a nearly identical issue as above when computing an offset
    into the `generation_data_overflow` chunk.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 0bd8f30a0e commit-graph.c: prevent overflow in load_oid_from_graph()
In a similar spirit as previous commits, ensure that we don't overflow
when trying to compute an offset into the `chunk_oid_lookup` table when
the `lex_index` of the item we're trying to look up exceeds
`2^32-1/g->hash_len`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 209250ef38 commit-graph.c: prevent overflow in add_graph_to_chain()
The commit-graph uses a fanout table with 4-byte entries to store the
number of commits at each shard of the commit-graph. So it is OK to have
a commit graph with as many as 2^32-1 stored commits. But we risk
overflowing any computation which may exceed the 32-bit (unsigned)
maximum when those computations are (incorrectly) performed using 32-bit
operands.

There are a couple of spots in `add_graph_to_chain()` where we could
potentially overflow the result:

  - First, when comparing the list of existing entries in the
    commit-graph chain. It is unlikely that this should ever overflow,
    since it would require having roughly 2^32-1/g->hash_len
    commit-graphs in the chain. But let's guard that computation with a
    `st_mult()` just to be safe.

  - Second, when computing the number of commits in the graph added to
    the front of the chain. This value is also a 32-bit unsigned, but we
    should make sure that it does not grow beyond the maximum value.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 48f3f8cf37 commit-graph.c: prevent overflow in write_commit_graph_file()
When writing a commit-graph, we use the chunk-format API to write out
each individual chunk of the commit-graph. Each chunk of the
commit-graph is tracked via a call to `add_chunk()`, along with the
expected size of that chunk.

Similar to an earlier commit which handled the identical issue in the
MIDX machinery, guard against overflow when dealing with a commit-graph
with a large number of entries to avoid corrupting the contents of the
commit-graph itself.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-14 09:32:03 -07:00
Taylor Blau 9281cd07f0 commit-graph.c: avoid duplicated progress output during verify
When `git commit-graph verify` was taught how to verify commit-graph
chains in 3da4b609bb (commit-graph: verify chains with --shallow mode,
2019-06-18), it produced one line of progress per layer of the
commit-graph chain.

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (4356/4356), done.
    Verifying commits in commit graph: 100% (131912/131912), done.

This could be somewhat confusing to users, who may wonder why there are
multiple occurrences of "Verifying commits in commit graph".

There are likely good arguments on whether or not there should be
one line of progress output per commit-graph layer. On the one hand, the
existing output shows us verifying each individual layer of the chain.
But on the other hand, the fact that a commit-graph may be stored among
multiple layers is an implementation detail that the caller need not be
aware of.

Clarify this by showing a single progress meter regardless of the number
of layers in the commit-graph chain. After this patch, the output
reflects the logical contents of a commit-graph chain, instead of
showing one line of output per commit-graph layer:

    $ git.compile commit-graph verify
    Verifying commits in commit graph: 100% (136268/136268), done.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:45 -07:00
Taylor Blau 7248857b6e commit-graph.c: pass progress to verify_one_commit_graph()
This is the final step to prepare for consolidating the output of `git
commit-graph verify`. Instead of having each call to
`verify_one_commit_graph()` initialize its own progress struct, have the
caller pass one in instead.

This patch does not alter the output of `git commit-graph verify`, but
the next commit will consolidate the output.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:43 -07:00
Taylor Blau f5facaa465 commit-graph.c: iteratively verify commit-graph chains
Now that we have a function which can verify a single layer of a
commit-graph chain, implement `verify_commit_graph()` in terms of
iterating over commit-graphs along their `->base_graph` pointers.

This further prepares us to consolidate the progress output of `git
commit-graph verify`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:42 -07:00
Taylor Blau eb319d6771 commit-graph.c: extract verify_one_commit_graph()
When the `verify_commit_graph()` function was extended to support
commit-graph chains via 3da4b609bb (commit-graph: verify chains with
--shallow mode, 2019-06-18), it did so by recursively calling itself on
each layer of the commit-graph chain.

In practice this poses no issues, since commit-graph chains do not loop,
and there are few enough of them that adding additional frames to the
stack is not a problem.

A future commit will consolidate the progress output from `git
commit-graph verify` when verifying chained commit-graphs to print a
single line instead of one progress meter per commit-graph layer.
Prepare for this by extracting a routine to verify a single layer of a
commit-graph.

Note that `verify_commit_graph()` is still recursive after this patch,
but this will change in the subsequent patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-10 10:02:41 -07:00
Calvin Wan da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Junio C Hamano a1264a08a1 Merge branch 'en/header-split-cache-h-part-3'
Header files cleanup.

* en/header-split-cache-h-part-3: (28 commits)
  fsmonitor-ll.h: split this header out of fsmonitor.h
  hash-ll, hashmap: move oidhash() to hash-ll
  object-store-ll.h: split this header out of object-store.h
  khash: name the structs that khash declares
  merge-ll: rename from ll-merge
  git-compat-util.h: remove unneccessary include of wildmatch.h
  builtin.h: remove unneccessary includes
  list-objects-filter-options.h: remove unneccessary include
  diff.h: remove unnecessary include of oidset.h
  repository: remove unnecessary include of path.h
  log-tree: replace include of revision.h with simple forward declaration
  cache.h: remove this no-longer-used header
  read-cache*.h: move declarations for read-cache.c functions from cache.h
  repository.h: move declaration of the_index from cache.h
  merge.h: move declarations for merge.c from cache.h
  diff.h: move declaration for global in diff.c from cache.h
  preload-index.h: move declarations for preload-index.c from elsewhere
  sparse-index.h: move declarations for sparse-index.c from cache.h
  name-hash.h: move declarations for name-hash.c from cache.h
  run-command.h: move declarations for run-command.c from cache.h
  ...
2023-06-29 16:43:21 -07:00
Junio C Hamano d9f9f6b358 Merge branch 'ds/disable-replace-refs'
Introduce a mechanism to disable replace refs globally and per
repository.

* ds/disable-replace-refs:
  repository: create read_replace_refs setting
  replace-objects: create wrapper around setting
  repository: create disable_replace_refs()
2023-06-22 16:29:06 -07:00
Elijah Newren a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Derrick Stolee f1178380ac replace-objects: create wrapper around setting
The 'read_replace_objects' constant is initialized by git_default_config
(if core.useReplaceRefs is disabled) and within setup_git_env (if
GIT_NO_REPLACE_OBJECTS) is set. To ensure that this variable cannot be
set accidentally in other places, wrap it in a replace_refs_enabled()
method.

Since we still assign this global in config.c, we are not able to remove
the global scope of this variable and make it a static within
replace-object.c. This will happen in a later change which will also
prevent the variable from being read before it is initialized.

Centralizing read access to the variable is an important first step.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12 13:34:55 -07:00
Junio C Hamano ccd12a3d6c Merge branch 'en/header-split-cache-h-part-2'
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
2023-05-09 16:45:46 -07:00
Junio C Hamano 0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
Elijah Newren d4a4f9291d commit.h: reduce unnecessary includes
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24 12:47:33 -07:00
Elijah Newren b6fdc44c84 treewide: remove cache.h inclusion due to object-file.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren 87bed17907 object-file.h: move declarations for object-file.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren 6f2d743043 treewide: be explicit about dependence on oid-array.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Junio C Hamano 6047b28eb7 Merge branch 'en/header-split-cleanup'
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.

* en/header-split-cleanup:
  csum-file.h: remove unnecessary inclusion of cache.h
  write-or-die.h: move declarations for write-or-die.c functions from cache.h
  treewide: remove cache.h inclusion due to setup.h changes
  setup.h: move declarations for setup.c functions from cache.h
  treewide: remove cache.h inclusion due to environment.h changes
  environment.h: move declarations for environment.c functions from cache.h
  treewide: remove unnecessary includes of cache.h
  wrapper.h: move declarations for wrapper.c functions from cache.h
  path.h: move function declarations for path.c functions from cache.h
  cache.h: remove expand_user_path()
  abspath.h: move absolute path functions from cache.h
  environment: move comment_line_char from cache.h
  treewide: remove unnecessary cache.h inclusion from several sources
  treewide: remove unnecessary inclusion of gettext.h
  treewide: be explicit about dependence on gettext.h
  treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06 13:38:31 -07:00
Junio C Hamano 72871b198f Merge branch 'ab/remove-implicit-use-of-the-repository'
Code clean-up around the use of the_repository.

* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06 13:38:30 -07:00
Junio C Hamano 119e82a515 Merge branch 'ps/ahead-behind-truncation-fix'
Fix unnecessary truncation of generation numbers used in-core.

* ps/ahead-behind-truncation-fix:
  commit-graph: fix truncated generation numbers
2023-04-06 13:38:27 -07:00
Junio C Hamano 7727da99df Merge branch 'ds/ahead-behind'
"git for-each-ref" learns '%(ahead-behind:<base>)' that computes the
distances from a single reference point in the history with bunch
of commits in bulk.

* ds/ahead-behind:
  commit-reach: add tips_reachable_from_bases()
  for-each-ref: add ahead-behind format atom
  commit-reach: implement ahead_behind() logic
  commit-graph: introduce `ensure_generations_valid()`
  commit-graph: return generation from memory
  commit-graph: simplify compute_generation_numbers()
  commit-graph: refactor compute_topological_levels()
  for-each-ref: explicitly test no matches
  for-each-ref: add --stdin option
2023-04-06 13:38:21 -07:00
Junio C Hamano e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason 4a93b899c1 libs: use "struct repository *" argument, not "the_repository"
As can easily be seen from grepping in our sources, we had these uses
of "the_repository" in various library code in cases where the
function in question was already getting a "struct repository *"
argument. Let's use that argument instead.

Out of these changes only the changes to "cache-tree.c",
"commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly
applied before the migration away from the "repo_*()" wrapper macros
in the preceding commits.

The rest aren't new, as we'd previously implicitly refer to
"the_repository", but it's now more obvious that we were doing the
wrong thing all along, and should have used the parameter instead.

The change to change "get_index_format_default(the_repository)" in
"read-cache.c" to use the "r" variable instead should arguably have
been part of [1], or in the subsequent cleanup in [2]. Let's do it
here, as can be seen from the initial code in [3] it's not important
that we use "the_repository" there, but would prefer to always use the
current repository.

This change excludes the "the_repository" use in "upload-pack.c"'s
upload_pack_advertise(), as the in-flight [4] makes that change.

1. ee1f0c242e (read-cache: add index.skipHash config option,
   2023-01-06)
2. 6269f8eaad (treewide: always have a valid "index_state.repo"
   member, 2023-01-17)
3. 7211b9e753 (repo-settings: consolidate some config settings,
   2019-08-13)
4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net>

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:46 -07:00
Ævar Arnfjörð Bjarmason afe27c8894 cocci: apply the "packfile.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"packfile.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason ecb5091fd4 cocci: apply the "commit.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Patrick Steinhardt d3af1c193d commit-graph: fix truncated generation numbers
In 80c928d947 (commit-graph: simplify compute_generation_numbers(),
2023-03-20), the code to compute generation numbers was simplified to
use the same infrastructure as is used to compute topological levels.
This refactoring introduced a bug where the generation numbers are
truncated when they exceed UINT32_MAX because we explicitly cast the
computed generation number to `uint32_t`. This is not required though:
both the computed value and the field of `struct commit_graph_data` are
of the same type `timestamp_t` already, so casting to `uint32_t` will
cause truncation.

This cast can cause us to miscompute generation data overflows:

    1. Given a commit with no parents and committer date
       `UINT32_MAX + 1`.

    2. We compute its generation number as `UINT32_MAX + 1`, but
       truncate it to `1`.

    3. We calculate the generation offset via `$generation - $date`,
       which is thus `1 - (UINT32_MAX + 1)`. The computation underflows
       and we thus end up with an offset that is bigger than the maximum
       allowed offset.

As a result, we'd be writing generation data overflow information into
the commit-graph that is bogus and ultimately not even required.

Fix this bug by removing the needless cast.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27 10:52:06 -07:00
Elijah Newren ec2f026961 csum-file.h: remove unnecessary inclusion of cache.h
With the change in the last commit to move several functions to
write-or-die.h, csum-file.h no longer needs to include cache.h.
However, removing that include forces several other C files, which
directly or indirectly dependend upon csum-file.h's inclusion of
cache.h, to now be more explicit about their dependencies.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:55 -07:00
Elijah Newren d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Taylor Blau c08645b353 commit-graph: introduce ensure_generations_valid()
Use the just-introduced compute_reachable_generation_numbers_1() to
implement a function which dynamically computes topological levels (or
corrected commit dates) for out-of-graph commits.

This will be useful for the ahead-behind algorithm we are about to
introduce, which needs accurate topological levels on _all_ commits
reachable from the tips in order to avoid over-counting.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee 2ee11f7261 commit-graph: return generation from memory
The commit_graph_generation() method used to report a value of
GENERATION_NUMBER_INFINITY if the commit_graph_data_slab had an instance
for the given commit but the graph_pos indicated the commit was not in
the commit-graph file.

However, an upcoming change will introduce the ability to set generation
values in-memory without writing the commit-graph file. Thus, we can no
longer trust 'graph_pos' to indicate whether or not the generation
member can be trusted.

Instead, trust the 'generation' member if the commit has a value in the
slab _and_ the 'generation' member is non-zero. Otherwise, treat it as
GENERATION_NUMBER_INFINITY.

This only makes a difference for a very old case for the commit-graph:
the very first Git release to write commit-graph files wrote zeroes in
the topological level positions. If we are parsing a commit-graph with
all zeroes, those commits will now appear to have
GENERATION_NUMBER_INFINITY (as if they were not parsed from the
commit-graph).

I attempted several variations to work around the need for providing an
uninitialized 'generation' member, but this was the best one I found. It
does require a change to a verification test in t5318 because it reports
a different error than the one about non-zero generation numbers.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee 80c928d947 commit-graph: simplify compute_generation_numbers()
The previous change introduced the generic algorithm
compute_reachable_generation_numbers() and used it as the core
functionality of compute_topological_levels(). Now, use it as the core
functionality of compute_generation_numbers().

The main difference here is that we use generation version 2, which is
used in to toggle the logic in compute_generation_from_max() for
computing the corrected commit date based on the corrected commit dates
of the parent commits (and the commit date of the current commit). It
also uses different methods for (get|set)_generation in the vtable in
order to store and access the value in the correct places.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Derrick Stolee 368d19b0b7 commit-graph: refactor compute_topological_levels()
This patch extracts the common code used to compute topological levels
and corrected committer dates into a common routine,
compute_reachable_generation_numbers(). For ease of reading, it only
modifies compute_topological_levels() to use this new routine, leaving
compute_generation_numbers() to be modified in the next change.

This new routine dispatches to call the necessary functions to get and
set the generation number for a given commit through a vtable (the
compute_generation_info struct).

Computing the generation number itself is done in
compute_generation_from_max(), which dispatches its implementation based
on the generation version requested, or issuing a BUG() for unrecognized
generation versions. This does not use a vtable because the logic
depends only on the generation number version, not where the data is
being loaded from or being stored to. This is a subtle point that will
make more sense in a future change that modifies the in-memory
generation values instead of just preparing values for writing to a
commit-graph file.

This change looks like it adds a lot of new code. However, two upcoming
changes will be quite small due to the work being done in this change.

Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 12:17:33 -07:00
Elijah Newren 41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00