In an effort to align write_midx_internal() with the chunk-format API,
continue to group necessary data into "struct write_midx_context". This
change collects the "struct pack_midx_entry *entries" list and its count
into the context.
Update write_midx_oid_fanout() and write_midx_oid_lookup() to take the
context directly, as these are easy conversions with this new data.
Only the callers of write_midx_object_offsets() and
write_midx_large_offsets() are updated here, since additional data in
the context before those methods can match chunk_write_fn.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an effort to align the write_midx_internal() to use the chunk-format
API, start converting chunk writing methods to match chunk_write_fn. The
first case is to convert write_midx_pack_names() to take "void *data".
We already have the necessary data in "struct write_midx_context", so
this conversion is rather mechanical.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an effort to streamline our chunk-based file formats, align some of
the code structure in write_midx_internal() to be similar to the
patterns in write_commit_graph_file().
Specifically, let's create a "struct write_midx_context" that can be
used as a data parameter to abstract function types.
This change only renames "struct pack_info" to "struct
write_midx_context" and the names of instances from "packs" to "ctx". In
future changes, we will expand the data inside "struct
write_midx_context" and align our chunk-writing method with the
chunk-format API.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph write logic is ready to make use of the chunk-format
write API. Each chunk write method is already in the correct prototype.
We only need to use the 'struct chunkfile' pointer and the correct API
calls.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of combining the logic from the commit-graph and
multi-pack-index file formats, create a new chunk-format API. Use a
'struct chunkfile' pointer to keep track of data that has been
registered for writes. This struct is anonymous outside of
chunk-format.c to ensure no user attempts to interfere with the data.
The next change will use this API in commit-graph.c, but the general
approach is:
1. initialize the chunkfile with init_chunkfile(f).
2. add chunks in the intended writing order with add_chunk().
3. write any header information to the hashfile f.
4. write the chunkfile data using write_chunkfile().
5. free the chunkfile struct using free_chunkfile().
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both AsciiDoc and Asciidoctor are eager to pick up the e-mail addresses
in this document and turn them into references at the bottom of the
manpage / clickable links. We don't really need that for these dummy
addresses. Spell "@" as "@" to make them not do this. In the open
block, we can instead avoid this by indenting the contents, similar to
the earlier blocks.
Fix a backtick which should have been a single quote mark. With all the
quoting that is going on around here, this mistake trips up the parsing
and rendering quite a bit.
Before this commit, we have the same failure mode with AsciiDoc 8.6.10
and Asciidoctor 1.5.5, and this change makes both of them happy.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we write `<name>`s with the "s" tucked on to the closing backtick,
we end up rendering the backticks literally. Rephrase this sentence
slightly to render this as monospace.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment character is specified by the config variable
'core.commentchar'. Any lines starting with this character is considered
a comment and should not be included in the final commit message.
Teach git-gui to filter out lines in the commit message that start with
the comment character using git-stripspace. If the config is not set,
'#' is taken as the default. Also add a message educating users about
the comment character.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The error message given when a configuration variable that is
expected to have a boolean value has been improved.
* ak/config-bad-bool-error:
config: improve error message for boolean config
"git reflog expire --stale-fix" can be used to repair the reflog by
removing entries that refer to objects that have been pruned away,
but was not careful to tolerate missing objects.
* js/reflog-expire-stale-fix:
reflog expire --stale-fix: be generous about missing objects
When certain features (e.g. grafts) used in the repository are
incompatible with the use of the commit-graph, we used to silently
turned commit-graph off; we now tell the user what we are doing.
* js/commit-graph-warning:
commit-graph: when incompatible with graphs, indicate why
Test to make sure "git rev-parse one-thing one-thing" gives
the same thing twice (when one-thing is --since=X).
* ew/rev-parse-since-test:
t1500: ensure current --since= behavior remains
Avoid individual tests in t5411 from getting affected by each other
by forcing them to use separate output files during the test.
* jx/t5411-unique-filenames:
t5411: refactor check of refs using test_cmp_refs
t5411: use different out file to prevent overwriting
Fix "git fsck --name-objects" which apparently has not been used by
anybody who is motivated enough to report breakage.
* js/fsck-name-objects-fix:
fsck --name-objects: be more careful parsing generation numbers
t1450: robustify `remove_object()`
The .mailmap is documented to be read only from the root level of a
working tree, but a stray file in a bare repository also was read
by accident, which has been corrected.
* jk/mailmap-only-at-root:
mailmap: only look for .mailmap in work tree
"git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects. The "--cached" and
"--untracked" options have been marked as mutually incompatible.
* mt/grep-cached-untracked:
grep: error out if --untracked is used with --cached
"git mergetool" feeds three versions (base, local and remote) of
a conflicted path unmodified. The command learned to optionally
prepare these files with unconflicted parts already resolved.
* sh/mergetool-hideresolved:
mergetool: add per-tool support and overrides for the hideResolved flag
mergetool: break setup_tool out into separate initialization function
mergetool: add hideResolved configuration
Even though invocations of "die()" were logged to the trace2
system, "BUG()"s were not, which has been corrected.
* jt/trace2-BUG:
usage: trace2 BUG() invocations
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.
* js/range-diff-one-side-only:
range-diff: offer --left-only/--right-only options
range-diff: move the diffopt initialization down one layer
range-diff: combine all options in a single data structure
range-diff: simplify code spawning `git log`
range-diff: libify the read_patches() function again
range-diff: avoid leaking memory in two error code paths
There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.
* js/range-diff-wo-dotdot:
range-diff(docs): explain how to specify commit ranges
range-diff/format-patch: handle commit ranges other than A..B
range-diff/format-patch: refactor check for commit range
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository. The protocol v2 learned how to do so.
* jt/clone-unborn-head:
clone: respect remote unborn HEAD
connect, transport: encapsulate arg in struct
ls-refs: report unborn targets of symrefs
Piecemeal of rewrite of "git bisect" in C continues.
* mr/bisect-in-c-4:
bisect--helper: retire `--check-and-set-terms` subcommand
bisect--helper: reimplement `bisect_skip` shell function in C
bisect--helper: retire `--bisect-auto-next` subcommand
bisect--helper: use `res` instead of return in BISECT_RESET case option
bisect--helper: retire `--bisect-write` subcommand
bisect--helper: reimplement `bisect_replay` shell function in C
bisect--helper: reimplement `bisect_log` shell function in C
Fix incremental update of commit-graph file around corrected commit
date data.
* ds/commit-graph-genno-fix:
commit-graph: prepare commit graph
commit-graph: be extra careful about mixed generations
commit-graph: compute generations separately
commit-graph: validate layers for generation data
commit-graph: always parse before commit_graph_data_at()
commit-graph: use repo_parse_commit
The commit-graph learned to use corrected commit dates instead of
the generation number to help topological revision traversal.
* ak/corrected-commit-date:
doc: add corrected commit date info
commit-reach: use corrected commit dates in paint_down_to_common()
commit-graph: use generation v2 only if entire chain does
commit-graph: implement generation data chunk
commit-graph: implement corrected commit date
commit-graph: return 64-bit generation number
commit-graph: add a slab to store topological levels
t6600-test-reach: generalize *_three_modes
commit-graph: consolidate fill_commit_graph_info
revision: parse parent in indegree_walk_step()
commit-graph: fix regression when computing Bloom filters
Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used in.
Before the preceding commit they used to be needed earlier, but now we
can move them to be adjacent to the other PCREv2 functions.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().
This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.
As noted in the preceding commit the approach 513f2b0bbd (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.
The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.
As noted in 6d423dd542 (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.
So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.
See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.
1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue work started in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.
The reason 513f2b0bbd worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it targeted the allocation freed via free(), as
opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
and pcre2_maketables_free() pair.
For most of the rest we continued allocating with stock malloc()
inside PCREv2 itself, but didn't segfault because we'd use its
corresponding free().
In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd. But then
we wouldn't use the desired allocator, let's just use it instead.
Before this patch we'd on e.g.:
grep --threads=1 -iP æ.*var.*xyz
Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() calls. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.
My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:
"most of the time (like when using UTF-8) the chartable (and
therefore the global context) is not needed (even when using
alternate allocators)"
That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().
Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().
That's not what PCREv2's compile context is. To quote PCREv2's
documentation:
"This context just contains pointers to (and data for) external
memory management functions that are called from several places in
the PCRE2 library."
Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd took of always
creating it, but then mostly not using it.
Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.
1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables().
At first sight it's strange that 10da030ab7 (grep: avoid leak of
chartables in PCRE2, 2019-10-16) which added the free() call here
doesn't make use of the pcre2_free() the author introduced in the
preceding commit in 513f2b0bbd (grep: make PCRE2 aware of custom
allocator, 2019-10-16).
The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.
Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable.
At some distant point in the future we'll be able to drop the version
guard, as nobody will be running a version older than 10.34.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987e (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.
It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".
You need to manually change the definition in the source file similar
to the DEBUG_MAILMAP, there's no Makefile knob for this.
This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer.
This doesn't introduce any functional change, it just makes a
subsequent commit's diff easier to read. Changes code added in
513f2b0bbd (grep: make PCRE2 aware of custom allocator, 2019-10-16).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct a comment added in 513f2b0bbd (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].
1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6 (grep: add support for PCRE v2, 2017-06-01).
In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.
I think this was probably something left over from an earlier
development version of mine.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop an assignment added in b65abcafc7 (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01).
There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option.
Furthermore the "struct grep_opt" is (mostly) used for the options the
user supplied, let's avoid the pattern of needlessly assigning to it.
With my recent removal of the PCREv1 backend in 7599730b7e (Remove
support for v1 of the PCRE library, 2021-01-24) there's even less
confusion around what we call where in these codepaths, which is one
more reason to remove this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git Protocol version 2[1] defines 0002 as a Message Packet that indicates
the end of a response for stateless connections.
Change the naming of the 0002 Packet to 'Response End' to match the
parsing introduced in Wireshark's MR !1922 for consistency. A subsequent
MR in Wireshark will address additional mismatches.
[1] kernel.org/pub/software/scm/git/docs/technical/protocol-v2.html
[2] gitlab.com/wireshark/wireshark/-/merge_requests/1922
Signed-off-by: Joey Salazar <jgsal@protonmail.com>
Reviewed-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's not immediately obvious why --disk-usage might be a useful thing.
These examples show off a few of the real-world cases I've used it for.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently don't show any examples of using git-rev-list at all. Let's
add some pretty elementary examples. They likely seem obvious to anybody
who has worked with the tool for a while, but my purpose here is
two-fold:
- they may be enlightening to people who haven't used the tool a lot
to give a general flavor of how it is meant to be used
- they can serve as a starting point for adding more interesting
examples (we can do that without the basic ones, of course, but I
think it makes sense to show off the building blocks)
This set is far from exhaustive, but again, the purpose is to be a
starting point for further additions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-log(1) -- but not in git-shortlog(1) or git-rev-list(1) -- we
include a bonus paragraph in the description of `--first-parent`. But
we forgot to add a lone "+" for a list continuation, and we shouldn't
be indenting this second paragraph. As a result, we get a different
indentation and the `backticks` render literally.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git blame --color-by-age`, the determine_line_heat() is called to
select how to color the output based on the commit's author date. It
uses the get_commit_info() to parse the information into a `commit_info`
structure, however, this is actually unnecessary because the
determine_line_heat() caller also does the same.
Instead, let's change the determine_line_heat() to take a `commit_info`
structure and remove the internal call to get_commit_info() thus
cleaning up and optimizing the code path.
Enabling Git's trace2 API in order to record the execution time for
every call to determine_line_heat() function:
+ trace2_region_enter("blame", "determine_line_heat", the_repository);
determine_line_heat(ent, &default_color);
+ trace2_region_enter("blame", "determine_line_heat", the_repository);
Then, running `git blame` for "kernel/fork.c" in linux.git and summing
all the execution time for every call (around 1.3k calls) resulted in
2.6x faster execution (best out 3):
git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
git built from 328c109303 + this change = 16ms
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow restricting the tags used by the placeholder %(describe) with the
options match and exclude. E.g. the following command describes the
current commit using official version tags, without those for release
candidates:
$ git log -1 --format='%(describe:match=v[0-9]*,exclude=*rc*)'
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a format placeholder for describe output. Implement it by actually
calling git describe, which is simple and guarantees correctness. It's
intended to be used with $Format:...$ in files with the attribute
export-subst and git archive. It can also be used with git log etc.,
even though that's going to be slow due to the fork for each commit.
Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Isolate and document initialization of `istate->fsmonitor_last_update`.
This field should contain a fsmonitor-specific opaque token, but we
need to initialize it before we can actually talk to a fsmonitor process,
so we create a generic default value.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow fsmonitor to report directory changes by reporting paths with a
trailing slash.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's measure the time taken to request and receive FSMonitor data
via the hook API and the size of the response.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Report the number of files in the working directory that were read and
their hashes verified in `refresh_index()`.
FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files. Let's measure this.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Report the total number of calls made to lstat() inside of refresh_index().
FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files. This can be seen in
`refresh_index()`. Let's measure this.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Report the total number of calls made to lstat() inside preload_index().
FSMonitor improves the performance of commands like `git status` by
avoiding scanning the disk for changed files. This can be seen in
`preload_index()`. Let's measure this.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>