Change the "define_categories()" and "define_category_names()" functions
to take the already-parsed output of "category_list()" as an argument,
which brings our number of passes over "command-list.txt" from three
to two.
Then have "category_list()" itself take the output of "command_list()"
as an argument, bringing the number of times we parse the file to one.
Compared to the pre-image this speeds us up quite a bit:
$ git show HEAD~:generate-cmdlist.sh >generate-cmdlist.sh.old
$ hyperfine --warmup 10 -L v ,.old 'sh generate-cmdlist.sh{v} command-list.txt'
Benchmark #1: sh generate-cmdlist.sh command-list.txt
Time (mean ± σ): 22.9 ms ± 0.3 ms [User: 15.8 ms, System: 9.6 ms]
Range (min … max): 22.5 ms … 24.0 ms 125 runs
Benchmark #2: sh generate-cmdlist.sh.old command-list.txt
Time (mean ± σ): 30.1 ms ± 0.4 ms [User: 24.4 ms, System: 17.5 ms]
Range (min … max): 29.5 ms … 32.3 ms 96 runs
Summary
'sh generate-cmdlist.sh command-list.txt' ran
1.32 ± 0.02 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the "grep" we run to exclude certain programs from the
generated output with a pure-shell loop that strips out the comments,
and sees if the "cmd" we're reading is on a list of excluded
programs. This uses a trick similar to test_have_prereq() in
test-lib-functions.sh.
On my *nix system this makes things quite a bit slower compared to
HEAD~:
o
'sh generate-cmdlist.sh.old command-list.txt' ran
1.56 ± 0.11 times faster than 'sh generate-cmdlist.sh command-list.txt'
18.00 ± 0.19 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
But when I tried running generate-cmdlist.sh 100 times in CI I found
that it helped across the board even on OSX & Linux. I tried testing
it in CI with this ad-hoc few-liner:
for i in $(seq -w 0 11 | sort -nr)
do
git show HEAD~$i:generate-cmdlist.sh >generate-cmdlist-HEAD$i.sh &&
git add generate-cmdlist* &&
cp t/t0000-generate-cmdlist.sh t/t00$i-generate-cmdlist.sh || : &&
perl -pi -e "s/HEAD0/HEAD$i/g" t/t00$i-generate-cmdlist.sh &&
git add t/t00*.sh
done && git commit -m"generated it"
Here HEAD~02 and the t0002* file refers to this change, and HEAD~03
and t0003* file to the preceding commit, the relevant results were:
linux-gcc:
[12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
[12:05:30] t0003-generate-cmdlist.sh .. ok 32 ms ( 0.00 usr 0.00 sys + 2.66 cusr 1.81 csys = 4.47 CPU)
osx-gcc:
[11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
[11:58:16] t0003-generate-cmdlist.sh .. ok 92127 ms ( 0.02 usr 0.01 sys + 22.54 cusr 14.27 csys = 36.84 CPU)
vs-test:
[12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
[12:03:20] t0003-generate-cmdlist.sh .. ok 32 s ( 0.00 usr 0.02 sys + 13.25 cusr 26.10 csys = 39.37 CPU)
I.e. even on *nix running 100 of these in a loop was up to ~2x faster
in absolute runtime, I suspect it's due factors that are exacerbated
in the CI, e.g. much slower process startup due to some platform
limits, or a slower FS.
The "cut -d" change here is because we're not emitting the
40-character aligned output anymore, i.e. we'll get the output from
command_list() now, not an as-is line from command-list.txt.
This also makes the parsing more reliable, as we could tweak the
whitespace alignment without breaking this parser. Let's reword a
now-inaccurate comment in "command-list.txt" describing that previous
alignment limitation. We'll still need the "### command-list [...]"
line due to the "Documentation/cmd-list.perl" logic added in
11c6659d85 (command-list: prepare machinery for upcoming "common
groups" section, 2015-05-21).
There was a proposed change subsequent to this one[3] which continued
moving more logic into the "command_list() function, i.e. replaced the
"cut | tr | grep" chain in "category_list()" with an argument to
"command_list()".
That change might have had a bit of an effect, but not as much as the
preceding commit, so I decided to drop it. The relevant performance
numbers from it were:
linux-gcc:
[12:05:33] t0001-generate-cmdlist.sh .. ok 13 ms ( 0.00 usr 0.00 sys + 3.33 cusr 2.78 csys = 6.11 CPU)
[12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
osx-gcc:
[11:58:03] t0001-generate-cmdlist.sh .. ok 78416 ms ( 0.02 usr 0.01 sys + 11.78 cusr 6.22 csys = 18.03 CPU)
[11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
vs-test:
[12:03:20] t0001-generate-cmdlist.sh .. ok 34 s ( 0.00 usr 0.03 sys + 12.42 cusr 19.55 csys = 32.00 CPU)
[12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
As above HEAD~2 and t0002* are testing the code in this commit (and
the line is the same), but HEAD~1 and t0001* are testing that dropped
change in [3].
1. https://lore.kernel.org/git/cover-v2-00.10-00000000000-20211022T193027Z-avarab@gmail.com/
2. https://lore.kernel.org/git/patch-v2-08.10-83318d6c0da-20211022T193027Z-avarab@gmail.com/
3. https://lore.kernel.org/git/patch-v2-10.10-e10a43756d1-20211022T193027Z-avarab@gmail.com/
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the "sed" invocation in get_synopsis() with a pure-shell
version. This speeds up generate-cmdlist.sh significantly. Compared to
HEAD~ (old) and "master" we are, according to hyperfine(1):
'sh generate-cmdlist.sh command-list.txt' ran
12.69 ± 5.01 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
18.34 ± 3.03 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a preceding commit we changed the print_command_list() loop to use
printf's auto-repeat feature. Let's now get rid of get_category_line()
entirely by not sorting the categories.
This will change the output of the generated code from e.g.:
- { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_complete | CAT_plumbingmanipulators },
To:
+ { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_plumbingmanipulators | CAT_complete },
I.e. the categories are no longer sorted, but as they're OR'd together
it won't matter for the end result.
This speeds up the generate-cmdlist.sh a bit. Comparing HEAD~ (old)
and "master" to this code:
'sh generate-cmdlist.sh command-list.txt' ran
1.07 ± 0.33 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
1.15 ± 0.36 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is just a small code reduction. There is a small probability that
the new code breaks when the category list is empty. But that would be
noticed during the compile step.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This doesn't matter for performance, but let's not include the empty
lines in our sorting. This makes the intent of the code clearer.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This isn't for optimization as the get_categories() is a purely shell
function, but rather for ease of readability, let's just inline these
two lines. We'll be changing this code some more in subsequent commits
to make this worth it.
Rename the get_categories() function to get_category_line(), since
that's what it's doing now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function get_categories() is invoked in a loop over all commands.
As it runs several processes, this takes an awful lot of time on
Windows. To reduce the number of processes, move the process that
filters empty lines to the other invoker of the function, where it is
needed. The invocation of get_categories() in the loop does not need
the empty line filtered away because the result is word-split by the
shell, which eliminates the empty line automatically.
Furthermore, use sort -u instead of sort | uniq to remove yet another
process.
[Ævar: on Linux this seems to speed things up a bit, although with
hyperfine(1) the results are fuzzy enough to land within the
confidence interval]:
$ git show HEAD~:generate-cmdlist.sh >generate-cmdlist.sh.old
$ hyperfine --warmup 1 -L s ,.old -p 'make clean' 'sh generate-cmdlist.sh{s} command-list.txt'
Benchmark #1: sh generate-cmdlist.sh command-list.txt
Time (mean ± σ): 371.3 ms ± 64.2 ms [User: 430.4 ms, System: 72.5 ms]
Range (min … max): 320.5 ms … 517.7 ms 10 runs
Benchmark #2: sh generate-cmdlist.sh.old command-list.txt
Time (mean ± σ): 489.9 ms ± 185.4 ms [User: 724.7 ms, System: 141.3 ms]
Range (min … max): 346.0 ms … 885.3 ms 10 runs
Summary
'sh generate-cmdlist.sh command-list.txt' ran
1.32 ± 0.55 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add " " before a "|" at the end of a line in generate-cmdlist.sh for
consistency with other code in the file. Some of the surrounding code
will be modified in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
tr(1) of ANSI/POSIX environment, aka APE, don't support \n literal.
It's handles only octal(\ooo) or hexadecimal(\xhhhh) numbers.
And its sed(1)'s label is limited to maximum seven characters.
Therefore I replaced some labels to drop a character.
* close -> cl
* continue -> cont (cnt is used for count)
* line -> ln
* hered -> hdoc
* shell -> sh
* string -> str
Signed-off-by: Kyohei Kadota <lufia@lufia.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting in 3ac68a93fd, help.o began to depend on builtin/branch.o,
builtin/clean.o, and builtin/config.o. This meant that help.o was
unusable outside of the context of the main Git executable.
To make help.o usable by other commands again, move list_config_help()
into builtin/help.c (where it makes sense to assume other builtin libraries
are present).
When command-list.h is included but a member is not used, we start to
hear a compiler warning. Since the config list is generated in a fairly
different way than the command list, and since commands and config
options are semantically different, move the config list into its own
header and move the generator into its own script and build rule.
For reasons explained in 976aaedc (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29), some build
artifacts we consider non-source files cannot be generated in the
Visual Studio environment, and we already have some Makefile tweaks
to help Visual Studio to use generated command-list.h header file.
Do the same to a new generated file, config-list.h, introduced by
this change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
When built with NO_CURL or with NO_UNIX_SOCKETS, some commands are
skipped from the build. It does not make sense to list them in the
output of `git help -a`, so let's just not.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
config.txt is going to be broken down in smaller pieces and put under
Documentation/config directory. Update build rules to take these files
into account.
A dummy file is added to make sure wildcard expansion is predictable
(depending on shell setting it could expand to nothing or becomes a
path if config directory is empty). The file will be deleted once the
move is over.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script uses Documentation/config.txt as input for "git help
--config" and "git config" completion but it misses the fact that
config.txt includes other txt files. Include all *config.txt as input
when scanning for config keys. This could produce false positives, but
as long as we stick to the blah-config.txt naming convention, we
should be ok.
While at there, move diff.* from config.txt to diff-config.txt where
all other diff config keys are.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes it helps to list all available config vars so the user can
search for something they want. The config man page can also be used
but it's harder to search if you want to focus on the variable name,
for example.
This is not the best way to collect the available config since it's
not precise. Ideally we should have a centralized list of config in C
code (pretty much like 'struct option'), but that's a lot more work.
This will do for now.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to select any group of commands by a category defined
in command-list.txt. This is an internal/hidden option so we don't
have to be picky about the category name or worried about exposing too
much.
This will be used later by git-completion.bash to retrieve certain
command groups.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the last patch, common-cmds.h is no longer used (and it was
actually broken). Remove all related code. command-list.h will take
its place from now on.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit added code generation for all_cmd_desc[] which
includes almost everything we need to generate common command list.
Convert help code to use that array instead and drop common_cmds[] array.
The description of each common command group is removed from
command-list.txt. This keeps this file format simpler. common-cmds.h
will not be generated correctly after this change due to the
command-list.txt format change. But it does not matter and
common-cmds.h will be removed.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current generate-cmds.sh generates just enough to print "git help"
output. That is, it only extracts help text for common commands.
The script is now updated to extract help text for all commands and
keep command classification a new file, command-list.h. This will be
useful later:
- "git help -a" could print a short summary of all commands instead of
just the common ones.
- "git" could produce a list of commands of one or more category. One
of its use is to reduce another command classification embedded in
git-completion.bash.
The new file can be generated but is not used anywhere yet. The plan
is we migrate away from common-cmds.h. Then we can kill off
common-cmds.h build rules and generation code (and also delete
duplicate content in command-list.h which we keep for now to not mess
generate-cmds.sh up too much).
PS. The new fixed column requirement on command-list.txt is
technically not needed. But it helps simplify the code a bit at this
stage. We could lift this restriction later if we want to.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it easier to reuse the same code in another place (very
soon).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-determinism makes it harder for build tools to discover when a
target needs to be rebuilt.
generate-cmdlist.sh stores the full path in a comment:
/* Automatically generated by /build/git-agojiD/git-2.15.0/generate-cmdlist.sh */
Use the file name alone instead.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
527ec39 (generate-cmdlist: parse common group commands, 2015-05-21)
replaced generate-cmdlist.sh with a more functional Perl version,
generate-cmdlist.perl. The Perl version gleans named tags from a new
"common groups" section in command-list.txt and recognizes those
tags in "command list" section entries in place of the old 'common'
tag. This allows git-help to, not only recognize, but also group
common commands.
Although the tests require Perl, 527ec39 creates an unconditional
dependence upon Perl in the build system itself, which can not be
overridden with NO_PERL. Such a dependency may be undesirable; for
instance, the 'git-lite' package in the FreeBSD ports tree is
intended as a minimal Git installation (which may, for example, be
useful on servers needing only local clone and update capability),
which, historically, has not depended upon Perl[1].
Therefore, revive generate-cmdlist.sh and extend it to recognize
"common groups" and its named tags. Retire generate-cmdlist.perl.
[1]: http://thread.gmane.org/gmane.comp.version-control.git/275905/focus=276132
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parse the group block to create the array of group descriptions:
static char *common_cmd_groups[] = {
N_("starting a working area"),
N_("working on the current change"),
N_("working with others"),
N_("examining the history and state"),
N_("growing, marking and tweaking your history"),
};
then map each element of common_cmds[] to a group via its index:
static struct cmdname_help common_cmds[] = {
{"add", N_("Add file contents to the index"), 1},
{"branch", N_("List, create, or delete branches"), 4},
{"checkout", N_("Checkout a branch or paths to the ..."), 4},
{"clone", N_("Clone a repository into a new directory"), 0},
{"commit", N_("Record changes to the repository"), 4},
...
};
so that 'git help' can print those commands grouped by theme.
Only commands tagged with an attribute from the group block are emitted to
common_cmds[].
[commit message by Sébastien Guimmara <sebastien.guimmara@gmail.com>]
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Sébastien Guimmara <sebastien.guimmara@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch also marks most common commands' synopsis for translation
so that "git help" gives a friendly listing.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I found that the patched 4 files were different when this
filter is applied.
expand -i | unexpand --first-only
This patch contains the corrected files.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:
struct foo {
int bar;
char *baz;
};
Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.
Linus sayeth:
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "common" man pages there is luckily no "NAME" anywhere except at
beginning of documents. If there is another "NAME", sed could
mis-select it and lead to common-cmds.h corruption. So better nail it
at beginning of line, which would reduce corruption chance.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The categorized list of commands in git(7) and the list of common
commands in "git help" output were maintained separately, which was
insane. This consolidates them to a single command-list.txt file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove apply, archive, cherry-pick, prune, revert, and show-branch, so
"git help" is less intimidating.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While 'init-db' still is and probably will always remain a valid git
command for obvious backward compatibility reasons, it would be a good
idea to move shipped tools and docs to using 'init' instead.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since git-show is pure Porcelain, it is the ideal candidate to
pretty print other things than commits, too.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-archive is a command to make TAR and ZIP archives of a git tree.
It helps prevent a proliferation of git-{format}-tree commands.
Instead of directly calling git-{tar,zip}-tree command, it defines
a very simple API, that archiver should implement and register in
"git-archive.c". This API is made up by 2 functions whose prototype
is defined in "archive.h" file.
- The first one is used to parse 'extra' parameters which have
signification only for the specific archiver. That would allow
different archive backends to have different kind of options.
- The second one is used to ask to an archive backend to build
the archive given some already resolved parameters.
The main reason for making this API is to avoid using
git-{tar,zip}-tree commands, hence making them useless. Maybe it's
time for them to die ?
It also implements remote operations by defining a very simple
protocol: it first sends the name of the specific uploader followed
the repository name (git-upload-tar git://example.org/repo.git).
Then it sends options. It's done by sending a sequence of one
argument per packet, with prefix "argument ", followed by a flush.
The remote protocol is implemented in "git-archive.c" for client
side and is triggered by "--remote=<repo>" option. For example,
to fetch a TAR archive in a remote repo, you can issue:
$ git archive --format=tar --remote=git://xxx/yyy/zzz.git HEAD
We choose to not make a new command "git-fetch-archive" for example,
avoind one more GIT command which should be nice for users (less
commands to remember, keeps existing --remote option).
Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Acked-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>