git/Documentation/git-grep.txt

369 lines
11 KiB
Text
Raw Normal View History

git-grep(1)
===========
NAME
----
git-grep - Print lines matching a pattern
SYNOPSIS
--------
[verse]
'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp]
[-v | --invert-match] [-h|-H] [--full-name]
[-E | --extended-regexp] [-G | --basic-regexp]
[-P | --perl-regexp]
[-F | --fixed-strings] [-n | --line-number] [--column]
[-l | --files-with-matches] [-L | --files-without-match]
[(-O | --open-files-in-pager) [<pager>]]
[-z | --null]
[ -o | --only-matching ] [-c | --count] [--all-match] [-q | --quiet]
[--max-depth <depth>] [--[no-]recursive]
[--color[=<when>] | --no-color]
[--break] [--heading] [-p | --show-function]
[-A <post-context>] [-B <pre-context>] [-C <context>]
[-W | --function-context]
[--threads <num>]
[-f <file>] [-e] <pattern>
[--and|--or|--not|(|)|-e <pattern>...]
[--recurse-submodules] [--parent-basename <basename>]
[ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | <tree>...]
[--] [<pathspec>...]
DESCRIPTION
-----------
Look for specified patterns in the tracked files in the work tree, blobs
registered in the index file, or blobs in given tree objects. Patterns
are lists of one or more search expressions separated by newline
characters. An empty string as search expression matches all lines.
CONFIGURATION
-------------
grep.lineNumber::
If set to true, enable `-n` option by default.
grep.column::
If set to true, enable the `--column` option by default.
grep.patternType::
Set the default matching behavior. Using a value of 'basic', 'extended',
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
value 'default' will return to the default matching behavior.
grep.extendedRegexp::
If set to true, enable `--extended-regexp` option by default. This
option is ignored when the `grep.patternType` option is set to a value
other than 'default'.
grep.threads::
grep: use no. of cores as the default no. of threads When --threads is not specified, git-grep will use 8 threads by default. This fixed number may be too many for machines with fewer cores and too little for machines with more cores. So, instead, use the number of logical cores available in the machine, which seems to result in the best overall performance: The following measurements correspond to the mean elapsed times for 30 git-grep executions in chromium's repository[1] with a 95% confidence interval (each set of 30 were performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is '(static|extern) (int|double) \*'. | Working tree | Object Store ------|-------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|---------------|----------------|--------------- 32 | 2.92s ± 0.01 | 3.72s ± 0.21 | 5.36s ± 0.01 | 6.07s ± 0.01 16 | 2.84s ± 0.01 | 3.57s ± 0.21 | 5.05s ± 0.01 | 5.71s ± 0.01 > 8 | 2.53s ± 0.00 | 3.24s ± 0.21 | 4.86s ± 0.01 | 5.48s ± 0.01 4 | 2.43s ± 0.02 | 3.22s ± 0.20 | 5.22s ± 0.02 | 6.03s ± 0.02 2 | 3.06s ± 0.20 | 4.52s ± 0.01 | 7.52s ± 0.01 | 9.06s ± 0.01 1 | 6.16s ± 0.01 | 9.25s ± 0.02 | 14.10s ± 0.01 | 17.22s ± 0.01 The above tests were performed in a desktop running Debian 10.0 with Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of RAM and a 7200 rpm, SATA 3.1 HDD. Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM: | Working tree | Object Store ------|--------------------------------|-------------------------------- #ths | Regex 1 | Regex 2 | Regex 1 | Regex 2 ------|---------------|----------------|----------------|--------------- 32 | 3.29s ± 0.21 | 4.30s ± 0.01 | 6.30s ± 0.01 | 7.30s ± 0.02 16 | 3.19s ± 0.20 | 4.14s ± 0.02 | 5.91s ± 0.01 | 6.83s ± 0.01 > 8 | 2.90s ± 0.04 | 3.82s ± 0.20 | 5.70s ± 0.02 | 6.53s ± 0.01 4 | 2.84s ± 0.02 | 3.77s ± 0.20 | 6.19s ± 0.02 | 7.18s ± 0.02 2 | 3.73s ± 0.21 | 5.57s ± 0.02 | 9.28s ± 0.01 | 11.22s ± 0.01 1 | 7.48s ± 0.02 | 11.36s ± 0.03 | 17.75s ± 0.01 | 21.87s ± 0.08 [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16 02:40:00 +00:00
Number of grep worker threads to use. If unset (or set to 0), Git will
use as many threads as the number of logical cores available.
grep.fullName::
If set to true, enable `--full-name` option by default.
grep.fallbackToNoIndex::
If set to true, fall back to git grep --no-index if git grep
is executed outside of a git repository. Defaults to false.
OPTIONS
-------
--cached::
Instead of searching tracked files in the working tree, search
blobs registered in the index file.
--no-index::
Search files in the current directory that is not managed by Git.
--untracked::
In addition to searching in the tracked files in the working
tree, search also in untracked files.
--no-exclude-standard::
Also search in ignored files by not honoring the `.gitignore`
mechanism. Only useful with `--untracked`.
--exclude-standard::
Do not pay attention to ignored files specified via the `.gitignore`
mechanism. Only useful when searching files in the current directory
with `--no-index`.
--recurse-submodules::
doc: --recurse-submodules mostly applies to active submodules The documentation refers to "initialized" or "populated" submodules, to explain which submodules are affected by '--recurse-submodules', but the real terminology here is 'active' submodules. Update the documentation accordingly. Some terminology: - Active is defined in gitsubmodules(7), it only involves the configuration variables 'submodule.active', 'submodule.<name>.active' and 'submodule.<name>.url'. The function submodule.c::is_submodule_active checks that a submodule is active. - Populated means that the submodule's working tree is present (and the gitfile correctly points to the submodule repository), i.e. either the superproject was cloned with ` --recurse-submodules`, or the user ran `git submodule update --init`, or `git submodule init [<path>]` and `git submodule update [<path>]` separately which populated the submodule working tree. This does not involve the 3 configuration variables above. - Initialized (at least in the context of the man pages involved in this patch) means both "populated" and "active" as defined above, i.e. what `git submodule update --init` does. The --recurse-submodules option mostly affects active submodules. An exception is `git fetch` where the option affects populated submodules. As a consequence, in `git pull --recurse-submodules` the fetch affects populated submodules, but the resulting working tree update only affects active submodules. In the documentation of `git-pull`, let's distinguish between the fetching part which affects populated submodules, and the updating of worktrees, which only affects active submodules. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-06 13:57:09 +00:00
Recursively search in each submodule that is active and
checked out in the repository. When used in combination with the
<tree> option the prefix of all submodule output will be the name of
the parent project's <tree> object. This option has no effect
if `--no-index` is given.
-a::
--text::
Process binary files as if they were text.
--textconv::
Honor textconv filter settings.
--no-textconv::
Do not honor textconv filter settings.
This is the default.
-i::
--ignore-case::
Ignore case differences between the patterns and the
files.
-I::
Don't match the pattern in binary files.
--max-depth <depth>::
For each <pathspec> given on command line, descend at most <depth>
levels of directories. A value of -1 means no limit.
This option is ignored if <pathspec> contains active wildcards.
In other words if "a*" matches a directory named "a*",
"*" is matched literally so --max-depth is still effective.
-r::
--recursive::
Same as `--max-depth=-1`; this is the default.
--no-recursive::
Same as `--max-depth=0`.
-w::
--word-regexp::
Match the pattern only at word boundary (either begin at the
beginning of a line, or preceded by a non-word character; end at
the end of a line or followed by a non-word character).
-v::
--invert-match::
Select non-matching lines.
-h::
-H::
By default, the command shows the filename for each
match. `-h` option is used to suppress this output.
`-H` is there for completeness and does not do anything
except it overrides `-h` given earlier on the command
line.
--full-name::
When run from a subdirectory, the command usually
outputs paths relative to the current directory. This
option forces paths to be output relative to the project
top directory.
-E::
--extended-regexp::
-G::
--basic-regexp::
Use POSIX extended/basic regexp for patterns. Default
is to use basic regexp.
-P::
--perl-regexp::
Use Perl-compatible regular expressions for patterns.
+
Support for these types of regular expressions is an optional
compile-time dependency. If Git wasn't compiled with support for them
providing this option will cause it to die.
-F::
--fixed-strings::
Use fixed strings for patterns (don't interpret pattern
as a regex).
-n::
--line-number::
Prefix the line number to matching lines.
--column::
Prefix the 1-indexed byte-offset of the first match from the start of the
matching line.
-l::
--files-with-matches::
--name-only::
-L::
--files-without-match::
Instead of showing every matched line, show only the
names of files that contain (or do not contain) matches.
For better compatibility with 'git diff', `--name-only` is a
synonym for `--files-with-matches`.
-O[<pager>]::
--open-files-in-pager[=<pager>]::
Open the matching files in the pager (not the output of 'grep').
If the pager happens to be "less" or "vi", and the user
specified only one pattern, the first file is positioned at
the first match automatically. The `pager` argument is
optional; if specified, it must be stuck to the option
without a space. If `pager` is unspecified, the default pager
will be used (see `core.pager` in linkgit:git-config[1]).
-z::
--null::
Use \0 as the delimiter for pathnames in the output, and print
them verbatim. Without this option, pathnames with "unusual"
characters are quoted as explained for the configuration
variable core.quotePath (see linkgit:git-config[1]).
-o::
--only-matching::
Print only the matched (non-empty) parts of a matching line, with each such
part on a separate output line.
-c::
--count::
Instead of showing every matched line, show the number of
lines that match.
--color[=<when>]::
Show colored matches.
The value must be always (the default), never, or auto.
--no-color::
Turn off match highlighting, even when the configuration file
gives the default to color output.
Same as `--color=never`.
--break::
Print an empty line between matches from different files.
--heading::
Show the filename above the matches in that file instead of
at the start of each shown line.
-p::
--show-function::
Show the preceding line that contains the function name of
the match, unless the matching line is a function name itself.
The name is determined in the same way as `git diff` works out
patch hunk headers (see 'Defining a custom hunk-header' in
linkgit:gitattributes[5]).
-<num>::
-C <num>::
--context <num>::
Show <num> leading and trailing lines, and place a line
containing `--` between contiguous groups of matches.
-A <num>::
--after-context <num>::
Show <num> trailing lines, and place a line containing
`--` between contiguous groups of matches.
-B <num>::
--before-context <num>::
Show <num> leading lines, and place a line containing
`--` between contiguous groups of matches.
-W::
--function-context::
Show the surrounding text from the previous line containing a
function name up to the one before the next function name,
effectively showing the whole function in which the match was
found. The function names are determined in the same way as
`git diff` works out patch hunk headers (see 'Defining a
custom hunk-header' in linkgit:gitattributes[5]).
--threads <num>::
Number of grep worker threads to use.
See `grep.threads` in 'CONFIGURATION' for more information.
-f <file>::
Read patterns from <file>, one per line.
grep: make the behavior for NUL-byte in patterns sane The behavior of "grep" when patterns contained a NUL-byte has always been haphazard, and has served the vagaries of the implementation more than anything else. A pattern containing a NUL-byte can only be provided via "-f <file>". Since pickaxe (log search) has no such flag the NUL-byte in patterns has only ever been supported by "grep" (and not "log --grep"). Since 9eceddeec6 ("Use kwset in grep", 2011-08-21) patterns containing "\0" were considered fixed. In 966be95549 ("grep: add tests to fix blind spots with \0 patterns", 2017-05-20) I added tests for this behavior. Change the behavior to do the obvious thing, i.e. don't silently discard a regex pattern and make it implicitly fixed just because they contain a NUL-byte. Instead die if the backend in question can't handle them, e.g. --basic-regexp is combined with such a pattern. This is desired because from a user's point of view it's the obvious thing to do. Whether we support BRE/ERE/Perl syntax is different from whether our implementation is limited by C-strings. These patterns are obscure enough that I think this behavior change is OK, especially since we never documented the old behavior. Doing this also makes it easier to replace the kwset backend with something else, since we'll no longer strictly need it for anything we can't easily use another fixed-string backend for. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01 21:20:57 +00:00
+
Passing the pattern via <file> allows for providing a search pattern
containing a \0.
+
Not all pattern types support patterns containing \0. Git will error
out if a given pattern type can't support such a pattern. The
`--perl-regexp` pattern type when compiled against the PCRE v2 backend
has the widest support for these types of patterns.
+
In versions of Git before 2.23.0 patterns containing \0 would be
silently considered fixed. This was never documented, there were also
odd and undocumented interactions between e.g. non-ASCII patterns
containing \0 and `--ignore-case`.
+
In future versions we may learn to support patterns containing \0 for
more search backends, until then we'll die when the pattern type in
question doesn't support them.
-e::
The next parameter is the pattern. This option has to be
used for patterns starting with `-` and should be used in
scripts passing user input to grep. Multiple patterns are
combined by 'or'.
--and::
--or::
--not::
( ... )::
Specify how multiple patterns are combined using Boolean
expressions. `--or` is the default operator. `--and` has
higher precedence than `--or`. `-e` has to be used for all
patterns.
--all-match::
When giving multiple pattern expressions combined with `--or`,
this flag is specified to limit the match to files that
have lines to match all of them.
-q::
--quiet::
Do not output matched lines; instead, exit with status 0 when
there is a match and with non-zero status when there isn't.
<tree>...::
Instead of searching tracked files in the working tree, search
blobs in the given trees.
\--::
Signals the end of options; the rest of the parameters
are <pathspec> limiters.
<pathspec>...::
If given, limit the search to paths matching at least one pattern.
Both leading paths match and glob(7) patterns are supported.
+
For more details about the <pathspec> syntax, see the 'pathspec' entry
in linkgit:gitglossary[7].
EXAMPLES
--------
docs: stop using asciidoc no-inline-literal In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 08:51:57 +00:00
`git grep 'time_t' -- '*.[ch]'`::
Looks for `time_t` in all tracked .c and .h files in the working
directory and its subdirectories.
docs: stop using asciidoc no-inline-literal In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 08:51:57 +00:00
`git grep -e '#define' --and \( -e MAX_PATH -e PATH_MAX \)`::
Looks for a line that has `#define` and either `MAX_PATH` or
`PATH_MAX`.
`git grep --all-match -e NODE -e Unexpected`::
Looks for a line that has `NODE` or `Unexpected` in
files that have lines that match both.
`git grep solution -- :^Documentation`::
Looks for `solution`, excluding files in `Documentation`.
grep: re-enable threads in non-worktree case They were disabled at 53b8d93 ("grep: disable threading in non-worktree case", 12-12-2011), due to observable performance drops (to the point that using a single thread would be faster than multiple threads). But now that zlib inflation can be performed in parallel we can regain the speedup, so let's re-enable threads in non-worktree grep. Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*' ("Regex 2") at chromium's repository[1] I got: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 17.2920s | 20.9624s 2 | 9.6512s | 11.3184s 4 | 6.7723s | 7.6268s 8** | 6.2886s | 6.9843s These are all means of 30 executions after 2 warmup runs. All tests were executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and SSD, running Manjaro Linux. But to make sure the optimization also performs well on HDD, the tests were repeated on another machine with an i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III, 5400 rpm), also running Manjaro Linux: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 18.4035s | 22.5368s 2 | 12.5063s | 14.6409s 4** | 10.9136s | 12.7106s ** Note that in these cases we relied on hyper-threading, and that's probably why we don't see a big difference in time. Unfortunately, multithreaded git-grep might be slow in the non-worktree case when --textconv is used and there're too many text conversions. Probably the reason for this is that the object read lock is used to protect fill_textconv() and therefore there is a mutual exclusion between textconv execution and object reading. Because both are time-consuming operations, not being able to perform them in parallel can cause performance drops. To inform the users about this (and other threading details), let's also add a "NOTES ON THREADS" section to Documentation/git-grep.txt. [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16 02:39:58 +00:00
NOTES ON THREADS
----------------
The `--threads` option (and the grep.threads configuration) will be ignored when
`--open-files-in-pager` is used, forcing a single-threaded execution.
When grepping the object store (with `--cached` or giving tree objects), running
with multiple threads might perform slower than single threaded if `--textconv`
is given and there're too many text conversions. So if you experience low
performance in this case, it might be desirable to use `--threads=1`.
GIT
---
Part of the linkgit:git[1] suite