Commit graph

59 commits

Author SHA1 Message Date
Jeff King 335ec3bf41 grep: allow to use textconv filters
Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:

ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)

"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping,
even when requested by --textconv.

Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-10 10:27:31 -07:00
Antoine Pelisse 3ce3ffb840 fix clang -Wtautological-compare with unsigned enum
Create a GREP_HEADER_FIELD_MIN so we can check that the field value is
sane and silence the clang warning.

Clang warning happens because the enum is unsigned (this is
implementation-defined, and there is no negative fields) and the check
is then tautological.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-25 07:35:55 -08:00
Jeff King e034d1bb92 Merge branch 'nd/grep-true-path'
"git grep -e pattern <tree>" asked the attribute system to read
"<tree>:.gitattributes" file in the working tree, which was
nonsense.

* nd/grep-true-path:
  grep: stop looking at random places for .gitattributes
2012-10-29 04:13:16 -04:00
Nguyễn Thái Ngọc Duy 55c61688ea grep: stop looking at random places for .gitattributes
grep searches for .gitattributes using "name" field in struct
grep_source but that field is not real on-disk path name. For example,
"grep pattern rev" fills the field with "rev:path", and Git looks for
.gitattributes in the (non-existent but exploitable) path "rev:path"
instead of "path".

This patch passes real paths down to grep_source_load_driver() when:

 - grep on work tree
 - grep on the index
 - grep a commit (or a tag if it points to a commit)

so that these cases look up .gitattributes at proper paths.
.gitattributes lookup is disabled in all other cases.

Initial-work-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 08:24:44 -07:00
Junio C Hamano c5c31d3381 grep: move pattern-type bits support to top-level grep.[ch]
Switching between -E/-G/-P/-F correctly needs a lot more than just
flipping opt->regflags bit these days, and we have a nice helper
function buried in builtin/grep.c for the sole use of "git grep".

Extract it so that "log --grep" family can also use it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 23:21:29 -07:00
Junio C Hamano 7687a0541e grep: move the configuration parsing logic to grep.[ch]
The configuration handling is a library-ish part of this program,
that is not specific to "git grep" command.  It should be reusable
by "log" and others.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 16:17:50 -07:00
Junio C Hamano baa6378ff2 log --grep-reflog: reject the option without -g
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 12:07:04 -07:00
Nguyễn Thái Ngọc Duy 72fd13f71c revision: add --grep-reflog to filter commits by reflog messages
Similar to --author/--committer which filters commits by author and
committer header fields. --grep-reflog adds a fake "reflog" header to
commit and a grep filter to search on that line.

All rules to --author/--committer apply except no timestamp stripping.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 11:41:14 -07:00
Nguyễn Thái Ngọc Duy ad4813b3c2 grep: prepare for new header field filter
grep supports only author and committer headers, which have the same
special treatment that later headers may or may not have. Check for
field type and only strip_timestamp() when the field is either author
or committer.

GREP_HEADER_FIELD_MAX is put in the grep_header_field enum to be
calculated automatically, correctly, as long as it's at the end of the
enum.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 11:40:58 -07:00
Junio C Hamano 3d7535e424 Merge branch 'jc/maint-log-grep-all-match'
Fix a long-standing bug in "git log --grep" when multiple "--grep"
are used together with "--all-match" and "--author" or "--committer".

* jc/maint-log-grep-all-match:
  t7810-grep: test --all-match with multiple --grep and --author options
  t7810-grep: test interaction of multiple --grep and --author options
  t7810-grep: test multiple --author with --all-match
  t7810-grep: test multiple --grep with and without --all-match
  t7810-grep: bring log --grep tests in common form
  grep.c: mark private file-scope symbols as static
  log: document use of multiple commit limiting options
  log --grep/--author: honor --all-match honored for multiple --grep patterns
  grep: show --debug output only once
  grep: teach --debug option to dump the parse tree
2012-09-18 14:37:54 -07:00
Junio C Hamano 07a7d656dd grep.c: mark private file-scope symbols as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 23:35:39 -07:00
Junio C Hamano 17bf35a3c7 grep: teach --debug option to dump the parse tree
Our "grep" allows complex boolean expressions to be formed to match
each individual line with operators like --and, '(', ')' and --not.
Introduce the "--debug" option to show the parse tree to help people
who want to debug and enhance it.

Also "log" learns "--grep-debug" option to do the same.  The command
line parser to the log family is a lot more limited than the general
"git grep" parser, but it has special handling for header matching
(e.g. "--author"), and a parse tree is valuable when working on it.

Note that "--all-match" is *not* any individual node in the parse
tree.  It is an instruction to the evaluator to check all the nodes
in the top-level backbone have matched and reject a document as
non-matching otherwise.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-14 10:10:35 -07:00
J Smith 84befcd0a4 grep: add a grep.patternType configuration setting
The grep.extendedRegexp configuration setting enables the -E flag on grep
by default but there are no equivalents for the -G, -F and -P flags.

Rather than adding an additional setting for grep.fooRegexp for current
and future pattern matching options, add a grep.patternType setting that
can accept appropriate values for modifying the default grep pattern
matching behavior. The current values are "basic", "extended", "fixed",
"perl" and "default" for setting -G, -E, -F, -P and the default behavior
respectively.

When grep.patternType is set to a value other than "default", the
grep.extendedRegexp setting is ignored. The value of "default" restores
the current default behavior, including the grep.extendedRegexp
behavior.

Signed-off-by: J Smith <dark.panda@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 09:58:02 -07:00
Junio C Hamano 2147cb2762 Merge branch 'rs/maint-grep-F' into maint
"git grep -e '$pattern'", unlike the case where the patterns are read from
a file, did not treat individual lines in the given pattern argument as
separate regular expressions as it should.

By René Scharfe
* rs/maint-grep-F:
  grep: stop leaking line strings with -f
  grep: support newline separated pattern list
  grep: factor out do_append_grep_pat()
  grep: factor out create_grep_pat()
2012-06-01 13:01:41 -07:00
Junio C Hamano fca9e0013e Merge branch 'rs/maint-grep-F'
"git grep -e '$pattern'", unlike the case where the patterns are read from
a file, did not treat individual lines in the given pattern argument as
separate regular expressions as it should.
2012-05-25 12:04:19 -07:00
René Scharfe 526a858a99 grep: support newline separated pattern list
Currently, patterns that contain newline characters don't match anything
when given to git grep.  Regular grep(1) interprets patterns as lists of
newline separated search strings instead.

Implement this functionality by creating and inserting extra grep_pat
structures for patterns consisting of multiple lines when appending to
the pattern lists.  For simplicity, all pattern strings are duplicated.
The original pattern is truncated in place to make it contain only the
first line.

Requested-by: Torne (Richard Coles) <torne@google.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20 15:25:46 -07:00
Jeff King 41b59bfcb1 grep: respect diff attributes for binary-ness
There is currently no way for users to tell git-grep that a
particular path is or is not a binary file; instead, grep
always relies on its auto-detection (or the user specifying
"-a" to treat all binary-looking files like text).

This patch teaches git-grep to use the same attribute lookup
that is used by git-diff. We could add a new "grep" flag,
but that is unnecessarily complex and unlikely to be useful.
Despite the name, the "-diff" attribute (or "diff=foo" and
the associated diff.foo.binary config option) are really
about describing the contents of the path. It's simply
historical that diff was the only thing that cared about
these attributes in the past.

And if this simple approach turns out to be insufficient, we
still have a backwards-compatible path forward: we can add a
separate "grep" attribute, and fall back to respecting
"diff" if it is unset.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:08 -08:00
Jeff King 94ad9d9e07 grep: cache userdiff_driver in grep_source
Right now, grep only uses the userdiff_driver for one thing:
looking up funcname patterns for "-p" and "-W".  As new uses
for userdiff drivers are added to the grep code, we want to
minimize attribute lookups, which can be expensive.

It might seem at first that this would also optimize multiple
lookups when the funcname pattern for a file is needed
multiple times. However, the compiled funcname pattern is
already cached in struct grep_opt's "priv" member, so
multiple lookups are already suppressed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:08 -08:00
Jeff King c876d6da88 grep: drop grep_buffer's "name" parameter
Before the grep_source interface existed, grep_buffer was
used by two types of callers:

  1. Ones which pulled a file into a buffer, and then wanted
     to supply the file's name for the output (i.e.,
     git grep).

  2. Ones which really just wanted to grep a buffer (i.e.,
     git log --grep).

Callers in set (1) should now be using grep_source. Callers
in set (2) always pass NULL for the "name" parameter of
grep_buffer. We can therefore get rid of this now-useless
parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:08 -08:00
Jeff King e1327023ea grep: refactor the concept of "grep source" into an object
The main interface to the low-level grep code is
grep_buffer, which takes a pointer to a buffer and a size.
This is convenient and flexible (we use it to grep commit
bodies, files on disk, and blobs by sha1), but it makes it
hard to pass extra information about what we are grepping
(either for correctness, like overriding binary
auto-detection, or for optimizations, like lazily loading
blob contents).

Instead, let's encapsulate the idea of a "grep source",
including the buffer, its size, and where the data is coming
from. This is similar to the diff_filespec structure used by
the diff code (unsurprising, since future patches will
implement some of the same optimizations found there).

The diffstat is slightly scarier than the actual patch
content. Most of the modified lines are simply replacing
access to raw variables with their counterparts that are now
in a "struct grep_source". Most of the added lines were
taken from builtin/grep.c, which partially abstracted the
idea of grep sources (for file vs sha1 sources).

Instead of dropping the now-redundant code, this patch
leaves builtin/grep.c using the traditional grep_buffer
interface (which now wraps the grep_source interface). That
makes it easy to test that there is no change of behavior
(yet).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:07 -08:00
Jeff King b3aeb285d0 grep: move sha1-reading mutex into low-level code
The multi-threaded git-grep code needs to serialize access
to the thread-unsafe read_sha1_file call. It does this with
a mutex that is local to builtin/grep.c.

Let's instead push this down into grep.c, where it can be
used by both builtin/grep.c and grep.c. This will let us
safely teach the low-level grep.c code tricks that involve
reading from the object db.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:07 -08:00
Jeff King 78db6ea9dc grep: make locking flag global
The low-level grep code traditionally didn't care about
threading, as it doesn't do any threading itself and didn't
call out to other non-thread-safe code.  That changed with
0579f91 (grep: enable threading with -p and -W using lazy
attribute lookup, 2011-12-12), which pushed the lookup of
funcname attributes (which is not thread-safe) into the
low-level grep code.

As a result, the low-level code learned about a new global
"grep_attr_mutex" to serialize access to the attribute code.
A multi-threaded caller (e.g., builtin/grep.c) is expected
to initialize the mutex and set "use_threads" in the
grep_opt structure. The low-level code only uses the lock if
use_threads is set.

However, putting the use_threads flag into the grep_opt
struct is not the most logical place. Whether threading is
in use is not something that matters for each call to
grep_buffer, but is instead global to the whole program
(i.e., if any thread is doing multi-threaded grep, every
other thread, even if it thinks it is doing its own
single-threaded grep, would need to use the locking).  In
practice, this distinction isn't a problem for us, because
the only user of multi-threaded grep is "git-grep", which
does nothing except call grep.

This patch turns the opt->use_threads flag into a global
flag. More important than the nit-picking semantic argument
above is that this means that the locking functions don't
need to actually have access to a grep_opt to know whether
to lock. Which in turn can make adding new locks simpler, as
we don't need to pass around a grep_opt.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 10:36:07 -08:00
Thomas Rast 0579f91dd7 grep: enable threading with -p and -W using lazy attribute lookup
Lazily load the userdiff attributes in match_funcname().  Use a
separate mutex around this loading to protect the (not thread-safe)
attributes machinery.  This lets us re-enable threading with -p and
-W while reducing the overhead caused by looking up attributes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-16 15:47:10 -08:00
Fredrik Kuivinen 9eceddeec6 Use kwset in grep
Benchmarks for the hot cache case:

before:
$ perf stat --repeat=5 git grep qwerty > /dev/null

Performance counter stats for 'git grep qwerty' (5 runs):

        3,478,085 cache-misses             #      2.322 M/sec   ( +-   2.690% )
       11,356,177 cache-references         #      7.582 M/sec   ( +-   2.598% )
        3,872,184 branch-misses            #      0.363 %       ( +-   0.258% )
    1,067,367,848 branches                 #    712.673 M/sec   ( +-   2.622% )
    3,828,370,782 instructions             #      0.947 IPC     ( +-   0.033% )
    4,043,832,831 cycles                   #   2700.037 M/sec   ( +-   0.167% )
            8,518 page-faults              #      0.006 M/sec   ( +-   3.648% )
              847 CPU-migrations           #      0.001 M/sec   ( +-   3.262% )
            6,546 context-switches         #      0.004 M/sec   ( +-   2.292% )
      1497.695495 task-clock-msecs         #      3.303 CPUs    ( +-   2.550% )

       0.453394396  seconds time elapsed   ( +-   0.912% )

after:
$ perf stat --repeat=5 git grep qwerty > /dev/null

Performance counter stats for 'git grep qwerty' (5 runs):

        2,989,918 cache-misses             #      3.166 M/sec   ( +-   5.013% )
       10,986,041 cache-references         #     11.633 M/sec   ( +-   4.899% )  (scaled from 95.06%)
        3,511,993 branch-misses            #      1.422 %       ( +-   0.785% )
      246,893,561 branches                 #    261.433 M/sec   ( +-   3.967% )
    1,392,727,757 instructions             #      0.564 IPC     ( +-   0.040% )
    2,468,142,397 cycles                   #   2613.494 M/sec   ( +-   0.110% )
            7,747 page-faults              #      0.008 M/sec   ( +-   3.995% )
              897 CPU-migrations           #      0.001 M/sec   ( +-   2.383% )
            6,535 context-switches         #      0.007 M/sec   ( +-   1.993% )
       944.384228 task-clock-msecs         #      3.177 CPUs    ( +-   0.268% )

       0.297257643  seconds time elapsed   ( +-   0.450% )

So we gain about 35% by using the kwset code.

As a side effect of using kwset two grep tests are fixed by this
patch. The first is fixed because kwset can deal with case-insensitive
search containing NULs, something strcasestr cannot do. The second one
is fixed because we consider patterns containing NULs as fixed strings
(regcomp cannot accept patterns with NULs).

Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-20 22:33:58 -07:00
René Scharfe ba8ea7496f grep: add option to show whole function as context
Add a new option, -W, to show the whole surrounding function of a match.

It uses the same regular expressions as -p and diff to find the beginning
of sections.

Currently it will not display comments in front of a function, but those
that are following one.  Despite this shortcoming it is already useful,
e.g. to simply see a more complete applicable context or to extract whole
functions.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01 16:09:15 -07:00
René Scharfe 1d84f72ef1 grep: add --heading
With --heading, the filename is printed once before matches from that
file instead of at the start of each line, giving more screen space to
the actual search results.

This option is taken from ack (http://betterthangrep.com/).  And now
git grep can dress up like it:

	$ git config alias.ack "grep --break --heading --line-number"

	$ git ack -e --heading
	Documentation/git-grep.txt
	154:--heading::

	t/t7810-grep.sh
	785:test_expect_success 'grep --heading' '
	786:    git grep --heading -e char -e lo_w hello.c hello_world >actual &&
	808:    git grep --break --heading -n --color \

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05 18:15:27 -07:00
René Scharfe a8f0e7649e grep: add --break
With --break, an empty line is printed between matches from different
files, increasing readability.  This option is taken from ack
(http://betterthangrep.com/).

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05 18:15:26 -07:00
Michał Kiedrowicz 63e7e9d8b6 git-grep: Learn PCRE
This patch teaches git-grep the --perl-regexp/-P options (naming
borrowed from GNU grep) in order to allow specifying PCRE regexes on the
command line.

PCRE has a number of features which make them more handy to use than
POSIX regexes, like consistent escaping rules, extended character
classes, ungreedy matching etc.

git isn't build with PCRE support automatically. USE_LIBPCRE environment
variable must be enabled (like `make USE_LIBPCRE=YesPlease`).

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09 16:29:33 -07:00
Junio C Hamano 5aaeb733f5 log --author: take union of multiple "author" requests
In the olden days,

    log --author=me --committer=him --grep=this --grep=that

used to be turned into:

    (OR (HEADER-AUTHOR me)
        (HEADER-COMMITTER him)
        (PATTERN this)
        (PATTERN that))

showing my patches that do not have any "this" nor "that", which was
totally useless.

80235ba ("log --author=me --grep=it" should find intersection, not union,
2010-01-17) improved it greatly to turn the same into:

    (ALL-MATCH
      (HEADER-AUTHOR me)
      (HEADER-COMMITTER him)
      (OR (PATTERN this) (PATTERN that)))

That is, "show only patches by me and committed by him, that have either
this or that", which is a lot more natural thing to ask.

We however need to be a bit more clever when the user asks more than one
"author" (or "committer"); because a commit has only one author (and one
committer), they ought to be interpreted as asking for union to be useful.
The current implementation simply added another author/committer pattern
at the same top-level for ALL-MATCH to insist on matching all, finding
nothing.

Turn

    log --author=me --author=her \
    	--committer=him --committer=you \
	--grep=this --grep=that

into

    (ALL-MATCH
      (OR (HEADER-AUTHOR me) (HEADER-AUTHOR her))
      (OR (HEADER-COMMITTER him) (HEADER-COMMITTER you))
      (OR (PATTERN this) (PATTERN that)))

instead.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-13 01:11:55 -07:00
Junio C Hamano 8d676d85f7 Merge branch 'gv/portable'
* gv/portable:
  test-lib: use DIFF definition from GIT-BUILD-OPTIONS
  build: propagate $DIFF to scripts
  Makefile: Tru64 portability fix
  Makefile: HP-UX 10.20 portability fixes
  Makefile: HPUX11 portability fixes
  Makefile: SunOS 5.6 portability fix
  inline declaration does not work on AIX
  Allow disabling "inline"
  Some platforms lack socklen_t type
  Make NO_{INET_NTOP,INET_PTON} configured independently
  Makefile: some platforms do not have hstrerror anywhere
  git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
  test_cmp: do not use "diff -u" on platforms that lack one
  fixup: do not unconditionally disable "diff -u"
  tests: use "test_cmp", not "diff", when verifying the result
  Do not use "diff" found on PATH while building and installing
  enums: omit trailing comma for portability
  Makefile: -lpthread may still be necessary when libc has only pthread stubs
  Rewrite dynamic structure initializations to runtime assignment
  Makefile: pass CPPFLAGS through to fllow customization

Conflicts:
	Makefile
	wt-status.h
2010-06-21 06:02:44 -07:00
Gary V. Vaughan 4b05548fc0 enums: omit trailing comma for portability
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.

enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.

Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:27 -07:00
René Scharfe ed40a0951c grep: support NUL chars in search strings for -F
Search patterns in a file specified with -f can contain NUL characters.
The current code ignores all characters on a line after a NUL.

Pass the actual length of the line all the way from the pattern file to
fixmatch() and use it for case-sensitive fixed string matching.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-24 11:22:07 -07:00
Junio C Hamano f1aa782a3b Merge branch 'ml/color-grep'
* ml/color-grep:
  grep: Colorize selected, context, and function lines
  grep: Colorize filename, line number, and separator
  Add GIT_COLOR_BOLD_* and GIT_COLOR_BG_*
2010-03-20 11:29:36 -07:00
Mark Lodato 00588bb5cd grep: Colorize selected, context, and function lines
Colorize non-matching text of selected lines, context lines, and
function name lines.  The default for all three is no color, but they
can be configured using color.grep.<slot>.  The first two are similar
to the corresponding options in GNU grep, except that GNU grep applies
the color to the entire line, not just non-matching text.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08 00:30:59 -08:00
Mark Lodato 55f638bdc6 grep: Colorize filename, line number, and separator
Colorize the filename, line number, and separator in git grep output, as
GNU grep does.  The colors are customizable through color.grep.<slot>.
The default is to only color the separator (in cyan), since this gives
the biggest legibility increase without overwhelming the user with
colors.  GNU grep also defaults cyan for the separator, but defaults to
magenta for the filename and to green for the line number, as well.

There is one difference from GNU grep: When a binary file matches
without -a, GNU grep does not color the <file> in "Binary file <file>
matches", but we do.

Like GNU grep, if --null is given, the null separators are not colored.

For config.txt, use a a sub-list to describe the slots, rather than
a single paragraph with parentheses, since this is much more readable.

Remove the cast to int for `rm_eo - rm_so` since it is not necessary.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08 00:30:44 -08:00
Junio C Hamano 6b45b8c088 Merge branch 'jc/grep-author-all-match-implicit'
* jc/grep-author-all-match-implicit:
  "log --author=me --grep=it" should find intersection, not union
2010-03-02 12:44:06 -08:00
Fredrik Kuivinen 5b594f457a Threaded grep
Make git grep use threads when it is available.

The results below are best of five runs in the Linux repository (on a
box with two cores).

With the patch:

git grep qwerty
1.58user 0.55system 0:01.16elapsed 183%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+5774minor)pagefaults 0swaps

Without:

git grep qwerty
1.59user 0.43system 0:02.02elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+3716minor)pagefaults 0swaps

And with a pattern with quite a few matches:

With the patch:

$ /usr/bin/time git grep void
5.61user 0.56system 0:03.44elapsed 179%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+5587minor)pagefaults 0swaps

Without:

$ /usr/bin/time git grep void
5.36user 0.51system 0:05.87elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+3693minor)pagefaults 0swaps

In either case we gain about 40% by the threading.

Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-26 09:20:07 -08:00
Junio C Hamano 80235ba79e "log --author=me --grep=it" should find intersection, not union
Historically, any grep filter in "git log" family of commands were taken
as restricting to commits with any of the words in the commit log message.
However, the user almost always want to find commits "done by this person
on that topic".  With "--all-match" option, a series of grep patterns can
be turned into a requirement that all of them must produce a match, but
that makes it impossible to ask for "done by me, on either this or that"
with:

	log --author=me --committer=him --grep=this --grep=that

because it will require both "this" and "that" to appear.

Change the "header" parser of grep library to treat the headers specially,
and parse it as:

	(all-match-OR (HEADER-AUTHOR me)
		      (HEADER-COMMITTER him)
		      (OR
		      	(PATTERN this)
			(PATTERN that) ) )

Even though the "log" command line parser doesn't give direct access to
the extended grep syntax to group terms with parentheses, this change will
cover the majority of the case the users would want.

This incidentally revealed that one test in t7002 was bogus.  It ran:

	log --author=Thor --grep=Thu --format='%s'

and expected (wrongly) "Thu" to match "Thursday" in the author/committer
date, but that would never match, as the timestamp in raw commit buffer
does not have the name of the day-of-the-week.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-25 19:28:13 -08:00
Junio C Hamano bbc09c22b9 grep: rip out support for external grep
We still allow people to pass --[no-]ext-grep on the command line,
but the option is ignored.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-13 01:04:54 -08:00
Brian Collins 5183bf6727 grep: Allow case insensitive search of fixed-strings
"git grep" currently an error when you combine the -F and -i flags.
This isn't in line with how GNU grep handles it.

This patch allows the simultaneous use of those flags.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Brian Collins <bricollins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-16 16:06:46 -08:00
Junio C Hamano 5b590d783a Merge branch 'maint'
* maint:
  GIT 1.6.4.3
  svn: properly escape arguments for authors-prog
  http.c: remove verification of remote packs
  grep: accept relative paths outside current working directory
  grep: fix exit status if external_grep() punts

Conflicts:
	GIT-VERSION-GEN
	RelNotes
2009-09-13 01:30:53 -07:00
Junio C Hamano 45c58ba00a Merge branch 'cb/maint-1.6.3-grep-relative-up' into maint
* cb/maint-1.6.3-grep-relative-up:
  grep: accept relative paths outside current working directory
  grep: fix exit status if external_grep() punts

Conflicts:
	t/t7002-grep.sh
2009-09-13 01:24:20 -07:00
Clemens Buchacher 493b7a08d8 grep: accept relative paths outside current working directory
"git grep" would barf at relative paths pointing outside the current
working directory (or subdirectories thereof). Use quote_path_relative(),
which can handle such cases just fine.

[jc: added tests.]

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-07 15:03:04 -07:00
Michał Kiedrowicz a91f453f64 grep: Add --max-depth option.
It is useful to grep directories non-recursively, e.g. when one wants to
look for all files in the toplevel directory, but not in any subdirectory,
or in Documentation/, but not in Documentation/technical/.

This patch adds support for --max-depth <depth> option to git-grep. If it is
given, git-grep descends at most <depth> levels of directories below paths
specified on the command line.

Note that if path specified on command line contains wildcards, this option
makes no sense, e.g.

    $ git grep -l --max-depth 0 GNU -- 'contrib/*'

(note the quotes) will search all files in contrib/, even in
subdirectories, because '*' matches all files.

Documentation updates, bash-completion and simple test cases are also
provided.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-22 21:54:54 -07:00
René Scharfe 60ecac98ed grep -p: support user defined regular expressions
Respect the userdiff attributes and config settings when looking for
lines with function definitions in git grep -p.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:50 -07:00
René Scharfe 2944e4e614 grep: add option -p/--show-function
The new option -p instructs git grep to print the previous function
definition as a context line, similar to diff -p.  Such context lines
are marked with an equal sign instead of a dash.  This option
complements the existing context options -A, -B, -C.

Function definitions are detected using the same heuristic that diff
uses.  User defined regular expressions are not supported, yet.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:49 -07:00
René Scharfe 046802d015 grep: print context hunk marks between files
Print a hunk mark before matches from a new file are shown, in addition
to the current behaviour of printing them if lines have been skipped.

The result is easier to read, as (presumably unrelated) matches from
different files are separated by a hunk mark.  GNU grep does the same.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:46 -07:00
René Scharfe 5dd06d3879 grep: move context hunk mark handling into show_line()
Move last_shown into struct grep_opt, to make it available in
show_line(), and then make the function handle the printing of hunk
marks for context lines in a central place.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:45 -07:00
René Scharfe 3e230fa1b2 grep: use parseopt
Convert git-grep to parseopt.

The bitfields in struct grep_opt are converted to full ints,
increasing its size.  This shouldn't be a problem as there is only a
single instance in memory.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-09 00:29:56 -07:00
René Scharfe a94982ef39 grep: add support for coloring with external greps
Add the config variable color.grep.external, which can be used to
switch on coloring of external greps.  To enable auto coloring with
GNU grep, one needs to set color.grep.external to --color=always to
defeat the pager started by git grep.  The value of the config
variable will be passed to the external grep only if it would
colorize internal grep's output, so automatic terminal detected
works.  The default is to not pass any option, because the external
grep command could be a program without color support.

Also set the environment variables GREP_COLOR and GREP_COLORS to
pass the configured color for matches to the external grep.  This
works with GNU grep; other variables could be added as needed.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 11:34:59 -08:00