Commit graph

32 commits

Author SHA1 Message Date
Michał Kiedrowicz
63e7e9d8b6 git-grep: Learn PCRE
This patch teaches git-grep the --perl-regexp/-P options (naming
borrowed from GNU grep) in order to allow specifying PCRE regexes on the
command line.

PCRE has a number of features which make them more handy to use than
POSIX regexes, like consistent escaping rules, extended character
classes, ungreedy matching etc.

git isn't build with PCRE support automatically. USE_LIBPCRE environment
variable must be enabled (like `make USE_LIBPCRE=YesPlease`).

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09 16:29:33 -07:00
Junio C Hamano
5aaeb733f5 log --author: take union of multiple "author" requests
In the olden days,

    log --author=me --committer=him --grep=this --grep=that

used to be turned into:

    (OR (HEADER-AUTHOR me)
        (HEADER-COMMITTER him)
        (PATTERN this)
        (PATTERN that))

showing my patches that do not have any "this" nor "that", which was
totally useless.

80235ba ("log --author=me --grep=it" should find intersection, not union,
2010-01-17) improved it greatly to turn the same into:

    (ALL-MATCH
      (HEADER-AUTHOR me)
      (HEADER-COMMITTER him)
      (OR (PATTERN this) (PATTERN that)))

That is, "show only patches by me and committed by him, that have either
this or that", which is a lot more natural thing to ask.

We however need to be a bit more clever when the user asks more than one
"author" (or "committer"); because a commit has only one author (and one
committer), they ought to be interpreted as asking for union to be useful.
The current implementation simply added another author/committer pattern
at the same top-level for ALL-MATCH to insist on matching all, finding
nothing.

Turn

    log --author=me --author=her \
    	--committer=him --committer=you \
	--grep=this --grep=that

into

    (ALL-MATCH
      (OR (HEADER-AUTHOR me) (HEADER-AUTHOR her))
      (OR (HEADER-COMMITTER him) (HEADER-COMMITTER you))
      (OR (PATTERN this) (PATTERN that)))

instead.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-13 01:11:55 -07:00
Junio C Hamano
8d676d85f7 Merge branch 'gv/portable'
* gv/portable:
  test-lib: use DIFF definition from GIT-BUILD-OPTIONS
  build: propagate $DIFF to scripts
  Makefile: Tru64 portability fix
  Makefile: HP-UX 10.20 portability fixes
  Makefile: HPUX11 portability fixes
  Makefile: SunOS 5.6 portability fix
  inline declaration does not work on AIX
  Allow disabling "inline"
  Some platforms lack socklen_t type
  Make NO_{INET_NTOP,INET_PTON} configured independently
  Makefile: some platforms do not have hstrerror anywhere
  git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
  test_cmp: do not use "diff -u" on platforms that lack one
  fixup: do not unconditionally disable "diff -u"
  tests: use "test_cmp", not "diff", when verifying the result
  Do not use "diff" found on PATH while building and installing
  enums: omit trailing comma for portability
  Makefile: -lpthread may still be necessary when libc has only pthread stubs
  Rewrite dynamic structure initializations to runtime assignment
  Makefile: pass CPPFLAGS through to fllow customization

Conflicts:
	Makefile
	wt-status.h
2010-06-21 06:02:44 -07:00
Gary V. Vaughan
4b05548fc0 enums: omit trailing comma for portability
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.

enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.

Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:27 -07:00
René Scharfe
ed40a0951c grep: support NUL chars in search strings for -F
Search patterns in a file specified with -f can contain NUL characters.
The current code ignores all characters on a line after a NUL.

Pass the actual length of the line all the way from the pattern file to
fixmatch() and use it for case-sensitive fixed string matching.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-24 11:22:07 -07:00
Junio C Hamano
f1aa782a3b Merge branch 'ml/color-grep'
* ml/color-grep:
  grep: Colorize selected, context, and function lines
  grep: Colorize filename, line number, and separator
  Add GIT_COLOR_BOLD_* and GIT_COLOR_BG_*
2010-03-20 11:29:36 -07:00
Mark Lodato
00588bb5cd grep: Colorize selected, context, and function lines
Colorize non-matching text of selected lines, context lines, and
function name lines.  The default for all three is no color, but they
can be configured using color.grep.<slot>.  The first two are similar
to the corresponding options in GNU grep, except that GNU grep applies
the color to the entire line, not just non-matching text.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08 00:30:59 -08:00
Mark Lodato
55f638bdc6 grep: Colorize filename, line number, and separator
Colorize the filename, line number, and separator in git grep output, as
GNU grep does.  The colors are customizable through color.grep.<slot>.
The default is to only color the separator (in cyan), since this gives
the biggest legibility increase without overwhelming the user with
colors.  GNU grep also defaults cyan for the separator, but defaults to
magenta for the filename and to green for the line number, as well.

There is one difference from GNU grep: When a binary file matches
without -a, GNU grep does not color the <file> in "Binary file <file>
matches", but we do.

Like GNU grep, if --null is given, the null separators are not colored.

For config.txt, use a a sub-list to describe the slots, rather than
a single paragraph with parentheses, since this is much more readable.

Remove the cast to int for `rm_eo - rm_so` since it is not necessary.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08 00:30:44 -08:00
Junio C Hamano
6b45b8c088 Merge branch 'jc/grep-author-all-match-implicit'
* jc/grep-author-all-match-implicit:
  "log --author=me --grep=it" should find intersection, not union
2010-03-02 12:44:06 -08:00
Fredrik Kuivinen
5b594f457a Threaded grep
Make git grep use threads when it is available.

The results below are best of five runs in the Linux repository (on a
box with two cores).

With the patch:

git grep qwerty
1.58user 0.55system 0:01.16elapsed 183%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+5774minor)pagefaults 0swaps

Without:

git grep qwerty
1.59user 0.43system 0:02.02elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+3716minor)pagefaults 0swaps

And with a pattern with quite a few matches:

With the patch:

$ /usr/bin/time git grep void
5.61user 0.56system 0:03.44elapsed 179%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+5587minor)pagefaults 0swaps

Without:

$ /usr/bin/time git grep void
5.36user 0.51system 0:05.87elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+800outputs (0major+3693minor)pagefaults 0swaps

In either case we gain about 40% by the threading.

Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-26 09:20:07 -08:00
Junio C Hamano
80235ba79e "log --author=me --grep=it" should find intersection, not union
Historically, any grep filter in "git log" family of commands were taken
as restricting to commits with any of the words in the commit log message.
However, the user almost always want to find commits "done by this person
on that topic".  With "--all-match" option, a series of grep patterns can
be turned into a requirement that all of them must produce a match, but
that makes it impossible to ask for "done by me, on either this or that"
with:

	log --author=me --committer=him --grep=this --grep=that

because it will require both "this" and "that" to appear.

Change the "header" parser of grep library to treat the headers specially,
and parse it as:

	(all-match-OR (HEADER-AUTHOR me)
		      (HEADER-COMMITTER him)
		      (OR
		      	(PATTERN this)
			(PATTERN that) ) )

Even though the "log" command line parser doesn't give direct access to
the extended grep syntax to group terms with parentheses, this change will
cover the majority of the case the users would want.

This incidentally revealed that one test in t7002 was bogus.  It ran:

	log --author=Thor --grep=Thu --format='%s'

and expected (wrongly) "Thu" to match "Thursday" in the author/committer
date, but that would never match, as the timestamp in raw commit buffer
does not have the name of the day-of-the-week.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-25 19:28:13 -08:00
Junio C Hamano
bbc09c22b9 grep: rip out support for external grep
We still allow people to pass --[no-]ext-grep on the command line,
but the option is ignored.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-13 01:04:54 -08:00
Brian Collins
5183bf6727 grep: Allow case insensitive search of fixed-strings
"git grep" currently an error when you combine the -F and -i flags.
This isn't in line with how GNU grep handles it.

This patch allows the simultaneous use of those flags.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Brian Collins <bricollins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-16 16:06:46 -08:00
Junio C Hamano
5b590d783a Merge branch 'maint'
* maint:
  GIT 1.6.4.3
  svn: properly escape arguments for authors-prog
  http.c: remove verification of remote packs
  grep: accept relative paths outside current working directory
  grep: fix exit status if external_grep() punts

Conflicts:
	GIT-VERSION-GEN
	RelNotes
2009-09-13 01:30:53 -07:00
Junio C Hamano
45c58ba00a Merge branch 'cb/maint-1.6.3-grep-relative-up' into maint
* cb/maint-1.6.3-grep-relative-up:
  grep: accept relative paths outside current working directory
  grep: fix exit status if external_grep() punts

Conflicts:
	t/t7002-grep.sh
2009-09-13 01:24:20 -07:00
Clemens Buchacher
493b7a08d8 grep: accept relative paths outside current working directory
"git grep" would barf at relative paths pointing outside the current
working directory (or subdirectories thereof). Use quote_path_relative(),
which can handle such cases just fine.

[jc: added tests.]

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-07 15:03:04 -07:00
Michał Kiedrowicz
a91f453f64 grep: Add --max-depth option.
It is useful to grep directories non-recursively, e.g. when one wants to
look for all files in the toplevel directory, but not in any subdirectory,
or in Documentation/, but not in Documentation/technical/.

This patch adds support for --max-depth <depth> option to git-grep. If it is
given, git-grep descends at most <depth> levels of directories below paths
specified on the command line.

Note that if path specified on command line contains wildcards, this option
makes no sense, e.g.

    $ git grep -l --max-depth 0 GNU -- 'contrib/*'

(note the quotes) will search all files in contrib/, even in
subdirectories, because '*' matches all files.

Documentation updates, bash-completion and simple test cases are also
provided.

Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-22 21:54:54 -07:00
René Scharfe
60ecac98ed grep -p: support user defined regular expressions
Respect the userdiff attributes and config settings when looking for
lines with function definitions in git grep -p.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:50 -07:00
René Scharfe
2944e4e614 grep: add option -p/--show-function
The new option -p instructs git grep to print the previous function
definition as a context line, similar to diff -p.  Such context lines
are marked with an equal sign instead of a dash.  This option
complements the existing context options -A, -B, -C.

Function definitions are detected using the same heuristic that diff
uses.  User defined regular expressions are not supported, yet.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:49 -07:00
René Scharfe
046802d015 grep: print context hunk marks between files
Print a hunk mark before matches from a new file are shown, in addition
to the current behaviour of printing them if lines have been skipped.

The result is easier to read, as (presumably unrelated) matches from
different files are separated by a hunk mark.  GNU grep does the same.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:46 -07:00
René Scharfe
5dd06d3879 grep: move context hunk mark handling into show_line()
Move last_shown into struct grep_opt, to make it available in
show_line(), and then make the function handle the printing of hunk
marks for context lines in a central place.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:45 -07:00
René Scharfe
3e230fa1b2 grep: use parseopt
Convert git-grep to parseopt.

The bitfields in struct grep_opt are converted to full ints,
increasing its size.  This shouldn't be a problem as there is only a
single instance in memory.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-09 00:29:56 -07:00
René Scharfe
a94982ef39 grep: add support for coloring with external greps
Add the config variable color.grep.external, which can be used to
switch on coloring of external greps.  To enable auto coloring with
GNU grep, one needs to set color.grep.external to --color=always to
defeat the pager started by git grep.  The value of the config
variable will be passed to the external grep only if it would
colorize internal grep's output, so automatic terminal detected
works.  The default is to not pass any option, because the external
grep command could be a program without color support.

Also set the environment variables GREP_COLOR and GREP_COLORS to
pass the configured color for matches to the external grep.  This
works with GNU grep; other variables could be added as needed.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 11:34:59 -08:00
René Scharfe
7e8f59d577 grep: color patterns in output
Coloring matches makes them easier to spot in the output.

Add two options and two parameters: color.grep (to turn coloring on
or off), color.grep.match (to set the color of matches), --color
and --no-color (to turn coloring on or off, respectively).

The output of external greps is not changed.

This patch is based on earlier ones by Nguyễn Thái Ngọc Duy and
Thiago Alves.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 11:34:59 -08:00
René Scharfe
d7eb527d73 grep: remove grep_opt argument from match_expr_eval()
The only use of the struct grep_opt argument of match_expr_eval()
is to pass the option word_regexp to match_one_pattern().  By adding
a pattern flag for it we can reduce the number of function arguments
of these two functions, as a cleanup and preparation for adding more
in the next patch.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 11:34:56 -08:00
René Scharfe
c822255cfc grep: don't call regexec() for fixed strings
Add the new flag "fixed" to struct grep_pat and set it if the pattern
is doesn't contain any regex control characters in addition to if the
flag -F/--fixed-strings was specified.

This gives a nice speed up on msysgit, where regexec() seems to be
extra slow.  Before (best of five runs):

	$ time git grep grep v1.6.1 >/dev/null

	real    0m0.552s
	user    0m0.000s
	sys     0m0.000s

	$ time git grep -F grep v1.6.1 >/dev/null

	real    0m0.170s
	user    0m0.000s
	sys     0m0.015s

With the patch:

	$ time git grep grep v1.6.1 >/dev/null

	real    0m0.173s
	user    0m0.000s
	sys     0m0.000s

The difference is much smaller on Linux, but still measurable.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-09 21:35:56 -08:00
Raphael Zimmerer
83caecca2f git grep: Add "-z/--null" option as in GNU's grep.
Here's a trivial patch that adds "-z" and "--null" options to "git
grep". It was discussed on the mailing-list that git's "-z"
convention should be used instead of GNU grep's "-Z".
So things like 'git grep -l -z "$FOO" | xargs -0 sed -i "s/$FOO/$BOO/"'
do work now.

Signed-off-by: Raphael Zimmerer <killekulla@rdrz.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-01 09:14:54 -07:00
Junio C Hamano
a4d7d2c6db log --author/--committer: really match only with name part
When we tried to find commits done by AUTHOR, the first implementation
tried to pattern match a line with "^author .*AUTHOR", which later was
enhanced to strip leading caret and look for "^author AUTHOR" when the
search pattern was anchored at the left end (i.e. --author="^AUTHOR").

This had a few problems:

 * When looking for fixed strings (e.g. "git log -F --author=x --grep=y"),
   the regexp internally used "^author .*x" would never match anything;

 * To match at the end (e.g. "git log --author='google.com>$'"), the
   generated regexp has to also match the trailing timestamp part the
   commit header lines have.  Also, in order to determine if the '$' at
   the end means "match at the end of the line" or just a literal dollar
   sign (probably backslash-quoted), we would need to parse the regexp
   ourselves.

An earlier alternative tried to make sure that a line matches "^author "
(to limit by field name) and the user supplied pattern at the same time.
While it solved the -F problem by introducing a special override for
matching the "^author ", it did not solve the trailing timestamp nor tail
match problem.  It also would have matched every commit if --author=author
was asked for, not because the author's email part had this string, but
because every commit header line that talks about the author begins with
that field name, regardleses of who wrote it.

Instead of piling more hacks on top of hacks, this rethinks the grep
machinery that is used to look for strings in the commit header, and makes
sure that (1) field name matches literally at the beginning of the line,
followed by a SP, and (2) the user supplied pattern is matched against the
remainder of the line, excluding the trailing timestamp data.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-04 22:21:56 -07:00
Junio C Hamano
0ab7befa31 grep --all-match
This lets you say:

	git grep --all-match -e A -e B -e C

to find lines that match A or B or C but limit the matches from
the files that have all of A, B and C.

This is different from

	git grep -e A --and -e B --and -e C

in that the latter looks for a single line that has all of these
at the same time.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 23:59:09 -07:00
Junio C Hamano
b48fb5b6a9 grep: free expressions and patterns when done.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 16:27:10 -07:00
Junio C Hamano
480c1ca6fd Update grep internal for grepping only in head/body
This further updates the built-in grep engine so that we can say
something like "this pattern should match only in head".  This
can be used to simplify grepping in the log messages.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 12:39:46 -07:00
Junio C Hamano
83b5d2f5b0 builtin-grep: make pieces of it available as library.
This makes three functions and associated option structures from
builtin-grep available from other parts of the system.

 * options to drive built-in grep engine is stored in struct
   grep_opt;

 * pattern strings and extended grep expressions are added to
   struct grep_opt with append_grep_pattern();

 * when finished calling append_grep_pattern(), call
   compile_grep_patterns() to prepare for execution;

 * call grep_buffer() to find matches in the in-core buffer.

This also adds an internal option "status_only" to grep_opt,
which suppresses any output from grep_buffer().  Callers of the
function as library can use it to check if there is a match
without producing any output.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 11:14:38 -07:00