I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
quote_path_relative() used to take a counted string as its parameter
(the string to be quoted). With an earlier change, it now uses
relative_path() that does not take a counted string, and we have
been passing only the pointer to the string since then.
Remove the length parameter from quote_path_relative() to show that
this parameter was redundant. All the changed lines show that the
caller passed either -1 (to ask the function run strlen() on the
string), or the length of the string, so the earlier conversion was
safe.
All the callers of quote_path_relative() that used to take counted string
have been audited to make sure that they are passing length of the actual
string (or -1 to ask the callee run strlen())
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make "grep" honor the "--textconv" option also for the object case, i.e.
when used with an argument "rev:path".
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:
ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)
"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping,
even when requested by --textconv.
Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not that we do not actively encourage having annotated tags outside
refs/tags/ hierarchy, but they were not advertised correctly to the
ls-remote and fetch with recent version of Git.
* jk/fully-peeled-packed-ref:
pack-refs: add fully-peeled trait
pack-refs: write peeled entry for non-tags
use parse_object_or_die instead of die("bad object")
avoid segfaults on parse_object failure
Some call-sites do:
o = parse_object(sha1);
if (!o)
die("bad object %s", some_name);
We can now handle that as a one-liner, and get more
consistent output.
In the third case of this patch, it looks like we are losing
information, as the existing message also outputs the sha1
hex; however, parse_object will already have written a more
specific complaint about the sha1, so there is no point in
repeating it here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike other commands that take both revs and pathspecs without "--"
disamiguators only when the boundary is clear, "git grep" treated
what can be interpreted as a rev as-is, without making sure that it
could also have meant a pathspec. E.g.
$ git grep -e foo master
when 'master' is in the working tree, should have triggered an
ambiguity error, but it didn't, and searched in the tree of the
commit named by 'master'.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep -e pattern <tree>" asked the attribute system to read
"<tree>:.gitattributes" file in the working tree, which was
nonsense.
* nd/grep-true-path:
grep: stop looking at random places for .gitattributes
grep searches for .gitattributes using "name" field in struct
grep_source but that field is not real on-disk path name. For example,
"grep pattern rev" fills the field with "rev:path", and Git looks for
.gitattributes in the (non-existent but exploitable) path "rev:path"
instead of "path".
This patch passes real paths down to grep_source_load_driver() when:
- grep on work tree
- grep on the index
- grep a commit (or a tag if it points to a commit)
so that these cases look up .gitattributes at proper paths.
.gitattributes lookup is disabled in all other cases.
Initial-work-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switching between -E/-G/-P/-F correctly needs a lot more than just
flipping opt->regflags bit these days, and we have a nice helper
function buried in builtin/grep.c for the sole use of "git grep".
Extract it so that "log --grep" family can also use it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The configuration handling is a library-ish part of this program,
that is not specific to "git grep" command. It should be reusable
by "log" and others.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grep_config() function takes one instance of grep_opt as its
callback parameter, and populates it by running git_config().
This has three practical implications:
- You have to have an instance of grep_opt already when you call
the configuration, but that is not necessarily always true. You
may be trying to initialize the grep_filter member of rev_info,
but are not ready to call init_revisions() on it yet.
- It is not easy to enhance grep_config() in such a way to make it
cascade to other callback functions to grab other variables in
one call of git_config(); grep_config() can be cascaded into from
other callbacks, but it has to be at the leaf level of a cascade.
- If you ever need to use more than one instance of grep_opt, you
will have to open and read the configuration file(s) every time
you initialize them.
Rearrange the configuration mechanism and model it after how diff
configuration variables are handled. An early call to git_config()
reads and remembers the values taken from the configuration in the
default "template", and a separate call to grep_init() uses this
template to instantiate a grep_opt.
The next step will be to move some of this out of this file so that
the other user of the grep machinery (i.e. "log") can use it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-log-grep-all-match-1:
grep.c: make two symbols really file-scope static this time
t7810-grep: test --all-match with multiple --grep and --author options
t7810-grep: test interaction of multiple --grep and --author options
t7810-grep: test multiple --author with --all-match
t7810-grep: test multiple --grep with and without --all-match
t7810-grep: bring log --grep tests in common form
grep.c: mark private file-scope symbols as static
log: document use of multiple commit limiting options
log --grep/--author: honor --all-match honored for multiple --grep patterns
grep: show --debug output only once
grep: teach --debug option to dump the parse tree
Fix a long-standing bug in "git log --grep" when multiple "--grep"
are used together with "--all-match" and "--author" or "--committer".
* jc/maint-log-grep-all-match:
t7810-grep: test --all-match with multiple --grep and --author options
t7810-grep: test interaction of multiple --grep and --author options
t7810-grep: test multiple --author with --all-match
t7810-grep: test multiple --grep with and without --all-match
t7810-grep: bring log --grep tests in common form
grep.c: mark private file-scope symbols as static
log: document use of multiple commit limiting options
log --grep/--author: honor --all-match honored for multiple --grep patterns
grep: show --debug output only once
grep: teach --debug option to dump the parse tree
When threaded grep is in effect, the patterns are duplicated and
recompiled for each thread. Avoid "--debug" output during the
recompilation so that the output is given once instead of "1+nthreads"
times.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our "grep" allows complex boolean expressions to be formed to match
each individual line with operators like --and, '(', ')' and --not.
Introduce the "--debug" option to show the parse tree to help people
who want to debug and enhance it.
Also "log" learns "--grep-debug" option to do the same. The command
line parser to the log family is a lot more limited than the general
"git grep" parser, but it has special handling for header matching
(e.g. "--author"), and a parse tree is valuable when working on it.
Note that "--all-match" is *not* any individual node in the parse
tree. It is an instruction to the evaluator to check all the nodes
in the top-level backbone have matched and reject a document as
non-matching otherwise.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A lot of i18n mark-up for the help text from "git <cmd> -h".
* nd/i18n-parseopt-help: (66 commits)
Use imperative form in help usage to describe an action
Reduce translations by using same terminologies
i18n: write-tree: mark parseopt strings for translation
i18n: verify-tag: mark parseopt strings for translation
i18n: verify-pack: mark parseopt strings for translation
i18n: update-server-info: mark parseopt strings for translation
i18n: update-ref: mark parseopt strings for translation
i18n: update-index: mark parseopt strings for translation
i18n: tag: mark parseopt strings for translation
i18n: symbolic-ref: mark parseopt strings for translation
i18n: show-ref: mark parseopt strings for translation
i18n: show-branch: mark parseopt strings for translation
i18n: shortlog: mark parseopt strings for translation
i18n: rm: mark parseopt strings for translation
i18n: revert, cherry-pick: mark parseopt strings for translation
i18n: rev-parse: mark parseopt strings for translation
i18n: reset: mark parseopt strings for translation
i18n: rerere: mark parseopt strings for translation
i18n: status: mark parseopt strings for translation
i18n: replace: mark parseopt strings for translation
...
The grep.extendedRegexp configuration setting enables the -E flag on grep
by default but there are no equivalents for the -G, -F and -P flags.
Rather than adding an additional setting for grep.fooRegexp for current
and future pattern matching options, add a grep.patternType setting that
can accept appropriate values for modifying the default grep pattern
matching behavior. The current values are "basic", "extended", "fixed",
"perl" and "default" for setting -G, -E, -F, -P and the default behavior
respectively.
When grep.patternType is set to a value other than "default", the
grep.extendedRegexp setting is ignored. The value of "default" restores
the current default behavior, including the grep.extendedRegexp
behavior.
Signed-off-by: J Smith <dark.panda@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff COPYING HEAD:COPYING" gave a nonsense error message that
claimed that the treeish HEAD did not have COPYING in it.
* mm/verify-filename-fix:
verify_filename(): ask the caller to chose the kind of diagnosis
sha1_name: do not trigger detailed diagnosis for file arguments
verify_filename() can be called in two different contexts. Either we
just tried to interpret a string as an object name, and it fails, so
we try looking for a working tree file (i.e. we finished looking at
revs that come earlier on the command line, and the next argument
must be a pathname), or we _know_ that we are looking for a
pathname, and shouldn't even try interpreting the string as an
object name.
For example, with this change, we get:
$ git log COPYING HEAD:inexistant
fatal: HEAD:inexistant: no such path in the working tree.
Use '-- <path>...' to specify paths that do not exist locally.
$ git log HEAD:inexistant
fatal: Path 'inexistant' does not exist in 'HEAD'
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep -e '$pattern'", unlike the case where the patterns are read from
a file, did not treat individual lines in the given pattern argument as
separate regular expressions as it should.
By René Scharfe
* rs/maint-grep-F:
grep: stop leaking line strings with -f
grep: support newline separated pattern list
grep: factor out do_append_grep_pat()
grep: factor out create_grep_pat()
"git grep -e '$pattern'", unlike the case where the patterns are read from
a file, did not treat individual lines in the given pattern argument as
separate regular expressions as it should.
When reading patterns from a file, we pass the lines as allocated string
buffers to append_grep_pat() and never free them. That's not a problem
because they are needed until the program ends anyway.
However, now that the function duplicates the pattern string, we can
reuse the strbuf after calling that function. This simplifies the code
a bit and plugs a minor memory leak.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PARSE_OPT_NEGHELP is confusing because short options defined with that
flag do the opposite of what the helptext says. It is also not needed
anymore now that options starting with no- can be negated by removing
that prefix. Convert its only two users to OPT_NEGBIT() and OPT_BOOL()
and then remove support for PARSE_OPT_NEGHELP.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/grep-binary-attribute:
grep: pre-load userdiff drivers when threaded
grep: load file data after checking binary-ness
grep: respect diff attributes for binary-ness
grep: cache userdiff_driver in grep_source
grep: drop grep_buffer's "name" parameter
convert git-grep to use grep_source interface
grep: refactor the concept of "grep source" into an object
grep: move sha1-reading mutex into low-level code
grep: make locking flag global
When the userdiff_config function was introduced in be58e70
(diff: unify external diff and funcname parsing code,
2008-10-05), it used a return value convention unlike any
other config callback. Like other callbacks, it used "-1" to
signal error. But it returned "1" to indicate that it found
something, and "0" otherwise; other callbacks simply
returned "0" to indicate that no error occurred.
This distinction was necessary at the time, because the
userdiff namespace overlapped slightly with the color
configuration namespace. So "diff.color.foo" could mean "the
'foo' slot of diff coloring" or "the 'foo' component of the
"color" userdiff driver". Because the color-parsing code
would die on an unknown color slot, we needed the userdiff
code to indicate that it had matched the variable, letting
us bypass the color-parsing code entirely.
Later, in 8b8e862 (ignore unknown color configuration,
2009-12-12), the color-parsing code learned to silently
ignore unknown slots. This means we no longer need to
protect userdiff-matched variables from reaching the
color-parsing code.
We can therefore change the userdiff_config calling
convention to a more normal one. This drops some code from
each caller, which is nice. But more importantly, it reduces
the cognitive load for readers who may wonder why
userdiff_config is unlike every other config callback.
There's no need to add a new test confirming that this
works; t4020 already contains a test that sets
diff.color.external.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level grep_source code will automatically load the
userdiff driver to see whether a file is binary. However,
when we are threaded, it will load the drivers in a
non-deterministic order, handling each one as its assigned
thread happens to be scheduled.
Meanwhile, the attribute lookup code (which underlies the
userdiff driver lookup) is optimized to handle paths in
sequential order (because they tend to share the same
gitattributes files). Multi-threading the lookups destroys
the locality and makes this optimization less effective.
We can fix this by pre-loading the userdiff driver in the
main thread, before we hand off the file to a worker thread.
My best-of-five for "git grep foo" on the linux-2.6
repository went from:
real 0m0.391s
user 0m1.708s
sys 0m0.584s
to:
real 0m0.360s
user 0m1.576s
sys 0m0.572s
Not a huge speedup, but it's quite easy to do. The only
trick is that we shouldn't perform this optimization if "-a"
was used, in which case we won't bother checking whether
the files are binary at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grep_source interface (as opposed to grep_buffer) will
eventually gives us a richer interface for telling the
low-level grep code about our buffers. Eventually this will
lead to things like better binary-file handling. For now, it
lets us drop a lot of now-redundant code.
The conversion is mostly straight-forward. One thing to note
is that the memory ownership rules for "struct grep_source"
are different than the "struct work_item" found here (the
former will copy things like the filename, rather than
taking ownership). Therefore you will also see some slight
tweaking of when filename buffers are released.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-threaded git-grep code needs to serialize access
to the thread-unsafe read_sha1_file call. It does this with
a mutex that is local to builtin/grep.c.
Let's instead push this down into grep.c, where it can be
used by both builtin/grep.c and grep.c. This will let us
safely teach the low-level grep.c code tricks that involve
reading from the object db.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level grep code traditionally didn't care about
threading, as it doesn't do any threading itself and didn't
call out to other non-thread-safe code. That changed with
0579f91 (grep: enable threading with -p and -W using lazy
attribute lookup, 2011-12-12), which pushed the lookup of
funcname attributes (which is not thread-safe) into the
low-level grep code.
As a result, the low-level code learned about a new global
"grep_attr_mutex" to serialize access to the attribute code.
A multi-threaded caller (e.g., builtin/grep.c) is expected
to initialize the mutex and set "use_threads" in the
grep_opt structure. The low-level code only uses the lock if
use_threads is set.
However, putting the use_threads flag into the grep_opt
struct is not the most logical place. Whether threading is
in use is not something that matters for each call to
grep_buffer, but is instead global to the whole program
(i.e., if any thread is doing multi-threaded grep, every
other thread, even if it thinks it is doing its own
single-threaded grep, would need to use the locking). In
practice, this distinction isn't a problem for us, because
the only user of multi-threaded grep is "git-grep", which
does nothing except call grep.
This patch turns the opt->use_threads flag into a global
flag. More important than the nit-picking semantic argument
above is that this means that the locking functions don't
need to actually have access to a grep_opt to know whether
to lock. Which in turn can make adding new locks simpler, as
we don't need to pass around a grep_opt.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In threaded mode, git-grep emits file breaks (enabled with context, -W
and --break) into the accumulation buffers even if they are not
required. The output collection thread then uses skip_first_line to
skip the first such line in the output, which would otherwise be at
the very top.
This is wrong when the user also specified -l/-L/-c, in which case
every line is relevant. While arguably giving these options together
doesn't make any sense, git-grep has always quietly accepted it. So
do not skip anything in these cases.
Signed-off-by: Albert Yale <surfingalbert@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Measurements by various people have shown that grepping in parallel is
not beneficial when the object store is involved. For example, with a
simple regex:
Threads | --cached case | worktree case
----------------------------------------------------------------
8 (default) | 2.88u 0.21s 0:02.94real | 0.19u 0.32s 0:00.16real
4 | 2.89u 0.29s 0:02.99real | 0.16u 0.34s 0:00.17real
2 | 2.83u 0.36s 0:02.87real | 0.18u 0.32s 0:00.26real
NO_PTHREADS | 2.16u 0.08s 0:02.25real | 0.12u 0.17s 0:00.31real
This happens because all the threads contend on read_sha1_mutex almost
all of the time. A more complex regex allows the threads to do more
work in parallel, but as Jeff King found out, the "super boost" (much
higher clock when only one core is active) feature of recent CPUs
still causes the unthreaded case to win by a large margin.
So until the pack machinery allows unthreaded access, we disable
grep's threading in all but the worktree case.
Helped-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lazily load the userdiff attributes in match_funcname(). Use a
separate mutex around this loading to protect the (not thread-safe)
attributes machinery. This lets us re-enable threading with -p and
-W while reducing the overhead caused by looking up attributes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/misc-cleanups:
unpack_object_header_buffer(): clear the size field upon error
tree_entry_interesting: make use of local pointer "item"
tree_entry_interesting(): give meaningful names to return values
read_directory_recursive: reduce one indentation level
get_tree_entry(): do not call find_tree_entry() on an empty tree
tree-walk.c: do not leak internal structure in tree_entry_len()
It is a basic code hygiene to avoid magic constants that are unnamed.
Besides, this helps extending the value later on for "interesting, but
cannot decide if the entry truely matches yet" (ie. prefix matches)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree_entry_len() does not simply take two random arguments and return
a tree length. The two pointers must point to a tree item structure,
or struct name_entry. Passing random pointers will return incorrect
value.
Force callers to pass struct name_entry instead of two pointers (with
hope that they don't manually construct struct name_entry themselves)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather nasty things happen when a mutex is not initialized but locked
nevertheless. Now, when we're not running in a threaded manner, the mutex
is not initialized, which is correct. But then we went and used the mutex
anyway, which -- at least on Windows -- leads to a hard crash (ordinarily
it would be called a segmentation fault, but in Windows speak it is an
access violation).
This problem was identified by our faithful tests when run in the msysGit
environment.
To avoid having to wrap the line due to the 80 column limit, we use
the name "WHEN_THREADED" instead of "IF_USE_THREADS" because it is one
character shorter. Which is all we need in this case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-grep-untracked-exclude:
grep: teach --untracked and --exclude-standard options
grep --no-index: don't use git standard exclusions
grep: do not use --index in the short usage output
Conflicts:
Documentation/git-grep.txt
builtin/grep.c
* jk/color-and-pager:
want_color: automatically fallback to color.ui
diff: don't load color config in plumbing
config: refactor get_colorbool function
color: delay auto-color decision until point of use
git_config_colorbool: refactor stdout_is_tty handling
diff: refactor COLOR_DIFF from a flag into an int
setup_pager: set GIT_PAGER_IN_USE
t7006: use test_config helpers
test-lib: add helper functions for config
t7006: modernize calls to unset
Conflicts:
builtin/commit.c
parse-options.c
All of the "do we want color" flags default to -1 to
indicate that we don't have any color configured. This value
is handled in one of two ways:
1. In porcelain, we check early on whether the value is
still -1 after reading the config, and set it to the
value of color.ui (which defaults to 0).
2. In plumbing, it stays untouched as -1, and want_color
defaults it to off.
This works fine, but means that every porcelain has to check
and reassign its color flag. Now that want_color gives us a
place to put this check in a single spot, we can do that,
simplifying the calling code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually this function figures out for itself whether stdout
is a tty. However, it has an extra parameter just to allow
git-config to override the auto-detection for its
--get-colorbool option.
Instead of an extra parameter, let's just use a global
variable. This makes calling easier in the common case, and
will make refactoring the colorbool code much simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take long option names for -A (--after-context), -B (--before-context)
and -C (--context) from GNU grep and add a similar long option name
for -W (--function-context).
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new option, -W, to show the whole surrounding function of a match.
It uses the same regular expressions as -p and diff to find the beginning
of sections.
Currently it will not display comments in front of a function, but those
that are following one. Despite this shortcoming it is already useful,
e.g. to simply see a more complete applicable context or to extract whole
functions.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --heading, the filename is printed once before matches from that
file instead of at the start of each line, giving more screen space to
the actual search results.
This option is taken from ack (http://betterthangrep.com/). And now
git grep can dress up like it:
$ git config alias.ack "grep --break --heading --line-number"
$ git ack -e --heading
Documentation/git-grep.txt
154:--heading::
t/t7810-grep.sh
785:test_expect_success 'grep --heading' '
786: git grep --heading -e char -e lo_w hello.c hello_world >actual &&
808: git grep --break --heading -n --color \
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --break, an empty line is printed between matches from different
files, increasing readability. This option is taken from ack
(http://betterthangrep.com/).
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 431d6e7b (grep: enable threading for context line printing)
split the printing of the "--\n" mark between results from different
files out into two places: show_line() in grep.c for the non-threaded
case and work_done() in builtin/grep.c for the threaded case. Commit
55f638bd (grep: Colorize filename, line number, and separator) updated
the former, but not the latter, so the separators between files are
not colored if threads are used.
This patch merges the two. In the threaded case, hunk marks are now
printed by show_line() for every file, including the first one, and the
very first mark is simply skipped in work_done(). This ensures that the
output is properly colored and works just as well.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/grep-pcre:
git-grep: Fix problems with recently added tests
git-grep: Update tests (mainly for -P)
Makefile: Pass USE_LIBPCRE down in GIT-BUILD-OPTIONS
git-grep: update tests now regexp type is "last one wins"
git-grep: do not die upon -F/-P when grep.extendedRegexp is set.
git-grep: Bail out when -P is used with -F or -E
grep: Add basic tests
configure: Check for libpcre
git-grep: Learn PCRE
grep: Extract compile_regexp_failed() from compile_regexp()
grep: Fix a typo in a comment
grep: Put calls to fixmatch() and regmatch() into patmatch()
contrib/completion: --line-number to git grep
Documentation: Add --line-number to git-grep synopsis
* jc/magic-pathspec:
setup.c: Fix some "symbol not declared" sparse warnings
t3703: Skip tests using directory name ":" on Windows
revision.c: leave a note for "a lone :" enhancement
t3703, t4208: add test cases for magic pathspec
rev/path disambiguation: further restrict "misspelled index entry" diag
fix overslow :/no-such-string-ever-existed diagnostics
fix overstrict :<path> diagnosis
grep: use get_pathspec() correctly
pathspec: drop "lone : means no pathspec" from get_pathspec()
Revert "magic pathspec: add ":(icase)path" to match case insensitively"
magic pathspec: add ":(icase)path" to match case insensitively
magic pathspec: futureproof shorthand form
magic pathspec: add tentative ":/path/from/top/level" pathspec support
When there is no remaining string in argv, get_pathspec(prefix, argv)
will return a two-element array that has prefix as the first element,
so there is no need to re-roll that logic in the code that uses
get_pathspec().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous one made "git grep -P" fail when grep.extendedRegexp is
enabled. That is a no-starter. The option on the command line should
just make the command ignore the configured default. The handling of "-F"
in the existing code has the same problem.
Instead of saying -G/-F/-E/-P incompatible with each other, just allow
the last one win. That way, you can have "[alias] gr = grep -P" and
use Pcre for everyday work e.g. "git gr ':i?foo'", and append -G to the
aliased command line to override it e.g. "git gr -G '[Ff][Oo][Oo]'".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch makes git-grep die() when -P is used on command line together
with -E/--extended-regexp or -F/--fixed-strings.
This also makes it bail out when grep.extendedRegexp is enabled.
But `git grep -G -P pattern` and `git grep -E -G -P pattern` still work
because -G and -E set opts.regflags during parse_options() and there is
no way to detect `-G` or `-E -G`.
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch teaches git-grep the --perl-regexp/-P options (naming
borrowed from GNU grep) in order to allow specifying PCRE regexes on the
command line.
PCRE has a number of features which make them more handy to use than
POSIX regexes, like consistent escaping rules, extended character
classes, ungreedy matching etc.
git isn't build with PCRE support automatically. USE_LIBPCRE environment
variable must be enabled (like `make USE_LIBPCRE=YesPlease`).
Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/struct-pathspec:
pathspec: rename per-item field has_wildcard to use_wildcard
Improve tree_entry_interesting() handling code
Convert read_tree{,_recursive} to support struct pathspec
Reimplement read_tree_recursive() using tree_entry_interesting()
* load_file() returns a void pointer but is using 0 for the return
value
* builtin/receive-pack.c forgot to include builtin.h
* packet_trace_prefix can be marked static
* ll_merge takes a pointer for its last argument, not an int
* crc32 expects a pointer as the second argument but Z_NULL is defined
to be 0 (see 38f4d13 sparse fix: Using plain integer as NULL pointer,
2006-11-18 for more info)
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add two configration variables grep.extendedRegexp and grep.lineNumbers to
allow the user to skip typing -E and -n on the command line, respectively.
Scripts that are meant to be used by random users and/or in random
repositories now have use -G and/or --no-line-number options as
appropriately to override the settings in the repository or user's
~/.gitconfig settings. Just because the script didn't say "git grep -n" no
longer guarantees that the output from the command will not have line
numbers.
Signed-off-by: Joe Ratterman <jratt0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a synonym for the existing '-n' option, matching GNU grep.
Signed-off-by: Joe Ratterman <jratt0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t_e_i() can return -1 or 2 to early shortcut a search. Current code
may use up to two variables to handle it. One for saving return value
from t_e_i temporarily, one for saving return code 2.
The second variable is not needed. If we make sure the first variable
does not change until the next t_e_i() call, then we can do something
like this:
int ret = 0;
while (...) {
if (ret != 2) {
ret = t_e_i();
if (ret < 0) /* no longer interesting */
break;
if (ret == 0) /* skip this round */
continue;
}
/* ret > 0, interesting */
}
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support the well-know convention of reading standard input instead of a
named file if "-" (dash) is specified. GNU grep does the same.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Prepare draft release notes to 1.7.4.2
gitweb: highlight: replace tabs with spaces
make_absolute_path: return the input path if it points to our buffer
valgrind: ignore SSE-based strlen invalid reads
diff --submodule: split into bite-sized pieces
cherry: split off function to print output lines
branch: split off function that writes tracking info and commit subject
standardize brace placement in struct definitions
compat: make gcc bswap an inline function
enums: omit trailing comma for portability
Conflicts:
RelNotes
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:
struct foo {
int bar;
char *baz;
};
Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.
Linus sayeth:
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gettextize the "--open-files-in-pager only works on the worktree"
message. A test in t7811-grep-open.sh explicitly checked for this
message. Change it to skip under GETTEXT_POISON=YesPlease.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though fill_directory() takes pathspec, the returned set of paths
is not guaranteed to be free of paths outside the pathspec. Perhaps we
would need to change that, but the current API is that the caller needs
to further filter them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All files that include this header file use the same four line
incantation:
#ifndef NO_PTHREADS
#include <pthread.h>
#include "thread-utils.h"
#endif
Move the responsibility for that gymnastics to the header file from the
files that include it. This approach makes it easier to later declare new
services that are related to threading in thread-utils.h and have them
available to all the threading code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allows better help text to be defined than "be quiet". Also make use
of the macro in a place that already had a different description. No
object code changes intended.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/paginate-fix:
t7006 (pager): add missing TTY prerequisites
merge-file: run setup_git_directory_gently() sooner
var: run setup_git_directory_gently() sooner
ls-remote: run setup_git_directory_gently() sooner
index-pack: run setup_git_directory_gently() sooner
config: run setup_git_directory_gently() sooner
bundle: run setup_git_directory_gently() sooner
apply: run setup_git_directory_gently() sooner
grep: run setup_git_directory_gently() sooner
shortlog: run setup_git_directory_gently() sooner
git wrapper: allow setup_git_directory_gently() be called earlier
setup: remember whether repository was found
git wrapper: introduce startup_info struct
Conflicts:
builtin/index-pack.c
git grep already runs a repository search unconditionally,
even when the --no-index option is supplied; running such a
search earlier is not very risky.
Just like with shortlog, without this change, the
“[pager] grep” configuration is not respected at all.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With a .gitconfig like this:
[color]
ui = auto
[color "grep"]
filename = magenta
if stdout is a terminal, the grep machinery will output the color
sequence \e[36m before each filename in its output.
In the case of "git grep -O foo", output is argv for the pager.
Disable color when calling the grep machinery in this case.
Signed-off-by: Nazri Ramliy <ayiehere@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An evil merge to adjust the series to cleaned-up API.
From: Julian Phillips <julian@quantumfyre.co.uk>
Subject: [PATCH v2 7/7] grep: fix string_list_append calls
Date: Sat, 26 Jun 2010 00:41:39 +0100
Message-ID: <20100625234140.18927.35025.julian@quantumfyre.co.uk>
* jp/string-list-api-cleanup:
string_list: Fix argument order for string_list_append
string_list: Fix argument order for string_list_lookup
string_list: Fix argument order for string_list_insert_at_index
string_list: Fix argument order for string_list_insert
string_list: Fix argument order for for_each_string_list
string_list: Fix argument order for print_string_list
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rs/grep-binary:
grep: support NUL chars in search strings for -F
grep: use REG_STARTEND for all matching if available
grep: continue case insensitive fixed string search after NUL chars
grep: use memmem() for fixed string search
grep: --name-only over binary
grep: --count over binary
grep: grep: refactor handling of binary mode options
grep: add test script for binary file handling
Suppose you want to edit all files that contain a specific search term.
Of course, you can do something totally trivial such as
git grep -z -e <term> | xargs -0r vi +/<term>
but maybe you are happy that the same will be achieved by
git grep -Ovi <term>
now.
[jn: rebased and added tests]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Paolo Bonzini <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds an option to open the matching files in the pager, and if the
pager happens to be "less" (or "vi") and there is only one grep pattern,
it also jumps to the first match right away.
The short option was chose as '-O' to avoid clashes with GNU grep's
options (as suggested by Junio).
So, 'git grep -O abc' is a short form for 'less +/abc $(grep -l abc)'
except that it works also with spaces in file names, and it does not
start the pager if there was no matching file.
[jn: rebased and added tests; with error handling fix from Junio
squashed in]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were three awfully similar code paths ending the threaded grep. It
is better to avoid duplicated code, though.
This change might very well prevent a race, where the grep patterns were
free()d before waiting that all threads finished.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify cmd_grep by splitting off the loop that finds matches in a
list of trees. So now the main part of cmd_grep looks like:
if (!use_index) {
int hit = grep_directory(&opt, paths);
if (use_threads)
hit |= wait_all();
return !hit;
}
if (!list.nr) {
if (!cached)
setup_work_tree();
int hit = grep_cache(&opt, paths, cached);
if (use_threads)
hit |= wait_all;
return !hit;
}
hit = grep_objects(&opt, path, &list);
if (use_threads)
hit |= wait_all();
return !hit;
and is ripe for further refactoring.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Search patterns in a file specified with -f can contain NUL characters.
The current code ignores all characters on a line after a NUL.
Pass the actual length of the line all the way from the pattern file to
fixmatch() and use it for case-sensitive fixed string matching.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>