Commit graph

153 commits

Author SHA1 Message Date
brian m. carlson 5ac913c6eb resolve-undo: convert struct resolve_undo_info to object_id
Convert the sha1 member of this struct to be an array of struct
object_id instead.  This change is needed to convert find_unique_abbrev.

Convert some instances of hard-coded constants to use the_hash_algo as
well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:47 -07:00
Junio C Hamano e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Ben Peart d8c71db866 ls-files: Add support in ls-files to display the fsmonitor valid bit
Add a new command line option (-f) to ls-files to have it use lowercase
letters for 'fsmonitor valid' files

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Jeff King 0e5bba53af add UNLEAK annotation for reducing leak false positives
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int cmd_foo(...)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int cmd_foo(...)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:

  1. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  2. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc > 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  3. Likewise, it works even if the variable has _already_
     been freed. We're just copying the pointer bytes. If
     the block has been freed, the leak-checker will skip
     over those bytes as uninteresting.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.  In
normal builds it's a noop, so we pay no runtime cost.

It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08 15:43:17 +09:00
Junio C Hamano 614ea03a71 Merge branch 'bw/submodule-config-cleanup'
Code clean-up to avoid mixing values read from the .gitmodules file
and values read from the .git/config file.

* bw/submodule-config-cleanup:
  submodule: remove gitmodules_config
  unpack-trees: improve loading of .gitmodules
  submodule-config: lazy-load a repository's .gitmodules file
  submodule-config: move submodule-config functions to submodule-config.c
  submodule-config: remove support for overlaying repository config
  diff: stop allowing diff to have submodules configured in .git/config
  submodule: remove submodule_config callback routine
  unpack-trees: don't respect submodule.update
  submodule: don't rely on overlayed config when setting diffopts
  fetch: don't overlay config with submodule-config
  submodule--helper: don't overlay config in update-clone
  submodule--helper: don't overlay config in remote_submodule_branch
  add, reset: ensure submodules can be added or reset
  submodule: don't use submodule_from_name
  t7411: check configuration parsing errors
2017-08-26 22:55:08 -07:00
Brandon Williams ff6f1f564c submodule-config: lazy-load a repository's .gitmodules file
In order to use the submodule-config subsystem, callers first need to
initialize it by calling 'repo_read_gitmodules()' or
'gitmodules_config()' (which just redirects to
'repo_read_gitmodules()').  There are a couple of callers who need to
load an explicit revision of the repository's .gitmodules file (grep) or
need to modify the .gitmodules file so they would need to load it before
modify the file (checkout), but the majority of callers are simply
reading the .gitmodules file present in the working tree.  For the
common case it would be nice to avoid the boilerplate of initializing
the submodule-config system before using it, so instead let's perform
lazy-loading of the submodule-config system.

Remove the calls to reading the gitmodules file from ls-files to show
that lazy-loading the .gitmodules file works.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-03 13:11:01 -07:00
Brandon Williams 1b796ace7b submodule-config: move submodule-config functions to submodule-config.c
Migrate the functions used to initialize the submodule-config to
submodule-config.c so that the callback routine used in the
initialization process can be static and prevent it from being used
outside of initializing the submodule-config through the main API.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-03 13:11:01 -07:00
René Scharfe 168e63554c ls-files: don't try to prune an empty index
Exit early when asked to prune an index that contains no entries to
begin with.  This avoids pointer arithmetic on istate->cache, which is
possibly NULL in that case.

Found with Clang's UBSan.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:55:21 -07:00
René Scharfe f331ab9d4c use MOVE_ARRAY
Simplify the code for moving members inside of an array and make it more
robust by using the helper macro MOVE_ARRAY.  It calculates the size
based on the specified number of elements for us and supports NULL
pointers when that number is zero.  Raw memmove(3) calls with NULL can
cause the compiler to (over-eagerly) optimize out later NULL checks.

This patch was generated with contrib/coccinelle/array.cocci and spatch
(Coccinelle).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:54:56 -07:00
Brandon Williams 188dce131f ls-files: use repository object
Convert ls-files to use a repository struct and recurse submodules
inprocess.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:24:34 -07:00
Junio C Hamano 25bf951381 Merge branches 'bw/ls-files-sans-the-index' and 'bw/config-h' into bw/repo-object
* bw/ls-files-sans-the-index:
  ls-files: factor out tag calculation
  ls-files: factor out debug info into a function
  ls-files: convert show_files to take an index
  ls-files: convert show_ce_entry to take an index
  ls-files: convert prune_cache to take an index
  ls-files: convert ce_excluded to take an index
  ls-files: convert show_ru_info to take an index
  ls-files: convert show_other_files to take an index
  ls-files: convert show_killed_files to take an index
  ls-files: convert write_eolinfo to take an index
  ls-files: convert overlay_tree_on_cache to take an index
  tree: convert read_tree to take an index parameter
  convert: convert renormalize_buffer to take an index
  convert: convert convert_to_git to take an index
  convert: convert convert_to_git_filter_fd to take an index
  convert: convert crlf_to_git to take an index
  convert: convert get_cached_convert_stats_ascii to take an index

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
  alias: use the early config machinery to expand aliases
  t7006: demonstrate a problem with aliases in subdirectories
  t1308: relax the test verifying that empty alias values are disallowed
  help: use early config when autocorrecting aliases
  config: report correct line number upon error
  discover_git_directory(): avoid setting invalid git_dir
2017-06-23 18:24:00 -07:00
Brandon Williams b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Brandon Williams a84f3e59eb ls-files: factor out tag calculation
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:52 -07:00
Brandon Williams 5306ccf9e9 ls-files: factor out debug info into a function
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:52 -07:00
Brandon Williams f587c8dcde ls-files: convert show_files to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:52 -07:00
Brandon Williams ff020a8ab0 ls-files: convert show_ce_entry to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 6510ae173a ls-files: convert prune_cache to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 1d35e3bf05 ls-files: convert ce_excluded to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 2d407e2da1 ls-files: convert show_ru_info to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 23d6846b23 ls-files: convert show_other_files to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 23d6236a07 ls-files: convert show_killed_files to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 1985fd68c6 ls-files: convert write_eolinfo to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 312c984a02 ls-files: convert overlay_tree_on_cache to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams 85ab50f938 tree: convert read_tree to take an index parameter
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Brandon Williams a7609c54b3 convert: convert get_cached_convert_stats_ascii to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Junio C Hamano 3c5a78280f Merge branch 'bw/pathspec-sans-the-index'
Simplify parse_pathspec() codepath and stop it from looking at the
default in-core index.

* bw/pathspec-sans-the-index:
  pathspec: convert find_pathspecs_matching_against_index to take an index
  pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP
  ls-files: prevent prune_cache from overeagerly pruning submodules
  pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag
  submodule: add die_in_unpopulated_submodule function
  pathspec: provide a more descriptive die message
2017-05-30 11:16:40 +09:00
Junio C Hamano 6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
Brandon Williams cbca060e10 ls-files: prevent prune_cache from overeagerly pruning submodules
Since (ae8d08242 pathspec: pass directory indicator to
match_pathspec_item()) the path matching logic has been able to cope
with submodules without needing to strip off a trailing slash if a path
refers to a submodule.

ls-files is the only caller of 'parse_pathspec()' which relies on the
behavior of the PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag because it
uses the result to construct a common prefix of all provided pathspecs
which is then used to prune the index of all entries which don't have
that prefix.  Since submodules entries in the index don't have a
trailing slash 'prune_cache()' will be overeager and prune a submodule
'sub' if the common prefix is 'sub/'.  To correct this behavior, only
prune entries which don't match up to, but not including, a trailing
slash of the common prefix.

This is in preparation to remove the
PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP flag in a later patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-12 14:23:46 +09:00
brian m. carlson a9dbc17910 tree: convert parse_tree_indirect to struct object_id
Convert parse_tree_indirect to take a pointer to struct object_id.
Update all the callers.  This transformation was achieved using the
following semantic patch and manual updates to the declaration and
definition.  Update builtin/checkout.c manually as well, since it uses a
ternary expression not handled by the semantic patch.

@@
expression E1;
@@
- parse_tree_indirect(E1.hash)
+ parse_tree_indirect(&E1)

@@
expression E1;
@@
- parse_tree_indirect(E1->hash)
+ parse_tree_indirect(E1)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson 6f37eb7d85 builtin/ls-files: convert overlay_tree_on_cache to object_id
This is another caller of parse_tree_indirect.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
Brandon Williams 0d32c183b6 dir: convert fill_directory to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams a0bba65b10 dir: convert is_excluded to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Jacob Keller 2cfe66a8ee ls-files: fix path used when recursing into submodules
Don't assume that the current working directory is the root of the
repository. Correctly generate the path for the recursing child
processes by building it from the work_tree() root instead. Otherwise if
we run ls-files using --git-dir or --work-tree it will not work
correctly as it attempts to change directory into a potentially invalid
location. Best case, it doesn't exist and we produce an error. Worst
case we cd into the wrong location and unknown behavior occurs.

Add a new test which highlights this possibility.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 18:01:41 -07:00
Jacob Keller 2e5d6503bd ls-files: fix recurse-submodules with nested submodules
Since commit e77aa336f1 ("ls-files: optionally recurse into
submodules", 2016-10-07) ls-files has known how to recurse into
submodules when displaying files.

Unfortunately this fails for certain cases, including when nesting more
than one submodule, called from within a submodule that itself has
submodules, or when the GIT_DIR environemnt variable is set.

Prior to commit b58a68c1c1 ("setup: allow for prefix to be passed to
git commands", 2017-03-17) this resulted in an error indicating that
--prefix and --super-prefix were incompatible.

After this commit, instead, the process loops forever with a GIT_DIR set
to the parent and continuously reads the parent submodule files and
recursing forever.

Fix this by preparing the environment properly for submodules when
setting up the child process. This is similar to how other commands such
as grep behave.

This was not caught by the original tests because the scenario is
avoided if the submodules are created separately and not stored as the
standard method of putting the submodule git directory under
.git/modules/<name>. We can update the test to show the failure by the
addition of "git submodule absorbgitdirs" to the test case. However,
note that this new test would run forever without the necessary fix in
this patch.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-17 19:04:16 -07:00
Brandon Williams b2dfeb7c00 ls-files: fix bug when recursing with relative pathspec
When using the --recurse-submodules flag with a relative pathspec which
includes "..", an error is produced inside the child process spawned for a
submodule.  When creating the pathspec struct in the child, the ".." is
interpreted to mean "go up a directory" which causes an error stating that the
path ".." is outside of the repository.

While it is true that ".." is outside the scope of the submodule, it is
confusing to a user who originally invoked the command where ".." was indeed
still inside the scope of the superproject.  Since the child process launched
for the submodule has some context that it is operating underneath a
superproject, this error could be avoided.

This patch fixes the bug by passing the 'prefix' to the child process.  Now
each child process that works on a submodule has two points of reference to the
superproject: (1) the 'super_prefix' which is the path from the root of the
superproject down to root of the submodule and (2) the 'prefix' which is the
path from the root of the superproject down to the directory where the user
invoked the git command.

With these two pieces of information a child process can correctly interpret
the pathspecs provided by the user as well as being able to properly format its
output relative to the directory the user invoked the original command from.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 11:54:50 -07:00
Brandon Williams e4770f67d1 ls-files: fix typo in variable name
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 11:54:50 -07:00
René Scharfe 96f6d3f61a ls-files: move only kept cache entries in prune_cache()
prune_cache() first identifies those entries at the start of the sorted
array that can be discarded.  Then it moves the rest of the entries up.
Last it identifies the unwanted trailing entries among the moved ones
and cuts them off.

Change the order: Identify both start *and* end of the range to keep
first and then move only those entries to the top.  The resulting code
is slightly shorter and a bit more efficient.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-13 12:06:10 -08:00
René Scharfe 7b4158a8d8 ls-files: pass prefix length explicitly to prune_cache()
The function prune_cache() relies on the fact that it is only called on
max_prefix and sneakily uses the matching global variable max_prefix_len
directly.  Tighten its interface by passing both the string and its
length as parameters.  While at it move the NULL check into the function
to collect all cache-pruning related logic in one place.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-13 12:06:08 -08:00
Junio C Hamano 1c2b1f7018 Merge branch 'bw/ls-files-recurse-submodules'
"git ls-files" learned "--recurse-submodules" option that can be
used to get a listing of tracked files across submodules (i.e. this
only works with "--cached" option, not for listing untracked or
ignored files).  This would be a useful tool to sit on the upstream
side of a pipe that is read with xargs to work on all working tree
files from the top-level superproject.

* bw/ls-files-recurse-submodules:
  ls-files: add pathspec matching for submodules
  ls-files: pass through safe options for --recurse-submodules
  ls-files: optionally recurse into submodules
  git: make super-prefix option
2016-10-26 13:14:44 -07:00
Brandon Williams 75a6315f74 ls-files: add pathspec matching for submodules
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule.  The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally.  If any wildcard character is encountered we'll simply punt
and produce a false positive match.  More accurate matching will be done
once inside the submodule.  This is due to the superproject not knowing
what files could exist in the submodule.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
Brandon Williams 07c01b9fd9 ls-files: pass through safe options for --recurse-submodules
Pass through some known-safe options when recursing into submodules.
(--cached, -v, -t, -z, --debug, --eol)

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
Brandon Williams e77aa336f1 ls-files: optionally recurse into submodules
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules.  This is done by forking off a
process to recursively call ls-files on all submodules. Use top-level
--super-prefix option to pass a path to the submodule which it can
use to prepend to output or pathspec matching logic.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
brian m. carlson 99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Johannes Schindelin ef1177d18e die("bug"): report bugs consistently
The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".

As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Junio C Hamano 722c924445 Merge branch 'jk/options-cleanup'
Various clean-ups to the command line option parsing.

* jk/options-cleanup:
  apply, ls-files: simplify "-z" parsing
  checkout-index: disallow "--no-stage" option
  checkout-index: handle "--no-index" option
  checkout-index: handle "--no-prefix" option
  checkout-index: simplify "-z" option parsing
  give "nbuf" strbuf a more meaningful name
2016-02-10 14:20:08 -08:00
Jeff King 1f3c79a9d6 apply, ls-files: simplify "-z" parsing
As a short option, we cannot handle negation. Thus a callback
handling "unset" is overkill, and we can just use OPT_SET_INT
instead to handle setting the option.

Anybody who adds "--nul" synonym to this later would need to be
careful not to break "--no-nul", which should mean that lines are
terminated with LF at the end.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-01 14:14:20 -08:00
Torsten Bögershausen a7630bd427 ls-files: add eol diagnostics
When working in a cross-platform environment, a user may want to
check if text files are stored normalized in the repository and
if .gitattributes are set appropriately.

Make it possible to let Git show the line endings in the index and
in the working tree and the effective text/eol attributes.

The end of line ("eolinfo") are shown like this:

    "-text"        binary (or with bare CR) file
    "none"         text file without any EOL
    "lf"           text file with LF
    "crlf"         text file with CRLF
    "mixed"        text file with mixed line endings.

The effective text/eol attribute is one of these:

    "", "-text", "text", "text=auto", "text eol=lf", "text eol=crlf"

git ls-files --eol gives an output like this:

    i/none   w/none   attr/text=auto      t/t5100/empty
    i/-text  w/-text  attr/-text          t/test-binary-2.png
    i/lf     w/lf     attr/text eol=lf    t/t5100/rfc2047-info-0007
    i/lf     w/crlf   attr/text eol=crlf  doit.bat
    i/mixed  w/mixed  attr/               locale/XX.po

to show what eol convention is used in the data in the index ('i'),
and in the working tree ('w'), and what attribute is in effect,
for each path that is shown.

Add test cases in t0027.

Helped-By: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-18 19:48:43 -08:00
Junio C Hamano 8b54c23437 ps_matched: xcalloc() takes nmemb and then element size
Even though multiplication is commutative, the order of arguments
should be xcalloc(nmemb, size).  ps_matched is an array of 1-byte
element whose size is the same as the number of pathspec elements.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-20 09:57:38 -07:00
Junio C Hamano 574ee8ae86 Merge branch 'jc/report-path-error-to-dir'
Code clean-up.

* jc/report-path-error-to-dir:
  report_path_error(): move to dir.c
2015-03-26 11:57:13 -07:00
Junio C Hamano 777c55a616 report_path_error(): move to dir.c
The expected call sequence is for the caller to use match_pathspec()
repeatedly on a set of pathspecs, accumulating the "hits" in a
separate array, and then call this function to diagnose a pathspec
that never matched anything, as that can indicate a typo from the
command line, e.g. "git commit Maekfile".

Many builtin commands use this function from builtin/ls-files.c,
which is not a very healthy arrangement.  ls-files might have been
the first command to feel the need for such a helper, but the need
is shared by everybody who uses the "match and then report" pattern.

Move it to dir.c where match_pathspec() is defined.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-24 14:12:10 -07:00
Alex Henrie 9c9b4f2f8b standardize usage info string format
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:

- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14 09:32:04 -08:00
Alex Henrie ad5fe3771b grammofix in user-facing messages
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-02 12:00:30 -07:00
Nguyễn Thái Ngọc Duy ae8d082421 pathspec: pass directory indicator to match_pathspec_item()
This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which
makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce
the same output. Previously only the version without trailing slash
returns the difference (if any).

That's the effect of new ce_path_match(). dir_path_match() is not
executed by the new tests. And it should not introduce regressions.

Previously if path "dir/" is passed in with pathspec "dir/", they
obviously match. With new dir_path_match(), the path becomes
_directory_ "dir" vs pathspec "dir/", which is not executed by the old
code path in m_p_i(). The new code path is executed and produces the
same result.

The other case is pathspec "dir" and path "dir/" is now turned to
"dir" (with DO_MATCH_DIRECTORY). Still the same result before or after
the patch.

So why change? Because of the next patch about clean.c.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy 854b09592c pathspec: rename match_pathspec_depth() to match_pathspec()
A long time ago, for some reason I was not happy with
match_pathspec(). I created a better version, match_pathspec_depth()
that was suppose to replace match_pathspec()
eventually. match_pathspec() has finally been gone since 6 months
ago. Use the shorter name for match_pathspec_depth().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:14 -08:00
Nguyễn Thái Ngọc Duy ebb32893ba pathspec: convert some match_pathspec_depth() to dir_path_match()
This helps reduce the number of match_pathspec_depth() call sites and
show how m_p_d() is used. And it usage is:

 - match against an index entry (ce_path_match or match_pathspec_depth
   in ls-files)

 - match against a dir_entry from read_directory (dir_path_match and
   match_pathspec_depth in clean.c, which will be converted later)

 - resolve-undo (rerere.c and ls-files.c)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:09 -08:00
Junio C Hamano 4c4d9d9b65 Merge branch 'jc/ls-files-killed-optim'
"git ls-files -k" needs to crawl only the part of the working tree
that may overlap the paths in the index to find killed files, but
shared code with the logic to find all the untracked files, which
made it unnecessarily inefficient.

* jc/ls-files-killed-optim:
  dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage
  t3010: update to demonstrate "ls-files -k" optimization pitfalls
  ls-files -k: a directory only can be killed if the index has a non-directory
  dir.c: use the cache_* macro to access the current index
2013-09-11 15:03:28 -07:00
Junio C Hamano b02f5aeda6 Merge branch 'jl/submodule-mv'
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
2013-09-09 14:36:15 -07:00
Junio C Hamano 2eac2a4cc4 ls-files -k: a directory only can be killed if the index has a non-directory
"ls-files -o" and "ls-files -k" both traverse the working tree down
to find either all untracked paths or those that will be "killed"
(removed from the working tree to make room) when the paths recorded
in the index are checked out.  It is necessary to traverse the
working tree fully when enumerating all the "other" paths, but when
we are only interested in "killed" paths, we can take advantage of
the fact that paths that do not overlap with entries in the index
can never be killed.

The treat_one_path() helper function, which is called during the
recursive traversal, is the ideal place to implement an
optimization.

When we are looking at a directory P in the working tree, there are
three cases:

 (1) P exists in the index.  Everything inside the directory P in
     the working tree needs to go when P is checked out from the
     index.

 (2) P does not exist in the index, but there is P/Q in the index.
     We know P will stay a directory when we check out the contents
     of the index, but we do not know yet if there is a directory
     P/Q in the working tree to be killed, so we need to recurse.

 (3) P does not exist in the index, and there is no P/Q in the index
     to require P to be a directory, either.  Only in this case, we
     know that everything inside P will not be killed without
     recursing.

Note that this helper is called by treat_leading_path() that decides
if we need to traverse only subdirectories of a single common
leading directory, which is essential for this optimization to be
correct.  This caller checks each level of the leading path
component from shallower directory to deeper ones, and that is what
allows us to only check if the path appears in the index.  If the
call to treat_one_path() weren't there, given a path P/Q/R, the real
traversal may start from directory P/Q/R, even when the index
records P as a regular file, and we would end up having to check if
any leading subpath in P/Q/R, e.g. P, appears in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-15 13:50:34 -07:00
Stefan Beller d5d09d4754 Replace deprecated OPT_BOOLEAN by OPT_BOOL
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.

This patch does not change semantics of any command intentionally.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 11:32:19 -07:00
Junio C Hamano 988f98f61f Merge branch 'jx/clean-interactive'
Add "interactive" mode to "git clean".

The early part to refactor relative path related helper functions
looked sensible.

* jx/clean-interactive:
  test: run testcases with POSIX absolute paths on Windows
  test: add t7301 for git-clean--interactive
  git-clean: add documentation for interactive git-clean
  git-clean: add ask each interactive action
  git-clean: add select by numbers interactive action
  git-clean: add filter by pattern interactive action
  git-clean: use a git-add-interactive compatible UI
  git-clean: add colors to interactive git-clean
  git-clean: show items of del_list in columns
  git-clean: add support for -i/--interactive
  git-clean: refactor git-clean into two phases
  write_name{_quoted_relative,}(): remove redundant parameters
  quote_path_relative(): remove redundant parameter
  quote.c: substitute path_relative with relative_path
  path.c: refactor relative_path(), not only strip prefix
  test: add test cases for relative_path
2013-07-22 11:24:11 -07:00
Nguyễn Thái Ngọc Duy 9a08727443 remove init_pathspec() in favor of parse_pathspec()
While at there, move free_pathspec() to pathspec.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 827f4d6c21 convert common_prefix() to use struct pathspec
The code now takes advantage of nowildcard_len field.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 7327d3d1b7 convert {read,fill}_directory to take struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:08 -07:00
Nguyễn Thái Ngọc Duy 17ddc66e70 convert report_path_error to take struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:08 -07:00
Nguyễn Thái Ngọc Duy 9e06d6ed76 ls-files: convert to use parse_pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:07 -07:00
Nguyễn Thái Ngọc Duy 64acde94ef move struct pathspec and related functions to pathspec.[ch]
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano 079424a2cf Merge branch 'mh/ref-races'
"git pack-refs" that races with new ref creation or deletion have
been susceptible to lossage of refs under right conditions, which
has been tightened up.

* mh/ref-races:
  for_each_ref: load all loose refs before packed refs
  get_packed_ref_cache: reload packed-refs file when it changes
  add a stat_validity struct
  Extract a struct stat_data from cache_entry
  packed_ref_cache: increment refcount when locked
  do_for_each_entry(): increment the packed refs cache refcount
  refs: manage lifetime of packed refs cache via reference counting
  refs: implement simple transactions for the packed-refs file
  refs: wrap the packed refs cache in a level of indirection
  pack_refs(): split creation of packed refs and entry writing
  repack_without_ref(): split list curation and entry writing
2013-06-30 15:40:05 -07:00
Jiang Xin e9a820cefd write_name{_quoted_relative,}(): remove redundant parameters
After substitute path_relative() in quote.c with relative_path()
from path.c, parameters (such as len and prefix_len) are redundant
in function write_name() and write_name_quoted_relative().  The
callers have already been audited that the strings they pass are
properly NUL terminated and the length they give are the length of
the string (or -1 that asks the length to be counted by the callee).

Remove these now-redundant parameters.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-26 11:22:06 -07:00
Jiang Xin 39598f9983 quote_path_relative(): remove redundant parameter
quote_path_relative() used to take a counted string as its parameter
(the string to be quoted).  With an earlier change, it now uses
relative_path() that does not take a counted string, and we have
been passing only the pointer to the string since then.

Remove the length parameter from quote_path_relative() to show that
this parameter was redundant.  All the changed lines show that the
caller passed either -1 (to ask the function run strlen() on the
string), or the length of the string, so the earlier conversion was
safe.

All the callers of quote_path_relative() that used to take counted string
have been audited to make sure that they are passing length of the actual
string (or -1 to ask the callee run strlen())

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-26 11:16:48 -07:00
Jiang Xin ad66df2df1 quote.c: substitute path_relative with relative_path
Substitute the function path_relative in quote.c with the function
relative_path. Function relative_path can be treated as an enhanced
and more robust version of path_relative.

Outputs of path_relative and it's replacement (relative_path) are the
same for the following cases:

    path      prefix     output of path_relative  output of relative_path
    ========  =========  =======================  =======================
    /a/b/c/   /a/b/      c/                       c/
    /a/b/c    /a/b/      c                        c
    /a/       /a/b/      ../                      ../
    /         /a/b/      ../../                   ../../
    /a/c      /a/b/      ../c                     ../c
    /x/y      /a/b/      ../../x/y                ../../x/y
    a/b/c/    a/b/       c/                       c/
    a/        a/b/       ../                      ../
    x/y       a/b/       ../../x/y                ../../x/y
    /a/b      (empty)    /a/b                     /a/b
    /a/b      (null)     /a/b                     /a/b
    a/b       (empty)    a/b                      a/b
    a/b       (null)     a/b                      a/b

But if both of the path and the prefix are the same, or the returned
relative path should be the current directory, the outputs of both
functions are different. Function relative_path returns "./", while
function path_relative returns empty string.

    path      prefix     output of path_relative  output of relative_path
    ========  =========  =======================  =======================
    /a/b/     /a/b/      (empty)                  ./
    a/b/      a/b/       (empty)                  ./
    (empty)   (null)     (empty)                  ./
    (empty)   (empty)    (empty)                  ./

But the callers of path_relative can handle such cases, or never
encounter this issue at all, because:

 * In function quote_path_relative, if the output of path_relative is
   empty, append "./" to it, like:

       if (!out->len)
           strbuf_addstr(out, "./");

 * Another caller is write_name_quoted_relative, which is only used
   by builtin/ls-files.c. git-ls-files only show files, so path of
   files will never be identical with the prefix of a directory.

The following differences show that path_relative does not handle
extra slashes properly:

    path      prefix     output of path_relative  output of relative_path
    ========  =========  =======================  =======================
    /a//b//c/ //a/b//    ../../../../a//b//c/     c/
    a/b//c    a//b       ../b//c                  c

And if prefix has no trailing slash, path_relative does not work
properly either.  But since prefix always has a trailing slash, it's
not a problem.

    path      prefix     output of path_relative  output of relative_path
    ========  =========  =======================  =======================
    /a/b/c/   /a/b       b/c/                     c/
    /a/b      /a/b       b                        ./
    /a/b/     /a/b       b/                       ./
    /a        /a/b/      ../../a                  ../
    a/b/c/    a/b        b/c/                     c/
    a/b/      a/b        b/                       ./
    a         a/b        ../a                     ../
    x/y       a/b/       ../x/y                   ../../x/y
    a/c       a/b        c                        ../c
    /a/       /a/b       (empty)                  ../
    (empty)   /a/b       ../../                   ./

One tricky part in this conversion is write_name() function in
ls-files.c.  It takes a counted string, <name, len>, that is to be
made relative to <prefix, prefix_len> and then quoted.  Because
write_name_quoted_relative() still takes these two parameters as
counted string, but ignores the count and treat these two as
NUL-terminated strings, this conversion needs to be audited for its
callers:

 - For <name, len>, all three callers of write_name() passes a
   NUL-terminated string and its true length, so this patch makes
   "len" unused.

 - For <prefix, prefix_len>, prefix could be a string that is longer
   than empty while prefix_len could be 0 when "--full-name" option
   is used.  This is fixed by checking prefix_len in write_name()
   and calling write_name_quoted_relative() with NULL when
   prefix_len is set to 0.  Again, this makes "prefix_len" given to
   write_name_quoted_relative() unused, without introducing a bug.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-26 11:13:50 -07:00
Michael Haggerty c21d39d7c7 Extract a struct stat_data from cache_entry
Add public functions fill_stat_data() and match_stat_data() to work
with it.  This infrastructure will later be used to check the validity
of other types of file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-20 15:50:17 -07:00
René Scharfe 0b437a18bd use logical OR (||) instead of binary OR (|) in logical context
The compiler can short-circuit the evaluation of conditions strung
together with logical OR operators instead of computing the resulting
bitmask with binary ORs.  More importantly, this patch makes the
intent of the changed code clearer, because the logical context (as
opposed to binary context) becomes immediately obvious.

While we're at it, simplify the check for patch->is_rename in
builtin/apply.c a bit; it can only be 0 or 1, so we don't need a
comparison operator.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-13 14:47:07 -07:00
Karsten Blees b07bc8c8c3 dir.c: replace is_path_excluded with now equivalent is_excluded API
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:01 -07:00
Junio C Hamano a39b15b4f6 Merge branch 'as/check-ignore'
Add a new command "git check-ignore" for debugging .gitignore
files.

The variable names may want to get cleaned up but that can be done
in-tree.

* as/check-ignore:
  clean.c, ls-files.c: respect encapsulation of exclude_list_groups
  t0008: avoid brace expansion
  add git-check-ignore sub-command
  setup.c: document get_pathspec()
  add.c: extract new die_if_path_beyond_symlink() for reuse
  add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
  pathspec.c: rename newly public functions for clarity
  add.c: move pathspec matchers into new pathspec.c for reuse
  add.c: remove unused argument from validate_pathspec()
  dir.c: improve docs for match_pathspec() and match_pathspec_depth()
  dir.c: provide clear_directory() for reclaiming dir_struct memory
  dir.c: keep track of where patterns came from
  dir.c: use a single struct exclude_list per source of excludes

Conflicts:
	builtin/ls-files.c
	dir.c
2013-01-23 21:19:10 -08:00
Adam Spiers 72aeb18772 clean.c, ls-files.c: respect encapsulation of exclude_list_groups
Consumers of the dir.c traversal API should avoid assuming knowledge
of the internal implementation of exclude_list_groups.  Therefore
when adding items to an exclude list, it should be accessed via the
pointer returned from add_exclude_list(), rather than by referencing
a location within dir.exclude_list_groups[EXC_CMDL].

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 09:43:35 -08:00
Junio C Hamano d912b0e44f Merge branch 'as/dir-c-cleanup'
Refactor and generally clean up the directory traversal API
implementation.

* as/dir-c-cleanup:
  dir.c: rename free_excludes() to clear_exclude_list()
  dir.c: refactor is_path_excluded()
  dir.c: refactor is_excluded()
  dir.c: refactor is_excluded_from_list()
  dir.c: rename excluded() to is_excluded()
  dir.c: rename excluded_from_list() to is_excluded_from_list()
  dir.c: rename path_excluded() to is_path_excluded()
  dir.c: rename cryptic 'which' variable to more consistent name
  Improve documentation and comments regarding directory traversal API
  api-directory-listing.txt: update to match code
2013-01-10 13:47:25 -08:00
Adam Spiers c04318e46a dir.c: keep track of where patterns came from
For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).

For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1.  So for example the 2nd pattern provided via
--exclude would be numbered -2.  This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:26:37 -08:00
Adam Spiers c082df2453 dir.c: use a single struct exclude_list per source of excludes
Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:25:06 -08:00
Adam Spiers 9013089c4a dir.c: rename path_excluded() to is_path_excluded()
Start adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was agreed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:45 -08:00
Nguyễn Thái Ngọc Duy 170260ae90 pathspec: save the non-wildcard length part
We mark pathspec with wildcards with the field use_wildcard. We
could do better by saving the length of the non-wildcard part, which
can be used for optimizations such as f9f6e2c (exclude: do strcmp as
much as possible before fnmatch - 2012-06-07).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-19 13:08:28 -08:00
Nguyễn Thái Ngọc Duy 377adc3aaf i18n: ls-files: mark parseopt strings for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-20 12:23:18 -07:00
Junio C Hamano 782cd4c0f6 path_excluded(): update API to less cache-entry centric
It was stupid of me to make the API too much cache-entry specific;
the caller may want to check arbitrary pathname without having a
corresponding cache-entry to see if a path is ignored.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-05 21:22:36 -07:00
Junio C Hamano eb41775ecc ls-files -i: pay attention to exclusion of leading paths
"git ls-files --exclude=t/ -i" does not show paths in directory t/
that have been added to the index, but it should.

The excluded() API was designed for callers who walk the tree from
the top, checking each level of the directory hierarchy as it
descends if it is excluded, and not even bothering to recurse into
an excluded directory.  This would allow us optimize for a common
case by not having to check if the exclude pattern "foo/" matches
when looking at "foo/bar", because the caller should have noticed
that "foo" is excluded and did not even bother to read "foo/bar"
out of opendir()/readdir() to call it.

The code for "ls-files -i" however walks the index linearly, feeding
paths without checking if the leading directory is already excluded.

Introduce a helper function path_excluded() to let this caller
properly call excluded() check for higher hierarchies as necessary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-03 16:05:42 -07:00
Junio C Hamano ca3ef81ad7 Merge branch 'cb/common-prefix-unification'
* cb/common-prefix-unification:
  rename pathspec_prefix() to common_prefix() and move to dir.[ch]
  consolidate pathspec_prefix and common_prefix
  remove prefix argument from pathspec_prefix
2011-10-10 15:56:17 -07:00
Clemens Buchacher f950eb9560 rename pathspec_prefix() to common_prefix() and move to dir.[ch]
Also make common_prefix_len() static as this refactoring makes dir.c
itself the only caller of this helper function.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-12 14:38:32 -07:00
Clemens Buchacher 5879f5684c remove prefix argument from pathspec_prefix
Passing a prefix to a function that is supposed to find the prefix is
strange. And it's really only used if the pathspec is NULL. Make the
callers handle this case instead.

As we are always returning a fresh copy of a string (or NULL), change the
type of the returned value to non-const "char *".

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-06 12:50:10 -07:00
Junio C Hamano 6133e4da54 Merge branch 'cb/maint-ls-files-error-report'
* cb/maint-ls-files-error-report:
  ls-files: fix pathspec display on error
2011-08-23 15:34:31 -07:00
Clemens Buchacher 0f64bfa956 ls-files: fix pathspec display on error
The following sequence of commands reveals an issue with error
reporting of relative paths:

 $ mkdir sub
 $ cd sub
 $ git ls-files --error-unmatch ../bbbbb
 error: pathspec 'b' did not match any file(s) known to git.
 $ git commit --error-unmatch ../bbbbb
 error: pathspec 'b' did not match any file(s) known to git.

This bug is visible only if the normalized path (i.e., the relative
path from the repository root) is longer than the prefix.
Otherwise, the code skips over the normalized path and reads from
an unused memory location which still contains a leftover of the
original command line argument.

So instead, use the existing facilities to deal with relative paths
correctly.

Also fix inconsistency between "checkout" and "commit", e.g.

    $ cd Documentation
    $ git checkout nosuch.txt
    error: pathspec 'Documentation/nosuch.txt' did not match...
    $ git commit nosuch.txt
    error: pathspec 'nosuch.txt' did not match...

by propagating the prefix down the codepath that reports the error.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-11 13:04:16 -07:00
Clemens Buchacher 8894d53580 commit: allow partial commits with relative paths
In order to do partial commits, git-commit overlays a tree on the
cache and checks pathspecs against the result. Currently, the
overlaying is done using "prefix" which prevents relative pathspecs
with ".." and absolute pathspec from matching when they refer to
files not under "prefix" and absent from the index, but still in
the tree (i.e.  files staged for removal).

The point of providing a prefix at all is performance optimization.
If we say there is no common prefix for the files of interest, then
we have to read the entire tree into the index.

But even if we cannot use the working directory as a prefix, we can
still figure out if there is a common prefix for all given paths,
and use that instead. The pathspec_prefix() routine from ls-files.c
does exactly that.

Any use of global variables is removed from pathspec_prefix() so
that it can be called from commit.c.

Reported-by: Reuben Thomas <rrt@sc3d.org>
Analyzed-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-02 14:20:35 -07:00
Junio C Hamano 33e0f62ba9 pathspec: rename per-item field has_wildcard to use_wildcard
As the point of the last change is to allow use of strings as
literals no matter what characters are in them, "has_wildcard"
does not match what we use this field for anymore.

It is used to decide if the wildcard matching should be used, so
rename it to match the usage better.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-05 09:30:36 -07:00
Nguyễn Thái Ngọc Duy f0096c06bc Convert read_tree{,_recursive} to support struct pathspec
This patch changes behavior of the two functions. Previously it does
prefix matching only. Now it can also do wildcard matching.

All callers are updated. Some gain wildcard matching (archive,
checkout), others reset pathspec_item.has_wildcard to retain old
behavior (ls-files, ls-tree as they are plumbing).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-25 09:20:33 -07:00
Junio C Hamano 6758af89e4 Merge branch 'jn/git-cmd-h-bypass-setup'
* jn/git-cmd-h-bypass-setup:
  update-index -h: show usage even with corrupt index
  merge -h: show usage even with corrupt index
  ls-files -h: show usage even with corrupt index
  gc -h: show usage even with broken configuration
  commit/status -h: show usage even with broken configuration
  checkout-index -h: show usage even in an invalid repository
  branch -h: show usage even in an invalid repository

Conflicts:
	builtin/merge.c
2010-12-12 21:49:50 -08:00
Nguyễn Thái Ngọc Duy cbb3167ef8 ls-files -h: show usage even with corrupt index
Part of a campaign to avoid git <command> -h being distracted by
access to the repository.  A caller hoping to use "git ls-files"
with an alternate index as part of a repair operation may well use
"git ls-files -h" to show usage while planning it out.

[jn: with rewritten log message and tests]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-22 11:04:53 -07:00
Štěpán Němec 0adda9362a Use parentheses and `...' where appropriate
Remove some stray usage of other bracket types and asterisks for the
same purpose.

Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:31:07 -07:00
Junio C Hamano c7e375de42 Merge branch 'ar/string-list-foreach'
* ar/string-list-foreach:
  Convert the users of for_each_string_list to for_each_string_list_item macro
  Add a for_each_string_list_item macro
2010-08-18 12:14:38 -07:00
Thomas Rast 8497421715 ls-files: learn a debugging dump format
Teach git-ls-files a new option --debug that just tacks all available
data from the cache onto each file's line.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 13:16:16 -07:00
Alex Riesen 8a57c6e943 Convert the users of for_each_string_list to for_each_string_list_item macro
The rule for selecting the candidates for conversion is: if the callback
function returns only 0 (the condition for for_each_string_list to exit
early), than it can be safely converted to the macro.

A notable exception are the callers in builtin/remote.c. If converted, the
readability in the file will suffer greately. Besides, the code is not very
performance critical (at the moment, at least): it does output formatting of
the list of remotes.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-05 11:44:35 -07:00
Junio C Hamano a53deac89e Merge branch 'jp/string-list-api-cleanup'
* jp/string-list-api-cleanup:
  string_list: Fix argument order for string_list_append
  string_list: Fix argument order for string_list_lookup
  string_list: Fix argument order for string_list_insert_at_index
  string_list: Fix argument order for string_list_insert
  string_list: Fix argument order for for_each_string_list
  string_list: Fix argument order for print_string_list
2010-06-30 11:55:38 -07:00
Julian Phillips b684e97736 string_list: Fix argument order for for_each_string_list
Update the definition and callers of for_each_string_list to use the
string_list as the first argument.  This helps make the string_list
API easier to use by being more consistent.

Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-27 10:06:51 -07:00