Commit graph

435 commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy 76e6b090a0 untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
This can be used to double check if results with untracked cache are
correctly, compared to vanilla version. Untracked cache remains in
index, but not used.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy 1bbb3dba3f untracked cache: mark index dirty if untracked cache is updated
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy c9ccb5d327 untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
This could be used to verify correct behavior in tests

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy ed4efab1b1 untracked cache: avoid racy timestamps
When a directory is updated within the same second that its timestamp
is last saved, we cannot realize the directory has been updated by
checking timestamps. Assume the worst (something is update). See
29e4d36 (Racy GIT - 2005-12-20) for more information.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy e931371a8f untracked cache: invalidate at index addition or removal
Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy f9e6c64958 untracked cache: load from UNTR index extension
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy 83c094ad0d untracked cache: save to an index extension
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy 27b099ae87 untracked cache: don't open non-existent .gitignore
This cuts down a signficant number of open(.gitignore) because most
directories usually don't have .gitignore files.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy 26cb0182b8 untracked cache: mark what dirs should be recursed/saved
If we redo this thing in a functional style, we would have one struct
untracked_dir as input tree and another as output. The input is used
for verification. The output is a brand new tree, reflecting current
worktree.

But that means recreate a lot of dir nodes even if a lot could be
shared between input and output trees in good cases. So we go with the
messy but efficient way, combining both input and output trees into
one. We need a way to know which node in this combined tree belongs to
the output. This is the purpose of this "recurse" flag.

"valid" bit can't be used for this because it's about data of the node
except the subdirs. When we invalidate a directory, we want to keep
cached data of the subdirs intact even though we don't really know
what subdir still exists (yet). Then we check worktree to see what
actual subdir remains on disk. Those will have 'recurse' bit set
again. If cached data for those are still valid, we may be able to
avoid computing exclude files for them. Those subdirs that are deleted
will have 'recurse' remained clear and their 'valid' bits do not
matter.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy 91a2288b5f untracked cache: record/validate dir mtime and reuse cached output
The main readdir loop in read_directory_recursive() is replaced with a
new one that checks if cached results of a directory is still valid.

If a file is added or removed from the index, the containing directory
is invalidated (but not its subdirs). If directory's mtime is changed,
the same happens. If a .gitignore is updated, the containing directory
and all subdirs are invalidated recursively. If dir_struct#flags or
other conditions change, the cache is ignored.

If a directory is invalidated, we opendir/readdir/closedir and run the
exclude machinery on that directory listing as usual. If untracked
cache is also enabled, we'll update the cache along the way. If a
directory is validated, we simply pull the untracked listing out from
the cache. The cache also records the list of direct subdirs that we
have to recurse in. Fully excluded directories are seen as "untracked
files".

In the best case when no dirs are invalidated, read_directory()
becomes a series of

  stat(dir), open(.gitignore), fstat(), read(), close() and optionally
  hash_sha1_file()

For comparison, standard read_directory() is a sequence of

  opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the
  expensive last_exclude_matching() and closedir().

We already try not to open(.gitignore) if we know it does not exist,
so open/fstat/read/close sequence does not apply to every
directory. The sequence could be reduced further, as noted in
prep_exclude() in another patch. So in theory, the entire best-case
read_directory sequence could be reduced to a series of stat() and
nothing else.

This is not a silver bullet approach. When you compile a C file, for
example, the old .o file is removed and a new one with the same name
created, effectively invalidating the containing directory's cache
(but not its subdirectories). If your build process touches every
directory, this cache adds extra overhead for nothing, so it's a good
idea to separate generated files from tracked files.. Editors may use
the same strategy for saving files. And of course you're out of luck
running your repo on an unsupported filesystem and/or operating system.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy cf7c61484f untracked cache: make a wrapper around {open,read,close}dir()
This allows us to feed different info to read_directory_recursive()
based on untracked cache in the next patch.

Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy 5ebf79ad4b untracked cache: invalidate dirs recursively if .gitignore changes
It's easy to see that if an existing .gitignore changes, its SHA-1
would be different and invalidate_gitignore() is called.

If .gitignore is removed, add_excludes() will treat it like an empty
.gitignore, which again should invalidate the cached directory data.

if .gitignore is added, lookup_untracked() already fills initial
.gitignore SHA-1 as "empty file", so again invalidate_gitignore() is
called.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy ccad261f07 untracked cache: initial untracked cache validation
Make sure the starting conditions and all global exclude files are
good to go. If not, either disable untracked cache completely, or wipe
out the cache and start fresh.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy 0dcb8d7fe0 untracked cache: record .gitignore information and dir hierarchy
The idea is if we can capture all input and (non-rescursive) output of
read_directory_recursive(), and can verify later that all the input is
the same, then the second r_d_r() should produce the same output as in
the first run.

The requirement for this to work is stat info of a directory MUST
change if an entry is added to or removed from that directory (and
should not change often otherwise). If your OS and filesystem do not
meet this requirement, untracked cache is not for you. Most file
systems on *nix should be fine. On Windows, NTFS is fine while FAT may
not be [1] even though FAT on Linux seems to be fine.

The list of input of r_d_r() is in the big comment block in dir.h. In
short, the output of a directory (not counting subdirs) mainly depends
on stat info of the directory in question, all .gitignore leading to
it and the check_only flag when r_d_r() is called recursively. This
patch records all this info (and the output) as r_d_r() runs.

Two hash_sha1_file() are required for $GIT_DIR/info/exclude and
core.excludesfile unless their stat data matches. hash_sha1_file() is
only needed when .gitignore files in the worktree are modified,
otherwise their SHA-1 in index is used (see the previous patch).

We could store stat data for .gitignore files so we don't have to
rehash them if their content is different from index, but I think
.gitignore files are rarely modified, so not worth extra cache data
(and hashing penalty read-cache.c:verify_hdr(), as we will be storing
this as an index extension).

The implication is, if you change .gitignore, you better add it to the
index soon or you lose all the benefit of untracked cache because a
modified .gitignore invalidates all subdirs recursively. This is
especially bad for .gitignore at root.

This cached output is about untracked files only, not ignored files
because the number of tracked files is usually small, so small cache
overhead, while the number of ignored files could go really high
(e.g. *.o files mixing with source code).

[1] "Description of NTFS date and time stamps for files and folders"
    http://support.microsoft.com/kb/299648

Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: David Turner <dturner@twopensource.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:14 -07:00
Nguyễn Thái Ngọc Duy 55fe6f51f4 dir.c: optionally compute sha-1 of a .gitignore file
This is not used anywhere yet. But the goal is to compare quickly if a
.gitignore file has changed when we have the SHA-1 of both old (cached
somewhere) and new (from index or a tree) versions.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:08 -07:00
Junio C Hamano 1758d236a2 Merge branch 'nd/dir-prep-exclude-cleanup'
Code clean-up.

* nd/dir-prep-exclude-cleanup:
  dir.c: remove the second declaration of "stk" in prep_exclude()
2014-10-24 15:00:05 -07:00
Nguyễn Thái Ngọc Duy 03e11a715b dir.c: remove the second declaration of "stk" in prep_exclude()
This "stk" shadows the first declaration at the top. There's currently
no bad effect. But let's avoid it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-21 11:22:00 -07:00
Junio C Hamano f655651e09 Merge branch 'rs/strbuf-getcwd'
Reduce the use of fixed sized buffer passed to getcwd() calls
by introducing xgetcwd() helper.

* rs/strbuf-getcwd:
  use strbuf_add_absolute_path() to add absolute paths
  abspath: convert absolute_path() to strbuf
  use xgetcwd() to set $GIT_DIR
  use xgetcwd() to get the current directory or die
  wrapper: add xgetcwd()
  abspath: convert real_path_internal() to strbuf
  abspath: use strbuf_getcwd() to remember original working directory
  setup: convert setup_git_directory_gently_1 et al. to strbuf
  unix-sockets: use strbuf_getcwd()
  strbuf: add strbuf_getcwd()
2014-09-02 13:28:44 -07:00
René Scharfe 56b9f6e738 use xgetcwd() to get the current directory or die
Convert several calls of getcwd() and die() to use xgetcwd() instead.
This way we get rid of fixed-size buffers (which can be too small
depending on the used file system) and gain consistent error messages.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-26 11:06:06 -07:00
Nguyễn Thái Ngọc Duy aceb9429b3 prep_exclude: remove the artificial PATH_MAX limit
This fixes a segfault in git-status with long paths on Windows,
where PATH_MAX is only 260.

This also fixes the problem of silently ignoring .gitignore if the
full path exceeds PATH_MAX. Now add_excludes_from_file() will report
if it gets ENAMETOOLONG.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14 15:24:34 -07:00
Nguyễn Thái Ngọc Duy d961baa846 dir.c: coding style fix
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14 15:24:34 -07:00
Jeremiah Mahler ccdd4a0f3c cleanup duplicate name_compare() functions
We often represent our strings as a counted string, i.e. a pair of
the pointer to the beginning of the string and its length, and the
string may not be NUL terminated to that length.

To compare a pair of such counted strings, unpack-trees.c and
read-cache.c implement their own name_compare() functions
identically.  In addition, the cache_name_compare() function in
read-cache.c is nearly identical.  The only difference is when one
string is the prefix of the other string, in which case
name_compare() returns -1/+1 to show which one is longer, and
cache_name_compare() returns the difference of the lengths to show
the same information.

Unify these three functions by using the implementation from
cache_name_compare().  This does not make any difference to the
existing and future callers, as they must be paying attention only
to the sign of the returned value (and not the magnitude) because
the original implementations of these two functions return values
returned by memcmp(3) when the one string is not a prefix of the
other string, and the only thing memcmp(3) guarantees its callers is
the sign of the returned value, not the magnitude.

Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:12:14 -07:00
Pasha Bolokhov e61a6c1d82 dir.c:trim_trailing_spaces(): fix for " \ " sequence
Discard the unnecessary 'nr_spaces' variable, remove 'strlen()' and
improve the 'if' structure.  Switch to pointers instead of integers
to control the loop.

Slightly more rare occurrences of 'text  \    ' with a backslash
in between spaces are handled correctly.  Namely, the code in
7e2e4b37 (dir: ignore trailing spaces in exclude patterns, 2014-02-09)
does not reset 'last_space' when a backslash is encountered and the above
line stays intact as a result.

Add a test at the end of t/t0008-ignores.sh to exhibit this behavior.

Signed-off-by: Pasha Bolokhov <pasha.bolokhov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-02 15:48:48 -07:00
Junio C Hamano 8ba87adad6 Merge branch 'cb/aix'
* cb/aix:
  tests: don't rely on strerror text when testing rmdir failure
  dir.c: make git_fnmatch() not inline
2014-04-03 12:38:38 -07:00
Charles Bailey 1f26ce615a dir.c: make git_fnmatch() not inline
Now that it calls a static inline function, it cannot be an inline
definition with external linkage. Remove inline and make it an
external definition.

Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 11:50:15 -07:00
Junio C Hamano fe9122a352 Merge branch 'dd/use-alloc-grow'
Replace open-coded reallocation with ALLOC_GROW() macro.

* dd/use-alloc-grow:
  sha1_file.c: use ALLOC_GROW() in pretend_sha1_file()
  read-cache.c: use ALLOC_GROW() in add_index_entry()
  builtin/mktree.c: use ALLOC_GROW() in append_to_tree()
  attr.c: use ALLOC_GROW() in handle_attr_line()
  dir.c: use ALLOC_GROW() in create_simplify()
  reflog-walk.c: use ALLOC_GROW()
  replace_object.c: use ALLOC_GROW() in register_replace_object()
  patch-ids.c: use ALLOC_GROW() in add_commit()
  diffcore-rename.c: use ALLOC_GROW()
  diff.c: use ALLOC_GROW()
  commit.c: use ALLOC_GROW() in register_commit_graft()
  cache-tree.c: use ALLOC_GROW() in find_subtree()
  bundle.c: use ALLOC_GROW() in add_to_ref_list()
  builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-18 13:50:21 -07:00
Junio C Hamano 650c90a185 Merge branch 'nd/no-more-fnmatch'
We started using wildmatch() in place of fnmatch(3); complete the
process and stop using fnmatch(3).

* nd/no-more-fnmatch:
  actually remove compat fnmatch source code
  stop using fnmatch (either native or compat)
  Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch"
  use wildmatch() directly without fnmatch() wrapper
2014-03-14 14:25:31 -07:00
Junio C Hamano dfcd354cdf Merge branch 'nd/gitignore-trailing-whitespace'
Trailing whitespaces in .gitignore files, unless they are quoted for
fnmatch(3), e.g. "path\ ", are warned and ignored.

Strictly speaking, this is a backward incompatible change, but very
unlikely to bite any sane user and adjusting should be obvious and
easy.

* nd/gitignore-trailing-whitespace:
  t0008: skip trailing space test on Windows
  dir: ignore trailing spaces in exclude patterns
  dir: warn about trailing spaces in exclude patterns
2014-03-14 14:23:37 -07:00
Dmitry S. Dolzhenko 9af49f822b dir.c: use ALLOC_GROW() in create_simplify()
Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 14:54:29 -08:00
Nguyễn Thái Ngọc Duy ae8d082421 pathspec: pass directory indicator to match_pathspec_item()
This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which
makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce
the same output. Previously only the version without trailing slash
returns the difference (if any).

That's the effect of new ce_path_match(). dir_path_match() is not
executed by the new tests. And it should not introduce regressions.

Previously if path "dir/" is passed in with pathspec "dir/", they
obviously match. With new dir_path_match(), the path becomes
_directory_ "dir" vs pathspec "dir/", which is not executed by the old
code path in m_p_i(). The new code path is executed and produces the
same result.

The other case is pathspec "dir" and path "dir/" is now turned to
"dir" (with DO_MATCH_DIRECTORY). Still the same result before or after
the patch.

So why change? Because of the next patch about clean.c.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy 68690fdd0b match_pathspec: match pathspec "foo/" against directory "foo"
Currently we do support matching pathspec "foo/" against directory
"foo". That is because match_pathspec() has no way to tell "foo" is a
directory and matching "foo/" against _file_ "foo" is wrong.

The callers can now tell match_pathspec if "foo" is a directory, we
could make an exception for this case. Code is not executed though
because no callers pass the flag yet.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy 42b0874a7e dir.c: prepare match_pathspec_item for taking more flags
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy 854b09592c pathspec: rename match_pathspec_depth() to match_pathspec()
A long time ago, for some reason I was not happy with
match_pathspec(). I created a better version, match_pathspec_depth()
that was suppose to replace match_pathspec()
eventually. match_pathspec() has finally been gone since 6 months
ago. Use the shorter name for match_pathspec_depth().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:14 -08:00
Nguyễn Thái Ngọc Duy eb07894fe0 use wildmatch() directly without fnmatch() wrapper
Make it clear that we don't use fnmatch() anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-20 14:15:46 -08:00
Nguyễn Thái Ngọc Duy 7e2e4b37d3 dir: ignore trailing spaces in exclude patterns
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 11:49:53 -08:00
Nguyễn Thái Ngọc Duy 16402b992e dir: warn about trailing spaces in exclude patterns
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 11:49:53 -08:00
Junio C Hamano d0956cfa8e Merge branch 'mh/safe-create-leading-directories'
Code clean-up and protection against concurrent write access to the
ref namespace.

* mh/safe-create-leading-directories:
  rename_tmp_log(): on SCLD_VANISHED, retry
  rename_tmp_log(): limit the number of remote_empty_directories() attempts
  rename_tmp_log(): handle a possible mkdir/rmdir race
  rename_ref(): extract function rename_tmp_log()
  remove_dir_recurse(): handle disappearing files and directories
  remove_dir_recurse(): tighten condition for removing unreadable dir
  lock_ref_sha1_basic(): if locking fails with ENOENT, retry
  lock_ref_sha1_basic(): on SCLD_VANISHED, retry
  safe_create_leading_directories(): add new error value SCLD_VANISHED
  cmd_init_db(): when creating directories, handle errors conservatively
  safe_create_leading_directories(): introduce enum for return values
  safe_create_leading_directories(): always restore slash at end of loop
  safe_create_leading_directories(): split on first of multiple slashes
  safe_create_leading_directories(): rename local variable
  safe_create_leading_directories(): add explicit "slash" pointer
  safe_create_leading_directories(): reduce scope of local variable
  safe_create_leading_directories(): fix format of "if" chaining
2014-01-27 10:45:33 -08:00
Michael Haggerty 863808cd1a remove_dir_recurse(): handle disappearing files and directories
If a file or directory that we are trying to remove disappears (e.g.,
because another process has pruned it), do not consider it an error.

However, if REMOVE_DIR_KEEP_TOPLEVEL is set, and the toplevel
directory is missing, then consider it an error (like before).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-21 13:46:47 -08:00
Michael Haggerty ecb2c282c0 remove_dir_recurse(): tighten condition for removing unreadable dir
If opendir() fails on the top-level directory, it makes sense to try
to delete it anyway--but only if the failure was due to EACCES.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-21 13:46:32 -08:00
Nguyễn Thái Ngọc Duy ef79b1f870 Support pathspec magic :(exclude) and its short form :!
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 13:00:39 -08:00
Eric Sunshine de372b1b46 dir: revert work-around for retired dangerous behavior
directory_exists_in_index_icase() dangerously assumed that it could
access one character beyond the end of its directory argument, and that
that character would unconditionally be '/'.  2eac2a4c (ls-files -k: a
directory only can be killed if the index has a non-directory,
2013-08-15) added a caller which did not respect this undocumented
assumption, and 680be044 (dir.c::test_one_path(): work around
directory_exists_in_index_icase() breakage, 2013-08-23) added a
work-around which temporarily appends a '/' before invoking
directory_exists_in_index_icase().

Since the dangerous behavior of directory_exists_in_index_icase() has
been eliminated, the work-around is now redundant, so retire it (but not
the tests added by the same commit).

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17 10:08:27 -07:00
Eric Sunshine d28eec2673 name-hash: stop storing trailing '/' on paths in index_state.dir_hash
When 5102c617 (Add case insensitivity support for directories when using
git status, 2010-10-03) added directories to the name-hash there was
only a single hash table in which both real cache entries and leading
directory prefixes were registered. To distinguish between the two types
of entries, directories were stored with a trailing '/'.

2092678c (name-hash.c: fix endless loop with core.ignorecase=true,
2013-02-28), however, moved directories to a separate hash table
(index_state.dir_hash) but retained the (now) redundant trailing '/',
thus callers continue to bear the burden of ensuring the slash's
presence before searching the index for a directory. Eliminate this
redundancy by storing paths in the dir-hash without the trailing '/'.

An important benefit of this change is that it eliminates undocumented
and dangerous behavior of dir.c:directory_exists_in_index_icase() in
which it assumes not only that it can validly access one character
beyond the end of its incoming directory argument, but also that that
character will unconditionally be a '/'. This perilous behavior was
"tolerated" because the string passed in by its lone caller always had a
'/' in that position, however, things broke [1] when 2eac2a4c (ls-files
-k: a directory only can be killed if the index has a non-directory,
2013-08-15) added a new caller which failed to respect the undocumented
assumption.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/232727

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17 10:08:07 -07:00
Eric Sunshine ebbd7439b1 employ new explicit "exists in index?" API
Each caller of index_name_exists() knows whether it is looking for a
directory or a file, and can avoid the unnecessary indirection of
index_name_exists() by instead calling index_dir_exists() or
index_file_exists() directly.

Invoking the appropriate search function explicitly will allow a
subsequent patch to relieve callers of the artificial burden of having
to add a trailing '/' to the pathname given to index_dir_exists().

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17 10:07:37 -07:00
Junio C Hamano 4c4d9d9b65 Merge branch 'jc/ls-files-killed-optim'
"git ls-files -k" needs to crawl only the part of the working tree
that may overlap the paths in the index to find killed files, but
shared code with the logic to find all the untracked files, which
made it unnecessarily inefficient.

* jc/ls-files-killed-optim:
  dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage
  t3010: update to demonstrate "ls-files -k" optimization pitfalls
  ls-files -k: a directory only can be killed if the index has a non-directory
  dir.c: use the cache_* macro to access the current index
2013-09-11 15:03:28 -07:00
Junio C Hamano b02f5aeda6 Merge branch 'jl/submodule-mv'
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
2013-09-09 14:36:15 -07:00
Eric Sunshine 680be044d9 dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage
directory_exists_in_index() takes pathname and its length, but its
helper function directory_exists_in_index_icase() reads one byte
beyond the end of the pathname and expects there to be a '/'.

This needs to be fixed, as that one-byte-beyond-the-end location may
not even be readable, possibly by not registering directories to
name hashes with trailing slashes.  In the meantime, update the new
caller added recently to treat_one_path() to make sure that the path
buffer it gives the function is one byte longer than the path it is
asking the function about by appending a slash to it.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-23 16:26:59 -07:00
Junio C Hamano 2eac2a4cc4 ls-files -k: a directory only can be killed if the index has a non-directory
"ls-files -o" and "ls-files -k" both traverse the working tree down
to find either all untracked paths or those that will be "killed"
(removed from the working tree to make room) when the paths recorded
in the index are checked out.  It is necessary to traverse the
working tree fully when enumerating all the "other" paths, but when
we are only interested in "killed" paths, we can take advantage of
the fact that paths that do not overlap with entries in the index
can never be killed.

The treat_one_path() helper function, which is called during the
recursive traversal, is the ideal place to implement an
optimization.

When we are looking at a directory P in the working tree, there are
three cases:

 (1) P exists in the index.  Everything inside the directory P in
     the working tree needs to go when P is checked out from the
     index.

 (2) P does not exist in the index, but there is P/Q in the index.
     We know P will stay a directory when we check out the contents
     of the index, but we do not know yet if there is a directory
     P/Q in the working tree to be killed, so we need to recurse.

 (3) P does not exist in the index, and there is no P/Q in the index
     to require P to be a directory, either.  Only in this case, we
     know that everything inside P will not be killed without
     recursing.

Note that this helper is called by treat_leading_path() that decides
if we need to traverse only subdirectories of a single common
leading directory, which is essential for this optimization to be
correct.  This caller checks each level of the leading path
component from shallower directory to deeper ones, and that is what
allows us to only check if the path appears in the index.  If the
call to treat_one_path() weren't there, given a path P/Q/R, the real
traversal may start from directory P/Q/R, even when the index
records P as a regular file, and we would end up having to check if
any leading subpath in P/Q/R, e.g. P, appears in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-15 13:50:34 -07:00
Junio C Hamano 7126102742 dir.c: use the cache_* macro to access the current index
These codepaths always start from the_index and use index_*
functions, but there is no reason to do so.  Use the compatibility
cache_* macro to access the current in-core index like everybody
else.

While at it, fix typo in the comment for a function to check if a
path within a directory appears in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-15 12:08:45 -07:00
Junio C Hamano d3aeb31dc4 Merge branch 'nd/const-struct-cache-entry'
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
2013-07-22 11:24:01 -07:00
Nguyễn Thái Ngọc Duy 93d9353716 parse_pathspec: accept :(icase)path syntax
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 12:14:38 -07:00
Nguyễn Thái Ngọc Duy bd30c2e484 pathspec: support :(glob) syntax
:(glob)path differs from plain pathspec that it uses wildmatch with
WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The
difference lies in how '*' (and '**') is processed.

With the introduction of :(glob) and :(literal) and their global
options --[no]glob-pathspecs, the user can:

 - make everything literal by default via --noglob-pathspecs
   --literal-pathspecs cannot be used for this purpose as it
   disables _all_ pathspec magic.

 - individually turn on globbing with :(glob)

 - make everything globbing by default via --glob-pathspecs

 - individually turn off globbing with :(literal)

The implication behind this is, there is no way to gain the default
matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get
new globbing or literal. The old fnmatch behavior is considered
deprecated and discouraged to use.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:10 -07:00
Nguyễn Thái Ngọc Duy 5c6933d201 pathspec: support :(literal) syntax for noglob pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 341003e715 kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy b3920bbdc5 rename field "raw" to "_raw" in struct pathspec
This patch is essentially no-op. It helps catching new use of this
field though. This field is introduced as an intermediate step for the
pathspec conversion and will be removed eventually. At this stage no
more access sites should be introduced.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 84b8b5d1fa remove match_pathspec() in favor of match_pathspec_depth()
match_pathspec_depth was created to replace match_pathspec (see
61cf282 (pathspec: add match_pathspec_depth() - 2010-12-15). It took
more than two years, but the replacement finally happens :-)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 9a08727443 remove init_pathspec() in favor of parse_pathspec()
While at there, move free_pathspec() to pathspec.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 827f4d6c21 convert common_prefix() to use struct pathspec
The code now takes advantage of nowildcard_len field.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 7327d3d1b7 convert {read,fill}_directory to take struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:08 -07:00
Nguyễn Thái Ngọc Duy 8f4f8f4579 guard against new pathspec magic in pathspec matching code
GUARD_PATHSPEC() marks pathspec-sensitive code, basically all those
that touch anything in 'struct pathspec' except fields "nr" and
"original". GUARD_PATHSPEC() is not supposed to fail. It's mainly to
help the designers catch unsupported codepaths.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:07 -07:00
Nguyễn Thái Ngọc Duy 6330a17199 parse_pathspec: add special flag for max_depth feature
match_pathspec_depth() and tree_entry_interesting() check max_depth
field in order to support "git grep --max-depth". The feature
activation is tied to "recursive" field, which led to some unwanted
activation, e.g. 5c8eeb8 (diff-index: enable recursive pathspec
matching in unpack_trees - 2012-01-15).

This patch decouples the activation from "recursive" field, puts it in
"magic" field instead. This makes sure that only "git grep" can
activate this feature. And because parse_pathspec knows when the
feature is not used, it does not need to sort pathspec (required for
max_depth to work correctly). A small win for non-grep cases.

Even though a new magic flag is introduced, no magic syntax is. The
magic can be only enabled by parse_pathspec() caller. We might someday
want to support ":(maxdepth:10)src." It all depends on actual use
cases.

max_depth feature cannot be enabled via init_pathspec() anymore. But
that's ok because init_pathspec() is on its way to /dev/null.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy d2ce133195 parse_pathspec: save original pathspec for reporting
We usually use pathspec_item's match field for pathspec error
reporting. However "match" (or "raw") does not show the magic part,
which will play more important role later on. Preserve exact user
input for reporting.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy 87323bdace add parse_pathspec() that converts cmdline args to struct pathspec
Currently to fill a struct pathspec, we do:

   const char **paths;
   paths = get_pathspec(prefix, argv);
   ...
   init_pathspec(&pathspec, paths);

"paths" can only carry bare strings, which loses information from
command line arguments such as pathspec magic or the prefix part's
length for each argument.

parse_pathspec() is introduced to combine the two calls into one. The
plan is gradually replace all get_pathspec() and init_pathspec() with
parse_pathspec(). get_pathspec() now becomes a thin wrapper of
parse_pathspec().

parse_pathspec() allows the caller to reject the pathspec magics that
it does not support. When a new pathspec magic is introduced, we can
enable it per command after making sure that all underlying code has no
problem with the new magic.

"flags" parameter is currently unused. But it would allow callers to
pass certain instructions to parse_pathspec, for example forcing
literal pathspec when no magic is used.

With the introduction of parse_pathspec, there are now two functions
that can initialize struct pathspec: init_pathspec and
parse_pathspec. Any semantic changes in struct pathspec must be
reflected in both functions. init_pathspec() will be phased out in
favor of parse_pathspec().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy 64acde94ef move struct pathspec and related functions to pathspec.[ch]
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano 26c986e118 treat_directory(): do not declare submodules to be untracked
When the working tree walker encounters a directory, it asks the
function treat_directory() if it should descend into it, show it as
an untracked directory, or do something else.  When the directory is
the top of the submodule working tree, we used to say "That is an
untracked directory", which was bogus.

It is an entity that is tracked in the index of the repository we
are looking at, and that is not to be descended into it.  Return
path_none, not path_untracked, to report that.

The existing case that path_untracked is returned for a newly
discovered submodule that is not tracked in the index (this only
happens when DIR_NO_GITLINKS option is not used) is unchanged, but
that is exactly because the submodule is not tracked in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-01 14:23:24 -07:00
Junio C Hamano 3684101a65 Merge branch 'kb/status-ignored-optim-2'
Fix 1.8.3 regressions in the .gitignore path exclusion logic.

* kb/status-ignored-optim-2:
  dir.c: fix ignore processing within not-ignored directories
2013-06-03 12:58:56 -07:00
Karsten Blees c3c327deea dir.c: fix ignore processing within not-ignored directories
As of 95c6f271 "dir.c: unify is_excluded and is_path_excluded APIs", the
is_excluded API no longer recurses into directories that match an ignore
pattern, and returns the directory's ignored state for all contained paths.

This is OK for normal ignore patterns, i.e. ignoring a directory affects
the entire contents recursively.

Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e.
the entire contents is "not-ignored" recursively, regardless of ignore
patterns that match the contents directly.

In prep_exclude, skip recursing into a directory only if it is really
ignored (i.e. the ignore pattern is not negated).

Signed-off-by: Karsten Blees <blees@dcon.de>
Tested-by: Øystein Walle <oystwa@gmail.com>
Reviewed-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 14:54:38 -07:00
Junio C Hamano 7ebb906ddd Merge branch 'jn/config-ignore-inaccessible'
When $HOME is misconfigured to point at an unreadable directory, we
used to complain and die. This loosens the check.

* jn/config-ignore-inaccessible:
  config: allow inaccessible configuration under $HOME
2013-05-29 14:30:10 -07:00
Karsten Blees 0aaf62b6e0 dir.c: git-status --ignored: don't scan the work tree twice
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:36:42 -07:00
Karsten Blees defd7c7b37 dir.c: git-status --ignored: don't scan the work tree three times
'git-status --ignored' recursively scans directories up to three times:

 1. To collect untracked files.

 2. To collect ignored files.

 3. When collecting ignored files, to check that an untracked directory
    that potentially contains ignored files doesn't also contain untracked
    files (i.e. isn't already listed as untracked).

Let's get rid of case 3 first.

Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.

To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.

Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.

Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").

Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.

In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.

In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.

The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:01 -07:00
Karsten Blees 8aaf8d7728 dir.c: git-status: avoid is_excluded checks for tracked files
Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).

Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.

This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.

There's just one line left in treat_file, so move this to treat_one_path
as well.

Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):

       |    status      | status --ignored
       | linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 |  1.583 | 0.321 |  2.579
after  | 0.156 |  0.988 | 0.202 |  1.279
gain   | 1.397 |  1.602 | 1.589 |  2.016

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:01 -07:00
Karsten Blees b07bc8c8c3 dir.c: replace is_path_excluded with now equivalent is_excluded API
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:01 -07:00
Karsten Blees 95c6f27164 dir.c: unify is_excluded and is_path_excluded APIs
The is_excluded and is_path_excluded APIs are very similar, except for a
few noteworthy differences:

is_excluded doesn't handle ignored directories, results for paths within
ignored directories are incorrect. This is probably based on the premise
that recursive directory scans should stop at ignored directories, which
is no longer true (in certain cases, read_directory_recursive currently
calls is_excluded *and* is_path_excluded to get correct ignored state).

is_excluded caches parsed .gitignore files of the last directory in struct
dir_struct. If the directory changes, it finds a common parent directory
and is very careful to drop only as much state as necessary. On the other
hand, is_excluded will also read and parse .gitignore files in already
ignored directories, which are completely irrelevant.

is_path_excluded correctly handles ignored directories by checking if any
component in the path is excluded. As it uses is_excluded internally, this
unfortunately forces is_excluded to drop and re-read all .gitignore files,
as there is no common parent directory for the root dir.

is_path_excluded tracks state in a separate struct path_exclude_check,
which is essentially a wrapper of dir_struct with two more fields. However,
as is_path_excluded also modifies dir_struct, it is not possible to e.g.
use multiple path_exclude_check structures with the same dir_struct in
parallel. The additional structure just unnecessarily complicates the API.

Teach is_excluded / prep_exclude about ignored directories: whenever
entering a new directory, first check if the entire directory is excluded.
Remember the excluded state in dir_struct. Don't traverse into already
ignored directories (i.e. don't read irrelevant .gitignore files).

Directories could also be excluded by exclude patterns specified on the
command line or .git/info/exclude, so we cannot simply skip prep_exclude
entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
Move this check to just before actually reading the file.

is_path_excluded is now equivalent to is_excluded, so we can simply
redirect to it (the public API is cleaned up in the next patch).

The performance impact of the additional ignored check per directory is
hardly noticeable when reading directories recursively (e.g. 'git status').
However, performance of git commands using the is_path_excluded API (e.g.
'git ls-files --cached --ignored --exclude-standard') is greatly improved
as this no longer re-reads .gitignore files on each call.

Here's some performance data from the linux and WebKit repos (best of 10
runs on a Debian Linux on SSD, core.preloadIndex=true):

       | ls-files -ci   |    status      | status --ignored
       | linux | WebKit | linux | WebKit | linux | WebKit
-------+-------+--------+-------+--------+-------+---------
before | 0.506 |  6.539 | 0.212 |  1.555 | 0.323 |  2.541
after  | 0.080 |  1.191 | 0.218 |  1.583 | 0.321 |  2.579
gain   | 6.325 |  5.490 | 0.972 |  0.982 | 1.006 |  0.985

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:00 -07:00
Karsten Blees 6cd5c582dc dir.c: move prep_exclude
Move prep_exclude in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:00 -07:00
Karsten Blees 46aa2f95d2 dir.c: factor out parts of last_exclude_matching for later reuse
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:00 -07:00
Karsten Blees 5bd8e2d894 dir.c: git-clean -d -X: don't delete tracked directories
The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c:
Make git-status --ignored more consistent" has a few unwanted side effects:

 - git-clean -d -X: deletes ignored tracked directories. git-clean should
   never delete tracked content.

 - git-ls-files --ignored --other --directory: lists ignored tracked
   directories instead of "other" directories.

 - git-status --ignored: lists ignored tracked directories while contained
   files may be listed as modified. Paths listed by git-status should be
   disjoint (except in long format where a path may be listed in both the
   staged and unstaged section).

Additionally, the current behaviour violates documentation in gitignore(5)
("Specifies intentionally *untracked* files to ignore") and Documentation/
technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include
a directory that is *not tracked*.").

In dir.c::treat_directory, remove the special handling of ignored tracked
directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects
"other" (i.e. untracked) directories. In dir.c::dir_add_name, check that
added paths are untracked even if DIR_SHOW_IGNORED is set.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:00 -07:00
Karsten Blees be8a84c526 dir.c: make 'git-status --ignored' work within leading directories
'git-status --ignored path/' doesn't list ignored files and directories
within 'path' if some component of 'path' is classified as untracked.

Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading
directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag
from aborting at the top level untracked directory.

As a side effect, this also eliminates a recursive directory scan per
leading directory level, as treat_directory() can no longer call
read_directory_recursive() when called from treat_leading_path().

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:59 -07:00
Karsten Blees c94ab01026 dir.c: git-status --ignored: don't list empty directories as ignored
'git-status --ignored' lists empty untracked directories as ignored, even
though they don't have any ignored files.

When checking if a directory is already listed as untracked (i.e. shouldn't
be listed as ignored as well), don't assume that the directory has only
ignored files if it doesn't have untracked files, as the directory may be
empty.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:59 -07:00
Karsten Blees 184d2a8e96 dir.c: git-ls-files --directories: don't hide empty directories
'git-ls-files --ignored --directories' hides empty directories even though
--no-empty-directory was not specified.

Treat the DIR_HIDE_EMPTY_DIRECTORIES flag independently from
DIR_SHOW_IGNORED to make all git-ls-files options work as expected.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:59 -07:00
Karsten Blees 0104c9e781 dir.c: git-status --ignored: don't list empty ignored directories
'git-status --ignored' lists ignored tracked directories without any
ignored files if a tracked file happens to match an exclude pattern.

Always exclude tracked files.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:58 -07:00
Karsten Blees 289ff5598f dir.c: git-status --ignored: don't list files in ignored directories
'git-status --ignored' lists both the ignored directory and the ignored
files if the files are in a tracked sub directory.

When recursing into sub directories in read_directory_recursive, pass on
the check_only parameter so that we don't accidentally add the files.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:58 -07:00
Karsten Blees 560bb7a7a1 dir.c: git-status --ignored: don't drop ignored directories
'git-status --ignored' drops ignored directories if they contain untracked
files in an untracked sub directory.

Fix it by getting exact (recursive) excluded status in treat_directory.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:33:58 -07:00
Jonathan Nieder 4698c8feb1 config: allow inaccessible configuration under $HOME
The changes v1.7.12.1~2^2~4 (config: warn on inaccessible files,
2012-08-21) and v1.8.1.1~22^2~2 (config: treat user and xdg config
permission problems as errors, 2012-10-13) were intended to prevent
important configuration (think "[transfer] fsckobjects") from being
ignored when the configuration is unintentionally unreadable (for
example with EIO on a flaky filesystem, or with ENOMEM due to a DoS
attack).  Usually ~/.gitconfig and ~/.config/git are readable by the
current user, and if they aren't then it would be easy to fix those
permissions, so the damage from adding this check should have been
minimal.

Unfortunately the access() check often trips when git is being run as
a server.  A daemon (such as inetd or git-daemon) starts as "root",
creates a listening socket, and then drops privileges, meaning that
when git commands are invoked they cannot access $HOME and die with

 fatal: unable to access '/root/.config/git/config': Permission denied

Any patch to fix this would have one of three problems:

  1. We annoy sysadmins who need to take an extra step to handle HOME
     when dropping privileges (the current behavior, or any other
     proposal that they have to opt into).

  2. We annoy sysadmins who want to set HOME when dropping privileges,
     either by making what they want to do impossible, or making them
     set an extra variable or option to accomplish what used to work
     (e.g., a patch to git-daemon to set HOME when --user is passed).

  3. We loosen the check, so some cases which might be noteworthy are
     not caught.

This patch is of type (3).

Treat user and xdg configuration that are inaccessible due to
permissions (EACCES) as though no user configuration was provided at
all.

An alternative method would be to check if $HOME is readable, but that
would not help in cases where the user who dropped privileges had a
globally readable HOME with only .config or .gitconfig being private.

This does not change the behavior when /etc/gitconfig or .git/config
is unreadable (since those are more serious configuration errors),
nor when ~/.gitconfig or ~/.config/git is unreadable due to problems
other than permissions.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 07:26:50 -07:00
Junio C Hamano 0f3d66c6dc Merge branch 'jk/rm-removed-paths'
A handful of test cases and a corner case bugfix for "git rm".

* jk/rm-removed-paths:
  t3600: document failure of rm across symbolic links
  t3600: test behavior of reverse-d/f conflict
  rm: do not complain about d/f conflicts during deletion
2013-04-07 14:33:14 -07:00
Junio C Hamano 6466fbbeef Sync with 1.8.1.6 2013-04-07 13:17:50 -07:00
Junio C Hamano 4bbb830a35 Merge branch 'jc/directory-attrs-regression-fix' into maint-1.8.1
A pattern "dir" (without trailing slash) in the attributes file
stopped matching a directory "dir" by mistake with an earlier change
that wanted to allow pattern "dir/" to also match.

* jc/directory-attrs-regression-fix:
  t: check that a pattern without trailing slash matches a directory
  dir.c::match_pathname(): pay attention to the length of string parameters
  dir.c::match_pathname(): adjust patternlen when shifting pattern
  dir.c::match_basename(): pay attention to the length of string parameters
  attr.c::path_matches(): special case paths that end with a slash
  attr.c::path_matches(): the basename is part of the pathname
2013-04-07 08:45:03 -07:00
Jeff King 9a6728d4d1 rm: do not complain about d/f conflicts during deletion
If we used to have an index entry "d/f", but "d" has been
replaced by a non-directory entry, the user may still want
to run "git rm" to delete the stale index entry. They could
use "git rm --cached" to just touch the index, but "git rm"
should also work: we explicitly try to handle the case that
the file has already been removed from the working tree.

However, because unlinking "d/f" in this case will not yield
ENOENT, but rather ENOTDIR, we do not notice that the file
is already gone. Instead, we report it as an error.

The simple solution is to treat ENOTDIR in this case exactly
like ENOENT; all we want to know is whether the file is
already gone, and if a leading path is no longer a
directory, then by definition the sub-path is gone.

Reported-by: jpinheiro <7jpinheiro@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-04 12:28:47 -07:00
Junio C Hamano f30366b27a Merge branch 'jc/directory-attrs-regression-fix'
Fix 1.8.1.x regression that stopped matching "dir" (without
trailing slash) to a directory "dir".

* jc/directory-attrs-regression-fix:
  t: check that a pattern without trailing slash matches a directory
  dir.c::match_pathname(): pay attention to the length of string parameters
  dir.c::match_pathname(): adjust patternlen when shifting pattern
  dir.c::match_basename(): pay attention to the length of string parameters
  attr.c::path_matches(): special case paths that end with a slash
  attr.c::path_matches(): the basename is part of the pathname
2013-04-03 09:34:09 -07:00
Jeff King ab3aebc15c dir.c::match_pathname(): pay attention to the length of string parameters
This function takes two counted strings: a <pattern, patternlen> pair
and a <pathname, pathlen> pair. But we end up feeding the result to
fnmatch, which expects NUL-terminated strings.

We can fix this by calling the fnmatch_icase_mem function, which
handles re-allocating into a NUL-terminated string if necessary.

While we're at it, we can avoid even calling fnmatch in some cases. In
addition to patternlen, we get "prefix", the size of the pattern that
contains no wildcard characters. We do a straight match of the prefix
part first, and then use fnmatch to cover the rest. But if there are
no wildcards in the pattern at all, we do not even need to call
fnmatch; we would simply be comparing two empty strings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 21:48:18 -07:00
Jeff King 982ac87316 dir.c::match_pathname(): adjust patternlen when shifting pattern
If we receive a pattern that starts with "/", we shift it
forward to avoid looking at the "/" part. Since the prefix
and patternlen parameters are counts of what is in the
pattern, we must decrement them as we increment the pointer.

We remembered to handle prefix, but not patternlen. This
didn't cause any bugs, though, because the patternlen
parameter is not actually used. Since it will be used in
future patches, let's correct this oversight.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 21:48:18 -07:00
Junio C Hamano 0b6e56dfe6 dir.c::match_basename(): pay attention to the length of string parameters
The function takes two counted strings (<basename, basenamelen> and
<pattern, patternlen>) as parameters, together with prefix (the
length of the prefix in pattern that is to be matched literally
without globbing against the basename) and EXC_* flags that tells it
how to match the pattern against the basename.

However, it did not pay attention to the length of these counted
strings.  Update them to do the following:

 * When the entire pattern is to be matched literally, the pattern
   matches the basename only when the lengths of them are the same,
   and they match up to that length.

 * When the pattern is "*" followed by a string to be matched
   literally, make sure that the basenamelen is equal or longer than
   the "literal" part of the pattern, and the tail of the basename
   string matches that literal part.

 * Otherwise, use the new fnmatch_icase_mem helper to make
   sure we only lookmake sure we use only look at the
   counted part of the strings.  Because these counted strings are
   full strings most of the time, we check for termination
   to avoid unnecessary allocation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-28 21:48:12 -07:00
Junio C Hamano 85fd059a89 Merge branch 'ap/status-ignored-in-ignored-directory' into maint
Output from "git status --ignored" did not work well when used with
"--untracked".

* ap/status-ignored-in-ignored-directory:
  status: always report ignored tracked directories
  git-status: Test --ignored behavior
  dir.c: Make git-status --ignored more consistent
2013-01-28 11:10:25 -08:00
Junio C Hamano 9ecd9f5dc3 Merge branch 'nd/retire-fnmatch'
Replace our use of fnmatch(3) with a more feature-rich wildmatch.
A handful patches at the bottom have been moved to nd/wildmatch to
graduate as part of that branch, before this series solidifies.

We may want to mark USE_WILDMATCH as an experimental curiosity a
bit more clearly (i.e. should not be enabled in production
environment, because it will make the behaviour between builds
unpredictable).

* nd/retire-fnmatch:
  Makefile: add USE_WILDMATCH to use wildmatch as fnmatch
  wildmatch: advance faster in <asterisk> + <literal> patterns
  wildmatch: make a special case for "*/" with FNM_PATHNAME
  test-wildmatch: add "perf" command to compare wildmatch and fnmatch
  wildmatch: support "no FNM_PATHNAME" mode
  wildmatch: make dowild() take arbitrary flags
  wildmatch: rename constants and update prototype
2013-01-25 12:34:55 -08:00
Junio C Hamano a39b15b4f6 Merge branch 'as/check-ignore'
Add a new command "git check-ignore" for debugging .gitignore
files.

The variable names may want to get cleaned up but that can be done
in-tree.

* as/check-ignore:
  clean.c, ls-files.c: respect encapsulation of exclude_list_groups
  t0008: avoid brace expansion
  add git-check-ignore sub-command
  setup.c: document get_pathspec()
  add.c: extract new die_if_path_beyond_symlink() for reuse
  add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
  pathspec.c: rename newly public functions for clarity
  add.c: move pathspec matchers into new pathspec.c for reuse
  add.c: remove unused argument from validate_pathspec()
  dir.c: improve docs for match_pathspec() and match_pathspec_depth()
  dir.c: provide clear_directory() for reclaiming dir_struct memory
  dir.c: keep track of where patterns came from
  dir.c: use a single struct exclude_list per source of excludes

Conflicts:
	builtin/ls-files.c
	dir.c
2013-01-23 21:19:10 -08:00
Junio C Hamano 0a9a787fca Merge branch 'ap/status-ignored-in-ignored-directory'
Output from "git status --ignored" showed an unexpected interaction
with "--untracked".

* ap/status-ignored-in-ignored-directory:
  status: always report ignored tracked directories
  git-status: Test --ignored behavior
  dir.c: Make git-status --ignored more consistent
2013-01-14 08:15:43 -08:00
Junio C Hamano d912b0e44f Merge branch 'as/dir-c-cleanup'
Refactor and generally clean up the directory traversal API
implementation.

* as/dir-c-cleanup:
  dir.c: rename free_excludes() to clear_exclude_list()
  dir.c: refactor is_path_excluded()
  dir.c: refactor is_excluded()
  dir.c: refactor is_excluded_from_list()
  dir.c: rename excluded() to is_excluded()
  dir.c: rename excluded_from_list() to is_excluded_from_list()
  dir.c: rename path_excluded() to is_path_excluded()
  dir.c: rename cryptic 'which' variable to more consistent name
  Improve documentation and comments regarding directory traversal API
  api-directory-listing.txt: update to match code
2013-01-10 13:47:25 -08:00
Junio C Hamano 2adf7247ec Merge branch 'nd/wildmatch'
Allows pathname patterns in .gitignore and .gitattributes files
with double-asterisks "foo/**/bar" to match any number of directory
hierarchies.

* nd/wildmatch:
  wildmatch: replace variable 'special' with better named ones
  compat/fnmatch: respect NO_FNMATCH* even on glibc
  wildmatch: fix "**" special case
  t3070: Disable some failing fnmatch tests
  test-wildmatch: avoid Windows path mangling
  Support "**" wildcard in .gitignore and .gitattributes
  wildmatch: make /**/ match zero or more directories
  wildmatch: adjust "**" behavior
  wildmatch: fix case-insensitive matching
  wildmatch: remove static variable force_lower_case
  wildmatch: make wildmatch's return value compatible with fnmatch
  t3070: disable unreliable fnmatch tests
  Integrate wildmatch to git
  wildmatch: follow Git's coding convention
  wildmatch: remove unnecessary functions
  Import wildmatch from rsync
  ctype: support iscntrl, ispunct, isxdigit and isprint
  ctype: make sane_ctype[] const array

Conflicts:
	Makefile
2013-01-10 13:47:20 -08:00
Antoine Pelisse a45fb697f1 status: always report ignored tracked directories
When enumerating paths that are ignored, paths the index knows
about are not included in the result.  The "index knows about"
check is done by consulting the name hash, not the actual
contents of the index:

 - When core.ignorecase is false, directory names are not in the
   name hash, and ignored ones are shown as ignored (directories
   can never be tracked anyway).

 - When core.ignorecase is true, however, the name hash keeps
   track of the names of directories, in order to detect
   additions of the paths under different cases.  This causes
   ignored directories to be mistakenly excluded when
   enumerating ignored paths.

Stop excluding directories that are in the name hash when
looking for ignored files in dir_add_name(); the names that are
actually in the index are excluded much earlier in the callchain
in treat_file(), so this fix will not make them mistakenly
identified as ignored.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-07 11:06:29 -08:00
Adam Spiers 52ed1894b0 dir.c: improve docs for match_pathspec() and match_pathspec_depth()
Fix a grammatical issue in the description of these functions, and
make it more obvious how and why seen[] can be reused across multiple
invocations.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:26:37 -08:00
Adam Spiers 270be81604 dir.c: provide clear_directory() for reclaiming dir_struct memory
By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:26:37 -08:00
Adam Spiers c04318e46a dir.c: keep track of where patterns came from
For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).

For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1.  So for example the 2nd pattern provided via
--exclude would be numbered -2.  This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:26:37 -08:00
Adam Spiers c082df2453 dir.c: use a single struct exclude_list per source of excludes
Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:25:06 -08:00
Junio C Hamano 971e829cd8 Merge branch 'jk/pathspec-literal'
Allow scripts to feed literal paths to commands that take
pathspecs, by disabling wildcard globbing.

* jk/pathspec-literal:
  add global --literal-pathspecs option

Conflicts:
	dir.c
2013-01-05 23:42:07 -08:00
Antoine Pelisse 721ac4edde dir.c: Make git-status --ignored more consistent
The current behavior of git-status is inconsistent and misleading.
Especially when used with --untracked-files=all option:

 - files ignored in untracked directories will be missing from
   status output.

 - untracked files in committed yet ignored directories are also
   missing.

 - with --untracked-files=normal, untracked directories that
   contains only ignored files are dropped too.

Make the behavior more consistent across all possible use cases:

 - "--ignored --untracked-files=normal" doesn't show each specific
   files but top directory.  It instead shows untracked directories
   that only contains ignored files, and ignored tracked directories
   with untracked files.

 - "--ignored --untracked-files=all" shows all ignored files, either
   because it's in an ignored directory (tracked or untracked), or
   because the file is explicitly ignored.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-01 16:24:45 -08:00
Nguyễn Thái Ngọc Duy c41244e702 wildmatch: support "no FNM_PATHNAME" mode
So far, wildmatch() has always honoured directory boundary and there
was no way to turn it off. Make it behave more like fnmatch() by
requiring all callers that want the FNM_PATHNAME behaviour to pass
that in the equivalent flag WM_PATHNAME. Callers that do not specify
WM_PATHNAME will get wildcards like ? and * in their patterns matched
against '/', just like not passing FNM_PATHNAME to fnmatch().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-01 15:32:37 -08:00
Nguyễn Thái Ngọc Duy 9b3497cab9 wildmatch: rename constants and update prototype
- All exported constants now have a prefix WM_
- Do not rely on FNM_* constants, use the WM_ counterparts
- Remove TRUE and FALSE to follow Git's coding style
- While at it, turn flags type from int to unsigned int
- Add an (unused yet) argument to carry extra information
  so that we don't have to change the prototype again later
  when we need to pass other stuff to wildmatch

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-01 15:32:36 -08:00
Adam Spiers f619881251 dir.c: rename free_excludes() to clear_exclude_list()
It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.

http://article.gmane.org/gmane.comp.version-control.git/206128

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:47 -08:00
Adam Spiers a35341a86e dir.c: refactor is_path_excluded()
In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching_path() which return the last
exclude_list element which matched, or NULL if no match was found.
is_path_excluded() becomes a wrapper around this, and just returns 0
or 1 depending on whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers f4cd69a674 dir.c: refactor is_excluded()
In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching() which returns the last exclude_list
element which matched, or NULL if no match was found.  is_excluded()
becomes a wrapper around this, and just returns 0 or 1 depending on
whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers 578cd7c3ea dir.c: refactor is_excluded_from_list()
The excluded function uses a new helper function called
last_exclude_matching_from_list() to perform the inner loop over all of
the exclude patterns.  The helper just tells us whether the path is
included, excluded, or undecided.

However, it may be useful to know _which_ pattern was triggered.  So
let's pass out the entire exclude match, which contains the status
information we were already passing out.

Further patches can make use of this.

This is a modified forward port of a patch from 2009 by Jeff King:
http://article.gmane.org/gmane.comp.version-control.git/108815

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers 6d24e7a807 dir.c: rename excluded() to is_excluded()
Continue adopting clearer names for exclude functions.  This is_*
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers 0795805053 dir.c: rename excluded_from_list() to is_excluded_from_list()
Continue adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Also adjust their callers as necessary.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers 9013089c4a dir.c: rename path_excluded() to is_path_excluded()
Start adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was agreed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:45 -08:00
Adam Spiers 840fc334e9 dir.c: rename cryptic 'which' variable to more consistent name
'el' is only *slightly* less cryptic, but is already used as the
variable name for a struct exclude_list pointer in numerous other
places, so this reduces the number of cryptic variable names in use by
one :-)

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:45 -08:00
Adam Spiers 95a68344af Improve documentation and comments regarding directory traversal API
traversal API has a few potentially confusing properties.  These
comments clarify a few key aspects and will hopefully make it easier
to understand for other newcomers in the future.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:45 -08:00
Jeff King 823ab40fd4 add global --literal-pathspecs option
Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.

However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.

This patch introduces a global command-line option (i.e.,
one for "git" itself, not for specific commands) to turn
this behavior off. It also has a matching environment
variable, which can make it easier if you are a script or
porcelain interface that is going to issue many such
commands.

This option cannot turn off globbing for particular
pathspecs. That could eventually be done with a ":(noglob)"
magic pathspec prefix. However, that level of granularity is
more cumbersome to use for many cases, and doing ":(noglob)"
right would mean converting the whole codebase to use
"struct pathspec", as the usual "const char **pathspec"
cannot represent extra per-item flags.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-19 14:58:59 -08:00
Nguyễn Thái Ngọc Duy 8c6abbcd27 pathspec: apply "*.c" optimization from exclude
When a pattern contains only a single asterisk as wildcard,
e.g. "foo*bar", after literally comparing the leading part "foo" with
the string, we can compare the tail of the string and make sure it
matches "bar", instead of running fnmatch() on "*bar" against the
remainder of the string.

-O2 build on linux-2.6, without the patch:

$ time git rev-list --quiet HEAD -- '*.c'

real    0m40.770s
user    0m40.290s
sys     0m0.256s

With the patch

$ time ~/w/git/git rev-list --quiet HEAD -- '*.c'

real    0m34.288s
user    0m33.997s
sys     0m0.205s

The above command is not supposed to be widely popular. It's chosen
because it exercises pathspec matching a lot. The point is it cuts
down matching time for popular patterns like *.c, which could be used
as pathspec in other places.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-26 11:13:13 -08:00
Nguyễn Thái Ngọc Duy 5d74762d87 pathspec: do exact comparison on the leading non-wildcard part
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-26 11:12:51 -08:00
Nguyễn Thái Ngọc Duy 170260ae90 pathspec: save the non-wildcard length part
We mark pathspec with wildcards with the field use_wildcard. We
could do better by saving the length of the non-wildcard part, which
can be used for optimizations such as f9f6e2c (exclude: do strcmp as
much as possible before fnmatch - 2012-06-07).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-19 13:08:28 -08:00
Jeff King 5f836422ab Merge branch 'nd/attr-match-optim-more'
Start laying the foundation to build the "wildmatch" after we can
agree on its desired semantics.

* nd/attr-match-optim-more:
  attr: more matching optimizations from .gitignore
  gitignore: make pattern parsing code a separate function
  exclude: split pathname matching code into a separate function
  exclude: fix a bug in prefix compare optimization
  exclude: split basename matching code into a separate function
  exclude: stricten a length check in EXC_FLAG_ENDSWITH case
2012-11-09 12:42:25 -05:00
Nguyễn Thái Ngọc Duy 237ec6e40d Support "**" wildcard in .gitignore and .gitattributes
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:58:19 -07:00
Nguyễn Thái Ngọc Duy 82dce998c2 attr: more matching optimizations from .gitignore
.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.

This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:

* "!pattern" syntax is forbidden in .gitattributes.  As an attribute
  can be unset (i.e. set to a special value "false") or made back to
  unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
  which one it means.

* we support attaching attributes to directories, but git-core
  internally does not currently make use of attributes on
  directories.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:17 -07:00
Nguyễn Thái Ngọc Duy 84460eec8d gitignore: make pattern parsing code a separate function
This function can later be reused by attr.c. Also turn to_exclude
field into a flag.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:16 -07:00
Nguyễn Thái Ngọc Duy b559263216 exclude: split pathname matching code into a separate function
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:16 -07:00
Nguyễn Thái Ngọc Duy a3ea4d7199 exclude: fix a bug in prefix compare optimization
When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.

Don't be too optimistic and let fnmatch() do the job.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:16 -07:00
Nguyễn Thái Ngọc Duy 593cb8802e exclude: split basename matching code into a separate function
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:16 -07:00
Nguyễn Thái Ngọc Duy 692663303f exclude: stricten a length check in EXC_FLAG_ENDSWITH case
This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:16 -07:00
Junio C Hamano 68bdfd7cdc Merge commit 'f9f6e2c' into nd/attr-match-optim-more
* commit 'f9f6e2c':
  exclude: do strcmp as much as possible before fnmatch
  dir.c: get rid of the wildcard symbol set in no_wildcard()
  Unindent excluded_from_list()
2012-10-05 12:45:30 -07:00
Junio C Hamano 55b38a48e2 warn_on_inaccessible(): a helper to warn on inaccessible paths
The previous series introduced warnings to multiple places, but it
could become tiring to see the warning on the same path over and
over again during a single run of Git.  Making just one function
responsible for issuing this warning, we could later choose to keep
track of which paths we issued a warning (it would involve a hash
table of paths after running them through real_path() or something)
in order to reduce noise.

Right now we do not know if the noise reduction is necessary, but it
still would be a good code reduction/sharing anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-21 14:52:07 -07:00
Jeff King 6966073102 gitignore: report access errors of exclude files
When we try to access gitignore files, we check for their
existence with a call to "access". We silently ignore
missing files. However, if a file is not readable, this may
be a configuration error; let's warn the user.

For $GIT_DIR/info/excludes or core.excludesfile, we can just
use access_or_warn. However, for per-directory files we
actually try to open them, so we must add a custom warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-21 14:46:47 -07:00
Junio C Hamano 0d94427ef8 Merge branch 'mm/config-xdg'
Finishing touches to the XDG support (new feature for 1.7.12) and
tests.

* mm/config-xdg:
  t1306: check that XDG_CONFIG_HOME works
  ignore: make sure we have an xdg path before using it
  attr: make sure we have an xdg path before using it
  test-lib.sh: unset XDG_CONFIG_HOME
2012-07-25 15:47:05 -07:00
Matthieu Moy 6283a376c4 ignore: make sure we have an xdg path before using it
Commit e3ebc35 (config: fix several access(NULL) calls, 2012-07-12) was
fixing access(NULL) calls when trying to access $HOME/.config/git/config,
but missed the ones when trying to access $HOME/.config/git/ignore. Fix
and test this.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-24 08:59:07 -07:00
Junio C Hamano cd733f4f71 Merge branch 'jc/ls-files-i-dir' into maint
"git ls-files --exclude=t -i" did not consider anything under t/ as
excluded, as it did not pay attention to exclusion of leading paths
while walking the index.  Other two users of excluded() are also
updated.

* jc/ls-files-i-dir:
  dir.c: make excluded() file scope static
  unpack-trees.c: use path_excluded() in check_ok_to_remove()
  builtin/add.c: use path_excluded()
  path_excluded(): update API to less cache-entry centric
  ls-files -i: micro-optimize path_excluded()
  ls-files -i: pay attention to exclusion of leading paths
2012-07-11 12:44:35 -07:00
Junio C Hamano d02d7ac303 Merge branch 'mm/config-xdg'
Teach git to read various information from $XDG_CONFIG_HOME/git/ to allow
the user to avoid cluttering $HOME.

* mm/config-xdg:
  config: write to $XDG_CONFIG_HOME/git/config file when appropriate
  Let core.attributesfile default to $XDG_CONFIG_HOME/git/attributes
  Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore
  config: read (but not write) from $XDG_CONFIG_HOME/git/config file
2012-07-09 09:00:36 -07:00
Junio C Hamano 653111f99c Merge branch 'nd/exclude-workaround-top-heavy'
Attempt to optimize matching with an exclude pattern with a deep
directory hierarchy by taking the part that specifies leading path
without wildcard literally.
2012-06-28 15:19:57 -07:00
Huynh Khoi Nguyen Nguyen dc79687e0b Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore
To use the feature of core.excludesfile, the user needs:

 1. to create such a file,

 2. and add configuration variable to point at it.

Instead, we can make this a one-step process by choosing a default value
which points to a filename in the user's $HOME, that is unlikely to
already exist on the system, and only use the presence of the file as a
cue that the user wants to use that feature.

And we use "${XDG_CONFIG_HOME:-$HOME/.config/git}/ignore" as such a
file, in the same directory as the newly added configuration file
("${XDG_CONFIG_HOME:-$HOME/.config/git}/config).  The use of this
directory is in line with XDG specification as a location to store
such application specific files.

Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr>
Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr>
Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr>
Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr>
Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-25 09:06:15 -07:00
Junio C Hamano 1966babf6e Merge branch 'jc/ls-files-i-dir'
"git ls-files --exclude=t -i" did not consider anything under t/
as excluded, as it did not pay attention to exclusion of leading
paths while walking the index.  Other two users of excluded() are
also updated.

* jc/ls-files-i-dir:
  dir.c: make excluded() file scope static
  unpack-trees.c: use path_excluded() in check_ok_to_remove()
  builtin/add.c: use path_excluded()
  path_excluded(): update API to less cache-entry centric
  ls-files -i: micro-optimize path_excluded()
  ls-files -i: pay attention to exclusion of leading paths
2012-06-21 14:42:07 -07:00
Nguyễn Thái Ngọc Duy f9f6e2ce26 exclude: do strcmp as much as possible before fnmatch
this also avoids calling fnmatch() if the non-wildcard prefix is
longer than basename

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-07 11:33:38 -07:00
Nguyễn Thái Ngọc Duy fcd631ed84 dir.c: get rid of the wildcard symbol set in no_wildcard()
Elsewhere in this file is_glob_special() is also used to check for
wildcards, which is defined in ctype. Make no_wildcard() also use this
function (indirectly via simple_length())

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-07 11:33:37 -07:00
Junio C Hamano 0d316f0cef dir.c: make excluded() file scope static
Now there no longer is external callers of this interface, so we can
make it static.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-05 22:26:12 -07:00
Junio C Hamano 782cd4c0f6 path_excluded(): update API to less cache-entry centric
It was stupid of me to make the API too much cache-entry specific;
the caller may want to check arbitrary pathname without having a
corresponding cache-entry to see if a path is ignored.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-05 21:22:36 -07:00
Junio C Hamano 93921b07e9 ls-files -i: micro-optimize path_excluded()
As we know a caller that does not recurse is calling us in the index
order, we can remember the last directory we found to be excluded
and see if the path we are looking at is still inside it, in which
case we can just answer that it is excluded.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-03 16:08:25 -07:00
Junio C Hamano eb41775ecc ls-files -i: pay attention to exclusion of leading paths
"git ls-files --exclude=t/ -i" does not show paths in directory t/
that have been added to the index, but it should.

The excluded() API was designed for callers who walk the tree from
the top, checking each level of the directory hierarchy as it
descends if it is excluded, and not even bothering to recurse into
an excluded directory.  This would allow us optimize for a common
case by not having to check if the exclude pattern "foo/" matches
when looking at "foo/bar", because the caller should have noticed
that "foo" is excluded and did not even bother to read "foo/bar"
out of opendir()/readdir() to call it.

The code for "ls-files -i" however walks the index linearly, feeding
paths without checking if the leading directory is already excluded.

Introduce a helper function path_excluded() to let this caller
properly call excluded() check for higher hierarchies as necessary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-03 16:05:42 -07:00
Nguyễn Thái Ngọc Duy 35a94d44af Unindent excluded_from_list()
Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME
code block as this block is getting complex and may need more
indentation.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-29 10:58:54 -07:00
René Scharfe 2b189435f3 dir: simplify fill_directory()
Now that read_directory_recursive() (reached through read_directory())
respects the string length limit we provide, we don't need to create a
NUL-limited copy of the common prefix anymore.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-11 14:31:32 -07:00
René Scharfe 1528d247e5 dir: respect string length argument of read_directory_recursive()
A directory name is passed to read_directory_recursive() as a
length-limited string, through the parameters base and baselen.
Suprisingly, base must be a NUL-terminated string as well, as it is
passed to opendir(), ignoring baselen.

Fix this by postponing the call to opendir() until the length-limted
string is added to a strbuf, which provides a NUL in the right place.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-11 14:31:27 -07:00
Junio C Hamano bef369219a Merge branch 'rs/maint-dir-strbuf' into rs/dir-strbuf
By René Scharfe
* rs/maint-dir-strbuf:
  dir: convert to strbuf
2012-05-08 09:43:40 -07:00
René Scharfe 49dc2cc2c9 dir: convert to strbuf
The functions read_directory_recursive() and treat_leading_path() both
use buffers sized to fit PATH_MAX characters.  The latter can be made to
overrun its buffer, e.g. like this:

	$ a=0123456789abcdef
	$ a=$a$a$a$a$a$a$a$a
	$ a=$a$a$a$a$a$a$a$a
	$ a=$a$a$a$a$a$a$a$a
	$ git add $a/a

Instead of trying to add a check and potentionally forgetting to address
similar cases, convert the involved functions and their helpers to use
struct strbuf.  The patch is suprisingly large because the helpers
treat_path() and treat_one_path() modify the buffer as well and thus need
to be converted, too.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-08 09:13:00 -07:00
Junio C Hamano ae2f203ef7 clean: preserve nested git worktree in subdirectories
remove_dir_recursively() has a check to avoid removing the directory it
was asked to remove without recursing into it and report success when the
directory is the top level of a working tree of a nested git repository,
to protect such a repository from "clean -f" (without double -f). If a
working tree of a nested git repository is in a subdirectory of a toplevel
project, however, this protection did not apply by mistake; we forgot to
pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal
codepath.

This requires us to also teach the higher level not to remove the
directory it is asked to remove, when the recursed invocation did not
remove the directory it was asked to remove due to a nested git
repository, as it is not an error to leave the parent directories of such
a nested repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-15 11:25:34 -07:00
Junio C Hamano c844a80356 remove_dir_recursively(): Add flag for skipping removal of toplevel dir
Add the REMOVE_DIR_KEEP_TOPLEVEL flag to remove_dir_recursively() for
deleting everything inside the given directory, but _not_ the given
directory itself.

Note that this does not pass the REMOVE_DIR_KEEP_NESTED_GIT flag, if set,
to the recursive invocations of remove_dir_recursively().  It is likely to
be a a bug that has been present since REMOVE_DIR_KEEP_NESTED_GIT was
introduced (a0f4afb), but this commit keeps the same behaviour for now.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-15 11:12:25 -07:00
Nguyễn Thái Ngọc Duy 02cb67530e read_directory_recursive: reduce one indentation level
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-27 11:27:57 -07:00
Clemens Buchacher f950eb9560 rename pathspec_prefix() to common_prefix() and move to dir.[ch]
Also make common_prefix_len() static as this refactoring makes dir.c
itself the only caller of this helper function.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-12 14:38:32 -07:00
Junio C Hamano 4a085b16f4 consolidate pathspec_prefix and common_prefix
The implementation from pathspec_prefix (slightly modified) replaces the
current common_prefix, because it also respects glob characters.

Based on a patch by Clemens Buchacher.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-06 12:54:19 -07:00
Junio C Hamano 1273738f05 Merge branch 'nd/struct-pathspec'
* nd/struct-pathspec:
  pathspec: rename per-item field has_wildcard to use_wildcard
  Improve tree_entry_interesting() handling code
  Convert read_tree{,_recursive} to support struct pathspec
  Reimplement read_tree_recursive() using tree_entry_interesting()
2011-05-06 10:50:06 -07:00
Junio C Hamano c67e367c50 Merge branch 'nd/maint-setup'
* nd/maint-setup:
  Kill off get_relative_cwd()
  setup: return correct prefix if worktree is '/'

Conflicts:
	dir.c
	setup.c
2011-05-02 15:58:30 -07:00
Junio C Hamano 1de0746d84 Merge branch 'ar/clean-rmdir-empty'
* ar/clean-rmdir-empty:
  clean: unreadable directory may still be rmdir-able if it is empty
2011-04-27 11:36:41 -07:00
Junio C Hamano 33e0f62ba9 pathspec: rename per-item field has_wildcard to use_wildcard
As the point of the last change is to allow use of strings as
literals no matter what characters are in them, "has_wildcard"
does not match what we use this field for anymore.

It is used to decide if the wildcard matching should be used, so
rename it to match the usage better.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-05 09:30:36 -07:00
Alex Riesen 0235017eaf clean: unreadable directory may still be rmdir-able if it is empty
As a last ditch effort, try rmdir(2) when we cannot read the directory
to be removed.  It may be an empty directory that we can remove without
any permission, as long as we can modify its parent directory.

Noticed by Linus.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-01 11:16:21 -07:00
Nguyễn Thái Ngọc Duy b892913d51 Kill off get_relative_cwd()
Function dir_inside_of() does something similar (correctly), but looks
easier to understand and does not bundle cwd to its business. Given
get_relative_cwd's only user is is_inside_dir, we can kill it for
good.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-28 17:02:57 -07:00
Nguyễn Thái Ngọc Duy 9b125da490 setup: return correct prefix if worktree is '/'
The same old problem reappears after setup code is reworked.  We tend
to assume there is at least one path component in a path and forget
that path can be simply '/'.

Reported-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-28 17:01:15 -07:00
Carlos Martín Nieto e2a57aac8a Name make_*_path functions more accurately
Rename the make_*_path functions so it's clearer what they do, in
particlar make clear what the differnce between make_absolute_path and
make_nonrelative_path is by renaming them real_path and absolute_path
respectively. make_relative_path has an understandable name and is
renamed to relative_path to maintain the name convention.

The function calls have been replaced 1-to-1 in their usage.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-17 16:08:30 -07:00
Nguyễn Thái Ngọc Duy 61cf282045 pathspec: add match_pathspec_depth()
match_pathspec_depth() is a clone of match_pathspec() except that it
can take depth limit. Computation is a bit lighter compared to
match_pathspec() because it's usually precomputed and stored in struct
pathspec.

In long term, match_pathspec() and match_one() should be removed in
favor of this function.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy d38f28093e tree_entry_interesting(): support wildcard matching
never_interesting optimization is disabled if there is any wildcard
pathspec, even if it only matches exactly on trees.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy 86e4ca69e3 tree_entry_interesting(): fix depth limit with overlapping pathspecs
Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth
limit 1. In current code, pathspecs are checked in input order. When
'a/b' is checked against pathspec 'a', it fails depth limit and
therefore is excluded, although it should match 'a/b' pathspec.

This patch reorders all pathspecs alphabetically, then teaches
tree_entry_interesting() to check against the deepest pathspec first,
so depth limit of a shallower pathspec won't affect a deeper one.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy bc96cc87db tree_entry_interesting(): support depth limit
This is needed to replace pathspec_matches() in builtin/grep.c.

max_depth == -1 means infinite depth. Depth limit is only effective
when pathspec.recursive == 1. When pathspec.recursive == 0, the
behavior depends on match functions: non-recursive for
tree_entry_interesting() and recursive for match_pathspec{,_depth}

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy 0602f3e916 Add struct pathspec
The old pathspec structure remains as pathspec.raw[]. New things are
stored in pathspec.items[]. There's no guarantee that the pathspec
order in raw[] is exactly as in items[].

raw[] is external (source) data and is untouched by pathspec
manipulation functions. It eases migration from old const char ** to
this new struct.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 12:19:19 -08:00
Junio C Hamano e39212ab08 Merge branch 'nd/maint-fix-add-typo-detection'
* nd/maint-fix-add-typo-detection:
  Revert "excluded_1(): support exclude files in index"
  unpack-trees: fix sparse checkout's "unable to match directories"
  unpack-trees: move all skip-worktree checks back to unpack_trees()
  dir.c: add free_excludes()
  cache.h: realign and use (1 << x) form for CE_* constants
2010-12-22 14:40:26 -08:00
Junio C Hamano 20cb8e2025 Merge branch 'nd/maint-relative'
* nd/maint-relative:
  get_cwd_relative(): do not misinterpret root path
2010-12-16 12:49:48 -08:00
Junio C Hamano 5e738ae820 Merge branch 'jj/icase-directory'
* jj/icase-directory:
  Support case folding in git fast-import when core.ignorecase=true
  Support case folding for git add when core.ignorecase=true
  Add case insensitivity support when using git ls-files
  Add case insensitivity support for directories when using git status
  Case insensitivity support for .gitignore via core.ignorecase
  Add string comparison functions that respect the ignore_case variable.
  Makefile & configure: add a NO_FNMATCH_CASEFOLD flag
  Makefile & configure: add a NO_FNMATCH flag

Conflicts:
	Makefile
	config.mak.in
	configure.ac
	fast-import.c
2010-12-03 16:10:34 -08:00
Nguyễn Thái Ngọc Duy 9e082734b3 Revert "excluded_1(): support exclude files in index"
This reverts commit c84de70781.
The commit provided a workaround for matching directories in
index. But it is no longer needed.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-30 17:30:28 -08:00
Nguyễn Thái Ngọc Duy 0fd0e2417d dir.c: add free_excludes()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 13:34:55 -08:00
Nguyễn Thái Ngọc Duy fbbb4e19be get_cwd_relative(): do not misinterpret root path
Commit 490544b (get_cwd_relative(): do not misinterpret suffix as
subdirectory) handles case where:

dir = "/path/work";
cwd = "/path/work-xyz";

When it comes to the end of get_cwd_relative(), dir is at '\0' and cwd
is at '-'. The rest of cwd, "-xyz", clearly cannot be the relative
path from dir to cwd. However there is another case where:

dir = "/";          /* or even "c:/" */
cwd = "/path/to/here";

In this special case, while *cwd == 'p', which is not a path
separator, the rest of cwd, "path/to/here", can be returned as a
relative path from dir to cwd.

Handle this case and make t1509 pass again.

Reported-by: Albert Strasheim <fullung@gmail.com>
Reported-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-23 16:15:16 -08:00
Nguyễn Thái Ngọc Duy ae3cdfe112 dir.c: fix EXC_FLAG_MUSTBEDIR match in sparse checkout
Commit c84de70 (excluded_1(): support exclude files in index -
2009-08-20) tries to work around the fact that there is no
directory/file information in index entries, therefore
EXC_FLAG_MUSTBEDIR match would fail.

Unfortunately the workaround is flawed. This fixes it.

Reported-by: Thomas Rinderknecht <thomasr@sailguy.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-08 11:39:23 -08:00
Joshua Jensen 21444f1805 Add case insensitivity support when using git ls-files
When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and
MyDir/fileb.txt is added, running git ls-files mydir only shows
mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt.
Running git ls-files mYdIR shows nothing.

With this patch running git ls-files for mydir, MyDir, and mYdIR shows
mydir/filea.txt and MyDir/fileb.txt.

Wildcards are not handled case insensitively in this patch. Example:
MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git
ls-files mydir/a* does not.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:59 -07:00
Joshua Jensen 5102c6173c Add case insensitivity support for directories when using git status
When using a case preserving but case insensitive file system, directory
case can differ but still refer to the same physical directory.  git
status reports the directory with the alternate case as an Untracked
file.  (That is, when mydir/filea.txt is added to the repository and
then the directory on disk is renamed from mydir/ to MyDir/, git status
shows MyDir/ as being untracked.)

Support has been added in name-hash.c for hashing directories with a
terminating slash into the name hash. When index_name_exists() is called
with a directory (a name with a terminating slash), the name is not
found via the normal cache_name_compare() call, but it is found in the
slow_same_name() function.

Additionally, in dir.c, directory_exists_in_index_icase() allows newly
added directories deeper in the directory chain to be identified.

Ultimately, it would be better if the file list was read in case
insensitive alphabetical order from disk, but this change seems to
suffice for now.

The end result is the directory is looked up in a case insensitive
manner and does not show in the Untracked files list.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Joshua Jensen 10d4b02b99 Case insensitivity support for .gitignore via core.ignorecase
This is especially beneficial when using Windows and Perforce and the
git-p4 bridge. Internally, Perforce preserves a given file's full path
including its case at the time it was added to the Perforce repository.
When syncing a file down via Perforce, missing directories are created,
if necessary, using the case as stored with the filename. Unfortunately,
two files in the same directory can have differing cases for their
respective paths, such as /diRa/file1.c and /DirA/file2.c. Depending on
sync order, DirA/ may get created instead of diRa/.

It is possible to handle directory names in a case insensitive manner
without this patch, but it is highly inconvenient, requiring each
character to be specified like so: [Bb][Uu][Ii][Ll][Dd]. With this patch, the
gitignore exclusions honor the core.ignorecase=true configuration
setting and make the process less error prone. The above is specified
like so: Build

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Joshua Jensen 8cf2a84e9d Add string comparison functions that respect the ignore_case variable.
Multiple locations within this patch series alter a case sensitive
string comparison call such as strcmp() to be a call to a string
comparison call that selects case comparison based on the global
ignore_case variable. Behaviorally, when core.ignorecase=false, the
*_icase() versions are functionally equivalent to their C runtime
counterparts.  When core.ignorecase=true, the *_icase() versions perform
a case insensitive comparison.

Like Linus' earlier ignorecase patch, these may ignore filename
conventions on certain file systems. By isolating filename comparisons
to certain functions, support for those filename conventions may be more
easily met.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Pat Notz 9d14017ada dir.c: squelch false uninitialized memory warning
GCC 4.4.4 on MacOS incorrectly warns about potential use of uninitialized memory.

Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-27 11:43:12 -07:00
Jens Lehmann 108da0db12 git add: Add the "--ignore-missing" option for the dry run
Sometimes it is useful to know if a file or directory will be ignored
before it is added to the work tree. An example is "git submodule add",
where it would be really nice to be able to fail with an appropriate
error message before the submodule is cloned and checked out.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-12 15:13:54 -07:00
Junio C Hamano 2c177a1ca1 Merge branch 'jc/maint-simpler-common-prefix'
* jc/maint-simpler-common-prefix:
  common_prefix: simplify and fix scanning for prefixes
2010-06-22 09:45:23 -07:00
Junio C Hamano 8d676d85f7 Merge branch 'gv/portable'
* gv/portable:
  test-lib: use DIFF definition from GIT-BUILD-OPTIONS
  build: propagate $DIFF to scripts
  Makefile: Tru64 portability fix
  Makefile: HP-UX 10.20 portability fixes
  Makefile: HPUX11 portability fixes
  Makefile: SunOS 5.6 portability fix
  inline declaration does not work on AIX
  Allow disabling "inline"
  Some platforms lack socklen_t type
  Make NO_{INET_NTOP,INET_PTON} configured independently
  Makefile: some platforms do not have hstrerror anywhere
  git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
  test_cmp: do not use "diff -u" on platforms that lack one
  fixup: do not unconditionally disable "diff -u"
  tests: use "test_cmp", not "diff", when verifying the result
  Do not use "diff" found on PATH while building and installing
  enums: omit trailing comma for portability
  Makefile: -lpthread may still be necessary when libc has only pthread stubs
  Rewrite dynamic structure initializations to runtime assignment
  Makefile: pass CPPFLAGS through to fllow customization

Conflicts:
	Makefile
	wt-status.h
2010-06-21 06:02:44 -07:00
Junio C Hamano 42f9852f3c common_prefix: simplify and fix scanning for prefixes
common_prefix() scans backwards from the far end of each 'next'
pathspec, starting from 'len', shortening the 'prefix' using 'path' as
a reference.

However, there is a small opportunity for an out-of-bounds access
because len is unconditionally set to prefix-1 after a "direct match"
test failed.  This means that if 'next' is shorter than prefix+2, we
read past it.

Instead of a minimal fix, simplify the loop: scan *forward* over the
'next' entry, remembering the last '/' where it matched the prefix
known so far.  This is far easier to read and also has the advantage
that we only scan over each entry once.

Acked-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-16 12:13:12 -07:00
Gary V. Vaughan 4b05548fc0 enums: omit trailing comma for portability
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.

enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.

Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:27 -07:00
Clemens Buchacher 490544b128 get_cwd_relative(): do not misinterpret suffix as subdirectory
If the current working directory is the same as the work tree path
plus a suffix, e.g. 'work' and 'work-xyz', then the suffix '-xyz'
would be interpreted as a subdirectory of 'work'.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-28 15:02:50 -07:00
Junio C Hamano 4e7d08a229 Merge branch 'jk/maint-add-ignored-dir'
* jk/maint-add-ignored-dir:
  tests for "git add ignored-dir/file" without -f
  dir: fix COLLECT_IGNORED on excluded prefixes
  t0050: mark non-working test as such
2010-03-20 11:29:36 -07:00
Jeff King 29209cbe58 dir: fix COLLECT_IGNORED on excluded prefixes
As we walk the directory tree, if we see an ignored path, we
want to add it to the ignored list only if it matches any
pathspec that we were given. We used to check for the
pathspec to appear explicitly. E.g., if we see "subdir/file"
and it is excluded, we check to see if we have "subdir/file"
in our pathspec.

However, this interacts badly with the optimization to avoid
recursing into ignored subdirectories. If "subdir" as a
whole is ignored, then we never recurse, and consider only
whether "subdir" itself is in our pathspec.  It would not
match a pathspec of "subdir/file" explicitly, even though it
is the reason that subdir/file would be excluded.

This manifests itself to the user as "git add subdir/file"
failing to correctly note that the pathspec was ignored.

This patch extends the in_pathspec logic to include prefix
directory case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-13 23:23:08 -08:00
Junio C Hamano 7c0be4da5c Merge branch 'jk/maint-rmdir-fix' into maint
* jk/maint-rmdir-fix:
  rm: fix bug in recursive subdirectory removal
2010-02-19 01:31:37 -08:00
Jeff King 3fc0d131c5 rm: fix bug in recursive subdirectory removal
If we remove a path in a/deep/subdirectory, we should try to
remove as many trailing components as possible (i.e.,
subdirectory, then deep, then a). However, the test for the
return value of rmdir was reversed, so we only ever deleted
at most one level.

The fix is in remove_path, so "apply" and "merge-recursive"
also are fixed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 22:22:22 -08:00
Nguyễn Thái Ngọc Duy 45d76f1718 Fix memory corruption when .gitignore does not end by \n
Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1())
tried not to append '\n' at the end because the next commit
may return a buffer that does not have extra space for that.

Unfortunately it left this assignment in the loop:

  buf[i - (i && buf[i-1] == '\r')] = 0;

that can corrupt memory if "buf" is not '\n' terminated. But even if
it does not corrupt memory, the last line would not be
NULL-terminated, leading to errors later inside add_exclude().

This patch fixes it by reverting the faulty commit and make
sure "buf" is always \n terminated.

While at it, free unused memory properly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-20 20:01:52 -08:00
Junio C Hamano 3af59e6f31 Merge branch 'jc/ls-files-ignored-pathspec'
* jc/ls-files-ignored-pathspec:
  ls-files: fix overeager pathspec optimization
  read_directory(): further split treat_path()
  read_directory_recursive(): refactor handling of a single path into a separate function
  t3001: test ls-files -o ignored/dir
2010-01-20 14:43:54 -08:00
Junio C Hamano 73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano 48ffef966c ls-files: fix overeager pathspec optimization
Given pathspecs that share a common prefix, ls-files optimized its call
into recursive directory reader by starting at the common prefix
directory.

If you have a directory "t" with an untracked file "t/junk" in it, but the
top-level .gitignore file told us to ignore "t/", this resulted in:

    $ git ls-files -o --exclude-standard
    $ git ls-files -o --exclude-standard t/
    t/junk
    $ git ls-files -o --exclude-standard t/junk
    t/junk
    $ cd t && git ls-files -o --exclude-standard
    junk

We could argue that you are overriding the ignore file by giving a
patchspec that matches or being in that directory, but it is somewhat
unexpected.  Worse yet, these behave differently:

    $ git ls-files -o --exclude-standard t/ .
    $ git ls-files -o --exclude-standard t/
    t/junk

This patch changes the optimization so that it notices when the common
prefix directory that it starts reading from is an ignored one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:14:50 -08:00
Junio C Hamano 16e2cfa909 read_directory(): further split treat_path()
The next caller I'll be adding won't have an access to struct dirent
because it won't be reading from a directory stream.  Split the main
part of the function further into a separate function to make it usable
by a caller without passing a dirent as long as it knows what type is
feeding the function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:13:47 -08:00
Junio C Hamano 53cc5356fb read_directory_recursive(): refactor handling of a single path into a separate function
Primarily because I want to reuse it in a separate function later,
but this de-dents a huge function by one tabstop which by itself is
an improvement as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:13:47 -08:00
Nguyễn Thái Ngọc Duy cb09753423 dir.c: export excluded_1() and add_excludes_from_file_1()
These functions are used to handle .gitignore. They are now exported
so that sparse checkout can reuse.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy c84de70781 excluded_1(): support exclude files in index
Index does not really have "directories", attempts to match "foo/"
against index will fail unless someone tries to reconstruct directories
from a list of file.

Observing that dtype in this function can never be NULL (otherwise
it would segfault), dtype NULL will be used to say "hey.. you are
matching against index" and behave properly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy c28b3d6e7b Read .gitignore from index if it is skip-worktree
This adds index as a prerequisite for directory listing (with
exclude).  At the moment directory listing is used by "git clean",
"git add", "git ls-files" and "git status"/"git commit" and
unpack_trees()-related commands.  These commands have been
checked/modified to populate index before doing directory listing.

add_excludes_from_file() does not enable this feature, because it
is used to read .git/info/exclude and some explicit files specified
by "git ls-files".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy b5041c5f3b Avoid writing to buffer in add_excludes_from_file_1()
In the next patch, the buffer that is being used within
add_excludes_from_file_1() comes from another function and does not
have extra space to put \n at the end.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Junio C Hamano a0f4afbe87 clean: require double -f options to nuke nested git repository and work tree
When you have an embedded git work tree in your work tree (be it
an orphaned submodule, or an independent checkout of an unrelated
project), "git clean -d -f" blindly descended into it and removed
everything.  This is rarely what the user wants.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 12:22:30 -07:00
Linus Torvalds 443e061a41 Avoid using 'lstat()' to figure out directories
If we have an up-to-date index entry for a file in that directory, we
can know that the directories leading up to that file must be
directories.  No need to do an lstat() on the directory.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 20:05:19 -07:00