Commit graph

262 commits

Author SHA1 Message Date
Joshua Jensen 21444f1805 Add case insensitivity support when using git ls-files
When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and
MyDir/fileb.txt is added, running git ls-files mydir only shows
mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt.
Running git ls-files mYdIR shows nothing.

With this patch running git ls-files for mydir, MyDir, and mYdIR shows
mydir/filea.txt and MyDir/fileb.txt.

Wildcards are not handled case insensitively in this patch. Example:
MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git
ls-files mydir/a* does not.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:59 -07:00
Joshua Jensen 5102c6173c Add case insensitivity support for directories when using git status
When using a case preserving but case insensitive file system, directory
case can differ but still refer to the same physical directory.  git
status reports the directory with the alternate case as an Untracked
file.  (That is, when mydir/filea.txt is added to the repository and
then the directory on disk is renamed from mydir/ to MyDir/, git status
shows MyDir/ as being untracked.)

Support has been added in name-hash.c for hashing directories with a
terminating slash into the name hash. When index_name_exists() is called
with a directory (a name with a terminating slash), the name is not
found via the normal cache_name_compare() call, but it is found in the
slow_same_name() function.

Additionally, in dir.c, directory_exists_in_index_icase() allows newly
added directories deeper in the directory chain to be identified.

Ultimately, it would be better if the file list was read in case
insensitive alphabetical order from disk, but this change seems to
suffice for now.

The end result is the directory is looked up in a case insensitive
manner and does not show in the Untracked files list.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Joshua Jensen 10d4b02b99 Case insensitivity support for .gitignore via core.ignorecase
This is especially beneficial when using Windows and Perforce and the
git-p4 bridge. Internally, Perforce preserves a given file's full path
including its case at the time it was added to the Perforce repository.
When syncing a file down via Perforce, missing directories are created,
if necessary, using the case as stored with the filename. Unfortunately,
two files in the same directory can have differing cases for their
respective paths, such as /diRa/file1.c and /DirA/file2.c. Depending on
sync order, DirA/ may get created instead of diRa/.

It is possible to handle directory names in a case insensitive manner
without this patch, but it is highly inconvenient, requiring each
character to be specified like so: [Bb][Uu][Ii][Ll][Dd]. With this patch, the
gitignore exclusions honor the core.ignorecase=true configuration
setting and make the process less error prone. The above is specified
like so: Build

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Joshua Jensen 8cf2a84e9d Add string comparison functions that respect the ignore_case variable.
Multiple locations within this patch series alter a case sensitive
string comparison call such as strcmp() to be a call to a string
comparison call that selects case comparison based on the global
ignore_case variable. Behaviorally, when core.ignorecase=false, the
*_icase() versions are functionally equivalent to their C runtime
counterparts.  When core.ignorecase=true, the *_icase() versions perform
a case insensitive comparison.

Like Linus' earlier ignorecase patch, these may ignore filename
conventions on certain file systems. By isolating filename comparisons
to certain functions, support for those filename conventions may be more
easily met.

Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-06 11:19:58 -07:00
Pat Notz 9d14017ada dir.c: squelch false uninitialized memory warning
GCC 4.4.4 on MacOS incorrectly warns about potential use of uninitialized memory.

Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-27 11:43:12 -07:00
Jens Lehmann 108da0db12 git add: Add the "--ignore-missing" option for the dry run
Sometimes it is useful to know if a file or directory will be ignored
before it is added to the work tree. An example is "git submodule add",
where it would be really nice to be able to fail with an appropriate
error message before the submodule is cloned and checked out.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-12 15:13:54 -07:00
Junio C Hamano 2c177a1ca1 Merge branch 'jc/maint-simpler-common-prefix'
* jc/maint-simpler-common-prefix:
  common_prefix: simplify and fix scanning for prefixes
2010-06-22 09:45:23 -07:00
Junio C Hamano 8d676d85f7 Merge branch 'gv/portable'
* gv/portable:
  test-lib: use DIFF definition from GIT-BUILD-OPTIONS
  build: propagate $DIFF to scripts
  Makefile: Tru64 portability fix
  Makefile: HP-UX 10.20 portability fixes
  Makefile: HPUX11 portability fixes
  Makefile: SunOS 5.6 portability fix
  inline declaration does not work on AIX
  Allow disabling "inline"
  Some platforms lack socklen_t type
  Make NO_{INET_NTOP,INET_PTON} configured independently
  Makefile: some platforms do not have hstrerror anywhere
  git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
  test_cmp: do not use "diff -u" on platforms that lack one
  fixup: do not unconditionally disable "diff -u"
  tests: use "test_cmp", not "diff", when verifying the result
  Do not use "diff" found on PATH while building and installing
  enums: omit trailing comma for portability
  Makefile: -lpthread may still be necessary when libc has only pthread stubs
  Rewrite dynamic structure initializations to runtime assignment
  Makefile: pass CPPFLAGS through to fllow customization

Conflicts:
	Makefile
	wt-status.h
2010-06-21 06:02:44 -07:00
Junio C Hamano 42f9852f3c common_prefix: simplify and fix scanning for prefixes
common_prefix() scans backwards from the far end of each 'next'
pathspec, starting from 'len', shortening the 'prefix' using 'path' as
a reference.

However, there is a small opportunity for an out-of-bounds access
because len is unconditionally set to prefix-1 after a "direct match"
test failed.  This means that if 'next' is shorter than prefix+2, we
read past it.

Instead of a minimal fix, simplify the loop: scan *forward* over the
'next' entry, remembering the last '/' where it matched the prefix
known so far.  This is far easier to read and also has the advantage
that we only scan over each entry once.

Acked-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-16 12:13:12 -07:00
Gary V. Vaughan 4b05548fc0 enums: omit trailing comma for portability
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.

enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.

Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:27 -07:00
Clemens Buchacher 490544b128 get_cwd_relative(): do not misinterpret suffix as subdirectory
If the current working directory is the same as the work tree path
plus a suffix, e.g. 'work' and 'work-xyz', then the suffix '-xyz'
would be interpreted as a subdirectory of 'work'.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-28 15:02:50 -07:00
Junio C Hamano 4e7d08a229 Merge branch 'jk/maint-add-ignored-dir'
* jk/maint-add-ignored-dir:
  tests for "git add ignored-dir/file" without -f
  dir: fix COLLECT_IGNORED on excluded prefixes
  t0050: mark non-working test as such
2010-03-20 11:29:36 -07:00
Jeff King 29209cbe58 dir: fix COLLECT_IGNORED on excluded prefixes
As we walk the directory tree, if we see an ignored path, we
want to add it to the ignored list only if it matches any
pathspec that we were given. We used to check for the
pathspec to appear explicitly. E.g., if we see "subdir/file"
and it is excluded, we check to see if we have "subdir/file"
in our pathspec.

However, this interacts badly with the optimization to avoid
recursing into ignored subdirectories. If "subdir" as a
whole is ignored, then we never recurse, and consider only
whether "subdir" itself is in our pathspec.  It would not
match a pathspec of "subdir/file" explicitly, even though it
is the reason that subdir/file would be excluded.

This manifests itself to the user as "git add subdir/file"
failing to correctly note that the pathspec was ignored.

This patch extends the in_pathspec logic to include prefix
directory case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-13 23:23:08 -08:00
Junio C Hamano 7c0be4da5c Merge branch 'jk/maint-rmdir-fix' into maint
* jk/maint-rmdir-fix:
  rm: fix bug in recursive subdirectory removal
2010-02-19 01:31:37 -08:00
Jeff King 3fc0d131c5 rm: fix bug in recursive subdirectory removal
If we remove a path in a/deep/subdirectory, we should try to
remove as many trailing components as possible (i.e.,
subdirectory, then deep, then a). However, the test for the
return value of rmdir was reversed, so we only ever deleted
at most one level.

The fix is in remove_path, so "apply" and "merge-recursive"
also are fixed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 22:22:22 -08:00
Nguyễn Thái Ngọc Duy 45d76f1718 Fix memory corruption when .gitignore does not end by \n
Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1())
tried not to append '\n' at the end because the next commit
may return a buffer that does not have extra space for that.

Unfortunately it left this assignment in the loop:

  buf[i - (i && buf[i-1] == '\r')] = 0;

that can corrupt memory if "buf" is not '\n' terminated. But even if
it does not corrupt memory, the last line would not be
NULL-terminated, leading to errors later inside add_exclude().

This patch fixes it by reverting the faulty commit and make
sure "buf" is always \n terminated.

While at it, free unused memory properly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-20 20:01:52 -08:00
Junio C Hamano 3af59e6f31 Merge branch 'jc/ls-files-ignored-pathspec'
* jc/ls-files-ignored-pathspec:
  ls-files: fix overeager pathspec optimization
  read_directory(): further split treat_path()
  read_directory_recursive(): refactor handling of a single path into a separate function
  t3001: test ls-files -o ignored/dir
2010-01-20 14:43:54 -08:00
Junio C Hamano 73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano 48ffef966c ls-files: fix overeager pathspec optimization
Given pathspecs that share a common prefix, ls-files optimized its call
into recursive directory reader by starting at the common prefix
directory.

If you have a directory "t" with an untracked file "t/junk" in it, but the
top-level .gitignore file told us to ignore "t/", this resulted in:

    $ git ls-files -o --exclude-standard
    $ git ls-files -o --exclude-standard t/
    t/junk
    $ git ls-files -o --exclude-standard t/junk
    t/junk
    $ cd t && git ls-files -o --exclude-standard
    junk

We could argue that you are overriding the ignore file by giving a
patchspec that matches or being in that directory, but it is somewhat
unexpected.  Worse yet, these behave differently:

    $ git ls-files -o --exclude-standard t/ .
    $ git ls-files -o --exclude-standard t/
    t/junk

This patch changes the optimization so that it notices when the common
prefix directory that it starts reading from is an ignored one.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:14:50 -08:00
Junio C Hamano 16e2cfa909 read_directory(): further split treat_path()
The next caller I'll be adding won't have an access to struct dirent
because it won't be reading from a directory stream.  Split the main
part of the function further into a separate function to make it usable
by a caller without passing a dirent as long as it knows what type is
feeding the function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:13:47 -08:00
Junio C Hamano 53cc5356fb read_directory_recursive(): refactor handling of a single path into a separate function
Primarily because I want to reuse it in a separate function later,
but this de-dents a huge function by one tabstop which by itself is
an improvement as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-08 23:13:47 -08:00
Nguyễn Thái Ngọc Duy cb09753423 dir.c: export excluded_1() and add_excludes_from_file_1()
These functions are used to handle .gitignore. They are now exported
so that sparse checkout can reuse.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy c84de70781 excluded_1(): support exclude files in index
Index does not really have "directories", attempts to match "foo/"
against index will fail unless someone tries to reconstruct directories
from a list of file.

Observing that dtype in this function can never be NULL (otherwise
it would segfault), dtype NULL will be used to say "hey.. you are
matching against index" and behave properly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy c28b3d6e7b Read .gitignore from index if it is skip-worktree
This adds index as a prerequisite for directory listing (with
exclude).  At the moment directory listing is used by "git clean",
"git add", "git ls-files" and "git status"/"git commit" and
unpack_trees()-related commands.  These commands have been
checked/modified to populate index before doing directory listing.

add_excludes_from_file() does not enable this feature, because it
is used to read .git/info/exclude and some explicit files specified
by "git ls-files".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy b5041c5f3b Avoid writing to buffer in add_excludes_from_file_1()
In the next patch, the buffer that is being used within
add_excludes_from_file_1() comes from another function and does not
have extra space to put \n at the end.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Junio C Hamano a0f4afbe87 clean: require double -f options to nuke nested git repository and work tree
When you have an embedded git work tree in your work tree (be it
an orphaned submodule, or an independent checkout of an unrelated
project), "git clean -d -f" blindly descended into it and removed
everything.  This is rarely what the user wants.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 12:22:30 -07:00
Linus Torvalds 443e061a41 Avoid using 'lstat()' to figure out directories
If we have an up-to-date index entry for a file in that directory, we
can know that the directories leading up to that file must be
directories.  No need to do an lstat() on the directory.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 20:05:19 -07:00
Linus Torvalds caa6b7825a Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry
On filesystems without d_type, we can look at the cache entry first.
Doing an lstat() can be expensive.

Reported by Dmitry Potapov for Cygwin.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 01:11:30 -07:00
Linus Torvalds dba2e2037f Simplify read_directory[_recursive]() arguments
Stop the insanity with separate 'path' and 'base' arguments that must
match.  We don't need that crazy interface any more, since we cleaned up
handling of 'path' in commit da4b3e8c28.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 01:11:28 -07:00
Linus Torvalds 1d8842d921 Add 'fill_directory()' helper function for directory traversal
Most of the users of "read_directory()" actually want a much simpler
interface than the whole complex (but rather powerful) one.

In fact 'git add' had already largely abstracted out the core interface
issues into a private "fill_directory()" function that was largely
applicable almost as-is to a number of callers.  Yes, 'git add' wants to
do some extra work of its own, specific to the add semantics, but we can
easily split that out, and use the core as a generic function.

This function does exactly that, and now that much simplified
'fill_directory()' function can be shared with a number of callers,
while also ensuring that the rather more complex calling conventions of
read_directory() are used by fewer call-sites.

This also makes the 'common_prefix()' helper function private to dir.c,
since all callers are now in that file.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 01:11:26 -07:00
Thomas Rast d824cbba02 Convert existing die(..., strerror(errno)) to die_errno()
Change calls to die(..., strerror(errno)) to use the new die_errno().

In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Jeff King 6e4f981ffb git-add: no need for -f when resolving a conflict in already tracked path
When a path F that matches ignore pattern has a conflict, "git add F"
insisted the -f option be given, which did not make sense.  It would have
required -f when the path was originally added, but when resolving a
conflict, it already is tracked.

So this should work (and does):

  $ echo file >.gitignore
  $ echo content >file
  $ git add -f file ;# need -f because we are adding new path
  $ echo more content >>file
  $ git add file ;# don't need -f; it is not actually an "other" file

This is handled under the hood by the COLLECT_IGNORED option to
read_directory. When that code finds an ignored file, it checks the
index to make sure it is not actually a tracked file. However, the test
it uses does not take into account unmerged entries, and considers them
to still be ignored. "git ls-files" uses a more elaborate test and gets
the right answer and the same test should be used here.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 15:59:16 -07:00
Linus Torvalds da4b3e8c28 dir.c: clean up handling of 'path' parameter in read_directory_recursive()
Right now we pass two different pathnames ('path' and 'base') down to
read_directory_recursive(), and the only real reason for that is that we
want to allow an empty 'base' parameter, but when we do so, we need the
pathname to "opendir()" to be "." rather than the empty string.

And rather than handle that confusion in the caller, we can just fix
read_directory_recursive() to handle the case of an empty path itself,
by just passing opendir() a "." ourselves if the path is empty.

This would allow us to then drop one of the pathnames entirely from the
calling convention, but rather than do that, we'll start separating them
out as a "filesystem pathname" (the one we use for filesystem accesses)
and a "git internal base name" (which is the name that we use for git
internally).

That will eventually allow us to do things like handle different
encodings (eg the filesystem pathnames might be Latin1, while git itself
would use UTF-8 for filename information).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-16 22:41:46 -07:00
Junio C Hamano 8146f19762 Merge branch 'maint'
* maint:
  improve error message in config.c
  t4018-diff-funcname: add cpp xfuncname pattern to syntax test
  Work around BSD whose typeof(tv.tv_sec) != time_t
  git-am.txt: reword extra headers in message body
  git-am.txt: Use date or value instead of time or timestamp
  git-am.txt: add an 'a', say what 'it' is, simplify a sentence
  dir.c: Fix two minor grammatical errors in comments
  git-svn: fix a sloppy Getopt::Long usage
2009-05-05 22:52:17 -07:00
Junio C Hamano 41f64ad34b Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  dir.c: Fix two minor grammatical errors in comments
2009-05-05 22:51:31 -07:00
Allan Caffee 2c5b011503 dir.c: Fix two minor grammatical errors in comments
Signed-off-by: Allan Caffee <allan.caffee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-05 22:04:16 -07:00
Felipe Contreras 4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Junio C Hamano de2e3b04cd Merge branch 'mv/parseopt-ls-files'
* mv/parseopt-ls-files:
  ls-files: fix broken --no-empty-directory
  t3000: use test_cmp instead of diff
  parse-opt: migrate builtin-ls-files.
  Turn the flags in struct dir_struct into a single variable

Conflicts:
	builtin-ls-files.c
	t/t3000-ls-files-others.sh
2009-03-20 14:30:51 -07:00
Junio C Hamano a9bfe81309 Merge branch 'kb/checkout-optim'
* kb/checkout-optim:
  Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
  checkout bugfix: use stat.mtime instead of stat.ctime in two places
  Makefile: Set compiler switch for USE_NSEC
  Create USE_ST_TIMESPEC and turn it on for Darwin
  Not all systems use st_[cm]tim field for ns resolution file timestamp
  Record ns-timestamps if possible, but do not use it without USE_NSEC
  write_index(): update index_state->timestamp after flushing to disk
  verify_uptodate(): add ce_uptodate(ce) test
  make USE_NSEC work as expected
  fix compile error when USE_NSEC is defined
  check_updates(): effective removal of cache entries marked CE_REMOVE
  lstat_cache(): print a warning if doing ping-pong between cache types
  show_patch_diff(): remove a call to fstat()
  write_entry(): use fstat() instead of lstat() when file is open
  write_entry(): cleanup of some duplicated code
  create_directories(): remove some memcpy() and strchr() calls
  unlink_entry(): introduce schedule_dir_for_removal()
  lstat_cache(): swap func(length, string) into func(string, length)
  lstat_cache(): generalise longest_match_lstat_cache()
  lstat_cache(): small cleanup and optimisation
2009-03-17 18:54:31 -07:00
Junio C Hamano 5f7b338310 Merge branch 'fg/maint-exclude-bq' into maint
* fg/maint-exclude-bq:
  Support "\" in non-wildcard exclusion entries
2009-03-11 13:53:53 -07:00
Junio C Hamano 1456d964fa Merge branch 'fg/exclude-bq'
* fg/exclude-bq:
  Support "\" in non-wildcard exclusion entries
2009-03-05 15:41:39 -08:00
Johannes Schindelin 7c4c97c0ac Turn the flags in struct dir_struct into a single variable
By having flags represented as bits in the new member variable 'flags',
it will be easier to use parse_options when dir_struct is involved.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18 11:04:19 -08:00
Finn Arne Gangstad dd482eeac2 Support "\" in non-wildcard exclusion entries
"\" was treated differently in exclude rules depending on whether a
wildcard match was done. For wildcard rules, "\" was de-escaped in
fnmatch, but this was not done for other rules since they used strcmp
instead.  A file named "#foo" would not be excluded by "\#foo", but would
be excluded by "\#foo*".

We now treat all rules with "\" as wildcard rules.

Another solution could be to de-escape all non-wildcard rules as we
read them, but we would have to do the de-escaping exactly as fnmatch
does it to avoid inconsistencies.

Signed-off-by: Finn Arne Gangstad <finnag@pvv.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-12 11:36:43 -08:00
Kjetil Barvik 571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Junio C Hamano d64d4835b8 Merge branch 'cb/add-pathspec'
* cb/add-pathspec:
  remove pathspec_match, use match_pathspec instead
  clean up pathspec matching
2009-01-25 17:13:11 -08:00
Junio C Hamano d9fde065bd Merge branch 'rs/ctype'
* rs/ctype:
  Add is_regex_special()
  Change NUL char handling of isspecial()
  Reformat ctype.c
  Add ctype test

Conflicts:
	Makefile
2009-01-21 16:51:03 -08:00
René Scharfe 8cc3299262 Change NUL char handling of isspecial()
Replace isspecial() by the new macro is_glob_special(), which is more,
well, specialized.  The former included the NUL char in its character
class, while the letter only included characters that are special to
file name globbing.

The new name contains underscores because they enhance readability
considerably now that it's made up of three words.  Renaming the
function is necessary to document its changed scope.

The call sites of isspecial() are updated to check explicitly for NUL.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 18:30:37 -08:00
Clemens Buchacher 0b50922abf remove pathspec_match, use match_pathspec instead
Both versions have the same functionality. This removes any
redundancy.

This also adds makes two extensions to match_pathspec:

- If pathspec is NULL, return 1. This reflects the behavior of git
  commands, for which no paths usually means "match all paths".

- If seen is NULL, do not use it.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-14 19:18:44 -08:00
Clemens Buchacher 1c7c1d179e clean up pathspec matching
If pathspec already matched exactly, it cannot match any more.
Originally, we had to continue anyways, because we did not
differentiate between exact, recursive and globbing matches.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-14 19:18:37 -08:00
Alexander Potashev 55892d2398 Allow cloning to an existing empty directory
The die() message updated accordingly.

The previous behaviour was to only allow cloning when the destination
directory doesn't exist.

[jc: added trivial tests]

Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 13:26:29 -08:00
Alexander Potashev 8ca12c0d62 add is_dot_or_dotdot inline function
A new inline function is_dot_or_dotdot is used to check if the
directory name is either "." or "..". It returns a non-zero value if
the given string is "." or "..". It's applicable to a lot of Git
source code.

Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 13:21:57 -08:00
Nanako Shiraishi 159b321270 dir.c: make dir_add_name() and dir_add_ignored() static
These functions are not used by any other file.

Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 17:46:09 -07:00
Shawn O. Pearce 5a139ba483 Merge branch 'maint' into bc/master-diff-hunk-header-fix
* maint: (41 commits)
  Clarify commit error message for unmerged files
  Use strchrnul() instead of strchr() plus manual workaround
  Use remove_path from dir.c instead of own implementation
  Add remove_path: a function to remove as much as possible of a path
  git-submodule: Fix "Unable to checkout" for the initial 'update'
  Clarify how the user can satisfy stash's 'dirty state' check.
  Remove empty directories in recursive merge
  Documentation: clarify the details of overriding LESS via core.pager
  Update release notes for 1.6.0.3
  checkout: Do not show local changes when in quiet mode
  for-each-ref: Fix --format=%(subject) for log message without newlines
  git-stash.sh: don't default to refs/stash if invalid ref supplied
  maint: check return of split_cmdline to avoid bad config strings
  builtin-prune.c: prune temporary packs in <object_dir>/pack directory
  Do not perform cross-directory renames when creating packs
  Use dashless git commands in setgitperms.perl
  git-remote: do not use user input in a printf format string
  make "git remote" report multiple URLs
  Start draft release notes for 1.6.0.3
  git-repack uses --no-repack-object, not --no-repack-delta.
  ...

Conflicts:
	RelNotes
2008-09-29 10:52:34 -07:00
Alex Riesen 4a92d1bfb7 Add remove_path: a function to remove as much as possible of a path
The function has two potential users which both managed to get wrong
their implementations (the one in builtin-rm.c one has a memleak, and
builtin-merge-recursive.c scribles over its const argument).

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-29 08:37:07 -07:00
Brandon Casey 63e8aea74e dir.c: Avoid c99 array initialization
The following syntax:

        char foo[] = {
                [0] = 1,
                [7] = 2,
                [15] = 3
        };

is a c99 construct which some compilers do not support even though they
support other c99 constructs. This construct can be avoided by folding
these 'special' test cases into the sane_ctype array and making use of
the related infrastructure.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-28 21:49:51 -07:00
Junio C Hamano 4a871de896 Merge branch 'jc/add-stop-at-symlink'
* jc/add-stop-at-symlink:
  add: refuse to add working tree items beyond symlinks
  update-index: refuse to add working tree items beyond symlinks
2008-08-20 23:42:18 -07:00
Kevin Ballard ea335b56d4 Fix escaping of glob special characters in pathspecs
match_one implements an optimized pathspec match where it only uses
fnmatch if it detects glob special characters in the pattern. Unfortunately
it didn't treat \ as a special character, so attempts to escape a glob
special character would fail even though fnmatch() supports it.

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13 17:11:03 -07:00
Junio C Hamano 725b06050a add: refuse to add working tree items beyond symlinks
This is the same fix for the issue of adding "sym/path" when "sym" is a
symblic link that points at a directory "dir" with "path" in it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-04 23:31:23 -07:00
Junio C Hamano 380a742679 Merge branch 'lt/case-insensitive'
* lt/case-insensitive:
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-10 18:14:28 -07:00
Linus Torvalds 88ea8112b4 Optimize match_pathspec() to avoid fnmatch()
"git add *" is actually fundamentally different from "git add .", and
yeah, you should generally use the latter.

The reason? The argument list is actually something different from what
you think it is. For git, it's a "pathspec", so what actualy happens is
that in *both* cases, it will really traverse the whole tree, and then
match every file it finds against the pathspec.

So think of the arguments not as a file list, but as a random bunch of
patterns to match against the files you have!

Which is why the cost is actually approximately O(n*m), where "n" is the
size of the working tree, and "m" is the number of pathspecs.

So the reason "git add ." is fast is actually that "m" in that case is
just 1 (just one trivial pattern), and then "git add *" is slow because
"m" is large (lots of complicated patterns). In both cases, 'n' is the
same (== the whole set of files in your working tree).

Anyway, here's a trivial patch that doesn't change this fundamental fact,
but that avoids doing anything *expensive* until we've done some cheap
initial tests. It may or may not help your test-case, but it's pretty
simple and it matches the other git optimizations in this area (ie
"conceptually handle the general case, but optimize the simple cases where
we can exit early")

Notice how this patch doesn' actually change the fundamental O(n^2)
behaviour, but it makes it much cheaper by generally avoiding the
expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not
needed.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-26 17:48:17 -07:00
Shawn Bohrer f2d0df7148 git clean: Don't automatically remove directories when run within subdirectory
When git clean is run from a subdirectory it should follow the normal
policy and only remove directories if they are passed in as a pathspec,
or -d is specified.

The fix is to send len which could be shorter than ent->len because we
have stripped the trailing '/' that read_directory adds. Additionaly
match_one() was modified to allow a name[] that is not NUL terminated.
This allows us to check if the name matched the pathspec exactly
instead of recursively.

Signed-off-by: Shawn Bohrer <shawn.bohrer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-14 23:14:58 -07:00
Linus Torvalds 0a9b88b7de Add 'core.ignorecase' option
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds cd2fef59ed Make hash_name_lookup able to do case-independent lookups
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Jim Meyering 8e0f70033b Avoid unnecessary "if-before-free" tests.
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:

        if (some_expression)
                free (some_expression);

with the now-equivalent:

        free (some_expression);

It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:

    http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html

FYI, the change below was prepared by running the following:

  git ls-files -z | xargs -0 \
  perl -0x3b -pi -e \
    's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'

Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }".  But that's ok, since there were none like
that in git sources.

Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code.  That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this

  if (x)
    free (x);
  else
    foo ();

into this:

  free (x);
  else
    foo ();

There were none of those here, either.

If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
 option to make it detect free-like functions with different names]

  http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free

Addendum:
  Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 14:14:40 -08:00
Junio C Hamano 987e315a6b Merge branch 'jc/gitignore-ends-with-slash'
* jc/gitignore-ends-with-slash:
  gitignore: lazily find dtype
  gitignore(5): Allow "foo/" in ignore list to match directory "foo"
2008-02-16 17:57:06 -08:00
Junio C Hamano 6831a88ac0 gitignore: lazily find dtype
When we process "foo/" entries in gitignore files on a system
that does not have d_type member in "struct dirent", the earlier
implementation ran lstat(2) separately when matching with
entries that came from the command line, in-tree .gitignore
files, and $GIT_DIR/info/excludes file.

This optimizes it by delaying the lstat(2) call until it becomes
absolutely necessary.

The initial idea for this change was by Jeff King, but I
optimized it further to pass pointers to around.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00
Junio C Hamano d6b8fc303b gitignore(5): Allow "foo/" in ignore list to match directory "foo"
A pattern "foo/" in the exclude list did not match directory
"foo", but a pattern "foo" did.  This attempts to extend the
exclude mechanism so that it would while not matching a regular
file or a symbolic link "foo".  In order to differentiate a
directory and non directory, this passes down the type of path
being checked to excluded() function.

A downside is that the recursive directory walk may need to run
lstat(2) more often on systems whose "struct dirent" do not give
the type of the entry; earlier it did not have to do so for an
excluded path, but we now need to figure out if a path is a
directory before deciding to exclude it.  This is especially bad
because an idea similar to the earlier CE_UPTODATE optimization
to reduce number of lstat(2) calls would by definition not apply
to the codepaths involved, as (1) directories will not be
registered in the index, and (2) excluded paths will not be in
the index anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00
Linus Torvalds cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Linus Torvalds 7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
李鸿 6ba78238a8 Fix a memory leak
Signed-off-by: Li Hong <leehong@pku.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-16 12:50:08 -08:00
Junio C Hamano 31cbb5d961 Merge branch 'kh/commit'
* kh/commit: (33 commits)
  git-commit --allow-empty
  git-commit: Allow to amend a merge commit that does not change the tree
  quote_path: fix collapsing of relative paths
  Make git status usage say git status instead of git commit
  Fix --signoff in builtin-commit differently.
  git-commit: clean up die messages
  Do not generate full commit log message if it is not going to be used
  Remove git-status from list of scripts as it is builtin
  Fix off-by-one error when truncating the diff out of the commit message.
  builtin-commit.c: export GIT_INDEX_FILE for launch_editor as well.
  Add a few more tests for git-commit
  builtin-commit: Include the diff in the commit message when verbose.
  builtin-commit: fix partial-commit support
  Fix add_files_to_cache() to take pathspec, not user specified list of files
  Export three helper functions from ls-files
  builtin-commit: run commit-msg hook with correct message file
  builtin-commit: do not color status output shown in the message template
  file_exists(): dangling symlinks do exist
  Replace "runstatus" with "status" in the tests
  t7501-commit: Add test for git commit <file> with dirty index.
  ...
2007-12-04 17:16:33 -08:00
Junio C Hamano 63d285c849 per-directory-exclude: lazily read .gitignore files
Operations that walk directories or trees, which potentially need to
consult the .gitignore files, used to always try to open the .gitignore
file every time they entered a new directory, even when they ended up
not needing to call excluded() function to see if a path in the
directory is ignored.  This was done by push/pop exclude_per_directory()
functions that managed the data in a stack.

This changes the directory walking API to remove the need to call these
two functions.  Instead, the directory walk data structure caches the
data used by excluded() function the last time, and lazily reuses it as
much as possible.  Among the data the last check used, the ones from
deeper directories that the path we are checking is outside are
discarded, data from the common leading directories are reused, and then
the directories between the common directory and the directory the path
being checked is in are checked for .gitignore file.  This is very
similar to the way gitattributes are handled.

This API change also fixes "ls-files -c -i", which called excluded()
without setting up the gitignore data via the old push/pop functions.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-29 02:19:14 -08:00
Junio C Hamano 686a4a06b6 dir.c: minor clean-up
Replace handcrafted reallocation with ALLOC_GROW().
Reindent "file_exists()" helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-29 01:11:46 -08:00
Junio C Hamano a50f9fc5fe file_exists(): dangling symlinks do exist
This function is used to see if a path given by the user does exist
on the filesystem.  A symbolic link that does not point anywhere does
exist but running stat() on it would yield an error, and it incorrectly
said it does not exist.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-22 17:05:04 -08:00
Junio C Hamano 41a7aa588f Fix per-directory exclude handing for "git add"
In "dir_struct", each exclusion element in the exclusion stack records a
base string (pointer to the beginning with length) so that we can tell
where it came from, but this pointer is just pointing at the parameter
that is given by the caller to the push_exclude_per_directory()
function.

While read_directory_recursive() runs, calls to excluded() makes use
the data in the exclusion elements, including this base string.  The
caller of read_directory_recursive() is not supposed to free the
buffer it gave to push_exclude_per_directory() earlier, until it
returns.

The test case Bruce Stephens gave in the mailing list discussion
was simplified and added to the t3700 test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-16 01:16:22 -08:00
Junio C Hamano 039bc64e88 core.excludesfile clean-up
There are inconsistencies in the way commands currently handle
the core.excludesfile configuration variable.  The problem is
the variable is too new to be noticed by anything other than
git-add and git-status.

 * git-ls-files does not notice any of the "ignore" files by
   default, as it predates the standardized set of ignore files.
   The calling scripts established the convention to use
   .git/info/exclude, .gitignore, and later core.excludesfile.

 * git-add and git-status know about it because they call
   add_excludes_from_file() directly with their own notion of
   which standard set of ignore files to use.  This is just a
   stupid duplication of code that need to be updated every time
   the definition of the standard set of ignore files is
   changed.

 * git-read-tree takes --exclude-per-directory=<gitignore>,
   not because the flexibility was needed.  Again, this was
   because the option predates the standardization of the ignore
   files.

 * git-merge-recursive uses hardcoded per-directory .gitignore
   and nothing else.  git-clean (scripted version) does not
   honor core.* because its call to underlying ls-files does not
   know about it.  git-clean in C (parked in 'pu') doesn't either.

We probably could change git-ls-files to use the standard set
when no excludes are specified on the command line and ignore
processing was asked, or something like that, but that will be a
change in semantics and might break people's scripts in a subtle
way.  I am somewhat reluctant to make such a change.

On the other hand, I think it makes perfect sense to fix
git-read-tree, git-merge-recursive and git-clean to follow the
same rule as other commands.  I do not think of a valid use case
to give an exclude-per-directory that is nonstandard to
read-tree command, outside a "negative" test in the t1004 test
script.

This patch is the first step to untangle this mess.

The next step would be to teach read-tree, merge-recursive and
clean (in C) to use setup_standard_excludes().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:08:04 -08:00
Junio C Hamano f3fa183802 Style: place opening brace of a function definition at column 1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-08 15:35:32 -08:00
Lars Knoll 68492fc73b Speedup scanning for excluded files.
Try to avoid a lot of work scanning for excluded files,
by caching some more information when setting up the exclusion
data structure.

Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and
reduces the amount of instructions executed (as measured by valgrind) by a
factor of 2.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-29 17:03:11 -07:00
Junio C Hamano d90a7fda35 Merge branch 'db/fetch-pack'
* db/fetch-pack: (60 commits)
  Define compat version of mkdtemp for systems lacking it
  Avoid scary errors about tagged trees/blobs during git-fetch
  fetch: if not fetching from default remote, ignore default merge
  Support 'push --dry-run' for http transport
  Support 'push --dry-run' for rsync transport
  Fix 'push --all branch...' error handling
  Fix compilation when NO_CURL is defined
  Added a test for fetching remote tags when there is not tags.
  Fix a crash in ls-remote when refspec expands into nothing
  Remove duplicate ref matches in fetch
  Restore default verbosity for http fetches.
  fetch/push: readd rsync support
  Introduce remove_dir_recursively()
  bundle transport: fix an alloc_ref() call
  Allow abbreviations in the first refspec to be merged
  Prevent send-pack from segfaulting when a branch doesn't match
  Cleanup unnecessary break in remote.c
  Cleanup style nit of 'x == NULL' in remote.c
  Fix memory leaks when disconnecting transport instances
  Ensure builtin-fetch honors {fetch,transfer}.unpackLimit
  ...
2007-10-24 21:59:50 -07:00
Linus Torvalds 07134421fc Fix directory scanner to correctly ignore files without d_type
On Fri, 19 Oct 2007, Todd T. Fries wrote:
> If DT_UNKNOWN exists, then we have to do a stat() of some form to
> find out the right type.

That happened in the case of a pathname that was ignored, and we did
not ask for "dir->show_ignored". That test used to be *together*
with the "DTYPE(de) != DT_DIR", but splitting the two tests up
means that we can do that (common) test before we even bother to
calculate the real dtype.

Of course, that optimization only matters for systems that don't
have, or don't fill in DTYPE properly.

I also clarified the real relationship between "exclude" and
"dir->show_ignored". It used to do

	if (exclude != dir->show_ignored) {
		..

which wasn't exactly obvious, because it triggers for two different
cases:

 - the path is marked excluded, but we are not interested in ignored
   files: ignore it

 - the path is *not* excluded, but we *are* interested in ignored
   files: ignore it unless it's a directory, in which case we might
   have ignored files inside the directory and need to recurse
   into it).

so this splits them into those two cases, since the first case
doesn't even care about the type.

I also made a the DT_UNKNOWN case a separate helper function,
and added some commentary to the cases.

		Linus

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:44:40 -04:00
Johannes Schindelin 7155b727c9 Introduce remove_dir_recursively()
There was a function called remove_empty_dir_recursive() buried
in refs.c.  Expose a slightly enhanced version in dir.h: it can now
optionally remove a non-empty directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-30 00:04:39 -07:00
Johannes Schindelin 420acb31ac get_relative_cwd(): clarify why it handles dir == NULL
The comment did not make a good case why it makes sense.
Clarify, and remove stale comment about the caller being lazy.
The behaviour on NULL input is pretty much intentional.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 11:34:13 -07:00
Johannes Schindelin e663674722 Add functions get_relative_cwd() and is_inside_dir()
The function get_relative_cwd() works just as getcwd(), only that it
takes an absolute path as additional parameter, returning the prefix
of the current working directory relative to the given path.  If the
cwd is no subdirectory of the given path, it returns NULL.

is_inside_dir() is just a trivial wrapper over get_relative_cwd().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:30 -07:00
Jeff King 25fd2f7a31 Fix ALLOC_GROW calls with obsolete semantics
ALLOC_GROW now expects the 'nr' argument to be "how much you
want" and not "how much you have". This fixes all cases
where we weren't previously adding anything to the 'nr'.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-16 18:00:07 -07:00
Jeff King e96980ef81 builtin-add: simplify (and increase accuracy of) exclude handling
Previously, the code would always set up the excludes, and then manually
pick through the pathspec we were given, assuming that non-added but
existing paths were just ignored. This was mostly correct, but would
erroneously mark a totally empty directory as 'ignored'.

Instead, we now use the collect_ignored option of dir_struct, which
unambiguously tells us whether a path was ignored. This simplifies the
code, and means empty directories are now just not mentioned at all.

Furthermore, we now conditionally ask dir_struct to respect excludes,
depending on whether the '-f' flag has been set. This means we don't have
to pick through the result, checking for an 'ignored' flag; ignored entries
were either added or not in the first place.

We can safely get rid of the special 'ignored' flags to dir_entry, which
were not used anywhere else.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 00:41:52 -07:00
Jeff King 2abd31b078 dir_struct: add collect_ignored option
When set, this option will cause read_directory to keep
track of which entries were ignored. While this shouldn't
effect functionality in most cases, it can make warning
messages to the user much more useful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 00:41:52 -07:00
Jeff King 6815e56933 refactor dir_add_name
This is in preparation for keeping two entry lists in the
dir object.

This patch adds and uses the ALLOC_GROW() macro, which
implements the commonly used idiom of growing a dynamic
array using the alloc_nr function (not just in dir.c, but
everywhere).

We also move creation of a dir_entry to dir_entry_new.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-12 23:00:31 -07:00
Martin Waitz 302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Michael Spang b991625611 dir.c: Omit non-excluded directories with dir->show_ignored
This makes "git-ls-files --others --directory --ignored" behave
as documented and consequently also fixes "git-clean -d -X".
Previously, git-clean would remove non-excluded directories
even when using the -X option.

Signed-off-by: Michael Spang <mspang@uwaterloo.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-07 15:29:29 -07:00
Junio C Hamano 520d7e278c Merge branch 'maint'
* maint:
  Documentation/git-reset.txt: suggest git commit --amend in example.
  Build RPM with ETC_GITCONFIG=/etc/gitconfig
  Ignore all man sections as they are generated files.
  Fix typo in git-am: s/Was is/Was it/
  Reverse the order of -b and --track in the man page.
  dir.c(common_prefix): Fix two bugs

Conflicts:

	git.spec.in
2007-04-24 00:08:16 -07:00
Johannes Schindelin c7f34c180b dir.c(common_prefix): Fix two bugs
The function common_prefix() is used to find the common subdirectory of
a couple of pathnames. When checking if the next pathname matches up with
the prefix, it incorrectly checked the whole path, not just the prefix
(including the slash). Thus, the expensive part of the loop was executed
always.

The other bug is more serious: if the first and the last pathname in the
list have a longer common prefix than the common prefix for _all_ pathnames
in the list, the longer one would be chosen. This bug was probably hidden
by the fact that bash's wildcard expansion sorts the results, and the code
just so happens to work with sorted input.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-23 01:44:00 -07:00
Linus Torvalds ab22aed3b7 Don't show gitlink directories when we want "other" files
When "show_other_directories" is set, that implies that we are looking
for untracked files, which obviously means that we should ignore
directories that are marked as gitlinks in the index.

This fixes "git status" in a superproject, that would otherwise always
report that subprojects were "Untracked files:"

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-12 16:23:25 -07:00
Linus Torvalds 095952585c Teach directory traversal about subprojects
This is the promised cleaned-up version of teaching directory traversal
(ie the "read_directory()" logic) about subprojects. That makes "git add"
understand to add/update subprojects.

It now knows to look at the index file to see if a directory is marked as
a subproject, and use that as information as whether it should be recursed
into or not.

It also generally cleans up the handling of directory entries when
traversing the working tree, by splitting up the decision-making process
into small functions of their own, and adding a fair number of comments.

Finally, it teaches "add_file_to_cache()" that directory names can have
slashes at the end, since the directory traversal adds them to make the
difference between a file and a directory clear (it always did that, but
my previous too-ugly-to-apply subproject patch had a totally different
path for subproject directories and avoided the slash for that case).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 19:09:55 -07:00
Linus Torvalds 5d5cea67af Avoid overflowing name buffer in deep directory structures
This just makes sure that when we do a read_directory(), we check
that the filename fits in the buffer we allocated (with a bit of
slop)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-09 22:30:05 -07:00
Linus Torvalds 9fc42d6091 Optimize directory listing with pathspec limiter.
The way things are set up, you can now pass a "pathspec" to the
"read_directory()" function. If you pass NULL, it acts exactly
like it used to do (read everything). If you pass a non-NULL
pointer, it will simplify it into a "these are the prefixes
without any special characters", and stop any readdir() early if
the path in question doesn't match any of the prefixes.

NOTE! This does *not* obviate the need for the caller to do the *exact*
pathspec match later. It's a first-level filter on "read_directory()", but
it does not do the full pathspec thing. Maybe it should. But in the
meantime, builtin-add.c really does need to do first

	read_directory(dir, .., pathspec);
	if (pathspec)
		prune_directory(dir, pathspec, baselen);

ie the "prune_directory()" part will do the *exact* pathspec pruning,
while the "read_directory()" will use the pathspec just to do some quick
high-level pruning of the directories it will recurse into.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-31 17:41:32 -07:00
Shawn O. Pearce dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Andy Whitcroft 93d26e4cb9 short i/o: fix calls to read to use xread or read_in_full
We have a number of badly checked read() calls.  Often we are
expecting read() to read exactly the size we requested or fail, this
fails to handle interrupts or short reads.  Add a read_in_full()
providing those semantics.  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xread().

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Junio C Hamano 4d06f8ac43 Fix 'git add' with .gitignore
When '*.ig' is ignored, and you have two files f.ig and d.ig/foo
in the working tree,

	$ git add .

correctly ignored f.ig but failed to ignore d.ig/foo.  This was
caused by a thinko in an earlier commit 4888c534, when we tried
to allow adding otherwise ignored files.

After reverting that commit, this takes a much simpler approach.
When we have an unmatched pathspec that talks about an existing
pathname, we know it is an ignored path the user tried to add,
so we include it in the set of paths directory walker returned.

This does not let you say "git add -f D" on an ignored directory
D and add everything under D.  People can submit a patch to
further allow it if they want to, but I think it is a saner
behaviour to require explicit paths to be spelled out in such a
case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:01:31 -08:00
Junio C Hamano c889763bf3 Revert "read_directory: show_both option."
This reverts commit 4888c53409.
2006-12-29 10:08:19 -08:00
Junio C Hamano 4888c53409 read_directory: show_both option.
This teaches the internal read_directory() routine to return
both interesting and ignored pathnames.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-25 03:29:08 -08:00
Junio C Hamano e813d50e35 match_pathspec() -- return how well the spec matched
This updates the return value from match_pathspec() so that the
caller can tell cases between exact match, leading pathname
match (i.e. file "foo/bar" matches a pathspec "foo"), or
filename glob match.  This can be used to prevent "rm dir" from
removing "dir/file" without explicitly asking for recursive
behaviour with -r flag, for example.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-25 03:29:08 -08:00
Junio C Hamano 85023577a8 simplify inclusion of system header files.
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 09:51:35 -08:00
Junio C Hamano f8a9d42872 read-tree: further loosen "working file will be lost" check.
This follows up commit ed93b449 where we removed overcautious
"working file will be lost" check.

A new option "--exclude-per-directory=.gitignore" can be used to
tell the "git-read-tree" command that the user does not mind
losing contents in untracked files in the working tree, if they
need to be overwritten by a merge (either a two-way "switch
branches" merge, or a three-way merge).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-05 23:25:52 -08:00
Johannes Schindelin 07ccbff89b runstatus: do not recurse into subdirectories if not needed
This speeds up the case when you run git-status, having an untracked
subdirectory containing huge amounts of files.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 21:36:54 -07:00
Jeff King c91f0d92ef git-commit.sh: convert run_status to a C builtin
This creates a new git-runstatus which should do roughly the same thing
as the run_status function from git-commit.sh. Except for color support,
the main focus has been to keep the output identical, so that it can be
verified as correct and then used as a C platform for other improvements to
the status printing code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-08 16:46:35 -07:00
Jonas Fonseca c470701a98 Use fstat instead of fseek
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-27 20:49:35 -07:00
Jonas Fonseca 83572c1a91 Use xrealloc instead of realloc
Change places that use realloc, without a proper error path, to instead use
xrealloc. Drop an erroneous error path in the daemon code that used errno
in the die message in favour of the simpler xrealloc.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 17:54:06 -07:00
Jonas Fonseca 095c424d08 Use PATH_MAX instead of MAXPATHLEN
According to sys/paramh.h it's a "BSD name" for values defined in
<limits.h>. Besides PATH_MAX seems to be more commonly used.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 17:52:58 -07:00
Pavel Roskin a9486b02ec Avoid C99 comments, use old-style C comments instead.
This doesn't make the code uglier or harder to read, yet it makes the
code more portable.  This also simplifies checking for other potential
incompatibilities.  "gcc -std=c89 -pedantic" can flag many incompatible
constructs as warnings, but C99 comments will cause it to emit an error.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-10 00:47:13 -07:00
Linus Torvalds 3c6a370b0e Move pathspec matching from builtin-add.c into dir.c
I'll use it for builtin-rm.c too.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-19 16:14:50 -07:00
Linus Torvalds b4189aa848 Clean up git-ls-file directory walking library interface
This moves the code to add the per-directory ignore files for the base
directory into the library routine.

That not only allows us to turn the function push_exclude_per_directory()
static again, it also simplifies the library interface a lot (the caller
no longer needs to worry about any of the per-directory exclude files at
all).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-17 01:56:55 -07:00
Linus Torvalds 453ec4bdf4 libify git-ls-files directory traversal
This moves the core directory traversal and filename exclusion logic
into the general git library, making it available for other users
directly.

If we ever want to do "git commit" or "git add" as a built-in (and we
do), we want to be able to handle most of git-ls-files as a library.

NOTE! Not all of git-ls-files is libified by this.  The index matching
and pathspec prefix calculation is still in ls-files.c, but this is a
big part of it.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-17 01:56:40 -07:00