Commit graph

88 commits

Author SHA1 Message Date
Jeff King eedd5949f5 Merge branch 'jk/submodule-name-verify-fix' into jk/submodule-name-verify-fsck
* jk/submodule-name-verify-fix:
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_path: drop clever fallthrough
  skip_prefix: add icase-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  path: match NTFS short names for more .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths

Note that this includes two bits of evil-merge:

 - there's a new call to verify_path() that doesn't actually
   have a mode available. It should be OK to pass "0" here,
   since we're just manipulating the untracked cache, not an
   actual index entry.

 - the lstat() in builtin/update-index.c:update_one() needs
   to be updated to handle the fsmonitor case (without this
   it still behaves correctly, but does an unnecessary
   lstat).
2018-05-21 23:54:28 -04:00
Jeff King 10ecfa7649 verify_path: disallow symlinks in .gitmodules
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:

  1. It won't be portable to systems without symlinks.

  2. It may behave inconsistently, since Git may look at
     this file in the index or a tree without bothering to
     resolve any symbolic links. We don't do this _yet_, but
     the config infrastructure is there and it's planned for
     the future.

With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:

  a. A symlinked .gitmodules file may circumvent any fsck
     checks of the content.

  b. Git may read and write from the on-disk file without
     sanity checking the symlink target. So for example, if
     you link ".gitmodules" to "../oops" and run "git
     submodule add", we'll write to the file "oops" outside
     the repository.

Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.

Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Jeff King eb12dd0c76 update-index: stat updated files earlier
In the update_one(), we check verify_path() on the proposed
path before doing anything else. In preparation for having
verify_path() look at the file mode, let's stat the file
earlier, so we can check the mode accurately.

This is made a bit trickier by the fact that this function
only does an lstat in a few code paths (the ones that flow
down through process_path()). So we can speculatively do the
lstat() here and pass the results down, and just use a dummy
mode for cases where we won't actually be updating the index
from the filesystem.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Junio C Hamano e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
brian m. carlson a98e6101f0 refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
brian m. carlson 34c290a6fc refs: convert read_ref and read_ref_full to object_id
All but two of the call sites already have parameters using the hash
parameter of struct object_id, so convert them to take a pointer to the
struct directly.  Also convert refs_read_refs_full, the underlying
implementation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:50 +09:00
Ben Peart 9d406cba45 update-index: add fsmonitor support to update-index
Add support in update-index to manually add/remove the fsmonitor
extension via --[no-]fsmonitor flags.

Add support in update-index to manually set/clear the fsmonitor
valid bit via --[no-]fsmonitor-valid flags.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Ben Peart 883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Ben Peart 7c545be9a1 update-index: add a new --force-write-index option
At times, it makes sense to avoid the cost of writing out the index
when the only changes can easily be recomputed on demand. This causes
problems when trying to write test cases to verify that state as they
can't guarantee the state has been persisted to disk.

Add a new option (--force-write-index) to update-index that will
ensure the index is written out even if the cache_changed flag is not
set.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:39:40 +09:00
Junio C Hamano 09595ab381 Merge branch 'jk/leak-checkers'
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives.  Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.

* jk/leak-checkers:
  add UNLEAK annotation for reducing leak false positives
  set_git_dir: handle feeding gitdir to itself
  repository: free fields before overwriting them
  reset: free allocated tree buffers
  reset: make tree counting less confusing
  config: plug user_config leak
  update-index: fix cache entry leak in add_one_file()
  add: free leaked pathspec after add_files_to_cache()
  test-lib: set LSAN_OPTIONS to abort by default
  test-lib: --valgrind should not override --verbose-log
2017-09-19 10:47:55 +09:00
Jeff King baddc96b2c update-index: fix cache entry leak in add_one_file()
When we fail to add the cache entry to the index, we end up
just leaking the struct. We should follow the pattern of the
early-return above and free it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King bfffb48c5d stop leaking lock structs in some simple cases
Now that it's safe to declare a "struct lock_file" on the
stack, we can do so (and avoid an intentional leak). These
leaks were found by running t0000 and t0001 under valgrind
(though certainly other similar leaks exist and just don't
happen to be exercised by those tests).

Initializing the lock_file's inner tempfile with NULL is not
strictly necessary in these cases, but it's a good practice
to model.  It means that if we were to call a function like
rollback_lock_file() on a lock that was never taken in the
first place, it becomes a quiet noop (rather than undefined
behavior).

Likewise, it's always safe to rollback_lock_file() on a file
that has already been committed or deleted, since that
operation is a noop on an inactive lockfile (and that's why
the case in config.c can drop the "if (lock)" check as we
move away from using a pointer).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Patryk Obara 98e019b067 sha1_file: convert index_path to struct object_id
Convert all remaining callers as well.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-20 21:51:38 -07:00
Junio C Hamano f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Brandon Williams b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano 93dd544f54 Merge branch 'jc/noent-notdir'
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it.  We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).

The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious).  Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.

* jc/noent-notdir:
  treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
  compat-util: is_missing_file_error()
2017-06-13 13:47:07 -07:00
Junio C Hamano c7054209d6 treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from

  $ git grep -e ENOENT --and -e ENOTDIR

There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both.  Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-30 09:29:00 +09:00
Junio C Hamano 77a24b7dc4 Merge branch 'cc/untracked'
Code cleanup.

* cc/untracked:
  update-index: fix xgetcwd() related memory leak
2017-04-11 00:21:51 -07:00
Christian Couder c105f563d1 update-index: fix xgetcwd() related memory leak
As xgetcwd() returns an allocated buffer, we should free this
buffer when we don't need it any more.

This was found by Coverity.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-30 10:33:31 -07:00
Christian Couder 6cc1053375 update-index: warn in case of split-index incoherency
When users are using `git update-index --(no-)split-index`, they
may expect the split-index feature to be used or not according to
the option they just used, but this might not be the case if the
new "core.splitIndex" config variable has been set. In this case
let's warn about what will happen and why.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-01 13:24:21 -08:00
Christian Couder cef4fc7ebe split-index: add {add,remove}_split_index() functions
Also use the functions in cmd_update_index() in
builtin/update-index.c.

These functions will be used in a following commit to tweak
our use of the split-index feature depending on the setting
of a configuration variable.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-01 13:24:21 -08:00
Junio C Hamano b3e83cc752 hold_locked_index(): align error handling with hold_lockfile_for_update()
Callers of the hold_locked_index() function pass 0 when they want to
prepare to write a new version of the index file without wishing to
die or emit an error message when the request fails (e.g. somebody
else already held the lock), and pass 1 when they want the call to
die upon failure.

This option is called LOCK_DIE_ON_ERROR by the underlying lockfile
API, and the hold_locked_index() function translates the paramter to
LOCK_DIE_ON_ERROR when calling the hold_lock_file_for_update().

Replace these hardcoded '1' with LOCK_DIE_ON_ERROR and stop
translating.  Callers other than the ones that are replaced with
this change pass '0' to the function; no behaviour change is
intended with this patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Among the callers of hold_locked_index() that passes 0:

 - diff.c::refresh_index_quietly() at the end of "git diff" is an
   opportunistic update; it leaks the lockfile structure but it is
   just before the program exits and nobody should care.

 - builtin/describe.c::cmd_describe(),
   builtin/commit.c::cmd_status(),
   sequencer.c::read_and_refresh_cache() are all opportunistic
   updates and they are OK.

 - builtin/update-index.c::cmd_update_index() takes a lock upfront
   but we may end up not needing to update the index (i.e. the
   entries may be fully up-to-date), in which case we do not need to
   issue an error upon failure to acquire the lock.  We do diagnose
   and die if we indeed need to update, so it is OK.

 - wt-status.c::require_clean_work_tree() IS BUGGY.  It asks
   silence, does not check the returned value.  Compare with
   callsites like cmd_describe() and cmd_status() to notice that it
   is wrong to call update_index_if_able() unconditionally.
2016-12-07 11:31:59 -08:00
Junio C Hamano ebc63580a1 Merge branch 'tg/add-chmod+x-fix'
"git add --chmod=+x <pathspec>" added recently only toggled the
executable bit for paths that are either new or modified. This has
been corrected to flip the executable bit for all paths that match
the given pathspec.

* tg/add-chmod+x-fix:
  t3700-add: do not check working tree file mode without POSIXPERM
  t3700-add: create subdirectory gently
  add: modify already added files when --chmod is given
  read-cache: introduce chmod_index_entry
  update-index: add test for chmod flags
2016-09-26 16:09:20 -07:00
Junio C Hamano 1fe6f5fb0a Merge branch 'va/i18n'
More i18n.

* va/i18n:
  i18n: update-index: mark warnings for translation
  i18n: show-branch: mark plural strings for translation
  i18n: show-branch: mark error messages for translation
  i18n: receive-pack: mark messages for translation
  notes: spell first word of error messages in lowercase
  i18n: notes: mark error messages for translation
  i18n: merge-recursive: mark verbose message for translation
  i18n: merge-recursive: mark error messages for translation
  i18n: config: mark error message for translation
  i18n: branch: mark option description for translation
  i18n: blame: mark error messages for translation
2016-09-21 15:15:28 -07:00
Vasco Almeida 43073f8984 i18n: update-index: mark warnings for translation
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:17:32 -07:00
Thomas Gummerer d9d7096662 read-cache: introduce chmod_index_entry
As there are chmod options for both add and update-index, introduce a
new chmod_index_entry function to do the work.  Use it in update-index,
while it will be used in add in the next patch.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 12:13:54 -07:00
Thomas Gummerer 22433ce461 update-index: add test for chmod flags
Currently there is no test checking the expected behaviour when multiple
chmod flags with different arguments are passed.  As argument handling
is not in line with other git commands it's easy to miss and
accidentally change the current behaviour.

While there, fix the argument type of chmod_path, which takes an int,
but had a char passed in.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-14 15:03:49 -07:00
brian m. carlson 71445a0fef builtin/update-index: convert file to struct object_id
Convert all functions to use struct object_id, and replace instances of
hardcoded 40, 41, and 42 with appropriate references to GIT_SHA1_HEXSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:43 -07:00
brian m. carlson 99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Johannes Schindelin ef1177d18e die("bug"): report bugs consistently
The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".

As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Junio C Hamano ed6e8038f9 pathspec: rename free_pathspec() to clear_pathspec()
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-02 14:09:22 -07:00
Nguyễn Thái Ngọc Duy 23d05364fc builtin/update-index.c: prefer "err" to "errno" in process_lstat_error
"errno" is already passed in as "err". Here we should use err instead of
errno. errno is probably a copy/paste mistake in e011054 (Teach
git-update-index about gitlinks - 2007-04-12)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Junio C Hamano 722c924445 Merge branch 'jk/options-cleanup'
Various clean-ups to the command line option parsing.

* jk/options-cleanup:
  apply, ls-files: simplify "-z" parsing
  checkout-index: disallow "--no-stage" option
  checkout-index: handle "--no-index" option
  checkout-index: handle "--no-prefix" option
  checkout-index: simplify "-z" option parsing
  give "nbuf" strbuf a more meaningful name
2016-02-10 14:20:08 -08:00
Junio C Hamano 0e35fcb412 Merge branch 'cc/untracked'
Update the untracked cache subsystem and change its primary UI from
"git update-index" to "git config".

* cc/untracked:
  t7063: add tests for core.untrackedCache
  test-dump-untracked-cache: don't modify the untracked cache
  config: add core.untrackedCache
  dir: simplify untracked cache "ident" field
  dir: add remove_untracked_cache()
  dir: add {new,add}_untracked_cache()
  update-index: move 'uc' var declaration
  update-index: add untracked cache notifications
  update-index: add --test-untracked-cache
  update-index: use enum for untracked cache options
  dir: free untracked cache when removing it
2016-02-10 14:20:06 -08:00
Jeff King 0d4cc1b45b give "nbuf" strbuf a more meaningful name
It's a common pattern in our code to read paths from stdin,
separated either by newlines or NULs, and unquote as
necessary. In each of these five cases we use "nbuf" to
temporarily store the unquoted value. Let's give it the more
meaningful name "unquoted", which makes it easier to
understand the purpose of the variable.

While we're at it, let's also static-initialize all of our
strbufs. It's not wrong to call strbuf_init, but it
increases the cognitive load on the reader, who might wonder
"do we sometimes avoid initializing them?  why?".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-01 13:43:02 -08:00
Christian Couder 435ec090ec config: add core.untrackedCache
When we know that mtime on directory as given by the environment
is usable for the purpose of untracked cache, we may want the
untracked cache to be always used without any mtime test or
kernel name check being performed.

Also when we know that mtime is not usable for the purpose of
untracked cache, for example because the repo is shared over a
network file system, we may want the untracked-cache to be
automatically removed from the index.

Allow the user to express such preference by setting the
'core.untrackedCache' configuration variable, which can take
'keep', 'false', or 'true' and default to 'keep'.

When read_index_from() is called, it now adds or removes the
untracked cache in the index to respect the value of this
variable. So it does nothing if the value is `keep` or if the
variable is unset; it adds the untracked cache if the value is
`true`; and it removes the cache if the value is `false`.

`git update-index --[no-|force-]untracked-cache` still adds the
untracked cache to, or removes it, from the index, but this
shows a warning if it goes against the value of
core.untrackedCache, because the next time the index is read
the untracked cache will be added or removed if the
configuration is set to do so.

Also `--untracked-cache` used to check that the underlying
operating system and file system change `st_mtime` field of a
directory if files are added or deleted in that directory. But
because those tests take a long time, `--untracked-cache` no
longer performs them. Instead, there is now
`--test-untracked-cache` to perform the tests. This change
makes `--untracked-cache` the same as `--force-untracked-cache`.

This last change is backward incompatible and should be
mentioned in the release notes.

Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

read-cache: Duy'sfixup

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-27 12:30:00 -08:00
Christian Couder 07b29bfd8d dir: add remove_untracked_cache()
Factor out code into remove_untracked_cache(), which will be used
in a later commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:40:11 -08:00
Christian Couder 4a4ca4796d dir: add {new,add}_untracked_cache()
Factor out code into new_untracked_cache() and
add_untracked_cache(), which will be used
in later commits.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:58 -08:00
Christian Couder e7c0c5354b update-index: move 'uc' var declaration
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:46 -08:00
Christian Couder 6d19db1491 update-index: add untracked cache notifications
Attempting to flip the untracked-cache feature on for a random index
file with

    cd /random/unrelated/place
    git --git-dir=/somewhere/else/.git update-index --untracked-cache

would not work as you might expect. Because flipping the feature on
in the index also records the location of the corresponding working
tree (/random/unrelated/place in the above example), when the index
is subsequently used to keep track of files in the working tree in
/somewhere/else, the feature is disabled.

With this patch "git update-index --[test-]untracked-cache" tells the
user in which directory tests are performed. This makes it easy to
spot any problem.

Also in verbose mode, let's tell the user when the cache is enabled
or disabled.

Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:34 -08:00
Christian Couder eaab83d0e5 update-index: add --test-untracked-cache
It is nice to just be able to test if untracked cache is
supported without enabling it.

Helped-by: David Turner <dturner@twopensource.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:22 -08:00
Christian Couder 113e641318 update-index: use enum for untracked cache options
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:13 -08:00
Junio C Hamano 7e07ed8418 update-index: there are only two possible line terminations
The program by default reads LF terminated lines, with an option to
use NUL terminated records.  Instead of pretending that there can be
other useful values for line_termination, use a boolean variable,
nul_term_line, to tell if NUL terminated records are used, and
switch between strbuf_getline_{lf,nul} based on it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:12:58 -08:00
Christian Couder 9624a22ac6 dir: free untracked cache when removing it
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-29 13:38:41 -08:00
Junio C Hamano 38ccaf93bb Merge branch 'nd/untracked-cache'
Teach the index to optionally remember already seen untracked files
to speed up "git status" in a working tree with tons of cruft.

* nd/untracked-cache: (24 commits)
  git-status.txt: advertisement for untracked cache
  untracked cache: guard and disable on system changes
  mingw32: add uname()
  t7063: tests for untracked cache
  update-index: test the system before enabling untracked cache
  update-index: manually enable or disable untracked cache
  status: enable untracked cache
  untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
  untracked cache: mark index dirty if untracked cache is updated
  untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
  untracked cache: avoid racy timestamps
  read-cache.c: split racy stat test to a separate function
  untracked cache: invalidate at index addition or removal
  untracked cache: load from UNTR index extension
  untracked cache: save to an index extension
  ewah: add convenient wrapper ewah_serialize_strbuf()
  untracked cache: don't open non-existent .gitignore
  untracked cache: mark what dirs should be recursed/saved
  untracked cache: record/validate dir mtime and reuse cached output
  untracked cache: make a wrapper around {open,read,close}dir()
  ...
2015-05-26 13:24:46 -07:00
Stefan Beller d7a643b73f prefix_path(): unconditionally free results in the callers
As of d089ebaa (setup: sanitize absolute and funny paths in
get_pathspec(), 2008-01-28), prefix_path() always returns a
newly allocated string, so callers should free its result.

Additionally, drop the const from variables to which the result of
the prefix_path() is assigned, so they can be free()'d without
having to cast-away the constness.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05 10:31:51 -07:00
Stefan Beller 1b7cb8969c update-index: fix a memleak
`old` is not used outside the loop and would get lost
once we reach the goto.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-22 12:26:31 -07:00
Nguyễn Thái Ngọc Duy 1e8fef609e untracked cache: guard and disable on system changes
If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:18 -07:00
Nguyễn Thái Ngọc Duy f64cb88d35 update-index: test the system before enabling untracked cache
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:18 -07:00
Nguyễn Thái Ngọc Duy 9e5972413b update-index: manually enable or disable untracked cache
Overall time saving on "git status" is about 40% in the best case
scenario, removing ..collect_untracked() as the most time consuming
function. read and refresh index operations are now at the top (which
should drop when index-helper and/or watchman support is added). More
numbers and analysis below.

webkit.git
==========

169k files. 6k dirs. Lots of test data (i.e. not touched most of the
time)

Base status
-----------

Index version 4 in split index mode and cache-tree populated. No
untracked cache. It shows how time is consumed by "git status". The
same settings are used for other repos below.

18:28:10.199679 builtin/commit.c:1394   performance: 0.000000451 s: cmd_status:setup
18:28:10.474847 read-cache.c:1407       performance: 0.274873831 s: read_index
18:28:10.475295 read-cache.c:1407       performance: 0.000000656 s: read_index
18:28:10.728443 preload-index.c:131     performance: 0.253147487 s: read_index_preload
18:28:10.741422 read-cache.c:1254       performance: 0.012868340 s: refresh_index
18:28:10.752300 wt-status.c:623         performance: 0.010421357 s: wt_status_collect_changes_worktree
18:28:10.762069 wt-status.c:629         performance: 0.009644748 s: wt_status_collect_changes_index
18:28:11.601019 wt-status.c:632         performance: 0.838859547 s: wt_status_collect_untracked
18:28:11.605939 builtin/commit.c:1421   performance: 0.004835004 s: cmd_status:update_index
18:28:11.606580 trace.c:415             performance: 1.407878388 s: git command: 'git' 'status'

Populating status
-----------------

This is after enabling untracked cache and the cache is still empty.
We see a slight increase in .._collect_untracked() and update_index
(because new cache has to be written to $GIT_DIR/index).

18:28:18.915213 builtin/commit.c:1394   performance: 0.000000326 s: cmd_status:setup
18:28:19.197364 read-cache.c:1407       performance: 0.281901416 s: read_index
18:28:19.197754 read-cache.c:1407       performance: 0.000000546 s: read_index
18:28:19.451355 preload-index.c:131     performance: 0.253599607 s: read_index_preload
18:28:19.464400 read-cache.c:1254       performance: 0.012935336 s: refresh_index
18:28:19.475115 wt-status.c:623         performance: 0.010236920 s: wt_status_collect_changes_worktree
18:28:19.486022 wt-status.c:629         performance: 0.010801685 s: wt_status_collect_changes_index
18:28:20.362660 wt-status.c:632         performance: 0.876551366 s: wt_status_collect_untracked
18:28:20.396199 builtin/commit.c:1421   performance: 0.033447969 s: cmd_status:update_index
18:28:20.396939 trace.c:415             performance: 1.482695902 s: git command: 'git' 'status'

Populated status
----------------

After the cache is populated, wt_status_collect_untracked() drops 82%
from 0.838s to 0.144s. Overall time drops 45%. Top offenders are now
read_index() and read_index_preload().

18:28:20.408605 builtin/commit.c:1394   performance: 0.000000457 s: cmd_status:setup
18:28:20.692864 read-cache.c:1407       performance: 0.283980458 s: read_index
18:28:20.693273 read-cache.c:1407       performance: 0.000000661 s: read_index
18:28:20.958814 preload-index.c:131     performance: 0.265540254 s: read_index_preload
18:28:20.972375 read-cache.c:1254       performance: 0.013437429 s: refresh_index
18:28:20.983959 wt-status.c:623         performance: 0.011146646 s: wt_status_collect_changes_worktree
18:28:20.993948 wt-status.c:629         performance: 0.009879094 s: wt_status_collect_changes_index
18:28:21.138125 wt-status.c:632         performance: 0.144084737 s: wt_status_collect_untracked
18:28:21.173678 builtin/commit.c:1421   performance: 0.035463949 s: cmd_status:update_index
18:28:21.174251 trace.c:415             performance: 0.766707355 s: git command: 'git' 'status'

gentoo-x86.git
==============

This repository is a strange one with a balanced, wide and shallow
worktree (about 100k files and 23k dirs) and no .gitignore in
worktree. .._collect_untracked() time drops 88%, total time drops 56%.

Base status
-----------
18:20:40.828642 builtin/commit.c:1394   performance: 0.000000496 s: cmd_status:setup
18:20:41.027233 read-cache.c:1407       performance: 0.198130532 s: read_index
18:20:41.027670 read-cache.c:1407       performance: 0.000000581 s: read_index
18:20:41.171716 preload-index.c:131     performance: 0.144045594 s: read_index_preload
18:20:41.179171 read-cache.c:1254       performance: 0.007320424 s: refresh_index
18:20:41.185785 wt-status.c:623         performance: 0.006144638 s: wt_status_collect_changes_worktree
18:20:41.192701 wt-status.c:629         performance: 0.006780184 s: wt_status_collect_changes_index
18:20:41.991723 wt-status.c:632         performance: 0.798927029 s: wt_status_collect_untracked
18:20:41.994664 builtin/commit.c:1421   performance: 0.002852772 s: cmd_status:update_index
18:20:41.995458 trace.c:415             performance: 1.168427502 s: git command: 'git' 'status'
Populating status
-----------------
18:20:48.968848 builtin/commit.c:1394   performance: 0.000000380 s: cmd_status:setup
18:20:49.172918 read-cache.c:1407       performance: 0.203734214 s: read_index
18:20:49.173341 read-cache.c:1407       performance: 0.000000562 s: read_index
18:20:49.320013 preload-index.c:131     performance: 0.146671391 s: read_index_preload
18:20:49.328039 read-cache.c:1254       performance: 0.007921957 s: refresh_index
18:20:49.334680 wt-status.c:623         performance: 0.006172020 s: wt_status_collect_changes_worktree
18:20:49.342526 wt-status.c:629         performance: 0.007731746 s: wt_status_collect_changes_index
18:20:50.257510 wt-status.c:632         performance: 0.914864222 s: wt_status_collect_untracked
18:20:50.338371 builtin/commit.c:1421   performance: 0.080776477 s: cmd_status:update_index
18:20:50.338900 trace.c:415             performance: 1.371462446 s: git command: 'git' 'status'
Populated status
----------------
18:20:50.351160 builtin/commit.c:1394   performance: 0.000000571 s: cmd_status:setup
18:20:50.577358 read-cache.c:1407       performance: 0.225917338 s: read_index
18:20:50.577794 read-cache.c:1407       performance: 0.000000617 s: read_index
18:20:50.734140 preload-index.c:131     performance: 0.156345564 s: read_index_preload
18:20:50.745717 read-cache.c:1254       performance: 0.011463075 s: refresh_index
18:20:50.755176 wt-status.c:623         performance: 0.008877929 s: wt_status_collect_changes_worktree
18:20:50.763768 wt-status.c:629         performance: 0.008471633 s: wt_status_collect_changes_index
18:20:50.854885 wt-status.c:632         performance: 0.090988721 s: wt_status_collect_untracked
18:20:50.857765 builtin/commit.c:1421   performance: 0.002789097 s: cmd_status:update_index
18:20:50.858411 trace.c:415             performance: 0.508647673 s: git command: 'git' 'status'

linux-2.6
=========

Reference repo. Not too big. .._collect_status() drops 84%. Total time
drops 42%.

Base status
-----------
18:34:09.870122 builtin/commit.c:1394   performance: 0.000000385 s: cmd_status:setup
18:34:09.943218 read-cache.c:1407       performance: 0.072871177 s: read_index
18:34:09.943614 read-cache.c:1407       performance: 0.000000491 s: read_index
18:34:10.004364 preload-index.c:131     performance: 0.060748102 s: read_index_preload
18:34:10.008190 read-cache.c:1254       performance: 0.003714285 s: refresh_index
18:34:10.012087 wt-status.c:623         performance: 0.002775446 s: wt_status_collect_changes_worktree
18:34:10.016054 wt-status.c:629         performance: 0.003862140 s: wt_status_collect_changes_index
18:34:10.214747 wt-status.c:632         performance: 0.198604837 s: wt_status_collect_untracked
18:34:10.216102 builtin/commit.c:1421   performance: 0.001244166 s: cmd_status:update_index
18:34:10.216817 trace.c:415             performance: 0.347670735 s: git command: 'git' 'status'
Populating status
-----------------
18:34:16.595102 builtin/commit.c:1394   performance: 0.000000456 s: cmd_status:setup
18:34:16.666600 read-cache.c:1407       performance: 0.070992413 s: read_index
18:34:16.667012 read-cache.c:1407       performance: 0.000000606 s: read_index
18:34:16.729375 preload-index.c:131     performance: 0.062362492 s: read_index_preload
18:34:16.732565 read-cache.c:1254       performance: 0.003075517 s: refresh_index
18:34:16.736148 wt-status.c:623         performance: 0.002422201 s: wt_status_collect_changes_worktree
18:34:16.739990 wt-status.c:629         performance: 0.003746618 s: wt_status_collect_changes_index
18:34:16.948505 wt-status.c:632         performance: 0.208426710 s: wt_status_collect_untracked
18:34:16.961744 builtin/commit.c:1421   performance: 0.013151887 s: cmd_status:update_index
18:34:16.962233 trace.c:415             performance: 0.368537535 s: git command: 'git' 'status'
Populated status
----------------
18:34:16.970026 builtin/commit.c:1394   performance: 0.000000631 s: cmd_status:setup
18:34:17.046235 read-cache.c:1407       performance: 0.075904673 s: read_index
18:34:17.046644 read-cache.c:1407       performance: 0.000000681 s: read_index
18:34:17.113564 preload-index.c:131     performance: 0.066920253 s: read_index_preload
18:34:17.117281 read-cache.c:1254       performance: 0.003604055 s: refresh_index
18:34:17.121115 wt-status.c:623         performance: 0.002508345 s: wt_status_collect_changes_worktree
18:34:17.125089 wt-status.c:629         performance: 0.003871636 s: wt_status_collect_changes_index
18:34:17.156089 wt-status.c:632         performance: 0.030895703 s: wt_status_collect_untracked
18:34:17.169861 builtin/commit.c:1421   performance: 0.013686404 s: cmd_status:update_index
18:34:17.170391 trace.c:415             performance: 0.201474531 s: git command: 'git' 'status'

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:18 -07:00