Commit graph

214 commits

Author SHA1 Message Date
Ben Peart 883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Martin Ågren dcb572ab94 object_array: use object_array_clear(), not free()
Instead of freeing `foo.objects` for an object array `foo` (sometimes
conditionally), call `object_array_clear(&foo)`. This means we don't
poke as much into the implementation, which is already a good thing, but
also that we release the individual entries as well, thereby fixing at
least one memory-leak (in diff-lib.c).

If someone is holding on to a pointer to an element's `name` or `path`,
that is now a dangling pointer, i.e., we'd be turning an unpleasant
situation into an outright bug. To the best of my understanding no such
long-term pointers are being taken.

The way we handle `study` in builting/reflog.c still looks like it might
leak. That will be addressed in the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:06:01 +09:00
Junio C Hamano 50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Junio C Hamano a6f38c109b Merge branch 'bw/object-id'
Conversion from uchar[20] to struct object_id continues.

* bw/object-id: (33 commits)
  diff: rename diff_fill_sha1_info to diff_fill_oid_info
  diffcore-rename: use is_empty_blob_oid
  tree-diff: convert path_appendnew to object_id
  tree-diff: convert diff_tree_paths to struct object_id
  tree-diff: convert try_to_follow_renames to struct object_id
  builtin/diff-tree: cleanup references to sha1
  diff-tree: convert diff_tree_sha1 to struct object_id
  notes-merge: convert write_note_to_worktree to struct object_id
  notes-merge: convert verify_notes_filepair to struct object_id
  notes-merge: convert find_notes_merge_pair_ps to struct object_id
  notes-merge: convert merge_from_diffs to struct object_id
  notes-merge: convert notes_merge* to struct object_id
  tree-diff: convert diff_root_tree_sha1 to struct object_id
  combine-diff: convert find_paths_* to struct object_id
  combine-diff: convert diff_tree_combined to struct object_id
  diff: convert diff_flush_patch_id to struct object_id
  patch-ids: convert to struct object_id
  diff: finish conversion for prepare_temp_file to struct object_id
  diff: convert reuse_worktree_file to struct object_id
  diff: convert fill_filespec to struct object_id
  ...
2017-06-19 12:38:44 -07:00
Ævar Arnfjörð Bjarmason 6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Junio C Hamano 93dd544f54 Merge branch 'jc/noent-notdir'
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it.  We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).

The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious).  Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.

* jc/noent-notdir:
  treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
  compat-util: is_missing_file_error()
2017-06-13 13:47:07 -07:00
Brandon Williams f9704c2d82 diff: convert fill_filespec to struct object_id
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
Brandon Williams 94a0097a41 diff: convert diff_change to struct object_id
Convert diff_change to take a struct object_id.  In addition convert the
function pointer type 'change_fn_t' to also take a struct object_id.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
Brandon Williams 55497b8c9e diff: convert run_diff_files to struct object_id
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
Brandon Williams c26022ea8f diff: convert diff_addremove to struct object_id
Convert diff_addremove to take a struct object_id.  In addtion convert
the function pointer type 'add_remove_fn_t' to also take a struct
object_id.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
Brandon Williams fcf2cfb54b diff: convert diff_index_show_file to struct object_id
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:06 +09:00
Brandon Williams 362d765915 diff: convert get_stat_data to struct object_id
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:06 +09:00
Junio C Hamano c7054209d6 treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from

  $ git grep -e ENOENT --and -e ENOTDIR

There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both.  Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-30 09:29:00 +09:00
brian m. carlson a9dbc17910 tree: convert parse_tree_indirect to struct object_id
Convert parse_tree_indirect to take a pointer to struct object_id.
Update all the callers.  This transformation was achieved using the
following semantic patch and manual updates to the declaration and
definition.  Update builtin/checkout.c manually as well, since it uses a
ternary expression not handled by the semantic patch.

@@
expression E1;
@@
- parse_tree_indirect(E1.hash)
+ parse_tree_indirect(&E1)

@@
expression E1;
@@
- parse_tree_indirect(E1->hash)
+ parse_tree_indirect(E1)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson 944cffbd18 diff-lib: convert do_diff_cache to struct object_id
This is needed to convert parse_tree_indirect.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
Nguyễn Thái Ngọc Duy 018ec3c820 commit: fix empty commit creation when there's no changes but ita entries
If i-t-a entries are present and there is no change between the index
and HEAD i-t-a entries, index_differs_from() still returns "dirty, new
entries" (aka, the resulting commit is not empty), but cache-tree will
skip i-t-a entries and produce the exact same tree of current
commit.

index_differs_from() is supposed to catch this so we can abort
git-commit (unless --no-empty is specified). Update it to optionally
ignore i-t-a entries when doing a diff between the index and HEAD so
that it would return "no change" in this case and abort commit.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24 10:48:23 -07:00
Nguyễn Thái Ngọc Duy 425a28e0a4 diff-lib: allow ita entries treated as "not yet exist in index"
When comparing the index and the working tree to show which paths are
new, and comparing the tree recorded in the HEAD and the index to see if
committing the contents recorded in the index would result in an empty
commit, we would want the former comparison to say "these are new paths"
and the latter to say "there is no change" for paths that are marked as
intent-to-add.

We made a similar attempt at d95d728a ("diff-lib.c: adjust position of
i-t-a entries in diff", 2015-03-16), which redefined the semantics of
these two comparison modes globally, which was a disaster and had to be
reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a
entries in diff"", 2015-06-23).

To make sure we do not repeat the same mistake, introduce a new internal
diffopt option so that this different semantics can be asked for only by
callers that ask it, while making sure other unaudited callers will get
the same comparison result.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24 10:47:28 -07:00
brian m. carlson 99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
brian m. carlson ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson 7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Junio C Hamano b5496cbd22 Merge branch 'nd/diff-i-t-a'
* nd/diff-i-t-a:
  Revert "diff-lib.c: adjust position of i-t-a entries in diff"
2015-06-25 10:47:46 -07:00
Junio C Hamano 78cc1a540b Revert "diff-lib.c: adjust position of i-t-a entries in diff"
This reverts commit d95d728aba.

It turns out that many other commands that need to interact with the
result of running diff-files and diff-index, e.g.  "git apply", "git
rm", etc., need to be adjusted to the new world order it brings in.
For example, it would break this sequence to correct a whitespace
breakage in the parts you changed:

	git add -N file
	git diff --cached file | git apply --cached --whitespace=fix
	git checkout file

In the old world order, "diff" showed a patch to modify an existing
empty file by adding its full contents, and "apply" updated the
index by modifying the existing empty blob (which is what an
Intent-to-Add entry records in the index) with that patch.

In the new world order, "diff" shows a patch to create a new file
with its full contents, but because "apply" thinks that the i-t-a
entry already exists in the index, it refused to accept a creation.

Adjusting "apply" to this new world order is easy, but we need to
assess the extent of the damage to the rest of the system the new
world order brought in before going forward and adjust them all,
after which we can resurrect the commit being reverted here.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 10:37:21 -07:00
Junio C Hamano d0c692263f Merge branch 'nd/diff-i-t-a'
After "git add -N", the path appeared in output of "git diff HEAD"
and "git diff --cached HEAD", leading "git status" to classify it
as "Changes to be committed".  Such a path, however, is not yet to
be scheduled to be committed.  "git diff" showed the change to the
path as modification, not as a "new file", in the header of its
output.

Treat such paths as "yet to be added to the index but Git already
know about them"; "git diff HEAD" and "git diff --cached HEAD"
should not talk about them, and "git diff" should show them as new
files yet to be added to the index.

* nd/diff-i-t-a:
  diff-lib.c: adjust position of i-t-a entries in diff
2015-05-19 13:17:49 -07:00
Junio C Hamano a916cb5fb4 Merge branch 'bc/object-id'
Identify parts of the code that knows that we use SHA-1 hash to
name our objects too much, and use (1) symbolic constants instead
of hardcoded 20 as byte count and/or (2) use struct object_id
instead of unsigned char [20] for object names.

* bc/object-id:
  apply: convert threeway_stage to object_id
  patch-id: convert to use struct object_id
  commit: convert parts to struct object_id
  diff: convert struct combine_diff_path to object_id
  bulk-checkin.c: convert to use struct object_id
  zip: use GIT_SHA1_HEXSZ for trailers
  archive.c: convert to use struct object_id
  bisect.c: convert leaf functions to use struct object_id
  define utility functions for object IDs
  define a structure for object IDs
2015-05-05 21:00:23 -07:00
Nguyễn Thái Ngọc Duy d95d728aba diff-lib.c: adjust position of i-t-a entries in diff
Entries added by "git add -N" are reminder for the user so that they
don't forget to add them before committing. These entries appear in
the index even though they are not real. Their presence in the index
leads to a confusing "git status" like this:

    On branch master
    Changes to be committed:
            new file:   foo

    Changes not staged for commit:
            modified:   foo

If you do a "git commit", "foo" will not be included even though
"status" reports it as "to be committed". This patch changes the
output to become

    On branch master
    Changes not staged for commit:
            new file:   foo

    no changes added to commit

The two hunks in diff-lib.c adjust "diff-index" and "diff-files" so
that i-t-a entries appear as new files in diff-files and nothing in
diff-index.

Due to this change, diff-files may start to report "new files" for the
first time. "add -u" needs to be told about this or it will die in
denial, screaming "new files can't exist! Reality is wrong." Luckily,
it's the only one among run_diff_files() callers that needs fixing.

Now in the new world order, a hierarchy in the index that contain
i-t-a paths is written out as a tree object as if these i-t-a
entries do not exist, and comparing the index with such a tree
object that would result from writing out the hierarchy will result
in no difference.  Update a test in t2203 that expected the i-t-a
entries to appear as "added to the index" in the comparison to
instead expect no output.

An earlier change eec3e7e4 (cache-tree: invalidate i-t-a paths after
generating trees, 2012-12-16) becomes an unnecessary pessimization
in the new world order---a cache-tree in the index that corresponds
to a hierarchy with i-t-a paths can now be marked as valid and
record the object name of the tree that results from writing a tree
object out of that hierarchy, as it will compare equal to that tree.

Reverting the commit is left for the future, though, as it is purely
a performance issue and no longer affects correctness.

Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-23 13:42:33 -07:00
brian m. carlson 1ff57c13c5 diff: convert struct combine_diff_path to object_id
Also, convert a constant to GIT_SHA1_HEXSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13 22:43:13 -07:00
Junio C Hamano 0b86fe8923 run_diff_files(): clarify computation of sha1 validity
Remove the need to have duplicated "if there is a change then feed
null_sha1 and otherwise sha1 from the cache entry" for the "new"
side of the diff by introducing two temporary variables to point
at the object name of the old and the new side of the blobs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-04 14:27:01 -08:00
Junio C Hamano 414405969e Merge branch 'jk/diff-files-assume-unchanged'
* jk/diff-files-assume-unchanged:
  run_diff_files: do not look at uninitialized stat data
2014-06-16 10:06:12 -07:00
Jeff King 5304810044 run_diff_files: do not look at uninitialized stat data
If we try to diff an index entry marked CE_VALID (because it
was marked with --assume-unchanged), we do not bother even
running stat() on the file to see if it was removed. This
started long ago with 540e694 (Prevent diff machinery from
examining assume-unchanged entries on worktree, 2009-08-11).

However, the subsequent code may look at our "struct stat"
and expect to find actual data; currently it will find
whatever cruft was left on the stack. This can cause
problems in two situations:

  1. We call match_stat_with_submodule with the stat data,
     so a submodule may be erroneously marked as changed.

  2. If --find-copies-harder is in effect, we pass all
     entries, even unchanged ones, to diff_change, so it can
     list them as rename/copy sources. Since we found no
     change, we assume that function will realize it and not
     actually display any diff output. However, we end up
     feeding it a bogus mode, leading it to sometimes claim
     there was a mode change.

We can fix both by splitting the CE_VALID and regular code
paths, and making sure only to look at the stat information
in the latter. Furthermore, we push the declaration of our
"struct stat" down into the code paths that actually set it,
so we cannot accidentally access it uninitialized in future
code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-15 09:35:33 -07:00
Junio C Hamano 2687ffdeb7 Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion'
Remove a confusing and deprecated "-q" option from "git diff-files";
"git diff-files --diff-filter=d" can be used instead.
2014-03-07 15:17:41 -08:00
Junio C Hamano 6376463c37 Merge branch 'ks/combine-diff'
Teach combine-diff to honour the path-output-order imposed by
diffcore-order, and optimize how matching paths are found in
the N-way diffs made with parents.

* ks/combine-diff:
  tests: add checking that combine-diff emits only correct paths
  combine-diff: simplify intersect_paths() further
  combine-diff: combine_diff_path.len is not needed anymore
  combine-diff: optimize combine_diff_path sets intersection
  diff test: add tests for combine-diff with orderfile
  diffcore-order: export generic ordering interface
2014-03-05 15:06:26 -08:00
Kirill Smelkov af82c7880f combine-diff: combine_diff_path.len is not needed anymore
The field was used in order to speed-up name comparison and also to
mark removed paths by setting it to 0.

Because the updated code does significantly less strcmp and also
just removes paths from the list and free right after we know a path
will not be needed, it is not needed anymore.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:44:57 -08:00
Nguyễn Thái Ngọc Duy 429bb40abd pathspec: convert some match_pathspec_depth() to ce_path_match()
This helps reduce the number of match_pathspec_depth() call sites and
show how match_pathspec_depth() is used.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:36:52 -08:00
Junio C Hamano b02f5aeda6 Merge branch 'jl/submodule-mv'
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.

* jl/submodule-mv: (53 commits)
  rm: delete .gitmodules entry of submodules removed from the work tree
  mv: update the path entry in .gitmodules for moved submodules
  submodule.c: add .gitmodules staging helper functions
  mv: move submodules using a gitfile
  mv: move submodules together with their work trees
  rm: do not set a variable twice without intermediate reading.
  t6131 - skip tests if on case-insensitive file system
  parse_pathspec: accept :(icase)path syntax
  pathspec: support :(glob) syntax
  pathspec: make --literal-pathspecs disable pathspec magic
  pathspec: support :(literal) syntax for noglob pathspec
  kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
  parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
  parse_pathspec: make sure the prefix part is wildcard-free
  rename field "raw" to "_raw" in struct pathspec
  tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
  remove match_pathspec() in favor of match_pathspec_depth()
  remove init_pathspec() in favor of parse_pathspec()
  remove diff_tree_{setup,release}_paths
  convert common_prefix() to use struct pathspec
  ...
2013-09-09 14:36:15 -07:00
Junio C Hamano 01a2a03c56 Merge branch 'jc/diff-filter-negation'
Teach "git diff --diff-filter" to express "I do not want to see
these classes of changes" more directly by listing only the
unwanted ones in lowercase (e.g. "--diff-filter=d" will show
everything but deletion) and deprecate "diff-files -q" which did
the same thing as "--diff-filter=d".

* jc/diff-filter-negation:
  diff: deprecate -q option to diff-files
  diff: allow lowercase letter to specify what change class to exclude
  diff: reject unknown change class given to --diff-filter
  diff: preparse --diff-filter string argument
  diff: factor out match_filter()
  diff: pass the whole diff_options to diffcore_apply_filter()
2013-09-09 14:28:35 -07:00
Junio C Hamano c48f6816f0 diff: remove "diff-files -q" in a version of Git in a distant future
This was inherited from "show-diff -q" that was invented to tell
comparison between the index and the working tree to ignore only
removals in 2005.

These days, it is spelled as "--diff-filter=d".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19 15:22:29 -07:00
Junio C Hamano 95a7c546b0 diff: deprecate -q option to diff-files
This reimplements the ancient "-q" option to "git diff-files" that
was inherited from "show-diff -q" in terms of "--diff-filter=d".  We
will be deprecating the "-q" option, so let's issue a warning when
we do so.

Incidentally this also tentatively fixes "git diff --no-index" to
honor "-q" and hide deletions; the use will get the same warning.

We should remove the support for "-q" in a future version but it is
not that urgent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19 15:20:47 -07:00
Nguyễn Thái Ngọc Duy 9a08727443 remove init_pathspec() in favor of parse_pathspec()
While at there, move free_pathspec() to pathspec.c

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy 6330a17199 parse_pathspec: add special flag for max_depth feature
match_pathspec_depth() and tree_entry_interesting() check max_depth
field in order to support "git grep --max-depth". The feature
activation is tied to "recursive" field, which led to some unwanted
activation, e.g. 5c8eeb8 (diff-index: enable recursive pathspec
matching in unpack_trees - 2012-01-15).

This patch decouples the activation from "recursive" field, puts it in
"magic" field instead. This makes sure that only "git grep" can
activate this feature. And because parse_pathspec knows when the
feature is not used, it does not need to sort pathspec (required for
max_depth to work correctly). A small win for non-grep cases.

Even though a new magic flag is introduced, no magic syntax is. The
magic can be only enabled by parse_pathspec() caller. We might someday
want to support ":(maxdepth:10)src." It all depends on actual use
cases.

max_depth feature cannot be enabled via init_pathspec() anymore. But
that's ok because init_pathspec() is on its way to /dev/null.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
René Scharfe 5828e8352c diff-lib, read-tree, unpack-trees: mark cache_entry array paramters const
Change the type merge_fn_t to accept the array of cache_entry pointers
as const pointers to const pointers.  This documents the fact that the
merge functions don't modify the cache_entry contents or replace any of
the pointers in the array.

Only a single cast is necessary in unpack_nondirectories because adding
two const modifiers at once is not allowed in C.  The cast is safe in
that it doesn't mask any modfication; call_unpack_fn only needs the
array for reading.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:14 -07:00
René Scharfe eb9ae4b505 diff-lib, read-tree, unpack-trees: mark cache_entry pointers const
Add const to struct cache_entry pointers throughout the tree which are
only used for reading.  This allows callers to pass in const pointers.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:14 -07:00
Jeff King e54501004a diff: do not use null sha1 as a sentinel value
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29 15:04:32 -07:00
Nguyen Thai Ngoc Duy 4838237cb7 diff-index: enable recursive pathspec matching in unpack_trees
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.

The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-16 14:17:18 -08:00
Junio C Hamano 9a33b691aa Merge branch 'cn/eradicate-working-copy'
* cn/eradicate-working-copy:
  Remove 'working copy' from the documentation and C code
2011-10-05 12:36:26 -07:00
Junio C Hamano 1b840a5662 Merge branch 'jc/diff-index-unpack'
* jc/diff-index-unpack:
  diff-index: pass pathspec down to unpack-trees machinery
  unpack-trees: allow pruning with pathspec
  traverse_trees(): allow pruning with pathspec
2011-10-05 12:35:53 -07:00
Carlos Martín Nieto f7d650c06e Remove 'working copy' from the documentation and C code
The git term is 'working tree', so replace the most public references
to 'working copy'.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-21 14:26:38 -07:00
Junio C Hamano 2f88c19700 diff-index: pass pathspec down to unpack-trees machinery
And finally, pass the pathspec down through unpack_trees() to traverse_trees()
callchain.

Before and after applying this series, looking for changes in the kernel
repository with a fairly narrow pathspec becomes somewhat faster.

  (without patch)
  $ /usr/bin/time git diff --raw v2.6.27 -- net/ipv6 >/dev/null
  0.48user 0.05system 0:00.53elapsed 100%CPU (0avgtext+0avgdata 163296maxresident)k
  0inputs+952outputs (0major+11163minor)pagefaults 0swaps

  (with patch)
  $ /usr/bin/time git diff --raw v2.6.27 -- net/ipv6 >/dev/null
  0.01user 0.00system 0:00.02elapsed 104%CPU (0avgtext+0avgdata 43856maxresident)k
  0inputs+24outputs (0major+3688minor)pagefaults 0swaps

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-29 15:09:17 -07:00
Junio C Hamano 54945edbbf Merge branch 'jc/diff-index-refactor'
* jc/diff-index-refactor:
  diff-lib: refactor run_diff_index() and do_diff_cache()
  diff-lib: simplify do_diff_cache()
2011-08-08 12:33:33 -07:00
Junio C Hamano 1df561fb48 Merge branch 'jc/maint-reset-unmerged-path'
* jc/maint-reset-unmerged-path:
  reset [<commit>] paths...: do not mishandle unmerged paths
2011-08-01 15:00:08 -07:00
Junio C Hamano bf979c07c7 diff-lib: refactor run_diff_index() and do_diff_cache()
The latter is meant to be an API for internal callers that want to inspect
the resulting diff-queue, while the former is an implementation of "git
diff-index" command. Extract the common logic into a single helper
function and make them thin wrappers around it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13 21:58:21 -07:00
Junio C Hamano fe549c21fc diff-lib: simplify do_diff_cache()
Since 34110cd (Make 'unpack_trees()' have a separate source and
destination index, 2008-03-06), we can run unpack_trees() without munging
the index at all, but do_diff_cache() tried ever so carefully to work
around the old behaviour of the function.

We can just tell unpack_trees() not to touch the original index and there
is no need to clean-up whatever the previous round has done.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13 21:58:15 -07:00
Junio C Hamano ff00b682f2 reset [<commit>] paths...: do not mishandle unmerged paths
Because "diff --cached HEAD" showed an incorrect blob object name on the
LHS of the diff, we ended up updating the index entry with bogus value,
not what we read from the tree.

Noticed by John Nowak.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13 21:39:18 -07:00
Junio C Hamano 456a4c08b8 Merge branch 'jk/diff-not-so-quick'
* jk/diff-not-so-quick:
  diff: futureproof "stop feeding the backend early" logic
  diff_tree: disable QUICK optimization with diff filter

Conflicts:
	diff.c
2011-06-06 11:40:14 -07:00
Junio C Hamano b4194828dc diff-index --quiet: learn the "stop feeding the backend early" logic
A negative return from the unpack callback function usually means unpack
failed for the entry and signals the unpack_trees() machinery to fail the
entire merge operation, immediately and there is no other way for the
callback to tell the machinery to exit early without reporting an error.

This is what we usually want to make a merge all-or-nothing operation, but
the machinery is also used for diff-index codepath by using a custom
unpack callback function. And we do sometimes want to exit early without
failing, namely when we are under --quiet and can short-cut the diff upon
finding the first difference.

Add "exiting_early" field to unpack_trees_options structure, to signal the
unpack_trees() machinery that the negative return value is not signaling
an error but an early return from the unpack_trees() machinery. As this by
definition hasn't unpacked everything, discard the resulting index just
like the failure codepath.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 11:24:12 -07:00
Junio C Hamano 2d11f21c36 Merge remote-tracking branch 'ko/maint' into jc/diff-index-quick-exit-early
* ko/maint: (4352 commits)
  git-submodule.sh: separate parens by a space to avoid confusing some shells
  Documentation/technical/api-diff.txt: correct name of diff_unmerge()
  read_gitfile_gently: use ssize_t to hold read result
  remove tests of always-false condition
  rerere.c: diagnose a corrupt MERGE_RR when hitting EOF between TAB and '\0'
  Git 1.7.5.3
  init/clone: remove short option -L and document --separate-git-dir
  do not read beyond end of malloc'd buffer
  git-svn: Fix git svn log --show-commit
  Git 1.7.5.2
  provide a copy of the LGPLv2.1
  test core.gitproxy configuration
  copy_gecos: fix not adding nlen to len when processing "&"
  Update draft release notes to 1.7.5.2
  Documentation/git-fsck.txt: fix typo: unreadable -> unreachable
  send-pack: avoid deadlock on git:// push with failed pack-objects
  connect: let callers know if connection is a socket
  connect: treat generic proxy processes like ssh processes
  sideband_demux(): fix decl-after-stmt
  t3503: test cherry picking and reverting root commits
  ...

Conflicts:
	diff.c
2011-05-31 10:57:32 -07:00
Junio C Hamano 28b9264dd6 diff: futureproof "stop feeding the backend early" logic
Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 09:21:36 -07:00
Junio C Hamano a613b534bc Merge branch 'jc/fix-diff-files-unmerged' into maint
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-13 10:41:54 -07:00
Junio C Hamano 22dbeee715 Merge branch 'jc/fix-diff-files-unmerged'
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-06 10:52:58 -07:00
Junio C Hamano 095ce9538b diff-files: show unmerged entries correctly
Earlier, e9c8409 (diff-index --cached --raw: show tree entry on the LHS
for unmerged entries., 2007-01-05) taught the command to show the object
name and the mode from the entry coming from the tree side when comparing
a tree with an unmerged index.

This is a belated companion patch that teaches diff-files to show the mode
from the entry coming from the working tree side, when comparing an
unmerged index and the working tree.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:35:13 -07:00
Junio C Hamano fa7b290895 diff: remove often unused parameters from diff_unmerge()
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair.  Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Junio C Hamano 7d0cf357a3 Merge branch 'jc/maint-diff-q-filter'
* jc/maint-diff-q-filter:
  diff --quiet: disable optimization when --diff-filter=X is used
2011-03-23 14:55:17 -07:00
Junio C Hamano 2cfe8a68cc diff --quiet: disable optimization when --diff-filter=X is used
The code notices that the caller does not want any detail of the changes
and only wants to know if there is a change or not by specifying --quiet.
And it breaks out of the loop when it knows it already found any change.

When you have a post-process filter (e.g. --diff-filter), however, the
path we found to be different in the previous round and set HAS_CHANGES
bit may end up being uninteresting, and there may be no output at the end.
The optimization needs to be disabled for such case.

Note that the f245194 (diff: change semantics of "ignore whitespace"
options, 2009-05-22) already disables this optimization by refraining
from setting HAS_CHANGES when post-process filters that need to inspect
the contents of the files (e.g. -S, -w) in diff_change() function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 15:57:19 -07:00
Nguyễn Thái Ngọc Duy eb9cb55b94 Convert ce_path_match() to use struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy afe069d166 struct rev_info: convert prune_data to struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy 66f136252f Convert struct diff_options to use struct pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03 12:28:15 -08:00
Jens Lehmann aee9c7d654 Submodules: Add the new "ignore" config option for diff and status
The new "ignore" config option controls the default behavior for "git
status" and the diff family. It specifies under what circumstances they
consider submodules as modified and can be set separately for each
submodule.

The command line option "--ignore-submodules=" has been extended to accept
the new parameter "none" for both status and diff.

Users that chose submodules to get rid of long work tree scanning times
might want to set the "dirty" option for those submodules. This brings
back the pre 1.7.0 behavior, where submodule work trees were never
scanned for modifications. By using "--ignore-submodules=none" on the
command line the status and diff commands can be told to do a full scan.

This option can be set to the following values (which have the same name
and meaning as for the "--ignore-submodules" option of status and diff):

"all": All changes to the submodule will be ignored.

"dirty": Only differences of the commit recorded in the superproject and
	the submodules HEAD will be considered modifications, all changes
	to the work tree of the submodule will be ignored. When using this
	value, the submodule will not be scanned for work tree changes at
	all, leading to a performance benefit on large submodules.

"untracked": Only untracked files in the submodules work tree are ignored,
	a changed HEAD and/or modified files in the submodule will mark it
	as modified.

"none" (which is the default): Either untracked or modified files in a
	submodules work tree or a difference between the subdmodules HEAD
	and the commit recorded in the superproject will make it show up
	as changed. This value is added as a new parameter for the
	"--ignore-submodules" option of the diff family and "git status"
	so the user can override the settings in the configuration.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:01:52 -07:00
Jens Lehmann dd44d419d3 Add optional parameters to the diff option "--ignore-submodules"
In some use cases it is not desirable that the diff family considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree. An example for that are scripts
which just want to check for submodule commits while ignoring any changes
to the work tree. Also users having large submodules known not to change
might want to use this option, as the - sometimes substantial - time it
takes to scan the submodule work tree(s) is saved.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:33:17 -07:00
Junio C Hamano b6a7a06aa6 Merge branch 'jl/submodule-diff-dirtiness'
* jl/submodule-diff-dirtiness:
  git status: ignoring untracked files must apply to submodules too
  git status: Fix false positive "new commits" output for dirty submodules
  Refactor dirty submodule detection in diff-lib.c
  git status: Show detailed dirty status of submodules in long format
  git diff --submodule: Show detailed dirty status of submodules
2010-03-24 16:25:43 -07:00
Jens Lehmann 3bfc450476 git status: ignoring untracked files must apply to submodules too
Since 1.7.0 submodules are considered dirty when they contain untracked
files. But when git status is called with the "-uno" option, the user
asked to ignore untracked files, so they must be ignored in submodules
too. To achieve this, the new flag DIFF_OPT_IGNORE_UNTRACKED_IN_SUBMODULES
is introduced.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-13 21:56:35 -08:00
Jens Lehmann 85adbf2f75 git status: Fix false positive "new commits" output for dirty submodules
Testing if the output "new commits" should appear in the long format of
"git status" is done by comparing the hashes of the diffpair. This always
resulted in printing "new commits" for submodules that contained untracked
or modified content, even if they did not contain new commits. The reason
was that match_stat_with_submodule() did set the "changed" flag for dirty
submodules, resulting in two->sha1 being set to the null_sha1 at the call
sites, which indicates that new commits are present. This is changed so
that when no new commits are present, the same object names are in the
sha1 field for both sides of the filepair, and the working tree side will
have the "dirty_submodule" flag set when appropriate. For a submodule to
be seen as modified even when it just has a dirty work tree, some
conditions had to be extended to also check for the "dirty_submodule"
flag.

Unfortunately the test case that should have found this bug had been
changed incorrectly too. It is fixed and extended to test for other
combinations too.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:24 -08:00
Jens Lehmann ae6d5c1b6f Refactor dirty submodule detection in diff-lib.c
Moving duplicated code into the new function match_stat_with_submodule().
Replacing the implicit activation of detailed checks for the dirtiness of
submodules when DIFF_FORMAT_PATCH was selected with explicitly setting
the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done().

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:17 -08:00
Junio C Hamano 32962c9bd5 revision: introduce setup_revision_opt
So far the last parameter to setup_revisions() was to specify the default
ref when the command line did not give any (typically "HEAD").  This changes
it to take a pointer to a structure so that we can add other information without
touching too many codepaths in later patches.

There is no functionality change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-09 01:11:18 -08:00
Jens Lehmann 9297f77e6d git status: Show detailed dirty status of submodules in long format
Since 1.7.0 there are three reasons a submodule is considered modified
against the work tree: It contains new commits, modified content or
untracked content. Lets show all reasons in the long format of git status,
so the user can better asses the nature of the modification. This change
does not affect the short and porcelain formats.

Two new members are added to "struct wt_status_change_data" to store the
information gathered by run_diff_files(). wt-status.c uses the new flag
DIFF_OPT_DIRTY_SUBMODULES to tell diff-lib.c it wants to get detailed
dirty information about submodules.

A hint line for submodules is printed in the dirty header when dirty
submodules are present.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-08 15:49:23 -08:00
Jens Lehmann c7e1a73641 git diff --submodule: Show detailed dirty status of submodules
When encountering a dirty submodule while doing "git diff --submodule"
print an extra line for new untracked content and another for modified
but already tracked content. And if the HEAD of the submodule is equal
to the ref diffed against in the superproject, drop the output which
would just show the same SHA1s and no commit message headlines.

To achieve that, the dirty_submodule bitfield is expanded to two bits.
The output of "git status" inside the submodule is parsed to set the
according bits.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-04 22:16:33 -08:00
Junio C Hamano d539de9f25 Merge branch 'jl/diff-submodule-ignore'
* jl/diff-submodule-ignore:
  Teach diff --submodule that modified submodule directory is dirty
  git diff: Don't test submodule dirtiness with --ignore-submodules
  Make ce_uptodate() trustworthy again
2010-01-26 22:53:13 -08:00
Jens Lehmann 4d34477f4c git diff: Don't test submodule dirtiness with --ignore-submodules
The diff family suppresses the output of submodule changes when
requested but checks them nonetheless. But since recently submodules
get examined for their dirtiness, which is rather expensive. There is
no need to do that when the --ignore-submodules option is used, as
the gathered information is never used anyway.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24 21:03:23 -08:00
Junio C Hamano 026680f881 Merge branch 'jc/fix-tree-walk'
* jc/fix-tree-walk:
  read-tree --debug-unpack
  unpack-trees.c: look ahead in the index
  unpack-trees.c: prepare for looking ahead in the index
  Aggressive three-way merge: fix D/F case
  traverse_trees(): handle D/F conflict case sanely
  more D/F conflict tests
  tests: move convenience regexp to match object names to test-lib.sh

Conflicts:
	builtin-read-tree.c
	unpack-trees.c
	unpack-trees.h
2010-01-24 17:35:58 -08:00
Junio C Hamano 125fd98434 Make ce_uptodate() trustworthy again
The rule has always been that a cache entry that is ce_uptodate(ce)
means that we already have checked the work tree entity and we know
there is no change in the work tree compared to the index, and nobody
should have to double check.  Note that false ce_uptodate(ce) does not
mean it is known to be dirty---it only means we don't know if it is
clean.

There are a few codepaths (refresh-index and preload-index are among
them) that mark a cache entry as up-to-date based solely on the return
value from ie_match_stat(); this function uses lstat() to see if the
work tree entity has been touched, and for a submodule entry, if its
HEAD points at the same commit as the commit recorded in the index of
the superproject (a submodule that is not even cloned is considered
clean).

A submodule is no longer considered unmodified merely because its HEAD
matches the index of the superproject these days, in order to prevent
people from forgetting to commit in the submodule and updating the
superproject index with the new submodule commit, before commiting the
state in the superproject.  However, the patch to do so didn't update
the codepath that marks cache entries up-to-date based on the updated
definition and instead worked it around by saying "we don't trust the
return value of ce_uptodate() for submodules."

This makes ce_uptodate() trustworthy again by not marking submodule
entries up-to-date.

The next step _could_ be to introduce a few "in-core" flag bits to
cache_entry structure to record "this entry is _known_ to be dirty",
call is_submodule_modified() from ie_match_stat(), and use these new
bits to avoid running this rather expensive check more than once, but
that can be a separate patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24 00:15:29 -08:00
Jens Lehmann e3d42c4773 Performance optimization for detection of modified submodules
In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time can be reused instead.

New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.

When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18 17:28:21 -08:00
Jens Lehmann ee6fc514f2 Show submodules as modified when they contain a dirty work tree
Until now a submodule only then showed up as modified in the supermodule
when the last commit in the submodule differed from the one in the index
or the diffed against commit of the superproject. A dirty work tree
containing new untracked or modified files in a submodule was
undetectable when looking at it from the superproject.

Now git status and git diff (against the work tree) in the superproject
will also display submodules as modified when they contain untracked or
modified files, even if the compared ref matches the HEAD of the
submodule.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16 16:40:50 -08:00
Junio C Hamano 73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano 730f72840c unpack-trees.c: look ahead in the index
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and

 (1) if the name appears in the index as a leaf node, it is also
     fed to the n_way_merge() callback function;

 (2) if the name is a directory in the index, i.e. there are entries in
     that are underneath it, then nothing is fed to the n_way_merge()
     callback function;

 (3) otherwise, if the name comes before the first eligible entry in the
     index, the index entry is first unpacked alone.

When traverse_trees_recursive() descends into a subdirectory, the
cache_bottom pointer is moved to walk index entries within that directory.

All of these are omitted for diff-index, which does not even want to be
fed an index entry and a tree entry with D/F conflicts.

This fixes 3-way read-tree and exposes a bug in other parts of the system
in t6035, test #5.  The test prepares these three trees:

 O = HEAD^
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob e69de29bb2    a/x

 A = HEAD
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb    a/x

 B = master
    120000 blob a36b77384451ea1de7bd340ffca868249626bc52    a/b
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/x

With a clean index that matches HEAD, running

    git read-tree -m -u --aggressive $O $A $B

now yields

    120000 a36b77384451ea1de7bd340ffca868249626bc52 3       a/b
    100644 e69de29bb2 0       a/b-2/c/d
    100644 e69de29bb2 1       a/b/c/d
    100644 e69de29bb2 2       a/b/c/d
    100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0       a/x

which is correct.  "master" created "a/b" symlink that did not exist,
and removed "a/b/c/d" while HEAD did not do touch either path.

Before this series, read-tree did not notice the situation and resolved
addition of "a/b" and removal of "a/b/c/d" independently.  If A = HEAD had
another path "a/b/c/e" added, this merge should conflict but instead it
silently resolved "a/b" and then immediately overwrote it to add
"a/b/c/e", which was quite bogus.

Tests in t1012 start to work with this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 15:00:14 -08:00
Junio C Hamano da165f470e unpack-trees.c: prepare for looking ahead in the index
This prepares but does not yet implement a look-ahead in the index entries
when traverse-trees.c decides to give us tree entries in an order that
does not match what is in the index.

A case where a look-ahead in the index is necessary happens when merging
branch B into branch A while the index matches the current branch A, using
a tree O as their common ancestor, and these three trees looks like this:

   O        A       B
   t                t
   t-i      t-i     t-i
   t-j      t-j
            t/1
            t/2

The traverse_trees() function gets "t", "t-i" and "t" from trees O, A and
B first, and notices that A may have a matching "t" behind "t-i" and "t-j"
(indeed it does), and tells A to give that entry instead.  After unpacking
blob "t" from tree B (as it hasn't changed since O in B and A removed it,
it will result in its removal), it descends into directory "t/".

The side that walked index in parallel to the tree traversal used to be
implemented with one pointer, o->pos, that points at the next index entry
to be processed.  When this happens, the pointer o->pos still points at
"t-i" that is the first entry.  We should be able to skip "t-i" and "t-j"
and locate "t/1" from the index while the recursive invocation of
traverse_trees() walks and match entries found there, and later come back
to process "t-i".

While that look-ahead is not implemented yet, this adds a flag bit,
CE_UNPACKED, to mark the entries in the index that has already been
processed.  o->pos pointer has been renamed to o->cache_bottom and it
points at the first entry that may still need to be processed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 14:59:54 -08:00
Junio C Hamano 3cc3fb7df6 Merge branch 'jc/1.7.0-diff-whitespace-only-status'
* jc/1.7.0-diff-whitespace-only-status:
  diff.c: fix typoes in comments
  Make test case number unique
  diff: Rename QUIET internal option to QUICK
  diff: change semantics of "ignore whitespace" options

Conflicts:
	diff.h
2009-12-26 14:03:18 -08:00
Junio C Hamano da8ba5e7da diff-lib.c: fix misleading comments on oneway_diff()
20a16eb (unpack_trees(): fix diff-index regression., 2008-03-10) adjusted
diff-index to the new world order since 34110cd (Make 'unpack_trees()'
have a separate source and destination index, 2008-03-06).  Callbacks are
expected to return anything non-negative as "success", and instead of
reporting how many index entries they have processed, they are expected to
advance o->pos themselves.  The code did so, but a stale comment was left
behind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-11 16:40:43 -07:00
Junio C Hamano 433233e0b6 Merge branch 'jc/shortstatus'
* jc/shortstatus:
  git commit --dry-run -v: show diff in color when asked
  Documentation/git-commit.txt: describe --dry-run
  wt-status: collect untracked files in a separate "collect" phase
  Make git_status_config() file scope static to builtin-commit.c
  wt-status: move wt_status_colors[] into wt_status structure
  wt-status: move many global settings to wt_status structure
  commit: --dry-run
  status: show worktree status of conflicted paths separately
  wt-status.c: rework the way changes to the index and work tree are summarized
  diff-index: keep the original index intact
  diff-index: report unmerged new entries
2009-08-28 19:38:19 -07:00
Nguyễn Thái Ngọc Duy b4d1690df1 Teach Git to respect skip-worktree bit (reading part)
grep: turn on --cached for files that is marked skip-worktree
ls-files: do not check for deleted file that is marked skip-worktree
update-index: ignore update request if it's skip-worktree, while still allows removing
diff*: skip worktree version

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Nguyễn Thái Ngọc Duy 540e694b13 Prevent diff machinery from examining assume-unchanged entries on worktree
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-11 23:17:55 -07:00
Junio C Hamano 26da1d7867 diff-index: keep the original index intact
When comparing the index and a tree, we used to read the contents of the
tree into stage #1 of the index and compared them with stage #0.  In order
not to lose sight of entries originally unmerged in the index, we hoisted
them to stage #3 before reading the tree.

Commit d1f2d7e (Make run_diff_index() use unpack_trees(), not read_tree(),
2008-01-19) changed all this.  These days, we instead use unpack_trees()
API to traverse the tree and compare the contents with the index, without
modifying the index at all.  There is no reason to hoist the unmerged
entries to stage #3 anymore.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-05 02:21:11 -07:00
Junio C Hamano 29796c6ccf diff-index: report unmerged new entries
Since an earlier change to diff-index by d1f2d7e (Make run_diff_index()
use unpack_trees(), not read_tree(), 2008-01-19), we stopped reporting an
unmerged path that does not exist in the tree, but we should.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-05 02:21:11 -07:00
Junio C Hamano 90b1994170 diff: Rename QUIET internal option to QUICK
The option "QUIET" primarily meant "find if we have _any_ difference as
quick as possible and report", which means we often do not even have to
look at blobs if we know the trees are different by looking at the higher
level (e.g. "diff-tree A B").  As a side effect, because there is no point
showing one change that we happened to have found first, it also enables
NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet.

Rename the internal option to QUICK to reflect this better; it also makes
grepping the source tree much easier, as there are other kinds of QUIET
option everywhere.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 10:22:39 -07:00
Junio C Hamano c28a17f270 Merge branch 'jc/cache-tree'
* jc/cache-tree:
  Avoid "diff-index --cached" optimization under --find-copies-harder
  Optimize "diff-index --cached" using cache-tree
  t4007: modernize the style
  cache-tree.c::cache_tree_find(): simplify internal API
  write-tree --ignore-cache-tree
2009-06-20 21:47:30 -07:00
Junio C Hamano a0919ced8a Avoid "diff-index --cached" optimization under --find-copies-harder
When find-copies-harder is in effect, the diff frontends are expected to
feed all paths, not just changed paths, to the diffcore, so that copy
sources can be picked up.  In such a case, not descending into subtrees
using the cache-tree information is simply wrong.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25 11:35:29 -07:00
Junio C Hamano b65982b608 Optimize "diff-index --cached" using cache-tree
When running "diff-index --cached" after making a change to only a small
portion of the index, there is no point unpacking unchanged subtrees into
the index recursively, only to find that all entries match anyway.  Tweak
unpack_trees() logic that is used to read in the tree object to catch the
case where the tree entry we are looking at matches the index as a whole
by looking at the cache-tree.

As an exercise, after modifying a few paths in the kernel tree, here are
a few numbers on my Athlon 64X2 3800+:

    (without patch, hot cache)
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+9407minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+2446minor)pagefaults 0swaps

Cold cache numbers are very impressive, but it does not matter very much
in practice:

    (without patch, cold cache)
    $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches'
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
    247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps

    (with patch, cold cache)
    $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches'
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k
    18440inputs+0outputs (79major+2369minor)pagefaults 0swaps

This of course helps "git status" as well.

    (without patch, hot cache)
    $ /usr/bin/time ../git.git/git-status >/dev/null
    0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+10970minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-status >/dev/null
    0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+3921minor)pagefaults 0swaps

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25 11:35:29 -07:00
Junio C Hamano 3ed24211d4 Merge branch 'lt/maint-diff-reduce-lstat'
* lt/maint-diff-reduce-lstat:
  Teach 'git checkout' to preload the index contents
  Avoid unnecessary 'lstat()' calls in 'get_stat_data()'
2009-05-23 01:40:33 -07:00
Linus Torvalds 658dd48c85 Avoid unnecessary 'lstat()' calls in 'get_stat_data()'
When we ask get_stat_data() to get the mode and size of an index entry,
we can avoid the lstat() call if we have marked the index entry as being
uptodate due to earlier lstat() calls.

This avoids a lot of unnecessary lstat() calls in eg 'git checkout',
where the last phase shows the differences to the working tree
(requiring a diff), but earlier phases have already verified the index.

On the kernel repo (with a fast machine and everything cached), this
changes timings of a nul 'git checkout' from

 - Before (best of ten):

	0.14user 0.05system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+13237minor)pagefaults 0swaps

 - After
	0.11user 0.03system 0:00.15elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+13235minor)pagefaults 0swaps

so it can obviously be noticeable, although equally obviously it's not a
show-stopper on this particular machine. The difference is likely larger
on slower machines, or with operating systems that don't do as good a job
of name caching.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-09 20:42:19 -07:00
Junio C Hamano a9bfe81309 Merge branch 'kb/checkout-optim'
* kb/checkout-optim:
  Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
  checkout bugfix: use stat.mtime instead of stat.ctime in two places
  Makefile: Set compiler switch for USE_NSEC
  Create USE_ST_TIMESPEC and turn it on for Darwin
  Not all systems use st_[cm]tim field for ns resolution file timestamp
  Record ns-timestamps if possible, but do not use it without USE_NSEC
  write_index(): update index_state->timestamp after flushing to disk
  verify_uptodate(): add ce_uptodate(ce) test
  make USE_NSEC work as expected
  fix compile error when USE_NSEC is defined
  check_updates(): effective removal of cache entries marked CE_REMOVE
  lstat_cache(): print a warning if doing ping-pong between cache types
  show_patch_diff(): remove a call to fstat()
  write_entry(): use fstat() instead of lstat() when file is open
  write_entry(): cleanup of some duplicated code
  create_directories(): remove some memcpy() and strchr() calls
  unlink_entry(): introduce schedule_dir_for_removal()
  lstat_cache(): swap func(length, string) into func(string, length)
  lstat_cache(): generalise longest_match_lstat_cache()
  lstat_cache(): small cleanup and optimisation
2009-03-17 18:54:31 -07:00
Stephan Beyer 75f3ff2eea Generalize and libify index_is_dirty() to index_differs_from(...)
index_is_dirty() in builtin-revert.c checks if the index is dirty.
This patch generalizes this function to check if the index differs
from a revision, i.e. the former index_is_dirty() behavior can now be
achieved by index_differs_from("HEAD", 0).

The second argument "diff_flags" allows to set further diff option
flags like DIFF_OPT_IGNORE_SUBMODULES. See DIFF_OPT_* macros in diff.h
for a list.

index_differs_from() seems to be useful for more than builtin-revert.c,
so it is moved into diff-lib.c and also used in builtin-commit.c.

Yet to mention:

 - "rev.abbrev = 0;" can be safely removed.
   This has no impact on performance or functioning of neither
   setup_revisions() nor run_diff_index().

 - rev.pending.objects is free()d because this fixes a leak.
   (Also see 295dd2ad "Fix memory leak in traverse_commit_list")

Mentored-by: Daniel Barkalow <barkalow@iabervon.org>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-10 22:25:39 -08:00
Kjetil Barvik 571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik ff7e6aad6d Cleanup of unused symcache variable inside diff-lib.c
Commit c40641b77b, 'Optimize
symlink/directory detection' by Linus Torvalds, removed the 'char
*symcache' parameter to the has_symlink_leading_path() function.  This
made all variables currently named 'symcache' inside diff-lib.c
unnecessary.

This also let us throw away the 'struct oneway_unpack_data', and
instead directly use the 'struct rev_info *revs' member, which
was the only member left after removal of the 'symcache[] array'
member.  The 'struct oneway_unpack_data' was introduced by the
following commit:

  948dd346  "diff-files: careful when inspecting work tree items"

Impact: cleanup
        PATH_MAX bytes less memory stack usage in some cases

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 15:56:55 -08:00