Commit graph

591 commits

Author SHA1 Message Date
Jameson Miller a849735bfb block alloc: add lifecycle APIs for cache_entry structs
It has been observed that the time spent loading an index with a large
number of entries is partly dominated by malloc() calls. This change
is in preparation for using memory pools to reduce the number of
malloc() calls made to allocate cahce entries when loading an index.

Add an API to allocate and discard cache entries, abstracting the
details of managing the memory backing the cache entries. This commit
does actually change how memory is managed - this will be done in a
later commit in the series.

This change makes the distinction between cache entries that are
associated with an index and cache entries that are not associated with
an index. A main use of cache entries is with an index, and we can
optimize the memory management around this. We still have other cases
where a cache entry is not persisted with an index, and so we need to
handle the "transient" use case as well.

To keep the congnitive overhead of managing the cache entries, there
will only be a single discard function. This means there must be enough
information kept with the cache entry so that we know how to discard
them.

A summary of the main functions in the API is:

make_cache_entry: create cache entry for use in an index. Uses specified
                  parameters to populate cache_entry fields.

make_empty_cache_entry: Create an empty cache entry for use in an index.
                        Returns cache entry with empty fields.

make_transient_cache_entry: create cache entry that is not used in an
                            index. Uses specified parameters to populate
                            cache_entry fields.

make_empty_transient_cache_entry: create cache entry that is not used in
                                  an index. Returns cache entry with
                                  empty fields.

discard_cache_entry: A single function that knows how to discard a cache
                     entry regardless of how it was allocated.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-03 10:58:27 -07:00
Junio C Hamano e47dbece39 Merge branch 'ma/unpack-trees-free-msgs'
Leak plugging.

* ma/unpack-trees-free-msgs:
  unpack_trees_options: free messages when done
  argv-array: return the pushed string from argv_push*()
  merge-recursive: provide pair of `unpack_trees_{start,finish}()`
  merge: setup `opts` later in `checkout_fast_forward()`
2018-05-30 21:51:29 +09:00
Junio C Hamano 42c8ce1c49 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (42 commits)
  merge-one-file: compute empty blob object ID
  add--interactive: compute the empty tree value
  Update shell scripts to compute empty tree object ID
  sha1_file: only expose empty object constants through git_hash_algo
  dir: use the_hash_algo for empty blob object ID
  sequencer: use the_hash_algo for empty tree object ID
  cache-tree: use is_empty_tree_oid
  sha1_file: convert cached object code to struct object_id
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  merge: convert empty tree constant to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/receive-pack: avoid hard-coded constants for push certs
  diff: specify abbreviation size in terms of the_hash_algo
  upload-pack: replace use of several hard-coded constants
  ...
2018-05-30 14:04:10 +09:00
Junio C Hamano 50f08db594 Merge branch 'js/use-bug-macro'
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.

* js/use-bug-macro:
  BUG_exit_code: fix sparse "symbol not declared" warning
  Convert remaining die*(BUG) messages
  Replace all die("BUG: ...") calls by BUG() ones
  run-command: use BUG() to report bugs, not die()
  test-tool: help verifying BUG() code paths
2018-05-30 14:04:07 +09:00
Junio C Hamano d658196f3c Merge branch 'en/unpack-trees-split-index-fix'
The split-index feature had a long-standing and dormant bug in
certain use of the in-core merge machinery, which has been fixed.

* en/unpack-trees-split-index-fix:
  unpack_trees: fix breakage when o->src_index != o->dst_index
2018-05-23 14:38:22 +09:00
Junio C Hamano c67de747f4 Merge branch 'en/rename-directory-detection-reboot'
Rename detection logic in "diff" family that is used in "merge" has
learned to guess when all of x/a, x/b and x/c have moved to z/a,
z/b and z/c, it is likely that x/d added in the meantime would also
want to move to z/d by taking the hint that the entire directory
'x' moved to 'z'.  A bug causing dirty files involved in a rename
to be overwritten during merge has also been fixed as part of this
work.  Incidentally, this also avoids updating a file in the
working tree after a (non-trivial) merge whose result matches what
our side originally had.

* en/rename-directory-detection-reboot: (36 commits)
  merge-recursive: fix check for skipability of working tree updates
  merge-recursive: make "Auto-merging" comment show for other merges
  merge-recursive: fix remainder of was_dirty() to use original index
  merge-recursive: fix was_tracked() to quit lying with some renamed paths
  t6046: testcases checking whether updates can be skipped in a merge
  merge-recursive: avoid triggering add_cacheinfo error with dirty mod
  merge-recursive: move more is_dirty handling to merge_content
  merge-recursive: improve add_cacheinfo error handling
  merge-recursive: avoid spurious rename/rename conflict from dir renames
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: fix remaining directory rename + dirty overwrite cases
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: avoid clobbering untracked files with directory renames
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: when comparing files, don't include trees
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: add computation of collisions due to dir rename & merging
  merge-recursive: check for directory level conflicts
  merge-recursive: add get_directory_renames()
  merge-recursive: make a helper function for cleanup for handle_renames
  ...
2018-05-23 14:38:19 +09:00
Martin Ågren 1c41d2805e unpack_trees_options: free messages when done
The strings allocated in `setup_unpack_trees_porcelain()` are never
freed. Provide a function `clear_unpack_trees_porcelain()` to do so and
call it where we use `setup_unpack_trees_porcelain()`. The only
non-trivial user is `unpack_trees_start()`, where we should place the
new call in `unpack_trees_finish()`.

We keep the string pointers in an array, mixing pointers to static
memory and memory that we allocate on the heap. We also keep several
copies of the individual pointers. So we need to make sure that we do
not free what we must not free and that we do not double-free. Let a
separate argv_array take ownership of all the strings we create so that
we can easily free them.

Zero the whole array of string pointers to make sure that we do not
leave any dangling pointers.

Note that we only take responsibility for the memory allocated in
`setup_unpack_trees_porcelain()` and not any other members of the
`struct unpack_trees_options`.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 11:59:31 +09:00
Stefan Beller cbd53a2193 object-store: move object access functions to object-store.h
This should make these functions easier to find and cache.h less
overwhelming to read.

In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file

As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise.  It would be better to #include wherever
identifiers from the header are used.  That can happen later
when we have better tooling for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Elijah Newren 64b1abe962 merge-recursive: fix overwriting dirty files involved in renames
This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of dirty files that are involved in directory rename
detection will be added in a subsequent commit.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08 16:11:00 +09:00
Junio C Hamano 0c7ecb7c31 Merge branch 'sb/submodule-move-nested'
Moving a submodule that itself has submodule in it with "git mv"
forgot to make necessary adjustment to the nested sub-submodules;
now the codepath learned to recurse into the submodules.

* sb/submodule-move-nested:
  submodule: fixup nested submodules after moving the submodule
  submodule-config: remove submodule_from_cache
  submodule-config: add repository argument to submodule_from_{name, path}
  submodule-config: allow submodule_free to handle arbitrary repositories
  grep: remove "repo" arg from non-supporting funcs
  submodule.h: drop declaration of connect_work_tree_and_git_dir
2018-05-08 15:59:17 +09:00
Johannes Schindelin 033abf97fc Replace all die("BUG: ...") calls by BUG() ones
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).

The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.

Let's just convert all remaining ones in one fell swoop.

This trick was performed by this invocation:

	sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 19:06:13 +09:00
brian m. carlson 75691ea345 Update struct index_state to use struct object_id
Adjust struct index_state to use struct object_id instead of unsigned
char [20].

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:50 +09:00
Elijah Newren 7db118303a unpack_trees: fix breakage when o->src_index != o->dst_index
Currently, all callers of unpack_trees() set o->src_index == o->dst_index.
The code in unpack_trees() does not correctly handle them being different.
There are two separate issues:

First, there is the possibility of memory corruption.  Since
unpack_trees() creates a temporary index in o->result and then discards
o->dst_index and overwrites it with o->result, in the special case that
o->src_index == o->dst_index, it is safe to just reuse o->src_index's
split_index for o->result.  However, when src and dst are different,
reusing o->src_index's split_index for o->result will cause the
split_index to be shared.  If either index then has entries replaced or
removed, it will result in the other index referring to free()'d memory.

Second, we can drop the index extensions.  Previously, we were moving
index extensions from o->dst_index to o->result.  Since o->src_index is
the one that will have the necessary extensions (o->dst_index is likely to
be a new index temporary index created to store the results), we should be
moving the index extensions from there.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 10:21:28 +09:00
Junio C Hamano 8b026edac3 Revert "Merge branch 'en/rename-directory-detection'"
This reverts commit e4bb62fa1e, reversing
changes made to 468165c1d8.

The topic appears to inflict severe regression in renaming merges,
even though the promise of it was that it would improve them.

We do not yet know which exact change in the topic was wrong, but in
the meantime, let's play it safe and revert it out of 'master'
before real Git-using projects are harmed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 18:07:11 +09:00
Junio C Hamano e4bb62fa1e Merge branch 'en/rename-directory-detection'
Rename detection logic in "diff" family that is used in "merge" has
learned to guess when all of x/a, x/b and x/c have moved to z/a,
z/b and z/c, it is likely that x/d added in the meantime would also
want to move to z/d by taking the hint that the entire directory
'x' moved to 'z'.  A bug causing dirty files involved in a rename
to be overwritten during merge has also been fixed as part of this
work.

* en/rename-directory-detection: (29 commits)
  merge-recursive: ensure we write updates for directory-renamed file
  merge-recursive: avoid spurious rename/rename conflict from dir renames
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: fix remaining directory rename + dirty overwrite cases
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: avoid clobbering untracked files with directory renames
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: when comparing files, don't include trees
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: add computation of collisions due to dir rename & merging
  merge-recursive: check for directory level conflicts
  merge-recursive: add get_directory_renames()
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: move the get_renames() function
  directory rename detection: tests for handling overwriting dirty files
  directory rename detection: tests for handling overwriting untracked files
  ...
2018-04-10 08:25:43 +09:00
Junio C Hamano c2a499e6c3 Merge branch 'jh/partial-clone'
Hotfix.

* jh/partial-clone:
  upload-pack: disable object filtering when disabled by config
  unpack-trees: release oid_array after use in check_updates()
2018-03-29 15:39:59 -07:00
Stefan Beller f793b895fd submodule-config: allow submodule_free to handle arbitrary repositories
At some point we may want to rename the function so that it describes what
it actually does as 'submodule_free' doesn't quite describe that this
clears a repository's submodule cache.  But that's beyond the scope of
this series.

While at it remove the extern key word from its declaration.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29 09:44:50 -07:00
René Scharfe 9f242a1336 unpack-trees: release oid_array after use in check_updates()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-25 10:51:46 -07:00
Junio C Hamano 169c9c0169 Merge branch 'bw/c-plus-plus'
Avoid using identifiers that clash with C++ keywords.  Even though
it is not a goal to compile Git with C++ compilers, changes like
this help use of code analysis tools that targets C++ on our
codebase.

* bw/c-plus-plus: (37 commits)
  replace: rename 'new' variables
  trailer: rename 'template' variables
  tempfile: rename 'template' variables
  wrapper: rename 'template' variables
  environment: rename 'namespace' variables
  diff: rename 'template' variables
  environment: rename 'template' variables
  init-db: rename 'template' variables
  unpack-trees: rename 'new' variables
  trailer: rename 'new' variables
  submodule: rename 'new' variables
  split-index: rename 'new' variables
  remote: rename 'new' variables
  ref-filter: rename 'new' variables
  read-cache: rename 'new' variables
  line-log: rename 'new' variables
  imap-send: rename 'new' variables
  http: rename 'new' variables
  entry: rename 'new' variables
  diffcore-delta: rename 'new' variables
  ...
2018-03-06 14:54:07 -08:00
Elijah Newren e0052f4613 merge-recursive: fix overwriting dirty files involved in renames
This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of dirty files that are involved in directory rename
detection will be added in a subsequent commit.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27 14:11:58 -08:00
Junio C Hamano cf44c1e0a2 Merge branch 'nd/fix-untracked-cache-invalidation'
Some bugs around "untracked cache" feature have been fixed.

* nd/fix-untracked-cache-invalidation:
  dir.c: ignore paths containing .git when invalidating untracked cache
  dir.c: stop ignoring opendir() error in open_cached_dir()
  dir.c: fix missing dir invalidation in untracked code
  dir.c: avoid stat() in valid_cached_dir()
  status: add a failing test showing a core.untrackedCache bug
2018-02-27 10:33:50 -08:00
Brandon Williams 69caed593e unpack-trees: rename 'new' variables
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22 10:08:05 -08:00
Junio C Hamano 6bed209a20 Merge branch 'jh/partial-clone'
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.

* jh/partial-clone:
  t5616: test bulk prefetch after partial fetch
  fetch: inherit filter-spec from partial clone
  t5616: end-to-end tests for partial clone
  fetch-pack: restore save_commit_buffer after use
  unpack-trees: batch fetching of missing blobs
  clone: partial clone
  partial-clone: define partial clone settings in config
  fetch: support filters
  fetch: refactor calculation of remote list
  fetch-pack: test support excluding large blobs
  fetch-pack: add --no-filter
  fetch-pack, index-pack, transport: partial clone
  upload-pack: add object filtering for partial clone
2018-02-13 13:39:04 -08:00
Nguyễn Thái Ngọc Duy 0cacebf099 dir.c: ignore paths containing .git when invalidating untracked cache
read_directory() code ignores all paths named ".git" even if it's not
a valid git repository. See treat_path() for details. Since ".git" is
basically invisible to read_directory(), when we are asked to
invalidate a path that contains ".git", we can safely ignore it
because the slow path would not consider it anyway.

This helps when fsmonitor is used and we have a real ".git" repo at
worktree top. Occasionally .git/index will be updated and if the
fsmonitor hook does not filter it, untracked cache is asked to
invalidate the path ".git/index".

Without this patch, we invalidate the root directory unncessarily,
which:

- makes read_directory() fall back to slow path for root directory
  (slower)

- makes the index dirty (because UNTR extension is updated). Depending
  on the index size, writing it down could also be slow.

A note about the new "safe_path" knob. Since this new check could be
relatively expensive, avoid it when we know it's not needed. If the
path comes from the index, it can't contain ".git". If it does
contain, we may be screwed up at many more levels, not just this one.

Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-07 12:27:02 -08:00
Stefan Beller ad17312e11 unpack-trees: oneway_merge to update submodules
When there is a one way merge, each submodule needs to be one way merged
as well, if we're asked to recurse into submodules.

In case of a submodule, check if it is up-to-date, otherwise set the
flag CE_UPDATE, which will trigger an update of it in the phase updating
the tree later.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05 12:35:35 -08:00
Jonathan Tan c0c578b33c unpack-trees: batch fetching of missing blobs
When running checkout, first prefetch all blobs that are to be updated
but are missing. This means that only one pack is downloaded during such
operations, instead of one per missing blob.

This operates only on the blob level - if a repository has a missing
tree, they are still fetched one at a time.

This does not use the delayed checkout mechanism introduced in commit
2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) due to significant conceptual differences - in particular,
for partial clones, we already know what needs to be fetched based on
the contents of the local repo alone, whereas for status=delayed, it is
the filter process that tells us what needs to be checked in the end.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:58:51 -08:00
Junio C Hamano e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
brian m. carlson a98e6101f0 refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
brian m. carlson 1053fe829c Convert remaining callers of resolve_gitlink_ref to object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
Ben Peart 883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Junio C Hamano 7fbbd3ec0f Merge branch 'ls/convert-filter-progress'
The codepath to call external process filter for smudge/clean
operation learned to show the progress meter.

* ls/convert-filter-progress:
  convert: display progress for filtered objects that have been delayed
2017-09-10 17:08:22 +09:00
Junio C Hamano 8e36002add Merge branch 'ma/up-to-date'
Message and doc updates.

* ma/up-to-date:
  treewide: correct several "up-to-date" to "up to date"
  Documentation/user-manual: update outdated example output
2017-09-10 17:08:22 +09:00
Junio C Hamano 614ea03a71 Merge branch 'bw/submodule-config-cleanup'
Code clean-up to avoid mixing values read from the .gitmodules file
and values read from the .git/config file.

* bw/submodule-config-cleanup:
  submodule: remove gitmodules_config
  unpack-trees: improve loading of .gitmodules
  submodule-config: lazy-load a repository's .gitmodules file
  submodule-config: move submodule-config functions to submodule-config.c
  submodule-config: remove support for overlaying repository config
  diff: stop allowing diff to have submodules configured in .git/config
  submodule: remove submodule_config callback routine
  unpack-trees: don't respect submodule.update
  submodule: don't rely on overlayed config when setting diffopts
  fetch: don't overlay config with submodule-config
  submodule--helper: don't overlay config in update-clone
  submodule--helper: don't overlay config in remote_submodule_branch
  add, reset: ensure submodules can be added or reset
  submodule: don't use submodule_from_name
  t7411: check configuration parsing errors
2017-08-26 22:55:08 -07:00
Lars Schneider 52f1d62eb4 convert: display progress for filtered objects that have been delayed
In 2841e8f ("convert: add "status=delayed" to filter process protocol",
2017-06-30) we taught the filter process protocol to delayed responses.
These responses are processed after the "Checking out files" phase.
If the processing takes noticeable time, then the user might think Git
is stuck.

Display the progress of the delayed responses to let the user know that
Git is still processing objects. This works very well for objects that
can be filtered quickly. If filtering of an individual object takes
noticeable time, then the user might still think that Git is stuck.
However, in that case the user would at least know what Git is doing.

It would be technical more correct to display "Checking out files whose
content filtering has been delayed". For brevity we only print
"Filtering content".

The finish_delayed_checkout() call was moved below the stop_progress()
call in unpack-trees.c to ensure that the "Checking out files" progress
is properly stopped before the "Filtering content" progress starts in
finish_delayed_checkout().

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Suggested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 12:41:20 -07:00
Junio C Hamano d33a433236 Merge branch 'jc/simplify-progress'
The API to start showing progress meter after a short delay has
been simplified.

* jc/simplify-progress:
  progress: simplify "delayed" progress API
2017-08-24 10:20:02 -07:00
Junio C Hamano 11bd95604a Merge branch 'rs/object-id'
Conversion from uchar[20] to struct object_id continues.

* rs/object-id:
  tree-walk: convert fill_tree_descriptor() to object_id
2017-08-24 10:20:02 -07:00
Martin Ågren 7560f547e6 treewide: correct several "up-to-date" to "up to date"
Follow the Oxford style, which says to use "up-to-date" before the noun,
but "up to date" after it. Don't change plumbing (specifically
send-pack.c, but transport.c (git push) also has the same string).

This was produced by grepping for "up-to-date" and "up to date". It
turned out we only had to edit in one direction, removing the hyphens.

Fix a typo in Documentation/git-diff-index.txt while we're there.

Reported-by: Jeffrey Manian <jeffrey.manian@gmail.com>
Reported-by: STEVEN WHITE <stevencharleswhitevoices@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 12:17:22 -07:00
Junio C Hamano 5aa0b6c506 Merge branch 'bw/grep-recurse-submodules'
"git grep --recurse-submodules" has been reworked to give a more
consistent output across submodule boundary (and do its thing
without having to fork a separate process).

* bw/grep-recurse-submodules:
  grep: recurse in-process using 'struct repository'
  submodule: merge repo_read_gitmodules and gitmodules_config
  submodule: check for unmerged .gitmodules outside of config parsing
  submodule: check for unstaged .gitmodules outside of config parsing
  submodule: remove fetch.recursesubmodules from submodule-config parsing
  submodule: remove submodule.fetchjobs from submodule-config parsing
  config: add config_from_gitmodules
  cache.h: add GITMODULES_FILE macro
  repository: have the_repository use the_index
  repo_read_index: don't discard the index
2017-08-22 10:29:01 -07:00
Junio C Hamano 8aade107dd progress: simplify "delayed" progress API
We used to expose the full power of the delayed progress API to the
callers, so that they can specify, not just the message to show and
expected total amount of work that is used to compute the percentage
of work performed so far, the percent-threshold parameter P and the
delay-seconds parameter N.  The progress meter starts to show at N
seconds into the operation only if we have not yet completed P per-cent
of the total work.

Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there
are oddballs that chose more random-looking values like 95%.

For a smoother workload, (50%, 1s) would allow us to start showing
the progress meter earlier than (0%, 2s), while keeping the chance
of not showing progress meter for long running operation the same as
the latter.  For a task that would take 2s or more to complete, it
is likely that less than half of it would complete within the first
second, if the workload is smooth.  But for a spiky workload whose
earlier part is easier, such a setting is likely to fail to show the
progress meter entirely and (0%, 2s) is more appropriate.

But that is merely a theory.  Realistically, it is of dubious value
to ask each codepath to carefully consider smoothness of their
workload and specify their own setting by passing two extra
parameters.  Let's simplify the API by dropping both parameters and
have everybody use (0%, 2s).

Oh, by the way, the percent-threshold parameter and the structure
member were consistently misspelled, which also is now fixed ;-)

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-19 14:01:34 -07:00
René Scharfe 5c377d3d59 tree-walk: convert fill_tree_descriptor() to object_id
All callers of fill_tree_descriptor() have been converted to object_id
already, so convert that function as well.  As a nice side-effect we get
rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already
does them for us.

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14 12:38:54 -07:00
Junio C Hamano 51b8aecabe Merge branch 'ls/filter-process-delayed'
The filter-process interface learned to allow a process with long
latency give a "delayed" response.

* ls/filter-process-delayed:
  convert: add "status=delayed" to filter process protocol
  convert: refactor capabilities negotiation
  convert: move multiple file filter error handling to separate function
  convert: put the flags field before the flag itself for consistent style
  t0021: write "OUT <size>" only on success
  t0021: make debug log file name configurable
  t0021: keep filter log files on comparison
2017-08-11 13:27:00 -07:00
Brandon Williams 3302871320 unpack-trees: improve loading of .gitmodules
When recursing submodules 'check_updates()' needs to have strict control
over the submodule-config subsystem to ensure that the gitmodules file
has been read before checking cache entries which are marked for
removal as well ensuring the proper gitmodules file is read before
updating cache entries.

Because of this let's not rely on callers of 'check_updates()' to read
the gitmodules file before calling 'check_updates()' and handle the
reading explicitly.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-03 13:11:02 -07:00
Brandon Williams 7463e2ec3e unpack-trees: don't respect submodule.update
The 'submodule.update' config was historically used and respected by the
'submodule update' command because update handled a variety of different
ways it updated a submodule.  As we begin teaching other commands about
submodules it makes more sense for the different settings of
'submodule.update' to be handled by the individual commands themselves
(checkout, rebase, merge, etc) so it shouldn't be respected by the
native checkout command.

Also remove the overlaying of the repository's config (via using
'submodule_config()') from the commands which use the unpack-trees
logic (checkout, read-tree, reset).

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-03 13:11:01 -07:00
Brandon Williams 4c0eeafe47 cache.h: add GITMODULES_FILE macro
Add a macro to be used when specifying the '.gitmodules' file and
convert any existing hard coded '.gitmodules' file strings to use the
new macro.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02 14:26:46 -07:00
Junio C Hamano 487fe1ffcd Merge branch 'ls/filter-process-delayed' into jt/subprocess-handshake
* ls/filter-process-delayed:
  convert: add "status=delayed" to filter process protocol
  convert: refactor capabilities negotiation
  convert: move multiple file filter error handling to separate function
  convert: put the flags field before the flag itself for consistent style
  t0021: write "OUT <size>" only on success
  t0021: make debug log file name configurable
  t0021: keep filter log files on comparison
2017-07-26 12:56:19 -07:00
Lars Schneider 2841e8f81c convert: add "status=delayed" to filter process protocol
Some `clean` / `smudge` filters may require a significant amount of
time to process a single blob (e.g. the Git LFS smudge filter might
perform network requests). During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol, introduced in edcc8581 ("convert: add
filter.<driver>.process option", 2016-10-16), to accept the status
"delayed" as response to a filter request. Upon this response Git
continues with the checkout operation. After the checkout operation Git
calls "finish_delayed_checkout" which queries the filter for remaining
blobs. If the filter is still working on the completion, then the filter
is expected to block. If the filter has completed all remaining blobs
then an empty response is expected.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations
for now. The optimization is most effective in these code paths as all
files of the tree are processed.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:50:41 -07:00
Junio C Hamano f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Brandon Williams b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano fa0624f79f Merge branch 'dt/unpack-save-untracked-cache-extension'
When "git checkout", "git merge", etc. manipulates the in-core
index, various pieces of information in the index extensions are
discarded from the original state, as it is usually not the case
that they are kept up-to-date and in-sync with the operation on the
main index.  The untracked cache extension is copied across these
operations now, which would speed up "git status" (as long as the
cache is properly invalidated).

* dt/unpack-save-untracked-cache-extension:
  unpack-trees: preserve index extensions
2017-05-30 11:16:45 +09:00
Junio C Hamano 4eeed27e16 Merge branch 'bw/dir-c-stops-relying-on-the-index'
API update.

* bw/dir-c-stops-relying-on-the-index:
  dir: convert fill_directory to take an index
  dir: convert read_directory to take an index
  dir: convert read_directory_recursive to take an index
  dir: convert open_cached_dir to take an index
  dir: convert is_excluded to take an index
  dir: convert prep_exclude to take an index
  dir: convert add_excludes to take an index
  dir: convert is_excluded_from_list to take an index
  dir: convert last_exclude_matching_from_list to take an index
  dir: convert dir_add* to take an index
  dir: convert get_dtype to take index
  dir: convert directory_exists_in_index to take index
  dir: convert read_skip_worktree_file_from_index to take an index
  dir: stop using the index compatibility macros
2017-05-29 12:34:41 +09:00
Junio C Hamano 5f074ca7e8 Merge branch 'sb/reset-recurse-submodules'
"git reset" learned "--recurse-submodules" option.

* sb/reset-recurse-submodules:
  builtin/reset: add --recurse-submodules switch
  submodule.c: submodule_move_head works with broken submodules
  submodule.c: uninitialized submodules are ignored in recursive commands
  entry.c: submodule recursing: respect force flag correctly
2017-05-29 12:34:40 +09:00
David Turner edf3b90553 unpack-trees: preserve index extensions
Make git checkout (and other unpack_tree operations) preserve the
untracked cache. This is valuable for two reasons:

1. Often, an unpack_tree operation will not touch large parts of the
working tree, and thus most of the untracked cache will continue to be
valid.

2. Even if the untracked cache were entirely invalidated by such an
operation, the user has signaled their intention to have such a cache,
and we don't want to throw it away.

[jes: backed out the watchman-specific parts]

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-20 18:26:45 +09:00
Brandon Williams 2c1eb10454 dir: convert read_directory to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams a0bba65b10 dir: convert is_excluded to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams 473e39307d dir: convert add_excludes to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams fba92be8f7 dir: convert is_excluded_from_list to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Junio C Hamano 8868ba1962 Merge branch 'jh/unpack-trees-micro-optim'
In a 2- and 3-way merge of trees, more than one source trees often
end up sharing an identical subtree; optimize by not reading the
same tree multiple times in such a case.

* jh/unpack-trees-micro-optim:
  unpack-trees: avoid duplicate ODB lookups during checkout
2017-04-23 22:07:48 -07:00
Stefan Beller cd279e2e1b entry.c: submodule recursing: respect force flag correctly
In case of a non-forced worktree update, the submodule movement is tested
in a dry run first, such that it doesn't matter if the actual update is
done via the force flag. However for correctness, we want to give the
flag as specified by the user. All callers have been inspected and updated
if needed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 21:18:29 -07:00
Jeff Hostetler d12a8cf0af unpack-trees: avoid duplicate ODB lookups during checkout
Teach traverse_trees_recursive() to not do redundant ODB
lookups when both directories refer to the same OID.

In operations such as read-tree and checkout, there will
likely be many peer directories that have the same OID when
the differences between the commits are relatively small.
In these cases we can avoid hitting the ODB multiple times
for the same OID.

This patch handles n=2 and n=3 cases and simply copies the
data rather than repeating the fill_tree_descriptor().

================
On the Windows repo (500K trees, 3.1M files, 450MB index),
this reduced the overall time by 0.75 seconds when cycling
between 2 commits with a single file difference.

(avg) before: 22.699
(avg) after:  21.955
===============

================
On Linux using p0006-read-tree-checkout.sh with linux.git:

Test                                                          HEAD^              HEAD
-------------------------------------------------------------------------------------------------------
0006.2: read-tree br_base br_ballast (57994)                  0.24(0.20+0.03)    0.24(0.22+0.01) +0.0%
0006.3: switch between br_base br_ballast (57994)             10.58(6.23+2.86)   10.67(5.94+2.87) +0.9%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.60(0.44+0.17)    0.57(0.44+0.14) -5.0%
0006.5: switch between aliases (57994)                        0.59(0.48+0.13)    0.57(0.44+0.15) -3.4%
================

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-15 02:50:44 -07:00
Stefan Beller 6362ed0b29 unpack-trees.c: align submodule error message to the other error messages
As the place holder in the error message is for multiple submodules,
we don't want to encapsulate the string place holder in single quotes.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-29 15:47:36 -07:00
Stefan Beller a7bc845a9a unpack-trees: check if we can perform the operation for submodules
In a later patch we'll support submodule entries to be
in sync with the tree in working tree changing commands,
such as checkout or read-tree.

When a new submodule entry changes in the tree, we need to
check if there are conflicts (directory/file conflicts)
for the tree. Add this check for submodules to be
performed before the working tree is touched.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16 14:07:16 -07:00
Stefan Beller d6b1230067 unpack-trees: pass old oid to verify_clean_submodule
The check (which uses the old oid) is yet to be implemented, but this part
is just a refactor, so it can go separately first.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16 14:07:16 -07:00
Junio C Hamano 2243d229f7 Merge branch 'sb/unpack-trees-super-prefix'
"git read-tree" and its underlying unpack_trees() machinery learned
to report problematic paths prefixed with the --super-prefix option.

* sb/unpack-trees-super-prefix:
  unpack-trees: support super-prefix option
  t1001: modernize style
  t1000: modernize style
  read-tree: use OPT_BOOL instead of OPT_SET_INT
2017-02-03 11:25:18 -08:00
Stefan Beller 3d415425c7 unpack-trees: support super-prefix option
In the future we want to support working tree operations within submodules,
e.g. "git checkout --recurse-submodules", which will update the submodule
to the commit as recorded in its superproject. In the submodule the
unpack-tree operation is carried out as usual, but the reporting to the
user needs to prefix any path with the superproject. The mechanism for
this is the super-prefix. (see 74866d757, git: make super-prefix option)

Add support for the super-prefix option for commands that unpack trees
by wrapping any path output in unpacking trees in the newly introduced
super_prefixed function. This new function prefixes any path with the
super-prefix if there is one.  Assuming the submodule case doesn't happen
in the majority of the cases, we'd want to have a fast behavior for no
super prefix, i.e. no reallocation/copying, but just returning path.

Another aspect of introducing the `super_prefixed` function is to consider
who owns the memory and if this is the right place where the path gets
modified. As the super prefix ought to change the output behavior only and
not the actual unpack tree part, it is fine to be that late in the line.
As we get passed in 'const char *path', we cannot change the path itself,
which means in case of a super prefix we have to copy over the path.
We need two static buffers in that function as the error messages
contain at most two paths.

For testing purposes enable it in read-tree, which has no output
of paths other than an unpack-trees.c. These are all converted in
this patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-25 12:33:33 -08:00
Junio C Hamano 0c0e0fd0ca Merge branch 'sb/unpack-trees-cleanup'
Code cleanup.

* sb/unpack-trees-cleanup:
  unpack-trees: factor progress setup out of check_updates
  unpack-trees: remove unneeded continue
  unpack-trees: move checkout state into check_updates
2017-01-18 15:12:17 -08:00
Stefan Beller 384f1a167b unpack-trees: factor progress setup out of check_updates
This makes check_updates shorter and easier to understand.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10 11:53:33 -08:00
Stefan Beller c4bfc7728b unpack-trees: remove unneeded continue
The continue is the last statement in the loop, so not needed.
This situation arose in 700e66d66 (2010-07-30, unpack-trees: let
read-tree -u remove index entries outside sparse area) when statements
after the continue were removed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10 11:51:19 -08:00
Stefan Beller 30ac275b1c unpack-trees: move checkout state into check_updates
The checkout state was introduced via 16da134b1f
(read-trees: refactor the unpack_trees() part, 2006-07-30). An attempt to
refactor the checkout state was done in b56aa5b268 (unpack-trees: pass
checkout state explicitly to check_updates(), 2016-09-13), but we can
go even further.

The `struct checkout state` is not used in unpack_trees apart from
initializing it, so move it into the function that makes use of it,
which is `check_updates`.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10 11:51:14 -08:00
Junio C Hamano 0f30315ba1 Merge branch 'sb/unpack-trees-grammofix'
* sb/unpack-trees-grammofix:
  unpack-trees: fix grammar for untracked files in directories
2016-12-19 14:45:31 -08:00
Stefan Beller 584f99c87b unpack-trees: fix grammar for untracked files in directories
Noticed-by: David Turner <dturner@twosigma.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-05 12:17:02 -08:00
René Scharfe 68e3d6292f introduce CHECKOUT_INIT
Add a static initializer for struct checkout and use it throughout the
code base.  It's shorter, avoids a memset(3) call and makes sure the
base_dir member is initialized to a valid (empty) string.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-22 13:42:18 -07:00
Junio C Hamano e9c6d3dcc2 Merge branch 'rs/unpack-trees-reduce-file-scope-global'
Code cleanup.

* rs/unpack-trees-reduce-file-scope-global:
  unpack-trees: pass checkout state explicitly to check_updates()
2016-09-21 15:15:26 -07:00
Junio C Hamano 4af9a7d344 Merge branch 'bc/object-id'
The "unsigned char sha1[20]" to "struct object_id" conversion
continues.  Notable changes in this round includes that ce->sha1,
i.e. the object name recorded in the cache_entry, turns into an
object_id.

It had merge conflicts with a few topics in flight (Christian's
"apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's
"status --porcelain-v2").  Extra sets of eyes double-checking for
mismerges are highly appreciated.

* bc/object-id:
  builtin/reset: convert to use struct object_id
  builtin/commit-tree: convert to struct object_id
  builtin/am: convert to struct object_id
  refs: add an update_ref_oid function.
  sha1_name: convert get_sha1_mb to struct object_id
  builtin/update-index: convert file to struct object_id
  notes: convert init_notes to use struct object_id
  builtin/rm: convert to use struct object_id
  builtin/blame: convert file to use struct object_id
  Convert read_mmblob to take struct object_id.
  notes-merge: convert struct notes_merge_pair to struct object_id
  builtin/checkout: convert some static functions to struct object_id
  streaming: make stream_blob_to_fd take struct object_id
  builtin: convert textconv_object to use struct object_id
  builtin/cat-file: convert some static functions to struct object_id
  builtin/cat-file: convert struct expand_data to use struct object_id
  builtin/log: convert some static functions to use struct object_id
  builtin/blame: convert struct origin to use struct object_id
  builtin/apply: convert static functions to struct object_id
  cache: convert struct cache_entry to use struct object_id
2016-09-19 13:47:19 -07:00
René Scharfe b56aa5b268 unpack-trees: pass checkout state explicitly to check_updates()
Add a parameter for the struct checkout variable to check_updates()
instead of using a static global variable.  Passing it explicitly makes
object ownership and usage more easily apparent.  And we get rid of a
static variable; those can be problematic in library-like code.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 16:26:12 -07:00
Alex Henrie a1c8044662 unpack-trees: do not capitalize "working"
In English, only proper nouns are capitalized.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-08 12:17:23 -07:00
brian m. carlson 99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Alex Henrie c2691e2add unpack-trees: fix English grammar in do-this-before-that messages
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 08:29:36 -07:00
Junio C Hamano 40cfc95856 Merge branch 'nd/error-errno'
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.

* nd/error-errno: (41 commits)
  wrapper.c: use warning_errno()
  vcs-svn: use error_errno()
  upload-pack.c: use error_errno()
  unpack-trees.c: use error_errno()
  transport-helper.c: use error_errno()
  sha1_file.c: use {error,die,warning}_errno()
  server-info.c: use error_errno()
  sequencer.c: use error_errno()
  run-command.c: use error_errno()
  rerere.c: use error_errno() and warning_errno()
  reachable.c: use error_errno()
  mailmap.c: use error_errno()
  ident.c: use warning_errno()
  http.c: use error_errno() and warning_errno()
  grep.c: use error_errno()
  gpg-interface.c: use error_errno()
  fast-import.c: use error_errno()
  entry.c: use error_errno()
  editor.c: use error_errno()
  diff-no-index.c: use error_errno()
  ...
2016-05-17 14:38:28 -07:00
Junio C Hamano e5e7a9115d Merge branch 'va/i18n-misc-updates'
Mark several messages for translation.

* va/i18n-misc-updates:
  i18n: unpack-trees: avoid substituting only a verb in sentences
  i18n: builtin/pull.c: split strings marked for translation
  i18n: builtin/pull.c: mark placeholders for translation
  i18n: git-parse-remote.sh: mark strings for translation
  i18n: branch: move comment for translators
  i18n: branch: unmark string for translation
  i18n: builtin/rm.c: remove a comma ',' from string
  i18n: unpack-trees: mark strings for translation
  i18n: builtin/branch.c: mark option for translation
  i18n: index-pack: use plural string instead of normal one
2016-05-17 14:38:23 -07:00
Vasco Almeida 2e3926b948 i18n: unpack-trees: avoid substituting only a verb in sentences
Instead of reusing the same set of message templates for checkout
and other actions and substituting the verb with "%s", prepare
separate message templates for each known action. That would make
it easier for translation into languages where the same verb may
conjugate differently depending on the message we are giving.

See gettext documentation for details:
http://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-12 16:28:43 -07:00
Nguyễn Thái Ngọc Duy 43c728e2c2 unpack-trees.c: use error_errno()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
brian m. carlson 7d924c9139 struct name_entry: use struct object_id instead of unsigned char sha1[20]
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-25 14:23:42 -07:00
Vasco Almeida ed47fdf7fa i18n: unpack-trees: mark strings for translation
Mark strings seen by the user inside setup_unpack_trees_porcelain() and
display_error_msgs() functions for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-12 10:29:16 -07:00
David Turner a6720955f1 unpack-trees: fix accidentally quadratic behavior
While unpacking trees (e.g. during git checkout), when we hit a cache
entry that's past and outside our path, we cut off iteration.

This provides about a 45% speedup on git checkout between master and
master^20000 on Twitter's monorepo.  Speedup in general will depend on
repostitory structure, number of changes, and packfile packing
decisions.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-22 13:03:10 -08:00
David Turner d9c2bd560e do_compare_entry: use already-computed path
In traverse_trees, we generate the complete traverse path for a
traverse_info.  Later, in do_compare_entry, we used to go do a bunch
of work to compare the traverse_info to a cache_entry's name without
computing that path.  But since we already have that path, we don't
need to do all that work.  Instead, we can just put the generated
path into the traverse_info, and do the comparison more directly.

We copy the path because prune_traversal might mutate `base`. This
doesn't happen in any codepaths where do_compare_entry is called,
but it's better to be safe.

This makes git checkout much faster -- about 25% on Twitter's
monorepo.  Deeper directory trees are likely to benefit more than
shallower ones.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-05 13:39:46 -08:00
Jeff King 75faa45ae0 replace trivial malloc + sprintf / strcpy calls with xstrfmt
It's a common pattern to do:

  foo = xmalloc(strlen(one) + strlen(two) + 1 + 1);
  sprintf(foo, "%s %s", one, two);

(or possibly some variant with strcpy()s or a more
complicated length computation).  We can switch these to use
xstrfmt, which is shorter, involves less error-prone manual
computation, and removes many sprintf and strcpy calls which
make it harder to audit the code for real buffer overflows.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano f0bc854623 Sync with 2.5.2 2015-09-09 14:30:35 -07:00
Junio C Hamano 3d3caf0b78 Sync with 2.4.9 2015-09-04 10:43:23 -07:00
Junio C Hamano 8267cd11d6 Sync with 2.2.3 2015-09-04 10:29:28 -07:00
Jeff King f514ef9787 verify_absent: allow filenames longer than PATH_MAX
When unpack-trees wants to know whether a path will
overwrite anything in the working tree, we use lstat() to
see if there is anything there. But if we are going to write
"foo/bar", we can't just lstat("foo/bar"); we need to look
for leading prefixes (e.g., "foo"). So we use the lstat cache
to find the length of the leading prefix, and copy the
filename up to that length into a temporary buffer (since
the original name is const, we cannot just stick a NUL in
it).

The copy we make goes into a PATH_MAX-sized buffer, which
will overflow if the prefix is longer than PATH_MAX. How
this happens is a little tricky, since in theory PATH_MAX is
the biggest path we will have read from the filesystem. But
this can happen if:

  - the compiled-in PATH_MAX does not accurately reflect
    what the filesystem is capable of

  - the leading prefix is not _quite_ what is on disk; it
    contains the next element from the name we are checking.
    So if we want to write "aaa/bbb/ccc/ddd" and "aaa/bbb"
    exists, the prefix of interest is "aaa/bbb/ccc". If
    "aaa/bbb" approaches PATH_MAX, then "ccc" can overflow
    it.

So this can be triggered, but it's hard to do. In
particular, you cannot just "git clone" a bogus repo. The
verify_absent checks happen before unpack-trees writes
anything to the filesystem, so there are never any leading
prefixes during the initial checkout, and the bug doesn't
trigger. And by definition, these files are larger than
PATH_MAX, so writing them will fail, and clone will
complain (though it may write a partial path, which will
cause a subsequent "git checkout" to hit the bug).

We can fix it by creating the temporary path on the heap.
The extra malloc overhead is not important, as we are
already making at least one stat() call (and probably more
for the prefix discovery).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-04 08:50:50 -07:00
Junio C Hamano 8c9155e031 Merge branch 'jk/git-path'
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4.  Their uses have been reduced.

* jk/git-path:
  memoize common git-path "constant" files
  get_repo_path: refactor path-allocation
  find_hook: keep our own static buffer
  refs.c: remove_empty_directories can take a strbuf
  refs.c: avoid git_path assignment in lock_ref_sha1_basic
  refs.c: avoid repeated git_path calls in rename_tmp_log
  refs.c: simplify strbufs in reflog setup and writing
  path.c: drop git_path_submodule
  refs.c: remove extra git_path calls from read_loose_refs
  remote.c: drop extraneous local variable from migrate_file
  prefer mkpathdup to mkpath in assignments
  prefer git_pathdup to git_path in some possibly-dangerous cases
  add_to_alternates_file: don't add duplicate entries
  t5700: modernize style
  cache.h: complete set of git_path_submodule helpers
  cache.h: clarify documentation for git_path, et al
2015-08-19 14:48:56 -07:00
Junio C Hamano 4f66e44300 Merge branch 'as/sparse-checkout-removal' into maint
"sparse checkout" misbehaved for a path that is excluded from the
checkout when switching between branches that differ at the path.

* as/sparse-checkout-removal:
  unpack-trees: don't update files with CE_WT_REMOVE set
2015-08-19 14:41:27 -07:00
Junio C Hamano 9ad8474b98 Merge branch 'dt/unpack-trees-cache-tree-revalidate'
The code to perform multi-tree merges has been taught to repopulate
the cache-tree upon a successful merge into the index, so that
subsequent "diff-index --cached" (hence "status") and "write-tree"
(hence "commit") will go faster.

The same logic in "git checkout" may now be removed, but that is a
separate issue.

* dt/unpack-trees-cache-tree-revalidate:
  unpack-trees: populate cache-tree on successful merge
2015-08-12 14:09:57 -07:00
Jeff King fcd12db6af prefer git_pathdup to git_path in some possibly-dangerous cases
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.

None of the cases changed here has a known bug, but they're
worth converting away from git_path because:

  1. It's easy to use git_pathdup in these cases.

  2. They use constructs (like assignment) that make it
     hard to tell whether they're safe or not.

The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:12 -07:00
Junio C Hamano 8348bf1b69 Merge branch 'as/sparse-checkout-removal'
"sparse checkout" misbehaved for a path that is excluded from the
checkout when switching between branches that differ at the path.

* as/sparse-checkout-removal:
  unpack-trees: don't update files with CE_WT_REMOVE set
2015-08-03 11:01:28 -07:00
Brian Degenhardt 52fca2184d unpack-trees: populate cache-tree on successful merge
When we unpack trees into an existing index, we discard the old
index and replace it with the new, merged index.  Ensure that this
index has its cache-tree populated.  This will make subsequent git
status and commit commands faster.

Signed-off-by: Brian Degenhardt <bmd@bmdhacks.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 13:43:13 -07:00
David Turner 7d782416cb unpack-trees: don't update files with CE_WT_REMOVE set
Don't update files in the worktree from cache entries which are
flagged with CE_WT_REMOVE.

When a user does a sparse checkout, git removes files that are
marked with CE_WT_REMOVE (because they are out-of-scope for the
sparse checkout). If those files are also marked CE_UPDATE (for
instance, because they differ in the branch that is being checked
out and the outgoing branch), git would previously recreate them.
This patch prevents them from being recreated.

These erroneously-created files would also interfere with merges,
causing pre-merge revisions of out-of-scope files to appear in the
worktree.

apply_sparse_checkout() is the function where all "action"
manipulation (add, delete, update files..) for sparse checkout
occurs; it should not ask to delete and update both at the same
time.

Signed-off-by: Anatole Shaw <git-devel@omni.poc.net>
Signed-off-by: David Turner <dturner@twopensource.com>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 13:19:20 -07:00
Nguyễn Thái Ngọc Duy e931371a8f untracked cache: invalidate at index addition or removal
Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Junio C Hamano 3f1509809e Sync with v2.2.1
* maint:
  Git 2.2.1
  Git 2.1.4
  Git 2.0.5
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-18 12:30:53 -08:00
Junio C Hamano 58f1d950e3 Sync with v2.0.5
* maint-2.0:
  Git 2.0.5
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:42:28 -08:00
Junio C Hamano 5e519fb8b0 Sync with v1.9.5
* maint-1.9:
  Git 1.9.5
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:28:54 -08:00
Junio C Hamano 6898b79721 Sync with v1.8.5.6
* maint-1.8.5:
  Git 1.8.5.6
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:20:31 -08:00
Junio C Hamano 2aa9100846 Merge branch 'dotgit-case-maint-1.8.5' into maint-1.8.5
* dotgit-case-maint-1.8.5:
  fsck: complain about NTFS ".git" aliases in trees
  read-cache: optionally disallow NTFS .git variants
  path: add is_ntfs_dotgit() helper
  fsck: complain about HFS+ ".git" aliases in trees
  read-cache: optionally disallow HFS+ .git variants
  utf8: add is_hfs_dotgit() helper
  fsck: notice .git case-insensitively
  t1450: refactor ".", "..", and ".git" fsck tests
  verify_dotfile(): reject .git case-insensitively
  read-tree: add tests for confusing paths like ".." and ".git"
  unpack-trees: propagate errors adding entries to the index
2014-12-17 11:11:15 -08:00
Jeff King 4616918013 unpack-trees: propagate errors adding entries to the index
When unpack_trees tries to write an entry to the index,
add_index_entry may report an error to stderr, but we ignore
its return value. This leads to us returning a successful
exit code for an operation that partially failed. Let's make
sure to propagate this code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17 10:57:53 -08:00
Junio C Hamano c09988ad94 Merge branch 'jc/unpack-trees-plug-leak'
* jc/unpack-trees-plug-leak:
  unpack_trees: plug leakage of o->result
2014-12-12 14:31:33 -08:00
Junio C Hamano a16cc8b247 unpack_trees: plug leakage of o->result
Most of the time the caller specifies to which destination variable
the resulting index_state should be assigned by passing a non-NULL
pointer in o->dst_index to receive that state, but for a caller that
gives a NULL o->dst_index, the resulting index simply leaked.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-17 13:34:07 -08:00
Junio C Hamano a28e876b9d Merge branch 'jn/unpack-trees-checkout-m-carry-deletion' into maint
* jn/unpack-trees-checkout-m-carry-deletion:
  checkout -m: attempt merge when deletion of path was staged
  unpack-trees: use 'cuddled' style for if-else cascade
  unpack-trees: simplify 'all other failures' case
2014-09-19 14:05:12 -07:00
Junio C Hamano f28763d756 Merge branch 'jn/unpack-trees-checkout-m-carry-deletion'
"git checkout -m" did not switch to another branch while carrying
the local changes forward when a path was deleted from the index.

* jn/unpack-trees-checkout-m-carry-deletion:
  checkout -m: attempt merge when deletion of path was staged
  unpack-trees: use 'cuddled' style for if-else cascade
  unpack-trees: simplify 'all other failures' case
2014-09-11 10:33:36 -07:00
Jonathan Nieder 6a143aa2b2 checkout -m: attempt merge when deletion of path was staged
twoway_merge() is missing an o->gently check in the case where a file
that needs to be modified is missing from the index but present in the
old and new trees.  As a result, in this case 'git checkout -m' errors
out instead of trying to perform a merge.

Fix it by checking o->gently.  While at it, inline the o->gently check
into reject_merge to prevent future call sites from making the same
mistake.

Noticed by code inspection.  The test for the motivating case was
added by JC.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-25 15:17:34 -07:00
Jonathan Nieder 6c1db1b388 unpack-trees: use 'cuddled' style for if-else cascade
Match the predominant style in git by following K&R style for if/else
cascades.  Documentation/CodingStyle from linux.git explains:

  Note that the closing brace is empty on a line of its own, _except_ in
  the cases where it is followed by a continuation of the same statement,
  ie a "while" in a do-statement or an "else" in an if-statement, like
  this:

	if (x == y) {
		..
	} else if (x > y) {
		...
	} else {
		....
	}

  Rationale: K&R.

  Also, note that this brace-placement also minimizes the number of empty
  (or almost empty) lines, without any loss of readability.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-13 10:32:12 -07:00
Stefan Beller 0ecd180a27 unpack-trees: simplify 'all other failures' case
In the 'if (current)' block of twoway_merge, we handle the boring
errors by checking if the entry from the old tree, current index, and
new tree are present, to get a pathname for the error message from one
of them:

	if (oldtree)
		return o->gently ? -1 : reject_merge(oldtree, o);
	if (current)
		return o->gently ? -1 : reject_merge(current, o);
	if (newtree)
		return o->gently ? -1 : reject_merge(newtree, o);
	return -1;

Since this is guarded by 'if (current)', the second test is guaranteed
to succeed.  Moreover, any of the three entries, if present, would
have the same path because there is no rename detection in this code
path.   Even if some day in the future the entries' paths differ, the
'current' path used in the index and worktree would presumably be the
most recognizable for the end user.

Simplify by just using 'current'.

Noticed by coverity, Id:290002

Signed-off-by: Stefan Beller <stefanbeller@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-13 10:32:08 -07:00
Junio C Hamano 788cef81d4 Merge branch 'nd/split-index'
An experiment to use two files (the base file and incremental
changes relative to it) to represent the index to reduce I/O cost
of rewriting a large index when only small part of the working tree
changes.

* nd/split-index: (32 commits)
  t1700: new tests for split-index mode
  t2104: make sure split index mode is off for the version test
  read-cache: force split index mode with GIT_TEST_SPLIT_INDEX
  read-tree: note about dropping split-index mode or index version
  read-tree: force split-index mode off on --index-output
  rev-parse: add --shared-index-path to get shared index path
  update-index --split-index: do not split if $GIT_DIR is read only
  update-index: new options to enable/disable split index mode
  split-index: strip pathname of on-disk replaced entries
  split-index: do not invalidate cache-tree at read time
  split-index: the reading part
  split-index: the writing part
  read-cache: mark updated entries for split index
  read-cache: save deleted entries in split index
  read-cache: mark new entries for split index
  read-cache: split-index mode
  read-cache: save index SHA-1 after reading
  entry.c: update cache_changed if refresh_cache is set in checkout_entry()
  cache-tree: mark istate->cache_changed on prime_cache_tree()
  cache-tree: mark istate->cache_changed on cache tree update
  ...
2014-07-16 11:25:40 -07:00
Junio C Hamano 3b8e8af187 Merge branch 'jk/xstrfmt'
* jk/xstrfmt:
  setup_git_env(): introduce git_path_from_env() helper
  unique_path: fix unlikely heap overflow
  walker_fetch: fix minor memory leak
  merge: use argv_array when spawning merge strategy
  sequencer: use argv_array_pushf
  setup_git_env: use git_pathdup instead of xmalloc + sprintf
  use xstrfmt to replace xmalloc + strcpy/strcat
  use xstrfmt to replace xmalloc + sprintf
  use xstrdup instead of xmalloc + strcpy
  use xstrfmt in favor of manual size calculations
  strbuf: add xstrfmt helper
2014-07-09 11:34:05 -07:00
Jeremiah Mahler ccdd4a0f3c cleanup duplicate name_compare() functions
We often represent our strings as a counted string, i.e. a pair of
the pointer to the beginning of the string and its length, and the
string may not be NUL terminated to that length.

To compare a pair of such counted strings, unpack-trees.c and
read-cache.c implement their own name_compare() functions
identically.  In addition, the cache_name_compare() function in
read-cache.c is nearly identical.  The only difference is when one
string is the prefix of the other string, in which case
name_compare() returns -1/+1 to show which one is longer, and
cache_name_compare() returns the difference of the lengths to show
the same information.

Unify these three functions by using the implementation from
cache_name_compare().  This does not make any difference to the
existing and future callers, as they must be paying attention only
to the sign of the returned value (and not the magnitude) because
the original implementations of these two functions return values
returned by memcmp(3) when the one string is not a prefix of the
other string, and the only thing memcmp(3) guarantees its callers is
the sign of the returned value, not the magnitude.

Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:12:14 -07:00
Jeff King fa3f60b783 use xstrfmt in favor of manual size calculations
In many parts of the code, we do an ugly and error-prone
malloc like:

  const char *fmt = "something %s";
  buf = xmalloc(strlen(foo) + 10 + 1);
  sprintf(buf, fmt, foo);

This makes the code brittle, and if we ever get the
allocation wrong, is a potential heap overflow. Let's
instead favor xstrfmt, which handles the allocation
automatically, and makes the code shorter and more readable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 12:25:17 -07:00
Nguyễn Thái Ngọc Duy 078a58e825 read-cache: mark updated entries for split index
The large part of this patch just follows CE_ENTRY_CHANGED
marks. replace_index_entry() is updated to update
split_index->base->cache[] as well so base->cache[] does not reference
to a freed entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:40 -07:00
Nguyễn Thái Ngọc Duy 5fc2fc8fa2 read-cache: split-index mode
This split-index mode is designed to keep write cost proportional to
the number of changes the user has made, not the size of the work
tree. (Read cost is another matter, to be dealt separately.)

This mode stores index info in a pair of $GIT_DIR/index and
$GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over
time while "index" is smaller and updated often. Format details are in
index-format.txt, although not everything is implemented in this
patch.

Shared indexes are not automatically removed, because it's unclear if
the shared index is needed by any (even temporary) indexes by just
looking at it. After a while you'll collect stale shared indexes. The
good news is one shared index is useable for long, until
$GIT_DIR/index becomes too big and sluggish that the new shared index
must be created.

The safest way to clean shared indexes is to turn off split index
mode, so shared files are all garbage, delete them all, then turn on
split index mode again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy e93021b20a read-cache: save index SHA-1 after reading
Also update SHA-1 after writing. If we do not do that, the second
read_index() will see "initialized" variable already set and not read
.git/index again, which is fine, except istate->sha1 now has a stale
value.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy d4a2024aef entry.c: update cache_changed if refresh_cache is set in checkout_entry()
Other fill_stat_cache_info() is on new entries, which should set
CE_ENTRY_ADDED in cache_changed, so we're safe.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy a5400efe29 cache-tree: mark istate->cache_changed on cache tree invalidation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Nguyễn Thái Ngọc Duy a5c446f116 unpack-trees: be specific what part of the index has changed
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Junio C Hamano 0963008cbf Merge branch 'nd/i18n-progress'
Mark the progress indicators from various time-consuming commands
for i18n/l10n.

* nd/i18n-progress:
  i18n: mark all progress lines for translation
2014-03-14 14:26:31 -07:00
Junio C Hamano d637d1b9a8 Merge branch 'kb/fast-hashmap'
Improvements to our hash table to get it to meet the needs of the
msysgit fscache project, with some nice performance improvements.

* kb/fast-hashmap:
  name-hash: retire unused index_name_exists()
  hashmap.h: use 'unsigned int' for hash-codes everywhere
  test-hashmap.c: drop unnecessary #includes
  .gitignore: test-hashmap is a generated file
  read-cache.c: fix memory leaks caused by removed cache entries
  builtin/update-index.c: cleanup update_one
  fix 'git update-index --verbose --again' output
  remove old hash.[ch] implementation
  name-hash.c: remove cache entries instead of marking them CE_UNHASHED
  name-hash.c: use new hash map implementation for cache entries
  name-hash.c: remove unreferenced directory entries
  name-hash.c: use new hash map implementation for directories
  diffcore-rename.c: use new hash map implementation
  diffcore-rename.c: simplify finding exact renames
  diffcore-rename.c: move code around to prepare for the next patch
  buitin/describe.c: use new hash map implementation
  add a hashtable implementation that supports O(1) removal
  submodule: don't access the .gitmodules cache entry after removing it
2014-02-27 14:01:09 -08:00
Nguyễn Thái Ngọc Duy 754dbc43f0 i18n: mark all progress lines for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 09:08:37 -08:00
Junio C Hamano 4766036ecd Merge branch 'jk/two-way-merge-corner-case-fix' into maint
"git am --abort" sometimes complained about not being able to write
a tree with an 0{40} object in it.

* jk/two-way-merge-corner-case-fix:
  t1005: add test for "read-tree --reset -u A B"
  t1005: reindent
  unpack-trees: fix "read-tree -u --reset A B" with conflicted index
2013-12-17 11:32:17 -08:00
Antoine Pelisse fc2b621454 Prevent buffer overflows when path is too long
Some buffers created with PATH_MAX length are not checked when being
written, and can overflow if PATH_MAX is not big enough to hold the
path.

Replace those buffers by strbufs so that their size is automatically
grown if necessary. They are created as static local variables to avoid
reallocating memory on each call. Note that prefix_filename() returns
this static buffer so each callers should copy or use the string
immediately (this is currently true).

Reported-by: Wataru Noguchi <wnoguchi.0727@gmail.com>
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 14:06:19 -08:00
Junio C Hamano 3979580265 Merge branch 'jk/two-way-merge-corner-case-fix'
Fix a rather longstanding corner-case bug in twoway "reset to
there" merge, which is most often seen in "git am --abort".

* jk/two-way-merge-corner-case-fix:
  t1005: add test for "read-tree --reset -u A B"
  t1005: reindent
  unpack-trees: fix "read-tree -u --reset A B" with conflicted index
2013-12-05 12:59:25 -08:00
Karsten Blees 419a597f64 name-hash.c: remove cache entries instead of marking them CE_UNHASHED
The new hashmap implementation supports remove, so really remove unused
cache entries from the name hashmap instead of just marking them.

The CE_UNHASHED flag and CE_STATE_MASK are no longer needed.

Keep the CE_HASHED flag to prevent adding entries twice.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18 13:04:24 -08:00
Karsten Blees 8b013788a1 name-hash.c: use new hash map implementation for cache entries
Note: the "ce->next = NULL;" in unpack-trees.c::do_add_entry can safely be
removed, as ce->next (now ce->ent.next) is always properly initialized in
name-hash.c::hash_index_entry.

Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18 13:04:24 -08:00
Jeff King b018ff6085 unpack-trees: fix "read-tree -u --reset A B" with conflicted index
When we call "read-tree --reset -u HEAD ORIG_HEAD", the first thing we
do with the index is to call read_cache_unmerged.  Originally that
would read the index, leaving aside any unmerged entries.  However, as
of d1a43f2 (reset --hard/read-tree --reset -u: remove unmerged new
paths, 2008-10-15), it actually creates a new cache entry to serve as
a placeholder, so that we later know to update the working tree.

However, we later noticed that the sha1 of that unmerged entry was
just copied from some higher stage, leaving you with random content in
the index.  That was fixed by e11d7b5 ("reset --merge": fix unmerged
case, 2009-12-31), which instead puts the null sha1 into the newly
created entry, and sets a CE_CONFLICTED flag. At the same time, it
teaches the unpack-trees machinery to pay attention to this flag, so
that oneway_merge throws away the current value.

However, it did not update the code paths for twoway_merge, which is
where we end up in the two-way read-tree with --reset. We notice that
the HEAD and ORIG_HEAD versions are the same, and say "oh, we can just
reuse the current version". But that's not true. The current version
is bogus.

Notice this case and make sure we do not keep the bogus entry; either
we do not have that path in the tree we are moving to (i.e. remove
it), or we want to have the cache entry we created for the tree we are
moving to (i.e. resolve by explicitly saying the "newtree" version is
what we want).

[jc: this is from the almost year-old $gmane/212316]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-04 10:13:27 -08:00
Eric Sunshine ebbd7439b1 employ new explicit "exists in index?" API
Each caller of index_name_exists() knows whether it is looking for a
directory or a file, and can avoid the unnecessary indirection of
index_name_exists() by instead calling index_dir_exists() or
index_file_exists() directly.

Invoking the appropriate search function explicitly will allow a
subsequent patch to relieve callers of the artificial burden of having
to add a trailing '/' to the pathname given to index_dir_exists().

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17 10:07:37 -07:00
Felipe Contreras e28f764159 unpack-trees: plug a memory leak
Before overwriting the destination index, first let's discard its
contents.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Лежанкин Иван <abyss.7@gmail.com> wrote:
Reviewed-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-13 14:37:30 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano bd21822572 Merge branch 'rs/unpack-trees-tree-walk-conflict-field'
Code clean-up.

* rs/unpack-trees-tree-walk-conflict-field:
  unpack-trees: don't shift conflicts left and right
2013-06-24 13:48:44 -07:00
René Scharfe 603d249853 unpack-trees: don't shift conflicts left and right
If o->merge is set, the struct traverse_info member conflicts is shifted
left in unpack_callback, then passed through traverse_trees_recursive
to unpack_nondirectories, where it is shifted right before use.  Stop
the shifting and just pass the conflict bit mask as is.  Rename the
member to df_conflicts to prove that it isn't used anywhere else.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-17 09:24:47 -07:00
René Scharfe 5d80ef5a6e unpack-trees: free cache_entry array members for merges
The merge functions duplicate entries as needed and they don't free
them.  Release them in unpack_nondirectories, the same function
where they were allocated, after we're done.

As suggested by Felipe, use the same loop style (zero-based for loop)
for freeing as for allocating.

Improved-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:15 -07:00
René Scharfe 5828e8352c diff-lib, read-tree, unpack-trees: mark cache_entry array paramters const
Change the type merge_fn_t to accept the array of cache_entry pointers
as const pointers to const pointers.  This documents the fact that the
merge functions don't modify the cache_entry contents or replace any of
the pointers in the array.

Only a single cast is necessary in unpack_nondirectories because adding
two const modifiers at once is not allowed in C.  The cast is safe in
that it doesn't mask any modfication; call_unpack_fn only needs the
array for reading.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:14 -07:00
René Scharfe eb9ae4b505 diff-lib, read-tree, unpack-trees: mark cache_entry pointers const
Add const to struct cache_entry pointers throughout the tree which are
only used for reading.  This allows callers to pass in const pointers.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:14 -07:00
René Scharfe f2fa354205 unpack-trees: create working copy of merge entry in merged_entry
Duplicate the merge entry right away and work with that instead of
modifying the entry we got and duplicating it only at the end of
the function.  Then mark that pointer const to document that we
don't modify the referenced cache_entry.

This change is safe because all existing merge functions call
merged_entry just before returning (or not at all), i.e. they don't
care about changes to the referenced cache_entry after the call.
unpack_nondirectories and unpack_index_entry, which call the merge
functions through call_unpack_fn, aren't interested in such changes
neither.

The change complicates merged_entry a bit because we have to free the
copy if we error out, but allows callers to pass a const pointer.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:13 -07:00
René Scharfe a33bd4d34d unpack-trees: factor out dup_entry
While we're add it, mark the struct cache_entry pointer of add_entry
const because we only read from it and this allows callers to pass in
const pointers.

Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 15:31:13 -07:00
Karsten Blees b07bc8c8c3 dir.c: replace is_path_excluded with now equivalent is_excluded API
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 12:34:01 -07:00
Junio C Hamano a39b15b4f6 Merge branch 'as/check-ignore'
Add a new command "git check-ignore" for debugging .gitignore
files.

The variable names may want to get cleaned up but that can be done
in-tree.

* as/check-ignore:
  clean.c, ls-files.c: respect encapsulation of exclude_list_groups
  t0008: avoid brace expansion
  add git-check-ignore sub-command
  setup.c: document get_pathspec()
  add.c: extract new die_if_path_beyond_symlink() for reuse
  add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
  pathspec.c: rename newly public functions for clarity
  add.c: move pathspec matchers into new pathspec.c for reuse
  add.c: remove unused argument from validate_pathspec()
  dir.c: improve docs for match_pathspec() and match_pathspec_depth()
  dir.c: provide clear_directory() for reclaiming dir_struct memory
  dir.c: keep track of where patterns came from
  dir.c: use a single struct exclude_list per source of excludes

Conflicts:
	builtin/ls-files.c
	dir.c
2013-01-23 21:19:10 -08:00
Junio C Hamano d912b0e44f Merge branch 'as/dir-c-cleanup'
Refactor and generally clean up the directory traversal API
implementation.

* as/dir-c-cleanup:
  dir.c: rename free_excludes() to clear_exclude_list()
  dir.c: refactor is_path_excluded()
  dir.c: refactor is_excluded()
  dir.c: refactor is_excluded_from_list()
  dir.c: rename excluded() to is_excluded()
  dir.c: rename excluded_from_list() to is_excluded_from_list()
  dir.c: rename path_excluded() to is_path_excluded()
  dir.c: rename cryptic 'which' variable to more consistent name
  Improve documentation and comments regarding directory traversal API
  api-directory-listing.txt: update to match code
2013-01-10 13:47:25 -08:00
Adam Spiers c082df2453 dir.c: use a single struct exclude_list per source of excludes
Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06 14:25:06 -08:00
Adam Spiers f619881251 dir.c: rename free_excludes() to clear_exclude_list()
It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.

http://article.gmane.org/gmane.comp.version-control.git/206128

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:47 -08:00
Adam Spiers 0795805053 dir.c: rename excluded_from_list() to is_excluded_from_list()
Continue adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Also adjust their callers as necessary.

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Adam Spiers 9013089c4a dir.c: rename path_excluded() to is_path_excluded()
Start adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was agreed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:45 -08:00
Martin von Zweigbergk 686b2de0ce oneway_merge(): only lstat() when told to update worktree
Although the subject line of 613f027 (read-tree -u one-way merge fix
to check out locally modified paths., 2006-05-15) mentions "read-tree
-u", it did not seem to check whether -u was in effect. Not checking
whether -u is in effect makes e.g. "read-tree --reset" lstat() the
worktree, even though the worktree stat should not matter for that
operation.

This speeds up e.g. "git reset" a little on the linux-2.6 repo (best
of five, warm cache):

        Before      After
real    0m0.288s    0m0.233s
user    0m0.190s    0m0.150s
sys     0m0.090s    0m0.080s

Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-20 13:07:22 -08:00
Junio C Hamano 30ea575876 Merge branch 'tg/ce-namelen-field'
Split lower bits of ce_flags field and creates a new ce_namelen
field in the in-core index structure.

* tg/ce-namelen-field:
  Strip namelen out of ce_flags into a ce_namelen field
2012-07-23 20:55:21 -07:00
Junio C Hamano cd733f4f71 Merge branch 'jc/ls-files-i-dir' into maint
"git ls-files --exclude=t -i" did not consider anything under t/ as
excluded, as it did not pay attention to exclusion of leading paths
while walking the index.  Other two users of excluded() are also
updated.

* jc/ls-files-i-dir:
  dir.c: make excluded() file scope static
  unpack-trees.c: use path_excluded() in check_ok_to_remove()
  builtin/add.c: use path_excluded()
  path_excluded(): update API to less cache-entry centric
  ls-files -i: micro-optimize path_excluded()
  ls-files -i: pay attention to exclusion of leading paths
2012-07-11 12:44:35 -07:00
Thomas Gummerer b60e188c51 Strip namelen out of ce_flags into a ce_namelen field
Strip the name length from the ce_flags field and move it
into its own ce_namelen field in struct cache_entry. This
will both give us a tiny bit of a performance enhancement
when working with long pathnames and is a refactoring for
more readability of the code.

It enhances readability, by making it more clear what
is a flag, and where the length is stored and make it clear
which functions use stages in comparisions and which only
use the length.

It also makes CE_NAMEMASK private, so that users don't
mistakenly write the name length in the flags.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-11 09:42:45 -07:00
Thomas Gummerer 68c4f6a577 Replace strlen() with ce_namelen()
Replace strlen(ce->name) with ce_namelen() in a couple
of places which gives us some additional bits of
performance.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-08 19:49:34 -07:00
Junio C Hamano 1966babf6e Merge branch 'jc/ls-files-i-dir'
"git ls-files --exclude=t -i" did not consider anything under t/
as excluded, as it did not pay attention to exclusion of leading
paths while walking the index.  Other two users of excluded() are
also updated.

* jc/ls-files-i-dir:
  dir.c: make excluded() file scope static
  unpack-trees.c: use path_excluded() in check_ok_to_remove()
  builtin/add.c: use path_excluded()
  path_excluded(): update API to less cache-entry centric
  ls-files -i: micro-optimize path_excluded()
  ls-files -i: pay attention to exclusion of leading paths
2012-06-21 14:42:07 -07:00
Junio C Hamano 589570dbe7 unpack-trees.c: use path_excluded() in check_ok_to_remove()
This function is responsible for determining if a path that is not
tracked is ignored and allow "checkout" to overwrite it as needed.
It used excluded() without checking if higher level directory in the
path is ignored; correct it to use path_excluded() for this check.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * There are uses of lower-level interface excluded_from_list() in
   the codepath for narrow-checkout hack; they are supposed to be
   already checking each level as they descend, and are not touched
   with this patch.
2012-06-05 22:21:42 -07:00
Junio C Hamano adc7052bb6 Merge branch 'maint'
By Jens Lehmann (1) and Johannes Sixt (1)
* maint:
  Consistently use "superproject" instead of "supermodule"
  t3404: begin "exchange commits with -p" test with correct preconditions
2012-05-20 15:45:35 -07:00
Jens Lehmann cb8ad289c6 Consistently use "superproject" instead of "supermodule"
We fairly consistently say "superproject" and never "supermodule" these
days. But there are seven occurrences of "supermodule" left in the current
work tree. Three appear in Release Notes for 1.5.3 and 1.7.7, three in
test names and one in a C-code comment.

Replace all occurrences of "supermodule" outside of the Release Notes
(which shouldn't be changed after the fact) with "superproject" for
consistency.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20 14:58:38 -07:00
Junio C Hamano 2fa4fff4b9 Merge branch 'pw/message-cleanup'
Many error/warning messages had extra trailing newlines that are
unnecessary.

By Pete Wyckoff
* pw/message-cleanup:
  remove blank filename in error message
  remove superfluous newlines in error messages
2012-05-02 13:53:35 -07:00
Junio C Hamano d4a5d872c0 Merge branch 'jc/index-v4'
Trivially shrinks the on-disk size of the index file to save both I/O and
checksum overhead.

The topic should give a solid base to build on further updates, with the
code refactoring in its earlier parts, and the backward compatibility
mechanism in its later parts.

* jc/index-v4:
  index-v4: document the entry format
  unpack-trees: preserve the index file version of original
  update-index: upgrade/downgrade on-disk index version
  read-cache.c: write prefix-compressed names in the index
  read-cache.c: read prefix-compressed names in index on-disk version v4
  read-cache.c: move code to copy incore to ondisk cache to a helper function
  read-cache.c: move code to copy ondisk to incore cache to a helper function
  read-cache.c: report the header version we do not understand
  read-cache.c: make create_from_disk() report number of bytes it consumed
  read-cache.c: allow unaligned mapping of the index file
  cache.h: hide on-disk index details
  varint: make it available outside the context of pack
2012-05-02 13:51:13 -07:00
Pete Wyckoff 82247e9bd5 remove superfluous newlines in error messages
The error handling routines add a newline.  Remove
the duplicate ones in error messages.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-30 15:45:51 -07:00
Junio C Hamano 9170c7ab28 unpack-trees: preserve the index file version of original
Otherwise "git checkout $other_branch" (or even "git checkout HEAD")
would end up writing the index out in the default format.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-27 16:03:29 -07:00
René Scharfe 6ff264ee05 unpack-trees: plug minor memory leak
The allocations made by unpack_nondirectories() using create_ce_entry()
are never freed.

In the non-merge case, we duplicate them using add_entry() and later
only look at the first allocated element (src[0]), perhaps even only
by mistake.  Split out the actual addition from add_entry() into the
new helper do_add_entry() and call this non-duplicating function
instead of add_entry() to avoid the leak.

Valgrind reports this for the command "git archive v1.7.9" without
the patch:

  ==13372== LEAK SUMMARY:
  ==13372==    definitely lost: 230,986 bytes in 2,325 blocks
  ==13372==    indirectly lost: 0 bytes in 0 blocks
  ==13372==      possibly lost: 98 bytes in 1 blocks
  ==13372==    still reachable: 2,259,198 bytes in 3,243 blocks
  ==13372==         suppressed: 0 bytes in 0 blocks

And with the patch applied:

  ==13375== LEAK SUMMARY:
  ==13375==    definitely lost: 65 bytes in 1 blocks
  ==13375==    indirectly lost: 0 bytes in 0 blocks
  ==13375==      possibly lost: 0 bytes in 0 blocks
  ==13375==    still reachable: 2,364,417 bytes in 3,245 blocks
  ==13375==         suppressed: 0 bytes in 0 blocks

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 16:36:23 -07:00
René Scharfe 97e5954bdc unpack-trees: don't perform any index operation if we're not merging
src[0] points to the index entry in the merge case and to the first
tree to unpack in the non-merge case.  We only want to mark the index
entry, so check first if we're merging.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 16:36:18 -07:00
Nguyễn Thái Ngọc Duy 0de1633783 tree-walk.c: do not leak internal structure in tree_entry_len()
tree_entry_len() does not simply take two random arguments and return
a tree length. The two pointers must point to a tree item structure,
or struct name_entry. Passing random pointers will return incorrect
value.

Force callers to pass struct name_entry instead of two pointers (with
hope that they don't manually construct struct name_entry themselves)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-27 11:08:26 -07:00
Junio C Hamano 08ec3b5e4d Merge branch 'nd/maint-sparse-errors'
* nd/maint-sparse-errors:
  Add explanation why we do not allow to sparse checkout to empty working tree
  sparse checkout: show error messages when worktree shaping fails
2011-10-13 19:03:18 -07:00
Junio C Hamano 5fbef463a1 Merge branch 'mg/maint-doc-sparse-checkout'
* mg/maint-doc-sparse-checkout:
  git-read-tree.txt: correct sparse-checkout and skip-worktree description
  git-read-tree.txt: language and typography fixes
  unpack-trees: print "Aborting" to stderr
2011-10-05 12:36:25 -07:00
Junio C Hamano 1b840a5662 Merge branch 'jc/diff-index-unpack'
* jc/diff-index-unpack:
  diff-index: pass pathspec down to unpack-trees machinery
  unpack-trees: allow pruning with pathspec
  traverse_trees(): allow pruning with pathspec
2011-10-05 12:35:53 -07:00
Nguyễn Thái Ngọc Duy a7bc906f2e Add explanation why we do not allow to sparse checkout to empty working tree
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-22 11:35:48 -07:00
Nguyễn Thái Ngọc Duy 17d26a4d04 sparse checkout: show error messages when worktree shaping fails
verify_* functions can queue errors up and to be printed later at
label return_failed. In case of errors, do not go to label "done"
directly because all queued messages would be dropped on the floor.

Found-by: Joshua Jensen <jjensen@workspacewhiz.com>
Tracked-down-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-22 11:35:44 -07:00
Michael J Gruber 6f90969ba8 unpack-trees: print "Aborting" to stderr
display_error_msgs() prints all the errors to stderr already (if any),
followed by "Aborting" (if any) to stdout. Make the latter go to stderr
instead.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-21 15:05:53 -07:00
Junio C Hamano 40e372563c unpack-trees: allow pruning with pathspec
Use the pathspec pruning of traverse_trees() from unpack_trees(). Again,
the unpack_trees() machinery is primarily meant for merging two (or more)
trees, and because a merge is a full tree operation, it didn't support any
pruning with pathspec, and this codepath probably should not be enabled
while running a merge, but the caller in diff-lib.c::diff_cache() should
be able to take advantage of it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-29 15:08:31 -07:00
Junio C Hamano b35acb5345 Merge branch 'maint'
* maint:
  Break down no-lstat() condition checks in verify_uptodate()
  t7400: fix bogus test failure with symlinked trash
  Documentation: clarify the invalidated tree entry format
2011-07-31 18:57:32 -07:00
Nguyễn Thái Ngọc Duy d5b6629904 Break down no-lstat() condition checks in verify_uptodate()
Make it easier to grok under what conditions we can skip lstat().

While at there, shorten ie_match_stat() line for the sake of my eyes.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-31 18:42:38 -07:00
Junio C Hamano 57e4d61686 Merge branch 'jc/diff-index-quick-exit-early'
* jc/diff-index-quick-exit-early:
  diff-index --quiet: learn the "stop feeding the backend early" logic

Conflicts:
	unpack-trees.h
2011-06-29 17:03:11 -07:00
Junio C Hamano b4194828dc diff-index --quiet: learn the "stop feeding the backend early" logic
A negative return from the unpack callback function usually means unpack
failed for the entry and signals the unpack_trees() machinery to fail the
entire merge operation, immediately and there is no other way for the
callback to tell the machinery to exit early without reporting an error.

This is what we usually want to make a merge all-or-nothing operation, but
the machinery is also used for diff-index codepath by using a custom
unpack callback function. And we do sometimes want to exit early without
failing, namely when we are under --quiet and can short-cut the diff upon
finding the first difference.

Add "exiting_early" field to unpack_trees_options structure, to signal the
unpack_trees() machinery that the negative return value is not signaling
an error but an early return from the unpack_trees() machinery. As this by
definition hasn't unpacked everything, discard the resulting index just
like the failure codepath.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 11:24:12 -07:00
Jens Lehmann 2c9078d05b unpack-trees: add the dry_run flag to unpack_trees_options
Until now there was no way to test if unpack_trees() with update=1 would
succeed without really updating the work tree. The reason for that is that
setting update to 0 does skip the tests for new files and deactivates the
sparse handling, thereby making that unsuitable as a dry run.

Add the new dry_run flag to struct unpack_trees_options unpack_trees().
Setting that together with the update flag will check if the work tree
update would be successful without doing it for real.

The only class of problems that is not detected at the moment are file
system conditions like ENOSPC or missing permissions. Also the index
entries of updated files are not as they would be after a real checkout
because lstat() isn't run as the files aren't updated for real.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-25 14:32:02 -07:00
Nguyễn Thái Ngọc Duy 28911091c1 sparse checkout: do not eagerly decide the fate for whole directory
Sparse-setting code follows closely how files are excluded in
read_directory(), every entry (including directories) are fed to
excluded_from_list() to decide if the entry is suitable. Directories
are treated no different than files. If a directory is matched (or
not), the whole directory is considered matched (or not) and the
process moves on.

This generally works as long as there are no patterns to exclude parts
of the directory. In case of sparse checkout code, the following patterns

  t
  !t/t0000-basic.sh

will produce a worktree with full directory "t" even if t0000-basic.sh
is requested to stay out.

By the same reasoning, if a directory is to be excluded, any rules to
re-include certain files within that directory will be ignored.

Fix it by always checking files against patterns. If no pattern can be
used to decide whether an entry is in our out
(ie. excluded_from_list() returns -1), the entry will be
included/excluded the same as their parent directory.

Noticed-by: <skillzero@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-10 09:04:26 -07:00
Stephen Boyd c2e86addb8 Fix sparse warnings
Fix warnings from 'make check'.

 - These files don't include 'builtin.h' causing sparse to complain that
   cmd_* isn't declared:

   builtin/clone.c:364, builtin/fetch-pack.c:797,
   builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
   builtin/merge-index.c:69, builtin/merge-recursive.c:22
   builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
   builtin/notes.c:822, builtin/pack-redundant.c:596,
   builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
   builtin/remote.c:1512, builtin/remote-ext.c:240,
   builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
   builtin/unpack-file.c:25, builtin/var.c:75

 - These files have symbols which should be marked static since they're
   only file scope:

   submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
   submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
   unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
   url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48

 - These files redeclare symbols to be different types:

   builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
   usage.c:49, usage.c:58, usage.c:63, usage.c:72

 - These files use a literal integer 0 when they really should use a NULL
   pointer:

   daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362

While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 10:16:54 -07:00
Junio C Hamano 4e363b11bd Revert "unpack_trees(): skip trees that are the same in all input"
This reverts commit 83c90314aa, which
seems to have broken merge to report conflicts when there should be
none.
2011-02-15 10:47:04 -08:00
Junio C Hamano 7551391478 Merge branch 'jc/unpack-trees'
* jc/unpack-trees:
  unpack_trees(): skip trees that are the same in all input
  unpack-trees.c: cosmetic fix

Conflicts:
	unpack-trees.c
2011-02-09 16:41:17 -08:00
Junio C Hamano 5bb20ece6b Merge branch 'jn/unpack-lstat-failure-report'
* jn/unpack-lstat-failure-report:
  unpack-trees: handle lstat failure for existing file
  unpack-trees: handle lstat failure for existing directory
2011-02-09 16:41:16 -08:00
Jonathan Nieder a93e530184 unpack-trees: handle lstat failure for existing file
When check_leading_path notices a file in the way of a new entry to be
checked out, verify_absent uses (1) the mode to determine whether it
is a directory (2) the rest of the stat information to check if this
is actually an old entry, disguised by a change in filename (e.g.,
README -> Readme) that is significant to git but insignificant to the
underlying filesystem.  If lstat fails, these checks are performed
with an uninitialied stat structure, producing essentially random
results.

Better to just error out when lstat fails.

The easiest way to reproduce this is to remove a file after the
check_leading_path call and before the lstat in verify_absent.  An
lstat failure other than ENOENT in check_leading_path would also
trigger the same code path.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-01-13 11:26:09 -08:00
Jonathan Nieder 92fda79ed0 unpack-trees: handle lstat failure for existing directory
When check_leading_path notices no file in the way of the new entry to
be checked out, verify_absent checks whether there is a directory
there or nothing at all.  If that lstat call fails (for example due to
ENOMEM), it assumes ENOENT, meaning a directory with untracked files
would be clobbered in that case.

Check errno after calling lstat, and for conditions other than ENOENT,
just error out.

This is a theoretical race condition.  lstat has to succeed moments
before it fails for there to be trouble.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-01-13 11:25:32 -08:00
Junio C Hamano 83c90314aa unpack_trees(): skip trees that are the same in all input
unpack_trees() merges two trees (the current HEAD and the destination
commit) when switching to another branch, checking and updating the index
entry where the destination differs from the current HEAD.  It merges three
trees (the common ancestor, the current HEAD and the other commit) when
performing a three-way merge, checking and updating the index entry when
the merge result differs from the current HEAD.  It does so by walking the
input trees in parallel all the way down to the leaves.

One common special case is a directory is identical across the trees
involved in the merge.  In such a case, we do not have to descend into the
directory at all---we know that the end result is to keep the entries in
the current index.

This optimization cannot be applied in a few special cases in
unpack_trees(), though.  We need to descend into the directory and update
the index entries from the target tree in the following cases:

 - When resetting (e.g. "git reset --hard"); and

 - When checking out a tree for the first time into an empty working tree
   (e.g. "git read-tree -m -u HEAD HEAD" with missing .git/index).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-01-04 13:03:32 -08:00
Junio C Hamano e39212ab08 Merge branch 'nd/maint-fix-add-typo-detection'
* nd/maint-fix-add-typo-detection:
  Revert "excluded_1(): support exclude files in index"
  unpack-trees: fix sparse checkout's "unable to match directories"
  unpack-trees: move all skip-worktree checks back to unpack_trees()
  dir.c: add free_excludes()
  cache.h: realign and use (1 << x) form for CE_* constants
2010-12-22 14:40:26 -08:00
Junio C Hamano 84563a624e unpack-trees.c: cosmetic fix
Make the parts a bit more readable before touching them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-22 10:02:13 -08:00
Clemens Buchacher 3055d78f97 use persistent memory for rejected paths
An aborted merge prints the list of rejected paths as part of the
error message. Since commit f66caaf9 (do not overwrite files in
leading path), some of those paths do not have static buffers, so
we have to keep a copy. Use string_list's to accomplish this.

This changes the order of the list to the order in which the paths
are processed. Previously, it was reversed.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-14 08:55:13 -08:00
Clemens Buchacher b1735b1ab7 do not overwrite files in leading path
If the work tree contains an untracked file x, and
unpack-trees wants to checkout a path x/*, the
file x is removed unconditionally.

Instead, apply the same checks that are normally
used for untracked files, and abort if the file
cannot be removed.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-14 08:55:12 -08:00
Clemens Buchacher 6838d1ad6b add function check_ok_to_remove()
This wraps some inline code into the function check_ok_to_remove(),
which will later be used for leading path components as well.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-14 08:55:10 -08:00
Nguyễn Thái Ngọc Duy 9037026d34 unpack-trees: fix sparse checkout's "unable to match directories"
Matching index entries against an excludes file currently has two
problems.

First, there's no function to do it.  Code paths (like sparse
checkout) that wanted to try it would iterate over index entries and
for each index entry pass that path to excluded_from_list().  But that
is not how excluded_from_list() works; one is supposed to feed in each
ancester of a path before a given path to find out if it was excluded
because of some parent or grandparent matching a

  bigsubdirectory/

pattern despite the path not matching any .gitignore pattern directly.

Second, it's inefficient.  The excludes mechanism is supposed to let
us block off vast swaths of the filesystem as uninteresting; separately
checking every index entry doesn't fit that model.

Introduce a new function to take care of both these problems.  This
traverses the index in depth-first order (well, that's what order the
index is in) to mark un-excluded entries.

Maybe some day the in-core index format will be restructured to make
this sort of operation easier.  Or maybe we will want to try some
binary search based thing.  The interface is simple enough to allow
all those things.  Example:

  clear_ce_flags(the_index.cache, the_index.cache_nr,
                 CE_CANDIDATE, CE_CLEARME, exclude_list);

would clear the CE_CLEARME flag on all index entries with
CE_CANDIDATE flag and not matched by exclude_list.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-30 17:28:09 -08:00
Nguyễn Thái Ngọc Duy 2431afbf1b unpack-trees: move all skip-worktree checks back to unpack_trees()
Earlier, the will_have_skip_worktree() checks are done in various
places, which makes it hard to traverse the index tree-alike, required
by excluded_from_list(). This patch moves all the checks into two
loops in unpack_trees().

Entries in index in this operation can be classified into two
groups: ones already in index before unpack_trees() is called and ones
added to index after traverse_trees() is called.

In both groups, before checking file status on worktree, the future
skip-worktree bit must be checked, so that if an entry will be outside
worktree, worktree should not be checked.

For the first group, the future skip-worktree bit is precomputed and
stored as CE_NEW_SKIP_WORKTREE in the first loop before
traverse_trees() is called so that *way_merge() function does not need
to compute it again.

For the second group, because we don't know what entries will be in
this group until traverse_trees() finishes, operations that need
future skip-worktree check is delayed until CE_NEW_SKIP_WORKTREE is
computed in the second loop. CE_ADDED is used to mark entries in the
second group.

CE_ADDED and CE_NEW_SKIP_WORKTREE are temporary flags used in
unpack_trees().  CE_ADDED is only used by add_to_index(), which should
not be called while unpack_trees() is running.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 13:35:12 -08:00
Nguyễn Thái Ngọc Duy 0fd0e2417d dir.c: add free_excludes()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 13:34:55 -08:00
Clemens Buchacher 7980872d4e use persistent memory for rejected paths
An aborted merge prints the list of rejected paths as part of the
error message. Since commit f66caaf9 (do not overwrite files in
leading path), some of those paths do not have static buffers, so
we have to keep a copy. Use string_list's to accomplish this.

This changes the order of the list to the order in which the paths
are processed. Previously, it was reversed.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-15 15:05:34 -08:00
Clemens Buchacher f66caaf9c8 do not overwrite files in leading path
If the work tree contains an untracked file x, and
unpack-trees wants to checkout a path x/*, the
file x is removed unconditionally.

Instead, apply the same checks that are normally
used for untracked files, and abort if the file
cannot be removed.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
2010-10-13 14:34:09 -07:00
Clemens Buchacher a9307f5a68 add function check_ok_to_remove()
This wraps some inline code into the function check_ok_to_remove(),
which will later be used for leading path components as well.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
2010-10-13 14:34:08 -07:00
Junio C Hamano c208e05bd9 Merge branch 'dg/local-mod-error-messages'
* dg/local-mod-error-messages:
  t7609-merge-co-error-msgs: test non-fast forward case too.
  Move "show_all_errors = 1" to setup_unpack_trees_porcelain()
  setup_unpack_trees_porcelain: take the whole options struct as parameter
  Move set_porcelain_error_msgs to unpack-trees.c and rename it

Conflicts:
	merge-recursive.c
2010-09-03 22:23:49 -07:00
Matthieu Moy 5e65ee35dd Move "show_all_errors = 1" to setup_unpack_trees_porcelain()
Not only this makes the code clearer since setting up the porcelain error
message is meant to work with show_all_errors, but this fixes a call to
setup_unpack_trees_porcelain() in git_merge_trees() which did not set
show_all_errors.

add_rejected_path() used to double-check whether it was running in
plumbing mode. This check was ineffective since it was setting
show_all_errors too late for traverse_trees() to see it, and is made
useless by this patch. Remove it.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-03 09:31:51 -07:00
Matthieu Moy e294030fe8 setup_unpack_trees_porcelain: take the whole options struct as parameter
This is a preparation patch to let setup_unpack_trees_porcelain set
show_all_errors itself.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-03 09:31:41 -07:00
Matthieu Moy dc1166e685 Move set_porcelain_error_msgs to unpack-trees.c and rename it
The function is currently dealing only with error messages, but the
intent of calling it is really to notify the unpack-tree mechanics that
it is running in porcelain mode.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-03 09:31:28 -07:00
Junio C Hamano c3b9325fa6 Merge branch 'nd/fix-sparse-checkout'
* nd/fix-sparse-checkout:
  unpack-trees: mark new entries skip-worktree appropriately
  unpack-trees: do not check for conflict entries too early
  unpack-trees: let read-tree -u remove index entries outside sparse area
  unpack-trees: only clear CE_UPDATE|CE_REMOVE when skip-worktree is always set
  t1011 (sparse checkout): style nitpicks
2010-08-21 23:28:05 -07:00
Junio C Hamano 2eb54692d1 Merge branch 'dg/local-mod-error-messages'
* dg/local-mod-error-messages:
  t7609: test merge and checkout error messages
  unpack_trees: group error messages by type
  merge-recursive: distinguish "removed" and "overwritten" messages
  merge-recursive: porcelain messages for checkout
  Turn unpack_trees_options.msgs into an array + enum

Conflicts:
	t/t3400-rebase.sh
2010-08-21 23:26:46 -07:00
Matthieu Moy e6c111b4c0 unpack_trees: group error messages by type
When an error is encountered, it calls add_rejected_file() which either
- directly displays the error message and stops if in plumbing mode
  (i.e. if show_all_errors is not initialized at 1)
- or stores it so that it will be displayed at the end with display_error_msgs(),

Storing the files by error type permits to have a list of files for
which there is the same error instead of having a serie of almost
identical errors.

As each bind_overlap error combines a file and an old file, a list cannot be
done, therefore, theses errors are not stored but directly displayed.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 10:36:06 -07:00
Matthieu Moy 08402b0409 merge-recursive: distinguish "removed" and "overwritten" messages
To limit the number of possible error messages, the error messages for
the case would_lose_untracked_file and would_lose_orphaned in
unpack_trees_options.msgs were handled with a single string,
parameterized by an action string ("overwritten" or "removed").

Instead, we consider them as two different cases, with unparameterized
string. This will make it easier to make separate lists sorted by error
types later.

Only the bind_overlap case still takes two %s parameters, but that's
unavoidable.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 10:36:05 -07:00
Diane Gasselin 23cbf11b5c merge-recursive: porcelain messages for checkout
A porcelain message was first added in checkout.c in the commit
8ccba008 (Junio C Hamano, Sat May 17 21:03:49 2008, unpack-trees:
allow Porcelain to give different error messages) to give better feedback
in the case of merge errors.

This patch adapts the porcelain messages for the case of checkout
instead. This way, when having a checkout error, "merge" no longer
appears in the error message.

While we're there, we add an advice in the case of
would_lose_untracked_file.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 10:36:03 -07:00
Matthieu Moy 08353ebbab Turn unpack_trees_options.msgs into an array + enum
The list of error messages was introduced as a structure, but an array
indexed over an enum is more flexible, since it allows one to store a
type of error message (index in the array) in a variable.

This change needs to rename would_lose_untracked ->
would_lose_untracked_file to avoid a clash with the function
would_lose_untracked in merge-recursive.c.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 10:36:00 -07:00
Jonathan Nieder 1ce584b058 read-tree: stop leaking tree objects
The underlying problem is that the fill_tree_descriptor()
API is easy to misuse, and this patch does not fix that.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 09:58:18 -07:00
Nguyễn Thái Ngọc Duy 74da98f9c7 unpack-trees: mark new entries skip-worktree appropriately
Sparse checkout narrows worktree down based on the skip-worktree bit
before and after $GIT_DIR/info/sparse-checkout application. If it does
not have that bit before but does after, a narrow is detected and the
file will be removed from worktree.

New files added by merge, however, does not have skip-worktree bit. If
those files appear to be outside checkout area, the same rule applies:
the file gets removed from worktree even though they don't exist in
worktree.

Just pretend they have skip-worktree before in that case, so the rule
is ignored.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 12:16:02 -07:00
Nguyễn Thái Ngọc Duy 711f151a7b unpack-trees: do not check for conflict entries too early
The idea of sparse checkout is conflict entries should always stay
in worktree, regardless $GIT_DIR/info/sparse-checkout. Therefore,
ce_stage(ce) usually means no CE_SKIP_WORKTREE. This is true when all
entries have been merged into the index, and identical staged entries
collapsed.

However, will_have_skip_worktree() since f1f523e (unpack-trees():
ignore worktree check outside checkout area) is also used earlier in
verify_* functions, where entries have not been merged to index yet
and ce_stage() is not zero. Checking ce_stage() then may provoke
unnecessary verification on entries outside checkout area and error
out.

This fixes part of test case "read-tree adds to worktree, dirty case".
The error

error: Untracked working tree file 'sub/added' would be overwritten by merge.

is now gone and (unfortunately) replaced by another error, which will
be addressed in the next patch.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 12:16:01 -07:00
Nguyễn Thái Ngọc Duy 700e66d661 unpack-trees: let read-tree -u remove index entries outside sparse area
To avoid touching the worktree outside a sparse checkout,
when the update flag is enabled unpack_trees() clears the
CE_UPDATE and CE_REMOVE flags on entries that do not match the
sparse pattern before actually committing any updates to the
index file or worktree.

The effect on the index was unintentional; sparse checkout was
never meant to prevent index updates outside the area checked
out.  And the result is very confusing: for example, after a
failed merge, currently "git reset --hard" does not reset the
state completely but an additional "git reset --mixed" will.

So stop clearing the CE_REMOVE flag.  Instead, maintain a
CE_WT_REMOVE flag to separately track whether a particular
file removal should apply to the worktree in addition to the
index or not.

The CE_WT_REMOVE flag is used already to mark files that
should be removed because of a narrowing checkout area.  That
usage will still apply; do not clear the CE_WT_REMOVE flag
in that case (detectable because the CE_REMOVE flag is not
set).

This bug masked some other bugs illustrated by the test
suite, which will be addressed by later patches.

Reported-by: Frédéric Brière <fbriere@fbriere.net>
Fixes: http://bugs.debian.org/583699

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 12:16:01 -07:00
Nguyễn Thái Ngọc Duy eec3fc0309 unpack-trees: only clear CE_UPDATE|CE_REMOVE when skip-worktree is always set
The purpose of this clearing is, as explained in comment, because
verify_*() may set those bits before apply_sparse_checkout() is
called. By that time, it's not clear whether an entry will stay in
checkout area or out. After $GIT_DIR/info/sparse-checkout is applied,
we know what entries will be in finally. It's time to clean unwanted
bits.

That works perfectly when checkout area remains unchanged. When
checkout area changes, apply_sparse_checkout() may set CE_UPDATE
or CE_WT_REMOVE to widen/narrow checkout area. Doing the clearing
after apply_sparse_checkout() may clear those widening/narrowing
bits unexpectedly.

So, only do that on entries that are not affected by checkout area
changes (i.e. skip-worktree bit does not change after
apply_sparse_checkout).

This code does not actually fix anything though, just
future-proof. The removed code and the narrow/widen code inside
apply_sparse_checkout are currently independent (narrow code never
sets CE_REMOVE, widen code sets CE_UPDATE, but ce_skip_worktree()
would be false).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 12:15:53 -07:00
Junio C Hamano 3919d40cfb Merge branch 'bd/maint-unpack-trees-parawalk-fix'
* bd/maint-unpack-trees-parawalk-fix:
  unpack-trees: Make index lookahead less pessimal
2010-06-22 09:45:22 -07:00
Junio C Hamano 8d676d85f7 Merge branch 'gv/portable'
* gv/portable:
  test-lib: use DIFF definition from GIT-BUILD-OPTIONS
  build: propagate $DIFF to scripts
  Makefile: Tru64 portability fix
  Makefile: HP-UX 10.20 portability fixes
  Makefile: HPUX11 portability fixes
  Makefile: SunOS 5.6 portability fix
  inline declaration does not work on AIX
  Allow disabling "inline"
  Some platforms lack socklen_t type
  Make NO_{INET_NTOP,INET_PTON} configured independently
  Makefile: some platforms do not have hstrerror anywhere
  git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
  test_cmp: do not use "diff -u" on platforms that lack one
  fixup: do not unconditionally disable "diff -u"
  tests: use "test_cmp", not "diff", when verifying the result
  Do not use "diff" found on PATH while building and installing
  enums: omit trailing comma for portability
  Makefile: -lpthread may still be necessary when libc has only pthread stubs
  Rewrite dynamic structure initializations to runtime assignment
  Makefile: pass CPPFLAGS through to fllow customization

Conflicts:
	Makefile
	wt-status.h
2010-06-21 06:02:44 -07:00
Brian Downing e53e6b4433 unpack-trees: Make index lookahead less pessimal
When traversing trees with an index, the current index pointer
(o->cache_bottom) occasionally has to be temporarily advanced forwards to
match the traversal order of the tree, which is not the same as the sort
order of the index.  The existing algorithm that did this (introduced in
730f72840c) would get "stuck" when the
cache_bottom was popped and then repeatedly check the same index entries
over and over.  This represents a serious performance regression for
large repositories compared to the old "broken" traversal order.

This commit makes a simple change to mitigate this.  Whenever
find_cache_pos sees that the current pos is also the cache_bottom, and
it has already been unpacked, it advances the cache_bottom as well as
the current pos.  This prevents the above "sticking" behavior without
dramatically changing the algorithm.

In addition, this commit moves the unpacked check above the
ce_in_traverse_path() check.  The simple bitmask check is cheaper, and
in the case described above will be firing quite a bit to advance the
cache_bottom after a tree pop.

This yields considerable performance improvements for large trees.
The following are the number of function calls for "git diff HEAD" on
the Linux kernel tree, with 33,307 files:

   Symbol               Calls Before   Calls After
   -------------------  ------------   -----------
   unpack_callback            35,332        35,332
   find_cache_pos             37,357        37,357
   ce_in_traverse_path     4,979,473        37,357
   do_compare_entry        6,828,181       251,925
   df_name_compare         6,828,181       251,925

And on a repository of 187,456 files:

   Symbol               Calls Before   Calls After
   -------------------  ------------   -----------
   unpack_callback           197,958       197,958
   find_cache_pos            208,460       208,460
   ce_in_traverse_path    37,308,336       208,460
   do_compare_entry      156,950,469     2,690,626
   df_name_compare       156,950,469     2,690,626

On the latter repository, user time for "git diff HEAD" was reduced from
5.58 to 0.42 seconds.  This is compared to 0.30 seconds before the
traversal order fix was implemented.

Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-18 08:06:18 -07:00
Junio C Hamano 40e9b27dec Merge branch 'cb/assume-unchanged-fix'
* cb/assume-unchanged-fix:
  Documentation: git-add does not update files marked "assume unchanged"
  do not overwrite files marked "assume unchanged"
2010-06-13 11:21:11 -07:00
Gary V. Vaughan 66dbfd55e3 Rewrite dynamic structure initializations to runtime assignment
Unfortunately, there are still plenty of production systems with
vendor compilers that choke unless all compound declarations can be
determined statically at compile time, for example hpux10.20 (I can
provide a comprehensive list of our supported platforms that exhibit
this problem if necessary).

This patch simply breaks apart any compound declarations with dynamic
initialisation expressions, and moves the initialisation until after
the last declaration in the same block, in all the places necessary to
have the offending compilers accept the code.

Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 16:59:26 -07:00
Clemens Buchacher aecda37c66 do not overwrite files marked "assume unchanged"
A merge will fail gracefully if it needs to update files marked
"assume unchanged", but other similar commands will not. In
particular, checkout and rebase will silently overwrite changes to
such files.

This is a regression introduced in commit 1dcafcc0 (verify_uptodate():
add ce_uptodate(ce) test), which avoids lstat's during a merge, if the
index entry is up-to-date. If the CE_VALID flag is set, however, we
cannot trust CE_UPTODATE.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-01 12:00:44 -07:00
Peter Collingbourne 80d706afed Introduce remove_or_warn function
This patch introduces the remove_or_warn function which is a
generalised version of the {unlink,rmdir}_or_warn functions.  It takes
an additional parameter indicating the mode of the file to be removed.

The patch also modifies certain functions to use remove_or_warn
where appropriate, and adds a test case for a bug fixed by the use
of remove_or_warn.

Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-28 09:53:04 -07:00
Junio C Hamano 026680f881 Merge branch 'jc/fix-tree-walk'
* jc/fix-tree-walk:
  read-tree --debug-unpack
  unpack-trees.c: look ahead in the index
  unpack-trees.c: prepare for looking ahead in the index
  Aggressive three-way merge: fix D/F case
  traverse_trees(): handle D/F conflict case sanely
  more D/F conflict tests
  tests: move convenience regexp to match object names to test-lib.sh

Conflicts:
	builtin-read-tree.c
	unpack-trees.c
	unpack-trees.h
2010-01-24 17:35:58 -08:00
Junio C Hamano 26b9f5cc99 Merge branch 'pc/uninteresting-submodule-disappear-upon-switch-branches'
* pc/uninteresting-submodule-disappear-upon-switch-branches:
  Remove empty directories when checking out a commit with fewer submodules
2010-01-18 18:12:57 -08:00
Junio C Hamano dc96c5ee70 Merge branch 'cc/reset-more'
* cc/reset-more:
  t7111: check that reset options work as described in the tables
  Documentation: reset: add some missing tables
  Fix bit assignment for CE_CONFLICTED
  "reset --merge": fix unmerged case
  reset: use "unpack_trees()" directly instead of "git read-tree"
  reset: add a few tests for "git reset --merge"
  Documentation: reset: add some tables to describe the different options
  reset: improve mixed reset error message when in a bare repo
2010-01-13 11:58:56 -08:00
Junio C Hamano 73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Peter Collingbourne c5e558a80a Remove empty directories when checking out a commit with fewer submodules
Change the unlink_entry function to use rmdir to remove submodule
directories.  Currently we try to use unlink, which will never succeed.

Of course rmdir will only succeed for empty (i.e. not checked out)
submodule directories.  Behaviour if a submodule is checked out stays
essentially the same: print a warning message and keep the submodule
directory.

Signed-off-by: Peter Collingbourne <peter@pcc.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-11 19:50:51 -08:00
Junio C Hamano ba655da537 read-tree --debug-unpack
A debugging patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 15:00:14 -08:00
Junio C Hamano 730f72840c unpack-trees.c: look ahead in the index
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and

 (1) if the name appears in the index as a leaf node, it is also
     fed to the n_way_merge() callback function;

 (2) if the name is a directory in the index, i.e. there are entries in
     that are underneath it, then nothing is fed to the n_way_merge()
     callback function;

 (3) otherwise, if the name comes before the first eligible entry in the
     index, the index entry is first unpacked alone.

When traverse_trees_recursive() descends into a subdirectory, the
cache_bottom pointer is moved to walk index entries within that directory.

All of these are omitted for diff-index, which does not even want to be
fed an index entry and a tree entry with D/F conflicts.

This fixes 3-way read-tree and exposes a bug in other parts of the system
in t6035, test #5.  The test prepares these three trees:

 O = HEAD^
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob e69de29bb2    a/x

 A = HEAD
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb    a/x

 B = master
    120000 blob a36b77384451ea1de7bd340ffca868249626bc52    a/b
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/x

With a clean index that matches HEAD, running

    git read-tree -m -u --aggressive $O $A $B

now yields

    120000 a36b77384451ea1de7bd340ffca868249626bc52 3       a/b
    100644 e69de29bb2 0       a/b-2/c/d
    100644 e69de29bb2 1       a/b/c/d
    100644 e69de29bb2 2       a/b/c/d
    100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0       a/x

which is correct.  "master" created "a/b" symlink that did not exist,
and removed "a/b/c/d" while HEAD did not do touch either path.

Before this series, read-tree did not notice the situation and resolved
addition of "a/b" and removal of "a/b/c/d" independently.  If A = HEAD had
another path "a/b/c/e" added, this merge should conflict but instead it
silently resolved "a/b" and then immediately overwrote it to add
"a/b/c/e", which was quite bogus.

Tests in t1012 start to work with this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 15:00:14 -08:00
Junio C Hamano da165f470e unpack-trees.c: prepare for looking ahead in the index
This prepares but does not yet implement a look-ahead in the index entries
when traverse-trees.c decides to give us tree entries in an order that
does not match what is in the index.

A case where a look-ahead in the index is necessary happens when merging
branch B into branch A while the index matches the current branch A, using
a tree O as their common ancestor, and these three trees looks like this:

   O        A       B
   t                t
   t-i      t-i     t-i
   t-j      t-j
            t/1
            t/2

The traverse_trees() function gets "t", "t-i" and "t" from trees O, A and
B first, and notices that A may have a matching "t" behind "t-i" and "t-j"
(indeed it does), and tells A to give that entry instead.  After unpacking
blob "t" from tree B (as it hasn't changed since O in B and A removed it,
it will result in its removal), it descends into directory "t/".

The side that walked index in parallel to the tree traversal used to be
implemented with one pointer, o->pos, that points at the next index entry
to be processed.  When this happens, the pointer o->pos still points at
"t-i" that is the first entry.  We should be able to skip "t-i" and "t-j"
and locate "t/1" from the index while the recursive invocation of
traverse_trees() walks and match entries found there, and later come back
to process "t-i".

While that look-ahead is not implemented yet, this adds a flag bit,
CE_UNPACKED, to mark the entries in the index that has already been
processed.  o->pos pointer has been renamed to o->cache_bottom and it
points at the first entry that may still need to be processed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 14:59:54 -08:00
Junio C Hamano cee2d6ae63 Aggressive three-way merge: fix D/F case
When the ancestor used to have a blob "P", your tree removed it, and the
tree you are merging with also removed it, the agressive three-way cleanly
merges to remove that blob.  If the other tree added a new blob "P/Q"
while removing "P", it should also merge cleanly to remove "P" and create
"P/Q" (since neither the ancestor nor your tree could have had it, so it
is a typical "created in one").

The "aggressive" rule is not new anymore.  Reword the stale comment.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-03 23:25:13 -08:00
Junio C Hamano e11d7b5969 "reset --merge": fix unmerged case
Commit 9e8ecea (Add 'merge' mode to 'git reset', 2008-12-01) disallowed
"git reset --merge" when there was unmerged entries.  But it wished if
unmerged entries were reset as if --hard (instead of --merge) has been
used.  This makes sense because all "mergy" operations makes sure that
any path involved in the merge does not have local modifications before
starting, so resetting such a path away won't lose any information.

The previous commit changed the behavior of --merge to accept resetting
unmerged entries if they are reset to a different state than HEAD, but it
did not reset the changes in the work tree, leaving the conflict markers
in the resulting file in the work tree.

Fix it by doing three things:

 - Update the documentation to match the wish of original "reset --merge"
   better, namely, "An unmerged entry is a sign that the path didn't have
   any local modification and can be safely resetted to whatever the new
   HEAD records";

 - Update read_index_unmerged(), which reads the index file into the cache
   while dropping any higher-stage entries down to stage #0, not to copy
   the object name from the higher stage entry.  The code used to take the
   object name from the a stage entry ("base" if you happened to have
   stage #1, or "ours" if both sides added, etc.), which essentially meant
   that you are getting random results depending on what the merge did.

   The _only_ reason we want to keep a previously unmerged entry in the
   index at stage #0 is so that we don't forget the fact that we have
   corresponding file in the work tree in order to be able to remove it
   when the tree we are resetting to does not have the path.  In order to
   differentiate such an entry from ordinary cache entry, the cache entry
   added by read_index_unmerged() is marked as CE_CONFLICTED.

 - Update merged_entry() and deleted_entry() so that they pay attention to
   cache entries marked as CE_CONFLICTED.  They are previously unmerged
   entries, and the files in the work tree that correspond to them are
   resetted away by oneway_merge() to the version from the tree we are
   resetting to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-03 16:01:05 -08:00
Nguyễn Thái Ngọc Duy 56cac48c35 ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
Previously CE_MATCH_IGNORE_VALID flag is used by both valid and
skip-worktree bits. While the two bits have similar behaviour, sharing
this flag means "git update-index --really-refresh" will ignore
skip-worktree while it should not. Instead another flag is
introduced to ignore skip-worktree bit, CE_MATCH_IGNORE_VALID only
applies to valid bit.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-14 14:03:58 -08:00
Junio C Hamano 39add7a36f Merge branch 'jc/fix-tree-walk' (early part)
* 'jc/fix-tree-walk' (early part):
  unpack_callback(): use unpack_failed() consistently
  unpack-trees: typofix
  diff-lib.c: fix misleading comments on oneway_diff()
2009-11-20 23:55:50 -08:00
Felipe Contreras a75d7b5409 Use 'fast-forward' all over the place
It's a compound word.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-24 23:50:28 -07:00
Junio C Hamano 353c5eeb5c unpack_callback(): use unpack_failed() consistently
When unpack_index_entry() failed, consistently call unpack_failed(),
instead of silently returning -1.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-11 16:40:43 -07:00
Junio C Hamano 6caa7b5553 unpack-trees: typofix
I am not good at subject-verb concordance.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-11 16:40:43 -07:00
Nguyễn Thái Ngọc Duy 9e1afb1675 sparse checkout: inhibit empty worktree
The way sparse checkout works, users may empty their worktree
completely, because of non-matching sparse-checkout spec, or empty
spec. I believe this is not desired. This patch makes Git refuse to
produce such worktree.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:14:42 -07:00
Nguyễn Thái Ngọc Duy f1f523eae9 unpack-trees(): ignore worktree check outside checkout area
verify_absent() and verify_uptodate() are used to ensure worktree
is safe to be updated, then CE_REMOVE or CE_UPDATE will be set.
Finally check_updates() bases on CE_REMOVE, CE_UPDATE and the
recently added CE_WT_REMOVE to update working directory accordingly.

The entries that are checked may eventually be left out of checkout
area (done later in apply_sparse_checkout()). We don't want to update
outside checkout area. This patch teaches Git to assume "good",
skip these checks when it's sure those entries will be outside checkout
area, and clear CE_REMOVE|CE_UPDATE that could be set due to this
assumption.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:14:41 -07:00
Nguyễn Thái Ngọc Duy e800ec9d72 unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:14:41 -07:00
Nguyễn Thái Ngọc Duy 08aefc9e47 unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
This patch introduces core.sparseCheckout, which will control whether
sparse checkout support is enabled in unpack_trees()

It also loads sparse-checkout file that will be used in the next patch.
I split it out so the next patch will be shorter, easier to read.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:14:41 -07:00
Nguyễn Thái Ngọc Duy 35a5aa79d0 unpack-trees.c: generalize verify_* functions
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy e663db2f44 unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
CE_REMOVE now removes both worktree and index versions. Sparse
checkout must be able to remove worktree version while keep the
index intact when checkout area is narrowed.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy 32f54ca317 unpack-trees(): carry skip-worktree bit over in merged_entry()
In this code path, we would remove "old" and replace it with "merge".
"old" may have skip-worktree bit, so re-add it to "merge".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:33 -07:00
Nguyễn Thái Ngọc Duy 5203083694 Teach Git to respect skip-worktree bit (writing part)
This part is mainly to remove CE_VALID shortcuts (and as a
consequence, ce_uptodate() shortcuts as it may be turned on by
CE_VALID) in writing code path if skip-worktree is used. Various tests
are added to avoid future breakages.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Junio C Hamano 58b1ef2f0f Merge branch 'maint'
* maint:
  checkout -f: deal with a D/F conflict entry correctly
  sha1_name.c: avoid unnecessary strbuf_release
  refs.c: release file descriptor on error return
2009-07-18 16:57:47 -07:00
Junio C Hamano 78d3b06e0f checkout -f: deal with a D/F conflict entry correctly
When we switch branches with "checkout -f", unpack_trees() feeds two
cache_entries to oneway_merge() function in its src[] array argument.  The
zeroth entry comes from the current index, and the first entry represents
what the merge result should be, taken from the tree recorded in the
commit we are switching to.

When we have a blob (either regular file or a symlink) in the index and in
the work tree at path "foo", and the switched-to tree has "foo/bar",
i.e. "foo" becomes a directory, src[0] is obviously that blob currently
registered at "foo".  Even though we do not have anything at "foo" in the
switched-to tree, src[1] is _not_ NULL in this case.

The unpack_trees() machinery places a special marker df_conflict_entry
to signal that no blob exists at "foo", but it will become a directory
that may have somthing underneath it (namely "foo/bar"), so a usual 3-way
merge can notice the situation.

But oneway_merge() codepath failed to notice this and passed the special
marker directly to merged_entry().  This happens to remove the "foo" in
the end because the df_conflict_entry does not have any name (hence the
"error" message) and its addition in add_index_entry() is rejected, but it
is wrong.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-18 16:57:30 -07:00
Linus Torvalds 05c1da2f5e Fix extraneous lstat's in 'git checkout -f'
In our 'oneway_merge()' we always do an 'lstat()' to see if we might
need to mark the entry for updating.

But we really shouldn't need to do that when the cache entry is already
marked as being ce_uptodate(), and this makes us do unnecessary lstat()
calls if we have index preloading enabled.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-14 15:17:01 -07:00
Brandon Casey 0039ba7e5e unpack-trees.c: work around run-time array initialization flaw on IRIX 6.5
The c99 MIPSpro Compiler version 7.4.4m on IRIX 6.5 does not properly
initialize run-time initialized arrays.  An array which is initialized with
fewer elements than the length of the array should have the unitialized
elements initialized to zero.  This compiler only initializes the remaining
elements when the last element is a static parameter.  So work around it
by adding a "NULL" initialization parameter.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-10 23:50:29 -07:00
Linus Torvalds dba2e2037f Simplify read_directory[_recursive]() arguments
Stop the insanity with separate 'path' and 'base' arguments that must
match.  We don't need that crazy interface any more, since we cleaned up
handling of 'path' in commit da4b3e8c28.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-09 01:11:28 -07:00
Linus Torvalds 2af202be3d Fix various sparse warnings in the git source code
There are a few remaining ones, but this fixes the trivial ones. It boils
down to two main issues that sparse complains about:

 - warning: Using plain integer as NULL pointer

   Sparse doesn't like you using '0' instead of 'NULL'. For various good
   reasons, not the least of which is just the visual confusion. A NULL
   pointer is not an integer, and that whole "0 works as NULL" is a
   historical accident and not very pretty.

   A few of these remain: zlib is a total mess, and Z_NULL is just a 0.
   I didn't touch those.

 - warning: symbol 'xyz' was not declared. Should it be static?

   Sparse wants to see declarations for any functions you export. A lack
   of a declaration tends to mean that you should either add one, or you
   should mark the function 'static' to show that it's in file scope.

   A few of these remain: I only did the ones that should obviously just
   be made static.

That 'wt_status_submodule_summary' one is debatable. It has a few related
flags (like 'wt_status_use_color') which _are_ declared, and are used by
builtin-commit.c. So maybe we'd like to export it at some point, but it's
not declared now, and not used outside of that file, so 'static' it is in
this patch.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-20 21:52:55 -07:00
Junio C Hamano b65982b608 Optimize "diff-index --cached" using cache-tree
When running "diff-index --cached" after making a change to only a small
portion of the index, there is no point unpacking unchanged subtrees into
the index recursively, only to find that all entries match anyway.  Tweak
unpack_trees() logic that is used to read in the tree object to catch the
case where the tree entry we are looking at matches the index as a whole
by looking at the cache-tree.

As an exercise, after modifying a few paths in the kernel tree, here are
a few numbers on my Athlon 64X2 3800+:

    (without patch, hot cache)
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+9407minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+2446minor)pagefaults 0swaps

Cold cache numbers are very impressive, but it does not matter very much
in practice:

    (without patch, cold cache)
    $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches'
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
    247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps

    (with patch, cold cache)
    $ su root sh -c 'echo 3 >/proc/sys/vm/drop_caches'
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k
    18440inputs+0outputs (79major+2369minor)pagefaults 0swaps

This of course helps "git status" as well.

    (without patch, hot cache)
    $ /usr/bin/time ../git.git/git-status >/dev/null
    0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+10970minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-status >/dev/null
    0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+3921minor)pagefaults 0swaps

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25 11:35:29 -07:00
Alex Riesen 691f1a28bf replace direct calls to unlink(2) with unlink_or_warn
This helps to notice when something's going wrong, especially on
systems which lock open files.

I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
  called from such a function
- it is in a static function, returning void and the function is only
  called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 18:37:41 -07:00
Junio C Hamano 66985e6629 unpack-trees: do not muck with attributes when we are not checking out
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-17 21:05:49 -07:00
Junio C Hamano 6ba8b079cb Merge branch 'jc/attributes-checkout'
* jc/attributes-checkout:
  Add a test for checking whether gitattributes is honored by checkout.
  Read attributes from the index that is being checked out
2009-03-26 00:27:33 -07:00
Junio C Hamano 7d4e3a72fb Merge branch 'jc/maint-1.6.0-read-tree-overlay'
* jc/maint-1.6.0-read-tree-overlay:
  read-tree A B C: do not create a bogus index and do not segfault
2009-03-17 18:58:55 -07:00
Junio C Hamano 06f33c1735 Read attributes from the index that is being checked out
Traditionally we used .gitattributes file from the work tree if exists,
and otherwise read from the index as a fallback.  When switching to a
branch that has an updated .gitattributes file, and entries in it give
different attributes to other paths being checked out, we should instead
read from the .gitattributes in the index.

This breaks a use case of fixing incorrect entries in the .gitattributes
in the work tree (without adding it to the index) and checking other paths
out, though.

    $ edit .gitattributes ;# mark foo.dat as binary
    $ rm foo.dat
    $ git checkout foo.dat

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-13 22:51:43 -07:00
Junio C Hamano aab3b9a1aa read-tree A B C: do not create a bogus index and do not segfault
"git read-tree A B C..." without the "-m" (merge) option is a way to read
these trees on top of each other to get an overlay of them.

An ancient commit ee6566e (Rewrite read-tree, 2005-09-05) passed the
ADD_CACHE_SKIP_DFCHECK flag when calling add_index_entry() to add the
paths obtained from these trees to the index, but it is an incorrect use
of the flag.  The flag is meant to be used by callers who know the
addition of the entry does not introduce a D/F conflict to the index in
order to avoid the overhead of checking.

This bug resulted in a bogus index that records both "x" and "x/z" as a
blob after reading three trees that have paths ("x"), ("x", "y"), and
("x/z", "y") respectively.  34110cd (Make 'unpack_trees()' have a separate
source and destination index, 2008-03-06) refactored the callsites of
add_index_entry() incorrectly and added more codepaths that use this flag
when it shouldn't be used.

Also, 0190457 (Move 'unpack_trees()' over to 'traverse_trees()' interface,
2008-03-05) introduced a bug to call add_index_entry() for the tree that
does not have the path in it, passing NULL as a cache entry.  This caused
reading multiple trees, one of which has path "x" but another doesn't, to
segfault.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-12 17:06:07 -07:00
Kjetil Barvik c06ff4908b Record ns-timestamps if possible, but do not use it without USE_NSEC
Traditionally, the lack of USE_NSEC meant "do not record nor use the
nanosecond resolution part of the file timestamps".  To avoid problems on
filesystems that lose the ns part when the metadata is flushed to the disk
and then later read back in, disabling USE_NSEC has been a good idea in
general.

If you are on a filesystem without such an issue, it does not hurt to read
and store them in the cached stat data in the index entries even if your
git is compiled without USE_NSEC.  The index left with such a version of
git can be read by git compiled with USE_NSEC and it can make use of the
nanosecond part to optimize the check to see if the path on the filesystem
hsa been modified since we last looked at.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:25:16 -08:00
Kjetil Barvik 1dcafcc0e6 verify_uptodate(): add ce_uptodate(ce) test
If we inside verify_uptodate() can already tell from the ce entry that
it is already uptodate by testing it with ce_uptodate(ce), there is no
need to call lstat(2) and ie_match_stat() afterwards.

And, reading from the commit log message from:

    commit eadb583134
    Author: Junio C Hamano <gitster@pobox.com>
    Date:   Fri Jan 18 23:45:24 2008 -0800

    Avoid running lstat(2) on the same cache entry.

this also seems to be correct usage of the ce_uptodate() macro
introduced by that patch.

This will avoid lots of lstat(2) calls in some cases, for example
by running the 'git checkout' command.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 21:39:51 -08:00
Kjetil Barvik fba2f38a2c make USE_NSEC work as expected
Since the filesystem ext4 is now defined as stable in Linux v2.6.28,
and ext4 supports nanonsecond resolution timestamps natively, it is
time to make USE_NSEC work as expected.

This will make racy git situations less likely to happen.  For 'git
checkout' this means it will be less likely that we have to open, read
the contents of the file into RAM, and check if file is really
modified or not.  The result sould be a litle less used CPU time, less
pagefaults and a litle faster program, at least for 'git checkout'.

Since the number of possible racy git situations would increase when
disks gets faster, this patch would be more and more helpfull as times
go by.  For a fast Solid State Disk, this patch should be helpfull.

Note that, when file operations starts to take less than 1 nanosecond,
one would again start to get more racy git situations.

For more info on racy git, see Documentation/technical/racy-git.txt
For more info on ext4, see http://kernelnewbies.org/Ext4

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 21:39:48 -08:00
Kjetil Barvik 36419c8ee4 check_updates(): effective removal of cache entries marked CE_REMOVE
Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25'
(move from tag v2.6.27 to tag v2.6.25 of the Linux kernel):

CPU: Core 2, speed 1999.95 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
                         mask of 0x00 (Unhalted core cycles) count 20000
Counted INST_RETIRED_ANY_P events (number of instructions retired) with a
                           unit mask of 0x00 (No unit mask) count 20000
CPU_CLK_UNHALT...|INST_RETIRED:2...|
  samples|      %|  samples|      %|
------------------------------------
   409247 100.000    342878 100.000 git
        CPU_CLK_UNHALT...|INST_RETIRED:2...|
          samples|      %|  samples|      %|
        ------------------------------------
           260476 63.6476    257843 75.1996 libz.so.1.2.3
           100876 24.6492     64378 18.7758 kernel-2.6.28.4_2.vmlinux
            30850  7.5382      7874  2.2964 libc-2.9.so
            14775  3.6103      8390  2.4469 git
             2020  0.4936      4325  1.2614 libcrypto.so.0.9.8
              191  0.0467        32  0.0093 libpthread-2.9.so
               58  0.0142        36  0.0105 ld-2.9.so
                1 2.4e-04         0       0 libldap-2.3.so.0.2.31

Detail list of the top 20 function entries (libz counted in one blob):

CPU_CLK_UNHALTED  INST_RETIRED_ANY_P
samples  %        samples  %        image name               symbol name
260476   63.6862  257843   75.2725  libz.so.1.2.3            /lib/libz.so.1.2.3
16587     4.0555  3636      1.0615  libc-2.9.so              memcpy
7710      1.8851  277       0.0809  libc-2.9.so              memmove
3679      0.8995  1108      0.3235  kernel-2.6.28.4_2.vmlinux d_validate
3546      0.8670  2607      0.7611  kernel-2.6.28.4_2.vmlinux __getblk
3174      0.7760  1813      0.5293  libc-2.9.so              _int_malloc
2396      0.5858  3681      1.0746  kernel-2.6.28.4_2.vmlinux copy_to_user
2270      0.5550  2528      0.7380  kernel-2.6.28.4_2.vmlinux __link_path_walk
2205      0.5391  1797      0.5246  kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty
2103      0.5142  1203      0.3512  kernel-2.6.28.4_2.vmlinux find_first_zero_bit
2077      0.5078  997       0.2911  kernel-2.6.28.4_2.vmlinux do_get_write_access
2070      0.5061  514       0.1501  git                      cache_name_compare
2043      0.4995  1501      0.4382  kernel-2.6.28.4_2.vmlinux rcu_irq_exit
2022      0.4944  1732      0.5056  kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc
2020      0.4939  4325      1.2626  libcrypto.so.0.9.8       /usr/lib/libcrypto.so.0.9.8
1965      0.4804  1384      0.4040  git                      patch_delta
1708      0.4176  984       0.2873  kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period
1682      0.4112  727       0.2122  kernel-2.6.28.4_2.vmlinux sysfs_slab_alias
1659      0.4056  290       0.0847  git                      find_pack_entry_one
1480      0.3619  1307      0.3816  kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks

Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles
per instruction, and compared to the total cycles spent inside the
source code of GIT for this command, all the memmove() calls
translates to (7710 * 100) / 14775 = 52.2% of this.

Retesting with a GIT program compiled for gcov usage, I found out that
the memmove() calls came from remove_index_entry_at() in read-cache.c,
where we have:

        memmove(istate->cache + pos,
                istate->cache + pos + 1,
                (istate->cache_nr - pos) * sizeof(struct cache_entry *));

remove_index_entry_at() is called 4902 times from check_updates() in
unpack-trees.c, and each time called we move each cache_entry pointers
(from the removed one) one step to the left.

Since we have 28828 entries in the cache this time, and if we on
average move half of them each time, we in total move approximately
4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if
each pointer is 8 bytes (64 bit).

OK, is seems that the function check_updates() is called 28 times, so
the estimated guess above had been more correct if check_updates() had
been called only once, but the point is: we get lots of bytes moved.

To fix this, and use an O(N) algorithm instead, where N is the number
of cache_entries, we delete/remove all entries in one loop through all
entries.

From a retest, the new remove_marked_cache_entries() from the patch
below, ended up with the following output line from oprofile:

46        0.0105  15        0.0041  git                      remove_marked_cache_entries

If we can trust the numbers from oprofile in this case, we saved
approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077
seconds CPU time with this fix for this particular test.  And notice
that now the CPU did only 46 / 15 = 3.1 cycles/instruction.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18 17:11:21 -08:00
Kjetil Barvik 7847892716 unlink_entry(): introduce schedule_dir_for_removal()
Currently inside unlink_entry() if we get a successful removal of one
file with unlink(), we try to remove the leading directories each and
every time.  So if one directory containing 200 files is moved to an
other location we get 199 failed calls to rmdir() and 1 successful
call.

To fix this and avoid some unnecessary calls to rmdir(), we schedule
each directory for removal and wait much longer before we do the real
call to rmdir().

Since the unlink_entry() function is called with alphabetically sorted
names, this new function end up being very effective to avoid
unnecessary calls to rmdir().  In some cases over 95% of all calls to
rmdir() is removed with this patch.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik 571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Junio C Hamano ddebfd1f27 Merge branch 'maint'
* maint:
  merge: fix out-of-bounds memory access
2009-01-31 17:42:26 -08:00
Junio C Hamano 6ac92294b3 Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  merge: fix out-of-bounds memory access
2009-01-31 17:42:17 -08:00
René Scharfe c7cddc1a2f merge: fix out-of-bounds memory access
The parameter n of unpack_callback() can have a value of up to
MAX_UNPACK_TREES.  The check at the top of unpack_trees() (its only
(indirect) caller) makes sure it cannot exceed this limit.

unpack_callback() passes it and the array src to unpack_nondirectories(),
which has this loop:

	for (i = 0; i < n; i++) {
		/* ... */
		src[i + o->merge] = o->df_conflict_entry;

o->merge can be 0 or 1, so unpack_nondirectories() potentially accesses
the array src at index MAX_UNPACK_TREES.  This patch makes it big enough.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-31 10:39:55 -08:00
Junio C Hamano 0990e7aaaa Merge branch 'kb/lstat-cache'
* kb/lstat-cache:
  lstat_cache(): introduce clear_lstat_cache() function
  lstat_cache(): introduce invalidate_lstat_cache() function
  lstat_cache(): introduce has_dirs_only_path() function
  lstat_cache(): introduce has_symlink_or_noent_leading_path() function
  lstat_cache(): more cache effective symlink/directory detection
2009-01-25 17:13:34 -08:00
Junio C Hamano 692be9f365 Merge branch 'cb/maint-unpack-trees-absense' into maint
* cb/maint-unpack-trees-absense:
  unpack-trees: remove redundant path search in verify_absent
  unpack-trees: fix path search bug in verify_absent
  unpack-trees: handle failure in verify_absent
2009-01-23 19:06:38 -08:00
Kjetil Barvik 09c9306658 lstat_cache(): introduce has_symlink_or_noent_leading_path() function
In some cases, especially inside the unpack-trees.c file, and inside
the verify_absent() function, we can avoid some unnecessary calls to
lstat(), if the lstat_cache() function can also be told to keep track
of non-existing directories.

So we update the lstat_cache() function to handle this new fact,
introduce a new wrapper function, and the result is that we save lots
of lstat() calls for a removed directory which previously contained
lots of files, when we call this new wrapper of lstat_cache() instead
of the old one.

We do similar changes inside the unlink_entry() function, since if we
can already say that the leading directory component of a pathname
does not exist, it is not necessary to try to remove a pathname below
it!

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:49 -08:00
Junio C Hamano f39adc250c Merge branch 'cb/maint-unpack-trees-absense'
* cb/maint-unpack-trees-absense:
  unpack-trees: remove redundant path search in verify_absent
  unpack-trees: fix path search bug in verify_absent
  unpack-trees: handle failure in verify_absent
2009-01-13 23:09:20 -08:00
Clemens Buchacher 7b9e3ce025 unpack-trees: remove redundant path search in verify_absent
Since the only caller, verify_absent, relies on the fact that o->pos
points to the next index entry anyways, there is no need to recompute
its position.

Furthermore, if a nondirectory entry were found, this would return too
early, because there could still be an untracked directory in the way.
This is currently not a problem, because verify_absent is only called
if the index does not have this entry.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05 12:48:43 -08:00
Clemens Buchacher 837e5fe95d unpack-trees: fix path search bug in verify_absent
Commit 0cf73755 (unpack-trees.c: assume submodules are clean during
check-out) changed an argument to verify_absent from 'path' to 'ce',
which is however shadowed by a local variable of the same name.

The bug triggers if verify_absent is used on a tree entry, for which
the index contains one or more subsequent directories of the same
length. The affected subdirectories are removed from the index. The
testcase included in this commit bisects to 55218834 (checkout: do not
lose staged removal), which reveals the bug in this case, but is
otherwise unrelated.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05 12:46:35 -08:00
Clemens Buchacher 6b9315d5a1 unpack-trees: handle failure in verify_absent
Commit 203a2fe1 (Allow callers of unpack_trees() to handle failure)
changed the "die on error" behavior to "return failure code".
verify_absent did not handle errors returned by
verify_clean_subdirectory, however.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05 12:45:38 -08:00
Daniel Lowe 9db56f71b9 Fix non-literal format in printf-style calls
These were found using gcc 4.3.2-1ubuntu11 with the warning:

    warning: format not a string literal and no format arguments

Incorporated suggestions from Brandon Casey <casey@nrlssc.navy.mil>.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-11 14:43:59 -08:00
Jeff King 13494ed14c correct cache_entry allocation
Most cache_entry structs are allocated by using the
cache_entry_size macro, which rounds the size of the struct
up to the nearest multiple of 8 bytes (presumably to avoid
memory fragmentation).

There is one exception: the special "conflict entry" is
allocated with an empty name, and so is explicitly given
just one extra byte to hold the NUL.

However, later code doesn't realize that this particular
struct has been allocated differently, and happily tries
reading and copying it based on the ce_size macro, which
assumes the 8-byte alignment.

This can lead to reading uninitalized data, though since
that data is simply padding, there shouldn't be any problem
as a result. Still, it makes sense to hold the padding
assumption so as not to surprise later maintainers.

This fixes valgrind errors in t1005, t3030, t4002, and
t4114.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-01 23:46:34 -07:00
Junio C Hamano 5521883490 checkout: do not lose staged removal
The logic to checkout a different commit implements the safety to never
lose user's local changes.  For example, switching from a commit to
another commit, when you have changed a path that is different between
them, need to merge your changes to the version from the switched-to
commit, which you may not necessarily be able to resolve easily.  By
default, "git checkout" refused to switch branches, to give you a chance
to stash your local changes (or use "-m" to merge, accepting the risks of
getting conflicts).

This safety, however, had one deliberate hole since early June 2005.  When
your local change was to remove a path (and optionally to stage that
removal), the command checked out the path from the switched-to commit
nevertheless.

This was to allow an initial checkout to happen smoothly (e.g. an initial
checkout is done by starting with an empty index and switching from the
commit at the HEAD to the same commit).  We can tighten the rule slightly
to allow this special case to pass, without losing sight of removal
explicitly done by the user, by noticing if the index is truly empty when
the operation begins.

For historical background, see:

    http://thread.gmane.org/gmane.comp.version-control.git/4641/focus=4646

This case is marked as *0* in the message, which both Linus and I said "it
feels somewhat wrong but otherwise we cannot start from an empty index".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-09 22:55:22 -07:00
Junio C Hamano 913e0e99b6 unpack_trees(): protect the handcrafted in-core index from read_cache()
unpack_trees() rebuilds the in-core index from scratch by allocating a new
structure and finishing it off by copying the built one to the final
index.

The resulting in-core index is Ok for most use, but read_cache() does not
recognize it as such.  The function is meant to be no-op if you already
have loaded the index, until you call discard_cache().

This change the way read_cache() detects an already initialized in-core
index, by introducing an extra bit, and marks the handcrafted in-core
index as initialized, to avoid this problem.

A better fix in the longer term would be to change the read_cache() API so
that it will always discard and re-read from the on-disk index to avoid
confusion.  But there are higher level API that have relied on the current
semantics, and they and their users all need to get converted, which is
outside the scope of 'maint' track.

An example of such a higher level API is write_cache_as_tree(), which is
used by git-write-tree as well as later Porcelains like git-merge, revert
and cherry-pick.  In the longer term, we should remove read_cache() from
there and add one to cmd_write_tree(); other callers expect that the
in-core index they prepared is what gets written as a tree so no other
change is necessary for this particular codepath.

The original version of this patch marked the index by pointing an
otherwise wasted malloc'ed memory with o->result.alloc, but this version
uses Linus's idea to use a new "initialized" bit, which is conceptually
much cleaner.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-23 18:09:27 -07:00
Junio C Hamano 2e2b887d1c unpack_trees(): allow callers to differentiate worktree errors from merge errors
Instead of uniformly returning -1 on any error, this teaches
unpack_trees() to return -2 when the merge itself is Ok but worktree
refuses to get updated.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-29 17:35:21 -07:00
Junio C Hamano 8ccba008ee unpack-trees: allow Porcelain to give different error messages
The plumbing output is sacred as it is an API.  We _could_ change it if it
is broken in such a way that it cannot convey necessary information fully,
but we just do not _reword_ for the sake of rewording.  If somebody does
not like it, s/he is complaining too late.  S/he should have been here in
early May 2005 and make the language used by the API closer to what humans
read.  S/he wasn't here.  Too bad, and it is too late.

And people who complain should look at a bigger picture.  Look at what was
suggested by one of them and think for five seconds:

     $ git checkout mytopic
    -fatal: Entry 'frotz' not uptodate. Cannot merge.
    +fatal: Entry 'frotz' has local changes. Cannot merge.

If you do not see something wrong with this output, your brain has already
been rotten with use of git for too long a time.  Nobody asked us to
"merge" but why are we talking about "Cannot merge"?

This patch introduces a mechanism to allow Porcelains to specify messages
that are different from the ones that is given by the underlying plumbing
implementation of read-tree, so that we can reword the message Porcelains give
without disrupting the output from the plumbing.

    $ git-checkout pu
    error: You have local changes to 'Makefile'; cannot switch branches.

There are other places that ask unpack_trees() to n-way merge, detect
issues  and let it issue error message on its own, but I did this as a
demonstration and replaced only one message.

Yes I know about C99 structure initializers.  I'd love to use them but we
try to be nice to compilers without it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-19 19:30:13 -07:00
Linus Torvalds c40641b77b Optimize symlink/directory detection
This is the base for making symlink detection in the middle fo a pathname
saner and (much) more efficient.

Under various loads, we want to verify that the full path leading up to a
filename is a real directory tree, and that when we successfully do an
'lstat()' on a filename, we don't get a false positive due to a symlink in
the middle of the path that git should have seen as a symlink, not as a
normal path component.

The 'has_symlink_leading_path()' function already did this, and cached
a single level of symlink information, but didn't cache the _lack_ of a
symlink, so the normal behaviour was actually the wrong way around, and we
ended up doing an 'lstat()' on each path component to check that it was a
real directory.

This caches the last detected full directory and symlink entries, and
speeds up especially deep directory structures a lot by avoiding to
lstat() all the directories leading up to each entry in the index.

[ This can - and should - probably be extended upon so that we eventually
  never do a bare 'lstat()' on any path entries at *all* when checking the
  index, but always check the full path carefully. Right now we do not
  generally check the whole path for all our normal quick index
  revalidation.

  We should also make sure that we're careful about all the invalidation,
  ie when we remove a link and replace it by a directory we should
  invalidate the symlink cache if it matches (and vice versa for the
  directory cache).

  But regardless, the basic function needs to be sane to do that. The old
  'has_symlink_leading_path()' was not capable enough - or indeed the code
  readable enough - to really do that sanely. So I'm pushing this as not
  just an optimization, but as a base for further work. ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:31 -07:00
Linus Torvalds 1fa6ead492 Make unpack-tree update removed files before any updated files
This is immaterial on sane filesystems, but if you have a broken (aka
case-insensitive) filesystem, and the objective is to remove the file
'abc' and replace it with the file 'Abc', then we must make sure to do
the removal first.

Otherwise, you'd first update the file 'Abc' - which would just
overwrite the file 'abc' due to the broken case-insensitive filesystem -
and then remove file 'abc' - which would now brokenly remove the just
updated file 'Abc' on that broken filesystem.

By doing removals first, this won't happen.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds 32260ad5db Make branch merging aware of underlying case-insensitive filsystems
If we find an unexpected file, see if that filename perhaps exists in a
case-insensitive way in the index, and whether the file matches that. If
so, ignore it as a known pre-existing file of a different name.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds cd2fef59ed Make hash_name_lookup able to do case-independent lookups
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds df292c791a Make "index_name_exists()" return the cache_entry it found
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.

Perhaps more importantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Junio C Hamano c4758d3c93 Fix read-tree not to discard errors
This fixes the issue identified with recently added tests to t1004

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-18 22:17:22 -07:00
Linus Torvalds 7f8ab8dc07 Don't update unchanged merge entries
In commit 34110cd4e3 ("Make 'unpack_trees()'
have a separate source and destination index") I introduced a really
stupid bug in that it would always add merged entries with the CE_UPDATE
flag set. That caused us to always re-write the file, even when it was
already up-to-date in the source index.

Not only is that really stupid from a performance angle, but more
importantly it's actively wrong: if we have dirty state in the tree when
we merge, overwriting it with the result of the merge will incorrectly
overwrite that dirty state.

This trivially fixes the problem - simply don't set the CE_UPDATE flag
when the merge result matches the old state.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-16 14:25:53 -07:00
Linus Torvalds fac4b32887 Fix recent 'unpack_trees()'-related changes breaking 'git stash'
On Sat, 15 Mar 2008, SZEDER G?bor wrote:
>
> The testcase usually fails during the first 25 run, but sometimes it
> runs more than 100 times before failing.

Damn, this series has had more subtle issues than I ever expected.

'git stash' creates its saved working tree object with:

        # state of the working tree
        w_tree=$( (
                rm -f "$TMP-index" &&
                cp -p ${GIT_INDEX_FILE-"$GIT_DIR/index"} "$TMP-index" &&
                GIT_INDEX_FILE="$TMP-index" &&
                export GIT_INDEX_FILE &&
                git read-tree -m $i_tree &&
                git add -u &&
                git write-tree &&
                rm -f "$TMP-index"
        ) ) ||
                die "Cannot save the current worktree state"

which creates a new index file with the updates, and writes the tree from
that.

We have this logic where we compare the timestamp of the index with the
timestamp of the files and we then write them out "smudged" if they are
the same, and it basically depends on the fact that the date on the index
file is compared with the date encoded in the stat information itself.

And what is going on is:

 - we create a new index file with that "cp". We are careful to preserve
   the timestamps by using "-p", so this one should be all ok.

 - then we *update* that index by resetting it to the tree with git
   read-tree, but now we do *not* preserve the timestamp on this new copy
   any more, even though we copy over all the timestamps on the files that
   are indexed from the stat information!

Now, we always had that problem when re-writing the index, but we had this
clever workaround in the writing part: if the source had racily clean
entries, then when we wrote those out (and thus can't depend on the index
fiel timestamp showing that they are racily clean any more!), we would
smudge them when writing.

IOW, we handle this issue by having write_index() do this:

	for (i = 0; i < entries; i++) {
		...
		if (is_racy_timestamp(istate, ce))
			ce_smudge_racily_clean_entry(ce);
		..

when writing out entries. And that all took care of it, because now when
we wrote the new index, we'd change the timestamp on the index, yes, but
we'd smudge the entries we wrote out, so now the resulting index would
still show that file as not-up-to-date any more.

But with commit 34110cd4e3 ("Make
'unpack_trees()' have a separate source and destination index"), this
logic no longer triggers, because we now write out the "result" index, and
that one never got its timestamp updated from the source index, so it had
lost all that "is_racy_timestamp()" information!

This trivial patch fixes it. It looks trivial, and it's a simple fix, but
boy did it take me way too much thinking and explaining to myself to
explain why there was a problem in the first place!

The trivial fix is to just copy the index timestamp from the source index
into the result index. But we only do this if we *have* a source index, of
course, and if we will even bother to use the result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-14 23:35:55 -07:00
Junio C Hamano ca885a4fe6 read-tree() and unpack_trees(): use consistent limit
read-tree -m can read up to MAX_TREES, which was arbitrarily set to 8 since
August 2007 (4 is needed to deal with 2 merge-base case).

However, the updated unpack_trees() code had an advertised limit of 4
(which it enforced).  In reality the code was prepared to take only 3
trees and giving 4 caused it to stomp on its stack.  Rename the MAX_TREES
constant to MAX_UNPACK_TREES, move it to the unpack-trees.h common header
file, and use it from both places to avoid future confusion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-13 23:56:36 -07:00
Linus Torvalds 20a16eb33e unpack_trees(): fix diff-index regression.
When skip_unmerged option is not given, unpack_trees() should not just
skip unmerged cache entries but keep them in the result for the caller to
sort them out.

For callers other than diff-index, the incoming index should never be
unmerged, but diff-index is a special case caller.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-10 23:51:13 -07:00
Junio C Hamano 542c264b01 traverse_trees_recursive(): propagate merge errors up
There were few places where merge errors detected deeper in the call chain
were ignored and not propagated up the callchain to the caller.

Most notably, this caused switching branches with "git checkout" to ignore
a path modified in a work tree are different between the HEAD version and
the commit being switched to, which it internally notices but ignores it,
resulting in an incorrect two-way merge and loss of the change in the work
tree.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-10 01:26:23 -07:00
Linus Torvalds 1caeacc1f2 unpack_trees(): minor memory leak fix in unused destination index
This adds a "discard_index(&o->result)" to the failure path, to reclaim
memory from an in-core index we built but ended up not using.

The *big* memory leak comes from the fact that we leak the cache_entry
things left and right. That's a very traditional and deliberate leak:
because we used to build up the cache entries by just mapping them
directly in from the index file (and we emulate that in modern times
by allocating them from one big array), we can't actually free them
one-by-one.

So doing the "discard_index()" will free the hash tables etc, which is
good, and it will free the "istate->alloc" but that is never set on the
result because we don't get the result from the index read. So we don't
actually free the individual cache entries themselves that got created
from the trees.

That's not something new, btw. We never did. But some day we should just
add a flag to the cache_entry() that it's a "free one by one" kind, and
then we could/should do it. In the meantime, this one-liner will fix
*some* of the memory leaks, but not that old traditional one.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 01:03:45 -08:00
Linus Torvalds 34110cd4e3 Make 'unpack_trees()' have a separate source and destination index
We will always unpack into our own internal index, but we will take the
source from wherever specified, and we will optionally write the result
to a specified index (optionally, because not everybody even _wants_ any
result: the index diffing really wants to just walk the tree and index
in parallel).

This ends up removing a fair number more lines than it adds, for the
simple reason that we can now skip all the crud that tried to be
oh-so-careful about maintaining our position in the index as we were
traversing and modifying it.  Since we don't actually modify the source
index any more, we can just update the 'o->pos' pointer without worrying
about whether an index entry got removed or replaced or added to.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 01:03:38 -08:00
Linus Torvalds bc052d7f43 Make 'unpack_trees()' take the index to work on as an argument
This is just a very mechanical conversion, and makes everybody set it to
'&the_index' before calling, but at least it makes it more explicit
where we work with the index.

The next stage would be to split that index usage up into a 'source' and
a 'destination' index, so that we can unpack into a different index than
we started out from.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:48 -08:00
Linus Torvalds 01904572a5 Move 'unpack_trees()' over to 'traverse_trees()' interface
This not only deletes more code than it adds, it gets rid of a
singularly hard-to-understand function (unpack_trees_rec()), and
replaces it with a set of smaller and simpler functions that use the
generic tree traversal mechanism to walk over one or more git trees in
parallel.

It's still not the most wonderful interface, and by no means is the new
code easy to understand either, but it's at least a bit less opaque.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:47 -08:00
Junio C Hamano 5a4d707a6d Merge branch 'db/checkout'
* db/checkout: (21 commits)
  checkout: error out when index is unmerged even with -m
  checkout: show progress when checkout takes long time while switching branches
  Add merge-subtree back
  checkout: updates to tracking report
  builtin-checkout.c: Remove unused prefix arguments in switch_branches path
  checkout: work from a subdirectory
  checkout: tone down the "forked status" diagnostic messages
  Clean up reporting differences on branch switch
  builtin-checkout.c: fix possible usage segfault
  checkout: notice when the switched branch is behind or forked
  Build in checkout
  Move code to clean up after a branch change to branch.c
  Library function to check for unmerged index entries
  Use diff -u instead of diff in t7201
  Move create_branch into a library file
  Build-in merge-recursive
  Add "skip_unmerged" option to unpack_trees.
  Discard "deleted" cache entries after using them to update the working tree
  Send unpack-trees debugging output to stderr
  Add flag to make unpack_trees() not print errors.
  ...

Conflicts:

	Makefile
2008-02-27 12:53:26 -08:00
Linus Torvalds e85486450e Be more verbose when checkout takes a long time
So I find it irritating when git thinks for a long time without telling me
what's taking so long. And by "long time" I definitely mean less than two
seconds, which is already way too long for me.

This hits me when doing a large pull and the checkout takes a long time,
or when just switching to another branch that is old and again checkout
takes a while.

Now, git read-tree already had support for the "-v" flag that does nice
updates about what's going on, but it was delayed by two seconds, and if
the thing had already done more than half by then it would be quiet even
after that, so in practice it meant that we migth be quiet for up to four
seconds. Much too long.

So this patch changes the timeout to just one second, which makes it much
more palatable to me.

The other thing this patch does is that "git checkout" now doesn't disable
the "-v" flag when doing its thing, and only disables the output when
given the -q flag.  When allowing "checkout -m" to fall back to a 3-way
merge, the users will see the error message from straight "checkout",
so we will tell them that we do fall back to make them look less scary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-24 10:01:13 -08:00
Linus Torvalds eb7a2f1d50 Use helper function for copying index entry information
We used to just memcpy() the index entry when we copied the stat() and
SHA1 hash information, which worked well enough back when the index
entry was just an exact bit-for-bit representation of the information on
disk.

However, these days we actually have various management information in
the cache entry too, and we should be careful to not overwrite it when
we copy the stat information from another index entry.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Junio C Hamano 987e315a6b Merge branch 'jc/gitignore-ends-with-slash'
* jc/gitignore-ends-with-slash:
  gitignore: lazily find dtype
  gitignore(5): Allow "foo/" in ignore list to match directory "foo"
2008-02-16 17:57:06 -08:00
Daniel Barkalow 4e7c4571b8 Add "skip_unmerged" option to unpack_trees.
This option allows the caller to reset everything that isn't unmerged,
leaving the unmerged things to be resolved. If, after a merge of
"working" and "HEAD", this is used with "HEAD" (reset, !update), the
result will be that all of the changes from "local" are in the working
tree but not added to the index (either with the index clean but
unchanged, or with the index unmerged, depending on whether there are
conflicts).

This will be used in checkout -m.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Daniel Barkalow 33ecf7eb61 Discard "deleted" cache entries after using them to update the working tree
Way back in read-tree.c, we used a mode 0 cache entry to indicate that
an entry had been deleted, so that the update code would remove the
working tree file, and we would just skip it when writing out the
index file afterward.

These days, unpack_trees is a library function, and it is still
leaving these entries in the active cache. Furthermore, unpack_trees
doesn't correctly ignore those entries, and who knows what other code
wouldn't expect them to be there, but just isn't yet called after a
call to unpack_trees. To avoid having other code trip over these
entries, have check_updates() remove them after it removes the working
tree files.

While we're at it, simplify the loop in check_updates(), and avoid
passing global variables as parameters to check_updates(): there is
only one call site anyway.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Daniel Barkalow b05c6dff8a Send unpack-trees debugging output to stderr
This is to keep git-stash from getting confused if you're debugging
unpack-trees.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Daniel Barkalow 17e4642667 Add flag to make unpack_trees() not print errors.
(This applies only to errors where a plausible operation is impossible due
to the particular data, not to errors resulting from misuse of the merge
functions.)

This will allow builtin-checkout to suppress merge errors if it's
going to try more merging methods.

Additionally, if unpack_trees() returns with an error, but without
printing anything, it will roll back any changes to the index (by
rereading the index, currently). This obviously could be done by the
caller, but chances are that the caller would forget and debugging
this is difficult. Also, future implementations may give unpack_trees() a
more efficient way of undoing its changes than the caller could.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Daniel Barkalow 203a2fe117 Allow callers of unpack_trees() to handle failure
Return an error from unpack_trees() instead of calling die(), and exit
with an error in read-tree, builtin-commit, and diff-lib. merge-recursive
already expected an error return from unpack_trees, so it doesn't need to
be changed. The merge function can return negative to abort.

This will be used in builtin-checkout -m.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Junio C Hamano 6831a88ac0 gitignore: lazily find dtype
When we process "foo/" entries in gitignore files on a system
that does not have d_type member in "struct dirent", the earlier
implementation ran lstat(2) separately when matching with
entries that came from the command line, in-tree .gitignore
files, and $GIT_DIR/info/excludes file.

This optimizes it by delaying the lstat(2) call until it becomes
absolutely necessary.

The initial idea for this change was by Jeff King, but I
optimized it further to pass pointers to around.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00
Junio C Hamano d6b8fc303b gitignore(5): Allow "foo/" in ignore list to match directory "foo"
A pattern "foo/" in the exclude list did not match directory
"foo", but a pattern "foo" did.  This attempts to extend the
exclude mechanism so that it would while not matching a regular
file or a symbolic link "foo".  In order to differentiate a
directory and non directory, this passes down the type of path
being checked to excluded() function.

A downside is that the recursive directory walk may need to run
lstat(2) more often on systems whose "struct dirent" do not give
the type of the entry; earlier it did not have to do so for an
excluded path, but we now need to figure out if a path is a
directory before deciding to exclude it.  This is especially bad
because an idea similar to the earlier CE_UPTODATE optimization
to reduce number of lstat(2) calls would by definition not apply
to the codepaths involved, as (1) directories will not be
registered in the index, and (2) excluded paths will not be in
the index anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00