Commit graph

232 commits

Author SHA1 Message Date
Junio C Hamano 64ca23afda discard_cache: reset lazy name_hash bit
We forgot to reset name_hash_initialized bit when discarding the in-core index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-23 13:05:10 -07:00
Junio C Hamano 16ce2e4c8f index: future proof for "extended" index entries
We do not have any more bits in the on-disk index flags word, but we would
need to have more in the future.  Use the last remaining bits as a signal
to tell us that the index entry we are looking at is an extended one.

Since we do not understand the extended format yet, we will just error out
when we see it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-17 00:22:45 -07:00
Junio C Hamano c70115b4b1 Teach gitlinks to ie_modified() and ce_modified_check_fs()
The ie_modified() function is the workhorse for refresh_cache_entry(),
i.e. checking if an index entry that is stat-dirty actually has changes.

After running quicker check to compare cached stat information with
results from the latest lstat(2) to answer "has modification" early, the
code goes on to check if there really is a change by comparing the staged
data with what is on the filesystem by asking ce_modified_check_fs().
However, this function always said "no change" for any gitlinks that has a
directory at the corresponding path.  This made ie_modified() to miss
actual changes in the subproject.

The patch fixes this first by modifying an existing short-circuit logic
before calling the ce_modified_check_fs() function.  It knows that for any
filesystem entity to which ie_match_stat() says its data has changed, if
its cached size is nonzero then the contents cannot match, which is a
correct optimization only for blob objects.  We teach gitlink objects to
this special case, as we already know that any gitlink that
ie_match_stat() says is modified is indeed modified at this point in the
codepath.

With the above change, we could leave ce_modified_check_fs() broken, but
it also futureproofs the code by teaching it to use ce_compare_gitlink(),
instead of assuming (incorrectly) that any directory is unchanged.

Originally noticed by Alex Riesen on Cygwin.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-30 00:09:22 -07:00
Alex Riesen 1ce4790bf5 Make use of stat.ctime configurable
A new configuration variable 'core.trustctime' is introduced to
allow ignoring st_ctime information when checking if paths
in the working tree has changed, because there are situations where
it produces too much false positives.  Like when file system crawlers
keep changing it when scanning and using the ctime for marking scanned
files.

The default is to notice ctime changes.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-28 23:26:25 -07:00
Petr Baudis 81dc2307d0 git-mv: Keep moved index entries inact
The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

Also the code used to refuse renaming overwriting symlink with a regular
file and vice versa; there is no need for that.

A few new tests have been added to the testsuite to reflect this change.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-27 15:05:19 -07:00
Junio C Hamano 041aee31be builtin-add.c: restructure the code for maintainability
A private function add_files_to_cache() in builtin-add.c was borrowed by
checkout and commit re-implementors without getting properly refactored to
more library-ish place.  This does the refactoring.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-25 21:14:21 -07:00
Junio C Hamano d14e7407b3 "needs update" considered harmful
"git update-index --refresh", "git reset" and "git add --refresh" have
reported paths that have local modifications as "needs update" since the
beginning of git.

Although this is logically correct in that you need to update the index at
that path before you can commit that change, it is now becoming more and
more clear, especially with the continuous push for user friendliness
since 1.5.0 series, that the message is suboptimal.  After all, the change
may be something the user might want to get rid of, and "updating" would
be absolutely a wrong thing to do if that is the case.

I prepared two alternatives to solve this.  Both aim to reword the message
to more neutral "locally modified".

This patch is a more intrusive variant that changes the message for only
Porcelain commands ("add" and "reset") while keeping the plumbing
"update-index" intact.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-20 17:21:32 -07:00
Junio C Hamano 3bf0dd1f4e read-cache.c: typofix
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-16 18:48:58 -07:00
Miklos Vajna e46bbcf6e8 Move read_cache_unmerged() to read-cache.c
builtin-read-tree has a read_cache_unmerged() which is useful for other
builtins, for example builtin-merge uses it as well. Move it to
read-cache.c to avoid code duplication.

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-30 22:45:51 -07:00
Junio C Hamano 159e639e5b Merge branch 'lt/racy-empty'
* lt/racy-empty:
  racy-git: an empty blob has a fixed object name
2008-06-22 14:34:20 -07:00
Linus Torvalds f49c2c22fe racy-git: an empty blob has a fixed object name
We use size=0 as the magic token to say the entry is known to be racily
clean, but a sequence that does:

 - update the path with a non-empty blob and write the index;
 - update an unrelated path and write the index -- this smudges
   the above entry;
 - truncate the path to size zero.

would make both the size field for the path in the index and the size on
the filesystem zero.  We should not mistake it as a clean index entry.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-19 14:14:45 -07:00
Marius Storm-Olsen aa9349d449 Add shortcut in refresh_cache_ent() for marked entries.
When a cache entry has been marked as CE_VALID, the user has
promised us that any change in the work tree does not matter.
Just mark the entry as up-to-date, and continue.

Signed-off-by: Marius Storm-Olsen <marius@trolltech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-31 14:18:20 -07:00
Junio C Hamano 0166592495 Merge branch 'jc/add-n-u'
* jc/add-n-u:
  Make git add -n and git -u -n output consistent
  "git-add -n -u" should not add but just report

Conflicts:

	builtin-add.c
	builtin-mv.c
	cache.h
	read-cache.c
2008-05-25 14:03:50 -07:00
Junio C Hamano 7e83003029 Merge branch 'js/ignore-submodule'
* js/ignore-submodule:
  Ignore dirty submodule states during rebase and stash
  Teach update-index about --ignore-submodules
  diff options: Introduce --ignore-submodules
2008-05-25 13:37:08 -07:00
Junio C Hamano 38ed1d89f7 "git-add -n -u" should not add but just report
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-21 12:04:41 -07:00
Johannes Schindelin 5fdeacb0ca Teach update-index about --ignore-submodules
Like with the diff machinery, update-index should sometimes just
ignore submodules (e.g. to determine a clean state before a rebase).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-15 16:12:43 -07:00
Alex Riesen 960b8ad1b1 Make the exit code of add_file_to_index actually useful
Update the programs which used the function (as add_file_to_cache).

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-12 20:54:46 -07:00
Linus Torvalds d177cab048 Avoid some unnecessary lstat() calls
The commit sequence used to do

	if (file_exists(p->path))
		add_file_to_cache(p->path, 0);

where both "file_exists()" and "add_file_to_cache()" needed to do a
lstat() on the path to do their work.

This cuts down 'lstat()' calls for the partial commit case by two
for each path we know about (because we do this twice per path).

Just move the lstat() to the caller instead (that's all that
"file_exists()" really does), and pass the stat information down to the
add_to_cache() function.

This essentially makes 'add_to_index()' the core function that adds a path
to the index, getting the index pointer, the pathname and the stat
information as arguments. There are then shorthand helper functions that
use this core function:

 - 'add_to_cache()' is just 'add_to_index()' with the default index

 - 'add_file_to_cache/index()' is the same, but does the lstat() call
   itself, so you can pass just the pathname if you don't already have the
   stat information available.

So old users of the 'add_file_to_xyzzy()' are essentially left unchanged,
and this just exposes the more generic helper function that can take
existing stat information into account.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:30 -07:00
Junio C Hamano 2855e70ad1 Merge branch 'py/diff-submodule'
* py/diff-submodule:
  is_racy_timestamp(): do not check timestamp for gitlinks
  diff-lib.c: rename check_work_tree_entity()
  diff: a submodule not checked out is not modified
  Add t7506 to test submodule related functions for git-status
  t4027: test diff for submodule with empty directory
2008-05-10 18:16:25 -07:00
Junio C Hamano 380a742679 Merge branch 'lt/case-insensitive'
* lt/case-insensitive:
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-10 18:14:28 -07:00
Junio C Hamano 050288d52d is_racy_timestamp(): do not check timestamp for gitlinks
Because we do not even check the timestamp to determie if a gitlink
is up to date or not, triggering the racy-timestamp check for gitlinks
does not make sense.

This fixes the recently added test in t7506.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:27 -07:00
Junio C Hamano e06c43c795 write_index(): optimize ce_smudge_racily_clean_entry() calls with CE_UPTODATE
When writing the index out, we need to check the work tree again to see if
an entry whose timestamp indicates that it could be "racily clean", in
order to smudge it if it is stat-clean but with modified contents.

However, we can skip this step for entries marked with CE_UPTODATE,
which are known to be the really clean (i.e. the one we already have
checked when we prepared the index).  This will reduce lstat(2) calls
necessary in git-status.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-12 19:42:17 -07:00
Linus Torvalds 1102952b45 Make git-add behave more sensibly in a case-insensitive environment
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.

In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a

	git add .

(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").

However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds 6835550def When adding files to the index, add support for case-independent matches
This simplifies the matching case of "I already have this file and it is
up-to-date" and makes it do the right thing in the face of
case-insensitive aliases.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds 96872bc200 Move name hashing functions into a file of its own
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.

It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds d1f128b050 Add 'const' where appropriate to index handling functions
This is in an effort to make the source index of 'unpack_trees()' as
being const, and thus making the compiler help us verify that we only
access it for reading.

The constification also extended to some of the hashing helpers that get
called indirectly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:48 -08:00
Linus Torvalds 0ab9e1e8cd Add 'df_name_compare()' helper function
This new helper is identical to base_name_compare(), except it compares
conflicting directory/file entries as equal in order to help handling DF
conflicts (thus the name).

Note that while a directory name compares as equal to a regular file
with the new helper, they then individually compare _differently_ to a
filename that has a dot after the basename (because '\0' < '.' < '/').

So a directory called "foo/" will compare equal to a file "foo", even
though "foo.c" will compare after "foo" and before "foo/"

This will be used by routines that want to traverse the git namespace
but then handle conflicting entries together when possible.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:46 -08:00
Junio C Hamano 5a4d707a6d Merge branch 'db/checkout'
* db/checkout: (21 commits)
  checkout: error out when index is unmerged even with -m
  checkout: show progress when checkout takes long time while switching branches
  Add merge-subtree back
  checkout: updates to tracking report
  builtin-checkout.c: Remove unused prefix arguments in switch_branches path
  checkout: work from a subdirectory
  checkout: tone down the "forked status" diagnostic messages
  Clean up reporting differences on branch switch
  builtin-checkout.c: fix possible usage segfault
  checkout: notice when the switched branch is behind or forked
  Build in checkout
  Move code to clean up after a branch change to branch.c
  Library function to check for unmerged index entries
  Use diff -u instead of diff in t7201
  Move create_branch into a library file
  Build-in merge-recursive
  Add "skip_unmerged" option to unpack_trees.
  Discard "deleted" cache entries after using them to update the working tree
  Send unpack-trees debugging output to stderr
  Add flag to make unpack_trees() not print errors.
  ...

Conflicts:

	Makefile
2008-02-27 12:53:26 -08:00
Linus Torvalds d070e3a31b Name hash fixups: export (and rename) remove_hash_entry
This makes the name hash removal function (which really just sets the
bit that disables lookups of it) available to external routines, and
makes read_cache_unmerged() use it when it drops an unmerged entry from
the index.

It's renamed to remove_index_entry(), and we drop the (unused) 'istate'
argument.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds a22c637124 Fix name re-hashing semantics
We handled the case of removing and re-inserting cache entries badly,
which is something that merging commonly needs to do (removing the
different stages, and then re-inserting one of them as the merged
state).

We even had a rather ugly special case for this failure case, where
replace_index_entry() basically turned itself into a no-op if the new
and the old entries were the same, exactly because the hash routines
didn't handle it on their own.

So what this patch does is to not just have the UNHASHED bit, but a
HASHED bit too, and when you insert an entry into the name hash, that
involves:

 - clear the UNHASHED bit, because now it's valid again for lookup
   (which is really all that UNHASHED meant)

 - if we're being lazy, we're done here (but we still want to clear the
   UNHASHED bit regardless of lazy mode, since we can become unlazy
   later, and so we need the UNHASHED bit to always be set correctly,
   even if we never actually insert the entry into the hash list)

 - if it was already hashed, we just leave it on the list

 - otherwise mark it HASHED and insert it into the list

this all means that unhashing and rehashing a name all just works
automatically.  Obviously, you cannot change the name of an entry (that
would be a serious bug), but nothing can validly do that anyway (you'd
have to allocate a new struct cache_entry anyway since the name length
could change), so that's not a new limitation.

The code actually gets simpler in many ways, although the lazy hashing
does mean that there are a few odd cases (ie something can be marked
unhashed even though it was never on the hash in the first place, and
isn't actually marked hashed!).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Daniel Barkalow 94a5728cfb Library function to check for unmerged index entries
It's small, but it was in three places already, so it should be in the
library.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Junio C Hamano 9cb76b8cdc lazy index hashing
This delays the hashing of index names until it becomes necessary for
the first time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 23:01:13 -08:00
Linus Torvalds cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Junio C Hamano 6d91da6d3c read-cache.c: introduce is_racy_timestamp() helper
This moves a common boolean expression into a helper function,
and makes the comparison between filesystem timestamp and index
timestamp done in the function in line with the other places.
st.st_mtime should be casted to (unsigned int) when compared to
an index timestamp ce_mtime.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:26:40 -08:00
Junio C Hamano 077c48df8a read-cache.c: fix a couple more CE_REMOVE conversion
It is a D/F conflict if you want to add "foo/bar" to the index
when "foo" already exists.  Also it is a conflict if you want to
add a file "foo" when "foo/bar" exists.

An exception is when the existing entry is there only to mark "I
used to be here but I am being removed".  This is needed for
operations such as "git read-tree -m -u" that update the index
and then reflect the result to the work tree --- we need to
remember what to remove somewhere, and we use the index for
that.  In such a case, an existing file "foo" is being removed
and we can create "foo/" directory and hang "bar" underneath it
without any conflict.

We used to use (ce->ce_mode == 0) to mark an entry that is being
removed, but (CE_REMOVE & ce->ce_flags) is used for that purpose
these days.  An earlier commit forgot to convert the logic in
the code that checks D/F conflict condition.

The old code knew that "to be removed" entries cannot be at
higher stage and actively checked that condition, but it was an
unnecessary check.  This patch removes the extra check as well.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:24:21 -08:00
Linus Torvalds 7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano eadb583134 Avoid running lstat(2) on the same cache entry.
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano 7fec10b7f4 index: be careful when handling long names
We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano c78a24986d Merge branch 'jc/maint-add-sync-stat'
* jc/maint-add-sync-stat:
  t2200: test more cases of "add -u"
  git-add: make the entry stat-clean after re-adding the same contents
  ce_match_stat, run_diff_files: use symbolic constants for readability

Conflicts:

	builtin-add.c
2007-11-14 14:15:40 -08:00
Junio C Hamano fb63d7f889 git-add: make the entry stat-clean after re-adding the same contents
Earlier in commit 0781b8a9b2
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.

The change meant well, but was not executed quite right.  It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".

This was wrong.  There are three possible comparison results
between the index and the file in the work tree:

 - with lstat(2) we _know_ they are different.  E.g. if the
   length or the owner in the cached stat information is
   different from the length we just obtained from lstat(2), we
   can tell the file is modified without looking at the actual
   contents.

 - with lstat(2) we _know_ they are the same.  The same length,
   the same owner, the same everything (but this has a twist, as
   described below).

 - we cannot tell from lstat(2) information alone and need to go
   to the filesystem to actually compare.

The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:

    $ echo hello >file
    $ git add file
    $ echo aeiou >file ;# the same length

If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified.  The path is called 'racily clean'.  We need to
reliably detect racily clean paths are in fact modified.

To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.

That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean.  With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:

    $ echo hello >file
    $ git add file
    $ echo hello >file ;# the same contents
    $ git add file

The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.

There was another problem with "git add -u".  The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified.  But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.

The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.

We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.

This patch applies to maint and all the way up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:37:39 -08:00
Junio C Hamano 4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
Shawn O. Pearce e75c55844f Merge branch 'maint'
* maint:
  Yet more 1.5.3.5 fixes mentioned in release notes
  cvsserver: Use exit 1 instead of die when req_Root fails.
  git-blame shouldn't crash if run in an unmerged tree
  git-config: print error message if the config file cannot be read
  fixing output of non-fast-forward output of post-receive-email
2007-10-18 03:11:17 -04:00
Linus Torvalds cd8ae20195 git-blame shouldn't crash if run in an unmerged tree
If we are in the middle of resolving a merge conflict there may be
one or more files whose entries in the index represent an unmerged
state (index entries in the higher-order stages).

Attempting to run git-blame on any file in such a working directory
resulted in "fatal: internal error: ce_mode is 0" as we use the magic
marker for an unmerged entry is 0 (set up by things like diff-lib.c's
do_diff_cache() and builtin-read-tree.c's read_tree_unmerged())
and the ce_match_stat_basic() function gets upset about this.

I'm not entirely sure that the whole "ce_mode = 0" case is a good
idea to begin with, and maybe the right thing to do is to remove
that horrid freakish special case, but removing the internal error
seems to be the simplest fix for now.

                Linus

[sp: Thanks to Björn Steinbrink for the test case]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-18 02:31:30 -04:00
Carlos Rica 102c2338da Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 13:42:10 -07:00
Pierre Habouzit 1dffb8fa80 Small cache_tree_write refactor.
This function cannot fail, make it void. Also make write_one act on a
const char* instead of a char*.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 02:27:06 -07:00
Junio C Hamano 58f6fb53dd Merge branch 'jc/cachetree' into cr/reset
* jc/cachetree:
  Simplify cache API
  git-format-patch --in-reply-to: accept <message@id> with angle brackets
  git-add -u: do not barf on type changes
  Remove duplicate note about removing commits with git-filter-branch
  git-clone: improve error message if curl program is missing or not executable
  git.el: Allow the add and remove commands to be applied to ignored files.
  git.el: Allow selecting whether to display uptodate/unknown/ignored files.
  git.el: Keep the status buffer sorted by filename.
  hooks--update: Explicitly check for all zeros for a deleted ref.
2007-09-14 01:19:30 -07:00
Junio C Hamano 09d5dc32fb Simplify cache API
Earlier, add_file_to_index() invalidated the path in the cache-tree
but remove_file_from_cache() did not, and the user of the latter
needed to invalidate the entry himself.  This led to a few bugs due to
missed invalidate calls already.  This patch makes the management of
cache-tree less error prone by making more invalidate calls from lower
level cache API functions.

The rules are:

 - If you are going to write the index, you should either maintain
   cache_tree correctly.

   - If you cannot, alternatively you can remove the entire cache_tree
     by calling cache_tree_free() before you call write_cache().

   - When you modify the index, cache_tree_invalidate_path() should be
     called with the path you are modifying, to discard the entry from
     the cache-tree structure.

 - The following cache API functions exported from read-cache.c (and
   the macro whose names have "cache" instead of "index")
   automatically call cache_tree_invalidate_path() for you:

   - remove_file_from_index();
   - add_file_to_index();
   - add_index_entry();

   You can modify the index bypassing the above API functions
   (e.g. find an existing cache entry from the index and modify it in
   place).  You need to call cache_tree_invalidate_path() yourself in
   such a case.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-14 01:02:21 -07:00
Carlos Rica 6640f88165 Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-12 13:25:07 -07:00
Alexandre Julliard d616813d75 git-add: Add support for --refresh option.
This allows to refresh only a subset of the project files, based on
the specified pathspecs.

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-13 12:58:38 -07:00
Junio C Hamano af3785dc5a Optimize "diff --cached" performance.
The read_tree() function is called only from the call chain to
run "git diff --cached" (this includes the internal call made by
git-runstatus to run_diff_index()).  The function vacates stage
without any funky "merge" magic.  The caller then goes and
compares stage #1 entries from the tree with stage #0 entries
from the original index.

When adding the cache entries this way, it used the general
purpose add_cache_entry().  This function looks for an existing
entry to replace or if there is none to find where to insert the
new entry, resolves D/F conflict and all the other things.

For the purpose of reading entries into an empty stage, none of
that processing is needed.  We can instead append everything and
then sort the result at the end.

This commit changes read_tree() to first make sure that there is
no existing cache entries at specified stage, and if that is the
case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
(new), and then sort the resulting cache using qsort().

This new flag tells add_cache_entry() to omit all the checks
such as "Does this path already exist?  Does adding this path
remove other existing entries because it turns a directory to a
file?" and instead append the given cache entry straight at the
end of the active cache.  The caller of course is expected to
sort the resulting cache at the end before using the result.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10 11:44:23 -07:00
Junio C Hamano 0781b8a9b2 add_file_to_index: skip rehashing if the cached stat already matches
An earlier commit 366bfcb6 broke git-add by moving read_cache()
call down, because it wanted the directory walking code to grab
paths that are already in the index.  The change serves its
purpose, but introduces a regression because the responsibility
of avoiding unnecessary reindexing by matching the cached stat
is shifted nowhere.

This makes it the job of add_file_to_index() function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-30 17:49:50 -07:00
Johannes Schindelin 2031427167 git add: respect core.filemode with unmerged entries
When a merge left unmerged entries, git add failed to pick up the
file mode from the index, when core.filemode == 0. If more than one
unmerged entry is there, the order of stage preference is 2, 1, 3.

Noticed by Johannes Sixt.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-01 13:26:05 -07:00
Junio C Hamano a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Martin Waitz 302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Luiz Fernando N. Capitulino 3511a3774e read_cache_from(): small simplification
This change 'opens' the code block which maps the index file into
memory, making the code clearer and easier to read.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:44:27 -07:00
Junio C Hamano 4aab5b46f4 Make read-cache.c "the_index" free.
This makes all low-level functions defined in read-cache.c to
take an explicit index_state structure as their first parameter,
to specify which index to work on.  These functions
traditionally operated on "the_index" and were named foo_cache();
the counterparts this patch introduces are called foo_index().

The traditional foo_cache() functions are made into macros that
give "the_index" to their corresponding foo_index() functions.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano 228e94f935 Move index-related variables into a structure.
This defines a index_state structure and moves index-related
global variables into it.  Currently there is one instance of
it, the_index, and everybody accesses it, so there is no code
change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Linus Torvalds a8ee75bc7a Fix gitlink index entry filesystem matching
The code to match up index entries with the filesystem was stupidly
broken.  We shouldn't compare the filesystem stat() information with
S_IFDIRLNK, since that's purely a git-internal value, and not what the
filesystem uses (on the filesystem, it's just a regular directory).

Also, don't bother to make the stat() time comparisons etc for DIRLNK
entries in ce_match_stat_basic(), since we do an exact match for these
things, and the hints in the stat data simply doesn't matter.

This fixes "git status" with submodules that haven't been checked out in
the supermodule.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 03:14:12 -07:00
Linus Torvalds 095952585c Teach directory traversal about subprojects
This is the promised cleaned-up version of teaching directory traversal
(ie the "read_directory()" logic) about subprojects. That makes "git add"
understand to add/update subprojects.

It now knows to look at the index file to see if a directory is marked as
a subproject, and use that as information as whether it should be recursed
into or not.

It also generally cleans up the handling of directory entries when
traversing the working tree, by splitting up the decision-making process
into small functions of their own, and adding a fair number of comments.

Finally, it teaches "add_file_to_cache()" that directory names can have
slashes at the end, since the directory traversal adds them to make the
difference between a file and a directory clear (it always did that, but
my previous too-ugly-to-apply subproject patch had a totally different
path for subproject directories and avoided the slash for that case).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 19:09:55 -07:00
Linus Torvalds 1833a92548 Fix thinko in subproject entry sorting
This fixes a total thinko in my original series: subprojects do *not* sort
like directories, because the index is sorted purely by full pathname, and
since a subproject shows up in the index as a normal NUL-terminated
string, it never has the issues with sorting with the '/' at the end.

So if you have a subproject "proj" and a file "proj.c", the subproject
sorts alphabetically before the file in the index (and must thus also sort
that way in a tree object, since trees sort as the index).

In contrast, it you have two files "proj/file" and "proj.c", the "proj.c"
will sort alphabetically before "proj/file" in the index. The index
itself, of course, does not actually contain an entry "proj/", but in the
*tree* that gets written out, the tree entry "proj" will sort after the
file entry "proj.c", which is the only real magic sorting rule.

In other words: the magic sorting rule only affects tree entries, and
*only* affects tree entries that point to other trees (ie are of the type
S_IFDIR).

Anyway, that thinko just means that we should remove the special case to
make S_ISDIRLNK entries sort like S_ISDIR entries. They don't.  They sort
like normal files.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 17:21:12 -07:00
Linus Torvalds f35a6d3bce Teach core object handling functions about gitlinks
This teaches the really fundamental core SHA1 object handling routines
about gitlinks.  We can compare trees with gitlinks in them (although we
can not actually generate patches for them yet - just raw git diffs),
and they show up as commits in "git ls-tree".

We also know to compare gitlinks as if they were directories (ie the
normal "sort as trees" rules apply).

[jc: amended a cut&paste error]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 13:50:43 -07:00
Junio C Hamano 640ee0d1cd Merge branch 'jc/read-tree-df' (early part)
* 'jc/read-tree-df' (early part):
  Fix switching to a branch with D/F when current branch has file D.
  Fix twoway_merge that passed d/f conflict marker to merged_entry().
  Fix read-tree --prefix=dir/.
  unpack-trees: get rid of *indpos parameter.
  unpack_trees.c: pass unpack_trees_options structure to keep_entry() as well.
  add_cache_entry(): removal of file foo does not conflict with foo/bar
2007-04-07 23:52:40 -07:00
Junio C Hamano fd1c3bf053 Rename add_file_to_index() to add_file_to_cache()
This function was not called "add_file_to_cache()" only because
an ancient program, update-cache, used that name as an internal
function name that does something slightly different.  Now that
is gone, we can take over the better name.

The plan is to name all functions that operate on the default
index xxx_cache().  Later patches create a variant of them that
take an explicit parameter xxx_index(), and then turn
xxx_cache() functions into macros that use "the_index".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano ec0cc70469 Propagate cache error internal to refresh_cache() via parameter.
The function refresh_cache() is the only user of cache_errno
that switches its behaviour based on what internal function
refresh_cache_entry() finds; pass the error status back in a
parameter passed down to it, to get rid of the global variable
cache_errno.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano 0424138d57 Fix bogus error message from merge-recursive error path
This error message should not usually trigger, but the function
make_cache_entry() called by add_cacheinfo() can return early
without calling into refresh_cache_entry() that sets cache_errno.

Also the error message had a wrong function name reported, and
it did not say anything about which path failed either.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano 21cd8d00b6 add_cache_entry(): removal of file foo does not conflict with foo/bar
Similarly, removal of file foo/bar does not conflict with a file foo.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-04 00:19:28 -07:00
Shawn O. Pearce dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Johannes Sixt 78a8d641c1 Add core.symlinks to mark filesystems that do not support symbolic links.
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.

This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system.  A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).

Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02 16:58:05 -08:00
Junio C Hamano 53bca91a7d index_fd(): pass optional path parameter as hint for blob conversion
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Junio C Hamano edaec3fbe8 index_fd(): use enum object_type instead of type name string.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Nicolas Pitre 21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Junio C Hamano 185c975faa Do not take mode bits from index after type change.
When we do not trust executable bit from lstat(2), we copied
existing ce_mode bits without checking if the filesystem object
is a regular file (which is the only thing we apply the "trust
executable bit" business) nor if the blob in the index is a
regular file (otherwise, we should do the same as registering a
new regular file, which is to default non-executable).

Noticed by Johannes Sixt.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-16 22:56:06 -08:00
Linus Torvalds 2cdf9509df write-cache: do not leak the serialized cache-tree data.
It is not used after getting written, and just is leaking every time
we write the index out.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-11 12:25:16 -08:00
Andy Whitcroft 93822c2239 short i/o: fix calls to write to use xwrite or write_in_full
We have a number of badly checked write() calls.  Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes.  Switch to using
the new write_in_full().  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().

Note, the changes to config handling are much larger and handled
in the next patch in the sequence.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Shawn O. Pearce 5fe5c8300d Cleanup read_cache_from error handling.
When I converted the mmap() call to xmmap() I failed to cleanup the
way this routine handles errors and left some crufty code behind.
This is a small cleanup, suggested by Johannes.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce c4712e4553 Replace mmap with xmmap, better handling MAP_FAILED.
In some cases we did not even bother to check the return value of
mmap() and just assume it worked.  This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.

We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location.  If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt.  If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Junio C Hamano 81a361be3b Fix check_file_directory_conflict().
When replacing an existing file A with a directory A that has a
file A/B in it in the index, 'update-index --replace --add A/B'
did not properly remove the file to make room for the new
directory.

There was a trivial logic error, most likely a cut & paste one,
dating back to quite early days of git.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:44 -08:00
Junio C Hamano c33ab0dd10 git-add: remove conflicting entry when adding.
When replacing an existing file A with a directory A that has a
file A/B in it in the index, 'git add' did not succeed because
it forgot to pass the allow-replace flag to add_cache_entry().

It might be safer to leave this as an error and require the user
to explicitly remove the existing A first before adding A/B
since it is an unusual case, but doing that automatically is
much easier to use.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:43 -08:00
Junio C Hamano 790fa0e297 update-index: make D/F conflict error a bit more verbose.
When you remove a directory D that has a tracked file D/F out of the
way to create a file D and try to "git update-index --add D", it used
to say "cannot add" which was not very helpful.  This issues an extra
error message to explain the situation before the final "fatal" message.

Since D/F conflicts are relatively rare event, extra verbosity would
not make things too noisy.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:43 -08:00
Junio C Hamano 2bbaaed9ee trust-executable-bit: fix breakage for symlinks
An earlier commit f28b34a broke symlinks when trust-executable-bit
is not set because it incorrectly assumed that everything was a
regular file.

Reported by Juergen Ruehle.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-22 16:36:49 -08:00
Rene Scharfe a6e8a76770 sparse fix: non-ANSI function declaration
The declaration of discard_cache() in cache.h already has its "void".

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-18 11:40:00 -08:00
Shawn Pearce fd28b34afd Ignore executable bit when adding files if filemode=0.
If the user has configured core.filemode=0 then we shouldn't set
the execute bit in the index when adding a new file as the user
has indicated that the local filesystem can't be trusted.

This means that when adding files that should be marked executable
in a repository with core.filemode=0 the user must perform a
'git update-index --chmod=+x' on the file before committing the
addition.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-26 22:42:52 -07:00
Junio C Hamano 1e49cb8ad4 Merge branch 'js/c-merge-recursive'
* js/c-merge-recursive: (21 commits)
  discard_cache(): discard index, even if no file was mmap()ed
  merge-recur: do not die unnecessarily
  merge-recur: try to merge older merge bases first
  merge-recur: if there is no common ancestor, fake empty one
  merge-recur: do not setenv("GIT_INDEX_FILE")
  merge-recur: do not call git-write-tree
  merge-recursive: fix rename handling
  .gitignore: git-merge-recur is a built file.
  merge-recur: virtual commits shall never be parsed
  merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
  merge-recur: fix thinko in unique_path()
  Makefile: git-merge-recur depends on xdiff libraries.
  merge-recur: Explain why sha_eq() and struct stage_data cannot go
  merge-recur: Cleanup last mixedCase variables...
  merge-recur: Fix compiler warning with -pedantic
  merge-recur: Remove dead code
  merge-recur: Get rid of debug code
  merge-recur: Convert variable names to lower_case
  Cumulative update of merge-recursive in C
  recur vs recursive: help testing without touching too many stuff.
  ...

This is an evil merge that removes TEST script from the toplevel.
2006-08-27 20:33:46 -07:00
David Rientjes a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
Junio C Hamano 8e3abd4c97 Merge branch 'jc/racy'
* jc/racy:
  Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
  Add check program "git-check-racy"
  Documentation/technical/racy-git.txt
  avoid nanosleep(2)
2006-08-16 14:00:34 -07:00
Junio C Hamano 0fc82cff12 Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
The work-around should not be needed.  Even if it turns out we
would want it later, git will remember the patch for us ;-).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 22:12:54 -07:00
Junio C Hamano 42f774063d Add check program "git-check-racy"
This will help counting the racily clean paths, but it should be
useless for daily use.  Do not even enable it in the makefile.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:38:07 -07:00
David Rientjes 96f1e58f52 remove unnecessary initializations
[jc: I needed to hand merge the changes to the updated codebase,
 so the result needs to be checked.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:22:20 -07:00
Junio C Hamano 789a09b487 avoid nanosleep(2)
On Solaris nanosleep(2) is not available in libc; you need to
link with -lrt to get it.

The purpose of the loop is to wait until the next filesystem
timestamp granularity, and the code uses subsecond sleep in the
hope that it can shorten the delay to 0.5 seconds on average
instead of a full second.  It is probably not worth depending on
an extra library for this.

We might want to yank out the whole "racy-git avoidance is
costly later at runtime, so let's delay writing the index out"
codepath later, but that is a separate issue and needs some
testing on large trees to figure it out.  After playing with the
kernel tree, I have a feeling that the whole thing may not be
worth it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 03:39:47 -07:00
David Rientjes 968a1d65f4 read-cache.c cleanup
Removes conditional returns.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-14 18:41:12 -07:00
Junio C Hamano eed94a570e Merge branch 'master' into js/c-merge-recursive
Adjust to hold_lock_file_for_update() change on the master.
2006-08-12 18:35:14 -07:00
Johannes Schindelin 4147d801db discard_cache(): discard index, even if no file was mmap()ed
Since add_cacheinfo() can be called without a mapped index file,
discard_cache() _has_ to discard the entries, even when
cache_mmap == NULL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-10 14:30:01 -07:00
Junio C Hamano 6015c28b1d read-cache: tweak racy-git delay logic
Instead of looping over the entries and writing out, use a
separate loop after all entries have been written out to check
how many entries are racily clean.  Make sure that the newly
created index file gets the right timestamp when we check by
flushing the buffered data by ce_write().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-08 17:17:04 -07:00
Junio C Hamano b7e58b17b5 Racy git: avoid having to be always too careful
Immediately after a bulk checkout, most of the paths in the
working tree would have the same timestamp as the index file,
and this would force ce_match_stat() to take slow path for all
of them.  When writing an index file out, if many of the paths
have very new (read: the same timestamp as the index file being
written out) timestamp, we are better off delaying the return
from the command, to make sure that later command to touch the
working tree files will leave newer timestamps than recorded in
the index, thereby avoiding to take the slow path.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-07 01:58:53 -07:00
Linus Torvalds 7f8508e8d3 Fix double "close()" in ce_compare_data
Doing an "strace" on "git diff" shows that we close() a file descriptor
twice (getting EBADFD on the second one) when we end up in ce_compare_data
if the index does not match the checked-out stat information.

The "index_fd()" function will already have closed the fd for us, so we
should not close it again.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31 11:55:56 -07:00
Junio C Hamano c1a788acee Merge branch 'js/read-tree' into js/c-merge-recursive
* js/read-tree: (107 commits)
  read-tree: move merge functions to the library
  read-trees: refactor the unpack_trees() part
  tar-tree: illustrate an obscure feature better
  git.c: allow alias expansion without a git directory
  setup_git_directory_gently: do not barf when GIT_DIR is given.
  Build on Debian GNU/kFreeBSD
  Call setup_git_directory() much earlier
  Call setup_git_directory() early
  Display an error from update-ref if target ref name is invalid.
  Fix http-fetch
  t4103: fix binary patch application test.
  git-apply -R: binary patches are irreversible for now.
  Teach git-apply about '-R'
  Makefile: ssh-pull.o depends on ssh-fetch.c
  log and diff family: honor config even from subdirectories
  git-reset: detect update-ref error and report it.
  lost-found: use fsck-objects --full
  Teach git-http-fetch the --stdin switch
  Teach git-local-fetch the --stdin switch
  Make pull() support fetching multiple targets at once
  ...
2006-07-30 23:42:10 -07:00
Johannes Schindelin 11be42a476 Make git-mv a builtin
This also moves add_file_to_index() to read-cache.c. Oh, and while
touching builtin-add.c, it also removes a duplicate git_config() call.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Johannes Schindelin 8fd2cb4069 Extract helper bits from c-merge-recursive work
This backports the pieces that are not uncooked from the merge-recursive
WIP we have seen earlier, to be used in git-mv rewritten in C.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Johannes Schindelin 6d297f8137 Status update on merge-recursive in C
This is just an update for people being interested. Alex and me were
busy with that project for a few days now. While it has progressed nicely,
there are quite a couple TODOs in merge-recursive.c, just search for "TODO".

For impatient people: yes, it passes all the tests, and yes, according
to the evil test Alex did, it is faster than the Python script.

But no, it is not yet finished. Biggest points are:

- there are still three external calls
- in the end, it should not be necessary to write the index more than once
  (just before exiting)
- a lot of things can be refactored to make the code easier and shorter

BTW we cannot just plug in git-merge-tree yet, because git-merge-tree
does not handle renames at all.

This patch is meant for testing, and as such,

- it compile the program to git-merge-recur
- it adjusts the scripts and tests to use git-merge-recur instead of
  git-merge-recursive
- it provides "TEST", a script to execute the tests regarding -recursive
- it inlines the changes to read-cache.c (read_cache_from(), discard_cache()
  and refresh_cache_entry())

Brought to you by Alex Riesen and Dscho

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 23:10:19 -07:00
Pavel Roskin a9486b02ec Avoid C99 comments, use old-style C comments instead.
This doesn't make the code uglier or harder to read, yet it makes the
code more portable.  This also simplifies checking for other potential
incompatibilities.  "gcc -std=c89 -pedantic" can flag many incompatible
constructs as warnings, but C99 comments will cause it to emit an error.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-10 00:47:13 -07:00