Commit graph

3138 commits

Author SHA1 Message Date
Junio C Hamano fe9122a352 Merge branch 'dd/use-alloc-grow'
Replace open-coded reallocation with ALLOC_GROW() macro.

* dd/use-alloc-grow:
  sha1_file.c: use ALLOC_GROW() in pretend_sha1_file()
  read-cache.c: use ALLOC_GROW() in add_index_entry()
  builtin/mktree.c: use ALLOC_GROW() in append_to_tree()
  attr.c: use ALLOC_GROW() in handle_attr_line()
  dir.c: use ALLOC_GROW() in create_simplify()
  reflog-walk.c: use ALLOC_GROW()
  replace_object.c: use ALLOC_GROW() in register_replace_object()
  patch-ids.c: use ALLOC_GROW() in add_commit()
  diffcore-rename.c: use ALLOC_GROW()
  diff.c: use ALLOC_GROW()
  commit.c: use ALLOC_GROW() in register_commit_graft()
  cache-tree.c: use ALLOC_GROW() in find_subtree()
  bundle.c: use ALLOC_GROW() in add_to_ref_list()
  builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-18 13:50:21 -07:00
Junio C Hamano 15520a858f Merge branch 'jk/clean-d-pathspec'
"git clean -d pathspec" did not use the given pathspec correctly
and ended up cleaning too much.

* jk/clean-d-pathspec:
  clean: simplify dir/not-dir logic
  clean: respect pathspecs with "-d"
2014-03-18 13:47:57 -07:00
Fabian Ruch c45a18e88f add: use struct argv_array in run_add_interactive()
run_add_interactive() in builtin/add.c manually computes array bounds
and allocates a static args array to build the add--interactive command
line, which is error-prone. Use the argv-array helper functions instead.

Signed-off-by: Fabian Ruch <bafain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18 12:47:29 -07:00
Benoit Pierre 0a3beb0e2e merge: fix GIT_EDITOR override for commit hook
Don't set GIT_EDITOR to ":" when calling prepare-commit-msg hook if the
editor is going to be called (e.g. with "merge -e").

Signed-off-by: Benoit Pierre <benoit.pierre@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18 11:25:38 -07:00
Benoit Pierre 15048f8a9a commit: fix patch hunk editing with "commit -p -m"
Don't change git environment: move the GIT_EDITOR=":" override to the
hook command subprocess, like it's already done for GIT_INDEX_FILE.

Signed-off-by: Benoit Pierre <benoit.pierre@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18 11:25:12 -07:00
Junio C Hamano de983a0a18 index-pack: report error using the correct variable
We feed a string pointer that is potentially NULL to die() when
showing the message.  Don't.

Noticed-by: Nguyễn Thái Ngọc Duy  <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17 15:08:36 -07:00
Jeff King 373c67da1d pack-objects: turn off bitmaps when skipping objects
The pack bitmap format requires that we have a single bit
for each object in the pack, and that each object's bitmap
represents its complete set of reachable objects. Therefore
we have no way to represent the bitmap of an object which
references objects outside the pack.

We notice this problem while generating the bitmaps, as we
try to find the offset of a particular object and realize
that we do not have it. In this case we die, and neither the
bitmap nor the pack is generated. This is correct, but
perhaps a little unfriendly. If you have bitmaps turned on
in the config, many repacks will fail which would otherwise
succeed. E.g., incremental repacks, repacks with "-l" when
you have alternates, ".keep" files.

Instead, this patch notices early that we are omitting some
objects from the pack and turns off bitmaps (with a
warning). Note that this is not strictly correct, as it's
possible that the object being omitted is not reachable from
any other object in the pack. In practice, this is almost
never the case, and there are two advantages to doing it
this way:

  1. The code is much simpler, as we do not have to cleanly
     abort the bitmap-generation process midway through.

  2. We do not waste time partially generating bitmaps only
     to find out that some object deep in the history is not
     being packed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17 15:02:39 -07:00
Jeff King 78d2214eb4 pack-objects: show reused packfile objects in "Counting objects"
When we are sending a pack for push or fetch, we may reuse a
chunk of packfile without even parsing it. The progress
meter then looks like this:

  Reusing existing pack: 3440489, done.
  Counting objects: 3, done.

The first line shows that we are reusing a large chunk of
objects, and then we further count any objects not included
in the reused portion with an actual traversal.

These are all implementation details that the user does not
need to care about. Instead, we can show the reused objects
in the normal "counting..." progress meter (which will
simply go much faster than normal), and then continue to add
to it as we traverse.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17 15:01:27 -07:00
Jeff King 657673f125 pack-objects: show progress for reused packfiles
When the "--all-progress" option is in effect, pack-objects
shows a progress report for the "writing" phase. If the
repository has bitmaps and we are reusing a packfile, the
user sees no progress update until the whole packfile is
sent.  Since this is typically the bulk of what is being
written, it can look like git hangs during this phase, even
though the transfer is proceeding.

This generally only happens with "git push" from a
repository with bitmaps. We do not use "--all-progress" for
fetch (since the result is going to index-pack on the
client, which takes care of progress reporting). And for
regular repacks to disk, we do not reuse packfiles.

We already have the progress meter setup during
write_reused_pack; we just need to call display_progress
whiel we are writing out the pack. The progress meter is
attached to our output descriptor, so it automatically
handles the throughput measurements.

However, we need to update the object count as we go, since
that is what feeds the percentage we show. We aren't
actually parsing the packfile as we send it, so we have no
idea how many objects we have sent; we only know that at the
end of N bytes, we will have sent M objects. So we cheat a
little and assume each object is M/N bytes (i.e., the mean
of the objects we are sending). While this isn't strictly
true, it actually produces a more pleasing progress meter
for the user, as it moves smoothly and predictably (and
nobody really cares about the object count; they care about
the percentage, and the object count is a proxy for that).

One alternative would be to actually show two progress
meters: one for the reused pack, and one for the rest of the
objects. That would more closely reflect the data we have
(the first would be measured in bytes, and the second
measured in objects). But it would also be more complex and
annoying to the user; rather than seeing one progress meter
counting up to 100%, they would finish one meter, then start
another one at zero.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17 15:01:25 -07:00
brian m. carlson fb8a4e8079 mv: prevent mismatched data when ignoring errors.
We shrink the source and destination arrays, but not the modes or
submodule_gitfile arrays, resulting in potentially mismatched data.  Shrink
all the arrays at the same time to prevent this.  Add tests to ensure the
problem does not recur.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-17 11:38:41 -07:00
Junio C Hamano 56e2874a81 Merge branch 'sh/write-pack-file-warning-message-fix'
A warning from "git pack-objects" were generated by referring to an
incorrect variable when forming the filename that we had trouble
with.

* sh/write-pack-file-warning-message-fix:
  write_pack_file: use correct variable in diagnostic
2014-03-14 14:27:17 -07:00
Junio C Hamano 3e30cb0fbf Merge branch 'mh/replace-refs-variable-rename'
* mh/replace-refs-variable-rename:
  Document some functions defined in object.c
  Add docstrings for lookup_replace_object() and do_lookup_replace_object()
  rename read_replace_refs to check_replace_refs
2014-03-14 14:27:06 -07:00
Junio C Hamano 0963008cbf Merge branch 'nd/i18n-progress'
Mark the progress indicators from various time-consuming commands
for i18n/l10n.

* nd/i18n-progress:
  i18n: mark all progress lines for translation
2014-03-14 14:26:31 -07:00
Junio C Hamano b37f81b7b6 Merge branch 'jh/note-trees-record-blobs'
"git notes -C <blob>" should not take an object that is not a blob.

* jh/note-trees-record-blobs:
  notes: disallow reusing non-blob as a note object
2014-03-14 14:25:39 -07:00
Junio C Hamano 650c90a185 Merge branch 'nd/no-more-fnmatch'
We started using wildmatch() in place of fnmatch(3); complete the
process and stop using fnmatch(3).

* nd/no-more-fnmatch:
  actually remove compat fnmatch source code
  stop using fnmatch (either native or compat)
  Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch"
  use wildmatch() directly without fnmatch() wrapper
2014-03-14 14:25:31 -07:00
Junio C Hamano 6eb593a764 Merge branch 'nd/reset-setup-worktree'
"git reset" needs to refresh the index when working in a working
tree (it can also be used to match the index to the HEAD in an
otherwise bare repository), but it failed to set up the working
tree properly, causing GIT_WORK_TREE to be ignored.

* nd/reset-setup-worktree:
  reset: optionally setup worktree and refresh index on --mixed
2014-03-14 14:25:03 -07:00
Junio C Hamano 08f36302b5 Merge branch 'ks/config-file-stdin'
"git config" learned to read from the standard input when "-" is
given as the value to its "--file" parameter (attempting an
operation to update the configuration in the standard input of
course is rejected).

* ks/config-file-stdin:
  config: teach "git config --file -" to read from the standard input
  config: change git_config_with_options() interface
  builtin/config.c: rename check_blob_write() -> check_write()
  config: disallow relative include paths from blobs
2014-03-14 14:24:40 -07:00
Junio C Hamano 7aab05d2b4 Merge branch 'jk/janitorial-fixes'
* jk/janitorial-fixes:
  open_istream(): do not dereference NULL in the error case
  builtin/mv: don't use memory after free
  utf8: use correct type for values in interval table
  utf8: fix iconv error detection
  notes-utils: handle boolean notes.rewritemode correctly
2014-03-14 14:24:37 -07:00
Junio C Hamano 28b68216de Merge branch 'jc/check-attr-honor-working-tree'
"git check-attr" when (trying to) work on a repository with a
working tree did not work well when the working tree was specified
via --work-tree (and obviously with --git-dir).

The command also works in a bare repository but it reads from the
(possibly stale, irrelevant and/or nonexistent) index, which may
need to be fixed to read from HEAD, but that is a completely
separate issue.  As a related tangent to this separate issue, we
may want to also fix "check-ignore", which refuses to work in a
bare repository, to also operate in a bare one.

* jc/check-attr-honor-working-tree:
  check-attr: move to the top of working tree when in non-bare repository
  t0003: do not chdir the whole test process
2014-03-14 14:06:00 -07:00
Jeff King a42fcd15d8 cat-file: restore warn_on_object_refname_ambiguity flag
Commit 25fba78 turned off the object/refname ambiguity check
during "git cat-file --batch" operations. However, this is a
global flag, so let's restore it when we are done.

This shouldn't make any practical difference, as cat-file
exits immediately afterwards, but is good code hygeine and
would prevent an unnecessary surprise if somebody starts to
call cmd_cat_file later.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-13 11:56:17 -07:00
John Keeping 89ccc1b09c builtin/mv: fix out of bounds write
When commit a88c915 (mv: move submodules using a gitfile, 2013-07-30)
added the submodule_gitfile array, it was not added to the block that
enlarges the arrays when we are moving a directory so that we do not
have to worry about it being a directory when we perform the actual
move.  After this, the loop continues over the enlarged set of sources.

Since we assume that submodule_gitfile has size argc, if any of the
items in the source directory are submodules we are guaranteed to write
beyond the end of submodule_gitfile.

Fix this by realloc'ing submodule_gitfile at the same time as the other
arrays.

Reported-by: Guillaume Gelin <contact@ramnes.eu>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 14:44:21 -07:00
Nguyễn Thái Ngọc Duy b790e0f67c upload-pack: send shallow info over stdin to pack-objects
Before cdab485 (upload-pack: delegate rev walking in shallow fetch to
pack-objects - 2013-08-16) upload-pack does not write to the source
repository. cdab485 starts to write $GIT_DIR/shallow_XXXXXX if it's a
shallow fetch, so the source repo must be writable.

git:// servers do not need write access to repos and usually don't
have it, which means cdab485 breaks shallow clone over git://

Instead of using a temporary file as the media for shallow points, we
can send them over stdin to pack-objects as well. Prepend shallow
SHA-1 with --shallow so pack-objects knows what is what.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 13:32:10 -07:00
Jeff King 1f2e108887 clean: simplify dir/not-dir logic
When we get a list of paths from read_directory, we further
prune it to create the final list of items to remove. The
code paths for directories and non-directories repeat the
same "add to list" code.

This patch restructures the code so that we don't repeat
ourselves. Also, by following a "if (condition) continue"
pattern like the pathspec check above, it makes it more
obvious that the conditional is about excluding directories
under certain circumstances.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 12:14:25 -07:00
Jeff King cf424f5fd8 clean: respect pathspecs with "-d"
git-clean uses read_directory to fill in a `struct dir` with
potential hits. However, read_directory does not actually
check against our pathspec. It uses a simplified version
that may turn up false positives. As a result, we need to
check that any hits match our pathspec. We do so reliably
for non-directories. For directories, if "-d" is not given
we check that the pathspec matched exactly (i.e., we are
even stricter, and require an explicit "git clean foo" to
clean "foo/"). But if "-d" is given, rather than relaxing
the exact match to allow a recursive match, we do not check
the pathspec at all.

This regression was introduced in 113f10f (Make git-clean a
builtin, 2007-11-11).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 12:13:42 -07:00
Junio C Hamano e3f1185946 Merge branch 'cc/starts-n-ends-with-endgame'
prefixcmp/suffixcmp are gone.
2014-03-07 15:18:28 -08:00
Junio C Hamano 289ca27dee Merge branch 'gj/push-more-verbose-advice' 2014-03-07 15:17:20 -08:00
Junio C Hamano 160c4b183c Merge branch 'jc/add-2.0-ignore-removal'
"git add <pathspec>" is the same as "git add -A <pathspec>" now,
i.e. it does not ignore removals from the directory specified.
2014-03-07 15:14:47 -08:00
Junio C Hamano 053a6b1807 Merge branch 'jn/add-2.0-u-A-sans-pathspec'
"git add -u" and "git add -A" without any pathspec is a tree-wide
operation now, even when they are run in a subdirectory of the
working tree.
2014-03-07 15:14:02 -08:00
Junio C Hamano 009055f3ec Merge branch 'jc/push-2.0-default-to-simple'
Finally update the "git push" default behaviour to "simple".
2014-03-07 15:13:15 -08:00
Junio C Hamano b0bc1365c2 tag: grok "--with" as synonym to "--contains"
Just like "git branch" can be told to list the branches that has the
named commit by "git branch --with <commit>", teach the same
short-hand to "git tag", so that "git tag --with <commit>" shows the
releases with the named commit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-07 12:52:02 -08:00
Junio C Hamano 4c4ac4db2c Merge branch 'nd/daemonize-gc'
Allow running "gc --auto" in the background.

* nd/daemonize-gc:
  gc: config option for running --auto in background
  daemon: move daemonize() to libgit.a
2014-03-05 15:06:39 -08:00
Dmitry S. Dolzhenko 66d9f38bc7 builtin/mktree.c: use ALLOC_GROW() in append_to_tree()
Helped-by: He Sun <sunheehnus@gmail.com>
Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 14:54:45 -08:00
Dmitry S. Dolzhenko 25e1940709 builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 14:44:11 -08:00
Jeff King ee34a2bead repack: add repack.packKeptObjects config var
The git-repack command always passes `--honor-pack-keep`
to pack-objects. This has traditionally been a good thing,
as we do not want to duplicate those objects in a new pack,
and we are not going to delete the old pack.

However, when bitmaps are in use, it is important for a full
repack to include all reachable objects, even if they may be
duplicated in a .keep pack. Otherwise, we cannot generate
the bitmaps, as the on-disk format requires the set of
objects in the pack to be fully closed.

Even if the repository does not generally have .keep files,
a simultaneous push could cause a race condition in which a
.keep file exists at the moment of a repack. The repack may
try to include those objects in one of two situations:

  1. The pushed .keep pack contains objects that were
     already in the repository (e.g., blobs due to a revert of
     an old commit).

  2. Receive-pack updates the refs, making the objects
     reachable, but before it removes the .keep file, the
     repack runs.

In either case, we may prefer to duplicate some objects in
the new, full pack, and let the next repack (after the .keep
file is cleaned up) take care of removing them.

This patch introduces both a command-line and config option
to disable the `--honor-pack-keep` option.  By default, it
is triggered when pack.writeBitmaps (or `--write-bitmap-index`
is turned on), but specifying it explicitly can override the
behavior (e.g., in cases where you prefer .keep files to
bitmaps, but only when they are present).

Note that this option just disables the pack-objects
behavior. We still leave packs with a .keep in place, as we
do not necessarily know that we have duplicated all of their
objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 12:21:49 -08:00
Sun He 5889271114 finish_tmp_packfile():use strbuf for pathname construction
The old version fixes a maximum length on the buffer, which could be a problem
if one is not certain of the length of get_object_directory().
Using strbuf can avoid the protential bug.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Sun He <sunheehnus@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 12:15:10 -08:00
Junio C Hamano 2156a98045 Merge branch 'sh/write-pack-file-warning-message-fix' into sh/finish-tmp-packfile
* sh/write-pack-file-warning-message-fix:
  write_pack_file: use correct variable in diagnostic
2014-03-03 12:13:20 -08:00
Sun He 0eea5a6e91 write_pack_file: use correct variable in diagnostic
'pack_tmp_name' is the subject of the utime() check, so report it in the
warning, not the uninitialized 'tmpname'

Signed-off-by: Sun He <sunheehnus@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 10:43:40 -08:00
Nguyễn Thái Ngọc Duy 9ef176b55c tag: support --sort=<spec>
--sort=version:refname (or --sort=v:refname for short) sorts tags as
if they are versions. --sort=-refname reverses the order (with or
without ":version").

versioncmp() is copied from string/strverscmp.c in glibc commit
ee9247c38a8def24a59eb5cfb7196a98bef8cfdc, reformatted to Git coding
style. The implementation is under LGPL-2.1 and according to [1] I can
relicense it to GPLv2.

[1] http://www.gnu.org/licenses/gpl-faq.html#AllCompatibility

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27 14:04:05 -08:00
Junio C Hamano 0f9e62e084 Merge branch 'jk/pack-bitmap'
Borrow the bitmap index into packfiles from JGit to speed up
enumeration of objects involved in a commit range without having to
fully traverse the history.

* jk/pack-bitmap: (26 commits)
  ewah: unconditionally ntohll ewah data
  ewah: support platforms that require aligned reads
  read-cache: use get_be32 instead of hand-rolled ntoh_l
  block-sha1: factor out get_be and put_be wrappers
  do not discard revindex when re-preparing packfiles
  pack-bitmap: implement optional name_hash cache
  t/perf: add tests for pack bitmaps
  t: add basic bitmap functionality tests
  count-objects: recognize .bitmap in garbage-checking
  repack: consider bitmaps when performing repacks
  repack: handle optional files created by pack-objects
  repack: turn exts array into array-of-struct
  repack: stop using magic number for ARRAY_SIZE(exts)
  pack-objects: implement bitmap writing
  rev-list: add bitmap mode to speed up object lists
  pack-objects: use bitmaps when packing objects
  pack-objects: split add_object_entry
  pack-bitmap: add support for bitmap indexes
  documentation: add documentation for the bitmap format
  ewah: compressed bitmap implementation
  ...
2014-02-27 14:01:48 -08:00
Junio C Hamano 6784fab0ac Merge branch 'dk/blame-janitorial'
Code clean-up.

* dk/blame-janitorial:
  builtin/blame.c::find_copy_in_blob: no need to scan for region end
  blame.c: prepare_lines should not call xrealloc for every line
  builtin/blame.c::prepare_lines: fix allocation size of sb->lineno
  builtin/blame.c: eliminate same_suspect()
  builtin/blame.c: struct blame_entry does not need a prev link
2014-02-27 14:01:46 -08:00
Junio C Hamano 62bef66fe7 Merge branch 'bc/gpg-sign-everywhere'
Teach "--gpg-sign" option to many commands that create commits.

* bc/gpg-sign-everywhere:
  pull: add the --gpg-sign option.
  rebase: add the --gpg-sign option
  rebase: parse options in stuck-long mode
  rebase: don't try to match -M option
  rebase: remove useless arguments check
  am: add the --gpg-sign option
  am: parse options in stuck-long mode
  git-sh-setup.sh: add variable to use the stuck-long mode
  cherry-pick, revert: add the --gpg-sign option
2014-02-27 14:01:44 -08:00
Junio C Hamano 8336832ad9 Merge branch 'nd/reset-intent-to-add'
* nd/reset-intent-to-add:
  reset: support "--mixed --intent-to-add" mode
2014-02-27 14:01:40 -08:00
Junio C Hamano 043478308f Merge branch 'ep/varscope'
Shrink lifetime of variables by moving their definitions to an
inner scope where appropriate.

* ep/varscope:
  builtin/gc.c: reduce scope of variables
  builtin/fetch.c: reduce scope of variable
  builtin/commit.c: reduce scope of variables
  builtin/clean.c: reduce scope of variable
  builtin/blame.c: reduce scope of variables
  builtin/apply.c: reduce scope of variables
  bisect.c: reduce scope of variable
2014-02-27 14:01:30 -08:00
Junio C Hamano 28006fb046 Merge branch 'ds/rev-parse-required-args'
"git rev-parse --default" without the required option argument did
not diagnose it as an error.

* ds/rev-parse-required-args:
  rev-parse: check i before using argv[i] against argc
2014-02-27 14:01:23 -08:00
Junio C Hamano cbaeafc325 Merge branch 'nd/submodule-pathspec-ending-with-slash'
Allow "git cmd path/", when the 'path' is where a submodule is
bound to the top-level working tree, to match 'path', despite the
extra and unnecessary trailing slash.

* nd/submodule-pathspec-ending-with-slash:
  clean: use cache_name_is_other()
  clean: replace match_pathspec() with dir_path_match()
  pathspec: pass directory indicator to match_pathspec_item()
  match_pathspec: match pathspec "foo/" against directory "foo"
  dir.c: prepare match_pathspec_item for taking more flags
  pathspec: rename match_pathspec_depth() to match_pathspec()
  pathspec: convert some match_pathspec_depth() to dir_path_match()
  pathspec: convert some match_pathspec_depth() to ce_path_match()
2014-02-27 14:01:15 -08:00
Junio C Hamano d637d1b9a8 Merge branch 'kb/fast-hashmap'
Improvements to our hash table to get it to meet the needs of the
msysgit fscache project, with some nice performance improvements.

* kb/fast-hashmap:
  name-hash: retire unused index_name_exists()
  hashmap.h: use 'unsigned int' for hash-codes everywhere
  test-hashmap.c: drop unnecessary #includes
  .gitignore: test-hashmap is a generated file
  read-cache.c: fix memory leaks caused by removed cache entries
  builtin/update-index.c: cleanup update_one
  fix 'git update-index --verbose --again' output
  remove old hash.[ch] implementation
  name-hash.c: remove cache entries instead of marking them CE_UNHASHED
  name-hash.c: use new hash map implementation for cache entries
  name-hash.c: remove unreferenced directory entries
  name-hash.c: use new hash map implementation for directories
  diffcore-rename.c: use new hash map implementation
  diffcore-rename.c: simplify finding exact renames
  diffcore-rename.c: move code around to prepare for the next patch
  buitin/describe.c: use new hash map implementation
  add a hashtable implementation that supports O(1) removal
  submodule: don't access the .gitmodules cache entry after removing it
2014-02-27 14:01:09 -08:00
Junio C Hamano 810273bc33 Merge branch 'nv/commit-gpgsign-config'
Introduce commit.gpgsign configuration variable to force every
commit to be GPG signed.  The variable cannot be overriden from the
command line of some of the commands that create commits except for
"git commit" and "git commit-tree", but I am not convinced that it
is a good idea to sprinkle support for --no-gpg-sign everywhere,
which in turn means that this configuration variable may not be
such a good idea.

* nv/commit-gpgsign-config:
  test the commit.gpgsign config option
  commit-tree: add and document --no-gpg-sign
  commit-tree: add the commit.gpgsign option to sign all commits
2014-02-27 14:01:03 -08:00
Jeff King 0179c945fc shallow: automatically clean up shallow tempfiles
We sometimes write tempfiles of the form "shallow_XXXXXX"
during fetch/push operations with shallow repositories.
Under normal circumstances, we clean up the result when we
are done. However, we do no take steps to clean up after
ourselves when we exit due to die() or signal death.

This patch teaches the tempfile creation code to register
handlers to clean up after ourselves. To handle this, we
change the ownership semantics of the filename returned by
setup_temporary_shallow. It now keeps a copy of the filename
itself, and returns only a const pointer to it.

We can also do away with explicit tempfile removal in the
callers. They all exit not long after finishing with the
file, so they can rely on the auto-cleanup, simplifying the
code.

Note that we keep things simple and maintain only a single
filename to be cleaned. This is sufficient for the current
caller, but we future-proof it with a die("BUG").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27 12:07:13 -08:00
David Kastrup 3ee8944fa5 builtin/blame.c::find_copy_in_blob: no need to scan for region end
The region end can be looked up just like its beginning.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-25 09:51:24 -08:00
Nguyễn Thái Ngọc Duy 75df1f434f commit: add --cleanup=scissors
Since 1a72cfd (commit -v: strip diffs and submodule shortlogs from the
commit message - 2013-12-05) we have a less fragile way to cut out
"git status" at the end of a commit message but it's only enabled for
stripping submodule shortlogs.

Add new cleanup option that reuses the same mechanism for the entire
"git status" without accidentally removing lines starting with '#'.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-25 09:35:20 -08:00
Junio C Hamano 55ca3f99ae commit-tree: add and document --no-gpg-sign
Document how to override commit.gpgsign configuration that is set to
true per "git commit" invocation (parse-options machinery lets us
say "--no-gpg-sign" to do so).

"git commit-tree" does not use parse-options, so manually add the
corresponding option for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:51:35 -08:00
Nicolas Vigier d95bfb12b8 commit-tree: add the commit.gpgsign option to sign all commits
If you want to GPG sign all your commits, you have to add the -S option
all the time. The commit.gpgsign config option allows to sign all
commits automatically.

Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:50:56 -08:00
Nguyễn Thái Ngọc Duy 2e70c01799 clean: use cache_name_is_other()
cmd_clean() has the exact same code of index_name_is_other(). Reduce
code duplication.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:24 -08:00
Nguyễn Thái Ngọc Duy 05b85022c9 clean: replace match_pathspec() with dir_path_match()
This instance was left out when many match_pathspec() call sites that
take input from dir_entry were converted to dir_path_match() because
it passed a path with the trailing slash stripped out to match_pathspec()
while the others did not. Stripping for all call sites back then would
be a regression because match_pathspec() did not know how to match
pathspec foo/ against _directory_ foo (the stripped version of path
"foo/").

match_pathspec() knows how to do it now. And dir_path_match() strips
the trailing slash also. Use the new function, because the stripping
code is removed in the next patch.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:24 -08:00
Nguyễn Thái Ngọc Duy ae8d082421 pathspec: pass directory indicator to match_pathspec_item()
This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which
makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce
the same output. Previously only the version without trailing slash
returns the difference (if any).

That's the effect of new ce_path_match(). dir_path_match() is not
executed by the new tests. And it should not introduce regressions.

Previously if path "dir/" is passed in with pathspec "dir/", they
obviously match. With new dir_path_match(), the path becomes
_directory_ "dir" vs pathspec "dir/", which is not executed by the old
code path in m_p_i(). The new code path is executed and produces the
same result.

The other case is pathspec "dir" and path "dir/" is now turned to
"dir" (with DO_MATCH_DIRECTORY). Still the same result before or after
the patch.

So why change? Because of the next patch about clean.c.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy 854b09592c pathspec: rename match_pathspec_depth() to match_pathspec()
A long time ago, for some reason I was not happy with
match_pathspec(). I created a better version, match_pathspec_depth()
that was suppose to replace match_pathspec()
eventually. match_pathspec() has finally been gone since 6 months
ago. Use the shorter name for match_pathspec_depth().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:14 -08:00
Nguyễn Thái Ngọc Duy ebb32893ba pathspec: convert some match_pathspec_depth() to dir_path_match()
This helps reduce the number of match_pathspec_depth() call sites and
show how m_p_d() is used. And it usage is:

 - match against an index entry (ce_path_match or match_pathspec_depth
   in ls-files)

 - match against a dir_entry from read_directory (dir_path_match and
   match_pathspec_depth in clean.c, which will be converted later)

 - resolve-undo (rerere.c and ls-files.c)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:37:09 -08:00
Nguyễn Thái Ngọc Duy 429bb40abd pathspec: convert some match_pathspec_depth() to ce_path_match()
This helps reduce the number of match_pathspec_depth() call sites and
show how match_pathspec_depth() is used.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:36:52 -08:00
David Kastrup 352bbbd9f2 blame.c: prepare_lines should not call xrealloc for every line
Making a single preparation run for counting the lines will avoid memory
fragmentation.  Also, fix the allocated memory size which was wrong
when sizeof(int *) != sizeof(int), and would have been too small
for sizeof(int *) < sizeof(int), admittedly unlikely.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:32:41 -08:00
David Kastrup 62cf3ca95a builtin/blame.c::prepare_lines: fix allocation size of sb->lineno
If we are calling xrealloc on every single line, the least we can do
is get the right allocation size.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:32:41 -08:00
David Kastrup 0a88f08e28 builtin/blame.c: eliminate same_suspect()
Since the origin pointers are "interned" and reference-counted, comparing
the pointers rather than the content is enough.  The only uninterned
origins are cached values kept in commit->util, but same_suspect is not
called on them.

Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:32:21 -08:00
Nguyễn Thái Ngọc Duy 754dbc43f0 i18n: mark all progress lines for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 09:08:37 -08:00
Michael Haggerty afc711b8e1 rename read_replace_refs to check_replace_refs
The semantics of this flag was changed in commit

    e1111cef23 inline lookup_replace_object() calls

but wasn't renamed at the time to minimize code churn.  Rename it now,
and add a comment explaining its use.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-20 14:16:55 -08:00
Nguyễn Thái Ngọc Duy eb07894fe0 use wildmatch() directly without fnmatch() wrapper
Make it clear that we don't use fnmatch() anymore.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-20 14:15:46 -08:00
Johan Herland ce8daa1eb8 notes: disallow reusing non-blob as a note object
Currently "git notes add -C $object" will read the raw bytes from $object,
and then copy those bytes into the note object, which is hardcoded to be
of type blob. This means that if the given $object is a non-blob (e.g.
tree or commit), the raw bytes from that object is copied into a blob
object. This is probably not useful, and certainly not what any sane
user would expect. So disallow it, by erroring out if the $object passed
to the -C option is not a blob.

The fix also applies to the -c option (in which the user is prompted to
edit/verify the note contents in a text editor), and also when -c/-C is
passed to "git notes append" (which appends the $object contents to an
existing note object). In both cases, passing a non-blob $object does not
make sense.

Also add a couple of tests demonstrating expected behavior.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-20 14:14:33 -08:00
Kirill A. Shutemov 3caec73b55 config: teach "git config --file -" to read from the standard input
The patch extends git config --file interface to allow read config from
stdin.

Editing stdin or setting value in stdin is an error.

Include by absolute path is allowed in stdin config, but not by relative
path.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 16:12:14 -08:00
Kirill A. Shutemov c8985ce053 config: change git_config_with_options() interface
We're going to have more options for config source.

Let's alter git_config_with_options() interface to accept struct with
all source options.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 16:12:13 -08:00
Kirill A. Shutemov 6aea9f0fdd builtin/config.c: rename check_blob_write() -> check_write()
The function will be reused to check for other conditions which prevent
write.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 16:12:11 -08:00
John Keeping d954828d45 builtin/mv: don't use memory after free
If 'src' already ends with a slash, then add_slash() will just return
it, meaning that 'free(src_with_slash)' is actually 'free(src)'.  Since
we use 'src' later, this will result in use-after-free.

In fact, this cannot happen because 'src' comes from
internal_copy_pathspec() without the KEEP_TRAILING_SLASH flag, so any
trailing '/' will have been stripped; but static analysis tools are not
clever enough to realise this and so warn that 'src' could be used after
having been free'd.  Fix this by checking that 'src_w_slash' is indeed
newly allocated memory.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 15:51:56 -08:00
Nguyễn Thái Ngọc Duy b7756d41dc reset: optionally setup worktree and refresh index on --mixed
Refreshing index requires work tree.  So we have two options: always
set up work tree (and refuse to reset if failing to do so), or make
refreshing index optional.

As refreshing index is not the main task, it makes more sense to make
it optional. This allows us to still work in a bare repository to update
what is in the index.

Reported-by: Patrick Palka <patrick@parcs.ath.cx>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 14:40:23 -08:00
Junio C Hamano 2cd861672e Merge branch 'bm/merge-base-octopus-dedup' into maint
"git merge-base --octopus" used to leave cleaning up suboptimal
result to the caller, but now it does the clean-up itself.

* bm/merge-base-octopus-dedup:
  merge-base --octopus: reduce the result from get_octopus_merge_bases()
  merge-base: separate "--independent" codepath into its own helper
2014-02-13 13:38:59 -08:00
Junio C Hamano 90791e3416 Merge branch 'sb/repack-in-c' into maint
"git repack --max-pack-size=8g" stopped being parsed correctly when
the command was reimplemented in C.

* sb/repack-in-c:
  repack: propagate pack-objects options as strings
  repack: make parsed string options const-correct
  repack: fix typo in max-pack-size option
2014-02-13 13:38:09 -08:00
Nguyễn Thái Ngọc Duy 9f673f9477 gc: config option for running --auto in background
`gc --auto` takes time and can block the user temporarily (but not any
less annoyingly). Make it run in background on systems that support
it. The only thing lost with running in background is printouts. But
gc output is not really interesting. You can keep it in foreground by
changing gc.autodetach.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 10:46:37 -08:00
Junio C Hamano cdbf623254 check-attr: move to the top of working tree when in non-bare repository
Lasse Makholm noticed that running "git check-attr" from a place
totally unrelated to $GIT_DIR and $GIT_WORK_TREE does not give
expected results.  I think it is because the command does not say it
wants to call setup_work_tree().

We still need to support use cases where only a bare repository is
involved, so unconditionally requiring a working tree would not work
well.  Instead, make a call only in a non-bare repository.

We may want to see if we want to do a similar fix in the opposite
direction to check-ignore.  The command unconditionally requires a
working tree, but it should be usable in a bare repository just like
check-attr attempts to be.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-06 10:19:33 -08:00
Nguyễn Thái Ngọc Duy b4b313f94a reset: support "--mixed --intent-to-add" mode
When --mixed is used, entries could be removed from index if the
target ref does not have them. When "reset" is used in preparation for
commit spliting (in a dirty worktree), it could be hard to track what
files to be added back. The new option --intent-to-add simplifies it
by marking all removed files intent-to-add.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
2014-02-05 16:44:51 -08:00
Junio C Hamano 3c864743a6 Merge branch 'js/lift-parent-count-limit' into maint
There is no reason to have a hardcoded upper limit of the number of
parents for an octopus merge, created via the graft mechanism, but
there was.

* js/lift-parent-count-limit:
  Remove the line length limit for graft files
2014-02-05 14:03:01 -08:00
Junio C Hamano ee5788e306 Merge branch 'nd/add-empty-fix' into maint
"git add -A" (no other arguments) in a totally empty working tree
used to emit an error.

* nd/add-empty-fix:
  add: don't complain when adding empty project root
2014-02-05 14:02:44 -08:00
Junio C Hamano a118beeddf Merge branch 'jl/commit-v-strip-marker' into maint
"git commit -v" appends the patch to the log message before
editing, and then removes the patch when the editor returned
control. However, the patch was not stripped correctly when the
first modified path was a submodule.

* jl/commit-v-strip-marker:
  commit -v: strip diffs and submodule shortlogs from the commit message
2014-02-05 14:01:09 -08:00
Junio C Hamano 1a111957b3 Merge branch 'tb/clone-ssh-with-colon-for-port' into maint
Remote repository URL expressed in scp-style host:path notation are
parsed more carefully (e.g. "foo/bar:baz" is local, "[::1]:/~user" asks
to connect to user's home directory on host at address ::1.

* tb/clone-ssh-with-colon-for-port:
  git_connect(): use common return point
  connect.c: refactor url parsing
  git_connect(): refactor the port handling for ssh
  git fetch: support host:/~repo
  t5500: add test cases for diag-url
  git fetch-pack: add --diag-url
  git_connect: factor out discovery of the protocol and its parts
  git_connect: remove artificial limit of a remote command
  t5601: add tests for ssh
  t5601: remove clear_ssh, refactor setup_ssh_wrapper
2014-02-05 13:59:16 -08:00
Junio C Hamano bf03d6e92d Merge branch 'nd/transport-positive-depth-only' into maint
"git fetch --depth=0" was a no-op, and was silently ignored.
Diagnose it as an error.

* nd/transport-positive-depth-only:
  clone,fetch: catch non positive --depth option value
2014-02-05 13:58:52 -08:00
Junio C Hamano 2171c0c36f Merge branch 'tb/repack-fix-renames' (early part)
Finishing touches to the "rewrite repack in C" series.

* 'tb/repack-fix-renames' (early part):
  repack.c: rename and unlink pack file if it exists
2014-02-05 12:02:29 -08:00
Torsten Bögershausen 9d7fbfd204 repack.c: rename and unlink pack file if it exists
When a repo was fully repacked, and is repacked again, we may run
into the situation that "new" packfiles have the same name as
already existing ones (traditionally packfiles have been named after
the list of names of objects in them, so repacking all the objects
in a single pack would have produced a packfile with the same name).

The logic is to rename the existing ones into filename like
"old-XXX", create the new ones and then remove the "old-" ones.
When something went wrong in the middle, this sequence is rolled
back by renaming the "old-" files back.

The renaming into "old-" did not work as intended, because
file_exists() was done on "XXX", not "pack-XXX".  Also when rolling
back the change, the code tried to rename "old-pack-XXX" but the
saved ones are named "old-XXX", so this couldn't have worked.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-05 11:58:49 -08:00
Elia Pinto 4f1c0b21e9 builtin/gc.c: reduce scope of variables
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:05 -08:00
Elia Pinto bf7e645c90 builtin/fetch.c: reduce scope of variable
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:05 -08:00
Elia Pinto e23fd15ada builtin/commit.c: reduce scope of variables
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:05 -08:00
Elia Pinto e666b89d76 builtin/clean.c: reduce scope of variable
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:05 -08:00
Elia Pinto ac39b27786 builtin/blame.c: reduce scope of variables
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:05 -08:00
Elia Pinto e36f3a8a6f builtin/apply.c: reduce scope of variables
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-31 10:44:04 -08:00
David Sharp a43219f2aa rev-parse: check i before using argv[i] against argc
The --prefix, --default, and --resolve-git-dir options to
git-rev-parse require an argument, but when given no argument,
the code uses the NULL read from argv[argc] without checking,
leading to a segfault.

Instead, check first and die() with an error message.

Signed-off-by: David Sharp <dhsharp@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-28 14:10:06 -08:00
Nicolas Vigier 3253553e12 cherry-pick, revert: add the --gpg-sign option
Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-27 15:15:52 -08:00
Junio C Hamano 4110639865 Merge branch 'sb/repack-in-c'
"git repack --max-pack-size=8g" stopped being parsed correctly when
the command was reimplemented in C.

* sb/repack-in-c:
  repack: propagate pack-objects options as strings
  repack: make parsed string options const-correct
  repack: fix typo in max-pack-size option
2014-01-27 10:45:49 -08:00
Junio C Hamano d0956cfa8e Merge branch 'mh/safe-create-leading-directories'
Code clean-up and protection against concurrent write access to the
ref namespace.

* mh/safe-create-leading-directories:
  rename_tmp_log(): on SCLD_VANISHED, retry
  rename_tmp_log(): limit the number of remote_empty_directories() attempts
  rename_tmp_log(): handle a possible mkdir/rmdir race
  rename_ref(): extract function rename_tmp_log()
  remove_dir_recurse(): handle disappearing files and directories
  remove_dir_recurse(): tighten condition for removing unreadable dir
  lock_ref_sha1_basic(): if locking fails with ENOENT, retry
  lock_ref_sha1_basic(): on SCLD_VANISHED, retry
  safe_create_leading_directories(): add new error value SCLD_VANISHED
  cmd_init_db(): when creating directories, handle errors conservatively
  safe_create_leading_directories(): introduce enum for return values
  safe_create_leading_directories(): always restore slash at end of loop
  safe_create_leading_directories(): split on first of multiple slashes
  safe_create_leading_directories(): rename local variable
  safe_create_leading_directories(): add explicit "slash" pointer
  safe_create_leading_directories(): reduce scope of local variable
  safe_create_leading_directories(): fix format of "if" chaining
2014-01-27 10:45:33 -08:00
Junio C Hamano 7b4e2b7e6a Merge branch 'ef/mingw-write'
* ef/mingw-write:
  mingw: remove mingw_write
  prefer xwrite instead of write
2014-01-27 10:44:59 -08:00
Jeff King b861e235bc repack: propagate pack-objects options as strings
In the original shell version of git-repack, any options
destined for pack-objects were left as strings, and passed
as a whole. Since the C rewrite in commit a1bbc6c (repack:
rewrite the shell script in C, 2013-09-15), we now parse
these values to integers internally, then reformat the
integers when passing the option to pack-objects.

This has the advantage that we catch format errors earlier
(i.e., when repack is invoked, rather than when pack-objects
is invoked).

It has three disadvantages, though:

  1. Our internal data types may not be the right size. In
     the case of "--window-memory" and "--max-pack-size",
     these are "unsigned long" in pack-objects, but we can
     only represent a regular "int".

  2. Our parsing routines might not be the same as those of
     pack-objects. For the two options above, pack-objects
     understands "100m" to mean "100 megabytes", but repack
     does not.

  3. We have to keep a sentinel value to know whether it is
     worth passing the option along. In the case of
     "--window-memory", we currently do not pass it if the
     value is "0". But that is a meaningful value to
     pack-objects, where it overrides any configured value.

We can fix all of these by simply passing the strings from
the user along to pack-objects verbatim. This does not
actually fix anything for "--depth" or "--window", but these
are converted, too, for consistency.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-23 10:34:53 -08:00
Jeff King aa8bd519db repack: make parsed string options const-correct
When we use OPT_STRING to parse an option, we get back a
pointer into the argv array, which should be "const char *".
The compiler doesn't notice because it gets passed through a
"void *" in the option struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-23 10:34:51 -08:00
Jeff King 44b96ecaa8 repack: fix typo in max-pack-size option
When we see "--max-pack-size", we accidentally propagated
this to pack-objects as "--max_pack_size", which does not
work at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-23 10:34:49 -08:00
David Kastrup a0f58c5830 builtin/blame.c: struct blame_entry does not need a prev link
Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-22 11:28:01 -08:00
Junio C Hamano 92251b1b5b Merge branch 'nd/shallow-clone'
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).

* nd/shallow-clone: (31 commits)
  t5537: fix incorrect expectation in test case 10
  shallow: remove unused code
  send-pack.c: mark a file-local function static
  git-clone.txt: remove shallow clone limitations
  prune: clean .git/shallow after pruning objects
  clone: use git protocol for cloning shallow repo locally
  send-pack: support pushing from a shallow clone via http
  receive-pack: support pushing to a shallow clone via http
  smart-http: support shallow fetch/clone
  remote-curl: pass ref SHA-1 to fetch-pack as well
  send-pack: support pushing to a shallow clone
  receive-pack: allow pushes that update .git/shallow
  connected.c: add new variant that runs with --shallow-file
  add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
  receive/send-pack: support pushing from a shallow clone
  receive-pack: reorder some code in unpack()
  fetch: add --update-shallow to accept refs that update .git/shallow
  upload-pack: make sure deepening preserves shallow roots
  fetch: support fetching from a shallow repository
  clone: support remote shallow repository
  ...
2014-01-17 12:21:20 -08:00
Erik Faye-Lund 7edc02f4de prefer xwrite instead of write
Our xwrite wrapper already deals with a few potential hazards, and
are as such more robust. Prefer it instead of write to get the
robustness benefits everywhere.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-and-improved-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17 12:09:26 -08:00
Junio C Hamano 272220ff67 Merge branch 'mm/mv-file-to-no-such-dir-with-slash'
Finishing touches to do the same on windows.

* mm/mv-file-to-no-such-dir-with-slash:
  mv: let 'git mv file no-such-dir/' error out on Windows, too
2014-01-13 11:33:51 -08:00
Junio C Hamano 3b72885bd8 Merge branch 'km/gc-eperm' into maint
A "gc" process running as a different user should be able to stop a
new "gc" process from starting.

* km/gc-eperm:
  gc: notice gc processes run by other users
2014-01-13 11:23:04 -08:00
Junio C Hamano ada6ebb6e9 Merge branch 'mm/mv-file-to-no-such-dir-with-slash' into maint
"git mv A B/", when B does not exist as a directory, should error
out, but it didn't.

* mm/mv-file-to-no-such-dir-with-slash:
  mv: let 'git mv file no-such-dir/' error out on Windows, too
  mv: let 'git mv file no-such-dir/' error out
2014-01-13 11:22:48 -08:00
Junio C Hamano be941a2c34 Merge branch 'jk/rev-parse-double-dashes' into maint
"git rev-parse <revs> -- <paths>" did not implement the usual
disambiguation rules the commands in the "git log" family used in
the same way.

* jk/rev-parse-double-dashes:
  rev-parse: be more careful with munging arguments
  rev-parse: correctly diagnose revision errors before "--"
2014-01-13 11:22:38 -08:00
Junio C Hamano 6845e8a62d Merge branch 'jk/cat-file-regression-fix' into maint
"git cat-file --batch=", an admittedly useless command, did not
behave very well.

* jk/cat-file-regression-fix:
  cat-file: handle --batch format with missing type/size
  cat-file: pass expand_data to print_object_or_die
2014-01-13 11:22:21 -08:00
Johannes Sixt a893346930 mv: let 'git mv file no-such-dir/' error out on Windows, too
The previous commit c57f628 (mv: let 'git mv file no-such-dir/' error out)
relies on that rename("file", "no-such-dir/") fails if the directory does not
exist (note the trailing slash).  This does not work as expected on Windows:
This rename() call does not fail, but renames "file" to "no-such-dir" (not to
"no-such-dir/file"). Insert an explicit check for this case to force an error.

This changes the error message from

   $ git mv file no-such-dir/
   fatal: renaming 'file' failed: Not a directory

to

   $ git mv file no-such-dir/
   fatal: destination directory does not exist, source=file, destination=no-such-dir/

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-10 11:28:12 -08:00
Junio C Hamano 74ca49330a Merge branch 'ss/builtin-cleanup'
"git help $cmd" unnecessarily enumerated potential command names
from the filesystem, even when $cmd is known to be a built-in.

Ideas for further optimization, primarily by killing the use of
is_in_cmdlist(), were suggested in the discussion, but they can
come as follow-ups on top of this series.

* ss/builtin-cleanup:
  builtin/help.c: speed up is_git_command() by checking for builtin commands first
  builtin/help.c: call load_command_list() only when it is needed
  git.c: consistently use the term "builtin" instead of "internal command"
2014-01-10 10:33:48 -08:00
Junio C Hamano 3b9d69ec22 Merge branch 'js/lift-parent-count-limit'
There is no reason to have a hardcoded upper limit of the number of
parents for an octopus merge, created via the graft mechanism.

* js/lift-parent-count-limit:
  Remove the line length limit for graft files
2014-01-10 10:33:36 -08:00
Junio C Hamano d5d1678b9c Merge branch 'bm/merge-base-octopus-dedup'
"git merge-base --octopus" used to leave cleaning up suboptimal
result to the caller, but now it does the clean-up itself.

* bm/merge-base-octopus-dedup:
  merge-base --octopus: reduce the result from get_octopus_merge_bases()
  merge-base: separate "--independent" codepath into its own helper
2014-01-10 10:33:33 -08:00
Junio C Hamano 55869681f1 Merge branch 'km/gc-eperm'
A "gc" process running as a different user should be able to stop a
new "gc" process from starting.

* km/gc-eperm:
  gc: notice gc processes run by other users
2014-01-10 10:33:30 -08:00
Junio C Hamano b2132068c6 Merge branch 'jk/oi-delta-base'
Teach "cat-file --batch" to show delta-base object name for a
packed object that is represented as a delta.

* jk/oi-delta-base:
  cat-file: provide %(deltabase) batch format
  sha1_object_info_extended: provide delta base sha1s
2014-01-10 10:33:11 -08:00
Junio C Hamano f06a5e607d Merge branch 'jk/sha1write-void'
Code clean-up.

* jk/sha1write-void:
  do not pretend sha1write returns errors
2014-01-10 10:33:09 -08:00
Junio C Hamano 4ba46c2847 Merge branch 'nd/add-empty-fix'
"git add -A" (no other arguments) in a totally empty working tree
used to emit an error.

* nd/add-empty-fix:
  add: don't complain when adding empty project root
2014-01-10 10:33:03 -08:00
Junio C Hamano 666b4c2670 Merge branch 'tm/fetch-prune'
Fetching 'frotz' branch with "git fetch", while having
'frotz/nitfol' remote-tracking branch from an earlier fetch, would
error out, primarily because the command has not been told to
remove anything on our side. In such a case, "git fetch --prune"
can be used to remove 'frotz/nitfol' to make room to fetch and
store 'frotz' remote-tracking branch.

* tm/fetch-prune:
  fetch --prune: Run prune before fetching
  fetch --prune: always print header url
2014-01-10 10:32:50 -08:00
Junio C Hamano 061614b309 Merge branch 'mh/path-max'
A few places where we relied on a fixed length buffer to hold
pathnames in these two programs have been converted to use strbuf.

* mh/path-max:
  builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX
  prune-packed: use strbuf to avoid having to worry about PATH_MAX
2014-01-10 10:32:21 -08:00
Junio C Hamano b0504a9519 Merge branch 'cc/replace-object-info'
read_sha1_file() that is the workhorse to read the contents given
an object name honoured object replacements, but there is no
corresponding mechanism to sha1_object_info() that is used to
obtain the metainfo (e.g. type & size) about the object, leading
callers to weird inconsistencies.

* cc/replace-object-info:
  replace info: rename 'full' to 'long' and clarify in-code symbols
  Documentation/git-replace: describe --format option
  builtin/replace: unset read_replace_refs
  t6050: add tests for listing with --format
  builtin/replace: teach listing using short, medium or full formats
  sha1_file: perform object replacement in sha1_object_info_extended()
  t6050: show that git cat-file --batch fails with replace objects
  sha1_object_info_extended(): add an "unsigned flags" parameter
  sha1_file.c: add lookup_replace_object_extended() to pass flags
  replace_object: don't check read_replace_refs twice
  rename READ_SHA1_FILE_REPLACE flag to LOOKUP_REPLACE_OBJECT
2014-01-10 10:32:10 -08:00
Junio C Hamano 010d81ae35 Merge branch 'nd/negative-pathspec'
Introduce "negative pathspec" magic, to allow "git log -- . ':!dir'" to
tell us "I am interested in everything but 'dir' directory".

* nd/negative-pathspec:
  pathspec.c: support adding prefix magic to a pathspec with mnemonic magic
  Support pathspec magic :(exclude) and its short form :!
  glossary-content.txt: rephrase magic signature part
2014-01-10 10:31:48 -08:00
Jeff King 648027c4c8 cat-file: fix a minor memory leak in batch_objects
We should always have been freeing our strbuf, but doing so
consistently was annoying until the refactoring in the
previous patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 14:31:52 -08:00
Jeff King 07e2383945 cat-file: refactor error handling of batch_objects
This just pulls the return value for the function out of the
inner loop, so we can break out of the loop rather than do
an early return. This will make it easier to put any cleanup
for the function in one place.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 14:31:10 -08:00
Sebastian Schuberth c6127fa3e2 builtin/help.c: speed up is_git_command() by checking for builtin commands first
Since 2dce956 is_git_command() is a bit slow as it does file I/O in
the call to list_commands_in_dir(). Avoid the file I/O by adding an
early check for the builtin commands.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 11:26:31 -08:00
Sebastian Schuberth a3c5263438 builtin/help.c: call load_command_list() only when it is needed
This avoids list_commands_in_dir() being called when not needed which is
quite slow due to file I/O in order to list matching files in a directory.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 11:26:10 -08:00
Michael Haggerty f3565c0ca5 cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories_const() returns a non-zero value on
error.  The old code at this calling site recognized a couple of
particular error values, and treated all other return values as
success.  Instead, be more conservative: recognize the errors we are
interested in, but treat any other nonzero values as failures.  This
is more robust in case somebody adds another possible return value
without telling us.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:34:22 -08:00
Michael Haggerty 0be0521b23 safe_create_leading_directories(): introduce enum for return values
Instead of returning magic integer values (which a couple of callers
go to the trouble of distinguishing), return values from an enum.  Add
a docstring.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:34:21 -08:00
Ramsay Jones feefdf62c1 shallow: remove unused code
Commit 58babfff ("shallow.c: the 8 steps to select new commits for
.git/shallow", 05-12-2013) added a function to implement step 5 of
the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'.
This function implements an optional optimization step in the new
shallow commit selection algorithm. However, this function has no
callers. (The commented out call sites would need to change, in
order to provide information required by the function.)

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:05:40 -08:00
Tom Miller 10a6cc8890 fetch --prune: Run prune before fetching
When we have a remote-tracking branch named "frotz/nitfol" from a
previous fetch, and the upstream now has a branch named "frotz",
fetch would fail to remove "frotz/nitfol" with a "git fetch --prune"
from the upstream. git would inform the user to use "git remote
prune" to fix the problem.

Change the way "fetch --prune" works by moving the pruning operation
before the fetching operation. This way, instead of warning the user
of a conflict, it autmatically fixes it.

Signed-off-by: Tom Miller <jackerran@gmail.com>
Tested-by: Thomas Rast <tr@thomasrast.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-03 10:18:40 -08:00
Tom Miller 4b3b33a747 fetch --prune: always print header url
If "fetch --prune" is run with no new refs to fetch, but it has refs
to prune. Then, the header url is not printed as it would if there were
new refs to fetch.

Output before this patch:

	$ git fetch --prune remote-with-no-new-refs
	 x [deleted]         (none)     -> origin/world

Output after this patch:

	$ git fetch --prune remote-with-no-new-refs
	From https://github.com/git/git
	 x [deleted]         (none)     -> origin/test

Signed-off-by: Tom Miller <jackerran@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-03 10:13:39 -08:00
Kyle J. McKay ed7eda8b38 gc: notice gc processes run by other users
Since 64a99eb4 git gc refuses to run without the --force option if
another gc process on the same repository is already running.

However, if the repository is shared and user A runs git gc on the
repository and while that gc is still running user B runs git gc on
the same repository the gc process run by user A will not be noticed
and the gc run by user B will go ahead and run.

The problem is that the kill(pid, 0) test fails with an EPERM error
since user B is not allowed to signal processes owned by user A
(unless user B is root).

Update the test to recognize an EPERM error as meaning the process
exists and another gc should not be run (unless --force is given).

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-02 16:15:29 -08:00
Christian Couder 663a8566be replace info: rename 'full' to 'long' and clarify in-code symbols
Enum names SHORT/MEDIUM/FULL were too broad to be descriptive.  And
they clashed with built-in symbols on platforms like Windows.
Clarify by giving them REPLACE_FORMAT_ prefix.

Rename 'full' format in "git replace --format=<name>" to 'long', to
match others (i.e. 'short' and 'medium').

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:33:11 -08:00
Junio C Hamano 44484662d8 Merge branch 'maint'
* maint:
  for-each-ref: remove unused variable
2013-12-30 12:27:01 -08:00
Ramkumar Ramachandra b9cf14d43b for-each-ref: remove unused variable
No code ever used this symbol since the command was introduced at
9f613ddd (Add git-for-each-ref: helper for language bindings,
2006-09-15).

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:23:51 -08:00
Vicent Marti ae4f07fbcc pack-bitmap: implement optional name_hash cache
When we use pack bitmaps rather than walking the object
graph, we end up with the list of objects to include in the
packfile, but we do not know the path at which any tree or
blob objects would be found.

In a recently packed repository, this is fine. A fetch would
use the paths only as a heuristic in the delta compression
phase, and a fully packed repository should not need to do
much delta compression.

As time passes, though, we may acquire more objects on top
of our large bitmapped pack. If clients fetch frequently,
then they never even look at the bitmapped history, and all
works as usual. However, a client who has not fetched since
the last bitmap repack will have "have" tips in the
bitmapped history, but "want" newer objects.

The bitmaps themselves degrade gracefully in this
circumstance. We manually walk the more recent bits of
history, and then use bitmaps when we hit them.

But we would also like to perform delta compression between
the newer objects and the bitmapped objects (both to delta
against what we know the user already has, but also between
"new" and "old" objects that the user is fetching). The lack
of pathnames makes our delta heuristics much less effective.

This patch adds an optional cache of the 32-bit name_hash
values to the end of the bitmap file. If present, a reader
can use it to match bitmapped and non-bitmapped names during
delta compression.

Here are perf results for p5310:

Test                      origin/master       HEAD^                      HEAD
-------------------------------------------------------------------------------------------------
5310.2: repack to disk    36.81(37.82+1.43)   47.70(48.74+1.41) +29.6%   47.75(48.70+1.51) +29.7%
5310.3: simulated clone   30.78(29.70+2.14)   1.08(0.97+0.10) -96.5%     1.07(0.94+0.12) -96.5%
5310.4: simulated fetch   3.16(6.10+0.08)     3.54(10.65+0.06) +12.0%    1.70(3.07+0.06) -46.2%
5310.6: partial bitmap    36.76(43.19+1.81)   6.71(11.25+0.76) -81.7%    4.08(6.26+0.46) -88.9%

You can see that the time spent on an incremental fetch goes
down, as our delta heuristics are able to do their work.
And we save time on the partial bitmap clone for the same
reason.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Vicent Marti 5cf2741c5a repack: consider bitmaps when performing repacks
Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.

This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.

Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.writebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.

Alternatively, `git repack` can be called with the `-b` switch to
explicitly generate bitmap indexes if you are experimenting
and don't want them on all the time.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Jeff King b77fcd1edc repack: handle optional files created by pack-objects
We ask pack-objects to pack to a set of temporary files, and
then rename them into place. Some files that pack-objects
creates may be optional (like a .bitmap file), in which case
we would not want to call rename(). We already call stat()
and make the chmod optional if the file cannot be accessed.
We could simply skip the rename step in this case, but that
would be a minor regression in noticing problems with
non-optional files (like the .pack and .idx files).

Instead, we can now annotate extensions as optional, and
skip them if they don't exist (and otherwise rely on
rename() to barf).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Jeff King 42a02d8529 repack: turn exts array into array-of-struct
This is slightly more verbose, but will let us annotate the
extensions with further options in future commits.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Jeff King b328c2166e repack: stop using magic number for ARRAY_SIZE(exts)
We have a static array of extensions, but hardcode the size
of the array in our loops. Let's pull out this magic number,
which will make it easier to change.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:23 -08:00
Vicent Marti 7cc8f97108 pack-objects: implement bitmap writing
This commit extends more the functionality of `pack-objects` by allowing
it to write out a `.bitmap` index next to any written packs, together
with the `.idx` index that currently gets written.

If bitmap writing is enabled for a given repository (either by calling
`pack-objects` with the `--write-bitmap-index` flag or by having
`pack.writebitmaps` set to `true` in the config) and pack-objects is
writing a packfile that would normally be indexed (i.e. not piping to
stdout), we will attempt to write the corresponding bitmap index for the
packfile.

Bitmap index writing happens after the packfile and its index has been
successfully written to disk (`finish_tmp_packfile`). The process is
performed in several steps:

    1. `bitmap_writer_set_checksum`: this call stores the partial
       checksum for the packfile being written; the checksum will be
       written in the resulting bitmap index to verify its integrity

    2. `bitmap_writer_build_type_index`: this call uses the array of
       `struct object_entry` that has just been sorted when writing out
       the actual packfile index to disk to generate 4 type-index bitmaps
       (one for each object type).

       These bitmaps have their nth bit set if the given object is of
       the bitmap's type. E.g. the nth bit of the Commits bitmap will be
       1 if the nth object in the packfile index is a commit.

       This is a very cheap operation because the bitmap writing code has
       access to the metadata stored in the `struct object_entry` array,
       and hence the real type for each object in the packfile.

    3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
       index for one of the packfiles we're trying to repack, this call
       will efficiently rebuild the existing bitmaps so they can be
       reused on the new index. All the existing bitmaps will be stored
       in a `reuse` hash table, and the commit selection phase will
       prioritize these when selecting, as they can be written directly
       to the new index without having to perform a revision walk to
       fill the bitmap. This can greatly speed up the repack of a
       repository that already has bitmaps.

    4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
       a given `pack-objects` run, the sequence of commits generated
       during the Counting Objects phase will be stored in an array.

       We then use that array to build up the list of selected commits.
       Writing a bitmap in the index for each object in the repository
       would be cost-prohibitive, so we use a simple heuristic to pick
       the commits that will be indexed with bitmaps.

       The current heuristics are a simplified version of JGit's
       original implementation. We select a higher density of commits
       depending on their age: the 100 most recent commits are always
       selected, after that we pick 1 commit of each 100, and the gap
       increases as the commits grow older. On top of that, we make sure
       that every single branch that has not been merged (all the tips
       that would be required from a clone) gets their own bitmap, and
       when selecting commits between a gap, we tend to prioritize the
       commit with the most parents.

       Do note that there is no right/wrong way to perform commit
       selection; different selection algorithms will result in
       different commits being selected, but there's no such thing as
       "missing a commit". The bitmap walker algorithm implemented in
       `prepare_bitmap_walk` is able to adapt to missing bitmaps by
       performing manual walks that complete the bitmap: the ideal
       selection algorithm, however, would select the commits that are
       more likely to be used as roots for a walk in the future (e.g.
       the tips of each branch, and so on) to ensure a bitmap for them
       is always available.

    5. `bitmap_writer_build`: this is the computationally expensive part
       of bitmap generation. Based on the list of commits that were
       selected in the previous step, we perform several incremental
       walks to generate the bitmap for each commit.

       The walks begin from the oldest commit, and are built up
       incrementally for each branch. E.g. consider this dag where A, B,
       C, D, E, F are the selected commits, and a, b, c, e are a chunk
       of simplified history that will not receive bitmaps.

            A---a---B--b--C--c--D
                     \
                      E--e--F

       We start by building the bitmap for A, using A as the root for a
       revision walk and marking all the objects that are reachable
       until the walk is over. Once this bitmap is stored, we reuse the
       bitmap walker to perform the walk for B, assuming that once we
       reach A again, the walk will be terminated because A has already
       been SEEN on the previous walk.

       This process is repeated for C, and D, but when we try to
       generate the bitmaps for E, we can reuse neither the current walk
       nor the bitmap we have generated so far.

       What we do now is resetting both the walk and clearing the
       bitmap, and performing the walk from scratch using E as the
       origin. This new walk, however, does not need to be completed.
       Once we hit B, we can lookup the bitmap we have already stored
       for that commit and OR it with the existing bitmap we've composed
       so far, allowing us to limit the walk early.

       After all the bitmaps have been generated, another iteration
       through the list of commits is performed to find the best XOR
       offsets for compression before writing them to disk. Because of
       the incremental nature of these bitmaps, XORing one of them with
       its predecesor results in a minimal "bitmap delta" most of the
       time. We can write this delta to the on-disk bitmap index, and
       then re-compose the original bitmaps by XORing them again when
       loaded.

       This is a phase very similar to pack-object's `find_delta` (using
       bitmaps instead of objects, of course), except the heuristics
       have been greatly simplified: we only check the 10 bitmaps before
       any given one to find best compressing one. This gives good
       results in practice, because there is locality in the ordering of
       the objects (and therefore bitmaps) in the packfile.

     6. `bitmap_writer_finish`: the last step in the process is
	serializing to disk all the bitmap data that has been generated
	in the two previous steps.

	The bitmap is written to a tmp file and then moved atomically to
	its final destination, using the same process as
	`pack-write.c:write_idx_file`.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Vicent Marti aa32939fea rev-list: add bitmap mode to speed up object lists
The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list (i.e., not the path names at which trees and blobs were found).

Calling `git rev-list --objects --use-bitmap-index [committish]` will
perform an object iteration based on a bitmap result instead of actually
walking the object graph.

These are some example timings for `torvalds/linux` (warm cache,
best-of-five):

    $ time git rev-list --objects master > /dev/null

    real    0m34.191s
    user    0m33.904s
    sys     0m0.268s

    $ time git rev-list --objects --use-bitmap-index master > /dev/null

    real    0m1.041s
    user    0m0.976s
    sys     0m0.064s

Likewise, using `git rev-list --count --use-bitmap-index` will speed up
the counting operation by building the resulting bitmap and performing a
fast popcount (number of bits set on the bitmap) on the result.

Here are some sample timings of different ways to count commits in
`torvalds/linux`:

    $ time git rev-list master | wc -l
        399882

        real    0m6.524s
        user    0m6.060s
        sys     0m3.284s

    $ time git rev-list --count master
        399882

        real    0m4.318s
        user    0m4.236s
        sys     0m0.076s

    $ time git rev-list --use-bitmap-index --count master
        399882

        real    0m0.217s
        user    0m0.176s
        sys     0m0.040s

This also respects negative refs, so you can use it to count
a slice of history:

        $ time git rev-list --count v3.0..master
        144843

        real    0m1.971s
        user    0m1.932s
        sys     0m0.036s

        $ time git rev-list --use-bitmap-index --count v3.0..master
        real    0m0.280s
        user    0m0.220s
        sys     0m0.056s

Though note that the closer the endpoints, the less it helps. In the
traversal case, we have fewer commits to cross, so we take less time.
But the bitmap time is dominated by generating the pack revindex, which
is constant with respect to the refs given.

Note that you cannot yet get a fast --left-right count of a symmetric
difference (e.g., "--count --left-right master...topic"). The slow part
of that walk actually happens during the merge-base determination when
we parse "master...topic". Even though a count does not actually need to
know the real merge base (it only needs to take the symmetric difference
of the bitmaps), the revision code would require some refactoring to
handle this case.

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same. This can be used to
exercise the bitmap code, and also to verify that the contents of the
.bitmap file are sane.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Vicent Marti 6b8fda2db1 pack-objects: use bitmaps when packing objects
In this patch, we use the bitmap API to perform the `Counting Objects`
phase in pack-objects, rather than a traditional walk through the object
graph. For a reasonably-packed large repo, the time to fetch and clone
is often dominated by the full-object revision walk during the Counting
Objects phase. Using bitmaps can reduce the CPU time required on the
server (and therefore start sending the actual pack data with less
delay).

For bitmaps to be used, the following must be true:

  1. We must be packing to stdout (as a normal `pack-objects` from
     `upload-pack` would do).

  2. There must be a .bitmap index containing at least one of the
     "have" objects that the client is asking for.

  3. Bitmaps must be enabled (they are enabled by default, but can be
     disabled by setting `pack.usebitmaps` to false, or by using
     `--no-use-bitmap-index` on the command-line).

If any of these is not true, we fall back to doing a normal walk of the
object graph.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository) that show the speedup produced by various
methods:

    [existing graph traversal]
    $ time git pack-objects --all --stdout --no-use-bitmap-index \
			    </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m44.111s
    user    0m42.396s
    sys     0m3.544s

    [bitmaps only, without partial pack reuse; note that
     pack reuse is automatic, so timing this required a
     patch to disable it]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Counting objects: 3237103, done.
    Compressing objects: 100% (508752/508752), done.
    Total 3237103 (delta 2699584), reused 3237103 (delta 2699584)

    real    0m5.413s
    user    0m5.604s
    sys     0m1.804s

    [bitmaps with pack reuse (what you get with this patch)]
    $ time git pack-objects --all --stdout </dev/null >/dev/null
    Reusing existing pack: 3237103, done.
    Total 3237103 (delta 0), reused 0 (delta 0)

    real    0m1.636s
    user    0m1.460s
    sys     0m0.172s

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Jeff King ce2bc42456 pack-objects: split add_object_entry
This function actually does three things:

  1. Check whether we've already added the object to our
     packing list.

  2. Check whether the object meets our criteria for adding.

  3. Actually add the object to our packing list.

It's a little hard to see these three phases, because they
happen linearly in the rather long function. Instead, this
patch breaks them up into three separate helper functions.

The result is a little easier to follow, though it
unfortunately suffers from some optimization
interdependencies between the stages (e.g., during step 3 we
use the packing list index from step 1 and the packfile
information from step 2).

More importantly, though, the various parts can be
composed differently, as they will be in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 12:19:22 -08:00
Junio C Hamano 8f29299136 merge-base --octopus: reduce the result from get_octopus_merge_bases()
Scripts that use "merge-base --octopus" could do the reducing
themselves, but most of them are expected to want to get the reduced
results without having to do any work themselves.

Tests are taken from a message by Василий Макаров
<einmalfel@gmail.com>

Signed-off-by: Junio C Hamano <gitster@pobox.com>

---

 We might want to vet the existing callers of the underlying
 get_octopus_merge_bases() and find out if _all_ of them are doing
 anything extra (like deduping) because the machinery can return
 duplicate results. And if that is the case, then we may want to
 move the dedupling down the callchain instead of having it here.
2013-12-30 11:58:54 -08:00
Junio C Hamano e2f5df4244 merge-base: separate "--independent" codepath into its own helper
It piggybacks on an unrelated handle_octopus() function only because
there are some similarities between the way they need to preprocess
their input and output their result.  There is nothing similar in
the true logic between these two operations.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30 11:37:49 -08:00
Johannes Schindelin e228c1736f Remove the line length limit for graft files
Support for grafts predates Git's strbuf, and hence it is understandable
that there was a hard-coded line length limit of 1023 characters (which
was chosen a bit awkwardly, given that it is *exactly* one byte short of
aligning with the 41 bytes occupied by a commit name and the following
space or new-line character).

While regular commit histories hardly win comprehensibility in general
if they merge more than twenty-two branches in one go, it is not Git's
business to limit grafts in such a way.

In this particular developer's case, the use case that requires
substantially longer graft lines to be supported is the visualization of
the commits' order implied by their changes: commits are considered to
have an implicit relationship iff exchanging them in an interactive
rebase would result in merge conflicts.

Thusly implied branches tend to be very shallow in general, and the
resulting thicket of implied branches is usually very wide; It is
actually quite common that *most* of the commits in a topic branch have
not even one implied parent, so that a final merge commit has about as
many implied parents as there are commits in said branch.

[jc: squashed in tests by Jonathan]

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-27 16:46:25 -08:00
Junio C Hamano 73b063130b Merge branch 'tg/diff-no-index-refactor'
"git diff ../else/where/A ../else/where/B" when ../else/where is
clearly outside the repository, and "git diff --no-index A B", do
not have to look at the index at all, but we used to read the index
unconditionally.

* tg/diff-no-index-refactor:
  diff: avoid some nesting
  diff: add test for --no-index executed outside repo
  diff: don't read index when --no-index is given
  diff: move no-index detection to builtin/diff.c
2013-12-27 14:58:17 -08:00
Junio C Hamano 604ada435b Merge branch 'jk/cat-file-regression-fix'
"git cat-file --batch=", an admittedly useless command, did not
behave very well.

* jk/cat-file-regression-fix:
  cat-file: handle --batch format with missing type/size
  cat-file: pass expand_data to print_object_or_die
2013-12-27 14:58:11 -08:00
Junio C Hamano e9ecee0423 Merge branch 'jk/rev-parse-double-dashes'
"git rev-parse <revs> -- <paths>" did not implement the usual
disambiguation rules the commands in the "git log" family used in
the same way.

* jk/rev-parse-double-dashes:
  rev-parse: be more careful with munging arguments
  rev-parse: correctly diagnose revision errors before "--"
2013-12-27 14:58:01 -08:00
Junio C Hamano 7cdebd8a20 Merge branch 'jc/push-refmap'
Make "git push origin master" update the same ref that would be
updated by our 'master' when "git push origin" (no refspecs) is run
while the 'master' branch is checked out, which makes "git push"
more symmetric to "git fetch" and more usable for the triangular
workflow.

* jc/push-refmap:
  push: also use "upstream" mapping when pushing a single ref
  push: use remote.$name.push as a refmap
  builtin/push.c: use strbuf instead of manual allocation
2013-12-27 14:57:50 -08:00
Jeff King 65ea9c3c3d cat-file: provide %(deltabase) batch format
It can be useful for debugging or analysis to see which
objects are stored as delta bases on top of others. This
information is available by running `git verify-pack`, but
that is extremely expensive (and is harder than necessary to
parse).

Instead, let's make it available as a cat-file query format,
which makes it fast and simple to get the bases for a subset
of the objects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 11:54:26 -08:00
Jeff King 9af270e8c2 do not pretend sha1write returns errors
The sha1write function returns an int, but it will always be
"0". The failure-prone parts of the function happen in the
"flush" callback, which cannot pass an error back to us. So
we just end up calling die() during the flush.

Let's just drop the return value altogether, as it only
confuses callers into thinking that it might be useful.

Only one call site actually checked the return value. We can
drop that check, since it just led to a die() anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 11:50:20 -08:00
Nguyễn Thái Ngọc Duy 64ed07cee0 add: don't complain when adding empty project root
This behavior was added in 07d7bed (add: don't complain when adding
empty project root - 2009-04-28) then broken by 84b8b5d (remove
match_pathspec() in favor of match_pathspec_depth() -
2013-07-14). Reinstate it.

Noticed-by: Thomas Ferris Nicolaisen <tfnico@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26 10:46:26 -08:00
Jeff King 4454e9cb59 builtin/prune.c: use strbuf to avoid having to worry about PATH_MAX
While at it, rename prune_tmp_object(), which used to be a helper to
remove temporary files that were created to become loose object
files, to prune_tmp_file(), as the function is also used to remove
any random cruft whose name begins with tmp_ directly in .git/object
or .git/object/pack directories these days.

Noticed-by:  Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-18 15:53:56 -08:00
Junio C Hamano 7794a680e6 Sync with 1.8.5.2
* maint:
  Git 1.8.5.2
  cmd_repack(): remove redundant local variable "nr_packs"
2013-12-17 14:12:17 -08:00
Junio C Hamano 1945e8ac85 Merge branch 'tb/clone-ssh-with-colon-for-port'
Be more careful when parsing remote repository URL given in the
scp-style host:path notation.

* tb/clone-ssh-with-colon-for-port:
  git_connect(): use common return point
  connect.c: refactor url parsing
  git_connect(): refactor the port handling for ssh
  git fetch: support host:/~repo
  t5500: add test cases for diag-url
  git fetch-pack: add --diag-url
  git_connect: factor out discovery of the protocol and its parts
  git_connect: remove artificial limit of a remote command
  t5601: add tests for ssh
  t5601: remove clear_ssh, refactor setup_ssh_wrapper
2013-12-17 12:03:32 -08:00
Junio C Hamano 88cb2f96ac Merge branch 'nd/transport-positive-depth-only'
"git fetch --depth=0" was a no-op, and was silently
ignored. Diagnose it as an error.

* nd/transport-positive-depth-only:
  clone,fetch: catch non positive --depth option value
2013-12-17 12:03:29 -08:00
Junio C Hamano ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Junio C Hamano 14a9c5f261 Merge branch 'jl/commit-v-strip-marker'
"git commit -v" appends the patch to the log message before
editing, and then removes the patch when the editor returned
control. However, the patch was not stripped correctly when the
first modified path was a submodule.

* jl/commit-v-strip-marker:
  commit -v: strip diffs and submodule shortlogs from the commit message
2013-12-17 11:47:18 -08:00
Junio C Hamano fb230b3523 Merge branch 'mm/mv-file-to-no-such-dir-with-slash'
* mm/mv-file-to-no-such-dir-with-slash:
  mv: let 'git mv file no-such-dir/' error out
2013-12-17 11:47:08 -08:00
Junio C Hamano 4d1826d1d9 Merge branch 'fc/trivial'
* fc/trivial:
  remote: fix status with branch...rebase=preserve
  fetch: add missing documentation
  t: trivial whitespace cleanups
  abspath: trivial style fix
2013-12-17 11:46:32 -08:00
Junio C Hamano c8b928d770 Merge branch 'nd/magic-pathspec' into maint
"git diff -- ':(icase)makefile'" was unnecessarily rejected at the
command line parser.

* nd/magic-pathspec:
  diff: restrict pathspec limitations to diff b/f case only
2013-12-17 11:21:34 -08:00
Michael Haggerty 3e7b066e22 cmd_repack(): remove redundant local variable "nr_packs"
Its value is the same as the number of entries in the "names"
string_list, so just use "names.nr" in its place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 10:54:41 -08:00
Junio C Hamano c235d960cb prune-packed: use strbuf to avoid having to worry about PATH_MAX
A/very/long/path/to/.git that becomes exactly PATH_MAX bytes long
after suffixed with /objects/??/??38-hex??, would have overflown
the on-stack pathname[] buffer.

Noticed-by:  Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-17 10:43:30 -08:00
Thomas Gummerer aad90e85f8 diff: avoid some nesting
Avoid some nesting in builtin/diff.c, to make the code easier to read.
There are no functional changes.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 13:13:05 -08:00
Junio C Hamano 577aed296a Merge branch 'jk/remove-deprecated'
* jk/remove-deprecated:
  stop installing git-tar-tree link
  peek-remote: remove deprecated alias of ls-remote
  lost-found: remove deprecated command
  tar-tree: remove deprecated command
  repo-config: remove deprecated alias for "git config"
2013-12-12 14:18:34 -08:00
Junio C Hamano e66ef7ae6f Merge branch 'mh/fetch-tags-in-addition-to-normal-refs'
The "--tags" option to "git fetch" used to be literally a synonym to
a "refs/tags/*:refs/tags/*" refspec, which meant that (1) as an
explicit refspec given from the command line, it silenced the lazy
"git fetch" default that is configured, and (2) also as an explicit
refspec given from the command line, it interacted with "--prune"
to remove any tag that the remote we are fetching from does not
have.

This demotes it to an option; with it, we fetch all tags in
addition to what would be fetched without the option, and it does
not interact with the decision "--prune" makes to see what
remote-tracking refs the local has are missing the remote
counterpart.

* mh/fetch-tags-in-addition-to-normal-refs: (23 commits)
  fetch: improve the error messages emitted for conflicting refspecs
  handle_duplicate(): mark error message for translation
  ref_remote_duplicates(): extract a function handle_duplicate()
  ref_remove_duplicates(): simplify loop logic
  t5536: new test of refspec conflicts when fetching
  ref_remove_duplicates(): avoid redundant bisection
  git-fetch.txt: improve description of tag auto-following
  fetch-options.txt: simplify ifdef/ifndef/endif usage
  fetch, remote: properly convey --no-prune options to subprocesses
  builtin/remote.c:update(): use struct argv_array
  builtin/remote.c: reorder function definitions
  query_refspecs(): move some constants out of the loop
  fetch --prune: prune only based on explicit refspecs
  fetch --tags: fetch tags *in addition to* other stuff
  fetch: only opportunistically update references based on command line
  get_expanded_map(): avoid memory leak
  get_expanded_map(): add docstring
  builtin/fetch.c: reorder function definitions
  get_ref_map(): rename local variables
  api-remote.txt: correct section "struct refspec"
  ...
2013-12-12 14:14:10 -08:00
Thomas Gummerer 6df5762db3 diff: don't read index when --no-index is given
git diff --no-index ... currently reads the index, during setup, when
calling gitmodules_config().  This results in worse performance when the
index is not actually needed.  This patch avoids calling
gitmodules_config() when the --no-index option is given.  The times for
executing "git diff --no-index" in the WebKit repository are improved as
follows:

Test                      HEAD~3            HEAD
------------------------------------------------------------------
4001.1: diff --no-index   0.24(0.15+0.09)   0.01(0.00+0.00) -95.8%

An additional improvement of this patch is that "git diff --no-index" no
longer breaks when the index file is corrupt, which makes it possible to
use it for investigating the broken repository.

To improve the possible usage as investigation tool for broken
repositories, setup_git_directory_gently() is also not called when the
--no-index option is given.

Also add a test to guard against future breakages, and a performance
test to show the improvements.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 12:23:02 -08:00
Thomas Gummerer 470faf9654 diff: move no-index detection to builtin/diff.c
Currently the --no-index option is parsed in diff_no_index().  Move the
detection if a no-index diff should be executed to builtin/diff.c, where
we can use it for executing diff_no_index() conditionally.  This will
also allow us to execute other operations conditionally, which will be
done in the next patch.

There are no functional changes.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 12:23:02 -08:00
Christian Couder 769a4fa463 builtin/replace: unset read_replace_refs
When checking to see if some objects are of the same type
and when displaying the type of objects, git replace uses
the sha1_object_info() function.

Unfortunately this function by default respects replace
refs, so instead of the type of a replaced object, it
gives the type of the replacement object which might
be different.

To fix this bug, and because git replace should work at a
level before replacement takes place, let's unset the
read_replace_refs global variable at the beginning of
cmd_replace().

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:53:49 -08:00
Christian Couder 44f9f850e8 builtin/replace: teach listing using short, medium or full formats
By default when listing replace refs, only the sha1 of the
replaced objects are shown.

In many cases, it is much nicer to be able to list all the
sha1 of the replaced objects along with the sha1 of the
replacment objects.

And in other cases it might be interesting to also show the
types of the replaced and replacement objects.

This patch introduce a new --format=<fmt> option where
<fmt> can be any of the following:

	'short': this is the same as when no --format
		option is used, that is only the sha1 of
		the replaced objects are shown
	'medium': this also lists the sha1 of the
		replacement objects
	'full': this shows the sha1 and the type of both
		the replaced and the replacement objects

Some documentation and some tests will follow.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:53:49 -08:00
Christian Couder de7b5d6218 sha1_object_info_extended(): add an "unsigned flags" parameter
This parameter is not used yet, but it will be used to tell
sha1_object_info_extended() if it should perform object
replacement or not.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:53:48 -08:00
Jeff King 6554dfa97a cat-file: handle --batch format with missing type/size
Commit 98e2092 taught cat-file to stream blobs with --batch,
which requires that we look up the object type before
loading it into memory.  As a result, we now print the
object header from information in sha1_object_info, and the
actual contents from the read_sha1_file. We double-check
that the information we printed in the header matches the
content we are about to show.

Later, commit 93d2a60 allowed custom header lines for
--batch, and commit 5b08640 made type lookups optional. As a
result, specifying a header line without the type or size
means that we will not look up those items at all.

This causes our double-checking to erroneously die with an
error; we think the type or size has changed, when in fact
it was simply left at "0".

For the size, we can fix this by only doing the consistency
double-check when we have retrieved the size via
sha1_object_info. In the case that we have not retrieved the
value, that means we also did not print it, so there is
nothing for us to check that we are consistent with.

We could do the same for the type. However, besides our
consistency check, we also care about the type in deciding
whether to stream or not. So instead of handling the case
where we do not know the type, this patch instead makes sure
that we always trigger a type lookup when we are printing,
so that even a format without the type will stream as we
would in the normal case.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:31:25 -08:00
Jeff King 370c9268d1 cat-file: pass expand_data to print_object_or_die
We currently individually pass the sha1, type, and size
fields calculated by sha1_object_info. However, if we pass
the whole struct, the called function can make more
intelligent decisions about which fields were actually
filled by sha1_object_info.

This patch takes that first refactoring step, passing the
whole struct, so further patches can make those decisions
with less noise in their diffs. There should be no
functional change to this patch (aside from a minor typo fix
in the error message).

As a side effect, we can rename the local variables in the
function to "type" and "size", since the names are no longer
taken.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12 11:27:21 -08:00
Nguyễn Thái Ngọc Duy eab3296c7e prune: clean .git/shallow after pruning objects
This patch teaches "prune" to remove shallow roots that are no longer
reachable from any refs (e.g. when the relevant refs are removed).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:19 -08:00
Nguyễn Thái Ngọc Duy 0d7d285f0e clone: use git protocol for cloning shallow repo locally
clone_local() does not handle $SRC/shallow. It could be made so, but
it's simpler to use fetch-pack/upload-pack instead.

This used to be caught by the check in upload-pack, which is triggered
by transport_get_remote_refs(), even in local clone case. The check is
now gone and check_everything_connected() should catch the result
incomplete repo. But check_everything_connected() will soon be skipped
in local clone case, opening a door to corrupt repo. This patch should
close that door.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy f2c681cf12 send-pack: support pushing from a shallow clone via http
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy c29a7b8b3f receive-pack: support pushing to a shallow clone via http
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy 16094885ca smart-http: support shallow fetch/clone
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy 58f2ed051f remote-curl: pass ref SHA-1 to fetch-pack as well
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy b016918b2f send-pack: support pushing to a shallow clone
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy 0a1bc12b6e receive-pack: allow pushes that update .git/shallow
The basic 8 steps to update .git/shallow does not fully apply here
because the user may choose to accept just a few refs (while fetch
always accepts all refs). The steps are modified a bit.

1-6. same as before. After calling assign_shallow_commits_to_refs at
   step 6, each shallow commit has a bitmap that marks all refs that
   require it.

7. mark all "ours" shallow commits that are reachable from any
   refs. We will need to do the original step 7 on them later.

8. go over all shallow commit bitmaps, mark refs that require new
   shallow commits.

9. setup a strict temporary shallow file to plug all the holes, even
   if it may cut some of our history short. This file is used by all
   hooks. The hooks could use --shallow-file=$GIT_DIR/shallow to
   overcome this and reach everything in current repo.

10. go over the new refs one by one. For each ref, do the reachability
   test if it needs a shallow commit on the list from step 7. Remove
   it if it's reachable from our refs. Gather all required shallow
   commits, run check_everything_connected() with the new ref, then
   install them to .git/shallow.

This mode is disabled by default and can be turned on with
receive.shallowupdate

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:18 -08:00
Nguyễn Thái Ngọc Duy 5dbd767601 receive/send-pack: support pushing from a shallow clone
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy 31c42bff35 receive-pack: reorder some code in unpack()
This is the preparation for adding --shallow-file to both
unpack-objects and index-pack. To sum up:

 - struct argv_array used instead of const char **

 - status/code, ip/child, unpacker/keeper are moved out to function
   top level

 - successful flow now ends at the end of the function

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy 48d25cae22 fetch: add --update-shallow to accept refs that update .git/shallow
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy 4820a33baa fetch: support fetching from a shallow repository
This patch just put together pieces from the 8 steps patch. We stop at
step 7 and reject refs that require new shallow commits.

Note that, by rejecting refs that require new shallow commits, we
leave dangling objects in the repo, which become "object islands" by
the next "git fetch" of the same source.

If the first fetch our "ours" set is zero and we do practically
nothing at step 7, "ours" is full at the next fetch and we may need to
walk through commits for reachability test. Room for improvement.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy beea4152d9 clone: support remote shallow repository
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.

Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.

One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).

If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.

Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.

A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy b06dcd7d68 connect.c: teach get_remote_heads to parse "shallow" lines
No callers pass a non-empty pointer as shallow_points at this
stage. As a result, all clients still refuse to talk to shallow
repository on the other end.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy ad491366de make the sender advertise shallow commits to the receiver
If either receive-pack or upload-pack is called on a shallow
repository, shallow commits (*) will be sent after the ref
advertisement (but before the packet flush), so that the receiver has
the full "shape" of the sender's commit graph. This will be needed for
the receiver to update its .git/shallow if necessary.

This breaks the protocol for all clients trying to push to a shallow
repo, or fetch from one. Which is basically the same end result as
today's "is_repository_shallow() && die()" in receive-pack and
upload-pack. New clients will be made aware of shallow upstream and
can make use of this information.

The sender must send all shallow commits that are sent in the
following pack. It may send more shallow commits than necessary.

upload-pack for example may choose to advertise no shallow commits if
it knows in advance that the pack it's going to send contains no
shallow commits. But upload-pack is the server, so we choose the
cheaper way, send full .git/shallow and let the client deal with it.

Smart HTTP is not affected by this patch. Shallow support on
smart-http comes later separately.

(*) A shallow commit is a commit that terminates the revision
    walker. It is usually put in .git/shallow in order to keep the
    revision walker from going out of bound because there is no
    guarantee that objects behind this commit is available.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy 606e435a0a clone: prevent --reference to a shallow repository
If we borrow objects from another repository, we should also pay
attention to their $GIT_DIR/shallow (and even info/grafts). But
current alternates code does not.

Reject alternate repos that are shallow because we do not do it
right. In future the alternate code may be updated to check
$GIT_DIR/shallow properly so that this restriction could be lifted.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy 0b854bcc2a send-pack: forbid pushing from a shallow repository
send-pack can send a pack with loose ends to the server.  receive-pack
before 6d4bb38 (fetch: verify we have everything we need before
updating our ref - 2011-09-01) does not detect this and keeps the pack
anyway, which corrupts the repository, at least from fsck point of
view.

send-pack will learn to safely push from a shallow repository later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy 13eb4626c4 remote.h: replace struct extra_have_objects with struct sha1_array
The latter can do everything the former can and is used in many more
places.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:15 -08:00
Torsten Bögershausen 5610b7c0c6 git fetch-pack: add --diag-url
The main purpose is to trace the URL parser called by git_connect() in
connect.c

The main features of the parser can be listed as this:

- parse out host and path for URLs with a scheme (git:// file:// ssh://)
- parse host names embedded by [] correctly
- extract the port number, if present
- separate URLs like "file" (which are local)
  from URLs like "host:repo" which should use ssh

Add the new parameter "--diag-url" to "git fetch-pack", which prints
the value for protocol, host and path to stderr and exits.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 14:54:47 -08:00
Jeff King 62f162f8e7 rev-parse: be more careful with munging arguments
When rev-parse looks at whether an argument like "foo..bar" or
"foobar^@" is a difference or parent-shorthand, it internally
munges the arguments so that it can pass the individual rev
arguments to get_sha1(). However, we do not consistently un-munge
the result.

For cases where we do not match (e.g., "doesnotexist..HEAD"), we
would then want to try to treat the argument as a filename.
try_difference gets() this right, and always unmunges in this case.
However, try_parent_shorthand() never unmunges, leading to incorrect
error messages, or even incorrect results:

  $ git rev-parse foobar^@
  foobar
  fatal: ambiguous argument 'foobar': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

  $ >foobar
  $ git rev-parse foobar^@
  foobar

For cases where we do match, neither function unmunges. This does
not currently matter, since we are done with the argument. However,
a future patch will do further processing, and this prepares for
it. In addition, it's simply a confusing interface for some cases to
modify the const argument, and others not to.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 14:39:16 -08:00
Felipe Contreras 0a54f70905 remote: fix status with branch...rebase=preserve
Commit 66713ef (pull: allow pull to preserve merges when rebasing)
didn't include an update so 'git remote status' parses branch.<name>.rebase=preserve
correctly, let's do that.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 14:12:24 -08:00
Jeff King 1418567381 rev-parse: correctly diagnose revision errors before "--"
Rev-parse understands that a "--" may separate revisions and
filenames, and that anything after the "--" is taken as-is.
However, it does not understand that anything before the
token must be a revision (which is the usual rule
implemented by the setup_revisions parser).

Since rev-parse prefers revisions to files when parsing
before the "--", we end up with the correct result (if such
an argument is a revision, we parse it as one, and if it is
not, it is an error either way).  However, we misdiagnose
the errors:

  $ git rev-parse foobar -- >/dev/null
  fatal: ambiguous argument 'foobar': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

  $ >foobar
  $ git rev-parse foobar -- >/dev/null
  fatal: bad flag '--' used after filename

In both cases, we should know that the real error is that
"foobar" is meant to be a revision, but could not be
resolved.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09 11:01:23 -08:00
Nguyễn Thái Ngọc Duy ef79b1f870 Support pathspec magic :(exclude) and its short form :!
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 13:00:39 -08:00
Nguyễn Thái Ngọc Duy 5594bcad21 clone,fetch: catch non positive --depth option value
Instead of simply ignoring the value passed to --depth option when
it is zero or negative, catch and report it as an error to let
people know that they were using the option incorrectly.

Original-patch-by: Andrés G. Aragoneses <knocte@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 12:57:10 -08:00
Junio C Hamano e2bcd4f779 Merge branch 'nd/magic-pathspec'
"git diff -- ':(icase)makefile'" were rejected unnecessarily.
This needs to be merged to 'maint' later.

* nd/magic-pathspec:
  diff: restrict pathspec limitations to diff b/f case only
2013-12-06 11:09:41 -08:00
Junio C Hamano cb6bd5722f Merge branch 'rr/for-each-ref-decoration'
Add a few formatting directives to "git for-each-ref --format=...",
to paint them in color, etc.

* rr/for-each-ref-decoration:
  for-each-ref: avoid color leakage
  for-each-ref: introduce %(color:...) for color
  for-each-ref: introduce %(upstream:track[short])
  for-each-ref: introduce %(HEAD) asterisk marker
  t6300 (for-each-ref): don't hardcode SHA-1 hexes
  t6300 (for-each-ref): clearly demarcate setup
2013-12-06 11:07:21 -08:00
Jens Lehmann 1a72cfd7fa commit -v: strip diffs and submodule shortlogs from the commit message
When using the '-v' option of "git commit" the diff added to the commit
message temporarily for editing is stripped off after the user exited the
editor by searching for "\ndiff --git " and truncating the commmit message
there if it is found.

But this approach has two problems:

- when the commit message itself contains a line starting with
  "diff --git" it will be truncated there prematurely; and

- when the "diff.submodule" setting is set to "log", the diff may
  start with "Submodule <hash1>..<hash2>", which will be left in
  the commit message while it shouldn't.

Fix that by introducing a special scissor separator line starting with the
comment character ('#' or the core.commentChar config if set) followed by
two lines describing what it is for. The scissor line - which will not be
translated - is used to reliably detect the start of the diff so it can be
chopped off from the commit message, no matter what the user enters there.

Turn a known test failure fixed by this change into a successful test;
also add one for a diff starting with a submodule log and another one for
proper handling of the comment char.

Reported-by: Ari Pollak <ari@debian.org>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:39:11 -08:00
Christian Couder 5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Christian Couder 3fb5aead29 builtin/remote: remove postfixcmp() and use suffixcmp() instead
Commit 8cc5b290 (git merge -X<option>, 25 Nov 2009) introduced
suffixcmp() with nearly the same implementation as postfixcmp()
that already existed since commit 211c8968 (Make git-remote a
builtin, 29 Feb 2008).

The only difference between the two implementations is that,
when the string is smaller than the suffix, one implementation
returns 1 while the other one returns -1.

But, as postfixcmp() is only used to compare for equality, the
distinction does not matter and does not affect the correctness of
this patch.

As postfixcmp() has always been static in builtin/remote.c
and is used nowhere else, it makes more sense to remove it
and use suffixcmp() instead in builtin/remote.c, rather than
to remove suffixcmp().

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:12:52 -08:00
Junio C Hamano 15a42a10ec Sync with 1.8.5 2013-12-05 14:11:20 -08:00
Junio C Hamano b00d2440f7 Merge branch 'gj/push-more-verbose-advice' (early part)
* 'gj/push-more-verbose-advice' (early part):
  push: enhance unspecified push default warning
2013-12-05 14:03:32 -08:00