Commit graph

5183 commits

Author SHA1 Message Date
Junio C Hamano 2f664566c5 Merge branch 'jk/tighten-alloc'
Small code and comment clean-up.

* jk/tighten-alloc:
  receive-pack: use FLEX_ALLOC_MEM in queue_command()
  correct FLEXPTR_* example in comment
2016-08-17 14:07:46 -07:00
Stefan Beller f7415b4d71 clone: implement optional references
In a later patch we want to try to create alternates for submodules,
but they might not exist in the referenced superproject. So add a way
to skip the non existing references and report them.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-15 15:28:45 -07:00
Stefan Beller 5e40800df2 clone: clarify option_reference as required
In the next patch we introduce optional references; To better distinguish
between optional and required references we rename the variable.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-15 15:28:42 -07:00
Stefan Beller 9eeea7d2bc clone: factor out checking for an alternate path
In a later patch we want to determine if a path is suitable as an
alternate from other commands than builtin/clone. Move the checking
functionality of `add_one_reference` to `compute_alternate_path` that is
defined in cache.h.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-15 15:28:01 -07:00
Stefan Beller 779b88a91f checkout: do not mention detach advice for explicit --detach option
When a user asked for a detached HEAD specifically with `--detach`,
we do not need to give advice on what a detached HEAD state entails as
we can assume they know what they're getting into as they asked for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-15 15:01:45 -07:00
René Scharfe ddd0bfac7c receive-pack: use FLEX_ALLOC_MEM in queue_command()
Use the macro FLEX_ALLOC_MEM instead of open-coding it.  This shortens
and simplifies the code a bit.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:49:30 -07:00
Stefan Beller 5f50f33e87 submodule--helper update-clone: allow multiple references
Allow the user to pass in multiple references to update_clone.
Currently this is only internal API, but once the shell script is
replaced by a C version, this is needed.

This fixes an API bug between the shell script and the helper.
Currently the helper accepts "--reference" "--reference=foo"
as a OPT_STRING whose value happens to be "--reference=foo", and
then uses

        if (suc->reference)
                argv_array_push(&child->args, suc->reference)

where suc->reference _is_ "--reference=foo" when invoking the
underlying "git clone", it cancels out.

With this change we omit one of the "--reference" arguments when
passing references from the shell script to the helper.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-12 15:00:17 -07:00
Stefan Beller 965dbea09a submodule--helper module-clone: allow multiple references
Allow users to pass in multiple references, just as clone accepts multiple
references as well.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-12 15:00:17 -07:00
Junio C Hamano dd610aeda6 Merge branch 'kw/patch-ids-optim'
When "git rebase" tries to compare set of changes on the updated
upstream and our own branch, it computes patch-id for all of these
changes and attempts to find matches. This has been optimized by
lazily computing the full patch-id (which is expensive) to be
compared only for changes that touch the same set of paths.

* kw/patch-ids-optim:
  rebase: avoid computing unnecessary patch IDs
  patch-ids: add flag to create the diff patch id using header only data
  patch-ids: replace the seen indicator with a commit pointer
  patch-ids: stop using a hand-rolled hashmap implementation
2016-08-12 09:47:39 -07:00
Junio C Hamano 2c44b7a53b Merge branch 'js/mv-dir-to-new-directory'
"git mv dir non-existing-dir/" did not work in some environments
the same way as existing mainstream platforms.  The code now moves
"dir" to "non-existing-dir", without relying on rename("A", "B/")
that strips the trailing slash of '/'.

* js/mv-dir-to-new-directory:
  git mv: do not keep slash in `git mv dir non-existing-dir/`
2016-08-12 09:47:37 -07:00
Junio C Hamano 0a315befa7 Merge branch 'rs/use-strbuf-add-unique-abbrev'
A small code clean-up.

* rs/use-strbuf-add-unique-abbrev:
  use strbuf_add_unique_abbrev() for adding short hashes
2016-08-12 09:47:37 -07:00
Junio C Hamano b32d7c524b Merge branch 'rs/merge-add-strategies-simplification'
A small code clean-up.

* rs/merge-add-strategies-simplification:
  merge: use string_list_split() in add_strategies()
2016-08-12 09:47:36 -07:00
Junio C Hamano 18f3ce8841 Merge branch 'rs/child-process-init'
A small code clean-up.

* rs/child-process-init:
  use CHILD_PROCESS_INIT to initialize automatic variables
2016-08-12 09:47:36 -07:00
Junio C Hamano 2f9c615efb Merge branch 'sb/submodule-clone-retry'
Fix-up to an error codepath in a topic already in 'master'.

* sb/submodule-clone-retry:
  submodule--helper: use parallel processor correctly
2016-08-12 09:47:34 -07:00
Junio C Hamano f4fd627661 Merge branch 'jk/reset-ident-time-per-commit' into maint
Not-so-recent rewrite of "git am" that started making internal
calls into the commit machinery had an unintended regression, in
that no matter how many seconds it took to apply many patches, the
resulting committer timestamp for the resulting commits were all
the same.

* jk/reset-ident-time-per-commit:
  am: reset cached ident date for each patch
2016-08-12 09:16:56 -07:00
Kevin Willford b3dfeebb92 rebase: avoid computing unnecessary patch IDs
The `rebase` family of Git commands avoid applying patches that were
already integrated upstream. They do that by using the revision walking
option that computes the patch IDs of the two sides of the rebase
(local-only patches vs upstream-only ones) and skipping those local
patches whose patch ID matches one of the upstream ones.

In many cases, this causes unnecessary churn, as already the set of
paths touched by a given commit would suffice to determine that an
upstream patch has no local equivalent.

This hurts performance in particular when there are a lot of upstream
patches, and/or large ones.

Therefore, let's introduce the concept of a "diff-header-only" patch ID,
compare those first, and only evaluate the "full" patch ID lazily.

Please note that in contrast to the "full" patch IDs, those
"diff-header-only" patch IDs are prone to collide with one another, as
adjacent commits frequently touch the very same files. Hence we now
have to be careful to allow multiple hash entries with the same hash.
We accomplish that by using the hashmap_add() function that does not even
test for hash collisions.  This also allows us to evaluate the full patch ID
lazily, i.e. only when we found commits with matching diff-header-only
patch IDs.

We add a performance test that demonstrates ~1-6% improvement.  In
practice this will depend on various factors such as how many upstream
changes and how big those changes are along with whether file system
caches are cold or warm.  As Git's test suite has no way of catching
performance regressions, we also add a regression test that verifies
that the full patch ID computation is skipped when the diff-header-only
computation suffices.

Signed-off-by: Kevin Willford <kcwillford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 14:39:16 -07:00
Christian Couder ccceb7bb13 builtin/apply: make write_out_results() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", write_out_results() should return -1 instead of
calling exit().

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 434389deb1 builtin/apply: make write_out_one_result() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", write_out_one_result() should just return what
remove_file() and create_file() are returning instead of calling
exit().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 8f5b5445d7 builtin/apply: make create_file() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", create_file() should just return what
add_conflicted_stages_file() and add_index_file() are returning
instead of calling exit().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 69e1609f81 builtin/apply: make add_index_file() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", add_index_file() should return -1 instead of
calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder a902edceeb builtin/apply: make add_conflicted_stages_file() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", add_conflicted_stages_file() should return -1
instead of calling die().

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 6e8df31469 builtin/apply: make remove_file() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", remove_file() should return -1 instead of
calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder fe41b80225 builtin/apply: make build_fake_ancestor() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", build_fake_ancestor() should return -1 instead
of calling die().

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 119ab159e6 builtin/apply: change die_on_unsafe_path() to check_unsafe_path()
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", die_on_unsafe_path() should return a negative
integer instead of calling die(), so while doing that let's change
its name to check_unsafe_path().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder dbf1b5fb6a builtin/apply: make gitdiff_*() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", gitdiff_*() functions should return -1 instead
of calling die().

A previous patch made it possible for gitdiff_*() functions to
return -1 in case of error. Let's take advantage of that to
make gitdiff_verify_name() return -1 on error, and to have
gitdiff_oldname() and gitdiff_newname() directly return
what gitdiff_verify_name() returns.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 70af7662d4 builtin/apply: make gitdiff_*() return 1 at end of header
The gitdiff_*() functions that are called as p->fn() in parse_git_header()
should return 1 instead of -1 in case of end of header or unrecognized
input, as these are not real errors. It just instructs the parser to break
out.

This makes it possible for gitdiff_*() functions to return -1 in case of a
real error. This will be done in a following patch.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 9724e6ff48 builtin/apply: make parse_traditional_patch() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", parse_traditional_patch() should return -1
instead of calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder fef7ba5353 builtin/apply: make apply_all_patches() return 128 or 1 on error
To finish libifying the apply functionality, apply_all_patches() should not
die() or exit() in case of error, but return either 128 or 1, so that it
gives the same exit code as when die() or exit(1) is called. This way
scripts relying on the exit code don't need to be changed.

While doing that we must take care that file descriptors are properly closed
and, if needed, reset to a sensible value.

Also, according to the lockfile API, when finished with a lockfile, one
should either commit it or roll it back.

This is even more important now that the same lockfile can be passed
to init_apply_state() many times to be reused by series of calls to
the apply lib functions.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder b6446d54ec builtin/apply: move check_apply_state() to apply.c
To libify `git apply` functionality we must make check_apply_state()
usable outside "builtin/apply.c".

Let's do that by moving it into "apply.c".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder f36538d88b builtin/apply: make check_apply_state() return -1 instead of die()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", check_apply_state() should return -1 instead of
calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder 2f5a6d1218 apply: make init_apply_state() return -1 instead of exit()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", init_apply_state() should return -1 instead of
calling exit().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder bb493a5c14 builtin/apply: move init_apply_state() to apply.c
To libify `git apply` functionality we must make init_apply_state()
usable outside "builtin/apply.c".

Let's do that by moving it into a new "apply.c".

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder f95fdc256b builtin/apply: make parse_ignorewhitespace_option() return -1 instead of die()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", parse_ignorewhitespace_option() should return
-1 instead of calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:47 -07:00
Christian Couder aaf6c447aa builtin/apply: make parse_whitespace_option() return -1 instead of die()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_whitespace_option() should return -1 instead
of calling die().

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder dae197f753 builtin/apply: make parse_single_patch() return -1 on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_single_patch() should return a negative
integer instead of calling die().

Let's do that by using error() and let's adjust the related test
cases accordingly.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder b654b34c1c builtin/apply: make parse_chunk() return a negative integer on error
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing or exit()ing.

To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_chunk() should return a negative integer
instead of calling die() or exit().

As parse_chunk() is called only by apply_patch() which already
returns either -1 or -128 when an error happened, let's make it also
return -1 or -128.

This makes it compatible with what find_header() and parse_binary()
already return.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder 5950851e44 builtin/apply: make find_header() return -128 instead of die()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.

To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, let's make find_header() return -128 instead of
calling die().

We could make it return -1, unfortunately find_header() already
returns -1 when no header is found.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder 3bee345d7b builtin/apply: read_patch_file() return -1 instead of die()ing
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing. Let's do that by returning -1 instead of
die()ing in read_patch_file().

Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder f07a9f7643 builtin/apply: make apply_patch() return -1 or -128 instead of die()ing
To libify `git apply` functionality we have to signal errors
to the caller instead of die()ing.

As a first step in this direction, let's make apply_patch() return
-1 or -128 in case of errors instead of dying. For now its only
caller apply_all_patches() will exit(128) when apply_patch()
return -128 and it will exit(1) when it returns -1.

We exit() with code 128 because that was what die() was doing
and we want to keep the distinction between exiting with code 1
and exiting with code 128.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder 71501a71d0 apply: move 'struct apply_state' to apply.h
To libify `git apply` functionality we must make 'struct apply_state'
usable outside "builtin/apply.c".

Let's do that by creating a new "apply.h" and moving
'struct apply_state' there.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:46 -07:00
Christian Couder 4d5acae0ca apply: make some names more specific
To prepare for some structs and constants being moved from
builtin/apply.c to apply.h, we should give them some more
specific names to avoid possible name collisions in the global
namespace.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 12:41:09 -07:00
Jeff King 07e7dbf0db gc: default aggressive depth to 50
This commit message is long and has lots of background and
numbers. The summary is: the current default of 250 doesn't
save much space, and costs CPU. It's not a good tradeoff.
Read on for details.

The "--aggressive" flag to git-gc does three things:

  1. use "-f" to throw out existing deltas and recompute from
     scratch

  2. use "--window=250" to look harder for deltas

  3. use "--depth=250" to make longer delta chains

Items (1) and (2) are good matches for an "aggressive"
repack. They ask the repack to do more computation work in
the hopes of getting a better pack. You pay the costs during
the repack, and other operations see only the benefit.

Item (3) is not so clear. Allowing longer chains means fewer
restrictions on the deltas, which means potentially finding
better ones and saving some space. But it also means that
operations which access the deltas have to follow longer
chains, which affects their performance. So it's a tradeoff,
and it's not clear that the tradeoff is even a good one.

The existing "250" numbers for "--aggressive" come
originally from this thread:

  http://public-inbox.org/git/alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org/

where Linus says:

  So when I said "--depth=250 --window=250", I chose those
  numbers more as an example of extremely aggressive
  packing, and I'm not at all sure that the end result is
  necessarily wonderfully usable. It's going to save disk
  space (and network bandwidth - the delta's will be re-used
  for the network protocol too!), but there are definitely
  downsides too, and using long delta chains may
  simply not be worth it in practice.

There are some numbers in that thread, but they're mostly
focused on the improved window size, and measure the
improvement from --depth=250 and --window=250 together.
E.g.:

  http://public-inbox.org/git/9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com/

talks about the improved run-time of "git-blame", which
comes from the reduced pack size. But most of that reduction
is coming from --window=250, whereas most of the extra costs
come from --depth=250. There's a link in that thread showing
that increasing the depth beyond 50 doesn't seem to help
much with the size:

  https://vcscompare.blogspot.com/2008/06/git-repack-parameters.html

but again, no discussion of the timing impact.

In an earlier thread from Ted Ts'o which discussed setting
the non-aggressive default (from 10 to 50):

  http://public-inbox.org/git/20070509134958.GA21489%40thunk.org/

we have more numbers, with the conclusion that going past 50
does not help size much, and hurts the speed of normal
operations.

So from that, we might guess that 50 is actually a sweet
spot, even for aggressive, if we interpret aggressive to
"spend time now to make a better pack". It is not clear that
"--depth=250" is actually a better pack. It may be slightly
_smaller_, but it carries a run-time penalty.

Here are some more recent timings I did to verify that. They
show three things:

  - the size of the resulting pack (so disk saved to store,
    bandwidth saved on clones/fetches)

  - the cost of "rev-list --objects --all", which shows the
    effect of the delta chains on trees (commits typically
    don't delta, and the command doesn't touch the blobs at
    all)

  - the cost of "log -Sfoo", which will additionally access
    each blob

All cases were repacked with "git repack -adf --depth=$d
--window=250" (so basically, what would happen if we tweaked
the "gc --aggressive" default depth).

The timings are all wall-clock best-of-3. The machine itself
has plenty of RAM compared to the repositories (which is
probably typical of most workstations these days), so we're
really measuring CPU usage, as the whole thing will be in
disk cache after the first run.

The core.deltaBaseCacheLimit is at its default of 96MiB.
It's possible that tweaking it would have some impact on the
tests, as some of them (especially "log -S" on a large repo)
are likely to overflow that. But bumping that carries a
run-time memory cost, so for these tests, I focused on what
we could do just with the on-disk pack tradeoffs.

Each test is done for four depths: 250 (the current value),
50 (the current default that tested well previously), 100
(to show something on the larger side, which previous tests
showed was not a good tradeoff), and 10 (the very old
default, which previous tests showed was worse than 50).

Here are the numbers for linux.git:

   depth |  size |  %    | rev-list |  %     | log -Sfoo |   %
  -------+-------+-------+----------+--------+-----------+-------
    250  | 967MB |  n/a  | 48.159s  |   n/a  | 378.088   |   n/a
    100  | 971MB | +0.4% | 41.471s  | -13.9% | 342.060   |  -9.5%
     50  | 979MB | +1.2% | 37.778s  | -21.6% | 311.040s  | -17.7%
     10  | 1.1GB | +6.6% | 32.518s  | -32.5% | 279.890s  | -25.9%

and for git.git:

   depth |  size |  %    | rev-list |  %     | log -Sfoo |   %
  -------+-------+-------+----------+--------+-----------+-------
    250  |  48MB |  n/a  |  2.215s  |   n/a  |  20.922s  |   n/a
    100  |  49MB | +0.5% |  2.140s  |  -3.4% |  17.736s  | -15.2%
     50  |  49MB | +1.7% |  2.099s  |  -5.2% |  15.418s  | -26.3%
     10  |  53MB | +9.3% |  2.001s  |  -9.7% |  12.677s  | -39.4%

You can see that that the CPU savings for regular operations improves as we
decrease the depth. The savings are less for "rev-list" on a smaller repository
than they are for blob-accessing operations, or even rev-list on a larger
repository. This may mean that a larger delta cache would help (though setting
core.deltaBaseCacheLimit by itself doesn't).

But we can also see that the space savings are not that great as the depth goes
higher. Saving 5-10% between 10 and 50 is probably worth the CPU tradeoff.
Saving 1% to go from 50 to 100, or another 0.5% to go from 100 to 250 is
probably not.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 11:53:19 -07:00
Jeff Hostetler d9fc746cd7 status: print branch info with --porcelain=v2 --branch
Expand porcelain v2 output to include branch and tracking
branch information. This includes the commit id, the branch,
the upstream branch, and the ahead and behind counts.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 11:15:40 -07:00
Jeff Hostetler 1ecdecce62 status: collect per-file data for --porcelain=v2
Collect extra per-file data for porcelain V2 format.

The output of `git status --porcelain` leaves out many
details about the current status that clients might like
to have.  This can force them to be less efficient as they
may need to launch secondary commands (and try to match
the logic within git) to accumulate this extra information.
For example, a GUI IDE might want the file mode to display
the correct icon for a changed item (without having to stat
it afterwards).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 11:14:43 -07:00
Jeff King c9af708b1a pack-objects: use mru list when iterating over packs
In the original implementation of want_object_in_pack(), we
always looked for the object in every pack, so the order did
not matter for performance.

As of the last few patches, however, we can now often break
out of the loop early after finding the first instance, and
avoid looking in the other packs at all. In this case, pack
order can make a big difference, because we'd like to find
the objects by looking at as few packs as possible.

This patch switches us to the same packed_git_mru list that
is now used by normal object lookups.

Here are timings for p5303 on linux.git:

Test                      HEAD^                HEAD
------------------------------------------------------------------------
5303.3: rev-list (1)      31.31(31.07+0.23)    31.28(31.00+0.27) -0.1%
5303.4: repack (1)        40.35(38.84+2.60)    40.53(39.31+2.32) +0.4%
5303.6: rev-list (50)     31.37(31.15+0.21)    31.41(31.16+0.24) +0.1%
5303.7: repack (50)       58.25(68.54+2.03)    47.28(57.66+1.89) -18.8%
5303.9: rev-list (1000)   31.91(31.57+0.33)    31.93(31.64+0.28) +0.1%
5303.10: repack (1000)    304.80(376.00+3.92)  87.21(159.54+2.84) -71.4%

The rev-list numbers are unchanged, which makes sense (they
are not exercising this code at all). The 50- and 1000-pack
repack cases show considerable improvement.

The single-pack repack case doesn't, of course; there's
nothing to improve. In fact, it gives us a baseline for how
fast we could possibly go. You can see that though rev-list
can approach the single-pack case even with 1000 packs,
repack doesn't. The reason is simple: the loop we are
optimizing is only part of what the repack is doing. After
the "counting" phase, we do delta compression, which is much
more expensive when there are multiple packs, because we
have fewer deltas we can reuse (you can also see that these
numbers come from a multicore machine; the CPU times are
much higher than the wall-clock times due to the delta
phase).

So the good news is that in cases with many packs, we used
to be dominated by the "counting" phase, and now we are
dominated by the delta compression (which is faster, and
which we have already parallelized).

Here are similar numbers for git.git:

Test                      HEAD^               HEAD
---------------------------------------------------------------------
5303.3: rev-list (1)      1.55(1.51+0.02)     1.54(1.53+0.00) -0.6%
5303.4: repack (1)        1.82(1.80+0.08)     1.82(1.78+0.09) +0.0%
5303.6: rev-list (50)     1.58(1.57+0.00)     1.58(1.56+0.01) +0.0%
5303.7: repack (50)       2.50(3.12+0.07)     2.31(2.95+0.06) -7.6%
5303.9: rev-list (1000)   2.22(2.20+0.02)     2.23(2.19+0.03) +0.5%
5303.10: repack (1000)    10.47(16.78+0.22)   7.50(13.76+0.22) -28.4%

Not as impressive in terms of percentage, but still
measurable wins.  If you look at the wall-clock time
improvements in the 1000-pack case, you can see that linux
improved by roughly 10x as many seconds as git. That's
because it has roughly 10x as many objects, and we'd expect
this improvement to scale linearly with the number of
objects (since the number of packs is kept constant). It's
just that the "counting" phase is a smaller percentage of
the total time spent for a git.git repack, and hence the
percentage win is smaller.

The implementation itself is a straightforward use of the
MRU code. We only bother marking a pack as used when we know
that we are able to break early out of the loop, for two
reasons:

  1. If we can't break out early, it does no good; we have
     to visit each pack anyway, so we might as well avoid
     even the minor overhead of managing the cache order.

  2. The mru_mark() function reorders the list, which would
     screw up our traversal. So it is only safe to mark when
     we are about to break out of the loop. We could record
     the found pack and mark it after the loop finishes, of
     course, but that's more complicated and it doesn't buy
     us anything due to (1).

Note that this reordering does have a potential impact on
the final pack, as we store only a single "found" pack for
each object, even if it is present in multiple packs. In
principle, any copy is acceptable, as they all refer to the
same content. But in practice, they may differ in whether
they are stored as deltas, against which base, etc. This may
have an impact on delta reuse, and even the delta search
(since we skip pairs that were already in the same pack).

It's not clear whether this change of order would hurt or
even help average cases, though. The most likely reason to
have duplicate objects is from the completion of thin packs
(e.g., you have some objects in a "base" pack, then receive
several pushes; the packs you receive may be thin on the
wire, with deltas that refer to bases outside the pack, but
we complete them with duplicate base objects when indexing
them).

In such a case the current code would always find the thin
duplicates (because we currently walk the packs in reverse
chronological order). Whereas with this patch, some of those
duplicates would be found in the base pack instead.

In my tests repacking a real-world case of linux.git with
3600 thin-pack pushes (on top of a large "base" pack), the
resulting pack was about 0.04% larger with this patch. On
the other hand, because we were more likely to hit the base
pack, there were more opportunities for delta reuse, and we
had 50,000 fewer objects to examine in the delta search.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 10:44:23 -07:00
Jeff King 4cf2143e02 pack-objects: break delta cycles before delta-search phase
We do not allow cycles in the delta graph of a pack (i.e., A
is a delta of B which is a delta of A) for the obvious
reason that you cannot actually access any of the objects in
such a case.

There's a last-ditch attempt to notice cycles during the
write phase, during which we issue a warning to the user and
write one of the objects out in full. However, this is
"last-ditch" for two reasons:

  1. By this time, it's too late to find another delta for
     the object, so the resulting pack is larger than it
     otherwise could be.

  2. The warning is there because this is something that
     _shouldn't_ ever happen. If it does, then either:

       a. a pack we are reusing deltas from had its own
          cycle

       b. we are reusing deltas from multiple packs, and
          we found a cycle among them (i.e., A is a delta of
          B in one pack, but B is a delta of A in another,
          and we choose to use both deltas).

       c. there is a bug in the delta-search code

     So this code serves as a final check that none of these
     things has happened, warns the user, and prevents us
     from writing a bogus pack.

Right now, (2b) should never happen because of the static
ordering of packs in want_object_in_pack(). If two objects
have a delta relationship, then they must be in the same
pack, and therefore we will find them from that same pack.

However, a future patch would like to change that static
ordering, which will make (2b) a common occurrence. In
preparation, we should be able to handle those kinds of
cycles better. This patch does by introducing a
cycle-breaking step during the get_object_details() phase,
when we are deciding which deltas can be reused. That gives
us the chance to feed the objects into the delta search as
if the cycle did not exist.

We'll leave the detection and warning in the write_object()
phase in place, as it still serves as a check for case (2c).

This does mean we will stop warning for (2a). That case is
caused by bogus input packs, and we ideally would warn the
user about it.  However, since those cycles show up after
picking reusable deltas, they look the same as (2b) to us;
our new code will break the cycles early and the last-ditch
check will never see them.

We could do analysis on any cycles that we find to
distinguish the two cases (i.e., it is a bogus pack if and
only if every delta in the cycle is in the same pack), but
we don't need to. If there is a cycle inside a pack, we'll
run into problems not only reusing the delta, but accessing
the object data at all. So when we try to dig up the actual
size of the object, we'll hit that same cycle and kick in
our usual complain-and-try-another-source code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 10:44:13 -07:00
Jeff King 27b5c1a065 provide an initializer for "struct object_info"
An all-zero initializer is fine for this struct, but because
the first element is a pointer, call sites need to know to
use "NULL" instead of "0". Otherwise some static checkers
like "sparse" will complain; see d099b71 (Fix some sparse
warnings, 2013-07-18) for example.  So let's provide an
initializer to make this easier to get right.

But let's also comment that memset() to zero is explicitly
OK[1]. One of the callers embeds object_info in another
struct which is initialized via memset (expand_data in
builtin/cat-file.c). Since our subset of C doesn't allow
assignment from a compound literal, handling this in any
other way is awkward, so we'd like to keep the ability to
initialize by memset(). By documenting this property, it
should make anybody who wants to change the initializer
think twice before doing so.

There's one other caller of interest. In parse_sha1_header(),
we did not initialize the struct fully in the first place.
This turned out not to be a bug because the sub-function it
calls does not look at any other fields except the ones we
did initialize. But that assumption might not hold in the
future, so it's a dangerous construct. This patch switches
it to initializing the whole struct, which protects us
against unexpected reads of the other fields.

[1] Obviously using memset() to initialize a pointer
    violates the C standard, but we long ago decided that it
    was an acceptable tradeoff in the real world.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 10:42:23 -07:00
Junio C Hamano 11b53957ac Merge branch 'sb/submodule-update-dot-branch'
A few updates to "git submodule update".

Use of "| wc -l" break with BSD variant of 'wc'.

* sb/submodule-update-dot-branch:
  t7406: fix breakage on OSX
  submodule update: allow '.' for branch value
  submodule--helper: add remote-branch helper
  submodule-config: keep configured branch around
  submodule--helper: fix usage string for relative-path
  submodule update: narrow scope of local variable
  submodule update: respect depth in subsequent fetches
  t7406: future proof tests with hard coded depth
2016-08-10 12:33:20 -07:00
Junio C Hamano 1a5f1a3f25 Merge branch 'js/am-3-merge-recursive-direct'
"git am -3" calls "git merge-recursive" when it needs to fall back
to a three-way merge; this call has been turned into an internal
subroutine call instead of spawning a separate subprocess.

* js/am-3-merge-recursive-direct:
  merge-recursive: flush output buffer even when erroring out
  merge_trees(): ensure that the callers release output buffer
  merge-recursive: offer an option to retain the output in 'obuf'
  merge-recursive: write the commit title in one go
  merge-recursive: flush output buffer before printing error messages
  am -3: use merge_recursive() directly again
  merge-recursive: switch to returning errors instead of dying
  merge-recursive: handle return values indicating errors
  merge-recursive: allow write_tree_from_memory() to error out
  merge-recursive: avoid returning a wholesale struct
  merge_recursive: abort properly upon errors
  prepare the builtins for a libified merge_recursive()
  merge-recursive: clarify code in was_tracked()
  die(_("BUG")): avoid translating bug messages
  die("bug"): report bugs consistently
  t5520: verify that `pull --rebase` shows the helpful advice when failing
2016-08-10 12:33:20 -07:00
Junio C Hamano 7a3ea66633 Merge branch 'js/commit-slab-decl-fix'
* js/commit-slab-decl-fix:
  commit-slab.h: avoid duplicated global static variables
  config.c: avoid duplicated global static variables
2016-08-10 12:33:20 -07:00
Junio C Hamano db40a62239 Merge branch 'jt/format-patch-from-config'
"git format-patch" learned format.from configuration variable to
specify the default settings for its "--from" option.

* jt/format-patch-from-config:
  format-patch: format.from gives the default for --from
2016-08-10 12:33:18 -07:00
Junio C Hamano 24fbe00490 Merge branch 'jk/reset-ident-time-per-commit'
Not-so-recent rewrite of "git am" that started making internal
calls into the commit machinery had an unintended regression, in
that no matter how many seconds it took to apply many patches, the
resulting committer timestamp for the resulting commits were all
the same.

* jk/reset-ident-time-per-commit:
  am: reset cached ident date for each patch
2016-08-10 12:33:17 -07:00
Junio C Hamano 574a31b5b7 Merge branch 'rs/use-strbuf-addstr' into maint
* rs/use-strbuf-addstr:
  use strbuf_addstr() instead of strbuf_addf() with "%s"
  use strbuf_addstr() for adding constant strings to a strbuf
2016-08-10 11:55:34 -07:00
Junio C Hamano d9d7ab3b1d Merge branch 'os/no-verify-skips-commit-msg-too' into maint
"git commit --help" said "--no-verify" is only about skipping the
pre-commit hook, and failed to say that it also skipped the
commit-msg hook.

* os/no-verify-skips-commit-msg-too:
  commit: describe that --no-verify skips the commit-msg hook in the help text
2016-08-10 11:55:25 -07:00
Junio C Hamano b7fb136bf6 Merge branch 'rs/rm-strbuf-optim' into maint
The use of strbuf in "git rm" to build filename to remove was a bit
suboptimal, which has been fixed.

* rs/rm-strbuf-optim:
  rm: reuse strbuf for all remove_dir_recursively() calls
2016-08-10 11:55:24 -07:00
Junio C Hamano 60b84ba26c Merge branch 'jk/parse-options-concat' into maint
Users of the parse_options_concat() API function need to allocate
extra slots in advance and fill them with OPT_END() when they want
to decide the set of supported options dynamically, which makes the
code error-prone and hard to read.  This has been corrected by tweaking
the API to allocate and return a new copy of "struct option" array.

* jk/parse-options-concat:
  parse_options: allocate a new array when concatenating
2016-08-10 11:55:24 -07:00
Stefan Beller 2201ee09b5 submodule--helper: use parallel processor correctly
When developing another patch series I had a temporary state in
which git-clone would segfault, when the call was prepared in
prepare_to_clone_next_submodule. This lead to the call failing,
i.e. in `update_clone_task_finished` the task was scheduled to be
tried again.  The second call to prepare_to_clone_next_submodule
would return 0, as the segfaulted clone did create the .git file
already, such that was not considered to need to be cloned again. I
was seeing the "BUG: ce was a submodule before?\n" message, which
was the correct behavior at the time as my local code was
buggy. When trying to debug this failure, I tried to use printing
messages into the strbuf that is passed around, but these messages
were never printed as the die(..) doesn't flush the `err` strbuf.

When implementing the die() in 665b35ecc (2016-06-09, "submodule--helper:
initial clone learns retry logic"), I considered this condition to be
a severe condition, which should lead to an immediate abort as we do not
trust ourselves any more. However the queued messages in `err` are valuable
so let's not toss them out by immediately dying, but a graceful return.

Another thing to note: The error message itself was misleading. A return
value of 0 doesn't indicate the passed in `ce` is not a submodule any more,
but just that we do not consider cloning it any more.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-09 14:54:16 -07:00
Johannes Sixt dc29ddebb9 config.c: avoid duplicated global static variables
Repeating the definition of a static variable seems to be valid in C.
Nevertheless, it is bad style because it can cause confusion, definitely
when it becomes necessary to change the type.

d64ec16 (git config: reorganize to use parseopt, 2009-02-21) added two
static variables near the top of the file config.c without removing the
definitions of the two variables that occurs later in the file.

The two variables were needed earlier in the file in the newly
introduced parseopt structure. These references were removed later in
d0e08d6 (config: fix parsing of "git config --get-color some.key -1",
2014-11-20).

Remove the redundant, younger, definitions near the top of the file and
keep the original definitions that occur later.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-09 10:19:24 -07:00
Junio C Hamano 19492555ca Merge branch 'jk/parseopt-string-list'
A small memory leak in the command line parsing of "git blame"
has been plugged.

* jk/parseopt-string-list:
  blame: drop strdup of string literal
2016-08-08 14:48:44 -07:00
Junio C Hamano 940622bc8b Merge branch 'rs/use-strbuf-addstr'
* rs/use-strbuf-addstr:
  use strbuf_addstr() instead of strbuf_addf() with "%s"
  use strbuf_addstr() for adding constant strings to a strbuf
2016-08-08 14:48:41 -07:00
Junio C Hamano 78849622ec Merge branch 'jk/pack-objects-optim'
"git pack-objects" has a few options that tell it not to pack
objects found in certain packfiles, which require it to scan .idx
files of all available packs.  The codepaths involved in these
operations have been optimized for a common case of not having any
non-local pack and/or any .kept pack.

* jk/pack-objects-optim:
  pack-objects: compute local/ignore_pack_keep early
  pack-objects: break out of want_object loop early
  find_pack_entry: replace last_found_pack with MRU cache
  add generic most-recently-used list
  sha1_file: drop free_pack_by_name
  t/perf: add tests for many-pack scenarios
2016-08-08 14:48:39 -07:00
Junio C Hamano 768ededa9c Merge branch 'va/i18n'
More i18n marking.

* va/i18n:
  i18n: config: unfold error messages marked for translation
  i18n: notes: mark comment for translation
2016-08-08 14:48:38 -07:00
Junio C Hamano 0d3279962a Merge branch 'jk/reflog-date'
The reflog output format is documented better, and a new format
--date=unix to report the seconds-since-epoch (without timezone)
has been added.

* jk/reflog-date:
  date: clarify --date=raw description
  date: add "unix" format
  date: document and test "raw-local" mode
  doc/pretty-formats: explain shortening of %gd
  doc/pretty-formats: describe index/time formats for %gd
  doc/rev-list-options: explain "-g" output formats
  doc/rev-list-options: clarify "commit@{Nth}" for "-g" option
2016-08-08 14:48:37 -07:00
Junio C Hamano a220e2bbbf Merge branch 'pb/commit-editmsg-path' into maint
Code clean-up.

* pb/commit-editmsg-path:
  builtin/commit.c: memoize git-path for COMMIT_EDITMSG
2016-08-08 14:21:38 -07:00
Junio C Hamano aa9136a87e Merge branch 'nd/pack-ofs-4gb-limit' into maint
"git pack-objects" and "git index-pack" mostly operate with off_t
when talking about the offset of objects in a packfile, but there
were a handful of places that used "unsigned long" to hold that
value, leading to an unintended truncation.

* nd/pack-ofs-4gb-limit:
  fsck: use streaming interface for large blobs in pack
  pack-objects: do not truncate result in-pack object size on 32-bit systems
  index-pack: correct "offset" type in unpack_entry_data()
  index-pack: report correct bad object offsets even if they are large
  index-pack: correct "len" type in unpack_data()
  sha1_file.c: use type off_t* for object_info->disk_sizep
  pack-objects: pass length to check_pack_crc() without truncation
2016-08-08 14:21:36 -07:00
Junio C Hamano 327b3f8459 Merge branch 'mh/blame-worktree' into maint
"git blame file" allowed the lineage of lines in the uncommitted,
unadded contents of "file" to be inspected, but it refused when
"file" did not appear in the current commit.  When "file" was
created by renaming an existing file (but the change has not been
committed), this restriction was unnecessarily tight.

* mh/blame-worktree:
  t/t8003-blame-corner-cases.sh: Use here documents
  blame: allow to blame paths freshly added to the index
2016-08-08 14:21:32 -07:00
Johannes Schindelin 189d035e67 git mv: do not keep slash in git mv dir non-existing-dir/
When calling `rename("dir", "non-existing-dir/")` on Linux, it silently
succeeds, stripping the trailing slash of the second argument.

This is all good and dandy but this behavior disagrees with the specs at

http://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html

that state clearly regarding the 2nd parameter (called `new`):

	If the `new` argument does not resolve to an existing directory
	entry for a file of type directory and the `new` argument
	contains at least one non- <slash> character and ends with one
	or more trailing <slash> characters after all symbolic links
	have been processed, `rename()` shall fail.

Of course, we would like `git mv dir non-existing-dir/` to succeed (and
rename the directory "dir" to "non-existing-dir"). Let's be extra
careful to remove the trailing slash in that case.

This lets t7001-mv.sh pass in Bash on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-08 10:43:20 -07:00
René Scharfe 1eb47f167d use strbuf_add_unique_abbrev() for adding short hashes
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer.  This is shorter and a bit more efficient.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-06 10:33:57 -07:00
Jeff Hostetler c4f596b98e status: support --porcelain[=<version>]
Update --porcelain argument to take optional version parameter
to allow multiple porcelain formats to be supported in the future.

The token "v1" is the default value and indicates the traditional
porcelain format.  (The token "1" is an alias for that.)

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:46:42 -07:00
Jeff Hostetler be7e795efe status: cleanup API to wt_status_print
Refactor the API between builtin/commit.c and wt-status.[ch].

Hide the details of the various wt_*status_print() routines inside
wt-status.c behind a single (new) wt_status_print() routine.
Eliminate the switch statements from builtin/commit.c.
Allow details of new status formats to be isolated within wt-status.c

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:46:08 -07:00
Jeff Hostetler 957a0fe2e5 status: rename long-format print routines
Rename the various wt_status_print*() routines to be
wt_longstatus_print*() to make it clear that these
routines are only concerned with the normal/long
status output and reduce developer confusion as other
status formats are added in the future.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:45:47 -07:00
René Scharfe 02a8cfa478 merge: use string_list_split() in add_strategies()
Call string_list_split() for cutting a space separated list into pieces
instead of reimplementing it based on struct strategy.  The attr member
of struct strategy was not used split_merge_strategies(); it was a pure
string operation.  Also be nice and clean up once we're done splitting;
the old code didn't bother freeing any of the allocated memory.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:11:06 -07:00
René Scharfe 542aa25d97 use CHILD_PROCESS_INIT to initialize automatic variables
Initialize struct child_process variables already when they're defined.
That's shorter and saves a function call.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:10:05 -07:00
René Scharfe bc57b9c0cc use strbuf_addstr() instead of strbuf_addf() with "%s"
Call strbuf_addstr() for adding a simple string to a strbuf instead of
using the heavier strbuf_addf().  This is shorter and documents the
intent more clearly.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:09:25 -07:00
Junio C Hamano 1e9a4856fb Merge branch 'sb/submodule-clone-retry'
An earlier tweak to make "submodule update" retry a failing clone
of submodules was buggy and caused segfault, which has been fixed.

* sb/submodule-clone-retry:
  submodule-helper: fix indexing in clone retry error reporting path
  git-submodule: forward exit code of git-submodule--helper more faithfully
2016-08-04 14:39:17 -07:00
Stefan Beller 4d7bc52b17 submodule update: allow '.' for branch value
Gerrit has a "superproject subscription" feature[1], that triggers a
commit in a superproject that is subscribed to its submodules.
Conceptually this Gerrit feature can be done on the client side with
Git via (except for raciness, error handling etc):

  while [ true ]; do
    git -C <superproject> submodule update --remote --force
    git -C <superproject> commit -a -m "Update submodules"
    git -C <superproject> push
  done

for each branch in the superproject. To ease the configuration in Gerrit
a special value of "." has been introduced for the submodule.<name>.branch
to mean the same branch as the superproject[2], such that you can create a
new branch on both superproject and the submodule and this feature
continues to work on that new branch.

Now we find projects in the wild with such a .gitmodules file.
The .gitmodules used in these Gerrit projects do not conform
to Gits understanding of how .gitmodules should look like.
This teaches Git to deal gracefully with this syntax as well.

The redefinition of "." does no harm to existing projects unaware of
this change, as "." is an invalid branch name in Git, so we do not
expect such projects to exist.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03 16:13:22 -07:00
Stefan Beller 92bbe7ccf1 submodule--helper: add remote-branch helper
In a later patch we want to enhance the logic for the branch selection.
Rewrite the current logic to be in C, so we can directly use C when
we enhance the logic.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03 16:11:35 -07:00
Junio C Hamano a58a8e3f71 Merge branch 'jk/push-progress'
"git push" and "git clone" learned to give better progress meters
to the end user who is waiting on the terminal.

* jk/push-progress:
  receive-pack: send keepalives during quiet periods
  receive-pack: turn on connectivity progress
  receive-pack: relay connectivity errors to sideband
  receive-pack: turn on index-pack resolving progress
  index-pack: add flag for showing delta-resolution progress
  clone: use a real progress meter for connectivity check
  check_connected: add progress flag
  check_connected: relay errors to alternate descriptor
  check_everything_connected: use a struct with named options
  check_everything_connected: convert to argv_array
  rev-list: add optional progress reporting
  check_everything_connected: always pass --quiet to rev-list
2016-08-03 15:10:28 -07:00
Junio C Hamano d083d420b7 Merge branch 'jk/parse-options-concat'
Users of the parse_options_concat() API function need to allocate
extra slots in advance and fill them with OPT_END() when they want
to decide the set of supported options dynamically, which makes the
code error-prone and hard to read.  This has been corrected by tweaking
the API to allocate and return a new copy of "struct option" array.

* jk/parse-options-concat:
  parse_options: allocate a new array when concatenating
2016-08-03 15:10:25 -07:00
Junio C Hamano cf27c7996e Merge branch 'sb/push-options'
"git push" learned to accept and pass extra options to the
receiving end so that hooks can read and react to them.

* sb/push-options:
  add a test for push options
  push: accept push options
  receive-pack: implement advertising and receiving push options
  push options: {pre,post}-receive hook learns about push options
2016-08-03 15:10:24 -07:00
Junio C Hamano a35031240b Merge branch 'os/no-verify-skips-commit-msg-too'
"git commit --help" said "--no-verify" is only about skipping the
pre-commit hook, and failed to say that it also skipped the
commit-msg hook.

* os/no-verify-skips-commit-msg-too:
  commit: describe that --no-verify skips the commit-msg hook in the help text
2016-08-03 15:10:22 -07:00
Eric Sunshine aa59e14b23 blame: drop strdup of string literal
This strdup was added as part of 58dbfa2 (blame: accept
multiple -L ranges, 2013-08-06) to be consistent with
parse_opt_string_list(), which appends to the same list.

But as of 7a7a517 (parse_opt_string_list: stop allocating
new strings, 2016-06-13), we should stop using strdup (to
match parse_opt_string_list, and for all the reasons
described in that commit; namely that it does nothing useful
and causes us to leak the memory).

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03 08:52:46 -07:00
Jeff King 4d9c7e6f45 am: reset cached ident date for each patch
When we compute the date to go in author/committer lines of
commits, or tagger lines of tags, we get the current date
once and then cache it for the rest of the program.  This is
a good thing in some cases, like "git commit", because it
means we do not racily assign different times to the
author/committer fields of a single commit object.

But as more programs start to make many commits in a single
process (e.g., the recently builtin "git am"), it means that
you'll get long strings of commits with identical committer
timestamps (whereas before, we invoked "git commit" many
times and got true timestamps).

This patch addresses it by letting callers reset the cached
time, which means they'll get a fresh time on their next
call to git_committer_info() or git_author_info(). The first
caller to do so is "git am", which resets the time for each
patch it applies.

It would be nice if we could just do this automatically
before filling in the ident fields of commit and tag
objects. Unfortunately, it's hard to know where a particular
logical operation begins and ends.

For instance, if commit_tree_extended() were to call
reset_ident_date() before getting the committer/author
ident, that doesn't quite work; sometimes the author info is
passed in to us as a parameter, and it may or may not have
come from a previous call to ident_default_date(). So in
those cases, we lose the property that the committer and the
author timestamp always match.

You could similarly put a date-reset at the end of
commit_tree_extended(). That actually works in the current
code base, but it's fragile. It makes the assumption that
after commit_tree_extended() finishes, the caller has no
other operations that would logically want to fall into the
same timestamp.

So instead we provide the tool to easily do the reset, and
let the high-level callers use it to annotate their own
logical operations.

There's no automated test, because it would be inherently
racy (it depends on whether the program takes multiple
seconds to run). But you can see the effect with something
like:

  # make a fake 100-patch series
  top=$(git rev-parse HEAD)
  bottom=$(git rev-list --first-parent -100 HEAD | tail -n 1)
  git log --format=email --reverse --first-parent \
          --binary -m -p $bottom..$top >patch

  # now apply it; this presumably takes multiple seconds
  git checkout --detach $bottom
  git am <patch

  # now count the number of distinct committer times;
  # prior to this patch, there would only be one, but
  # now we'd typically see several.
  git log --format=%ct $bottom.. | sort -u

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Helped-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 14:49:41 -07:00
Stefan Beller 2de26ae1dc submodule--helper: fix usage string for relative-path
Internally we call the underscore version of relative_path, but externally
we present an API with no underscores.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 14:41:53 -07:00
René Scharfe 02962d3684 use strbuf_addstr() for adding constant strings to a strbuf
Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls.

In http-push.c it becomes easier to see what's going on without having
to verfiy that the definition of PROPFIND_ALL_REQUEST doesn't contain
any format specifiers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 13:42:10 -07:00
Josh Triplett 6bc6b6c0dc format-patch: format.from gives the default for --from
This helps users who would prefer format-patch to default to --from,
and makes it easier to change the default in the future.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 13:13:02 -07:00
Johannes Schindelin 548009c0d5 merge_trees(): ensure that the callers release output buffer
The recursive merge machinery accumulates its output in an output
buffer, to be flushed at the end of merge_recursive(). At this point,
we forgot to release the output buffer.

When calling merge_trees() (i.e. the non-recursive part of the recursive
merge) directly, the output buffer is never flushed because the caller
may be merge_recursive() which wants to flush the output itself.

For the same reason, merge_trees() cannot release the output buffer: it
may still be needed.

Forgetting to release the output buffer did not matter much when running
git-checkout, or git-merge-recursive, because we exited after the
operation anyway. Ever since cherry-pick learned to pick a commit range,
however, this memory leak had the potential of becoming a problem.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01 11:45:30 -07:00
Jeff King 56dfeb6263 pack-objects: compute local/ignore_pack_keep early
In want_object_in_pack(), we can exit early from our loop if
neither "local" nor "ignore_pack_keep" are set. If they are,
however, we must examine each pack to see if it has the
object and is non-local or has a ".keep".

It's quite common for there to be no non-local or .keep
packs at all, in which case we know ahead of time that
looking further will be pointless. We can pre-compute this
by simply iterating over the list of packs ahead of time,
and dropping the flags if there are no packs that could
match.

Another similar strategy would be to modify the loop in
want_object_in_pack() to notice that we have already found
the object once, and that we are looping only to check for
"local" and "keep" attributes. If a pack has neither of
those, we can skip the call to find_pack_entry_one(), which
is the expensive part of the loop.

This has two advantages:

  - it isn't all-or-nothing; we still get some improvement
    when there's a small number of kept or non-local packs,
    and a large number of non-kept local packs

  - it eliminates any possible race where we add new
    non-local or kept packs after our initial scan. In
    practice, I don't think this race matters; we already
    cache the packed_git information, so somebody who adds a
    new pack or .keep file after we've started will not be
    noticed at all, unless we happen to need to call
    reprepare_packed_git() because a lookup fails.

    In other words, we're already racy, and the race is not
    a big deal (losing the race means we might include an
    object in the pack that would not otherwise be, which is
    an acceptable outcome).

However, it also has a disadvantage: we still loop over the
rest of the packs for each object to check their flags. This
is much less expensive than doing the object lookup, but
still not free. So if we wanted to implement that strategy
to cover the non-all-or-nothing cases, we could do so in
addition to this one (so you get the most speedup in the
all-or-nothing case, and the best we can do in the other
cases). But given that the all-or-nothing case is likely the
most common, it is probably not worth the trouble, and we
can revisit this later if evidence points otherwise.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29 11:05:08 -07:00
Jeff King cd37996795 pack-objects: break out of want_object loop early
When pack-objects collects the list of objects to pack
(either from stdin, or via its internal rev-list), it
filters each one through want_object_in_pack().

This function loops through each existing packfile, looking
for the object. When we find it, we mark the pack/offset
combo for later use. However, we can't just return "yes, we
want it" at that point. If --honor-pack-keep is in effect,
we must keep looking to find it in _all_ packs, to make sure
none of them has a .keep. Likewise, if --local is in effect,
we must make sure it is not present in any non-local pack.

As a result, the sum effort of these calls is effectively
O(nr_objects * nr_packs). In an ordinary repository, we have
only a handful of packs, and this doesn't make a big
difference. But in pathological cases, it can slow the
counting phase to a crawl.

This patch notices the case that we have neither "--local"
nor "--honor-pack-keep" in effect and breaks out of the loop
early, after finding the first instance. Note that our worst
case is still "objects * packs" (i.e., we might find each
object in the last pack we look in), but in practice we will
often break out early. On an "average" repo, my git.git with
8 packs, this shows a modest 2% (a few dozen milliseconds)
improvement in the counting-objects phase of "git
pack-objects --all <foo" (hackily instrumented by sticking
exit(0) right after list_objects).

But in a much more pathological case, it makes a bigger
difference. I ran the same command on a real-world example
with ~9 million objects across 1300 packs. The counting time
dropped from 413s to 45s, an improvement of about 89%.

Note that this patch won't do anything by itself for a
normal "git gc", as it uses both --honor-pack-keep and
--local.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29 11:05:07 -07:00
Junio C Hamano c81d283675 Merge branch 'dk/blame-move-no-reason-for-1-line-context' into maint
"git blame -M" missed a single line that was moved within the file.

* dk/blame-move-no-reason-for-1-line-context:
  blame: require 0 context lines while finding moved lines with -M
2016-07-28 11:26:01 -07:00
Junio C Hamano 174f9e622f Merge branch 'js/am-call-theirs-theirs-in-fallback-3way' into maint
One part of "git am" had an oddball helper function that called
stuff from outside "his" as opposed to calling what we have "ours",
which was not gender-neutral and also inconsistent with the rest of
the system where outside stuff is usuall called "theirs" in
contrast to "ours".

* js/am-call-theirs-theirs-in-fallback-3way:
  am: counteract gender bias
2016-07-28 11:25:59 -07:00
Junio C Hamano 87be95b6f9 Merge branch 'ew/gc-auto-pack-limit-fix' into maint
"gc.autoPackLimit" when set to 1 should not trigger a repacking
when there is only one pack, but the code counted poorly and did
so.

* ew/gc-auto-pack-limit-fix:
  gc: fix off-by-one error with gc.autoPackLimit
2016-07-28 11:25:56 -07:00
Junio C Hamano c12c71fabb Merge branch 'nd/ita-cleanup' into maint
Git does not know what the contents in the index should be for a
path added with "git add -N" yet, so "git grep --cached" should not
show hits (or show lack of hits, with -L) in such a path, but that
logic does not apply to "git grep", i.e. searching in the working
tree files.  But we did so by mistake, which has been corrected.

* nd/ita-cleanup:
  grep: fix grepping for "intent to add" files
  t7810-grep.sh: fix a whitespace inconsistency
  t7810-grep.sh: fix duplicated test name
2016-07-28 11:25:51 -07:00
Junio C Hamano 4966b58f3e Merge branch 'js/find-commit-subject-ignore-leading-blanks' into maint
A helper function that takes the contents of a commit object and
finds its subject line did not ignore leading blank lines, as is
commonly done by other codepaths.  Make it ignore leading blank
lines to match.

* js/find-commit-subject-ignore-leading-blanks:
  reset --hard: skip blank lines when reporting the commit subject
  sequencer: use skip_blank_lines() to find the commit subject
  commit -C: skip blank lines at the beginning of the message
  commit.c: make find_commit_subject() more robust
  pretty: make the skip_blank_lines() function public
2016-07-28 11:25:50 -07:00
Junio C Hamano ad2d777604 Merge branch 'nd/pack-ofs-4gb-limit'
"git pack-objects" and "git index-pack" mostly operate with off_t
when talking about the offset of objects in a packfile, but there
were a handful of places that used "unsigned long" to hold that
value, leading to an unintended truncation.

* nd/pack-ofs-4gb-limit:
  fsck: use streaming interface for large blobs in pack
  pack-objects: do not truncate result in-pack object size on 32-bit systems
  index-pack: correct "offset" type in unpack_entry_data()
  index-pack: report correct bad object offsets even if they are large
  index-pack: correct "len" type in unpack_data()
  sha1_file.c: use type off_t* for object_info->disk_sizep
  pack-objects: pass length to check_pack_crc() without truncation
2016-07-28 10:34:42 -07:00
Junio C Hamano 2c608e0f7c Merge branch 'nd/worktree-lock'
"git worktree prune" protected worktrees that are marked as
"locked" by creating a file in a known location.  "git worktree"
command learned a dedicated command pair to create and remove such
a file, so that the users do not have to do this with editor.

* nd/worktree-lock:
  worktree.c: find_worktree() search by path suffix
  worktree: add "unlock" command
  worktree: add "lock" command
  worktree.c: add is_worktree_locked()
  worktree.c: add is_main_worktree()
  worktree.c: add find_worktree()
2016-07-28 10:34:42 -07:00
Vasco Almeida 996ee6d27a i18n: notes: mark comment for translation
Mark comment displayed when editing a note for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-28 09:09:18 -07:00
Jeff King 642833db78 date: add "unix" format
We already have "--date=raw", which is a Unix epoch
timestamp plus a contextual timezone (either the author's or
the local). But one may not care about the timezone and just
want the epoch timestamp by itself. It's not hard to parse
the two apart, but if you are using a pretty-print format,
you may want git to show the "finished" form that the user
will see.

We can accomodate this by adding a new date format, "unix",
which is basically "raw" without the timezone.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-27 14:15:51 -07:00
Orgad Shaneh def480fe99 commit: describe that --no-verify skips the commit-msg hook in the help text
This brings the short help in line with the documentation.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 13:44:55 -07:00
Johannes Schindelin 3f338f43b0 am -3: use merge_recursive() directly again
Last October, we had to change this code to run `git merge-recursive`
in a child process: git-am wants to print some helpful advice when the
merge failed, but the code in question was not prepared to return, it
die()d instead.

We are finally at a point when the code *is* prepared to return errors,
and can avoid the child process again.

This reverts commit c63d4b2 (am -3: do not let failed merge from
completing the error codepath, 2015-10-09), with the necessary changes
to adjust for the fact that Git's source code changed in the meantime
(such as: using OIDs instead of hashes in the recursive merge, and a
removed gender bias).

Note: the code now calls merge_recursive_generic() again. Unlike
merge_trees() and merge_recursive(), this function returns 0 upon success,
as most of Git's functions. Therefore, the error value -1 naturally is
handled correctly, and we do not have to take care of it specifically.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Johannes Schindelin f241ff0d0a prepare the builtins for a libified merge_recursive()
Previously, callers of merge_trees() or merge_recursive() expected that
code to die() with an error message. This used to be okay because we
called those commands from scripts, and had a chance to print out a
message in case the command failed fatally (read: with exit code 128).

As scripting incurs its own set of problems (portability, speed,
idiosyncrasies of different shells, limited data structures leading to
inefficient code), we are converting more and more of these scripts into
builtins, using library functions directly.

We already tried to use merge_recursive() directly in the builtin
git-am, for example. Unfortunately, we had to roll it back temporarily
because some of the code in merge-recursive.c still deemed it okay to
call die(), when the builtin am code really wanted to print out a useful
advice after the merge failed fatally. In the next commits, we want to
fix that.

The code touched by this commit expected merge_trees() to die() with
some useful message when there is an error condition, but merge_trees()
is going to be improved by converting all die() calls to return error()
instead (i.e. return value -1 after printing out the message as before),
so that the caller can react more flexibly.

This is a step to prepare for the version of merge_trees() that no
longer dies,  even if we just imitate the previous behavior by calling
exit(128): this is what callers of e.g. `git merge` have come to expect.

Note that the callers of the sequencer (revert and cherry-pick) already
fail fast even for the return value -1; The only difference is that they
now get a chance to say "<command> failed".

A caller of merge_trees() might want handle error messages themselves
(or even suppress them). As this patch is already complex enough, we
leave that change for a later patch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Johannes Schindelin ef1177d18e die("bug"): report bugs consistently
The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".

As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Junio C Hamano 37e9c7f5e1 Merge branch 'mh/blame-worktree'
"git blame file" allowed the lineage of lines in the uncommitted,
unadded contents of "file" to be inspected, but it refused when
"file" did not appear in the current commit.  When "file" was
created by renaming an existing file (but the change has not been
committed), this restriction was unnecessarily tight.

* mh/blame-worktree:
  t/t8003-blame-corner-cases.sh: Use here documents
  blame: allow to blame paths freshly added to the index
2016-07-25 14:13:45 -07:00
Junio C Hamano 9db3979784 Merge branch 'js/fsck-name-object'
When "git fsck" reports a broken link (e.g. a tree object contains
a blob that does not exist), both containing object and the object
that is referred to were reported with their 40-hex object names.
The command learned the "--name-objects" option to show the path to
the containing object from existing refs (e.g. "HEAD~24^2:file.txt").

* js/fsck-name-object:
  fsck: optionally show more helpful info for broken links
  fsck: give the error function a chance to see the fsck_options
  fsck_walk(): optionally name objects on the go
  fsck: refactor how to describe objects
2016-07-25 14:13:44 -07:00
Junio C Hamano 03f25e85d9 Merge branch 'rs/rm-strbuf-optim'
The use of strbuf in "git rm" to build filename to remove was a bit
suboptimal, which has been fixed.

* rs/rm-strbuf-optim:
  rm: reuse strbuf for all remove_dir_recursively() calls
2016-07-25 14:13:36 -07:00
Junio C Hamano 87492cb24d Merge branch 'mh/ref-iterators'
The API to iterate over all the refs (i.e. for_each_ref(), etc.)
has been revamped.

* mh/ref-iterators:
  for_each_reflog(): reimplement using iterators
  dir_iterator: new API for iterating over a directory tree
  for_each_reflog(): don't abort for bad references
  do_for_each_ref(): reimplement using reference iteration
  refs: introduce an iterator interface
  ref_resolves_to_object(): new function
  entry_resolves_to_object(): rename function from ref_resolves_to_object()
  get_ref_cache(): only create an instance if there is a submodule
  remote rm: handle symbolic refs correctly
  delete_refs(): add a flags argument
  refs: use name "prefix" consistently
  do_for_each_ref(): move docstring to the header file
  refs: remove unnecessary "extern" keywords
2016-07-25 14:13:33 -07:00
Junio C Hamano 6b34ce90a7 Merge branch 'mh/split-under-lock'
Further preparatory work on the refs API before the pluggable
backend series can land.

* mh/split-under-lock: (33 commits)
  lock_ref_sha1_basic(): only handle REF_NODEREF mode
  commit_ref_update(): remove the flags parameter
  lock_ref_for_update(): don't resolve symrefs
  lock_ref_for_update(): don't re-read non-symbolic references
  refs: resolve symbolic refs first
  ref_transaction_update(): check refname_is_safe() at a minimum
  unlock_ref(): move definition higher in the file
  lock_ref_for_update(): new function
  add_update(): initialize the whole ref_update
  verify_refname_available(): adjust constness in declaration
  refs: don't dereference on rename
  refs: allow log-only updates
  delete_branches(): use resolve_refdup()
  ref_transaction_commit(): correctly report close_ref() failure
  ref_transaction_create(): disallow recursive pruning
  refs: make error messages more consistent
  lock_ref_sha1_basic(): remove unneeded local variable
  read_raw_ref(): move docstring to header file
  read_raw_ref(): improve docstring
  read_raw_ref(): rename symref argument to referent
  ...
2016-07-25 14:13:32 -07:00
Johannes Sixt eb09121b74 submodule-helper: fix indexing in clone retry error reporting path
'git submodule--helper update-clone' has logic to retry failed clones
a second time. For this purpose, there is a list of submodules to clone,
and a second list that is filled with the submodules to retry. Within
these lists, the submodules are identified by an index as if both lists
were just appended.

This works nicely except when the second clone attempt fails as well. To
report an error, the identifying index must be adjusted by an offset so
that it can be used as an index into the second list. However, the
calculation uses the logical total length of the lists so that the result
always points one past the end of the second list.

Pick the correct index.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-22 13:43:53 -07:00
Jeff King 83558686ce receive-pack: send keepalives during quiet periods
After a client has sent us the complete pack, we may spend
some time processing the data and running hooks. If the
client asked us to be quiet, receive-pack won't send any
progress data during the index-pack or connectivity-check
steps. And hooks may or may not produce their own progress
output. In these cases, the network connection is totally
silent from both ends.

Git itself doesn't care about this (it will wait forever),
but other parts of the system (e.g., firewalls,
load-balancers, etc) might hang up the connection. So we'd
like to send some sort of keepalive to let the network and
the client side know that we're still alive and processing.

We can use the same trick we did in 05e9515 (upload-pack:
send keepalive packets during pack computation, 2013-09-08).
Namely, we will send an empty sideband data packet every `N`
seconds that we do not relay any stderr data over the
sideband channel. As with 05e9515, this means that we won't
bother sending keepalives when there's actual progress data,
but will kick in when it has been disabled (or if there is a
lull in the progress data).

The concept is simple, but the details are subtle enough
that they need discussing here.

Before the client sends us the pack, we don't want to do any
keepalives. We'll have sent our ref advertisement, and we're
waiting for them to send us the pack (and tell us that they
support sidebands at all).

While we're receiving the pack from the client (or waiting
for it to start), there's no need for keepalives; it's up to
them to keep the connection active by sending data.
Moreover, it would be wrong for us to do so. When we are the
server in the smart-http protocol, we must treat our
connection as half-duplex. So any keepalives we send while
receiving the pack would potentially be buffered by the
webserver. Not only does this make them useless (since they
would not be delivered in a timely manner), but it could
actually cause a deadlock if we fill up the buffer with
keepalives. (It wouldn't be wrong to send keepalives in this
phase for a full-duplex connection like ssh; it's simply
pointless, as it is the client's responsibility to speak).

As soon as we've gotten all of the pack data, then the
client is waiting for us to speak, and we should start
keepalives immediately. From here until the end of the
connection, we send one any time we are not otherwise
sending data.

But there's a catch. Receive-pack doesn't know the moment
we've gotten all the data. It passes the descriptor to
index-pack, who reads all of the data, and then starts
resolving the deltas. We have to communicate that back.

To make this work, we instruct the sideband muxer to enable
keepalives in three phases:

  1. In the beginning, not at all.

  2. While reading from index-pack, wait for a signal
     indicating end-of-input, and then start them.

  3. Afterwards, always.

The signal from index-pack in phase 2 has to come over the
stderr channel which the muxer is reading. We can't use an
extra pipe because the portable run-command interface only
gives us stderr and stdout.

Stdout is already used to pass the .keep filename back to
receive-pack. We could also send a signal there, but then we
would find out about it in the main thread. And the
keepalive needs to be done by the async muxer thread (since
it's the one writing sideband data back to the client). And
we can't reliably signal the async thread from the main
thread, because the async code sometimes uses threads and
sometimes uses forked processes.

Therefore the signal must come over the stderr channel,
where it may be interspersed with other random
human-readable messages from index-pack. This patch makes
the signal a single NUL byte.  This is easy to parse, should
not appear in any normal stderr output, and we don't have to
worry about any timing issues (like seeing half the signal
bytes in one read(), and half in a subsequent one).

This is a bit ugly, but it's simple to code and should work
reliably.

Another option would be to stop using an async thread for
muxing entirely, and just poll() both stderr and stdout of
index-pack from the main thread. This would work for
index-pack (because we aren't doing anything useful in the
main thread while it runs anyway). But it would make the
connectivity check and the hook muxers much more
complicated, as they need to simultaneously feed the
sub-programs while reading their stderr.

The index-pack phase is the only one that needs this
signaling, so it could simply behave differently than the
other two. That would mean having two separate
implementations of copy_to_sideband (and the keepalive
code), though. And it still doesn't get rid of the
signaling; it just means we can write a nicer message like
"END_OF_INPUT" or something on stdout, since we don't have
to worry about separating it from the stderr cruft.

One final note: this signaling trick is only done with
index-pack, not with unpack-objects. There's no point in
doing it for the latter, because by definition it only kicks
in for a small number of objects, where keepalives are not
as useful (and this conveniently lets us avoid duplicating
the implementation).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King 6b4cd2f827 receive-pack: turn on connectivity progress
When we receive a large push, the server side of the
connection may spend a lot of time (30s or more for a full
push of linux.git) walking the object graph without
producing any output. Let's give the user some indication
that we're actually working.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King d415092ac4 receive-pack: relay connectivity errors to sideband
If the connectivity check encounters a problem when
receiving a push, the error output goes to receive-pack's
stderr, whose destination depends on the protocol used
(ssh tends to send it to the user, though without a "remote"
prefix; http will generally eat it in the server's error
log).

The information should consistently go back to the user, as
there is a reasonable chance their client is buggy and
generating a bad pack.

We can do so by muxing it over the sideband as we do with
other sub-process stderr.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King d06303bb9a receive-pack: turn on index-pack resolving progress
When we receive a large push, the server side may have to
spend a lot of CPU processing the incoming packfile.

During the "receiving" phase, we are typically network
bound, and the client is writing its own progress to the
user. But during the delta resolution phase, we may spend
minutes (e.g., for a full push of linux.git) without
making any indication to the user that the connection has
not hung.

Let's ask index-pack to produce progress output for this
phase (unless the client asked us to be quiet, of course).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King e376f17fd1 index-pack: add flag for showing delta-resolution progress
The index-pack command has two progress meters: one for
"receiving objects", and one for "resolving deltas". You get
neither by default, or both with "-v".

But for a push through receive-pack, we would want only the
"resolving deltas" phase, _not_ the "receiving objects"
progress. There are two reasons for this.

One is simply that existing clients are already printing
"writing objects" progress at the same time.  Arguably
"receiving" from the far end is more useful, because it
tells you what has actually gotten there, as opposed to what
might be stuck in a buffer somewhere between the client and
server. But that would require a protocol extension to tell
clients not to print their progress. Possible, but
complexity for little gain.

The second reason is much more important. In a full-duplex
connection like git-over-ssh, we can print progress while
the pack is incoming, and it will immediately get to the
client. But for a half-duplex connection like git-over-http,
we should not say anything until we have received the full
request.  Anything we write is subject to being stuck in a
buffer by the webserver.  Worse, we can end up in a deadlock
if that buffer fills up.

So our best bet is to avoid writing anything that isn't a
small fixed size until we've received the full pack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King 38e590ea12 clone: use a real progress meter for connectivity check
Because the initial connectivity check for a cloned
repository can be slow, 0781aa4 (clone: let the user know
when check_everything_connected is run, 2013-05-03) added a
"fake" progress meter; we simply say "Checking connectivity"
when it starts, and "done" at the end, with nothing between.

Since check_connected() now knows how to do a real progress
meter, we can drop our fake one and use that one instead.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:09 -07:00
Jeff King 7043c7071c check_everything_connected: use a struct with named options
The number of variants of check_everything_connected has
grown over the years, so that the "real" function takes
several possibly-zero, possibly-NULL arguments. We hid the
complexity behind some wrapper functions, but this doesn't
scale well when we want to add new options.

If we add more wrapper variants to handle the new options,
then we can get a combinatorial explosion when those options
might be used together (right now nobody wants to use both
"shallow" and "transport" together, so we get by with just a
few wrappers).

If instead we add new parameters to each function, each of
which can have a default value, then callers who want the
defaults end up with confusing invocations like:

  check_everything_connected(fn, 0, data, -1, 0, NULL);

where it is unclear which parameter is which (and every
caller needs updated when we add new options).

Instead, let's add a struct to hold all of the optional
parameters. This is a little more verbose for the callers
(who have to declare the struct and fill it in), but it
makes their code much easier to follow, because every option
is named as it is set (and unused options do not have to be
mentioned at all).

Note that we could also stick the iteration function and its
callback data into the option struct, too. But since those
are required for each call, by avoiding doing so, we can let
very simple callers just pass "NULL" for the options and not
worry about the struct at all.

While we're touching each site, let's also rename the
function to check_connected(). The existing name was quite
long, and not all of the wrappers even used the full name.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:10:53 -07:00
Jeff King 434ea3cdad rev-list: add optional progress reporting
It's easy to ask rev-list to do a traversal that may takes
many seconds (e.g., by calling "--objects --all"). In theory
you can monitor its progress by the output you get to
stdout, but this isn't always easy.

Some operations, like "--count", don't make any output until
the end.

And some callers, like check_everything_connected(), are
using it just for the error-checking of the traversal, and
throw away stdout entirely.

This patch adds a "--progress" option which can be used to
give some eye-candy for a user waiting for a long traversal.
This is just a rev-list option and not a regular traversal
option, because it needs cooperation from the callbacks in
builtin/rev-list.c to do the actual count.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:10:44 -07:00
Junio C Hamano 3d55eea805 Merge branch 'js/am-call-theirs-theirs-in-fallback-3way'
One part of "git am" had an oddball helper function that called
stuff from outside "his" as opposed to calling what we have "ours",
which was not gender-neutral and also inconsistent with the rest of
the system where outside stuff is usuall called "theirs" in
contrast to "ours".

* js/am-call-theirs-theirs-in-fallback-3way:
  am: counteract gender bias
2016-07-19 13:22:23 -07:00
Junio C Hamano 2b6456b808 Merge branch 'jk/write-file'
General code clean-up around a helper function to write a
single-liner to a file.

* jk/write-file:
  branch: use write_file_buf instead of write_file
  use write_file_buf where applicable
  write_file: add format attribute
  write_file: add pointer+len variant
  write_file: use xopen
  write_file: drop "gently" form
  branch: use non-gentle write_file for branch description
  am: ignore return value of write_file()
  config: fix bogus fd check when setting up default config
2016-07-19 13:22:23 -07:00
Junio C Hamano 96e08010ee Merge branch 'jk/printf-format'
Code clean-up to avoid using a variable string that compilers may
feel untrustable as printf-style format given to write_file()
helper function.

* jk/printf-format:
  commit.c: remove print_commit_list()
  avoid using sha1_to_hex output as printf format
  walker: let walker_say take arbitrary formats
2016-07-19 13:22:22 -07:00
Junio C Hamano 566fdaf611 Merge branch 'nd/fetch-ref-summary'
Improve the look of the way "git fetch" reports what happened to
each ref that was fetched.

* nd/fetch-ref-summary:
  fetch: reduce duplicate in ref update status lines with placeholder
  fetch: align all "remote -> local" output
  fetch: change flag code for displaying tag update and deleted ref
  fetch: refactor ref update status formatting code
  git-fetch.txt: document fetch output
2016-07-19 13:22:21 -07:00
Junio C Hamano a63d31b4d3 Merge branch 'bc/cocci'
Conversion from unsigned char sha1[20] to struct object_id
continues.

* bc/cocci:
  diff: convert prep_temp_blob() to struct object_id
  merge-recursive: convert merge_recursive_generic() to object_id
  merge-recursive: convert leaf functions to use struct object_id
  merge-recursive: convert struct merge_file_info to object_id
  merge-recursive: convert struct stage_data to use object_id
  diff: rename struct diff_filespec's sha1_valid member
  diff: convert struct diff_filespec to struct object_id
  coccinelle: apply object_id Coccinelle transformations
  coccinelle: convert hashcpy() with null_sha1 to hashclr()
  contrib/coccinelle: add basic Coccinelle transforms
  hex: add oid_to_hex_r()
2016-07-19 13:22:16 -07:00
Junio C Hamano 63641fb071 Merge branch 'js/log-to-diffopt-file'
The commands in the "log/diff" family have had an FILE* pointer in the
data structure they pass around for a long time, but some codepaths
used to always write to the standard output.  As a preparatory step
to make "git format-patch" available to the internal callers, these
codepaths have been updated to consistently write into that FILE*
instead.

* js/log-to-diffopt-file:
  mingw: fix the shortlog --output=<file> test
  diff: do not color output when --color=auto and --output=<file> is given
  t4211: ensure that log respects --output=<file>
  shortlog: respect the --output=<file> setting
  format-patch: use stdout directly
  format-patch: avoid freopen()
  format-patch: explicitly switch off color when writing to files
  shortlog: support outputting to streams other than stdout
  graph: respect the diffopt.file setting
  line-log: respect diffopt's configured output file stream
  log-tree: respect diffopt's configured output file stream
  log: prepare log/log-tree to reuse the diffopt.close_file attribute
2016-07-19 13:22:15 -07:00
Junio C Hamano 7418a6b1a0 Merge branch 'dk/blame-move-no-reason-for-1-line-context'
"git blame -M" missed a single line that was moved within the file.

* dk/blame-move-no-reason-for-1-line-context:
  blame: require 0 context lines while finding moved lines with -M
2016-07-19 13:22:13 -07:00
Johannes Schindelin 90cf590f53 fsck: optionally show more helpful info for broken links
When reporting broken links between commits/trees/blobs, it would be
quite helpful at times if the user would be told how the object is
supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly
that: name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that
should not be, the user might want to remove the corresponding reflog
entry. This option helps them find that entry: `git fsck` will now
report something like this:

	broken link from    tree b5eb6ff...  (refs/stash@{<date>}~37:)
	              to    blob ec5cf80...

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 15:15:59 -07:00
Mike Hommey 3b75ee9327 blame: allow to blame paths freshly added to the index
When blaming files, changes in the work tree are taken into account
and displayed as being "Not Committed Yet".

However, when blaming a file that is not known to the current HEAD,
git blame fails with `no such path 'foo' in HEAD`, even when the file
was git add'ed.

Allowing such a blame is useful when the new file added to the index
(not yet committed) was created by renaming an existing file.  It
also is useful when the new file was created from pieces already in
HEAD, moved or copied from other files and blaming with copy
detection (i.e. "-C").

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 14:33:38 -07:00
Johannes Schindelin 1cd772cc41 fsck: give the error function a chance to see the fsck_options
We will need this in the next commit, where fsck will be taught to
optionally name the objects when reporting issues about them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:35:00 -07:00
Johannes Schindelin 993a21b0a0 fsck: refactor how to describe objects
In many places, we refer to objects via their SHA-1s. Let's abstract
that into a function.

For the moment, it does nothing else than what we did previously: print
out the 40-digit hex string. But that will change over the course of the
next patches.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-18 11:35:00 -07:00
Stefan Beller f6a4e61fbb push: accept push options
This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:41 -07:00
Stefan Beller c714e45f87 receive-pack: implement advertising and receiving push options
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:40 -07:00
Stefan Beller 77a9745d19 push options: {pre,post}-receive hook learns about push options
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable
============================================
+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
======================================
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==========================================================
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
======================================================
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:17 -07:00
Junio C Hamano 42bd66816b Merge branch 'nd/ita-cleanup'
Git does not know what the contents in the index should be for a
path added with "git add -N" yet, so "git grep --cached" should not
show hits (or show lack of hits, with -L) in such a path, but that
logic does not apply to "git grep", i.e. searching in the working
tree files.  But we did so by mistake, which has been corrected.

* nd/ita-cleanup:
  grep: fix grepping for "intent to add" files
  t7810-grep.sh: fix a whitespace inconsistency
  t7810-grep.sh: fix duplicated test name
2016-07-13 11:24:18 -07:00
Junio C Hamano 97865e83c7 Merge branch 'ew/gc-auto-pack-limit-fix'
"gc.autoPackLimit" when set to 1 should not trigger a repacking
when there is only one pack, but the code counted poorly and did
so.

* ew/gc-auto-pack-limit-fix:
  gc: fix off-by-one error with gc.autoPackLimit
2016-07-13 11:24:12 -07:00
Junio C Hamano 2703572b3a Merge branch 'va/i18n-even-more'
More markings of messages for i18n, with updates to various tests
to pass GETTEXT_POISON tests.

One patch from the original submission dropped due to conflicts
with jk/upload-pack-hook, which is still in flux.

* va/i18n-even-more: (38 commits)
  t5541: become resilient to GETTEXT_POISON
  i18n: branch: mark comment when editing branch description for translation
  i18n: unmark die messages for translation
  i18n: submodule: escape shell variables inside eval_gettext
  i18n: submodule: join strings marked for translation
  i18n: init-db: join message pieces
  i18n: remote: allow translations to reorder message
  i18n: remote: mark URL fallback text for translation
  i18n: standardise messages
  i18n: sequencer: add period to error message
  i18n: merge: change command option help to lowercase
  i18n: merge: mark messages for translation
  i18n: notes: mark options for translation
  i18n: notes: mark strings for translation
  i18n: transport-helper.c: change N_() call to _()
  i18n: bisect: mark strings for translation
  t5523: use test_i18ngrep for negation
  t4153: fix negated test_i18ngrep call
  t9003: become resilient to GETTEXT_POISON
  tests: unpack-trees: update to use test_i18n* functions
  ...
2016-07-13 11:24:10 -07:00
Nguyễn Thái Ngọc Duy ec9d224903 fsck: use streaming interface for large blobs in pack
For blobs, we want to make sure the on-disk data is not corrupted
(i.e. can be inflated and produce the expected SHA-1). Blob content is
opaque, there's nothing else inside to check for.

For really large blobs, we may want to avoid unpacking the entire blob
in memory, just to check whether it produces the same SHA-1. On 32-bit
systems, we may not have enough virtual address space for such memory
allocation. And even on 64-bit where it's not a problem, allocating a
lot more memory could result in kicking other parts of systems to swap
file, generating lots of I/O and slowing everything down.

For this particular operation, not unpacking the blob and letting
check_sha1_signature, which supports streaming interface, do the job
is sufficient. check_sha1_signature() is not shown in the diff,
unfortunately. But if will be called when "data_valid && !data" is
false.

We will call the callback function "fn" with NULL as "data". The only
callback of this function is fsck_obj_buffer(), which does not touch
"data" at all if it's a blob.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:29 -07:00
Nguyễn Thái Ngọc Duy af92a645d3 pack-objects: do not truncate result in-pack object size on 32-bit systems
A typical diff will not show what's going on and you need to see full
functions. The core code is like this, at the end of of write_one()

	e->idx.offset = *offset;
	size = write_object(f, e, *offset);
	if (!size) {
		e->idx.offset = recursing;
		return WRITE_ONE_BREAK;
	}
	written_list[nr_written++] = &e->idx;

	/* make sure off_t is sufficiently large not to wrap */
	if (signed_add_overflows(*offset, size))
		die("pack too large for current definition of off_t");
	*offset += size;

Here we can see that the in-pack object size is returned by
write_object (or indirectly by write_reuse_object). And it's used to
calculate object offsets, which end up in the pack index file,
generated at the end.

If "size" overflows (on 32-bit sytems, unsigned long is 32-bit while
off_t can be 64-bit), we got wrong offsets and produce incorrect .idx
file, which may make it look like the .pack file is corrupted.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:17 -07:00
Nguyễn Thái Ngọc Duy da49a7da3a index-pack: correct "offset" type in unpack_entry_data()
unpack_entry_data() receives an off_t value from unpack_raw_entry(),
which could be larger than unsigned long on 32-bit systems with large
file support. Correct the type so truncation does not happen. This
only affects bad object reporting though.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:15:08 -07:00
Nguyễn Thái Ngọc Duy fd3e67474c index-pack: report correct bad object offsets even if they are large
Use the right type for offsets in this case, off_t, which makes a
difference on 32-bit systems with large file support, and change
formatting code accordingly.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:14:47 -07:00
Nguyễn Thái Ngọc Duy 7171a0b0cf index-pack: correct "len" type in unpack_data()
On 32-bit systems with large file support, one entry could be larger
than 4GB and overflow "len". Correct it so we can unpack a full entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:14:38 -07:00
Nguyễn Thái Ngọc Duy 166df26f28 sha1_file.c: use type off_t* for object_info->disk_sizep
This field, filled by sha1_object_info() contains the on-disk size of
an object, which could go over 4GB limit of unsigned long on 32-bit
systems. Use off_t for it instead and update all callers.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13 09:14:20 -07:00
René Scharfe deb8e15a19 rm: reuse strbuf for all remove_dir_recursively() calls
Don't throw the memory allocated for remove_dir_recursively() away after
a single call, use it for the other entries as well instead.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12 15:09:21 -07:00
Nguyễn Thái Ngọc Duy 211c61c6cf pack-objects: pass length to check_pack_crc() without truncation
On 32 bit systems with large file support, unsigned long is 32-bit
while the two offsets in the subtraction expression (pack-objects has
the exact same expression as in sha1_file.c but not shown in diff) are
in 64-bit. If an in-pack object is larger than 2^32 len/datalen is
truncated and we get a misleading "error: bad packed object CRC for
..." as a result.

Use off_t for len and datalen. check_pack_crc() already accepts this
argument as off_t and can deal with 4+ GB.

Noticed-by: Christoph Michelbach <michelbach94@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12 10:14:29 -07:00
Junio C Hamano 1a88ca99db Merge branch 'sb/clone-shallow-passthru' into maint
Fix an unintended regression in v2.9 that breaks "clone --depth"
that recurses down to submodules by forcing the submodules to also
be cloned shallowly, which many server instances that host upstream
of the submodules are not prepared for.

* sb/clone-shallow-passthru:
  clone: do not let --depth imply --shallow-submodules
2016-07-11 10:44:12 -07:00
Junio C Hamano 1401236842 Merge branch 'km/fetch-do-not-free-remote-name' into maint
The ownership rule for the piece of memory that hold references to
be fetched in "git fetch" was screwy, which has been cleaned up.

* km/fetch-do-not-free-remote-name:
  builtin/fetch.c: don't free remote->name after fetch
2016-07-11 10:44:10 -07:00
Junio C Hamano 369dc4081c Merge branch 'mj/log-show-signature-conf'
"git log" learns log.showSignature configuration variable, and a
command line option "--no-show-signature" to countermand it.

* mj/log-show-signature-conf:
  log: add log.showSignature configuration variable
  log: add "--no-show-signature" command line option
  t4202: refactor test
2016-07-11 10:31:08 -07:00
Junio C Hamano 62e5e83f8d Merge branch 'js/find-commit-subject-ignore-leading-blanks'
A helper function that takes the contents of a commit object and
finds its subject line did not ignore leading blank lines, as is
commonly done by other codepaths.  Make it ignore leading blank
lines to match.

* js/find-commit-subject-ignore-leading-blanks:
  reset --hard: skip blank lines when reporting the commit subject
  sequencer: use skip_blank_lines() to find the commit subject
  commit -C: skip blank lines at the beginning of the message
  commit.c: make find_commit_subject() more robust
  pretty: make the skip_blank_lines() function public
2016-07-11 10:31:08 -07:00
Junio C Hamano bb2d8a817d Merge branch 'sb/submodule-clone-retry'
"git submodule update" that drives many "git clone" could
eventually hit flaky servers/network conditions on one of the
submodules; the command learned to retry the attempt.

* sb/submodule-clone-retry:
  submodule update: continue when a clone fails
  submodule--helper: initial clone learns retry logic
2016-07-11 10:31:04 -07:00
Nguyễn Thái Ngọc Duy 6d308627ca worktree: add "unlock" command
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 15:31:04 -07:00
Nguyễn Thái Ngọc Duy 58142c09a4 worktree: add "lock" command
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 15:31:04 -07:00
Johannes Schindelin 715a51bcaf am: counteract gender bias
Since 47f0b6d5 (Fall back to three-way merge when applying a patch.,
2005-10-06), i.e. for almost 11 years already, we used a male form
to describe "the other tree".

While it was unintended, this gave the erroneous impression as if
the Git developers thought of users as male, and were unaware of the
important role in software development played by female actors such
as Ada Lovelace, Grace Hopper and Margaret Hamilton. In fact, the
first professional software developers were all female.

Let's change those unfortunate references to the gender neutral
"their tree".  Doing so also makes the fallback_merge_recursive(),
which is an oddball, more in line with the other parts of the system
where we contrast what we have vs what we obtain from others by
saying "ours" vs "theirs".  This inconsistency was also unintended.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 14:39:48 -07:00
Jeff King dabd35f4cd avoid using sha1_to_hex output as printf format
We know that it should not contain any percent-signs, but
it's a good habit not to feed non-literals to printf
formatters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 10:11:27 -07:00
Jeff King 7eb6e10c9d branch: use write_file_buf instead of write_file
If we already have a strbuf, then using write_file_buf is a
little nicer to read (no wondering whether "%s" will eat
your NULs), and it's more efficient (no extra formatting
step).

We don't care about the newline magic of write_file(), as we
have our own multi-line content.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 09:47:29 -07:00
Jeff King e78d5d4993 use write_file_buf where applicable
There are several places where we open a file, write some
content from a strbuf, and close it. These can be simplified
with write_file_buf(). As a bonus, many of these did not
catch write problems at close() time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 09:47:29 -07:00
Jeff King 3d75bba28d branch: use non-gentle write_file for branch description
We use write_file_gently() to do this job currently.
However, if we see an error, we simply complain via
error_errno() and then end up exiting with an error code.

By switching to the non-gentle form, the function will die
for us, with a better error. It is more specific about which
syscall caused the error, and that mentions the
actual filename we're trying to write.

Our exit code for the error case does switch from "1" to
"128", but that's OK; it wasn't a meaningful documented code
(and in fact it was odd that it was a different exit code
than most other error conditions).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 09:47:28 -07:00
René Scharfe 1dad879a7b am: ignore return value of write_file()
write_file() either returns 0 or dies, so there is no point in checking
its return value.  The callers of the wrappers write_state_text(),
write_state_count() and write_state_bool() consequently already ignore
their return values.  Stop pretending we care and make them void.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 09:47:28 -07:00
Jeff King aabbd3f3c9 config: fix bogus fd check when setting up default config
Since 9830534 (config --global --edit: create a template
file if needed, 2014-07-25), an edit of the global config
file will try to open() it with O_EXCL, and wants to handle
three cases:

  1. We succeeded; the user has no config file, and we
     should fill in the default template.

  2. We got EEXIST; they have a file already, proceed as usual.

  3. We got another error; we should complain.

However, the check for case 1 does "if (fd)", which will
generally _always_ be true (except for the oddball case that
somehow our stdin got closed and opening really did give us
a new descriptor 0).

So in the EEXIST case, we tried to write the default config
anyway! Fortunately, this turns out to be a noop, since we
just end up writing to and closing "-1", which does nothing.

But in case 3, we would fail to notice any other errors, and
just silently continue (given that we don't actually notice
write errors for the template either, it's probably not that
big a deal; we're about to spawn the editor, so it would
notice any problems. But the code is clearly _trying_ to hit
cover this case and failing).

We can fix it easily by using "fd >= 0" for case 1.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 09:47:28 -07:00
Junio C Hamano 9f1027d18a Merge branch 'sb/clone-shallow-passthru'
Fix an unintended regression in v2.9 that breaks "clone --depth"
that recurses down to submodules by forcing the submodules to also
be cloned shallowly, which many server instances that host upstream
of the submodules are not prepared for.

* sb/clone-shallow-passthru:
  clone: do not let --depth imply --shallow-submodules
2016-07-06 13:38:13 -07:00
Junio C Hamano 979f030359 Merge branch 'jk/repack-keep-unreachable'
"git repack" learned the "--keep-unreachable" option, which sends
loose unreachable objects to a pack instead of leaving them loose.
This helps heuristics based on the number of loose objects
(e.g. "gc --auto").

* jk/repack-keep-unreachable:
  repack: extend --keep-unreachable to loose objects
  repack: add --keep-unreachable option
  repack: document --unpack-unreachable option
2016-07-06 13:38:11 -07:00
Junio C Hamano e25a4ded8a Merge branch 'ew/mboxrd-format-am'
Teach format-patch and mailsplit (hence "am") how a line that
happens to begin with "From " in the e-mail message is quoted with
">", so that these lines can be restored to their original shape.

* ew/mboxrd-format-am:
  am: support --patch-format=mboxrd
  mailsplit: support unescaping mboxrd messages
  pretty: support "mboxrd" output format
2016-07-06 13:38:11 -07:00
Junio C Hamano 7a738b40f6 Merge branch 'nd/worktree-cleanup-post-head-protection'
Further preparatory clean-up for "worktree" feature continues.

* nd/worktree-cleanup-post-head-protection:
  worktree: simplify prefixing paths
  worktree: avoid 0{40}, too many zeroes, hard to read
  worktree.c: use is_dot_or_dotdot()
  git-worktree.txt: keep subcommand listing in alphabetical order
  worktree.c: rewrite mark_current_worktree() to avoid strbuf
  completion: support git-worktree
2016-07-06 13:38:11 -07:00
Junio C Hamano 845351c99b Merge branch 'km/fetch-do-not-free-remote-name'
The ownership rule for the piece of memory that hold references to
be fetched in "git fetch" was screwy, which has been cleaned up.

* km/fetch-do-not-free-remote-name:
  builtin/fetch.c: don't free remote->name after fetch
2016-07-06 13:38:08 -07:00
Junio C Hamano b8b6365a8a Merge branch 'jk/string-list-static-init'
Instead of taking advantage of a struct string_list that is
allocated with all NULs happens to be STRING_LIST_INIT_NODUP kind,
initialize them explicitly as such, to document their behaviour
better.

* jk/string-list-static-init:
  use string_list initializer consistently
  blame,shortlog: don't make local option variables static
  interpret-trailers: don't duplicate option strings
  parse_opt_string_list: stop allocating new strings
2016-07-06 13:38:08 -07:00
Junio C Hamano 7758b02b44 Merge branch 'pb/commit-editmsg-path'
Code clean-up.

* pb/commit-editmsg-path:
  builtin/commit.c: memoize git-path for COMMIT_EDITMSG
2016-07-06 13:38:06 -07:00
Junio C Hamano f838198357 Merge branch 'jc/deref-tag' into maint
Code clean-up.

* jc/deref-tag:
  blame, line-log: do not loop around deref_tag()
2016-07-06 13:06:46 -07:00
Junio C Hamano c8b080af71 Merge branch 'et/add-chmod-x' into maint
"git update-index --add --chmod=+x file" may be usable as an escape
hatch, but not a friendly thing to force for people who do need to
use it regularly.  "git add --chmod=+x file" can be used instead.

* et/add-chmod-x:
  add: add --chmod=+x / --chmod=-x options
2016-07-06 13:06:39 -07:00
Nguyễn Thái Ngọc Duy bc437d1020 fetch: reduce duplicate in ref update status lines with placeholder
In the "remote -> local" line, if either ref is a substring of the
other, the common part in the other string is replaced with "*". For
example

    abc                -> origin/abc
    refs/pull/123/head -> pull/123

become

    abc         -> origin/*
    refs/*/head -> pull/123

Activated with fetch.output=compact.

For the record, this output is not perfect. A single giant ref can
push all refs very far to the right and likely be wrapped around. We
may have a few options:

 - exclude these long lines smarter

 - break the line after "->", exclude it from column width calculation

 - implement a new format, { -> origin/}foo, which makes the problem
   go away at the cost of a bit harder to read

 - reverse all the arrows so we have "* <- looong-ref", again still
   hard to read.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:48:25 -07:00
Nguyễn Thái Ngọc Duy 6bc91f23a6 fetch: align all "remote -> local" output
We do align "remote -> local" output by allocating 10 columns to
"remote". That produces aligned output only for short refs. An extra
pass is performed to find the longest remote ref name (that does not
produce a line longer than terminal width) to produce better aligned
output.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:48:25 -07:00
Jeff King 023ff39b29 parse_options: allocate a new array when concatenating
In exactly one callers (builtin/revert.c), we build up the
options list dynamically from multiple arrays. We do so by
manually inserting "filler" entries into one array, and then
copying the other array into the allocated space.

This is tedious and error-prone, as you have to adjust the
filler any time the second array is modified (although we do
at least check and die() when the counts do not match up).

Instead, let's just allocate a new array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 10:11:08 -07:00
Charles Bailey b8e47d1acf grep: fix grepping for "intent to add" files
This reverts commit 4d5520053 (grep: make it clear i-t-a entries are
ignored, 2015-12-27) and adds an alternative fix to maintain the -L
--cached behavior.

4d5520053 caused 'git grep' to no longer find matches in new files in
the working tree where the corresponding index entry had the "intent to
add" bit set, despite the fact that these files are tracked.

The content in the index of a file for which the "intent to add" bit is
set is considered indeterminate and not empty. For most grep queries we
want these to behave the same, however for -L --cached (files without a
match) we don't want to respond positively for "intent to add" files as
their contents are indeterminate. This is in contrast to files with
empty contents in the index (no lines implies no matches for any grep
query expression) which should be reported in the output of a grep -L
--cached invocation.

Add tests to cover this case and a few related cases which previously
lacked coverage.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:27:41 -07:00
Johannes Schindelin 054a5aee6f reset --hard: skip blank lines when reporting the commit subject
When there are blank lines at the beginning of a commit message, the
pretty printing machinery already skips them when showing a commit
subject (or the complete commit message). We shall henceforth do the
same when reporting the commit subject after the user called

	git reset --hard <commit>

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-29 15:03:36 -07:00
Johannes Schindelin 84e213a30a commit -C: skip blank lines at the beginning of the message
Consistent with the pretty-printing machinery, we skip leading blank
lines (if any) of existing commit messages.

While Git itself only produces commit objects with a single empty line
between commit header and commit message, it is legal to have more than
one blank line (i.e. lines containing only white space, or no
characters) at the beginning of the commit message, and the
pretty-printing code already handles that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-29 14:56:37 -07:00
brian m. carlson 4e8161a82e merge-recursive: convert merge_recursive_generic() to object_id
Convert this function and the git merge-recursive subcommand to use
struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28 11:39:02 -07:00
brian m. carlson a0d12c4433 diff: convert struct diff_filespec to struct object_id
Convert struct diff_filespec's sha1 member to use a struct object_id
called "oid" instead.  The following Coccinelle semantic patch was used
to implement this, followed by the transformations in object_id.cocci:

@@
struct diff_filespec o;
@@
- o.sha1
+ o.oid.hash

@@
struct diff_filespec *p;
@@
- p->sha1
+ p->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28 11:39:02 -07:00
brian m. carlson c368dde924 coccinelle: apply object_id Coccinelle transformations
Apply the set of semantic patches from contrib/coccinelle to convert
some leftover places using struct object_id's hash member to instead
use the wrapper functions that take struct object_id natively.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28 11:39:02 -07:00
brian m. carlson f449198e58 coccinelle: convert hashcpy() with null_sha1 to hashclr()
hashcpy with null_sha1 as the source is equivalent to hashclr.  In
addition to being simpler, using hashclr may give the compiler a chance
to optimize better.  Convert instances of hashcpy with the source
argument of null_sha1 to hashclr.

This transformation was implemented using the following semantic patch:

@@
expression E1;
@@
-hashcpy(E1, null_sha1);
+hashclr(E1);

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28 11:39:02 -07:00
Nguyễn Thái Ngọc Duy 2cb040baa6 fetch: change flag code for displaying tag update and deleted ref
This makes the fetch flag code consistent with push, where '-' means
deleted ref.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 10:58:02 -07:00
Nguyễn Thái Ngọc Duy d0b39a03cd fetch: refactor ref update status formatting code
This makes it easier to change the formatting later. And it makes sure
translators cannot mess up format specifiers and break Git.

There are a couple call sites where the length of the second column is
TRANSPORT_SUMMARY_WIDTH instead of calculated by TRANSPORT_SUMMARY(),
which is enforced now. The result should be the same because these call
sites do not contain characters outside ASCII range.

The two strbuf_addf() calls instead of one is mostly to reduce
diff-noise in a future patch where "ref -> ref" is reformatted
differently.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 10:58:02 -07:00
Junio C Hamano 8579c4ebee Merge branch 'lf/receive-pack-auto-gc-to-client'
Allow messages that are generated by auto gc during "git push" on
the receiving end to be explicitly passed back to the sending end
over sideband, so that they are shown with "remote: " prefix to
avoid confusing the users.

* lf/receive-pack-auto-gc-to-client:
  receive-pack: send auto-gc output over sideband 2
2016-06-27 09:56:52 -07:00
Junio C Hamano 2a5618ec78 Merge branch 'jc/deref-tag'
Code clean-up.

* jc/deref-tag:
  blame, line-log: do not loop around deref_tag()
2016-06-27 09:56:50 -07:00
Junio C Hamano c49fd57bf4 Merge branch 'et/add-chmod-x'
"git update-index --add --chmod=+x file" may be usable as an escape
hatch, but not a friendly thing to force for people who do need to
use it regularly.  "git add --chmod=+x file" can be used instead.

* et/add-chmod-x:
  add: add --chmod=+x / --chmod=-x options
2016-06-27 09:56:49 -07:00
Junio C Hamano 0bbda4bac7 Merge branch 'cc/apply-introduce-state'
The "git apply" standalone program is being libified; this is the
first step to move many state variables into a structure that can
be explicitly (re)initialized to make the machinery callable more
than once.

The next step that moves some remaining state variables into the
structure and turns die()s into an error return that propagates up
to the caller is not queued yet but in flight.  It would be good to
review the above first and give the remainder of the series a solid
base to build on.

* cc/apply-introduce-state: (50 commits)
  builtin/apply: remove misleading comment on lock_file field
  builtin/apply: move 'newfd' global into 'struct apply_state'
  builtin/apply: add 'lock_file' pointer into 'struct apply_state'
  builtin/apply: move applying patches into apply_all_patches()
  builtin/apply: move 'state' check into check_apply_state()
  builtin/apply: move 'symlink_changes' global into 'struct apply_state'
  builtin/apply: move 'fn_table' global into 'struct apply_state'
  builtin/apply: move 'state_linenr' global into 'struct apply_state'
  builtin/apply: move 'max_change' and 'max_len' into 'struct apply_state'
  builtin/apply: move 'ws_ignore_action' into 'struct apply_state'
  builtin/apply: move 'ws_error_action' into 'struct apply_state'
  builtin/apply: move 'applied_after_fixing_ws' into 'struct apply_state'
  builtin/apply: move 'squelch_whitespace_errors' into 'struct apply_state'
  builtin/apply: remove whitespace_option arg from set_default_whitespace_mode()
  builtin/apply: move 'whitespace_option' into 'struct apply_state'
  builtin/apply: move 'whitespace_error' global into 'struct apply_state'
  builtin/apply: move 'root' global into 'struct apply_state'
  builtin/apply: move 'p_value_known' global into 'struct apply_state'
  builtin/apply: move 'p_value' global into 'struct apply_state'
  builtin/apply: move 'has_include' global into 'struct apply_state'
  ...
2016-06-27 09:56:42 -07:00
Junio C Hamano df5a925523 Merge branch 'jk/rev-list-count-with-bitmap' into maint
"git rev-list --count" whose walk-length is limited with "-n"
option did not work well with the counting optimized to look at the
bitmap index.

* jk/rev-list-count-with-bitmap:
  rev-list: disable bitmaps when "-n" is used with listing objects
  rev-list: "adjust" results of "--count --use-bitmap-index -n"
2016-06-27 09:56:24 -07:00
Eric Wong 5f4e3bf536 gc: fix off-by-one error with gc.autoPackLimit
This matches the documentation and allows gc.autoPackLimit=1
to maintain a single pack without attempting a repack on every
"git gc --auto" invocation.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 08:28:47 -07:00
Johannes Schindelin 7f7d712bcf shortlog: respect the --output=<file> setting
Thanks to the diff option parsing, we already know about this option.
We just have to make use of it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 15:20:47 -07:00
Johannes Schindelin 36a4d905c3 format-patch: use stdout directly
Earlier, we freopen()ed stdout in order to write patches to files.
That forced us to duplicate stdout (naming it "realstdout") because we
*still* wanted to be able to report the file names.

As we do not abuse stdout that way anymore, we no longer need to
duplicate stdout, either.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 15:20:47 -07:00
Johannes Schindelin 95235f5ba1 format-patch: avoid freopen()
We just taught the relevant functions to respect the diffopt.file field,
to allow writing somewhere else than stdout. Let's make use of it.

Technically, we do not need to avoid that call in a builtin: we assume
that builtins (as opposed to library functions) are stand-alone programs
that may do with their (global) state. Yet, we want to be able to reuse
that code in properly lib-ified code, e.g. when converting scripts into
builtins.

Further, while we did not *have* to touch the cmd_show() and cmd_cherry()
code paths (because they do not want to write anywhere but stdout as of
yet), it just makes sense to be consistent, making it easier and safer to
move the code later.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 15:20:47 -07:00
Johannes Schindelin 11f4eb1984 format-patch: explicitly switch off color when writing to files
The --color=auto handling is done by seeing if file descriptor 1
(the standard output) is connected to a terminal.  format-patch
used freopen() to reuse the standard output stream even when sending
its output to an on-disk file, and this check is appropriate.

In the next step, however, we will stop reusing "FILE *stdout", and
instead start using arbitrary file descriptor obtained by doing an
fopen(3) ourselves.  The check --color=auto does will become useless,
as we no longer are writing to the standard output stream.

But then, we do not need to guess to begin with. As argued in the commit
message of 7787570c (format-patch: ignore ui.color, 2011-09-13), we do not
allow the ui.color setting to affect format-patch's output. The only time,
therefore, that we allow color sequences to be written to the output files
is when the user specified the --color=always command-line option explicitly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 15:15:55 -07:00
Johannes Schindelin 0a7b357737 shortlog: support outputting to streams other than stdout
This will be needed to avoid freopen() in `git format-patch`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 14:25:49 -07:00
Johannes Schindelin 6ea57703f6 log: prepare log/log-tree to reuse the diffopt.close_file attribute
We are about to teach the log-tree machinery to reuse the diffopt.file
field to output to a file stream other than stdout, in line with the
diff machinery already writing to diffopt.file.

However, we might want to write something after the diff in
log_tree_commit() (e.g. with the --show-linear-break option), therefore
we must not let the diff machinery close the file (as per
diffopt.close_file.

This means that log_tree_commit() itself must override the
diffopt.close_file flag and close the file, and if log_tree_commit() is
called in a loop, the caller is responsible to do the same.

Note: format-patch has an `--output-directory` option. Due to the fact
that format-patch's options are parsed first, and that the parse-options
machinery accepts uniquely abbreviated options, the diff options
`--output` (and `-o`) are shadowed. Therefore close_file is not set to 1
so that cmd_format_patch() does *not* need to handle the close_file flag
differently, even if it calls log_tree_commit() in a loop.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 13:50:45 -07:00
Mehul Jain fce04c3ca6 log: add log.showSignature configuration variable
Users may want to always use "--show-signature" while using git-log and
related commands.

When log.showSignature is set to true, git-log and related commands will
behave as if "--show-signature" was given to them.

Note that this config variable is meant to affect git-log, git-show,
git-whatchanged and git-reflog. Other commands like git-format-patch,
git-rev-list are not to be affected by this config variable.

Signed-off-by: Mehul Jain <mehul.jain2029@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-24 13:01:13 -07:00
Michael Haggerty 29a7cf9644 remote rm: handle symbolic refs correctly
In the modern world of reference backends, it is not OK to delete a
symref by unlink()ing the file directly. This must be done via the refs
API.

We do so by adding the symref to the list of references to delete along
with the non-symbolic references, then calling delete_refs() with the
new flags option set to REF_NODEREF.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-20 11:38:18 -07:00
Michael Haggerty c5f04dddb6 delete_refs(): add a flags argument
This will be useful for passing REF_NODEREF through.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-20 11:38:18 -07:00
Junio C Hamano 18a74a092b clone: do not let --depth imply --shallow-submodules
In v2.9.0, we prematurely flipped the default to force cloning
submodules shallowly, when the superproject is getting cloned
shallowly.  This is likely to fail when the upstream repositories
submodules are cloned from a repository that is not prepared to
serve histories that ends at a commit that is not at the tip of a
branch, and we know the world is not yet ready.

Use a safer default to clone the submodules fully, unless the user
tells us that she knows that the upstream repository of the
submodules are willing to cooperate with "--shallow-submodules"
option.

Noticed-by: Vadim Eisenberg <VADIME@il.ibm.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-20 11:35:28 -07:00
Junio C Hamano 6d8c5454b6 Merge branch 'jk/rev-list-count-with-bitmap'
"git rev-list --count" whose walk-length is limited with "-n"
option did not work well with the counting optimized to look at the
bitmap index.

* jk/rev-list-count-with-bitmap:
  rev-list: disable bitmaps when "-n" is used with listing objects
  rev-list: "adjust" results of "--count --use-bitmap-index -n"
2016-06-20 11:01:03 -07:00
Junio C Hamano 1958a17fe4 Merge branch 'jc/clear-pathspec'
We usually call a function that clears the contents a data
structure X without freeing the structure itself clear_X(), and
call a function that does clear_X() and also frees it free_X().
free_pathspec() function has been renamed to clear_pathspec()
to avoid confusion.

* jc/clear-pathspec:
  pathspec: rename free_pathspec() to clear_pathspec()
2016-06-20 11:01:02 -07:00
Junio C Hamano 6d41eb685a Merge branch 'jg/dash-is-last-branch-in-worktree-add'
"git worktree add" learned that '-' can be used as a short-hand for
"@{-1}", the previous branch.

* jg/dash-is-last-branch-in-worktree-add:
  worktree: allow "-" short-hand for @{-1} in add command
2016-06-20 11:01:02 -07:00
Junio C Hamano 3807098cd6 Merge branch 'sb/submodule-recommend-shallowness'
An upstream project can make a recommendation to shallowly clone
some submodules in the .gitmodules file it ships.

* sb/submodule-recommend-shallowness:
  submodule update: learn `--[no-]recommend-shallow` option
  submodule-config: keep shallow recommendation around
2016-06-20 11:01:01 -07:00
Junio C Hamano 73bc4b4928 Merge branch 'ah/no-verify-signature-with-pull-rebase'
"git pull --rebase --verify-signature" learned to warn the user
that "--verify-signature" is a no-op when rebasing.

* ah/no-verify-signature-with-pull-rebase:
  pull: warn on --verify-signatures with --rebase
2016-06-20 11:01:00 -07:00
Vasco Almeida b18237f4e3 i18n: branch: mark comment when editing branch description for translation
When one issues git branch --edit-description branch_name, a edit with
that message commented out is opened. Mark that message for translation
in to order to be localized.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida c30364d080 i18n: init-db: join message pieces
Join message displayed during repository initialization in one entire
sentence.  That would improve translations since it's easier translate
an entire sentence than translating each piece.

Update Icelandic translation to reflect the changes.  The Icelandic
translation of these messages is used with test
t0204-gettext-reencode-sanity.sh and not updating the translation would
fail the test.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida a1b467a4ee i18n: remote: allow translations to reorder message
Before this patch, translations couldn't place the branch name
where it was better fit in the message "and with remote <branch_name>".
Allow translations that, instead of forcing the branch name to display
right of the message.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida 7ba7b9abcc i18n: remote: mark URL fallback text for translation
Marks fallback text for translation that may be displayed in git remote
show output.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida e923a8abe9 i18n: standardise messages
Standardise messages in order to save translators some work.

Nuances fixed in this commit:
"failed to read %s"
"read of %s failed"

"detach the HEAD at named commit"
"detach HEAD at named commit"

"removing '%s' failed"
"failed to remove '%s'"

"index file corrupt"
"corrupt index file"

"failed to read %s"
"read of %s failed"

"detach the HEAD at named commit"
"detach HEAD at named commit"

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida c8bb9d2e5a i18n: merge: change command option help to lowercase
Change command option description to lowercase, matching pull
counterpart option. Translators would have to translate such message
only once.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida bef4830e88 i18n: merge: mark messages for translation
Mark messages shown to the user for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:46:10 -07:00
Vasco Almeida b34c77e33e i18n: notes: mark options for translation
Mark options description of git prune for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:45:49 -07:00
Vasco Almeida 5313827f7e i18n: notes: mark strings for translation
Mark strings of messages for the user as translatable.

Update tests t3310-notes-merge-manual-resolve.sh and
t3320-notes-merge-worktrees.sh to reflect new translatable messages.

Tests that grep for .git/NOTES_MERGE_WORKTREE reflect the translatable
string "Automatic notes merge failed. Fix conflicts in %s and [...]".

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:45:49 -07:00
Vasco Almeida 8785c42532 i18n: advice: internationalize message for conflicts
Mark message for translation telling the user she has conflicts to
resolve. Expose each particular use case, in order to enable translating
entire sentences which would facilitate translating into other
languages.

Change "Pull" to lowercase to match other instances. Update test
t5520-pull.sh, that relied on the old error message, to use the new one.

Although we loose in source code conciseness, we would gain better
translations because translators can 1) translate the entire sentence,
including those terms concerning Git (committing, merging, etc) 2) have
leeway to adapt to their languages.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:45:48 -07:00
Vasco Almeida 070b7e4416 i18n: builtin/remote.c: fix mark for translation
The second string inside _() was not being extracted for translation by
xgettext, meaning that, although the string was passed to gettext, there
was no translation available.

Mark each individual string instead of marking the result of ternary if.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-17 15:45:47 -07:00
Jeff King e26a8c4721 repack: extend --keep-unreachable to loose objects
If you use "repack -adk" currently, we will pack all objects
that are already packed into the new pack, and then drop the
old packs. However, loose unreachable objects will be left
as-is. In theory these are meant to expire eventually with
"git prune". But if you are using "repack -k", you probably
want to keep things forever and therefore do not run "git
prune" at all. Meaning those loose objects may build up over
time and end up fooling any object-count heuristics (such as
the one done by "gc --auto", though since git-gc does not
support "repack -k", this really applies to whatever custom
scripts people might have driving "repack -k").

With this patch, we instead stuff any loose unreachable
objects into the pack along with the already-packed
unreachable objects. This may seem wasteful, but it is
really no more so than using "repack -k" in the first place.
We are at a slight disadvantage, in that we have no useful
ordering for the result, or names to hand to the delta code.
However, this is again no worse than what "repack -k" is
already doing for the packed objects. The packing of these
objects doesn't matter much because they should not be
accessed frequently (unless they actually _do_ become
referenced, but then they would get moved to a different
part of the packfile during the next repack).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 13:57:45 -07:00
Jeff King 905f27b86a repack: add --keep-unreachable option
The usual way to do a full repack (and what is done by
git-gc) is to run "repack -Ad --unpack-unreachable=<when>",
which will loosen any unreachable objects newer than
"<when>", and drop any older ones.

This is a safer alternative to "repack -ad", because
"<when>" becomes a grace period during which we will not
drop any new objects that are about to be referenced.
However, it isn't perfectly safe. It's always possible that
a process is about to reference an old object. Even if that
process were to take care to update the timestamp on the
object, there is no atomicity with a simultaneously running
"repack" process.

So while unlikely, there is a small race wherein we may drop
an object that is in the process of being referenced. If you
do automated repacking on a large number of active
repositories, you may hit it eventually, and the result is a
corrupted repository.

It would be nice to fix that race in the long run, but it's
complicated.  In the meantime, there is a much simpler
strategy for automated repository maintenance: do not drop
objects at all. We already have a "--keep-unreachable"
option in pack-objects; we just need to plumb it through
from git-repack.

Note that this _isn't_ plumbed through from git-gc, so at
this point it's strictly a tool for people doing their own
advanced repository maintenance strategy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 13:57:42 -07:00
Junio C Hamano 31da121f2d blame, line-log: do not loop around deref_tag()
These callers appear to expect that deref_tag() is to peel one layer
of a tag, but the function does not work that way; it has its own
loop to unwrap tags until an object that is not a tag appears.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 13:38:14 -07:00
Junio C Hamano e1d09701a4 blame: dwim "blame --reverse OLD" as "blame --reverse OLD.."
Instead of always requiring both ends of a range, we could DWIM
"OLD", which could be a misspelt "OLD..", to be a range that ends at
the current commit.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 12:13:07 -07:00
Junio C Hamano d993ce1ed2 blame: improve diagnosis for "--reverse NEW"
"git blame --reverse OLD..NEW -- PATH" tells us to start from the
contents in PATH at OLD and observe how each line is changed while
the history develops up to NEW, and report for each line the latest
commit up to which the line survives in the original form.

If you say "git blame --reverse NEW -- PATH" by mistake, we complain
about the missing OLD, but we phrased it as "No commit to dig down
to?"  In this case, however, we are digging up from OLD, so say so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 12:12:15 -07:00
Keith McGuigan b7410f616e builtin/fetch.c: don't free remote->name after fetch
Make fetch's string_list of remote names own all of its string items
(strdup'ing when necessary) so that it can deallocate them safely
when clearing.

Signed-off-by: Keith McGuigan <kmcguigan@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14 11:58:05 -07:00
Nguyễn Thái Ngọc Duy cccf74e2da fetch, upload-pack: --deepen=N extends shallow boundary by N commits
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.

This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.

Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..

The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.

(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).

Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy 859e5df916 clone: define shallow clone boundary with --shallow-exclude
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy a45a260086 fetch: define shallow boundary with --shallow-exclude
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy 994c2aaf31 clone: define shallow clone boundary based on time with --shallow-since
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy 508ea88226 fetch: define shallow boundary with --shallow-since
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy 45a3e52641 fetch-pack: use skip_prefix() instead of starts_with()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Jeff King 2721ce21e4 use string_list initializer consistently
There are two types of string_lists: those that own the
string memory, and those that don't. You can tell the
difference by the strdup_strings flag, and one should use
either STRING_LIST_INIT_DUP, or STRING_LIST_INIT_NODUP as an
initializer.

Historically, the normal all-zeros initialization has
corresponded to the NODUP case. Many sites use no
initializer at all, and that works as a shorthand for that
case. But for a reader of the code, it can be hard to
remember which is which. Let's be more explicit and actually
have each site declare which type it means to use.

This is a fairly mechanical conversion; I assumed each site
was correct as-is, and just switched them all to NODUP.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 10:37:51 -07:00
Junio C Hamano 7013220d2b Merge branch 'jk/parseopt-string-list' into jk/string-list-static-init
* jk/parseopt-string-list:
  blame,shortlog: don't make local option variables static
  interpret-trailers: don't duplicate option strings
  parse_opt_string_list: stop allocating new strings
2016-06-13 10:37:48 -07:00
Jeff King 64093fc06a blame,shortlog: don't make local option variables static
There's no need for these option variables to be static,
except that they are referenced by the options array itself,
which is static. But having all of this static is simply
unnecessary and confusing (and inconsistent with most other
commands, which either use a static global option list or a
true function-local one).

Note that in some cases we may need to actually initialize
the variables (since we cannot rely on BSS to do so). This
is a net improvement to readability, though, as we can use
the more verbose initializers for our string_lists.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 10:33:33 -07:00
Jeff King 7c4b169585 interpret-trailers: don't duplicate option strings
There's no need to do so; the argv strings will last until
the end of the program.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 10:33:14 -07:00
Stefan Beller 665b35eccd submodule--helper: initial clone learns retry logic
Each submodule that is attempted to be cloned, will be retried once in
case of failure after all other submodules were cloned. This helps to
mitigate ephemeral server failures and increases chances of a reliable
clone of a repo with hundreds of submodules immensely.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 10:28:38 -07:00
Michael Haggerty 8bb0455367 delete_branches(): use resolve_refdup()
The return value of resolve_ref_unsafe() is not guaranteed to stay
around as long as we need it, so use resolve_refdup() instead.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2016-06-13 11:23:49 +02:00
Pranit Bauva e51b0dfc97 builtin/commit.c: memoize git-path for COMMIT_EDITMSG
This is a follow up commit for f932729c (memoize common git-path
"constant" files, 10-Aug-2015).

The many function calls to git_path() are replaced by
git_path_commit_editmsg() and which thus eliminates the need to repeatedly
compute the location of "COMMIT_EDITMSG".

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-09 10:03:10 -07:00
Edward Thomson 4e55ed32db add: add --chmod=+x / --chmod=-x options
The executable bit will not be detected (and therefore will not be
set) for paths in a repository with `core.filemode` set to false,
though the users may still wish to add files as executable for
compatibility with other users who _do_ have `core.filemode`
functionality.  For example, Windows users adding shell scripts may
wish to add them as executable for compatibility with users on
non-Windows.

Although this can be done with a plumbing command
(`git update-index --add --chmod=+x foo`), teaching the `git-add`
command allows users to set a file executable with a command that
they're already familiar with.

Signed-off-by: Edward Thomson <ethomson@edwardthomson.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-07 17:43:39 -07:00
Junio C Hamano 05781d37fa Merge branch 'ar/diff-args-osx-precompose' into maint
Many commands normalize command line arguments from NFD to NFC
variant of UTF-8 on OSX, but commands in the "diff" family did
not, causing "git diff $path" to complain that no such path is
known to Git.  They have been taught to do the normalization.

* ar/diff-args-osx-precompose:
  diff: run arguments through precompose_argv
2016-06-06 14:27:35 -07:00
Junio C Hamano 283badc38e Merge branch 'sb/submodule-helper-relative-path'
A bash-ism "local" has been removed from "git submodule" scripted
Porcelain.

* sb/submodule-helper-relative-path:
  submodule: remove bashism from shell script
2016-06-06 14:18:55 -07:00
Junio C Hamano f6136f3c39 Merge branch 'sb/submodule-helper-list-signal-unmatch-via-exit-status'
The way how "submodule--helper list" signals unmatch error to its
callers has been updated.

* sb/submodule-helper-list-signal-unmatch-via-exit-status:
  submodule--helper: offer a consistent API
2016-06-06 14:18:55 -07:00
Junio C Hamano a7d4c49a82 builtin/apply: remove misleading comment on lock_file field
Just like pointer field like prefix, the piece of memory pointed at
by lock_file field is not owned by the apply_state structure.  It is
true that the caller needs to be careful about the lifetime rule for
lockfile instances, but that is none of this API's business.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 13:11:02 -07:00
Eric Wong d9925d1a71 am: support --patch-format=mboxrd
Combined with "git format-patch --pretty=mboxrd", this should
allow us to round-trip commit messages with embedded mbox
"From " lines without corruption.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 11:40:15 -07:00
Eric Wong c88098d7f1 mailsplit: support unescaping mboxrd messages
This will allow us to parse the output of --pretty=mboxrd
and the output of other mboxrd generators.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 11:14:43 -07:00
Eric Wong 9f23e04061 pretty: support "mboxrd" output format
This output format prevents format-patch output from breaking
readers if somebody copy+pasted an mbox into a commit message.

Unlike the traditional "mboxo" format, "mboxrd" is designed to
be fully-reversible.  "mboxrd" also gracefully degrades to
showing extra ">" in existing "mboxo" readers.

This degradation is preferable to breaking message splitting
completely, a problem I've seen in "mboxcl" due to having
multiple, non-existent, or inaccurate Content-Length headers.

"mboxcl2" is a non-starter since it's inherits the problems
of "mboxcl" while being completely incompatible with existing
tooling based around mailsplit.

ref: http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 11:14:14 -07:00
Lukas Fleischer 860a2ebecd receive-pack: send auto-gc output over sideband 2
Redirect auto-gc output to the sideband such that it is visible to all
clients. As a side effect, all auto-gc error messages are now prefixed
with "remote: " before being printed to stderr on the client-side which
makes it easier to understand that those error messages originate from
the server.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 10:58:55 -07:00
Junio C Hamano bf523da2a2 Merge branch 'rs/apply-name-terminate'
Code clean-up.

* rs/apply-name-terminate:
  apply: remove unused parameters from name_terminate()
2016-06-03 14:38:04 -07:00
Junio C Hamano 29e54b019f Merge branch 'rs/patch-id-use-skip-prefix'
Code clean-up.

* rs/patch-id-use-skip-prefix:
  patch-id: use starts_with() and skip_prefix()
2016-06-03 14:38:03 -07:00
Christian Couder a1bc3dd464 builtin/apply: move 'newfd' global into 'struct apply_state'
To libify the apply functionality the 'newfd' variable should
not be static and global to the file. Let's move it into
'struct apply_state'.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03 10:31:56 -07:00
Christian Couder 8f31fac365 builtin/apply: add 'lock_file' pointer into 'struct apply_state'
We cannot have a 'struct lock_file' allocated on the stack, as lockfile.c
keeps a linked list of all created lock_file structures.

Also 'struct apply_state' users might later want the same 'struct lock_file'
instance to be reused by different series of calls to the apply api.

So let's add a 'struct lock_file *lock_file' pointer into 'struct apply_state'
and have the user of 'struct apply_state' allocate memory for the actual
'struct lock_file' instance.

Let's also add an argument to init_apply_state(), so that the caller can
easily supply a pointer to the allocated instance.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03 10:30:16 -07:00
Jeff King fb85db84dc rev-list: disable bitmaps when "-n" is used with listing objects
You can ask rev-list to use bitmaps to speed up an --objects
traversal, which should generally give you your answers much
faster.

Likewise, you can ask rev-list to limit such a traversal
with `-n`, in which case we'll show only a limited set of
commits (and only the tree and commit objects directly
reachable from those commits).

But if you do both together, the results are nonsensical. We
end up limiting any fallback traversal we do to _find_ the
bitmaps, but the actual set of objects we output will be
picked arbitrarily from the union of any bitmaps we do find,
and will involve the objects of many more commits.

It's possible that somebody might want this as a "show me
what you can, but limit the amount of work you do" flag.
But as with the prior commit clamping "--count", the results
are basically non-deterministic; you'll get the values from
some commits between `n` and the total number, and you can't
tell which.

And unlike the `--count` case, we can't easily generate the
"real" value from the bitmap values (you can't just walk
back `-n` commits and subtract out the reachable objects
from the boundary commits; the bitmaps for `X` record its
total reachability, so you don't know which objects are
directly from `X` itself, which from `X^`, and so on).

So let's just fallback to the non-bitmap code path in this
case, so we always give a sane answer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03 09:01:02 -07:00
Jeff King 5c9f9bf313 rev-list: "adjust" results of "--count --use-bitmap-index -n"
If you ask rev-list for:

    git rev-list --count --use-bitmap-index HEAD

we optimize out the actual traversal and just give you the
number of bits set in the commit bitmap. This is faster,
which is good.

But if you ask to limit the size of the traversal, like:

    git rev-list --count --use-bitmap-index -n 100 HEAD

we'll still output the full bitmapped number we found. On
the surface, that might even seem OK. You explicitly asked
to use the bitmap index, and it was cheap to compute the
real answer, so we gave it to you.

But there's something much more complicated going on under
the hood. If we don't have a bitmap directly for HEAD, then
we have to actually traverse backwards, looking for a
bitmapped commit. And _that_ traversal is bounded by our
`-n` count.

This is a good thing, because it bounds the work we have to
do, which is probably what the user wanted by asking for
`-n`. But now it makes the output quite confusing. You might
get many values:

  - your `-n` value, if we walked back and never found a
    bitmap (or fewer if there weren't that many commits)

  - the actual full count, if we found a bitmap root for
    every path of our traversal with in the `-n` limit

  - any number in between! We might have walked back and
    found _some_ bitmaps, but then cut off the traversal
    early with some commits not accounted for in the result.

So you cannot even see a value higher than your `-n` and say
"OK, bitmaps kicked in, this must be the real full count".
The only sane thing is for git to just clamp the value to a
maximum of the `-n` value, which means we should output the
exact same results whether bitmaps are in use or not.

The test in t5310 demonstrates this by using `-n 1`.
Without this patch we fail in the full-bitmap case (where we
do not have to traverse at all) but _not_ in the
partial-bitmap case (where we have to walk down to find an
actual bitmap). With this patch, both cases just work.

I didn't implement the crazy in-between case, just because
it's complicated to set up, and is really a subset of the
full-count case, which we do cover.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03 09:00:59 -07:00
Junio C Hamano ed6e8038f9 pathspec: rename free_pathspec() to clear_pathspec()
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-02 14:09:22 -07:00
Stefan Beller 44431df024 submodule: remove bashism from shell script
Junio pointed out `relative_path` was using bashisms via the
local variables. As the longer term goal is to rewrite most of the
submodule code in C, do it now.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 11:32:53 -07:00
Stefan Beller b0f4b40846 submodule--helper: offer a consistent API
In 48308681 (2016-02-29, git submodule update: have a dedicated helper
for cloning), the helper communicated errors back only via exit code,
and dance with printing '#unmatched' in case of error was left to
git-submodule.sh as it uses the output of the helper and pipes it into
shell commands. This change makes the helper consistent by never
printing '#unmatched' in the helper but always handling these piping
issues in the actual shell script.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 11:31:49 -07:00
Christian Couder 91b769c48f builtin/apply: move applying patches into apply_all_patches()
To libify the apply functionality we should provide a function to
apply many patches. Let's move the code to do that into a new
apply_all_patches() function.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 10:10:16 -07:00
Christian Couder c84a86c995 builtin/apply: move 'state' check into check_apply_state()
To libify the apply functionality we should provide a function
to check that the values in a 'struct apply_state' instance are
coherent. Let's move the code to do that into a new
check_apply_state() function.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 10:10:16 -07:00
Christian Couder 2f63cea963 builtin/apply: move 'symlink_changes' global into 'struct apply_state'
To libify the apply functionality the 'symlink_changes' variable should
not be static and global to the file. Let's move it into
'struct apply_state'.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 10:10:16 -07:00
Christian Couder 71dac5cef5 builtin/apply: move 'fn_table' global into 'struct apply_state'
To libify the apply functionality the 'fn_table' variable should
not be static and global to the file. Let's move it into
'struct apply_state'.

As fn_table is cleared at the end of apply_patch(), it is not
necessary to clear it in clear_apply_state().

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 10:10:16 -07:00
Christian Couder d7263d097c builtin/apply: move 'state_linenr' global into 'struct apply_state'
To libify the apply functionality the 'state_linenr' variable should
not be static and global to the file. Let's move it into
'struct apply_state'.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-01 10:10:16 -07:00