Commit graph

152 commits

Author SHA1 Message Date
Jonathan Tan
da470981de fetch-pack: grow stateless RPC windows exponentially
When updating large repositories, the LARGE_FLUSH limit (that is, the
limit at which the window growth strategy switches from exponential to
linear) is reached quite quickly. Use a conservative exponential growth
strategy when that limit is reached instead (and increase LARGE_FLUSH so
that there is no regression in window size).

This optimization is only applied during stateless RPCs to avoid the
issue raised and fixed in commit 44d8dc54 (Fix potential local
deadlock during fetch-pack, 2011-03-29).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19 13:27:22 -07:00
Jeff King
df85757244 fetch-pack: isolate sigpipe in demuxer thread
In commit 9ff18fa (fetch-pack: ignore SIGPIPE in sideband
demuxer, 2016-02-24), we started using sigchain_push() to
ignore SIGPIPE in the async demuxer thread. However, this is
rather clumsy, as it ignores SIGPIPE for the entire process,
including the main thread. At the time we didn't have any
per-thread signal support, but we now we do. Let's use it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-20 13:33:56 -07:00
Jeff King
9ff18faf2f fetch-pack: ignore SIGPIPE in sideband demuxer
If the other side feeds us a bogus pack, index-pack (or
unpack-objects) may die early, before consuming all of its
input. As a result, the sideband demuxer may get SIGPIPE
(racily, depending on whether our data made it into the pipe
buffer or not). If this happens and we are compiled with
pthread support, it will take down the main thread, too.

This isn't the end of the world, as the main process will
just die() anyway when it sees index-pack failed. But it
does mean we don't get a chance to say "fatal: index-pack
failed" or similar. And it also means that we racily fail
t5504, as we sometimes die() and sometimes are killed by
SIGPIPE.

So let's ignore SIGPIPE while demuxing the sideband. We are
already careful to check the return value of write(), so we
won't waste time writing to a broken pipe. The caller will
notice the error return from the async thread, though in
practice we don't even get that far, as we die() as soon as
we see that index-pack failed.

The non-sideband case is already fine; we let index-pack
read straight from the socket, so there is no SIGPIPE at
all. Technically the non-threaded async case is also OK
without this (the forked async process gets SIGPIPE), but
it's not worth distinguishing from the threaded case here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-25 13:51:47 -08:00
brian m. carlson
ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
f2fd0760f6 Convert struct object to object_id
struct object is one of the major data structures dealing with object
IDs.  Convert it to use struct object_id instead of an unsigned char
array.  Convert get_object_hash to refer to the new member as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
f4e54d02b8 Convert struct ref to use object_id.
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Jeff King
984a43b902 fetch-pack: use argv_array for index-pack / unpack-objects
This cleans up a magic number that must be kept in sync with
the rest of the code (the number of argv slots). It also
lets us drop some fixed buffers and an sprintf (since we
can now use argv_array_pushf).

We do still have to keep one fixed buffer for calling
gethostname, but at least now the size computations for it
are much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
f932729cc7 memoize common git-path "constant" files
One of the most common uses of git_path() is to pass a
constant, like git_path("MERGE_MSG"). This has two
drawbacks:

  1. The return value is a static buffer, and the lifetime
     is dependent on other calls to git_path, etc.

  2. There's no compile-time checking of the pathname. This
     is OK for a one-off (after all, we have to spell it
     correctly at least once), but many of these constant
     strings appear throughout the code.

This patch introduces a series of functions to "memoize"
these strings, which are essentially globals for the
lifetime of the program. We compute the value once, take
ownership of the buffer, and return the cached value for
subsequent calls.  cache.h provides a helper macro for
defining these functions as one-liners, and defines a few
common ones for global use.

Using a macro is a little bit gross, but it does nicely
document the purpose of the functions. If we need to touch
them all later (e.g., because we learned how to change the
git_dir variable at runtime, and need to invalidate all of
the stored values), it will be much easier to have the
complete list.

Note that the shared-global functions have separate, manual
declarations. We could do something clever with the macros
(e.g., expand it to a declaration in some places, and a
declaration _and_ a definition in path.c). But there aren't
that many, and it's probably better to stay away from
too-magical macros.

Likewise, if we abandon the C preprocessor in favor of
generating these with a script, we could get much fancier.
E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz".
But the small amount of saved typing is probably not worth
the resulting confusion to readers who want to grep for the
function's definition.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:14 -07:00
Junio C Hamano
58eb0122f3 Merge branch 'me/fetch-into-shallow-safety'
"git fetch --depth=<depth>" and "git clone --depth=<depth>" issued
a shallow transfer request even to an upload-pack that does not
support the capability.

* me/fetch-into-shallow-safety:
  fetch-pack: check for shallow if depth given
2015-07-01 14:02:33 -07:00
Mike Edgar
eb86a507a1 fetch-pack: check for shallow if depth given
When a repository is first fetched as a shallow clone, either by
git-clone or by fetching into an empty repo, the server's capabilities
are not currently consulted. The client will send shallow requests even
if the server does not understand them, and the resulting error may be
unhelpful to the user. This change pre-emptively checks so we can exit
with a helpful error if necessary.

Signed-off-by: Mike Edgar <adgar@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-17 12:03:58 -07:00
Junio C Hamano
5455ee0573 Merge branch 'bc/object-id'
for_each_ref() callback functions were taught to name the objects
not with "unsigned char sha1[20]" but with "struct object_id".

* bc/object-id: (56 commits)
  struct ref_lock: convert old_sha1 member to object_id
  warn_if_dangling_symref(): convert local variable "junk" to object_id
  each_ref_fn_adapter(): remove adapter
  rev_list_insert_ref(): remove unneeded arguments
  rev_list_insert_ref_oid(): new function, taking an object_oid
  mark_complete(): remove unneeded arguments
  mark_complete_oid(): new function, taking an object_oid
  clear_marks(): rewrite to take an object_id argument
  mark_complete(): rewrite to take an object_id argument
  send_ref(): convert local variable "peeled" to object_id
  upload-pack: rewrite functions to take object_id arguments
  find_symref(): convert local variable "unused" to object_id
  find_symref(): rewrite to take an object_id argument
  write_one_ref(): rewrite to take an object_id argument
  write_refs_to_temp_dir(): convert local variable sha1 to object_id
  submodule: rewrite to take an object_id argument
  shallow: rewrite functions to take object_id arguments
  handle_one_ref(): rewrite to take an object_id argument
  add_info_ref(): rewrite to take an object_id argument
  handle_one_reflog(): rewrite to take an object_id argument
  ...
2015-06-05 12:17:37 -07:00
Michael Haggerty
c38cd1c89d rev_list_insert_ref(): remove unneeded arguments
Now that the function is not being used as an each_ref_sha1_fn, we can
delete the unused arguments in its signature.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:39 -07:00
Michael Haggerty
b1b49c6eb6 rev_list_insert_ref_oid(): new function, taking an object_oid
This function can be used with for_each_ref() without having to be
wrapped.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:39 -07:00
Michael Haggerty
6e20a51a80 mark_complete(): remove unneeded arguments
Now that the function is not being used as an each_ref_sha1_fn, we can
delete the unused arguments in its signature.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:38 -07:00
Michael Haggerty
f8ee4d8522 mark_complete_oid(): new function, taking an object_oid
This function can be used with for_each_ref() without having to be
wrapped.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:38 -07:00
Michael Haggerty
c50fb6cee6 clear_marks(): rewrite to take an object_id argument
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:38 -07:00
Michael Haggerty
2b2a5be394 each_ref_fn: change to take an object_id parameter
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
Fredrik Medley
68ee628932 upload-pack: optionally allow fetching reachable sha1
With uploadpack.allowReachableSHA1InWant configuration option set on the
server side, "git fetch" can make a request with a "want" line that names
an object that has not been advertised (likely to have been obtained out
of band or from a submodule pointer). Only objects reachable from the
branch tips, i.e. the union of advertised branches and branches hidden by
transfer.hideRefs, will be processed. Note that there is an associated
cost of having to walk back the history to check the reachability.

This feature can be used when obtaining the content of a certain commit,
for which the sha1 is known, without the need of cloning the whole
repository, especially if a shallow fetch is used. Useful cases are e.g.
repositories containing large files in the history, fetching only the
needed data for a submodule checkout, when sharing a sha1 without telling
which exact branch it belongs to and in Gerrit, if you think in terms of
commits instead of change numbers. (The Gerrit case has already been
solved through allowTipSHA1InWant as every Gerrit change has a ref.)

Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-22 18:25:36 -07:00
Fredrik Medley
7199c093ad upload-pack: prepare to extend allow-tip-sha1-in-want
To allow future extensions, e.g. allowing non-tip sha1, replace the
boolean allow_tip_sha1_in_want variable with the flag-style
allow_request_with_bare_object_name variable.

Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-22 18:25:35 -07:00
Jeff King
32d0462f8d fetch-pack: remove dead assignment to ref->new_sha1
In everything_local(), we used to assign the current ref's value
found in ref->old_sha1 to ref->new_sha1 when we already have all the
necessary objects to complete the history leading to that
commit.  This copying was broken at 49bb805e (Do not ask for
objects known to be complete., 2005-10-19) and ever since we
instead stuffed a random bytes in ref->new_sha1 here.  No
code complained or failed due to this breakage.

It turns out that no code path that comes after this
assignment even looks at ref->new_sha1 at all.

 - The only caller of everything_local(), do_fetch_pack(),
   returns this list of refs, whose element has bogus
   new_sha1 values, to its caller.  It does not look at the
   elements itself, but does pass them to find_common, which
   looks only at the name and old_sha1 fields.

 - The only caller of do_fetch_pack(), fetch_pack(), returns this
   list to its caller.  It does not look at the elements nor act on
   them.

 - One of the two callers of fetch_pack() is cmd_fetch_pack(), the
   top-level that implements "git fetch-pack".  The only thing it
   looks at in the elements of the returned ref list is the old_sha1
   and name fields.

 - The other caller of fetch_pack() is fetch_refs_via_pack() in the
   transport layer, which is a helper that implements "git fetch".
   It only cares about whether the returned list is empty (i.e.
   failed to fetch anything).

Just drop the bogus assignment, that is not even necessary.  The
remote-tracking refs are updated based on a different list and not
using the ref list being manipulated by this code path; the caller
do_fetch_pack() created a copy of that real ref list and passed the
copy down to this function, and modifying the elements here does not
affect anything.

Noticed-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-19 14:11:52 -07:00
Jeff King
c3c17bf107 filter_ref: make a copy of extra "sought" entries
If the server supports allow_tip_sha1_in_want, we add any
unmatched raw-sha1 entries in our "sought" list of refs to
the list of refs we will ask the other side for. We do so by
inserting the original "struct ref" directly into our list,
rather than making a copy. This has several problems.

The most minor problem is that one cannot ever free the
resulting list; it contains structs that are copies of the
remote refs (made earlier by fetch_pack) along with sought
refs that are referenced elsewhere.

But more importantly that we set the ref->next pointer to
NULL, chopping off the remainder of any existing list that
the ref was a part of. We get the set of "sought" refs in
an array rather than a linked list, but that array is often
in turn generated from a list.  The test modification in
t5516 demonstrates this. Rather than fetching just an exact
sha1, we fetch that sha1 plus another ref:

  - we build a linked list of refs to fetch when do_fetch
    calls get_ref_map; the exact sha1 is first, followed by
    the named ref ("refs/heads/extra" in this case).

  - we pass that linked list to transport_fetch_ref, which
    squashes it into an array of pointers

  - that array goes to fetch_pack, which calls filter_ref.
    There we generate the want list from a mix of what the
    remote side has advertised, and the "sought" entry for
    the exact sha1. We set the sought entry's "next" pointer
    to NULL.

  - after we return from transport_fetch_refs, we then try
    to update the refs by following the linked list. But our
    list is now truncated, and we do not update
    refs/heads/extra at all.

We can fix this by making a copy of the ref. There's nothing
that fetch_pack does to it that must be reflected in the
original "sought" list (and indeed, if that were the case we
would have a serious bug, because it is only exact-sha1
entries which are treated this way).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-19 14:11:11 -07:00
Jeff King
b7916422c7 filter_ref: avoid overwriting ref->old_sha1 with garbage
If the server supports allow_tip_sha1_in_want, then
fetch-pack's filter_refs function tries to check whether a
ref is a request for a straight sha1 by running:

  if (get_sha1_hex(ref->name, ref->old_sha1))
	  ...

I.e., we are using get_sha1_hex to ask "is this ref name a
sha1?". If it is true, then the contents of ref->old_sha1
will end up unchanged. But if it is false, then get_sha1_hex
makes no guarantees about what it has written. With a ref
name like "abcdefoo", we would overwrite 3 bytes of
ref->old_sha1 before realizing that it was not a sha1.

This is likely not a problem in practice, as anything in
refs->name (besides a sha1) will start with "refs/", meaning
that we would notice on the first character that there is a
problem. Still, we are making assumptions about the state
left in the output when get_sha1_hex returns an error (e.g.,
it could start from the end of the string, or error check
the values only once they were placed in the output). It's
better to be defensive.

We could just check that we have exactly 40 characters of
sha1. But let's be even more careful and make sure that we
have a 40-char hex refname that matches what is in old_sha1.
This is perhaps overly defensive, but spells out our
assumptions clearly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-19 13:52:54 -07:00
Michael Haggerty
697cc8efd9 lockfile.h: extract new header file for the functions in lockfile.c
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).

Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:56:14 -07:00
Junio C Hamano
825fd93767 Merge branch 'rs/child-process-init'
Code clean-up.

* rs/child-process-init:
  run-command: inline prepare_run_command_v_opt()
  run-command: call run_command_v_opt_cd_env() instead of duplicating it
  run-command: introduce child_process_init()
  run-command: introduce CHILD_PROCESS_INIT
2014-09-11 10:33:27 -07:00
René Scharfe
d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Tanay Abhra
f44af51d13 fetchpack.c: replace git_config() with git_config_get_*() family
Use `git_config_get_*()` family instead of `git_config()` to take advantage of
the config-set API which provides a cleaner control flow.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 13:33:27 -07:00
Junio C Hamano
e91ae32a01 Merge branch 'jk/skip-prefix'
* jk/skip-prefix:
  http-push: refactor parsing of remote object names
  imap-send: use skip_prefix instead of using magic numbers
  use skip_prefix to avoid repeated calculations
  git: avoid magic number with skip_prefix
  fetch-pack: refactor parsing in get_ack
  fast-import: refactor parsing of spaces
  stat_opt: check extra strlen call
  daemon: use skip_prefix to avoid magic numbers
  fast-import: use skip_prefix for parsing input
  use skip_prefix to avoid repeating strings
  use skip_prefix to avoid magic numbers
  transport-helper: avoid reading past end-of-string
  fast-import: fix read of uninitialized argv memory
  apply: use skip_prefix instead of raw addition
  refactor skip_prefix to return a boolean
  avoid using skip_prefix as a boolean
  daemon: mark some strings as const
  parse_diff_color_slot: drop ofs parameter
2014-07-09 11:33:28 -07:00
Jeff King
82e56767aa fetch-pack: refactor parsing in get_ack
There are several uses of the magic number "line+45" when
parsing ACK lines from the server, and it's rather unclear
why 45 is the correct number. We can make this more clear by
keeping a running pointer as we parse, using skip_prefix to
jump past the first "ACK ", then adding 40 to jump past
get_sha1_hex (which is still magical, but hopefully 40 is
less magical to readers of git code).

Note that this actually puts us at line+44. The original
required some character between the sha1 and further ACK
flags (it is supposed to be a space, but we never enforced
that). We start our search for flags at line+44, which
meanas we are slightly more liberal than the old code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:45:19 -07:00
Jeff King
ae021d8791 use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:

  if (starts_with(foo, "bar"))
	  foo += 3;

This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes.  We can use skip_prefix to avoid the magic
numbers here.

Note that some of these conversions could be much shorter.
For example:

  if (starts_with(arg, "--foo=")) {
	  bar = arg + 6;
	  continue;
  }

could become:

  if (skip_prefix(arg, "--foo=", &bar))
	  continue;

However, I have left it as:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = v;
	  continue;
  }

to visually match nearby cases which need to actually
process the string. Like:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = atoi(v);
	  continue;
  }

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:45 -07:00
René Scharfe
50e19a8358 Use starts_with() for C strings instead of memcmp()
Convert three cases of checking for a constant prefix using memcmp() to
starts_with().  This way there is no need for magic string length
constants and we avoid running over the end of the string should it be
shorter than the prefix.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-09 14:38:12 -07:00
Junio C Hamano
b407d40933 Merge branch 'nd/log-show-linear-break'
Attempts to show where a single-strand-of-pearls break in "git log"
output.

* nd/log-show-linear-break:
  log: add --show-linear-break to help see non-linear history
  object.h: centralize object flag allocation
2014-04-03 12:38:11 -07:00
Nguyễn Thái Ngọc Duy
208acbfb82 object.h: centralize object flag allocation
While the field "flags" is mainly used by the revision walker, it is
also used in many other places. Centralize the whole flag allocation to
one place for a better overview (and easier to move flags if we have
too).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-25 15:09:24 -07:00
Junio C Hamano
3e14384b12 Merge branch 'jk/shallow-update-fix'
Serving objects from a shallow repository needs to write a
new file to hold the temporary shallow boundaries but it was not
cleaned when we exit due to die() or a signal.

* jk/shallow-update-fix:
  shallow: verify shallow file after taking lock
  shallow: automatically clean up shallow tempfiles
  shallow: use stat_validity to check for up-to-date file
2014-03-21 12:33:29 -07:00
Jeff King
0179c945fc shallow: automatically clean up shallow tempfiles
We sometimes write tempfiles of the form "shallow_XXXXXX"
during fetch/push operations with shallow repositories.
Under normal circumstances, we clean up the result when we
are done. However, we do no take steps to clean up after
ourselves when we exit due to die() or signal death.

This patch teaches the tempfile creation code to register
handlers to clean up after ourselves. To handle this, we
change the ownership semantics of the filename returned by
setup_temporary_shallow. It now keeps a copy of the filename
itself, and returns only a const pointer to it.

We can also do away with explicit tempfile removal in the
callers. They all exit not long after finishing with the
file, so they can rely on the auto-cleanup, simplifying the
code.

Note that we keep things simple and maintain only a single
filename to be cleaned. This is sufficient for the current
caller, but we future-proof it with a die("BUG").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27 12:07:13 -08:00
Nguyễn Thái Ngọc Duy
ff62eca7d1 fetch-pack: fix deepen shallow over smart http with no-done cap
In smart http, upload-pack adds new shallow lines at the beginning of
each rpc response. Only shallow lines from the first rpc call are
useful. After that they are thrown away. It's designed this way
because upload-pack is stateless and has no idea when its shallow
lines are helpful or not.

So after refs are negotiated with multi_ack_detailed and the server
thinks it learned enough, it sends "ACK obj-id ready", terminates the
rpc call and waits for the final rpc round. The client sends "done".
The server sends another response, which also has shallow lines at
the beginning, and the last "ACK obj-id" line.

When no-done is active, the last round is cut out, the server sends
"ACK obj-id ready" and "ACK obj-id" in the same rpc
response. fetch-pack is updated to recognize this and not send
"done". However it still tries to consume shallow lines, which are
never sent.

Update the code, make sure to skip consuming shallow lines when
no-done is enabled.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10 10:21:33 -08:00
Junio C Hamano
f583ace157 Merge branch 'jk/allow-fetch-onelevel-refname'
"git clone" would fail to clone from a repository that has a ref
directly under "refs/", e.g. "refs/stash", because different
validation paths do different things on such a refname.  Loosen the
client side's validation to allow such a ref.

* jk/allow-fetch-onelevel-refname:
  fetch-pack: do not filter out one-level refs
2014-01-27 10:44:14 -08:00
Junio C Hamano
92251b1b5b Merge branch 'nd/shallow-clone'
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).

* nd/shallow-clone: (31 commits)
  t5537: fix incorrect expectation in test case 10
  shallow: remove unused code
  send-pack.c: mark a file-local function static
  git-clone.txt: remove shallow clone limitations
  prune: clean .git/shallow after pruning objects
  clone: use git protocol for cloning shallow repo locally
  send-pack: support pushing from a shallow clone via http
  receive-pack: support pushing to a shallow clone via http
  smart-http: support shallow fetch/clone
  remote-curl: pass ref SHA-1 to fetch-pack as well
  send-pack: support pushing to a shallow clone
  receive-pack: allow pushes that update .git/shallow
  connected.c: add new variant that runs with --shallow-file
  add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
  receive/send-pack: support pushing from a shallow clone
  receive-pack: reorder some code in unpack()
  fetch: add --update-shallow to accept refs that update .git/shallow
  upload-pack: make sure deepening preserves shallow roots
  fetch: support fetching from a shallow repository
  clone: support remote shallow repository
  ...
2014-01-17 12:21:20 -08:00
Jeff King
4c22408111 fetch-pack: do not filter out one-level refs
Currently fetching a one-level ref like "refs/foo" does not
work consistently. The outer "git fetch" program filters the
list of refs, checking each against check_refname_format.
Then it feeds the result to do_fetch_pack to actually
negotiate the haves/wants and get the pack. The fetch-pack
code does its own filter, and it behaves differently.

The fetch-pack filter looks for refs in "refs/", and then
feeds everything _after_ the slash (i.e., just "foo") into
check_refname_format.  But check_refname_format is not
designed to look at a partial refname. It complains that the
ref has only one component, thinking it is at the root
(i.e., alongside "HEAD"), when in reality we just fed it a
partial refname.

As a result, we omit a ref like "refs/foo" from the pack
request, even though "git fetch" then tries to store the
resulting ref.  If we happen to get the object anyway (e.g.,
because the ref is contained in another ref we are
fetching), then the fetch succeeds. But if it is a unique
object, we fail when trying to update "refs/foo".

We can fix this by just passing the whole refname into
check_refname_format; we know the part we were omitting is
"refs/", which is acceptable in a refname. This at least
makes the checks consistent with each other.

This problem happens most commonly with "refs/stash", which
is the only one-level ref in wide use. However, our test
does not use "refs/stash", as we may later want to restrict
it specifically (not because it is one-level, but because
of the semantics of stashes).

We may also want to do away with the multiple levels of
filtering (which can cause problems when they are out of
sync), or even forbid one-level refs entirely. However,
those decisions can come later; this fixes the most
immediate problem, which is the mismatch between the two.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:37:24 -08:00
Ramsay Jones
feefdf62c1 shallow: remove unused code
Commit 58babfff ("shallow.c: the 8 steps to select new commits for
.git/shallow", 05-12-2013) added a function to implement step 5 of
the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'.
This function implements an optional optimization step in the new
shallow commit selection algorithm. However, this function has no
callers. (The commented out call sites would need to change, in
order to provide information required by the function.)

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06 09:05:40 -08:00
Junio C Hamano
ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Nguyễn Thái Ngọc Duy
48d25cae22 fetch: add --update-shallow to accept refs that update .git/shallow
The same steps are done as in when --update-shallow is not given. The
only difference is we now add all shallow commits in "ours" and
"theirs" to .git/shallow (aka "step 8").

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
4820a33baa fetch: support fetching from a shallow repository
This patch just put together pieces from the 8 steps patch. We stop at
step 7 and reject refs that require new shallow commits.

Note that, by rejecting refs that require new shallow commits, we
leave dangling objects in the repo, which become "object islands" by
the next "git fetch" of the same source.

If the first fetch our "ours" set is zero and we do practically
nothing at step 7, "ours" is full at the next fetch and we may need to
walk through commits for reachability test. Room for improvement.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
beea4152d9 clone: support remote shallow repository
Cloning from a shallow repository does not follow the "8 steps for new
.git/shallow" because if it does we need to get through step 6 for all
refs. That means commit walking down to the bottom.

Instead the rule to create .git/shallow is simpler and, more
importantly, cheap: if a shallow commit is found in the pack, it's
probably used (i.e. reachable from some refs), so we add it. Others
are dropped.

One may notice this method seems flawed by the word "probably". A
shallow commit may not be reachable from any refs at all if it's
attached to an object island (a group of objects that are not
reachable by any refs).

If that object island is not complete, a new fetch request may send
more objects to connect it to some ref. At that time, because we
incorrectly installed the shallow commit in this island, the user will
not see anything after that commit (fsck is still ok). This is not
desired.

Given that object islands are rare (C Git never sends such islands for
security reasons) and do not really harm the repository integrity, a
tradeoff is made to surprise the user occasionally but work faster
everyday.

A new option --strict could be added later that follows exactly the 8
steps. "git prune" can also learn to remove dangling objects _and_ the
shallow commits that are attached to them from .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy
a796ccee51 fetch-pack.c: move shallow update code out of fetch_pack()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy
1a30f5a2f2 shallow.c: extend setup_*_shallow() to accept extra shallow commits
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10 16:14:16 -08:00
Christian Couder
5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano
5bb62059f2 Merge branch 'jk/robustify-parse-commit'
* jk/robustify-parse-commit:
  checkout: do not die when leaving broken detached HEAD
  use parse_commit_or_die instead of custom message
  use parse_commit_or_die instead of segfaulting
  assume parse_commit checks for NULL commit
  assume parse_commit checks commit->object.parsed
  log_tree_diff: die when we fail to parse a commit
2013-12-05 12:54:01 -08:00
Junio C Hamano
832ee79ab8 Merge branch 'jl/pack-transfer-avoid-double-close'
The codepath that send_pack() calls pack_objects() mistakenly
closed the same file descriptor twice, leading to potentially
closing a wrong file descriptor that was opened in the meantime.

* jl/pack-transfer-avoid-double-close:
  Clear fd after closing to avoid double-close error
2013-10-30 12:10:45 -07:00
Jeff King
0064053bd7 assume parse_commit checks commit->object.parsed
The parse_commit function will check the "parsed" flag of
the object and do nothing if it is set. There is no need
for callers to check the flag themselves, and doing so only
clutters the code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-24 15:43:50 -07:00