Commit graph

183 commits

Author SHA1 Message Date
Jeff King fea6c47f2f receive-pack: fix misleading namespace/.have comment
The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since
85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King ab6eea6f7b receive-pack: use oidset to de-duplicate .have lines
If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.

The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).

But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:

 - an unknown object is as large as the largest object type
   (a commit), which is bigger than an oidset entry

 - we can free the memory after our ref advertisement, but
   "struct object" entries persist forever (and the
   receive-pack may hang around for a long time, as the
   bottleneck is often client upload bandwidth).

So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.

In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King 2429d63a46 for_each_alternate_ref: pass name/oid instead of ref struct
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.

The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Junio C Hamano 140d41ae87 Merge branch 'rs/receive-pack-cleanup'
Code clean-up.

* rs/receive-pack-cleanup:
  receive-pack: call string_list_clear() unconditionally
2017-02-02 13:36:57 -08:00
René Scharfe 4432dd6b5b receive-pack: call string_list_clear() unconditionally
string_list_clear() handles empty lists just fine, so remove the
redundant check.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-30 15:08:58 -08:00
Alex Henrie 2ddaa42783 receive-pack: improve English grammar of denyCurrentBranch message
The article "the" is required here.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-05 14:50:54 -08:00
Junio C Hamano dbaa6bdce2 Merge branch 'ls/filter-process'
The smudge/clean filter API expect an external process is spawned
to filter the contents for each path that has a filter defined.  A
new type of "process" filter API has been added to allow the first
request to run the filter for a path to spawn a single process, and
all filtering need is served by this single process for multiple
paths, reducing the process creation overhead.

* ls/filter-process:
  contrib/long-running-filter: add long running filter example
  convert: add filter.<driver>.process option
  convert: prepare filter.<driver>.process option
  convert: make apply_filter() adhere to standard Git error handling
  pkt-line: add functions to read/write flush terminated packet streams
  pkt-line: add packet_write_gently()
  pkt-line: add packet_flush_gently()
  pkt-line: add packet_write_fmt_gently()
  pkt-line: extract set_packet_header()
  pkt-line: rename packet_write() to packet_write_fmt()
  run-command: add clean_on_exit_handler
  run-command: move check_pipe() from write_or_die to run_command
  convert: modernize tests
  convert: quote filter names in error messages
2016-10-31 13:15:21 -07:00
Jeff King ef2ed5013c find_unique_abbrev: use 4-buffer ring
Some code paths want to format multiple abbreviated sha1s in
the same output line. Because we use a single static buffer
for our return value, they have to either break their output
into several calls or allocate their own arrays and use
find_unique_abbrev_r().

Intead, let's mimic sha1_to_hex() and use a ring of several
buffers, so that the return value stays valid through
multiple calls. This shortens some of the callers, and makes
it harder to for them to make a silly mistake.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26 13:30:51 -07:00
Junio C Hamano 25ab004c53 Merge branch 'jk/quarantine-received-objects'
In order for the receiving end of "git push" to inspect the
received history and decide to reject the push, the objects sent
from the sending end need to be made available to the hook and
the mechanism for the connectivity check, and this was done
traditionally by storing the objects in the receiving repository
and letting "git gc" to expire it.  Instead, store the newly
received objects in a temporary area, and make them available by
reusing the alternate object store mechanism to them only while we
decide if we accept the check, and once we decide, either migrate
them to the repository or purge them immediately.

* jk/quarantine-received-objects:
  tmp-objdir: do not migrate files starting with '.'
  tmp-objdir: put quarantine information in the environment
  receive-pack: quarantine objects until pre-receive accepts
  tmp-objdir: introduce API for temporary object directories
  check_connected: accept an env argument
2016-10-17 13:25:20 -07:00
Lars Schneider 81c634e94f pkt-line: rename packet_write() to packet_write_fmt()
packet_write() should be called packet_write_fmt() because it is a
printf-like function that takes a format string as first parameter.

packet_write_fmt() should be used for text strings only. Arbitrary
binary data should use a new packet_write() function that is introduced
in a subsequent patch.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 11:36:50 -07:00
Jeff King 722ff7f876 receive-pack: quarantine objects until pre-receive accepts
When a client pushes objects to us, index-pack checks the
objects themselves and then installs them into place. If we
then reject the push due to a pre-receive hook, we cannot
just delete the packfile; other processes may be depending
on it. We have to do a normal reachability check at this
point via `git gc`.

But such objects may hang around for weeks due to the
gc.pruneExpire grace period. And worse, during that time
they may be exploded from the pack into inefficient loose
objects.

Instead, this patch teaches receive-pack to put the new
objects into a "quarantine" temporary directory. We make
these objects available to the connectivity check and to the
pre-receive hook, and then install them into place only if
it is successful (and otherwise remove them as tempfiles).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:54:02 -07:00
Jeff King 16ddcd403b sha1_array: let callbacks interrupt iteration
The callbacks for iterating a sha1_array must have a void
return.  This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:41 -07:00
Junio C Hamano 1fe6f5fb0a Merge branch 'va/i18n'
More i18n.

* va/i18n:
  i18n: update-index: mark warnings for translation
  i18n: show-branch: mark plural strings for translation
  i18n: show-branch: mark error messages for translation
  i18n: receive-pack: mark messages for translation
  notes: spell first word of error messages in lowercase
  i18n: notes: mark error messages for translation
  i18n: merge-recursive: mark verbose message for translation
  i18n: merge-recursive: mark error messages for translation
  i18n: config: mark error message for translation
  i18n: branch: mark option description for translation
  i18n: blame: mark error messages for translation
2016-09-21 15:15:28 -07:00
Vasco Almeida 8ba35a2dc6 i18n: receive-pack: mark messages for translation
Mark messages refuse_unconfigured_deny_msg and
refuse_unconfigured_deny_delete_current_msg for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:17:32 -07:00
Jeff King c08db5a2d0 receive-pack: allow a maximum input size to be specified
Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.

Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24 12:31:05 -07:00
Junio C Hamano 2f664566c5 Merge branch 'jk/tighten-alloc'
Small code and comment clean-up.

* jk/tighten-alloc:
  receive-pack: use FLEX_ALLOC_MEM in queue_command()
  correct FLEXPTR_* example in comment
2016-08-17 14:07:46 -07:00
René Scharfe ddd0bfac7c receive-pack: use FLEX_ALLOC_MEM in queue_command()
Use the macro FLEX_ALLOC_MEM instead of open-coding it.  This shortens
and simplifies the code a bit.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:49:30 -07:00
Junio C Hamano a58a8e3f71 Merge branch 'jk/push-progress'
"git push" and "git clone" learned to give better progress meters
to the end user who is waiting on the terminal.

* jk/push-progress:
  receive-pack: send keepalives during quiet periods
  receive-pack: turn on connectivity progress
  receive-pack: relay connectivity errors to sideband
  receive-pack: turn on index-pack resolving progress
  index-pack: add flag for showing delta-resolution progress
  clone: use a real progress meter for connectivity check
  check_connected: add progress flag
  check_connected: relay errors to alternate descriptor
  check_everything_connected: use a struct with named options
  check_everything_connected: convert to argv_array
  rev-list: add optional progress reporting
  check_everything_connected: always pass --quiet to rev-list
2016-08-03 15:10:28 -07:00
Jeff King 83558686ce receive-pack: send keepalives during quiet periods
After a client has sent us the complete pack, we may spend
some time processing the data and running hooks. If the
client asked us to be quiet, receive-pack won't send any
progress data during the index-pack or connectivity-check
steps. And hooks may or may not produce their own progress
output. In these cases, the network connection is totally
silent from both ends.

Git itself doesn't care about this (it will wait forever),
but other parts of the system (e.g., firewalls,
load-balancers, etc) might hang up the connection. So we'd
like to send some sort of keepalive to let the network and
the client side know that we're still alive and processing.

We can use the same trick we did in 05e9515 (upload-pack:
send keepalive packets during pack computation, 2013-09-08).
Namely, we will send an empty sideband data packet every `N`
seconds that we do not relay any stderr data over the
sideband channel. As with 05e9515, this means that we won't
bother sending keepalives when there's actual progress data,
but will kick in when it has been disabled (or if there is a
lull in the progress data).

The concept is simple, but the details are subtle enough
that they need discussing here.

Before the client sends us the pack, we don't want to do any
keepalives. We'll have sent our ref advertisement, and we're
waiting for them to send us the pack (and tell us that they
support sidebands at all).

While we're receiving the pack from the client (or waiting
for it to start), there's no need for keepalives; it's up to
them to keep the connection active by sending data.
Moreover, it would be wrong for us to do so. When we are the
server in the smart-http protocol, we must treat our
connection as half-duplex. So any keepalives we send while
receiving the pack would potentially be buffered by the
webserver. Not only does this make them useless (since they
would not be delivered in a timely manner), but it could
actually cause a deadlock if we fill up the buffer with
keepalives. (It wouldn't be wrong to send keepalives in this
phase for a full-duplex connection like ssh; it's simply
pointless, as it is the client's responsibility to speak).

As soon as we've gotten all of the pack data, then the
client is waiting for us to speak, and we should start
keepalives immediately. From here until the end of the
connection, we send one any time we are not otherwise
sending data.

But there's a catch. Receive-pack doesn't know the moment
we've gotten all the data. It passes the descriptor to
index-pack, who reads all of the data, and then starts
resolving the deltas. We have to communicate that back.

To make this work, we instruct the sideband muxer to enable
keepalives in three phases:

  1. In the beginning, not at all.

  2. While reading from index-pack, wait for a signal
     indicating end-of-input, and then start them.

  3. Afterwards, always.

The signal from index-pack in phase 2 has to come over the
stderr channel which the muxer is reading. We can't use an
extra pipe because the portable run-command interface only
gives us stderr and stdout.

Stdout is already used to pass the .keep filename back to
receive-pack. We could also send a signal there, but then we
would find out about it in the main thread. And the
keepalive needs to be done by the async muxer thread (since
it's the one writing sideband data back to the client). And
we can't reliably signal the async thread from the main
thread, because the async code sometimes uses threads and
sometimes uses forked processes.

Therefore the signal must come over the stderr channel,
where it may be interspersed with other random
human-readable messages from index-pack. This patch makes
the signal a single NUL byte.  This is easy to parse, should
not appear in any normal stderr output, and we don't have to
worry about any timing issues (like seeing half the signal
bytes in one read(), and half in a subsequent one).

This is a bit ugly, but it's simple to code and should work
reliably.

Another option would be to stop using an async thread for
muxing entirely, and just poll() both stderr and stdout of
index-pack from the main thread. This would work for
index-pack (because we aren't doing anything useful in the
main thread while it runs anyway). But it would make the
connectivity check and the hook muxers much more
complicated, as they need to simultaneously feed the
sub-programs while reading their stderr.

The index-pack phase is the only one that needs this
signaling, so it could simply behave differently than the
other two. That would mean having two separate
implementations of copy_to_sideband (and the keepalive
code), though. And it still doesn't get rid of the
signaling; it just means we can write a nicer message like
"END_OF_INPUT" or something on stdout, since we don't have
to worry about separating it from the stderr cruft.

One final note: this signaling trick is only done with
index-pack, not with unpack-objects. There's no point in
doing it for the latter, because by definition it only kicks
in for a small number of objects, where keepalives are not
as useful (and this conveniently lets us avoid duplicating
the implementation).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King 6b4cd2f827 receive-pack: turn on connectivity progress
When we receive a large push, the server side of the
connection may spend a lot of time (30s or more for a full
push of linux.git) walking the object graph without
producing any output. Let's give the user some indication
that we're actually working.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King d415092ac4 receive-pack: relay connectivity errors to sideband
If the connectivity check encounters a problem when
receiving a push, the error output goes to receive-pack's
stderr, whose destination depends on the protocol used
(ssh tends to send it to the user, though without a "remote"
prefix; http will generally eat it in the server's error
log).

The information should consistently go back to the user, as
there is a reasonable chance their client is buggy and
generating a bad pack.

We can do so by muxing it over the sideband as we do with
other sub-process stderr.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King d06303bb9a receive-pack: turn on index-pack resolving progress
When we receive a large push, the server side may have to
spend a lot of CPU processing the incoming packfile.

During the "receiving" phase, we are typically network
bound, and the client is writing its own progress to the
user. But during the delta resolution phase, we may spend
minutes (e.g., for a full push of linux.git) without
making any indication to the user that the connection has
not hung.

Let's ask index-pack to produce progress output for this
phase (unless the client asked us to be quiet, of course).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King 7043c7071c check_everything_connected: use a struct with named options
The number of variants of check_everything_connected has
grown over the years, so that the "real" function takes
several possibly-zero, possibly-NULL arguments. We hid the
complexity behind some wrapper functions, but this doesn't
scale well when we want to add new options.

If we add more wrapper variants to handle the new options,
then we can get a combinatorial explosion when those options
might be used together (right now nobody wants to use both
"shallow" and "transport" together, so we get by with just a
few wrappers).

If instead we add new parameters to each function, each of
which can have a default value, then callers who want the
defaults end up with confusing invocations like:

  check_everything_connected(fn, 0, data, -1, 0, NULL);

where it is unclear which parameter is which (and every
caller needs updated when we add new options).

Instead, let's add a struct to hold all of the optional
parameters. This is a little more verbose for the callers
(who have to declare the struct and fill it in), but it
makes their code much easier to follow, because every option
is named as it is set (and unused options do not have to be
mentioned at all).

Note that we could also stick the iteration function and its
callback data into the option struct, too. But since those
are required for each call, by avoiding doing so, we can let
very simple callers just pass "NULL" for the options and not
worry about the struct at all.

While we're touching each site, let's also rename the
function to check_connected(). The existing name was quite
long, and not all of the wrappers even used the full name.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:10:53 -07:00
Stefan Beller c714e45f87 receive-pack: implement advertising and receiving push options
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:40 -07:00
Stefan Beller 77a9745d19 push options: {pre,post}-receive hook learns about push options
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable
============================================
+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
======================================
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==========================================================
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
======================================================
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:17 -07:00
Junio C Hamano 8579c4ebee Merge branch 'lf/receive-pack-auto-gc-to-client'
Allow messages that are generated by auto gc during "git push" on
the receiving end to be explicitly passed back to the sending end
over sideband, so that they are shown with "remote: " prefix to
avoid confusing the users.

* lf/receive-pack-auto-gc-to-client:
  receive-pack: send auto-gc output over sideband 2
2016-06-27 09:56:52 -07:00
Lukas Fleischer 860a2ebecd receive-pack: send auto-gc output over sideband 2
Redirect auto-gc output to the sideband such that it is visible to all
clients. As a side effect, all auto-gc error messages are now prefixed
with "remote: " before being printed to stderr on the client-side which
makes it easier to understand that those error messages originate from
the server.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 10:58:55 -07:00
Junio C Hamano edc2f715bd Merge branch 'dt/pre-refs-backend'
Code restructuring around the "refs" area to prepare for pluggable
refs backends.

* dt/pre-refs-backend: (24 commits)
  refs: on symref reflog expire, lock symref not referrent
  refs: move resolve_ref_unsafe into common code
  show_head_ref(): check the result of resolve_ref_namespace()
  check_aliased_update(): check that dst_name is non-NULL
  checkout_paths(): remove unneeded flag variable
  cmd_merge(): remove unneeded flag variable
  fsck_head_link(): remove unneeded flag variable
  read_raw_ref(): change flags parameter to unsigned int
  files-backend: inline resolve_ref_1() into resolve_ref_unsafe()
  read_raw_ref(): manage own scratch space
  files-backend: break out ref reading
  resolve_ref_1(): eliminate local variable "bad_name"
  resolve_ref_1(): reorder code
  resolve_ref_1(): eliminate local variable
  resolve_ref_unsafe(): ensure flags is always set
  resolve_ref_unsafe(): use for loop to count up to MAXDEPTH
  resolve_missing_loose_ref(): simplify semantics
  t1430: improve test coverage of deletion of badly-named refs
  t1430: test for-each-ref in the presence of badly-named refs
  t1430: don't rely on symbolic-ref for creating broken symrefs
  ...
2016-04-25 15:17:15 -07:00
Michael Haggerty ded8393610 check_aliased_update(): check that dst_name is non-NULL
If there is an error in resolve_ref_unsafe(), it returns NULL. We check
for this case, but not until after calling strip_namespace(). Instead,
call strip_namespace() *after* the NULL check.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-10 11:35:37 -07:00
Sidhant Sharma [:tk] 1b68387e02 builtin/receive-pack.c: use parse_options API
Make receive-pack use the parse_options API,
bringing it more in line with send-pack and push.

Helped-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Sidhant Sharma [:tk] <tigerkid001@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-01 13:38:45 -08:00
Jeff King 50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King b32fa95fd8 convert trivial cases to ALLOC_ARRAY
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King 850d2fec53 convert manual allocations to argv_array
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
     array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
     then allocate, then fill in the array. Now we can do it
     in one pass, making the code shorter and easier to
     follow.

  3. argv_array handles memory ownership for us, making it
     more obvious when things should be free()d and and when
     not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:50:32 -08:00
Junio C Hamano f748e69167 Merge branch 'js/close-packs-before-gc' into maint
Many codepaths that run "gc --auto" before exiting kept packfiles
mapped and left the file descriptors to them open, which was not
friendly to systems that cannot remove files that are open.  They
now close the packs before doing so.

* js/close-packs-before-gc:
  receive-pack: release pack files before garbage-collecting
  merge: release pack files before garbage-collecting
  am: release pack files before garbage-collecting
  fetch: release pack files before garbage-collecting
2016-02-05 14:54:13 -08:00
Junio C Hamano 6e29ac2302 Merge branch 'jk/clang-pedantic' into maint
A few unportable C construct have been spotted by clang compiler
and have been fixed.

* jk/clang-pedantic:
  bswap: add NO_UNALIGNED_LOADS define
  avoid shifting signed integers 31 bits
2016-02-05 14:54:09 -08:00
Johannes Schindelin d5621020c1 receive-pack: release pack files before garbage-collecting
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-13 11:36:28 -08:00
Jeff King 9a93c6686f avoid shifting signed integers 31 bits
We sometimes use 32-bit unsigned integers as bit-fields.
It's fine to access the MSB, because it's unsigned. However,
doing so as "1 << 31" is wrong, because the constant "1" is
a signed int, and we shift into the sign bit, causing
undefined behavior.

We can fix this by using "1U" as the constant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-04 09:51:16 -08:00
brian m. carlson f4e54d02b8 Convert struct ref to use object_id.
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Lukas Fleischer 78a766ab6e hideRefs: add support for matching full refs
In addition to matching stripped refs, one can now add hideRefs
patterns that the full (unstripped) ref is matched against. To
distinguish between stripped and full matches, those new patterns
must be prefixed with a circumflex (^).

This commit also removes support for the undocumented and unintended
hideRefs settings ".have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-05 11:25:02 -08:00
Jeff King b26cb7c777 receive-pack: simplify keep_arg computation
To generate "--keep=receive-pack $pid on $host", we write
progressively into a single buffer, which requires keeping
track of how much we've written so far. But since the result
is destined to go into our argv array, we can simply use
argv_array_pushf.

Unfortunately we still have to have a fixed-size buffer for
the gethostname() call, but at least it now doesn't involve
any extra size computation. And as a bonus, we drop an
sprintf and a strcpy call.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King d59f765ac9 use sha1_to_hex_r() instead of strcpy
Before sha1_to_hex_r() existed, a simple way to get hex
sha1 into a buffer was with:

  strcpy(buf, sha1_to_hex(sha1));

This isn't wrong (assuming the buf is 41 characters), but it
makes auditing the code base for bad strcpy() calls harder,
as these become false positives.

Let's convert them to sha1_to_hex_r(), and likewise for
some calls to find_unique_abbrev(). While we're here, we'll
double-check that all of the buffers are correctly sized,
and use the more obvious GIT_SHA1_HEXSZ constant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King b7115a350b receive-pack: convert strncpy to xsnprintf
This strncpy is pointless; we pass the strlen() of the src
string, meaning that it works just like a memcpy. Worse,
though, is that the size has no relation to the destination
buffer, meaning it is a potential overflow.  In practice,
it's not. We pass only short constant strings like
"warning: " and "error: ", which are much smaller than the
destination buffer.

We can make this much simpler by just using xsnprintf, which
will check for overflow and return the size for our next
vsnprintf, without us having to run a separate strlen().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano 0baebca51e Merge branch 'jx/do-not-crash-receive-pack-wo-head'
An attempt to delete a ref by pushing into a repositorywhose HEAD
symbolic reference points at an unborn branch that cannot be
created due to ref D/F conflict (e.g. refs/heads/a/b exists, HEAD
points at refs/heads/a) failed.

* jx/do-not-crash-receive-pack-wo-head:
  receive-pack: crash when checking with non-exist HEAD
2015-08-03 11:01:31 -07:00
Jiang Xin b112b14d78 receive-pack: crash when checking with non-exist HEAD
If HEAD of a repository points to a conflict reference, such as:

* There exist a reference named 'refs/heads/jx/feature1', but HEAD
  points to 'refs/heads/jx', or

* There exist a reference named 'refs/heads/feature', but HEAD points
  to 'refs/heads/feature/bad'.

When we push to delete a reference for this repo, such as:

        git push /path/to/bad-head-repo.git :some/good/reference

The git-receive-pack process will crash.

This is because if HEAD points to a conflict reference, the function
`resolve_refdup("HEAD", ...)` does not return a valid reference name,
but a null buffer.  Later matching the delete reference against the null
buffer will cause git-receive-pack crash.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-22 14:18:22 -07:00
Johannes Schindelin cd94c6f91e fsck: git receive-pack: support excluding objects from fsck'ing
The optional new config option `receive.fsck.skipList` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 14:27:37 -07:00
Johannes Schindelin 5d477a334a fsck (receive-pack): allow demoting errors to warnings
For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missingemail=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 14:27:34 -07:00
Michael Haggerty ce2a987329 show_ref_cb(): rewrite to take an object_id argument
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:29 -07:00
Michael Haggerty 2b2a5be394 each_ref_fn: change to take an object_id parameter
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
Junio C Hamano 68a2e6a2c8 Merge branch 'nd/multiple-work-trees'
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.

* nd/multiple-work-trees: (41 commits)
  prune --worktrees: fix expire vs worktree existence condition
  t1501: fix test with split index
  t2026: fix broken &&-chain
  t2026 needs procondition SANITY
  git-checkout.txt: a note about multiple checkout support for submodules
  checkout: add --ignore-other-wortrees
  checkout: pass whole struct to parse_branchname_arg instead of individual flags
  git-common-dir: make "modules/" per-working-directory directory
  checkout: do not fail if target is an empty directory
  t2025: add a test to make sure grafts is working from a linked checkout
  checkout: don't require a work tree when checking out into a new one
  git_path(): keep "info/sparse-checkout" per work-tree
  count-objects: report unused files in $GIT_DIR/worktrees/...
  gc: support prune --worktrees
  gc: factor out gc.pruneexpire parsing code
  gc: style change -- no SP before closing parenthesis
  checkout: clean up half-prepared directories in --to mode
  checkout: reject if the branch is already checked out elsewhere
  prune: strategies for linked checkouts
  checkout: support checking out into a new working directory
  ...
2015-05-11 14:23:39 -07:00
Junio C Hamano fa9aaa8f10 Merge branch 'jc/update-instead-into-void'
A push into an unborn branch, with "receive.denyCurrentBranch" set
to "updateInstead", did not check out the working tree as expected.

* jc/update-instead-into-void:
  push-to-deploy: allow pushing into an unborn branch and updating it
2015-04-14 11:49:10 -07:00