1
0
mirror of https://github.com/git/git synced 2024-07-05 00:58:49 +00:00
Commit Graph

434 Commits

Author SHA1 Message Date
Junio C Hamano
5cb7c73589 Merge branch 'ds/close-object-store'
The commit-graph file is now part of the "files that the runtime
may keep open file descriptors on, all of which would need to be
closed when done with the object store", and the file descriptor to
an existing commit-graph file now is closed before "gc" finalizes a
new instance to replace it.

* ds/close-object-store:
  packfile: rename close_all_packs to close_object_store
  packfile: close commit-graph in close_all_packs
  commit-graph: use raw_object_store when closing
2019-07-09 15:25:37 -07:00
Jeff King
709dfa6990 object-store.h: move for_each_alternate_ref() from transport.h
There's nothing inherently transport-related about enumerating the
alternate ref tips. The code has lived in transport.[ch] because the
only use so far had been advertising available tips during transport.
But it could be used for more, and a future patch will teach rev-list to
access these refs.

Let's move it alongside the other alt-odb code, declaring it in
object-store.h with the implementation in sha1-file.c.

This lets us drop the inclusion of transport.h from receive-pack, which
perhaps shows how it was misplaced (though receive-pack is about
transporting objects, transport.h is mostly about the client side).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01 09:47:29 -07:00
Derrick Stolee
2d511cfc0b packfile: rename close_all_packs to close_object_store
The close_all_packs() method is now responsible for more than just pack-files.
It also closes the commit-graph and the multi-pack-index. Rename the function
to be more descriptive of its larger role. The name also fits because the
input parameter is a raw_object_store.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-12 11:33:54 -07:00
Jeff King
c95fc72f47 receive-pack: drop unused "commands" from prepare_shallow_update()
We pass in the list of proposed ref updates to prepare_shallow_update(),
but that function doesn't actually need it (and never has since its
inception in 0a1bc12b6e). Only its caller, update_shallow_info(), needs
to look at the command list.

Let's drop the unused parameter to reduce confusion.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-13 14:22:54 +09:00
Junio C Hamano
4751a1673c Merge branch 'ab/receive-pack-use-after-free-fix'
Memfix.

* ab/receive-pack-use-after-free-fix:
  receive-pack: fix use-after-free bug
2019-03-07 09:59:57 +09:00
Jeff Hostetler
6206286e49 trace2:data: add trace2 hook classification
Classify certain child processes as hooks.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22 15:28:21 -08:00
Ævar Arnfjörð Bjarmason
9903623761 receive-pack: fix use-after-free bug
The resolve_ref_unsafe() function can, and sometimes will in the case
of this codepath, return the char * passed to it to the caller. In
this case we construct a strbuf, free it, and then continue using the
dst_name after that free().

The code being fixed dates back to da3efdb17b ("receive-pack: detect
aliased updates which can occur with symrefs", 2010-04-19). When it
was originally added it didn't have this bug, it was introduced when
it was subsequently modified to use strbuf in 6b01ecfe22 ("ref
namespaces: Support remote repositories via upload-pack and
receive-pack", 2011-07-08).

This is theoretically a security issue, the C standard makes no
guarantees that a value you use after free() hasn't been poked at or
changed by something else on the system, but in practice modern OSs
will have mapped the relevant page to this process, so nothing else
would have used it. We do no further allocations between the free()
and use-after-free, so we ourselves didn't corrupt or change the
value.

Jeff investigated that and found: "It probably would be an issue if
the allocation were larger. glibc at least will use mmap()/munmap()
after some cutoff[1], in which case we'd get a segfault from hitting
the unmapped page. But for small allocations, it just bumps brk() and
the memory is still available for further allocations after
free(). [...] If you had a sufficiently large refname you might be
able to trigger the bug [...]. I tried to push such a ref. I had to
manually make a packed-refs file with the long name to avoid
filesystem limits (though probably you could have a long a/b/c/ name
on ext4).  But the result can't actually be pushed, because it all has
to fit into a 64k pkt-line as part of the push protocol.".

An a alternative and more succinct way of implementing this would have
been to do the strbuf_release() at the end of check_aliased_update()
and use "goto out" instead of the early "return" statements. Hopefully
this approach of using a helper instead makes it easier to follow.

1. Jeff: "Weirdly, the mmap() cutoff on my glibc system is 135168
   bytes. Which is...2^17 + 2^12? 33 pages? I'm sure there's a good
   reason for that, but I didn't dig into it."

Reported-by: 王健强 <jianqiang.wang@securitygossip.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-20 15:02:12 -08:00
Masaya Suzuki
2d103c31c2 pack-protocol.txt: accept error packets in any context
In the Git pack protocol definition, an error packet may appear only in
a certain context. However, servers can face a runtime error (e.g. I/O
error) at an arbitrary timing. This patch changes the protocol to allow
an error packet to be sent instead of any packet.

Without this protocol spec change, when a server cannot process a
request, there's no way to tell that to a client. Since the server
cannot produce a valid response, it would be forced to cut a connection
without telling why. With this protocol spec change, the server can be
more gentle in this situation. An old client may see these error packets
as an unexpected packet, but this is not worse than having an unexpected
EOF.

Following this protocol spec change, the error packet handling code is
moved to pkt-line.c. Implementation wise, this implementation uses
pkt-line to communicate with a subprocess. Since this is not a part of
Git protocol, it's possible that a packet that is not supposed to be an
error packet is mistakenly parsed as an error packet. This error packet
handling is enabled only for the Git pack protocol parsing code
considering this.

Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-02 13:05:30 -08:00
Masaya Suzuki
01f9ec64c8 Use packet_reader instead of packet_read_line
By using and sharing a packet_reader while handling a Git pack protocol
request, the same reader option is used throughout the code. This makes
it easy to set a reader option to the request parsing code.

Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-02 13:05:27 -08:00
Junio C Hamano
4c7f544022 Merge branch 'jc/receive-deny-current-branch-fix'
The receive.denyCurrentBranch=updateInstead codepath kicked in even
when the push should have been rejected due to other reasons, such
as it does not fast-forward or the update-hook rejects it, which
has been corrected.

* jc/receive-deny-current-branch-fix:
  receive: denyCurrentBranch=updateinstead should not blindly update
2018-10-30 15:43:46 +09:00
Junio C Hamano
b072a25fad receive: denyCurrentBranch=updateinstead should not blindly update
The handling of receive.denyCurrentBranch=updateInstead was added to
a switch statement that handles other values of the variable, but
all the other case arms only checked a condition to reject the
attempted push, or let later logic in the same function to still
intervene, so that a push that does not fast-forward (which is
checked after the switch statement in question) is still rejected.

But the handling of updateInstead incorrectly took immediate effect,
without giving other checks a chance to intervene.

Instead of calling update_worktree() that causes the side effect
immediately, just note the fact that we will need to call the
function later, and first give other checks a chance to reject the
request.  After the update-hook gets a chance to reject the push
(which happens as the last step in a series of checks), call
update_worktree() when we earlier detected the need to.

Reported-by: Rajesh Madamanchi
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19 15:13:32 +09:00
Junio C Hamano
465e73fff3 Merge branch 'tb/filter-alternate-refs'
When pushing into a repository that borrows its objects from an
alternate object store, "git receive-pack" that responds to the
push request on the other side lists the tips of refs in the
alternate to reduce the amount of objects transferred.  This
sometimes is detrimental when the number of refs in the alternate
is absurdly large, in which case the bandwidth saved in potentially
fewer objects transferred is wasted in excessively large ref
advertisement.  The alternate refs that are advertised are now
configurable with a pair of configuration variables.

* tb/filter-alternate-refs:
  transport.c: introduce core.alternateRefsPrefixes
  transport.c: introduce core.alternateRefsCommand
  transport.c: extract 'fill_alternate_refs_command'
  transport: drop refnames from for_each_alternate_ref
2018-10-19 13:34:08 +09:00
Junio C Hamano
20e0ef6a03 Merge branch 'jk/check-everything-connected-is-long-gone'
Comment fix.

* jk/check-everything-connected-is-long-gone:
  receive-pack: update comment with check_everything_connected
2018-10-16 16:16:04 +09:00
Jeff King
bdf4276c91 transport: drop refnames from for_each_alternate_ref
None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-09 14:30:02 +09:00
Jeff King
7987d2232d receive-pack: update comment with check_everything_connected
That function is now called "check_connected()", but we forgot to update
this comment in 7043c7071c (check_everything_connected: use a struct
with named options, 2016-07-15).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-25 13:32:08 -07:00
Junio C Hamano
ae109a9789 Merge branch 'en/double-semicolon-fix'
Code clean-up.

* en/double-semicolon-fix:
  Remove superfluous trailing semicolons
2018-09-24 10:30:47 -07:00
Junio C Hamano
769af0fd9e Merge branch 'jk/cocci'
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.

* jk/cocci:
  show_dirstat: simplify same-content check
  read-cache: use oideq() in ce_compare functions
  convert hashmap comparison functions to oideq()
  convert "hashcmp() != 0" to "!hasheq()"
  convert "oidcmp() != 0" to "!oideq()"
  convert "hashcmp() == 0" to hasheq()
  convert "oidcmp() == 0" to oideq()
  introduce hasheq() and oideq()
  coccinelle: use <...> for function exclusion
2018-09-17 13:53:57 -07:00
Junio C Hamano
1b7a91da71 Merge branch 'ds/reachable'
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.

* ds/reachable:
  commit-reach: correct accidental #include of C file
  commit-reach: use can_all_from_reach
  commit-reach: make can_all_from_reach... linear
  commit-reach: replace ref_newer logic
  test-reach: test commit_contains
  test-reach: test can_all_from_reach_with_flags
  test-reach: test reduce_heads
  test-reach: test get_merge_bases_many
  test-reach: test is_descendant_of
  test-reach: test in_merge_bases
  test-reach: create new test tool for ref_newer
  commit-reach: move can_all_from_reach_with_flags
  upload-pack: generalize commit date cutoff
  upload-pack: refactor ok_to_give_up()
  upload-pack: make reachable() more generic
  commit-reach: move commit_contains from ref-filter
  commit-reach: move ref_newer from remote.c
  commit.h: remove method declarations
  commit-reach: move walk methods from commit.c
2018-09-17 13:53:52 -07:00
Elijah Newren
c3b9bc94b9 Remove superfluous trailing semicolons
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-05 10:21:24 -07:00
Jeff King
4a7e27e957 convert "oidcmp() == 0" to oideq()
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.

The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).

This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.

I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29 11:32:49 -07:00
Junio C Hamano
3a2a1dc170 Merge branch 'sb/object-store-lookup'
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.

* sb/object-store-lookup: (32 commits)
  commit.c: allow lookup_commit_reference to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: migrate the commit buffer to the parsed object store
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  object: allow object_as_type to handle arbitrary repositories
  tag: add repository argument to deref_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to lookup_tag
  ...
2018-08-02 15:30:42 -07:00
Junio C Hamano
6fc7de1a1f Merge branch 'hs/push-cert-check-cleanup'
Code clean-up.

* hs/push-cert-check-cleanup:
  gpg-interface: make parse_gpg_output static and remove from interface header
  builtin/receive-pack: use check_signature from gpg-interface
2018-07-24 14:50:48 -07:00
Derrick Stolee
6404355657 commit.h: remove method declarations
These methods are now declared in commit-reach.h. Remove them from
commit.h and add new include statements in all files that require these
declarations.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:54 -07:00
Junio C Hamano
00624d608c Merge branch 'sb/object-store-grafts'
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.

* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
2018-07-18 12:20:28 -07:00
Henning Schild
3b9291e182 builtin/receive-pack: use check_signature from gpg-interface
The combination of verify_signed_buffer followed by parse_gpg_output is
available as check_signature. Use that instead of implementing it again.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-11 10:05:20 -07:00
Stefan Beller
109cd76dd3 object: add repository argument to parse_object
Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29 10:43:38 -07:00
Junio C Hamano
b16b60f71b Merge branch 'sb/object-store-grafts' into sb/object-store-lookup
* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
2018-06-29 10:43:28 -07:00
Junio C Hamano
42c8ce1c49 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (42 commits)
  merge-one-file: compute empty blob object ID
  add--interactive: compute the empty tree value
  Update shell scripts to compute empty tree object ID
  sha1_file: only expose empty object constants through git_hash_algo
  dir: use the_hash_algo for empty blob object ID
  sequencer: use the_hash_algo for empty tree object ID
  cache-tree: use is_empty_tree_oid
  sha1_file: convert cached object code to struct object_id
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  merge: convert empty tree constant to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/receive-pack: avoid hard-coded constants for push certs
  diff: specify abbreviation size in terms of the_hash_algo
  upload-pack: replace use of several hard-coded constants
  ...
2018-05-30 14:04:10 +09:00
Junio C Hamano
50f08db594 Merge branch 'js/use-bug-macro'
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.

* js/use-bug-macro:
  BUG_exit_code: fix sparse "symbol not declared" warning
  Convert remaining die*(BUG) messages
  Replace all die("BUG: ...") calls by BUG() ones
  run-command: use BUG() to report bugs, not die()
  test-tool: help verifying BUG() code paths
2018-05-30 14:04:07 +09:00
Junio C Hamano
2f76ebc93c Merge branch 'ma/lockfile-cleanup'
Code clean-up to adjust to a more recent lockfile API convention that
allows lockfile instances kept on the stack.

* ma/lockfile-cleanup:
  lock_file: move static locks into functions
  lock_file: make function-local locks non-static
  refs.c: do not die if locking fails in `delete_pseudoref()`
  refs.c: do not die if locking fails in `write_pseudoref()`
  t/helper/test-write-cache: clean up lock-handling
2018-05-30 14:04:05 +09:00
Stefan Beller
19143f139d shallow: add repository argument to register_shallow
Add a repository argument to allow callers of register_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
cbd53a2193 object-store: move object access functions to object-store.h
This should make these functions easier to find and cache.h less
overwhelming to read.

In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file

As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise.  It would be better to #include wherever
identifiers from the header are used.  That can happen later
when we have better tooling for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Martin Ågren
b227586831 lock_file: make function-local locks non-static
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)

These `struct lock_file`s are local to their respective functions and we
can drop their staticness.

For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.

As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-10 14:54:45 +09:00
Junio C Hamano
9bfa0f9be3 Merge branch 'bw/protocol-v2'
The beginning of the next-gen transfer protocol.

* bw/protocol-v2: (35 commits)
  remote-curl: don't request v2 when pushing
  remote-curl: implement stateless-connect command
  http: eliminate "# service" line when using protocol v2
  http: don't always add Git-Protocol header
  http: allow providing extra headers for http requests
  remote-curl: store the protocol version the server responded with
  remote-curl: create copy of the service name
  pkt-line: add packet_buf_write_len function
  transport-helper: introduce stateless-connect
  transport-helper: refactor process_connect_service
  transport-helper: remove name parameter
  connect: don't request v2 when pushing
  connect: refactor git_connect to only get the protocol version once
  fetch-pack: support shallow requests
  fetch-pack: perform a fetch using v2
  upload-pack: introduce fetch server command
  push: pass ref prefixes when pushing
  fetch: pass ref prefixes when fetching
  ls-remote: pass ref prefixes when requesting a remote's refs
  transport: convert transport_get_remote_refs to take a list of ref prefixes
  ...
2018-05-08 15:59:16 +09:00
Johannes Schindelin
033abf97fc Replace all die("BUG: ...") calls by BUG() ones
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).

The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.

Let's just convert all remaining ones in one fell swoop.

This trick was performed by this invocation:

	sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 19:06:13 +09:00
brian m. carlson
c00866a2cc builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
Convert one use of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:52 +09:00
brian m. carlson
f6d27d2468 builtin/receive-pack: avoid hard-coded constants for push certs
Use the GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ macros instead of hard-coding
the constants 20 and 40.  Switch one use of 20 with a format specifier
for a hex value to use the hex constant instead, as the original appears
to have been a typo.

At this point, avoid converting the hard-coded use of SHA-1 to use
the_hash_algo.  SHA-1, even if not collision resistant, is secure in the
context in which it is used here, and the hash algorithm of the repo
need not match what is used here.  When we adopt a new hash algorithm,
we can simply adopt the new algorithm wholesale here, as the nonce is
opaque and its length and validity are entirely controlled by the
server.  Consequently, defer updating this code until that point.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:51 +09:00
Stefan Beller
d807c4a01d exec_cmd: rename to use dash in file name
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>
2018-04-11 18:11:00 +09:00
Junio C Hamano
3a1ec60c43 Merge branch 'sb/packfiles-in-repository'
Refactoring of the internal global data structure continues.

* sb/packfiles-in-repository:
  packfile: keep prepare_packed_git() private
  packfile: allow find_pack_entry to handle arbitrary repositories
  packfile: add repository argument to find_pack_entry
  packfile: allow reprepare_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git_one to handle arbitrary repositories
  packfile: add repository argument to reprepare_packed_git
  packfile: add repository argument to prepare_packed_git
  packfile: add repository argument to prepare_packed_git_one
  packfile: allow install_packed_git to handle arbitrary repositories
  packfile: allow rearrange_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git_mru to handle arbitrary repositories
2018-04-11 13:09:55 +09:00
Junio C Hamano
cf0b1793ea Merge branch 'sb/object-store'
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.

Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.

* sb/object-store: (27 commits)
  sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
  sha1_file: allow map_sha1_file to handle arbitrary repositories
  sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
  sha1_file: allow open_sha1_file to handle arbitrary repositories
  sha1_file: allow stat_sha1_file to handle arbitrary repositories
  sha1_file: allow sha1_file_name to handle arbitrary repositories
  sha1_file: add repository argument to sha1_loose_object_info
  sha1_file: add repository argument to map_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to sha1_file_name
  sha1_file: allow prepare_alt_odb to handle arbitrary repositories
  sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
  sha1_file: add repository argument to prepare_alt_odb
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add raw_object_store argument to alt_odb_usable
  pack: move approximate object count to object store
  ...
2018-04-11 13:09:55 +09:00
Stefan Beller
a49d283435 packfile: add repository argument to reprepare_packed_git
See previous patch for explanation.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26 10:07:43 -07:00
Stefan Beller
d0b5986622 object-store: close all packs upon clearing the object store
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26 10:05:55 -07:00
Brandon Williams
8f6982b4e1 protocol: introduce enum protocol_version value protocol_v2
Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 14:15:07 -07:00
brian m. carlson
aab9583f7b Convert find_unique_abbrev* to struct object_id
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:48 -07:00
Patryk Obara
a09c985eae sha1_file: convert write_sha1_file to object_id
Convert the definition and declaration of write_sha1_file to
struct object_id and adjust usage of this function.

This commit also converts static function write_sha1_file_prepare, as it
is closely related.

Rename these functions to write_object_file and
write_object_file_prepare respectively.

Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents
wherever possible.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Brandon Williams
aa9bab29b8 upload-pack, receive-pack: introduce protocol version 1
Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
brian m. carlson
6ccac9eed5 Convert check_connected to use struct object_id
Convert check_connected and the callbacks it takes to use struct
object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:50 +09:00
brian m. carlson
89f3bbdd3b refs: update ref transactions to use struct object_id
Update the ref transaction code to use struct object_id.  Remove one
NULL pointer check which was previously inserted around a dereference;
since we now pass a pointer to struct object_id directly through, the
code we're calling handles this for us.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:50 +09:00
Junio C Hamano
efe9d6ce33 Merge branch 'rs/resolve-ref-optional-result'
Code clean-up.

* rs/resolve-ref-optional-result:
  refs: pass NULL to resolve_refdup() if hash is not needed
  refs: pass NULL to refs_resolve_refdup() if hash is not needed
2017-10-05 13:48:19 +09:00
René Scharfe
efbd4fdfc9 refs: pass NULL to resolve_refdup() if hash is not needed
This allows us to get rid of several write-only variables.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:27:14 +09:00
Junio C Hamano
73ecdc606e Merge branch 'rs/resolve-ref-optional-result'
Code clean-up.

* rs/resolve-ref-optional-result:
  refs: pass NULL to resolve_ref_unsafe() if hash is not needed
  refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
  refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
2017-09-28 14:47:56 +09:00
Junio C Hamano
c50424a6f0 Merge branch 'jk/write-in-full-fix'
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.

* jk/write-in-full-fix:
  read_pack_header: handle signed/unsigned comparison in read result
  config: flip return value of store_write_*()
  notes-merge: use ssize_t for write_in_full() return value
  pkt-line: check write_in_full() errors against "< 0"
  convert less-trivial versions of "write_in_full() != len"
  avoid "write_in_full(fd, buf, len) != len" pattern
  get-tar-commit-id: check write_in_full() return against 0
  config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-25 15:24:06 +09:00
René Scharfe
744c040b19 refs: pass NULL to resolve_ref_unsafe() if hash is not needed
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:18:21 +09:00
Jeff King
06f46f237a avoid "write_in_full(fd, buf, len) != len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "< len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b89 (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "<0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jonathan Tan
3836d88ae5 pack: move pack-closing functions
The function close_pack_fd() needs to be temporarily made global. Its
scope will be restored to static in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:06 -07:00
Junio C Hamano
c7528f4d8a Merge branch 'bw/object-id'
Conversion from uchar[20] to struct object_id continues.

* bw/object-id:
  receive-pack: don't access hash of NULL object_id pointer
  notes: don't access hash of NULL object_id pointer
  tree-diff: don't access hash of NULL object_id pointer
2017-08-11 13:26:56 -07:00
René Scharfe
f730944a49 receive-pack: don't access hash of NULL object_id pointer
We set old_oid to NULL if we found out that it's a corrupt reference.
In that case don't try to access the hash member and pass NULL to
ref_transaction_delete() instead.

Found with Clang's UBSan.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:51:32 -07:00
brian m. carlson
15be4a5d38 Convert remaining callers of get_sha1 to get_oid.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 13:54:51 -07:00
Junio C Hamano
f056cde60e Merge branch 'rs/use-div-round-up'
Code cleanup.

* rs/use-div-round-up:
  use DIV_ROUND_UP
2017-07-12 15:18:23 -07:00
René Scharfe
42c78a216e use DIV_ROUND_UP
Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 14:24:36 -07:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano
058d655f8f Merge branch 'jt/push-options-doc' into maint
The receive-pack program now makes sure that the push certificate
records the same set of push options used for pushing.

* jt/push-options-doc:
  receive-pack: verify push options in cert
  docs: correct receive.advertisePushOptions default
2017-06-04 10:21:05 +09:00
Junio C Hamano
849e671b52 Merge branch 'js/plug-leaks'
Fix memory leaks pointed out by Coverity (and people).

* js/plug-leaks: (26 commits)
  checkout: fix memory leak
  submodule_uses_worktrees(): plug memory leak
  show_worktree(): plug memory leak
  name-rev: avoid leaking memory in the `deref` case
  remote: plug memory leak in match_explicit()
  add_reflog_for_walk: avoid memory leak
  shallow: avoid memory leak
  line-log: avoid memory leak
  receive-pack: plug memory leak in update()
  fast-export: avoid leaking memory in handle_tag()
  mktree: plug memory leaks reported by Coverity
  pack-redundant: plug memory leak
  setup_discovered_git_dir(): plug memory leak
  setup_bare_git_dir(): help static analysis
  split_commit_in_progress(): simplify & fix memory leak
  checkout: fix memory leak
  cat-file: fix memory leak
  mailinfo & mailsplit: check for EOF while parsing
  status: close file descriptor after reading git-rebase-todo
  difftool: address a couple of resource/memory leaks
  ...
2017-05-29 12:34:44 +09:00
Junio C Hamano
6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
Junio C Hamano
3c980083bc Merge branch 'jt/push-options-doc'
The receive-pack program now makes sure that the push certificate
records the same set of push options used for pushing.

* jt/push-options-doc:
  receive-pack: verify push options in cert
  docs: correct receive.advertisePushOptions default
2017-05-23 13:46:07 +09:00
Junio C Hamano
b15667bbdc Merge branch 'js/larger-timestamps'
Some platforms have ulong that is smaller than time_t, and our
historical use of ulong for timestamp would mean they cannot
represent some timestamp that the platform allows.  Invent a
separate and dedicated timestamp_t (so that we can distingiuish
timestamps and a vanilla ulongs, which along is already a good
move), and then declare uintmax_t is the type to be used as the
timestamp_t.

* js/larger-timestamps:
  archive-tar: fix a sparse 'constant too large' warning
  use uintmax_t for timestamps
  date.c: abort if the system time cannot handle one of our timestamps
  timestamp_t: a new data type for timestamps
  PRItime: introduce a new "printf format" for timestamps
  parse_timestamp(): specify explicitly where we parse timestamps
  t0006 & t5000: skip "far in the future" test when time_t is too limited
  t0006 & t5000: prepare for 64-bit timestamps
  ref-filter: avoid using `unsigned long` for catch-all data type
2017-05-16 11:51:59 +09:00
Jonathan Tan
cbaf82cc6b receive-pack: verify push options in cert
In commit f6a4e61 ("push: accept push options", 2016-07-14), send-pack
was taught to include push options both within the signed cert (if the
push is a signed push) and outside the signed cert; however,
receive-pack ignores push options within the cert, only handling push
options outside the cert.

Teach receive-pack, in the case that push options are provided for a
signed push, to verify that the push options both within the cert and
outside the cert are consistent.

This sets in stone the requirement that send-pack redundantly send its
push options in 2 places, but I think that this is better than the
alternatives. Sending push options only within the cert is
backwards-incompatible with existing Git servers (which read push
options only from outside the cert), and sending push options only
outside the cert means that the push options are not signed for.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-10 13:18:28 +09:00
brian m. carlson
c251c83df2 object: convert parse_object* to take struct object_id
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id.  Remove the temporary variables inserted
earlier, since they are no longer necessary.  Transform all of the
callers using the following semantic patch:

@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)

@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)

@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)

@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson
e92b848cb6 shallow: convert shallow registration functions to object_id
Convert register_shallow and unregister_shallow to take struct
object_id.  register_shallow is a caller of lookup_commit, which we will
convert later.  It doesn't make sense for the registration and
unregistration functions to have incompatible interfaces, so convert
them both.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
Johannes Schindelin
bda6e82801 receive-pack: plug memory leak in update()
Reported via Coverity.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 12:18:20 +09:00
Johannes Schindelin
dddbad728c timestamp_t: a new data type for timestamps
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27 13:07:39 +09:00
Junio C Hamano
5938454cbc Merge branch 'dt/xgethostname-nul-termination'
gethostname(2) may not NUL terminate the buffer if hostname does
not fit; unfortunately there is no easy way to see if our buffer
was too small, but at least this will make sure we will not end up
using garbage past the end of the buffer.

* dt/xgethostname-nul-termination:
  xgethostname: handle long hostnames
  use HOST_NAME_MAX to size buffers for gethostname(2)
2017-04-23 22:07:57 -07:00
Junio C Hamano
9f1384f711 Merge branch 'jk/quarantine-received-objects'
Add finishing touches to a recent topic.

* jk/quarantine-received-objects:
  refs: reject ref updates while GIT_QUARANTINE_PATH is set
  receive-pack: document user-visible quarantine effects
  receive-pack: drop tmp_objdir_env from run_update_hook
2017-04-23 22:07:52 -07:00
Johannes Schindelin
cb71f8bdb5 PRItime: introduce a new "printf format" for timestamps
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.

So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23 20:19:15 -07:00
Johannes Schindelin
1aeb7e756c parse_timestamp(): specify explicitly where we parse timestamps
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23 20:19:15 -07:00
Junio C Hamano
b1081e4004 Merge branch 'bc/object-id'
Conversion from unsigned char [40] to struct object_id continues.

* bc/object-id:
  Documentation: update and rename api-sha1-array.txt
  Rename sha1_array to oid_array
  Convert sha1_array_for_each_unique and for_each_abbrev to object_id
  Convert sha1_array_lookup to take struct object_id
  Convert remaining callers of sha1_array_lookup to object_id
  Make sha1_array_append take a struct object_id *
  sha1-array: convert internal storage for struct sha1_array to object_id
  builtin/pull: convert to struct object_id
  submodule: convert check_for_new_submodule_commits to object_id
  sha1_name: convert disambiguate_hint_fn to take object_id
  sha1_name: convert struct disambiguate_state to object_id
  test-sha1-array: convert most code to struct object_id
  parse-options-cb: convert sha1_array_append caller to struct object_id
  fsck: convert init_skiplist to struct object_id
  builtin/receive-pack: convert portions to struct object_id
  builtin/pull: convert portions to struct object_id
  builtin/diff: convert to struct object_id
  Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
  Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
  Define new hash-size constants for allocating memory
2017-04-19 21:37:13 -07:00
David Turner
5781a9a270 xgethostname: handle long hostnames
If the full hostname doesn't fit in the buffer supplied to
gethostname, POSIX does not specify whether the buffer will be
null-terminated, so to be safe, we should do it ourselves.  Introduce
new function, xgethostname, which ensures that there is always a \0
at the end of the buffer.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 19:58:04 -07:00
René Scharfe
da25bdb776 use HOST_NAME_MAX to size buffers for gethostname(2)
POSIX limits the length of host names to HOST_NAME_MAX.  Export the
fallback definition from daemon.c and use this constant to make all
buffers used with gethostname(2) big enough for any possible result
and a terminating NUL.

Inspired-by: David Turner <dturner@twosigma.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18 19:57:41 -07:00
Junio C Hamano
cb054eb264 Merge branch 'jk/snprintf-cleanups'
Code clean-up.

* jk/snprintf-cleanups:
  daemon: use an argv_array to exec children
  gc: replace local buffer with git_path
  transport-helper: replace checked snprintf with xsnprintf
  convert unchecked snprintf into xsnprintf
  combine-diff: replace malloc/snprintf with xstrfmt
  replace unchecked snprintf calls with heap buffers
  receive-pack: print --pack-header directly into argv array
  name-rev: replace static buffer with strbuf
  create_branch: use xstrfmt for reflog message
  create_branch: move msg setup closer to point of use
  avoid using mksnpath for refs
  avoid using fixed PATH_MAX buffers for refs
  fetch: use heap buffer to format reflog
  tag: use strbuf to format tag header
  diff: avoid fixed-size buffer for patch-ids
  odb_mkstemp: use git_path_buf
  odb_mkstemp: write filename into strbuf
  do not check odb_mkstemp return value for errors
2017-04-16 23:29:26 -07:00
Jeff King
360244a51a receive-pack: drop tmp_objdir_env from run_update_hook
Since 722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), we have to feed the
pre-receive hook the tmp_objdir environment, so that git
programs run from the hook know where to find the objects.

That commit modified run_update_hook() to do the same, but
there it is a noop. By the time we get to the update hooks,
we have already migrated the objects from quarantine, and so
tmp_objdir_env() will always return NULL. We can drop this
useless call.

Note that the ordering here and the lack of support for the
update hook is intentional. The update hook calls are
interspersed with actual ref updates, and we must migrate
the objects before any refs are updated (since otherwise
those refs would appear broken to outside processes). So the
only other options are:

  - remain in quarantine for the _first_ ref, but not the
    others. This is sufficiently confusing that it can be
    rejected outright.

  - run all the individual update hooks first, then migrate,
    then update all the refs. But this changes the repository
    state that the update hooks see (i.e., whether or not
    refs from the same push are updated yet or not).

So the functionality is fine and remains unchanged with this
patch; we're just cleaning up a useless and confusing line
of code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-16 18:13:12 -07:00
brian m. carlson
910650d2f8 Rename sha1_array to oid_array
Since this structure handles an array of object IDs, rename it to struct
oid_array.  Also rename the accessor functions and the initialization
constant.

This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:

    perl -pi -E 's/struct sha1_array/struct oid_array/g'
    perl -pi -E 's/\bsha1_array_/oid_array_/g'
    perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31 08:33:56 -07:00
brian m. carlson
1b7ba794d2 Convert sha1_array_for_each_unique and for_each_abbrev to object_id
Make sha1_array_for_each_unique take a callback using struct object_id.
Since one of these callbacks is an argument to for_each_abbrev, convert
those as well.  Rename various functions, replacing "sha1" with "oid".

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31 08:33:55 -07:00
brian m. carlson
98a72ddc12 Make sha1_array_append take a struct object_id *
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:

@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, &E2)

@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31 08:33:55 -07:00
Jeff King
446d5d9112 receive-pack: print --pack-header directly into argv array
After receive-pack reads the pack header from the client, it
feeds the already-read part to index-pack and unpack-objects
via their --pack-header command-line options.  To do so, we
format it into a fixed buffer, then duplicate it into the
child's argv_array.

Our buffer is long enough to handle any possible input, so
this isn't wrong. But it's more complicated than it needs to
be; we can just argv_array_pushf() the final value and avoid
the intermediate copy. This drops the magic number and is
more efficient, too.

Note that we need to push to the argv_array in order, which
means we can't do the push until we are in the "unpack-objects
versus index-pack" conditional.  Rather than duplicate the
slightly complicated format specifier, I pushed it into a
helper function.

Signed-off-by: Jeff King <peff@peff.net>
2017-03-30 14:59:50 -07:00
Junio C Hamano
e711824c5e Merge branch 'bc/push-cert-receive-fix'
"git receive-pack" could have been forced to die by attempting
allocate an unreasonably large amount of memory with a crafted push
certificate; this has been fixed.

* bc/push-cert-receive-fix:
  builtin/receive-pack: fix incorrect pointer arithmetic
2017-03-30 14:07:18 -07:00
Junio C Hamano
fd7c41ec97 Merge branch 'rs/update-hook-optim' into maint
Code clean-up.

* rs/update-hook-optim:
  receive-pack: simplify run_update_post_hook()
2017-03-28 13:52:28 -07:00
brian m. carlson
ee3051bd23 sha1-array: convert internal storage for struct sha1_array to object_id
Make the internal storage for struct sha1_array use an array of struct
object_id internally.  Update the users of this struct which inspect its
internals.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28 09:59:34 -07:00
brian m. carlson
9c44ea4403 builtin/receive-pack: convert portions to struct object_id
Convert some hardcoded constants into uses of parse_oid_hex.
Additionally, convert all uses of struct command, and miscellaneous
other functions necessary for that.  This work is necessary to be able
to convert sha1_array_append later on.

To avoid needing to specify a constant, reject shallow lines with the
wrong length instead of simply ignoring them.

Note that in queue_command we are guaranteed to have a NUL-terminated
buffer or at least one byte of overflow that we can safely read, so the
linelen check can be elided.  We would die in such a case, but not read
invalid memory.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28 09:59:33 -07:00
brian m. carlson
f2214dede9 builtin/receive-pack: fix incorrect pointer arithmetic
If we had already processed the last newline in a push certificate, we
would end up subtracting NULL from the end-of-certificate pointer when
computing the length of the line.  This would have resulted in an
absurdly large length, and possibly a buffer overflow.  Instead,
subtract the beginning-of-certificate pointer from the
end-of-certificate pointer, which is what's expected.

Note that this situation should never occur, since not only do we
require the certificate to be newline terminated, but the signature will
only be read from the beginning of a line.  Nevertheless, it seems
prudent to correct it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28 09:57:14 -07:00
brian m. carlson
cd02599c48 Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 20 bytes, use the constant
GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead
of GIT_SHA1_RAWSZ.  This will ease the transition down the line by
distinguishing between places where we need to allocate memory suitable
for the largest hash from those where we need to handle the current
hash.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-26 22:08:21 -07:00
Junio C Hamano
8b47c5de96 Merge branch 'rs/update-hook-optim'
Code clean-up.

* rs/update-hook-optim:
  receive-pack: simplify run_update_post_hook()
2017-03-24 13:07:37 -07:00
Junio C Hamano
f4d3af1859 Merge branch 'jk/push-deadlock-regression-fix' into maint
"git push" had a handful of codepaths that could lead to a deadlock
when unexpected error happened, which has been fixed.

* jk/push-deadlock-regression-fix:
  send-pack: report signal death of pack-objects
  send-pack: read "unpack" status even on pack-objects failure
  send-pack: improve unpack-status error messages
  send-pack: use skip_prefix for parsing unpack status
  send-pack: extract parsing of "unpack" response
  receive-pack: fix deadlock when we cannot create tmpdir
2017-03-24 12:57:52 -07:00
René Scharfe
dce96c41f9 receive-pack: simplify run_update_post_hook()
Instead of counting the arguments to see if there are any and then
building the full command use a single loop and add the hook command
just before the first argument.  This reduces duplication and overall
code size.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-18 10:13:09 -07:00
Junio C Hamano
e1fae93019 Merge branch 'bc/object-id'
"uchar [40]" to "struct object_id" conversion continues.

* bc/object-id:
  wt-status: convert to struct object_id
  builtin/merge-base: convert to struct object_id
  Convert object iteration callbacks to struct object_id
  sha1_file: introduce an nth_packed_object_oid function
  refs: simplify parsing of reflog entries
  refs: convert each_reflog_ent_fn to struct object_id
  reflog-walk: convert struct reflog_info to struct object_id
  builtin/replace: convert to struct object_id
  Convert remaining callers of resolve_refdup to object_id
  builtin/merge: convert to struct object_id
  builtin/clone: convert to struct object_id
  builtin/branch: convert to struct object_id
  builtin/grep: convert to struct object_id
  builtin/fmt-merge-message: convert to struct object_id
  builtin/fast-export: convert to struct object_id
  builtin/describe: convert to struct object_id
  builtin/diff-tree: convert to struct object_id
  builtin/commit: convert to struct object_id
  hex: introduce parse_oid_hex
2017-03-17 13:50:25 -07:00
Junio C Hamano
d6857a831c Merge branch 'jk/push-deadlock-regression-fix'
"git push" had a handful of codepaths that could lead to a deadlock
when unexpected error happened, which has been fixed.

* jk/push-deadlock-regression-fix:
  send-pack: report signal death of pack-objects
  send-pack: read "unpack" status even on pack-objects failure
  send-pack: improve unpack-status error messages
  send-pack: use skip_prefix for parsing unpack status
  send-pack: extract parsing of "unpack" response
  receive-pack: fix deadlock when we cannot create tmpdir
2017-03-14 15:23:20 -07:00
Jeff King
6cdad1f133 receive-pack: fix deadlock when we cannot create tmpdir
The err_fd descriptor passed to the unpack() function is
intended to be handed off to the child index-pack, and our
async muxer will read until it gets EOF. However, if we
encounter an error before handing off the descriptor, we
must manually close(err_fd). Otherwise we will be waiting
for our muxer to finish, while the muxer is waiting for EOF
on err_fd.

We fixed an identical deadlock already in 49ecfa13f
(receive-pack: close sideband fd on early pack errors,
2013-04-19). But since then, the function grew a new
early-return in 722ff7f87 (receive-pack: quarantine objects
until pre-receive accepts, 2016-10-03), when we fail to
create a temporary directory. This return needs the same
treatment.

Reported-by: Horst Schirmeier <horst@schirmeier.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-07 14:51:03 -08:00
brian m. carlson
2928325fc0 Convert remaining callers of resolve_refdup to object_id
There are a few leaf functions in various files that call
resolve_refdup.  Convert these functions to use struct object_id
internally to prepare for transitioning resolve_refdup itself.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-22 10:12:15 -08:00
Jeff King
63d428e656 receive-pack: avoid duplicates between our refs and alternates
We de-duplicate ".have" refs among themselves, but never
check if they are duplicates of our local refs. It's not
unreasonable that they would be if we are a "--shared" or
"--reference" clone of a similar repository; we'd have all
the same tags.

We can handle this by inserting our local refs into the
oidset, but obviously not suppressing duplicates (since the
refnames are important).

Note that this also switches the order in which we advertise
refs, processing ours first and then any alternates. The
order shouldn't matter (and arguably showing our refs first
makes more sense).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
8b24b9e765 receive-pack: treat namespace .have lines like alternates
Namely, de-duplicate them. We use the same set as the
alternates, since we call them both ".have" (i.e., there is
no value in showing one versus the other).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
fea6c47f2f receive-pack: fix misleading namespace/.have comment
The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since
85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
ab6eea6f7b receive-pack: use oidset to de-duplicate .have lines
If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.

The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).

But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:

 - an unknown object is as large as the largest object type
   (a commit), which is bigger than an oidset entry

 - we can free the memory after our ref advertisement, but
   "struct object" entries persist forever (and the
   receive-pack may hang around for a long time, as the
   bottleneck is often client upload bandwidth).

So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.

In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
2429d63a46 for_each_alternate_ref: pass name/oid instead of ref struct
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.

The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Junio C Hamano
140d41ae87 Merge branch 'rs/receive-pack-cleanup'
Code clean-up.

* rs/receive-pack-cleanup:
  receive-pack: call string_list_clear() unconditionally
2017-02-02 13:36:57 -08:00
René Scharfe
4432dd6b5b receive-pack: call string_list_clear() unconditionally
string_list_clear() handles empty lists just fine, so remove the
redundant check.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-30 15:08:58 -08:00
Alex Henrie
2ddaa42783 receive-pack: improve English grammar of denyCurrentBranch message
The article "the" is required here.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-05 14:50:54 -08:00
Junio C Hamano
dbaa6bdce2 Merge branch 'ls/filter-process'
The smudge/clean filter API expect an external process is spawned
to filter the contents for each path that has a filter defined.  A
new type of "process" filter API has been added to allow the first
request to run the filter for a path to spawn a single process, and
all filtering need is served by this single process for multiple
paths, reducing the process creation overhead.

* ls/filter-process:
  contrib/long-running-filter: add long running filter example
  convert: add filter.<driver>.process option
  convert: prepare filter.<driver>.process option
  convert: make apply_filter() adhere to standard Git error handling
  pkt-line: add functions to read/write flush terminated packet streams
  pkt-line: add packet_write_gently()
  pkt-line: add packet_flush_gently()
  pkt-line: add packet_write_fmt_gently()
  pkt-line: extract set_packet_header()
  pkt-line: rename packet_write() to packet_write_fmt()
  run-command: add clean_on_exit_handler
  run-command: move check_pipe() from write_or_die to run_command
  convert: modernize tests
  convert: quote filter names in error messages
2016-10-31 13:15:21 -07:00
Jeff King
ef2ed5013c find_unique_abbrev: use 4-buffer ring
Some code paths want to format multiple abbreviated sha1s in
the same output line. Because we use a single static buffer
for our return value, they have to either break their output
into several calls or allocate their own arrays and use
find_unique_abbrev_r().

Intead, let's mimic sha1_to_hex() and use a ring of several
buffers, so that the return value stays valid through
multiple calls. This shortens some of the callers, and makes
it harder to for them to make a silly mistake.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26 13:30:51 -07:00
Junio C Hamano
25ab004c53 Merge branch 'jk/quarantine-received-objects'
In order for the receiving end of "git push" to inspect the
received history and decide to reject the push, the objects sent
from the sending end need to be made available to the hook and
the mechanism for the connectivity check, and this was done
traditionally by storing the objects in the receiving repository
and letting "git gc" to expire it.  Instead, store the newly
received objects in a temporary area, and make them available by
reusing the alternate object store mechanism to them only while we
decide if we accept the check, and once we decide, either migrate
them to the repository or purge them immediately.

* jk/quarantine-received-objects:
  tmp-objdir: do not migrate files starting with '.'
  tmp-objdir: put quarantine information in the environment
  receive-pack: quarantine objects until pre-receive accepts
  tmp-objdir: introduce API for temporary object directories
  check_connected: accept an env argument
2016-10-17 13:25:20 -07:00
Lars Schneider
81c634e94f pkt-line: rename packet_write() to packet_write_fmt()
packet_write() should be called packet_write_fmt() because it is a
printf-like function that takes a format string as first parameter.

packet_write_fmt() should be used for text strings only. Arbitrary
binary data should use a new packet_write() function that is introduced
in a subsequent patch.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 11:36:50 -07:00
Jeff King
722ff7f876 receive-pack: quarantine objects until pre-receive accepts
When a client pushes objects to us, index-pack checks the
objects themselves and then installs them into place. If we
then reject the push due to a pre-receive hook, we cannot
just delete the packfile; other processes may be depending
on it. We have to do a normal reachability check at this
point via `git gc`.

But such objects may hang around for weeks due to the
gc.pruneExpire grace period. And worse, during that time
they may be exploded from the pack into inefficient loose
objects.

Instead, this patch teaches receive-pack to put the new
objects into a "quarantine" temporary directory. We make
these objects available to the connectivity check and to the
pre-receive hook, and then install them into place only if
it is successful (and otherwise remove them as tempfiles).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:54:02 -07:00
Jeff King
16ddcd403b sha1_array: let callbacks interrupt iteration
The callbacks for iterating a sha1_array must have a void
return.  This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:41 -07:00
Junio C Hamano
1fe6f5fb0a Merge branch 'va/i18n'
More i18n.

* va/i18n:
  i18n: update-index: mark warnings for translation
  i18n: show-branch: mark plural strings for translation
  i18n: show-branch: mark error messages for translation
  i18n: receive-pack: mark messages for translation
  notes: spell first word of error messages in lowercase
  i18n: notes: mark error messages for translation
  i18n: merge-recursive: mark verbose message for translation
  i18n: merge-recursive: mark error messages for translation
  i18n: config: mark error message for translation
  i18n: branch: mark option description for translation
  i18n: blame: mark error messages for translation
2016-09-21 15:15:28 -07:00
Vasco Almeida
8ba35a2dc6 i18n: receive-pack: mark messages for translation
Mark messages refuse_unconfigured_deny_msg and
refuse_unconfigured_deny_delete_current_msg for translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:17:32 -07:00
Jeff King
c08db5a2d0 receive-pack: allow a maximum input size to be specified
Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.

Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24 12:31:05 -07:00
Junio C Hamano
2f664566c5 Merge branch 'jk/tighten-alloc'
Small code and comment clean-up.

* jk/tighten-alloc:
  receive-pack: use FLEX_ALLOC_MEM in queue_command()
  correct FLEXPTR_* example in comment
2016-08-17 14:07:46 -07:00
René Scharfe
ddd0bfac7c receive-pack: use FLEX_ALLOC_MEM in queue_command()
Use the macro FLEX_ALLOC_MEM instead of open-coding it.  This shortens
and simplifies the code a bit.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:49:30 -07:00
Junio C Hamano
a58a8e3f71 Merge branch 'jk/push-progress'
"git push" and "git clone" learned to give better progress meters
to the end user who is waiting on the terminal.

* jk/push-progress:
  receive-pack: send keepalives during quiet periods
  receive-pack: turn on connectivity progress
  receive-pack: relay connectivity errors to sideband
  receive-pack: turn on index-pack resolving progress
  index-pack: add flag for showing delta-resolution progress
  clone: use a real progress meter for connectivity check
  check_connected: add progress flag
  check_connected: relay errors to alternate descriptor
  check_everything_connected: use a struct with named options
  check_everything_connected: convert to argv_array
  rev-list: add optional progress reporting
  check_everything_connected: always pass --quiet to rev-list
2016-08-03 15:10:28 -07:00
Jeff King
83558686ce receive-pack: send keepalives during quiet periods
After a client has sent us the complete pack, we may spend
some time processing the data and running hooks. If the
client asked us to be quiet, receive-pack won't send any
progress data during the index-pack or connectivity-check
steps. And hooks may or may not produce their own progress
output. In these cases, the network connection is totally
silent from both ends.

Git itself doesn't care about this (it will wait forever),
but other parts of the system (e.g., firewalls,
load-balancers, etc) might hang up the connection. So we'd
like to send some sort of keepalive to let the network and
the client side know that we're still alive and processing.

We can use the same trick we did in 05e9515 (upload-pack:
send keepalive packets during pack computation, 2013-09-08).
Namely, we will send an empty sideband data packet every `N`
seconds that we do not relay any stderr data over the
sideband channel. As with 05e9515, this means that we won't
bother sending keepalives when there's actual progress data,
but will kick in when it has been disabled (or if there is a
lull in the progress data).

The concept is simple, but the details are subtle enough
that they need discussing here.

Before the client sends us the pack, we don't want to do any
keepalives. We'll have sent our ref advertisement, and we're
waiting for them to send us the pack (and tell us that they
support sidebands at all).

While we're receiving the pack from the client (or waiting
for it to start), there's no need for keepalives; it's up to
them to keep the connection active by sending data.
Moreover, it would be wrong for us to do so. When we are the
server in the smart-http protocol, we must treat our
connection as half-duplex. So any keepalives we send while
receiving the pack would potentially be buffered by the
webserver. Not only does this make them useless (since they
would not be delivered in a timely manner), but it could
actually cause a deadlock if we fill up the buffer with
keepalives. (It wouldn't be wrong to send keepalives in this
phase for a full-duplex connection like ssh; it's simply
pointless, as it is the client's responsibility to speak).

As soon as we've gotten all of the pack data, then the
client is waiting for us to speak, and we should start
keepalives immediately. From here until the end of the
connection, we send one any time we are not otherwise
sending data.

But there's a catch. Receive-pack doesn't know the moment
we've gotten all the data. It passes the descriptor to
index-pack, who reads all of the data, and then starts
resolving the deltas. We have to communicate that back.

To make this work, we instruct the sideband muxer to enable
keepalives in three phases:

  1. In the beginning, not at all.

  2. While reading from index-pack, wait for a signal
     indicating end-of-input, and then start them.

  3. Afterwards, always.

The signal from index-pack in phase 2 has to come over the
stderr channel which the muxer is reading. We can't use an
extra pipe because the portable run-command interface only
gives us stderr and stdout.

Stdout is already used to pass the .keep filename back to
receive-pack. We could also send a signal there, but then we
would find out about it in the main thread. And the
keepalive needs to be done by the async muxer thread (since
it's the one writing sideband data back to the client). And
we can't reliably signal the async thread from the main
thread, because the async code sometimes uses threads and
sometimes uses forked processes.

Therefore the signal must come over the stderr channel,
where it may be interspersed with other random
human-readable messages from index-pack. This patch makes
the signal a single NUL byte.  This is easy to parse, should
not appear in any normal stderr output, and we don't have to
worry about any timing issues (like seeing half the signal
bytes in one read(), and half in a subsequent one).

This is a bit ugly, but it's simple to code and should work
reliably.

Another option would be to stop using an async thread for
muxing entirely, and just poll() both stderr and stdout of
index-pack from the main thread. This would work for
index-pack (because we aren't doing anything useful in the
main thread while it runs anyway). But it would make the
connectivity check and the hook muxers much more
complicated, as they need to simultaneously feed the
sub-programs while reading their stderr.

The index-pack phase is the only one that needs this
signaling, so it could simply behave differently than the
other two. That would mean having two separate
implementations of copy_to_sideband (and the keepalive
code), though. And it still doesn't get rid of the
signaling; it just means we can write a nicer message like
"END_OF_INPUT" or something on stdout, since we don't have
to worry about separating it from the stderr cruft.

One final note: this signaling trick is only done with
index-pack, not with unpack-objects. There's no point in
doing it for the latter, because by definition it only kicks
in for a small number of objects, where keepalives are not
as useful (and this conveniently lets us avoid duplicating
the implementation).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King
6b4cd2f827 receive-pack: turn on connectivity progress
When we receive a large push, the server side of the
connection may spend a lot of time (30s or more for a full
push of linux.git) walking the object graph without
producing any output. Let's give the user some indication
that we're actually working.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:11 -07:00
Jeff King
d415092ac4 receive-pack: relay connectivity errors to sideband
If the connectivity check encounters a problem when
receiving a push, the error output goes to receive-pack's
stderr, whose destination depends on the protocol used
(ssh tends to send it to the user, though without a "remote"
prefix; http will generally eat it in the server's error
log).

The information should consistently go back to the user, as
there is a reasonable chance their client is buggy and
generating a bad pack.

We can do so by muxing it over the sideband as we do with
other sub-process stderr.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King
d06303bb9a receive-pack: turn on index-pack resolving progress
When we receive a large push, the server side may have to
spend a lot of CPU processing the incoming packfile.

During the "receiving" phase, we are typically network
bound, and the client is writing its own progress to the
user. But during the delta resolution phase, we may spend
minutes (e.g., for a full push of linux.git) without
making any indication to the user that the connection has
not hung.

Let's ask index-pack to produce progress output for this
phase (unless the client asked us to be quiet, of course).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:11:10 -07:00
Jeff King
7043c7071c check_everything_connected: use a struct with named options
The number of variants of check_everything_connected has
grown over the years, so that the "real" function takes
several possibly-zero, possibly-NULL arguments. We hid the
complexity behind some wrapper functions, but this doesn't
scale well when we want to add new options.

If we add more wrapper variants to handle the new options,
then we can get a combinatorial explosion when those options
might be used together (right now nobody wants to use both
"shallow" and "transport" together, so we get by with just a
few wrappers).

If instead we add new parameters to each function, each of
which can have a default value, then callers who want the
defaults end up with confusing invocations like:

  check_everything_connected(fn, 0, data, -1, 0, NULL);

where it is unclear which parameter is which (and every
caller needs updated when we add new options).

Instead, let's add a struct to hold all of the optional
parameters. This is a little more verbose for the callers
(who have to declare the struct and fill it in), but it
makes their code much easier to follow, because every option
is named as it is set (and unused options do not have to be
mentioned at all).

Note that we could also stick the iteration function and its
callback data into the option struct, too. But since those
are required for each call, by avoiding doing so, we can let
very simple callers just pass "NULL" for the options and not
worry about the struct at all.

While we're touching each site, let's also rename the
function to check_connected(). The existing name was quite
long, and not all of the wrappers even used the full name.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20 12:10:53 -07:00
Stefan Beller
c714e45f87 receive-pack: implement advertising and receiving push options
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:40 -07:00
Stefan Beller
77a9745d19 push options: {pre,post}-receive hook learns about push options
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of
push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted
option.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

There was some discussion back and forth how to present these push options
to the user as there are some ways to do it:

Keep all options in one environment variable
============================================
+ easiest way to implement in Git
- This would make things hard to parse correctly in the hook.

Put the options in files instead,
filenames are in GIT_PUSH_OPTION_FILES
======================================
+ After a discussion about environment variables and shells, we may not
  want to put user data into an environment variable (see [1] for example).
+ We could transmit binaries, i.e. we're not bound to C strings as
  we are when using environment variables to the user.
+ Maybe easier to parse than constructing environment variable names
  GIT_PUSH_OPTION_{0,1,..} yourself
- cleanup of the temporary files is hard to do reliably
- we have race conditions with multiple clients pushing, hence we'd need
  to use mkstemp. That's not too bad, but still.

Use environment variables, but restrict to key/value pairs
==========================================================
(When the user pushes a push option `foo=bar`, we'd
GIT_PUSH_OPTION_foo=bar)
+ very easy to parse for a simple model of push options
- it's not sufficient for more elaborate models, e.g.
  it doesn't allow doubles (e.g. cc=reviewer@email)

Present the options in different environment variables
======================================================
(This is implemented)
* harder to parse as a user, but we have a sample hook for that.
- doesn't allow binary files
+ allows the same option twice, i.e. is not restrictive about
  options, except for binary files.
+ doesn't clutter a remote directory with (possibly stale)
  temporary files

As we first want to focus on getting simple strings to work
reliably, we go with the last option for now. If we want to
do transmission of binaries later, we can just attach a
'side-channel', e.g. "any push option that contains a '\0' is
put into a file instead of the environment variable and we'd
have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..}
environment variables".

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14 15:50:17 -07:00
Junio C Hamano
8579c4ebee Merge branch 'lf/receive-pack-auto-gc-to-client'
Allow messages that are generated by auto gc during "git push" on
the receiving end to be explicitly passed back to the sending end
over sideband, so that they are shown with "remote: " prefix to
avoid confusing the users.

* lf/receive-pack-auto-gc-to-client:
  receive-pack: send auto-gc output over sideband 2
2016-06-27 09:56:52 -07:00
Lukas Fleischer
860a2ebecd receive-pack: send auto-gc output over sideband 2
Redirect auto-gc output to the sideband such that it is visible to all
clients. As a side effect, all auto-gc error messages are now prefixed
with "remote: " before being printed to stderr on the client-side which
makes it easier to understand that those error messages originate from
the server.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-06 10:58:55 -07:00
Junio C Hamano
edc2f715bd Merge branch 'dt/pre-refs-backend'
Code restructuring around the "refs" area to prepare for pluggable
refs backends.

* dt/pre-refs-backend: (24 commits)
  refs: on symref reflog expire, lock symref not referrent
  refs: move resolve_ref_unsafe into common code
  show_head_ref(): check the result of resolve_ref_namespace()
  check_aliased_update(): check that dst_name is non-NULL
  checkout_paths(): remove unneeded flag variable
  cmd_merge(): remove unneeded flag variable
  fsck_head_link(): remove unneeded flag variable
  read_raw_ref(): change flags parameter to unsigned int
  files-backend: inline resolve_ref_1() into resolve_ref_unsafe()
  read_raw_ref(): manage own scratch space
  files-backend: break out ref reading
  resolve_ref_1(): eliminate local variable "bad_name"
  resolve_ref_1(): reorder code
  resolve_ref_1(): eliminate local variable
  resolve_ref_unsafe(): ensure flags is always set
  resolve_ref_unsafe(): use for loop to count up to MAXDEPTH
  resolve_missing_loose_ref(): simplify semantics
  t1430: improve test coverage of deletion of badly-named refs
  t1430: test for-each-ref in the presence of badly-named refs
  t1430: don't rely on symbolic-ref for creating broken symrefs
  ...
2016-04-25 15:17:15 -07:00
Michael Haggerty
ded8393610 check_aliased_update(): check that dst_name is non-NULL
If there is an error in resolve_ref_unsafe(), it returns NULL. We check
for this case, but not until after calling strip_namespace(). Instead,
call strip_namespace() *after* the NULL check.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-10 11:35:37 -07:00
Sidhant Sharma [:tk]
1b68387e02 builtin/receive-pack.c: use parse_options API
Make receive-pack use the parse_options API,
bringing it more in line with send-pack and push.

Helped-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Sidhant Sharma [:tk] <tigerkid001@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-01 13:38:45 -08:00
Jeff King
50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
b32fa95fd8 convert trivial cases to ALLOC_ARRAY
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
850d2fec53 convert manual allocations to argv_array
There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
     array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
     then allocate, then fill in the array. Now we can do it
     in one pass, making the code shorter and easier to
     follow.

  3. argv_array handles memory ownership for us, making it
     more obvious when things should be free()d and and when
     not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:50:32 -08:00
Junio C Hamano
f748e69167 Merge branch 'js/close-packs-before-gc' into maint
Many codepaths that run "gc --auto" before exiting kept packfiles
mapped and left the file descriptors to them open, which was not
friendly to systems that cannot remove files that are open.  They
now close the packs before doing so.

* js/close-packs-before-gc:
  receive-pack: release pack files before garbage-collecting
  merge: release pack files before garbage-collecting
  am: release pack files before garbage-collecting
  fetch: release pack files before garbage-collecting
2016-02-05 14:54:13 -08:00
Junio C Hamano
6e29ac2302 Merge branch 'jk/clang-pedantic' into maint
A few unportable C construct have been spotted by clang compiler
and have been fixed.

* jk/clang-pedantic:
  bswap: add NO_UNALIGNED_LOADS define
  avoid shifting signed integers 31 bits
2016-02-05 14:54:09 -08:00
Johannes Schindelin
d5621020c1 receive-pack: release pack files before garbage-collecting
Before auto-gc'ing, we need to make sure that the pack files are
released in case they need to be repacked and garbage-collected.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-13 11:36:28 -08:00
Jeff King
9a93c6686f avoid shifting signed integers 31 bits
We sometimes use 32-bit unsigned integers as bit-fields.
It's fine to access the MSB, because it's unsigned. However,
doing so as "1 << 31" is wrong, because the constant "1" is
a signed int, and we shift into the sign bit, causing
undefined behavior.

We can fix this by using "1U" as the constant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-04 09:51:16 -08:00
brian m. carlson
f4e54d02b8 Convert struct ref to use object_id.
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Lukas Fleischer
78a766ab6e hideRefs: add support for matching full refs
In addition to matching stripped refs, one can now add hideRefs
patterns that the full (unstripped) ref is matched against. To
distinguish between stripped and full matches, those new patterns
must be prefixed with a circumflex (^).

This commit also removes support for the undocumented and unintended
hideRefs settings ".have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-05 11:25:02 -08:00
Jeff King
b26cb7c777 receive-pack: simplify keep_arg computation
To generate "--keep=receive-pack $pid on $host", we write
progressively into a single buffer, which requires keeping
track of how much we've written so far. But since the result
is destined to go into our argv array, we can simply use
argv_array_pushf.

Unfortunately we still have to have a fixed-size buffer for
the gethostname() call, but at least it now doesn't involve
any extra size computation. And as a bonus, we drop an
sprintf and a strcpy call.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
d59f765ac9 use sha1_to_hex_r() instead of strcpy
Before sha1_to_hex_r() existed, a simple way to get hex
sha1 into a buffer was with:

  strcpy(buf, sha1_to_hex(sha1));

This isn't wrong (assuming the buf is 41 characters), but it
makes auditing the code base for bad strcpy() calls harder,
as these become false positives.

Let's convert them to sha1_to_hex_r(), and likewise for
some calls to find_unique_abbrev(). While we're here, we'll
double-check that all of the buffers are correctly sized,
and use the more obvious GIT_SHA1_HEXSZ constant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
b7115a350b receive-pack: convert strncpy to xsnprintf
This strncpy is pointless; we pass the strlen() of the src
string, meaning that it works just like a memcpy. Worse,
though, is that the size has no relation to the destination
buffer, meaning it is a potential overflow.  In practice,
it's not. We pass only short constant strings like
"warning: " and "error: ", which are much smaller than the
destination buffer.

We can make this much simpler by just using xsnprintf, which
will check for overflow and return the size for our next
vsnprintf, without us having to run a separate strlen().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano
0baebca51e Merge branch 'jx/do-not-crash-receive-pack-wo-head'
An attempt to delete a ref by pushing into a repositorywhose HEAD
symbolic reference points at an unborn branch that cannot be
created due to ref D/F conflict (e.g. refs/heads/a/b exists, HEAD
points at refs/heads/a) failed.

* jx/do-not-crash-receive-pack-wo-head:
  receive-pack: crash when checking with non-exist HEAD
2015-08-03 11:01:31 -07:00
Jiang Xin
b112b14d78 receive-pack: crash when checking with non-exist HEAD
If HEAD of a repository points to a conflict reference, such as:

* There exist a reference named 'refs/heads/jx/feature1', but HEAD
  points to 'refs/heads/jx', or

* There exist a reference named 'refs/heads/feature', but HEAD points
  to 'refs/heads/feature/bad'.

When we push to delete a reference for this repo, such as:

        git push /path/to/bad-head-repo.git :some/good/reference

The git-receive-pack process will crash.

This is because if HEAD points to a conflict reference, the function
`resolve_refdup("HEAD", ...)` does not return a valid reference name,
but a null buffer.  Later matching the delete reference against the null
buffer will cause git-receive-pack crash.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-22 14:18:22 -07:00
Johannes Schindelin
cd94c6f91e fsck: git receive-pack: support excluding objects from fsck'ing
The optional new config option `receive.fsck.skipList` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 14:27:37 -07:00
Johannes Schindelin
5d477a334a fsck (receive-pack): allow demoting errors to warnings
For example, missing emails in commit and tag objects can be demoted to
mere warnings with

	git config receive.fsck.missingemail=warn

The value is actually a comma-separated list.

In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 14:27:34 -07:00
Michael Haggerty
ce2a987329 show_ref_cb(): rewrite to take an object_id argument
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:29 -07:00
Michael Haggerty
2b2a5be394 each_ref_fn: change to take an object_id parameter
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
Junio C Hamano
68a2e6a2c8 Merge branch 'nd/multiple-work-trees'
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.

* nd/multiple-work-trees: (41 commits)
  prune --worktrees: fix expire vs worktree existence condition
  t1501: fix test with split index
  t2026: fix broken &&-chain
  t2026 needs procondition SANITY
  git-checkout.txt: a note about multiple checkout support for submodules
  checkout: add --ignore-other-wortrees
  checkout: pass whole struct to parse_branchname_arg instead of individual flags
  git-common-dir: make "modules/" per-working-directory directory
  checkout: do not fail if target is an empty directory
  t2025: add a test to make sure grafts is working from a linked checkout
  checkout: don't require a work tree when checking out into a new one
  git_path(): keep "info/sparse-checkout" per work-tree
  count-objects: report unused files in $GIT_DIR/worktrees/...
  gc: support prune --worktrees
  gc: factor out gc.pruneexpire parsing code
  gc: style change -- no SP before closing parenthesis
  checkout: clean up half-prepared directories in --to mode
  checkout: reject if the branch is already checked out elsewhere
  prune: strategies for linked checkouts
  checkout: support checking out into a new working directory
  ...
2015-05-11 14:23:39 -07:00
Junio C Hamano
fa9aaa8f10 Merge branch 'jc/update-instead-into-void'
A push into an unborn branch, with "receive.denyCurrentBranch" set
to "updateInstead", did not check out the working tree as expected.

* jc/update-instead-into-void:
  push-to-deploy: allow pushing into an unborn branch and updating it
2015-04-14 11:49:10 -07:00
Junio C Hamano
1a51b52422 push-to-deploy: allow pushing into an unborn branch and updating it
Setting receive.denycurrentbranch to updateinstead and pushing into
the current branch, when the working tree and the index is truly
clean, is supposed to reset the working tree and the index to match
the tree of the pushed commit.  This did not work when pushing into
an unborn branch.

The code that drives push-to-checkout hook needs no change, as the
interface is defined so that hook can decide what to do when the
push is coming to an unborn branch and take an appropriate action
since the beginning.

Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-01 22:40:10 -07:00
Michael Haggerty
fb5a6bb61c ref_transaction_delete(): remove "have_old" parameter
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:23:48 -08:00
Michael Haggerty
1d147bdff0 ref_transaction_update(): remove "have_old" parameter
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.

ref_transaction_delete() will get the same treatment in a moment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:22:50 -08:00
Junio C Hamano
cba07bb6ff Merge branch 'jc/push-to-checkout'
Extending the js/push-to-deploy topic, the behaviour of "git push"
when updating the working tree and the index with an update to the
branch that is checked out can be tweaked by push-to-checkout hook.

* jc/push-to-checkout:
  receive-pack: support push-to-checkout hook
  receive-pack: refactor updateInstead codepath
2015-02-11 13:43:56 -08:00
Junio C Hamano
39fa6112ec Merge branch 'sb/atomic-push'
"git push" has been taught a "--atomic" option that makes push to
update more than one ref an "all-or-none" affair.

* sb/atomic-push:
  Document receive.advertiseatomic
  t5543-atomic-push.sh: add basic tests for atomic pushes
  push.c: add an --atomic argument
  send-pack.c: add --atomic command line argument
  send-pack: rename ref_update_to_be_sent to check_to_send_update
  receive-pack.c: negotiate atomic push support
  receive-pack.c: add execute_commands_atomic function
  receive-pack.c: move transaction handling in a central place
  receive-pack.c: move iterating over all commands outside execute_commands
  receive-pack.c: die instead of error in case of possible future bug
  receive-pack.c: shorten the execute_commands loop over all commands
2015-02-11 13:43:51 -08:00
Junio C Hamano
0855331941 receive-pack: support push-to-checkout hook
When receive.denyCurrentBranch is set to updateInstead, a push that
tries to update the branch that is currently checked out is accepted
only when the index and the working tree exactly matches the
currently checked out commit, in which case the index and the
working tree are updated to match the pushed commit.  Otherwise the
push is refused.

This hook can be used to customize this "push-to-deploy" logic.  The
hook receives the commit with which the tip of the current branch is
going to be updated, and can decide what kind of local changes are
acceptable and how to update the index and the working tree to match
the updated tip of the current branch.

For example, the hook can simply run `git read-tree -u -m HEAD "$1"`
in order to emulate 'git fetch' that is run in the reverse direction
with `git push`, as the two-tree form of `read-tree -u -m` is
essentially the same as `git checkout` that switches branches while
keeping the local changes in the working tree that do not interfere
with the difference between the branches.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-08 14:28:43 -08:00
Ronnie Sahlberg
1b70fe5d30 receive-pack.c: negotiate atomic push support
This adds the atomic protocol option to allow
receive-pack to inform the client that it has
atomic push capability.

This commit makes the functionality introduced
in the previous commits go live for the serving
side. The changes in documentation reflect the
protocol capabilities of the server.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:43 -08:00
Stefan Beller
68deed298a receive-pack.c: add execute_commands_atomic function
This introduces the new function execute_commands_atomic which will use
one atomic transaction for all updates. The default behavior is still
the old non atomic way, one ref at a time. This is to cause as little
disruption as possible to existing clients. It is unknown if there are
client scripts that depend on the old non-atomic behavior so we make it
opt-in for now.

A later patch will add the possibility to actually use the functionality
added by this patch. For now use_atomic is always 0.

Inspired-by: Ronnie Sahlberg <sahlberg@google.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:43 -08:00
Stefan Beller
222368c645 receive-pack.c: move transaction handling in a central place
This moves all code related to transactions into the
execute_commands_non_atomic function. This includes
beginning and committing the transaction as well as
dealing with the errors which may occur during the
begin and commit phase of a transaction.

No functional changes intended.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:43 -08:00
Stefan Beller
a1a261457c receive-pack.c: move iterating over all commands outside execute_commands
This commit allows us in a later patch to easily distinguish between
the non atomic way to update the received refs and the atomic way which
is introduced in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:43 -08:00
Stefan Beller
b6a4788586 receive-pack.c: die instead of error in case of possible future bug
Discussion on the previous patch revealed we rather want to err on the
safe side. To do so we need to stop receive-pack in case of the possible
future bug when connectivity is not checked on a shallow push.

Also while touching that code we considered that removing the reported
refs may be harmful in some situations. Sound the message more like a
"This Cannot Happen, Please Investigate!" instead of giving advice to
remove refs.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:42 -08:00
Stefan Beller
a6a8431968 receive-pack.c: shorten the execute_commands loop over all commands
Make the main "execute_commands" loop in receive-pack easier to read
by splitting out some steps into helper functions. The new helper
'should_process_cmd' checks if a ref update is unnecessary, whether
due to an error having occurred or for another reason. The helper
'warn_if_skipped_connectivity_check' warns if we have forgotten to
run a connectivity check on a ref which is shallow for the client
which would be a bug.

This will help us to duplicate less code in a later patch when we make
a second copy of the "execute_commands" loop.

No functional change intended.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:56:42 -08:00
Junio C Hamano
72ecc6ef53 Merge branch 'js/push-to-deploy'
"git push" into a repository with a working tree normally refuses
to modify the branch that is checked out.  The command learned to
optionally do an equivalent of "git reset --hard" only when there
is no change to the working tree and the index instead, which would
be useful to "deploy" by pushing into a repository.

* js/push-to-deploy:
  t5516: more tests for receive.denyCurrentBranch=updateInstead
  receive-pack: add another option for receive.denyCurrentBranch
2014-12-22 12:27:04 -08:00
Junio C Hamano
a7ddaa8eac Merge branch 'mh/simplify-repack-without-refs'
"git remote update --prune" to drop many refs has been optimized.

* mh/simplify-repack-without-refs:
  sort_string_list(): rename to string_list_sort()
  prune_remote(): iterate using for_each_string_list_item()
  prune_remote(): rename local variable
  repack_without_refs(): make the refnames argument a string_list
  prune_remote(): sort delete_refs_list references en masse
  prune_remote(): initialize both delete_refs lists in a single loop
  prune_remote(): exit early if there are no stale references
2014-12-22 12:26:50 -08:00
Junio C Hamano
0e0252b755 Merge branch 'rs/receive-pack-use-labs'
* rs/receive-pack-use-labs:
  use labs() for variables of type long instead of abs()
2014-12-05 11:42:54 -08:00
Junio C Hamano
21b138d0f6 receive-pack: refactor updateInstead codepath
Keep the "there is nothing to update in a bare repository", "when
the check and update process runs, here are the GIT_DIR and
GIT_WORK_TREE" logic, which will be common regardless of how the
decision to update and the actual update are done, in the original
update_worktree() function, and split out the "working tree and
the index must match the original HEAD exactly" and "use two-way
read-tree to update the working tree" into a new push_to_deploy()
helper function.  This will allow customizing the logic more cleanly
and easily.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 13:57:28 -08:00
Nguyễn Thái Ngọc Duy
dcf692625a path.c: make get_pathname() call sites return const char *
Before the previous commit, get_pathname returns an array of PATH_MAX
length. Even if git_path() and similar functions does not use the
whole array, git_path() caller can, in theory.

After the commit, get_pathname() may return a buffer that has just
enough room for the returned string and git_path() caller should never
write beyond that.

Make git_path(), mkpath() and git_path_submodule() return a const
buffer to make sure callers do not write in it at all.

This could have been part of the previous commit, but the "const"
conversion is too much distraction from the core changes in path.c.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 11:00:10 -08:00
Johannes Schindelin
1404bcbb6b receive-pack: add another option for receive.denyCurrentBranch
When synchronizing between working directories, it can be handy to update
the current branch via 'push' rather than 'pull', e.g. when pushing a fix
from inside a VM, or when pushing a fix made on a user's machine (where
the developer is not at liberty to install an ssh daemon let alone know
the user's password).

The common workaround – pushing into a temporary branch and then merging
on the other machine – is no longer necessary with this patch.

The new option is:

'updateInstead':
	Update the working tree accordingly, but refuse to do so if there
	are any uncommitted changes.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-30 17:15:13 -08:00
Michael Haggerty
3383e19984 sort_string_list(): rename to string_list_sort()
The new name is more consistent with the names of other
string_list-related functions.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-25 10:11:34 -08:00
René Scharfe
31a8aa1ee8 use labs() for variables of type long instead of abs()
Using abs() on long values can cause truncation, so use labs() instead.
Reported by Clang 3.5 (-Wabsolute-value, enabled by -Wall).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-17 08:57:07 -08:00
Junio C Hamano
1d42cf3c6c Merge branch 'jc/push-cert'
* jc/push-cert:
  receive-pack: avoid minor leak in case start_async() fails
2014-10-31 11:49:55 -07:00
René Scharfe
5d222c099e receive-pack: avoid minor leak in case start_async() fails
If the asynchronous start of copy_to_sideband() fails, then any
env_array entries added to struct child_process proc by
prepare_push_cert_sha1() are leaked.  Call the latter function only
after start_async() succeeded so that the allocated entries are
cleaned up automatically by start_command() or finish_command().

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-28 14:55:15 -07:00
Junio C Hamano
217610d7d6 Merge branch 'rs/run-command-env-array'
Add managed "env" array to child_process to clarify the lifetime
rules.

* rs/run-command-env-array:
  use env_array member of struct child_process
  run-command: add env_array, an optional argv_array for env
2014-10-24 14:57:54 -07:00
Junio C Hamano
3c85452bb0 Merge branch 'rs/ref-transaction'
The API to update refs have been restructured to allow introducing
a true transactional updates later.  We would even allow storing
refs in backends other than the traditional filesystem-based one.

* rs/ref-transaction: (25 commits)
  ref_transaction_commit: bail out on failure to remove a ref
  lockfile: remove unable_to_lock_error
  refs.c: do not permit err == NULL
  remote rm/prune: print a message when writing packed-refs fails
  for-each-ref: skip and warn about broken ref names
  refs.c: allow listing and deleting badly named refs
  test: put tests for handling of bad ref names in one place
  packed-ref cache: forbid dot-components in refnames
  branch -d: simplify by using RESOLVE_REF_READING
  branch -d: avoid repeated symref resolution
  reflog test: test interaction with detached HEAD
  refs.c: change resolve_ref_unsafe reading argument to be a flags field
  refs.c: make write_ref_sha1 static
  fetch.c: change s_update_ref to use a ref transaction
  refs.c: ref_transaction_commit: distinguish name conflicts from other errors
  refs.c: pass a list of names to skip to is_refname_available
  refs.c: call lock_ref_sha1_basic directly from commit
  refs.c: refuse to lock badly named refs in lock_ref_sha1_basic
  rename_ref: don't ask read_ref_full where the ref came from
  refs.c: pass the ref log message to _create/delete/update instead of _commit
  ...
2014-10-21 13:28:10 -07:00
Junio C Hamano
b67588d018 Merge branch 'rs/receive-pack-argv-leak-fix'
* rs/receive-pack-argv-leak-fix:
  receive-pack: plug minor memory leak in unpack()
2014-10-20 12:23:45 -07:00
René Scharfe
a915459097 use env_array member of struct child_process
Convert users of struct child_process to using the managed env_array for
specifying environment variables instead of supplying an array on the
stack or bringing their own argv_array.  This shortens and simplifies
the code and ensures automatically that the allocated memory is freed
after use.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-19 15:26:34 -07:00
Ronnie Sahlberg
7695d118e5 refs.c: change resolve_ref_unsafe reading argument to be a flags field
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading).  Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.

While at it, swap two of the arguments in the function to put output
arguments at the end.  As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.

Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:24 -07:00
Ronnie Sahlberg
db7516ab9f refs.c: pass the ref log message to _create/delete/update instead of _commit
Change the ref transaction API so that we pass the reflog message to the
create/delete/update functions instead of to ref_transaction_commit.
This allows different reflog messages for each ref update in a multi-ref
transaction.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:22 -07:00
Junio C Hamano
bd107e1052 Merge branch 'mh/lockfile'
The lockfile API and its users have been cleaned up.

* mh/lockfile: (38 commits)
  lockfile.h: extract new header file for the functions in lockfile.c
  hold_locked_index(): move from lockfile.c to read-cache.c
  hold_lock_file_for_append(): restore errno before returning
  get_locked_file_path(): new function
  lockfile.c: rename static functions
  lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF
  commit_lock_file_to(): refactor a helper out of commit_lock_file()
  trim_last_path_component(): replace last_path_elm()
  resolve_symlink(): take a strbuf parameter
  resolve_symlink(): use a strbuf for internal scratch space
  lockfile: change lock_file::filename into a strbuf
  commit_lock_file(): use a strbuf to manage temporary space
  try_merge_strategy(): use a statically-allocated lock_file object
  try_merge_strategy(): remove redundant lock_file allocation
  struct lock_file: declare some fields volatile
  lockfile: avoid transitory invalid states
  git_config_set_multivar_in_file(): avoid call to rollback_lock_file()
  dump_marks(): remove a redundant call to rollback_lock_file()
  api-lockfile: document edge cases
  commit_lock_file(): rollback lock file on failure to rename
  ...
2014-10-14 10:49:45 -07:00
René Scharfe
64a7e92f28 receive-pack: plug minor memory leak in unpack()
The argv_array used in unpack() is never freed.  Instead of adding
explicit calls to argv_array_clear() use the args member of struct
child_process and let run_command() and friends clean up for us.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-13 11:50:20 -07:00
Junio C Hamano
fb06b5280e Merge branch 'jc/push-cert'
Allow "git push" request to be signed, so that it can be verified and
audited, using the GPG signature of the person who pushed, that the
tips of branches at a public repository really point the commits
the pusher wanted to, without having to "trust" the server.

* jc/push-cert: (24 commits)
  receive-pack::hmac_sha1(): copy the entire SHA-1 hash out
  signed push: allow stale nonce in stateless mode
  signed push: teach smart-HTTP to pass "git push --signed" around
  signed push: fortify against replay attacks
  signed push: add "pushee" header to push certificate
  signed push: remove duplicated protocol info
  send-pack: send feature request on push-cert packet
  receive-pack: GPG-validate push certificates
  push: the beginning of "git push --signed"
  pack-protocol doc: typofix for PKT-LINE
  gpg-interface: move parse_signature() to where it should be
  gpg-interface: move parse_gpg_output() to where it should be
  send-pack: clarify that cmds_sent is a boolean
  send-pack: refactor inspecting and resetting status and sending commands
  send-pack: rename "new_refs" to "need_pack_data"
  receive-pack: factor out capability string generation
  send-pack: factor out capability string generation
  send-pack: always send capabilities
  send-pack: refactor decision to send update per ref
  send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher
  ...
2014-10-08 13:05:25 -07:00
Michael Haggerty
697cc8efd9 lockfile.h: extract new header file for the functions in lockfile.c
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).

Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:56:14 -07:00
Junio C Hamano
bdab1bca53 Merge branch 'jc/ignore-sigpipe-while-running-hooks'
pre- and post-receive hooks are no longer required to read all
their inputs.

* jc/ignore-sigpipe-while-running-hooks:
  receive-pack: allow hooks to ignore its standard input stream
2014-09-26 14:39:44 -07:00
Brian Gernhardt
6f5ef44e0d receive-pack::hmac_sha1(): copy the entire SHA-1 hash out
clang gives the following warning:

builtin/receive-pack.c:327:35: error: sizeof on array function
parameter will return size of 'unsigned char *' instead of 'unsigned
char [20]' [-Werror,-Wsizeof-array-argument]
        git_SHA1_Update(&ctx, out, sizeof(out));
                                         ^
builtin/receive-pack.c:292:37: note: declared here
static void hmac_sha1(unsigned char out[20],
                                   ^
Signed-off-by: Brian Gernhardt <brian@gernhardtsoftware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-25 11:12:57 -07:00
Junio C Hamano
5732373daa signed push: allow stale nonce in stateless mode
When operating with the stateless RPC mode, we will receive a nonce
issued by another instance of us that advertised our capability and
refs some time ago.  Update the logic to check received nonce to
detect this case, compute how much time has passed since the nonce
was issued and report the status with a new environment variable
GIT_PUSH_CERT_NONCE_SLOP to the hooks.

GIT_PUSH_CERT_NONCE_STATUS will report "SLOP" in such a case.  The
hooks are free to decide how large a slop it is willing to accept.

Strictly speaking, the "nonce" is not really a "nonce" anymore in
the stateless RPC mode, as it will happily take any "nonce" issued
by it (which is protected by HMAC and its secret key) as long as it
is fresh enough.  The degree of this security degradation, relative
to the native protocol, is about the same as the "we make sure that
the 'git push' decided to update our refs with new objects based on
the freshest observation of our refs by making sure the values they
claim the original value of the refs they ask us to update exactly
match the current state" security is loosened to accomodate the
stateless RPC mode in the existing code without this series, so
there is no need for those who are already using smart HTTP to push
to their repositories to be alarmed any more than they already are.

In addition, the server operator can set receive.certnonceslop
configuration variable to specify how stale a nonce can be (in
seconds).  When this variable is set, and if the nonce received in
the certificate that passes the HMAC check was less than that many
seconds old, hooks are given "OK" in GIT_PUSH_CERT_NONCE_STATUS
(instead of "SLOP") and the received nonce value is given in
GIT_PUSH_CERT_NONCE, which makes it easier for a simple-minded
hook to check if the certificate we received is recent enough.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-17 15:19:54 -07:00
Junio C Hamano
b89363e4a5 signed push: fortify against replay attacks
In order to prevent a valid push certificate for pushing into an
repository from getting replayed in a different push operation, send
a nonce string from the receive-pack process and have the signer
include it in the push certificate.  The receiving end uses an HMAC
hash of the path to the repository it serves and the current time
stamp, hashed with a secret seed (the secret seed does not have to
be per-repository but can be defined in /etc/gitconfig) to generate
the nonce, in order to ensure that a random third party cannot forge
a nonce that looks like it originated from it.

The original nonce is exported as GIT_PUSH_CERT_NONCE for the hooks
to examine and match against the value on the "nonce" header in the
certificate to notice a replay, but returned "nonce" header in the
push certificate is examined by receive-pack and the result is
exported as GIT_PUSH_CERT_NONCE_STATUS, whose value would be "OK"
if the nonce recorded in the certificate matches what we expect, so
that the hooks can more easily check.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-17 14:27:40 -07:00
Junio C Hamano
ec7dbd145b receive-pack: allow hooks to ignore its standard input stream
The pre-receive and post-receive hooks were designed to be an
improvement over old style update and post-update hooks, which take
the update information on their command line and are limited by the
command line length limit.  The same information is fed from the
standard input to pre/post-receive hooks instead to lift this
limitation.  It has been mandatory for these new style hooks to
consume the update information fully from the standard input stream.
Otherwise, they would risk killing the receive-pack process via
SIGPIPE.

If a hook does not want to look at all the information, it is easy
to send its standard input to /dev/null (perhaps a niche use of hook
might need to know only the fact that a push was made, without
having to know what objects have been pushed to update which refs),
and this has already been done by existing hooks that are written
carefully.

However, because there is no good way to consistently fail hooks
that do not consume the input fully (a small push may result in a
short update record that may fit within the pipe buffer, to which
the receive-pack process may manage to write before the hook has a
chance to exit without reading anything, which will not result in a
death-by-SIGPIPE of receive-pack), it can lead to a hard to diagnose
"once in a blue moon" phantom failure.

Lift this "hooks must consume their input fully" mandate.  A mandate
that is not enforced strictly is not helping us to catch mistakes in
hooks.  If a hook has a good reason to decide the outcome of its
operation without reading the information we feed it, let it do so
as it pleases.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-16 15:11:58 -07:00
Junio C Hamano
4adf569dea signed push: remove duplicated protocol info
With the interim protocol, we used to send the update commands even
though we already send a signed copy of the same information when
push certificate is in use.  Update the send-pack/receive-pack pair
not to do so.

The notable thing on the receive-pack side is that it makes sure
that there is no command sent over the traditional protocol packet
outside the push certificate.  Otherwise a pusher can claim to be
pushing one set of ref updates in the signed certificate while
issuing commands to update unrelated refs, and such an update will
evade later audits.

Finally, start documenting the protocol.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:28 -07:00
Junio C Hamano
d05b9618ce receive-pack: GPG-validate push certificates
Reusing the GPG signature check helpers we already have, verify
the signature in receive-pack and give the results to the hooks
via GIT_PUSH_CERT_{SIGNER,KEY,STATUS} environment variables.

Policy decisions, such as accepting or rejecting a good signature by
a key that is not fully trusted, is left to the hook and kept
outside of the core.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:28 -07:00
Junio C Hamano
a85b377d04 push: the beginning of "git push --signed"
While signed tags and commits assert that the objects thusly signed
came from you, who signed these objects, there is not a good way to
assert that you wanted to have a particular object at the tip of a
particular branch.  My signing v2.0.1 tag only means I want to call
the version v2.0.1, and it does not mean I want to push it out to my
'master' branch---it is likely that I only want it in 'maint', so
the signature on the object alone is insufficient.

The only assurance to you that 'maint' points at what I wanted to
place there comes from your trust on the hosting site and my
authentication with it, which cannot easily audited later.

Introduce a mechanism that allows you to sign a "push certificate"
(for the lack of better name) every time you push, asserting that
what object you are pushing to update which ref that used to point
at what other object.  Think of it as a cryptographic protection for
ref updates, similar to signed tags/commits but working on an
orthogonal axis.

The basic flow based on this mechanism goes like this:

 1. You push out your work with "git push --signed".

 2. The sending side learns where the remote refs are as usual,
    together with what protocol extension the receiving end
    supports.  If the receiving end does not advertise the protocol
    extension "push-cert", an attempt to "git push --signed" fails.

    Otherwise, a text file, that looks like the following, is
    prepared in core:

	certificate version 0.1
	pusher Junio C Hamano <gitster@pobox.com> 1315427886 -0700

	7339ca65... 21580ecb... refs/heads/master
	3793ac56... 12850bec... refs/heads/next

    The file begins with a few header lines, which may grow as we
    gain more experience.  The 'pusher' header records the name of
    the signer (the value of user.signingkey configuration variable,
    falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the
    certificate generation.  After the header, a blank line follows,
    followed by a copy of the protocol message lines.

    Each line shows the old and the new object name at the tip of
    the ref this push tries to update, in the way identical to how
    the underlying "git push" protocol exchange tells the ref
    updates to the receiving end (by recording the "old" object
    name, the push certificate also protects against replaying).  It
    is expected that new command packet types other than the
    old-new-refname kind will be included in push certificate in the
    same way as would appear in the plain vanilla command packets in
    unsigned pushes.

    The user then is asked to sign this push certificate using GPG,
    formatted in a way similar to how signed tag objects are signed,
    and the result is sent to the other side (i.e. receive-pack).

    In the protocol exchange, this step comes immediately before the
    sender tells what the result of the push should be, which in
    turn comes before it sends the pack data.

 3. When the receiving end sees a push certificate, the certificate
    is written out as a blob.  The pre-receive hook can learn about
    the certificate by checking GIT_PUSH_CERT environment variable,
    which, if present, tells the object name of this blob, and make
    the decision to allow or reject this push.  Additionally, the
    post-receive hook can also look at the certificate, which may be
    a good place to log all the received certificates for later
    audits.

Because a push certificate carry the same information as the usual
command packets in the protocol exchange, we can omit the latter
when a push certificate is in use and reduce the protocol overhead.
This however is not included in this patch to make it easier to
review (in other words, the series at this step should never be
released without the remainder of the series, as it implements an
interim protocol that will be incompatible with the final one).
As such, the documentation update for the protocol is left out of
this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:20 -07:00
Junio C Hamano
52d2ae582e receive-pack: factor out capability string generation
Similar to the previous one for send-pack, make it easier and
cleaner to add to capability advertisement.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:19 -07:00
Junio C Hamano
39895c74d8 receive-pack: factor out queueing of command
Make a helper function to accept a line of a protocol message and
queue an update command out of the code from read_head_info().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:18 -07:00
Junio C Hamano
c09b71ccc4 receive-pack: do not reuse old_sha1[] for other things
This piece of code reads object names of shallow boundaries, not
old_sha1[], i.e. the current value the ref points at, which is to be
replaced by what is in new_sha1[].

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:18 -07:00
Junio C Hamano
0e3c339bb6 receive-pack: parse feature request a bit earlier
Ideally, we should have also allowed the first "shallow" to carry
the feature request trailer, but that is water under the bridge
now.  This makes the next step to factor out the queuing of commands
easier to review.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:18 -07:00
Junio C Hamano
3bfcb95fa8 receive-pack: do not overallocate command structure
An "update" command in the protocol exchange consists of 40-hex old
object name, SP, 40-hex new object name, SP, and a refname, but the
first instance is further followed by a NUL with feature requests.

The command structure, which has a flex-array member that stores the
refname at the end, was allocated based on the whole length of the
update command, without excluding the trailing feature requests.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 13:23:18 -07:00
Junio C Hamano
01d678a226 Merge branch 'rs/ref-transaction-1'
The second batch of the transactional ref update series.

* rs/ref-transaction-1: (22 commits)
  update-ref --stdin: pass transaction around explicitly
  update-ref --stdin: narrow scope of err strbuf
  refs.c: make delete_ref use a transaction
  refs.c: make prune_ref use a transaction to delete the ref
  refs.c: remove lock_ref_sha1
  refs.c: remove the update_ref_write function
  refs.c: remove the update_ref_lock function
  refs.c: make lock_ref_sha1 static
  walker.c: use ref transaction for ref updates
  fast-import.c: use a ref transaction when dumping tags
  receive-pack.c: use a reference transaction for updating the refs
  refs.c: change update_ref to use a transaction
  branch.c: use ref transaction for all ref updates
  fast-import.c: change update_branch to use ref transactions
  sequencer.c: use ref transactions for all ref updates
  commit.c: use ref transactions for updates
  replace.c: use the ref transaction functions for updates
  tag.c: use ref transactions when doing updates
  refs.c: add transaction.status and track OPEN/CLOSED
  refs.c: make ref_transaction_begin take an err argument
  ...
2014-09-11 10:33:31 -07:00
Ronnie Sahlberg
6629ea2d4a receive-pack.c: use a reference transaction for updating the refs
Wrap all the ref updates inside a transaction.

In the new API there is no distinction between failure to lock and
failure to write a ref.  Both can be permanent (e.g., a ref
"refs/heads/topic" is blocking creation of the lock file
"refs/heads/topic/1.lock") or transient (e.g., file system full) and
there's no clear difference in how the client should respond, so
replace the two statuses "failed to lock" and "failed to write" with
a single status "failed to update ref".  In both cases a more
detailed message is sent by sideband to diagnose the problem.

Example, before:

 error: there are still refs under 'refs/heads/topic'
 remote: error: failed to lock refs/heads/topic
 To foo
  ! [remote rejected] HEAD -> topic (failed to lock)

After:

 error: there are still refs under 'refs/heads/topic'
 remote: error: Cannot lock the ref 'refs/heads/topic'.
 To foo
  ! [remote rejected] HEAD -> topic (failed to update ref)

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:14 -07:00
René Scharfe
d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Junio C Hamano
ad524f834a Merge branch 'jk/misc-fixes-maint'
* jk/misc-fixes-maint:
  apply: avoid possible bogus pointer
  fix memory leak parsing core.commentchar
  transport: fix leaks in refs_from_alternate_cb
  free ref string returned by dwim_ref
  receive-pack: don't copy "dir" parameter
2014-07-28 11:30:41 -07:00