Commit graph

103 commits

Author SHA1 Message Date
Jeff King de89f0b25a remote-curl: die directly with http error messages
When we encounter an unknown http error (e.g., a 403), we
hand the error code to http_error, which then prints it with
error(). After that we die with the redundant message "HTTP
request failed".

Instead, let's just drop http_error entirely, which does
nothing but pass arguments to error(), and instead die
directly with a useful message.

So before:

  $ git clone https://example.com/repo.git
  Cloning into 'repo'...
  error: unable to access 'https://example.com/repo.git': The requested URL returned error: 403 Forbidden
  fatal: HTTP request failed

and after:

  $ git clone https://example.com/repo.git
  Cloning into 'repo'...
  fatal: unable to access 'https://example.com/repo.git': The requested URL returned error: 403 Forbidden

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:45 -07:00
Jeff King 67d2a7b5c5 http: simplify http_error helper function
This helper function should really be a one-liner that
prints an error message, but it has ended up unnecessarily
complicated:

  1. We call error() directly when we fail to start the curl
     request, so we must later avoid printing a duplicate
     error in http_error().

     It would be much simpler in this case to just stuff the
     error message into our usual curl_errorstr buffer
     rather than printing it ourselves. This means that
     http_error does not even have to care about curl's exit
     value (the interesting part is in the errorstr buffer
     already).

  2. We return the "ret" value passed in to us, but none of
     the callers actually cares about our return value. We
     can just drop this entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:44 -07:00
Jeff King d5ccbe4dfb remote-curl: consistently report repo url for http errors
When we report http errors in fetching the initial ref
advertisement, we show the full URL we attempted to use,
including "info/refs?service=git-upload-pack". While this
may be useful for debugging a broken server, it is
unnecessarily verbose and confusing for most cases, in which
the client user is not even the same person as the owner of
the repository.

Let's just show the repository URL; debugging can happen
with GIT_CURL_VERBOSE, which shows way more useful
information, anyway.

At the same time, let's also make sure to mention the
repository URL when we report failed authentication
(previously we said only "Authentication failed"). Knowing
the URL can help the user realize why authentication failed
(e.g., they meant to push to remote A, not remote B).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:43 -07:00
Jeff King cfa0f4040d remote-curl: always show friendlier 404 message
When we get an http 404 trying to get the initial list of
refs from the server, we try to be helpful and remind the
user that update-server-info may need to be run. This looks
like:

  $ git clone https://github.com/non/existent
  Cloning into 'existent'...
  fatal: https://github.com/non/existent/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

Suggesting update-server-info may be a good suggestion for
users who are in control of the server repo and who are
planning to set up dumb http. But for users of smart http,
and especially users who are not in control of the server
repo, the advice is useless and confusing.

Since most people are expected to use smart http these days,
it does not make sense to keep the update-server-info hint.

We not only drop the mention of update-server-info, but also
show only the main repo URL, not the full "info/refs" and
service parameter. These elements may be useful for
debugging a broken server configuration, but in the majority
of cases, users are not fetching from their own
repositories, but rather from other people's repositories;
they have neither the power nor interest to fix a broken
configuration, and the extra components just make the
message more confusing. Users who do want to debug can and
should use GIT_CURL_VERBOSE to get more complete information
on the actual URLs visited.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:43 -07:00
Jeff King 110bcdc3d0 remote-curl: let servers override http 404 advice
When we get an http 404 trying to get the initial list of
refs from the server, we try to be helpful and remind the
user that update-server-info may need to be run. This looks
like:

  $ git clone https://github.com/non/existent
  Cloning into 'existent'...
  fatal: https://github.com/non/existent/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

Suggesting update-server-info may be a good suggestion for
users who are in control of the server repo and who are
planning to set up dumb http. But for users of smart http,
and especially users who are not in control of the server
repo, the advice is useless and confusing.

The previous patch taught remote-curl to show custom advice
from the server when it is available. When we have shown
messages from the server, we can also drop our custom
advice; what the server has to say is likely to be more
accurate and helpful.

We not only drop the mention of update-server-info, but also
show only the main repo URL, not the full "info/refs" and
service parameter. These elements may be useful for
debugging a broken server configuration, but again, anything
the server has provided is likely to be more useful (and one
can still use GIT_CURL_VERBOSE to get much more complete
debugging information).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:42 -07:00
Jeff King 426e70d4a1 remote-curl: show server content on http errors
If an http request to a remote git server fails, we show
only the http response code, or sometimes a custom message
for particular codes. This gives the server no opportunity
to offer a more detailed explanation of the reason for the
failure, or to give extra advice.

This patch teaches remote-curl to record and display the
body content of a failed http response. We only display such
responses when the content-type is advertised as text/plain,
as it is the most likely to look presentable on the user's
terminal (and it is hoped to be a good indication that the
message is intended for git clients, and not for a web
browser).

Each line of the new output is prepended with "remote:".
Example output may look like this (assuming the server is
configured to display such a helpful message):

  $ GIT_SMART_HTTP=0 git clone https://example.com/some/repo.git
  Cloning into 'repo'...
  remote: Sorry, fetching via dumb http is forbidden.
  remote: Please upgrade your git client to v1.6.6 or greater
  remote: and make sure that smart-http is enabled.
  error: The requested URL returned error: 403 while accessing http://localhost:5001/some/repo.git/info/refs
  fatal: HTTP request failed

For the sake of simplicity, we only record and display these
errors during the initial fetch of the ref list, as that is
the initial contact with the server and where the most
common, interesting errors happen (and there is already
precedent, as that is the only place we currently massage
http error codes into more helpful messages).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:42 -07:00
Jeff King 2a4552021a remote-curl: always parse incoming refs
When remote-curl receives a list of refs from a server, it
keeps the whole buffer intact. When we get a "list" command,
we feed the result to get_remote_heads, and when we get a
"fetch" or "push" command, we feed it to fetch-pack or
send-pack, respectively.

If the HTTP response from the server is truncated for any
reason, we will get an incomplete ref advertisement. If we
then feed this incomplete list to fetch-pack, one of a few
things may happen:

  1. If the truncation is in a packet header, fetch-pack
     will notice the bogus line and complain.

  2. If the truncation is inside a packet, fetch-pack will
     keep waiting for us to send the rest of the packet,
     which we never will.

  3. If the truncation is at a packet boundary, fetch-pack
     will keep waiting for us to send the next packet, which
     we never will.

As a result, fetch-pack hangs, waiting for input.  However,
remote-curl believes it has sent all of the advertisement,
and therefore waits for fetch-pack to speak. The two
processes end up in a deadlock.

We do notice the broken ref list if we feed it to
get_remote_heads. So if git asks the helper to do a "list"
followed by a "fetch", we are safe; we'll abort during the
list operation, which parses the refs.

This patch teaches remote-curl to always parse and save the
incoming ref list when we read the ref advertisement from a
server. That means that we will always verify and abort
before even running fetch-pack (or send-pack) when reading a
corrupted list, even if we do not run the "list" command
explicitly.

Since we save the result, in the common case of running
"list" then "fetch", we do not do any extra parsing at all.
In the case of just a "fetch", we do an extra round of
parsing, but only once.

Note also that the "fetch" case will now also initialize
server_capabilities from the remote (in remote-curl; we
already would do so inside fetch-pack).  Doing "list+fetch"
already does this. It doesn't actually matter now, but the
new behavior is arguably more correct, should remote-curl
ever start caring about the server's capability list.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Jeff King b8054bbee7 remote-curl: move ref-parsing code up in file
The ref-parsing functions are static. Let's move them up in
the file to be available to more functions, which will help
us with later refactoring.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Jeff King 5dbf43602d remote-curl: pass buffer straight to get_remote_heads
Until recently, get_remote_heads only knew how to read refs
from a file descriptor. To hack around this, we spawned a
thread (or forked a process) to write the buffer back to us.

Now that we can just pass it our buffer directly, we don't
have to use this hack anymore.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Jeff King 85edf4f58b teach get_remote_heads to read from a memory buffer
Now that we can read packet data from memory as easily as a
descriptor, get_remote_heads can take either one as a
source. This will allow further refactoring in remote-curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Jeff King 4981fe750b pkt-line: share buffer/descriptor reading implementation
The packet_read function reads from a descriptor. The
packet_get_line function is similar, but reads from an
in-memory buffer, and uses a completely separate
implementation. This patch teaches the generic packet_read
function to accept either source, and we can do away with
packet_get_line's implementation.

There are two other differences to account for between the
old and new functions. The first is that we used to read
into a strbuf, but now read into a fixed size buffer. The
only two callers are fine with that, and in fact it
simplifies their code, since they can use the same
static-buffer interface as the rest of the packet_read_line
callers (and we provide a similar convenience wrapper for
reading from a buffer rather than a descriptor).

This is technically an externally-visible behavior change in
that we used to accept arbitrary sized packets up to 65532
bytes, and now cap out at LARGE_PACKET_MAX, 65520. In
practice this doesn't matter, as we use it only for parsing
smart-http headers (of which there is exactly one defined,
and it is small and fixed-size). And any extension headers
would be breaking the protocol to go over LARGE_PACKET_MAX
anyway.

The other difference is that packet_get_line would return
on error rather than dying. However, both callers of
packet_get_line are actually improved by dying.

The first caller does its own error checking, but we can
drop that; as a result, we'll actually get more specific
reporting about protocol breakage when packet_read dies
internally. The only downside is that packet_read will not
print the smart-http URL that failed, but that's not a big
deal; anybody not debugging can already see the remote's URL
already, and anybody debugging would want to run with
GIT_CURL_VERBOSE anyway to see way more information.

The second caller, which is just trying to skip past any
extra smart-http headers (of which there are none defined,
but which we allow to keep room for future expansion), did
not error check at all. As a result, it would treat an error
just like a flush packet. The resulting mess would generally
cause an error later in get_remote_heads, but now we get
error reporting much closer to the source of the problem.

Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:14:15 -08:00
Jeff King 819b929d33 pkt-line: teach packet_read_line to chomp newlines
The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.

This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.

As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.

Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility.  However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.

This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.

By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch.  The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Jeff King cdf4fb8e33 pkt-line: drop safe_write function
This is just write_or_die by another name. The one
distinction is that write_or_die will treat EPIPE specially
by suppressing error messages. That's fine, as we die by
SIGPIPE anyway (and in the off chance that it is disabled,
write_or_die will simulate it).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Shawn Pearce 4656bf47fc Verify Content-Type from smart HTTP servers
Before parsing a suspected smart-HTTP response verify the returned
Content-Type matches the standard. This protects a client from
attempting to process a payload that smells like a smart-HTTP
server response.

JGit has been doing this check on all responses since the dawn of
time. I mistakenly failed to include it in git-core when smart HTTP
was introduced. At the time I didn't know how to get the Content-Type
from libcurl. I punted, meant to circle back and fix this, and just
plain forgot about it.

Signed-off-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-04 10:22:36 -08:00
Junio C Hamano fda800f0b1 Merge branch 'jk/maint-http-half-auth-fetch'
Finishing touches to squelch a compiler warning.

* jk/maint-http-half-auth-fetch:
  remote-curl.c: Fix a compiler warning
2012-11-21 11:59:29 -08:00
Ramsay Jones 377115493a remote-curl.c: Fix a compiler warning
In particular, gcc issues an "'gzip_size' might be used uninitialized"
warning (-Wuninitialized). However, this warning is a false positive,
since the 'gzip_size' variable would not, in fact, be used uninitialized.
In order to suppress the warning, we simply initialise the variable to
zero in it's declaration.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-21 11:54:32 -08:00
Junio C Hamano a9bb4e55a3 Merge branch 'jk/maint-http-half-auth-fetch'
Fixes fetch from servers that ask for auth only during the actual
packing phase. This is not really a recommended configuration, but it
cleans up the code at the same time.

* jk/maint-http-half-auth-fetch:
  remote-curl: retry failed requests for auth even with gzip
  remote-curl: hoist gzip buffer size to top of post_rpc
2012-11-20 10:30:17 -08:00
Jeff King 2e736fd5e9 remote-curl: retry failed requests for auth even with gzip
Commit b81401c taught the post_rpc function to retry the
http request after prompting for credentials. However, it
did not handle two cases:

  1. If we have a large request, we do not retry. That's OK,
     since we would have sent a probe (with retry) already.

  2. If we are gzipping the request, we do not retry. That
     was considered OK, because the intended use was for
     push (e.g., listing refs is OK, but actually pushing
     objects is not), and we never gzip on push.

This patch teaches post_rpc to retry even a gzipped request.
This has two advantages:

  1. It is possible to configure a "half-auth" state for
     fetching, where the set of refs and their sha1s are
     advertised, but one cannot actually fetch objects.

     This is not a recommended configuration, as it leaks
     some information about what is in the repository (e.g.,
     an attacker can try brute-forcing possible content in
     your repository and checking whether it matches your
     branch sha1). However, it can be slightly more
     convenient, since a no-op fetch will not require a
     password at all.

  2. It future-proofs us should we decide to ever gzip more
     requests.

Signed-off-by: Jeff King <peff@peff.net>
2012-10-31 07:45:13 -04:00
Jeff King df126e108b remote-curl: hoist gzip buffer size to top of post_rpc
When we gzip the post data for a smart-http rpc request, we
compute the gzip body and its size inside the "use_gzip"
conditional. We keep track of the body after the conditional
ends, but not the size. Let's remember both, which will
enable us to retry failed gzip requests in a future patch.

Signed-off-by: Jeff King <peff@peff.net>
2012-10-31 07:45:08 -04:00
Jeff King 58f3f9893d Merge branch 'jk/maint-http-init-not-in-result-handler'
Further clean-up to the http codepath that picks up results after
cURL library is done with one request slot.

* jk/maint-http-init-not-in-result-handler:
  http: do not set up curl auth after a 401
  remote-curl: do not call run_slot repeatedly
2012-10-29 04:13:09 -04:00
Junio C Hamano 053a08f5bb Merge branch 'jk/maint-http-half-auth-push'
Fixes a regression in maint-1.7.11 (v1.7.11.7), maint (v1.7.12.1)
and master (v1.8.0-rc0).

* jk/maint-http-half-auth-push:
  http: fix segfault in handle_curl_result
2012-10-16 11:44:37 -07:00
Jeff King 1960897ebc http: do not set up curl auth after a 401
When we get an http 401, we prompt for credentials and put
them in our global credential struct. We also feed them to
the curl handle that produced the 401, with the intent that
they will be used on a retry.

When the code was originally introduced in commit 42653c0,
this was a necessary step. However, since dfa1725, we always
feed our global credential into every curl handle when we
initialize the slot with get_active_slot. So every further
request already feeds the credential to curl.

Moreover, accessing the slot here is somewhat dubious. After
the slot has produced a response, we don't actually control
it any more.  If we are using curl_multi, it may even have
been re-initialized to handle a different request.

It just so happens that we will reuse the curl handle within
the slot in such a case, and that because we only keep one
global credential, it will be the one we want.  So the
current code is not buggy, but it is misleading.

By cleaning it up, we can remove the slot argument entirely
from handle_curl_result, making it much more obvious that
slots should not be accessed after they are marked as
finished.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 09:45:15 -07:00
Jeff King abf8df869c remote-curl: do not call run_slot repeatedly
Commit b81401c (http: prompt for credentials on failed POST)
taught post_rpc to call run_slot in a loop in order to retry
a request after asking the user for credentials. However,
after a call to run_slot we will have called
finish_active_slot. This means we have released the slot,
and we should no longer look at it.

As it happens, this does not cause any bugs in the current
code, since we know that we are not using curl_multi in this
code path, and therefore nobody will have taken over our
slot in the meantime. However, it is good form to actually
call get_active_slot again. It also future proofs us against
changes in the http code.

We can do this by jumping back to a retry label at the top
of our function. We just need to reorder a few setup lines
that should not be repeated; everything else within the loop
is either idempotent, needs to be repeated, or in a path we
do not follow (e.g., we do not even try when large_request
is set, because we don't know how much data we might have
streamed from our helper program).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 09:45:13 -07:00
Jeff King 188923f0d1 http: fix segfault in handle_curl_result
When we create an http active_request_slot, we can set its
"results" pointer back to local storage. The http code will
fill in the details of how the request went, and we can
access those details even after the slot has been cleaned
up.

Commit 8809703 (http: factor out http error code handling)
switched us from accessing our local results struct directly
to accessing it via the "results" pointer of the slot. That
means we're accessing the slot after it has been marked as
finished, defeating the whole purpose of keeping the results
storage separate.

Most of the time this doesn't matter, as finishing the slot
does not actually clean up the pointer. However, when using
curl's multi interface with the dumb-http revision walker,
we might actually start a new request before handing control
back to the original caller. In that case, we may reuse the
slot, zeroing its results pointer, and leading the original
caller to segfault while looking for its results inside the
slot.

Instead, we need to pass a pointer to our local results
storage to the handle_curl_result function, rather than
relying on the pointer in the slot struct. This matches what
the original code did before the refactoring (which did not
use a separate function, and therefore just accessed the
results struct directly).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 09:42:31 -07:00
Junio C Hamano fb1e4a85bf Merge branch 'jk/smart-http-switch'
Allows users to turn off smart-http when talking to dumb-only
servers.

* jk/smart-http-switch:
  remote-curl: let users turn off smart http
  remote-curl: rename is_http variable
2012-09-29 22:28:25 -07:00
Junio C Hamano c318040a36 Merge branch 'sp/maint-http-enable-gzip'
Allows a more common 'gzip' Accept-Encoding to be used.

* sp/maint-http-enable-gzip:
  Enable info/refs gzip decompression in HTTP client
2012-09-29 22:28:20 -07:00
Jeff King 02572c2e3a remote-curl: let users turn off smart http
Usually there is no need for users to specify whether an
http remote is smart or dumb; the protocol is designed so
that a single initial request is made, and the client can
determine the server's capability from the response.

However, some misconfigured dumb-only servers may not like
the initial request by a smart client, as it contains a
query string. Until recently, commit 703e6e7 worked around
this by making a second request. However, that commit was
recently reverted due to its side effect of masking the
initial request's error code.

Since git has had that workaround for several years, we
don't know exactly how many such misconfigured servers are
out there. The reversion of 703e6e7 assumes they are rare
enough not to worry about. Still, that reversion leaves
somebody who does run into such a server with no escape
hatch at all. Let's give them an environment variable they
can tweak to perform the "dumb" request.

This is intentionally not a documented interface. It's
overly simple and is really there for debugging in case
somebody does complain about git not working with their
server. A real user-facing interface would entail a
per-remote or per-URL config variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-21 10:33:11 -07:00
Jeff King 243c329c1e remote-curl: rename is_http variable
We don't actually care whether the connection is http or
not; what we care about is whether it might be smart http.
Rename the variable to be more accurate, which will make it
easier to later make smart-http optional.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-20 10:48:45 -07:00
Shawn O. Pearce aa90b9697f Enable info/refs gzip decompression in HTTP client
Some HTTP servers try to use gzip compression on the /info/refs
request to save transfer bandwidth. Repositories with many tags
may find the /info/refs request can be gzipped to be 50% of the
original size due to the few but often repeated bytes used (hex
SHA-1 and commonly digits in tag names).

For most HTTP requests enable "Accept-Encoding: gzip" ensuring
the /info/refs payload can use this encoding format.

Only request gzip encoding from servers. Although deflate is
supported by libcurl, most servers have standardized on gzip
encoding for compression as that is what most browsers support.
Asking for deflate increases request sizes by a few bytes, but is
unlikely to ever be used by a server.

Disable the Accept-Encoding header on probe RPCs as response bodies
are supposed to be exactly 4 bytes long, "0000". The HTTP headers
requesting and indicating compression use more space than the data
transferred in the body.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-20 10:26:50 -07:00
Shawn O. Pearce 6ac964a627 Revert "retry request without query when info/refs?query fails"
This reverts commit 703e6e76a1.

Retrying without the query parameter was added as a workaround
for a single broken HTTP server at git.debian.org[1]. The server
was misconfigured to route every request with a query parameter
into gitweb.cgi. Admins fixed the server's configuration within
16 hours of the bug report to the Git mailing list, but we still
patched Git with this fallback and have been paying for it since.

Most Git hosting services configure the smart HTTP protocol and the
retry logic confuses users when there is a transient HTTP error as
Git dropped the real error from the smart HTTP request. Removing the
retry makes root causes easier to identify.

[1] http://thread.gmane.org/gmane.comp.version-control.git/137609

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-20 10:25:21 -07:00
Jeff King b81401c1de http: prompt for credentials on failed POST
All of the smart-http GET requests go through the http_get_*
functions, which will prompt for credentials and retry if we
see an HTTP 401.

POST requests, however, do not go through any central point.
Moreover, it is difficult to retry in the general case; we
cannot assume the request body fits in memory or is even
seekable, and we don't know how much of it was consumed
during the attempt.

Most of the time, this is not a big deal; for both fetching
and pushing, we make a GET request before doing any POSTs,
so typically we figure out the credentials during the first
request, then reuse them during the POST. However, some
servers may allow a client to get the list of refs from
receive-pack without authentication, and then require
authentication when the client actually tries to POST the
pack.

This is not ideal, as the client may do a non-trivial amount
of work to generate the pack (e.g., delta-compressing
objects). However, for a long time it has been the
recommended example configuration in git-http-backend(1) for
setting up a repository with anonymous fetch and
authenticated push. This setup has always been broken
without putting a username into the URL. Prior to commit
986bbc0, it did work with a username in the URL, because git
would prompt for credentials before making any requests at
all. However, post-986bbc0, it is totally broken. Since it
has been advertised in the manpage for some time, we should
make sure it works.

Unfortunately, it is not as easy as simply calling post_rpc
again when it fails, due to the input issue mentioned above.
However, we can still make this specific case work by
retrying in two specific instances:

  1. If the request is large (bigger than LARGE_PACKET_MAX),
     we will first send a probe request with a single flush
     packet. Since this request is static, we can freely
     retry it.

  2. If the request is small and we are not using gzip, then
     we have the whole thing in-core, and we can freely
     retry.

That means we will not retry in some instances, including:

  1. If we are using gzip. However, we only do so when
     calling git-upload-pack, so it does not apply to
     pushes.

  2. If we have a large request, the probe succeeds, but
     then the real POST wants authentication. This is an
     extremely unlikely configuration and not worth worrying
     about.

While it might be nice to cover those instances, doing so
would be significantly more complex for very little
real-world gain. In the long run, we will be much better off
when curl learns to internally handle authentication as a
callback, and we can cleanly handle all cases that way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27 10:49:09 -07:00
Junio C Hamano 25047b8896 Merge branch 'jk/maint-push-progress' into maint
"git push" over smart-http lost progress output a few releases ago.

By Jeff King
* jk/maint-push-progress:
  t5541: test more combinations of --progress
  teach send-pack about --[no-]progress
  send-pack: show progress when isatty(2)
2012-05-10 10:08:54 -07:00
Junio C Hamano cda03b6ad3 Merge branch 'it/fetch-pack-many-refs' into maint
When "git fetch" encounters repositories with too many references, the
command line of "fetch-pack" that is run by a helper e.g. remote-curl, may
fail to hold all of them. Now such an internal invocation can feed the
references through the standard input of "fetch-pack".

By Ivan Todoroski
* it/fetch-pack-many-refs:
  remote-curl: main test case for the OS command line overflow
  fetch-pack: test cases for the new --stdin option
  remote-curl: send the refs to fetch-pack on stdin
  fetch-pack: new --stdin option to read refs from stdin

Conflicts:
	t/t5500-fetch-pack.sh
2012-05-01 21:12:36 -07:00
Jeff King 391b1f2003 teach send-pack about --[no-]progress
The send_pack function gets a "progress" flag saying "yes,
definitely show progress" or "no, definitely do not show
progress". This gets set properly by transport_push when
send_pack is called directly.

However, when the send-pack command is executed separately
(as it is for the remote-curl helper), there is no way to
tell it "definitely do this". As a result, we do not
properly respect "git push --no-progress" for smart-http
remotes; you will still get progress if stderr is a tty.

This patch teaches send-pack --progress and --no-progress,
and teaches remote-curl to pass the appropriate option to
override send-pack's isatty check. This fixes the
--no-progress case above, and as a bonus, also makes "git
push --progress" work when stderr is not a tty.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-01 09:40:30 -07:00
Ivan Todoroski 8150749da1 remote-curl: send the refs to fetch-pack on stdin
Now that we can throw an arbitrary number of refs at fetch-pack using
its --stdin option, we use it in the remote-curl helper to bypass the
OS command line length limit.

Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 14:49:18 -07:00
Junio C Hamano f2120eb4db Merge branch 'sp/smart-http-failure-to-push' into maint
* sp/smart-http-failure-to-push:
  remote-curl: Fix push status report when all branches fail
2012-02-05 23:58:43 -08:00
Junio C Hamano 2bbf77dde2 Merge branch 'sp/smart-http-failure-to-push'
* sp/smart-http-failure-to-push:
  remote-curl: Fix push status report when all branches fail
2012-01-29 13:18:54 -08:00
Shawn O. Pearce 5238cbf656 remote-curl: Fix push status report when all branches fail
The protocol between transport-helper.c and remote-curl requires
remote-curl to always print a blank line after the push command
has run. If the blank line is ommitted, transport-helper kills its
container process (the git push the user started) with exit(128)
and no message indicating a problem, assuming the helper already
printed reasonable error text to the console.

However if the remote rejects all branches with "ng" commands in the
report-status reply, send-pack terminates with non-zero status, and
in turn remote-curl exited with non-zero status before outputting
the blank line after the helper status printed by send-pack. No
error messages reach the user.

This caused users to see the following from git push over HTTP
when the remote side's update hook rejected the branch:

  $ git push http://... master
  Counting objects: 4, done.
  Delta compression using up to 6 threads.
  Compressing objects: 100% (2/2), done.
  Writing objects: 100% (3/3), 301 bytes, done.
  Total 3 (delta 0), reused 0 (delta 0)
  $

Always print a blank line after the send-pack process terminates,
ensuring the helper status report (if it was output) will be
correctly parsed by the calling transport-helper.c. This ensures
the helper doesn't abort before the status report can be shown to
the user.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-20 10:14:32 -08:00
Clemens Buchacher c207e34f77 fix push --quiet: add 'quiet' capability to receive-pack
Currently, git push --quiet produces some non-error output, e.g.:

 $ git push --quiet
 Unpacking objects: 100% (3/3), done.

This fixes a bug reported for the fedora git package:

 https://bugzilla.redhat.com/show_bug.cgi?id=725593

Reported-by: Jesse Keating <jkeating@redhat.com>
Cc: Todd Zullinger <tmz@pobox.com>

Commit 90a6c7d4 (propagate --quiet to send-pack/receive-pack)
introduced the --quiet option to receive-pack and made send-pack
pass that option. Older versions of receive-pack do not recognize
the option, however, and terminate immediately. The commit was
therefore reverted.

This change instead adds a 'quiet' capability to receive-pack,
which is a backwards compatible.

In addition, this fixes push --quiet via http: A verbosity of 0
means quiet for remote helpers.

Reported-by: Tobias Ulmer <tobiasu@tmux.org>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-08 14:27:28 -08:00
Junio C Hamano 200888ef3b Merge branch 'jk/http-push-to-empty'
* jk/http-push-to-empty:
  remote-curl: don't pass back fake refs

Conflicts:
	remote-curl.c
2011-12-22 11:27:22 -08:00
Junio C Hamano 1d3a035d6d Merge branch 'jk/maint-push-over-dav'
* jk/maint-push-over-dav:
  http-push: enable "proactive auth"
  t5540: test DAV push with authentication

Conflicts:
	http.c
2011-12-19 16:05:59 -08:00
Jeff King 02f7914734 remote-curl: don't pass back fake refs
When receive-pack advertises its list of refs, it generally hides the
capabilities information after a NUL at the end of the first ref.

However, when we have an empty repository, there are no refs, and
therefore receive-pack writes a fake ref "capabilities^{}" with the
capabilities afterwards.

On the client side, git reads the result with get_remote_heads(). We pick
the capabilities from the end of the line, and then call check_ref() to
make sure the ref name is valid. We see that it isn't, and don't bother
adding it to our list of refs.

However, the call to check_ref() is enabled by passing the REF_NORMAL flag
to get_remote_heads. For the regular git transport, we pass REF_NORMAL in
get_refs_via_connect() if we are doing a push (since only receive-pack
uses this fake ref).  But in remote-curl, we never use this flag, and we
accept the fake ref as a real one, passing it back from the helper to the
parent git-push.

Most of the time this bug goes unnoticed, as the fake ref won't match our
refspecs. However, if "--mirror" is used, then we see it as remote cruft
to be pruned, and try to pass along a deletion refspec for it. Of course
this refspec has bogus syntax (because of the ^{}), and the helper
complains, aborting the push.

Let's have remote-curl mirror what the builtin get_refs_via_connect() does
(at least for the case of using git protocol; we can leave the dumb
info/refs reader as it is).

This also fixes pushing with --mirror to a smart-http remote that uses
alternates. The fake ".have" refs the server gives to avoid unnecessary
network transfer has a similar bad interactions with the machinery.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-19 11:21:29 -08:00
Jeff King a4ddbc33d7 http-push: enable "proactive auth"
Before commit 986bbc08, git was proactive about asking for
http passwords. It assumed that if you had a username in
your URL, you would also want a password, and asked for it
before making any http requests.

However, this could interfere with the use of .netrc (see
986bbc08 for details). And it was also unnecessary, since
the http fetching code had learned to recognize an HTTP 401
and prompt the user then. Furthermore, the proactive prompt
could interfere with the usage of .netrc (see 986bbc08 for
details).

Unfortunately, the http push-over-DAV code never learned to
recognize HTTP 401, and so was broken by this change. This
patch does a quick fix of re-enabling the "proactive auth"
strategy only for http-push, leaving the dumb http fetch and
smart-http as-is.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-13 16:34:44 -08:00
Jeff King afe7c5ff1f drop "match" parameter from get_remote_heads
The get_remote_heads function reads the list of remote refs
during git protocol session. It dates all the way back to
def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04).
At that time, the idea was to come up with a list of refs we
were interested in, and then filter the list as we got it
from the remote side.

Later, 1baaae5 (Make maximal use of the remote refs,
2005-10-28) stopped filtering at the get_remote_heads layer,
letting us use the non-matching refs to find common history.

As a result, all callers now simply pass an empty match
list (and any future callers will want to do the same). So
let's drop these now-useless parameters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-13 10:08:24 -08:00
Junio C Hamano 963838402a Merge branch 'jk/http-auth'
* jk/http-auth:
  http_init: accept separate URL parameter
  http: use hostname in credential description
  http: retry authentication failures for all http requests
  remote-curl: don't retry auth failures with dumb protocol
  improve httpd auth tests
  url: decode buffers that are not NUL-terminated
2011-10-17 21:37:15 -07:00
Jeff King deba49377b http_init: accept separate URL parameter
The http_init function takes a "struct remote". Part of its
initialization procedure is to look at the remote's url and
grab some auth-related parameters. However, using the url
included in the remote is:

  - wrong; the remote-curl helper may have a separate,
    unrelated URL (e.g., from remote.*.pushurl). Looking at
    the remote's configured url is incorrect.

  - incomplete; http-fetch doesn't have a remote, so passes
    NULL. So http_init never gets to see the URL we are
    actually going to use.

  - cumbersome; http-push has a similar problem to
    http-fetch, but actually builds a fake remote just to
    pass in the URL.

Instead, let's just add a separate URL parameter to
http_init, and all three callsites can pass in the
appropriate information.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-15 21:18:36 -07:00
Junio C Hamano 7ed72d1813 Merge branch 'sp/smart-http-failure'
* sp/smart-http-failure:
  remote-curl: Fix warning after HTTP failure
2011-10-12 12:34:27 -07:00
Shawn O. Pearce 6cdf0223fe remote-curl: Fix warning after HTTP failure
If the HTTP connection is broken in the middle of a fetch or clone
body, the client presented a useless error message due to part of
the upload-pack->remote-curl pkt-line protocol leaking out of the
helper as the helper's "fetch result":

  error: RPC failed; result=18, HTTP code = 200
  fatal: The remote end hung up unexpectedly
  fatal: early EOF
  fatal: unpack-objects failed
  warning: https unexpectedly said: '0000'

Instead when the HTTP RPC fails discard all remaining data from
upload-pack and report nothing to the transport helper. Errors
were already sent to stderr.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-04 19:11:50 -07:00
Junio C Hamano 48f36dcd73 Sync with 1.7.6.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-06 11:42:12 -07:00
Junio C Hamano 5a277f3ff7 Revert "Merge branch 'cb/maint-quiet-push' into maint"
This reverts commit ffa69e61d3, reversing
changes made to 4a13c4d148.

Adding a new command line option to receive-pack and feed it from
send-pack is not an acceptable way to add features, as there is no
guarantee that your updated send-pack will be talking to updated
receive-pack. New features need to be added via the capability mechanism
negotiated over the protocol.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-06 11:10:41 -07:00