development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-28 19:25:47 +00:00

Author	SHA1	Message	Date
Jonathan Tan	726b25a91b	http: allow custom index-pack args Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Junio C Hamano	34e849b05a	Merge branch 'jt/cdn-offload' The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack	2020-06-25 12:27:47 -07:00
Jonathan Tan	8d5d2a34df	http-fetch: support fetching packfiles by URL Teach http-fetch the ability to download packfiles directly, given a URL, and to verify them. The http_pack_request suite has been augmented with a function that takes a URL directly. With this function, the hash is only used to determine the name of the temporary file. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Jonathan Tan	eb05349247	http: refactor finish_http_pack_request() finish_http_pack_request() does multiple tasks, including some housekeeping on a struct packed_git - (1) closing its index, (2) removing it from a list, and (3) installing it. These concerns are independent of fetching a pack through HTTP: they are there only because (1) the calling code opens the pack's index before deciding to fetch it, (2) the calling code maintains a list of packfiles that can be fetched, and (3) the calling code fetches it in order to make use of its objects in the same process. In preparation for a subsequent commit, which adds a feature that does not need any of this housekeeping, remove (1), (2), and (3) from finish_http_pack_request(). (2) and (3) are now done by a helper function, and (1) is the responsibility of the caller (in this patch, done closer to the point where the pack index is opened). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Jonathan Tan	7167a62b9e	http, imap-send: stop using CURLOPT_VERBOSE Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting CURLOPT_VERBOSE. This is to prevent inadvertent revelation of sensitive data. In particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header nor any cookies specified by GIT_REDACT_COOKIES. Unifying the tracing mechanism also has the future benefit that any improvements to the tracing mechanism will benefit both users of GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to implement any improvement twice. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 11:18:01 -07:00
Carlo Marcelo Arenas Belón	93b980e58f	http: use xmalloc with cURL `f0ed8226c9` (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31) never told cURL about it. Correct that by using the cURL initializer available since version 7.12 to point to xmalloc and friends for consistency which then will pass the allocation requests along when USE_NED_ALLOCATOR=YesPlease is used (most likely in Windows) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-15 12:37:10 -07:00
Junio C Hamano	4aeeef3773	Merge branch 'dl/no-extern-in-func-decl' Mechanically and systematically drop "extern" from function declarlation. * dl/no-extern-in-func-decl: .[ch]: manually align parameter lists .[ch]: remove extern from function declarations using sed *.[ch]: remove extern from function declarations using spatch	2019-05-13 23:50:32 +09:00
Denton Liu	ad6dad0996	*.[ch]: manually align parameter lists In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:10 +09:00
Denton Liu	554544276a	.[ch]: remove extern from function declarations using spatch There has been a push to remove extern from function declarations. Remove some instances of "extern" for function declarations which are caught by Coccinelle. Note that Coccinelle has some difficulty with processing functions with `__attribute__` or varargs so some `extern` declarations are left behind to be dealt with in a future patch. This was the Coccinelle patch used: @@ type T; identifier f; @@ - extern T f(...); and it was run with: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:06 +09:00
Junio C Hamano	d4e568b2a3	Merge branch 'bc/hash-transition-16' Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...	2019-04-25 16:41:17 +09:00
brian m. carlson	eed0e60f02	http: compute hash of downloaded objects using the_hash_algo Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 11:57:38 +09:00
Jeff King	a3722bcbbd	http: factor out curl result code normalization We make some requests with CURLOPT_FAILONERROR and some without, and then handle_curl_result() normalizes any failures to a uniform CURLcode. There are some other code paths in the dumb-http walker which don't use handle_curl_result(); let's pull the normalization into its own function so it can be reused. Arguably those code paths would benefit from the rest of handle_curl_result(), notably the auth handling. But retro-fitting it now would be a lot of work, and in practice it doesn't matter too much (whatever authentication we needed to make the initial contact with the server is generally sufficient for the rest of the dumb-http requests). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-24 21:22:40 +09:00
Junio C Hamano	cba595ab1a	Merge branch 'jk/loose-object-cache-oid' Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references	2019-02-06 22:05:27 -08:00
Masaya Suzuki	e6cf87b12d	http: enable keep_error for HTTP requests curl stops parsing a response when it sees a bad HTTP status code and it has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP headers on error. keep_error is an option to receive the HTTP response body for those error responses. By enabling this option, curl will process the HTTP response headers, and they're shown if GIT_CURL_VERBOSE is set. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-10 15:00:56 -08:00
Jeff King	f0be0db13d	http: use struct object_id instead of bare sha1 The dumb-http walker code still passes around and stores object ids as "unsigned char *sha1". Let's modernize it. There's probably still more work to be done to handle dumb-http fetches with a new, larger hash. But that can wait; this is enough that we can now convert some of the low-level object routines that we call into from here (and in fact, some of the "oid.hash" references added here will be further improved in the next patch). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Junio C Hamano	7c3d15fe31	Merge branch 'jk/snprintf-truncation' Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers	2018-05-30 21:51:28 +09:00
Jeff King	390c6cbc5e	http: use strbufs instead of fixed buffers We keep the names of incoming packs and objects in fixed PATH_MAX-size buffers, and snprintf() into them. This is unlikely to end up with truncated filenames, but it is possible (especially on systems where PATH_MAX is shorter than actual paths can be). Let's switch to using strbufs, which makes the question go away entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-21 09:54:30 +09:00
Brandon Williams	8ff14ed412	http: allow providing extra headers for http requests Add a way for callers to request that extra headers be included when making http requests. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
David Turner	37ee680d9b	http.postbuffer: allow full range of ssize_t values Unfortunately, in order to push some large repos where a server does not support chunked encoding, the http postbuffer must sometimes exceed two gigabytes. On a 64-bit system, this is OK: we just malloc a larger buffer. This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the buffer size. Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-13 18:24:32 -07:00
Junio C Hamano	8a2882f23e	Merge branch 'jk/http-walker-limit-redirect-2.9' Transport with dumb http can be fooled into following foreign URLs that the end user does not intend to, especially with the server side redirects and http-alternates mechanism, which can lead to security issues. Tighten the redirection and make it more obvious to the end user when it happens. * jk/http-walker-limit-redirect-2.9: http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect	2016-12-19 14:45:32 -08:00
Jeff King	50d3413740	http: make redirects more obvious We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since `c93c92f30` (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-06 12:32:48 -08:00
Junio C Hamano	2f84df2ca0	Merge branch 'ep/http-curl-trace' HTTP transport gained an option to produce more detailed debugging trace. * ep/http-curl-trace: imap-send.c: introduce the GIT_TRACE_CURL enviroment variable http.c: implement the GIT_TRACE_CURL environment variable	2016-07-06 13:38:06 -07:00
Elia Pinto	74c682d3c6	http.c: implement the GIT_TRACE_CURL environment variable Implement the GIT_TRACE_CURL environment variable to allow a greater degree of detail of GIT_CURL_VERBOSE, in particular the complete transport header and all the data payload exchanged. It might be useful if a particular situation could require a more thorough debugging analysis. Document the new GIT_TRACE_CURL environment variable. Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-24 15:48:18 -07:00
Johannes Schindelin	8cb01e2fd3	http: support sending custom HTTP headers We introduce a way to send custom HTTP headers with all requests. This allows us, for example, to send an extra token from build agents for temporary access to private repositories. (This is the use case that triggered this patch.) This feature can be used like this: git -c http.extraheader='Secret: sssh!' fetch $URL $REF Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only a single list, overriding any previous call. This means we have to collect _all_ of the headers we want to use into a single list, and feed it to cURL in one shot. Since we already unconditionally set a "pragma" header when initializing the curl handles, we can add our new headers to that list. For callers which override the default header list (like probe_rpc), we provide `http_copy_default_headers()` so they can do the same trick. Big thanks to Jeff King and Junio Hamano for their outstanding help and patient reviews. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-27 14:02:33 -07:00
Junio C Hamano	e84d5e9fa1	Merge branch 'ew/force-ipv4' "git fetch" and friends that make network connections can now be told to only use ipv4 (or ipv6). * ew/force-ipv4: connect & http: support -4 and -6 switches for remote operations	2016-02-24 13:25:54 -08:00
Eric Wong	c915f11eb4	connect & http: support -4 and -6 switches for remote operations Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 11:34:14 -08:00
Knut Franke	372370f167	http: use credential API to handle proxy authentication Currently, the only way to pass proxy credentials to curl is by including them in the proxy URL. Usually, this means they will end up on disk unencrypted, one way or another (by inclusion in ~/.gitconfig, shell profile or history). Since proxy authentication often uses a domain user, credentials can be security sensitive; therefore, a safer way of passing credentials is desirable. If the configured proxy contains a username but not a password, query the credential API for one. Also, make sure we approve/reject proxy credentials properly. For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy environment variables, which would otherwise be evaluated as a fallback by curl. Without this, we would have different semantics for git configuration and environment variables. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Knut Franke <k.franke@science-computing.de> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-26 10:53:25 -08:00
David Turner	835c4d3689	http.c: use CURLOPT_RANGE for range requests A HTTP server is permitted to return a non-range response to a HTTP range request (and Apache httpd in fact does this in some cases). While libcurl knows how to correctly handle this (by skipping bytes before and after the requested range), it only turns on this handling if it is aware that a range request is being made. By manually setting the range header instead of using CURLOPT_RANGE, we were hiding the fact that this was a range request from libcurl. This could cause corruption. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-11-02 14:18:06 -08:00
Junio C Hamano	b90a3d7b32	http.c: make finish_active_slot() and handle_curl_result() static They used to be used directly by remote-curl.c for the smart-http protocol. But they got wrapped by run_one_slot() in `beed336` (http: never use curl_easy_perform, 2014-02-18). Any future users are expected to follow that route. Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-01-15 11:00:52 -08:00
Jeff King	e31316263a	http: optionally extract charset parameter from content-type Since the previous commit, we now give a sanitized, shortened version of the content-type header to any callers who ask for it. This patch adds back a way for them to cleanly access specific parameters to the type. We could easily extract all parameters and make them available via a string_list, but: 1. That complicates the interface and memory management. 2. In practice, no planned callers care about anything except the charset. This patch therefore goes with the simplest thing, and we can expand or change the interface later if it becomes necessary. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-27 09:59:19 -07:00
Junio C Hamano	d59c12d7ad	Merge branch 'jl/nor-or-nand-and' Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"	2014-04-08 12:00:28 -07:00
Justin Lebar	01689909eb	comments: fix misuses of "nor" Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-31 15:29:27 -07:00
Jeff King	beed336c3e	http: never use curl_easy_perform We currently don't reuse http connections when fetching via the smart-http protocol. This is bad because the TCP handshake introduces latency, and especially because SSL connection setup may be non-trivial. We can fix it by consistently using curl's "multi" interface. The reason is rather complicated: Our http code has two ways of being used: queuing many "slots" to be fetched in parallel, or fetching a single request in a blocking manner. The parallel code is built on curl's "multi" interface. Most of the single-request code uses http_request, which is built on top of the parallel code (we just feed it one slot, and wait until it finishes). However, one could also accomplish the single-request scheme by avoiding curl's multi interface entirely and just using curl_easy_perform. This is simpler, and is used by post_rpc in the smart-http protocol. It does work to use the same curl handle in both contexts, as long as it is not at the same time. However, internally curl may not share all of the cached resources between both contexts. In particular, a connection formed using the "multi" code will go into a reuse pool connected to the "multi" object. Further requests using the "easy" interface will not be able to reuse that connection. The smart http protocol does ref discovery via http_request, which uses the "multi" interface, and then follows up with the "easy" interface for its rpc calls. As a result, we make two HTTP connections rather than reusing a single one. We could teach the ref discovery to use the "easy" interface. But it is only once we have done this discovery that we know whether the protocol will be smart or dumb. If it is dumb, then our further requests, which want to fetch objects in parallel, will not be able to reuse the same connection. Instead, this patch switches post_rpc to build on the parallel interface, which means that we use it consistently everywhere. It's a little more complicated to use, but since we have the infrastructure already, it doesn't add any code; we can just factor out the relevant bits from http_request. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-18 15:50:57 -08:00
Junio C Hamano	c5a77e8f92	Merge branch 'bc/http-100-continue' Issue "100 Continue" responses to help use of GSS-Negotiate authentication scheme over HTTP transport when needed. * bc/http-100-continue: remote-curl: fix large pushes with GSSAPI remote-curl: pass curl slot_results back through run_slot http: return curl's AUTHAVAIL via slot_results	2013-12-05 12:58:59 -08:00
Jeff King	0972ccd97c	http: return curl's AUTHAVAIL via slot_results Callers of the http code may want to know which auth types were available for the previous request. But after finishing with the curl slot, they are not supposed to look at the curl handle again. We already handle returning other information via the slot_results struct; let's add a flag to check the available auth. Note that older versions of curl did not support this, so we simply return 0 (something like "-1" would be worse, as the value is a bitflag and we might accidentally set a flag). This is sufficient for the callers planned in this series, who only trigger some optional behavior if particular bits are set, and can live with a fake "no bits" answer. Signed-off-by: Jeff King <peff@peff.net>	2013-10-31 10:05:55 -07:00
Jeff King	c93c92f309	http: update base URLs when we see redirects If a caller asks the http_get_* functions to go to a particular URL and we end up elsewhere due to a redirect, the effective_url field can tell us where we went. It would be nice to remember this redirect and short-cut further requests for two reasons: 1. It's more efficient. Otherwise we spend an extra http round-trip to the server for each subsequent request, just to get redirected. 2. If we end up with an http 401 and are going to ask for credentials, it is to feed them to the redirect target. If the redirect is an http->https upgrade, this means our credentials may be provided on the http leg, just to end up redirected to https. And if the redirect crosses server boundaries, then curl will drop the credentials entirely as it follows the redirect. However, it, it is not enough to simply record the effective URL we saw and use that for subsequent requests. We were originally fed a "base" url like: http://example.com/foo.git and we want to figure out what the new base is, even though the URLs we see may be: original: http://example.com/foo.git/info/refs effective: http://example.com/bar.git/info/refs Subsequent requests will not be for "info/refs", but for other paths relative to the base. We must ask the caller to pass in the original base, and we must pass the redirected base back to the caller (so that it can generate more URLs from it). Furthermore, we need to feed the new base to the credential code, so that requests to credential helpers (or to the user) match the URL we will be requesting. This patch teaches http_request_reauth to do this munging. Since it is the caller who cares about making more URLs, it seems at first glance that callers could simply check effective_url themselves and handle it. However, since we need to update the credential struct before the second re-auth request, we have to do it inside http_request_reauth. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2013-10-14 16:56:47 -07:00
Jeff King	78868962c0	http: provide effective url to callers When we ask curl to access a URL, it may follow one or more redirects to reach the final location. We have no idea this has happened, as curl takes care of the details and simply returns the final content to us. The final URL that we ended up with can be accessed via CURLINFO_EFFECTIVE_URL. Let's make that optionally available to callers of http_get_*, so that they can make further decisions based on the redirection. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2013-10-14 16:55:23 -07:00
Jeff King	2501aff8b7	http: hoist credential request out of handle_curl_result When we are handling a curl response code in http_request or in the remote-curl RPC code, we use the handle_curl_result helper to translate curl's response into an easy-to-use code. When we see an HTTP 401, we do one of two things: 1. If we already had a filled-in credential, we mark it as rejected, and then return HTTP_NOAUTH to indicate to the caller that we failed. 2. If we didn't, then we ask for a new credential and tell the caller HTTP_REAUTH to indicate that they may want to try again. Rejecting in the first case makes sense; it is the natural result of the request we just made. However, prompting for more credentials in the second step does not always make sense. We do not know for sure that the caller is going to make a second request, and nor are we sure that it will be to the same URL. Logically, the prompt belongs not to the request we just finished, but to the request we are (maybe) about to make. In practice, it is very hard to trigger any bad behavior. Currently, if we make a second request, it will always be to the same URL (even in the face of redirects, because curl handles the redirects internally). And we almost always retry on HTTP_REAUTH these days. The one exception is if we are streaming a large RPC request to the server (e.g., a pushed packfile), in which case we cannot restart. It's extremely unlikely to see a 401 response at this stage, though, as we would typically have seen it when we sent a probe request, before streaming the data. This patch drops the automatic prompt out of case 2, and instead requires the caller to do it. This is a few extra lines of code, and the bug it fixes is unlikely to come up in practice. But it is conceptually cleaner, and paves the way for better handling of credentials across redirects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2013-10-14 16:55:13 -07:00
Jeff King	1bbcc224cc	http: refactor options to http_get_* Over time, the http_get_strbuf function has grown several optional parameters. We now have a bitfield with multiple boolean options, as well as an optional strbuf for returning the content-type of the response. And a future patch in this series is going to add another strbuf option. Treating these as separate arguments has a few downsides: 1. Most call sites need to add extra NULLs and 0s for the options they aren't interested in. 2. The http_get_* functions are actually wrappers around 2 layers of low-level implementation functions. We have to pass these options through individually. 3. The http_get_strbuf wrapper learned these options, but nobody bothered to do so for http_get_file, even though it is backed by the same function that does understand the options. Let's consolidate the options into a single struct. For the common case of the default options, we'll allow callers to simply pass a NULL for the options struct. The resulting code is often a few lines longer, but it ends up being easier to read (and to change as we add new options, since we do not need to update each call site). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2013-09-30 17:21:59 -07:00
Junio C Hamano	574d51b575	Merge branch 'mv/ssl-ftp-curl' Does anybody really use commit walkers over (s)ftp? * mv/ssl-ftp-curl: Support FTP-over-SSL/TLS for regular FTP	2013-04-19 13:31:08 -07:00
Modestas Vainius	4bc444eb64	Support FTP-over-SSL/TLS for regular FTP Add a boolean http.sslTry option which allows to enable AUTH SSL/TLS and encrypted data transfers when connecting via regular FTP protocol. Default is false since it might trigger certificate verification errors on misconfigured servers. Signed-off-by: Modestas Vainius <modestas@vainius.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-12 08:52:23 -07:00
Jeff King	4df13f69e9	http: drop http_error function This function is a single-liner and is only called from one place. Just inline it, which makes the code more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-06 18:56:46 -07:00
Jeff King	67d2a7b5c5	http: simplify http_error helper function This helper function should really be a one-liner that prints an error message, but it has ended up unnecessarily complicated: 1. We call error() directly when we fail to start the curl request, so we must later avoid printing a duplicate error in http_error(). It would be much simpler in this case to just stuff the error message into our usual curl_errorstr buffer rather than printing it ourselves. This means that http_error does not even have to care about curl's exit value (the interesting part is in the errorstr buffer already). 2. We return the "ret" value passed in to us, but none of the callers actually cares about our return value. We can just drop this entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-06 18:56:44 -07:00
Jeff King	6d052d78d7	http: add HTTP_KEEP_ERROR option We currently set curl's FAILONERROR option, which means that any http failures are reported as curl errors, and the http body content from the server is thrown away. This patch introduces a new option to http_get_strbuf which specifies that the body content from a failed http response should be placed in the destination strbuf, where it can be accessed by the caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-06 18:56:41 -07:00
Shawn Pearce	4656bf47fc	Verify Content-Type from smart HTTP servers Before parsing a suspected smart-HTTP response verify the returned Content-Type matches the standard. This protects a client from attempting to process a payload that smells like a smart-HTTP server response. JGit has been doing this check on all responses since the dawn of time. I mistakenly failed to include it in git-core when smart HTTP was introduced. At the time I didn't know how to get the Content-Type from libcurl. I punted, meant to circle back and fix this, and just plain forgot about it. Signed-off-by: Shawn Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-04 10:22:36 -08:00
Jeff King	1960897ebc	http: do not set up curl auth after a 401 When we get an http 401, we prompt for credentials and put them in our global credential struct. We also feed them to the curl handle that produced the 401, with the intent that they will be used on a retry. When the code was originally introduced in commit `42653c0`, this was a necessary step. However, since `dfa1725`, we always feed our global credential into every curl handle when we initialize the slot with get_active_slot. So every further request already feeds the credential to curl. Moreover, accessing the slot here is somewhat dubious. After the slot has produced a response, we don't actually control it any more. If we are using curl_multi, it may even have been re-initialized to handle a different request. It just so happens that we will reuse the curl handle within the slot in such a case, and that because we only keep one global credential, it will be the one we want. So the current code is not buggy, but it is misleading. By cleaning it up, we can remove the slot argument entirely from handle_curl_result, making it much more obvious that slots should not be accessed after they are marked as finished. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-12 09:45:15 -07:00
Jeff King	188923f0d1	http: fix segfault in handle_curl_result When we create an http active_request_slot, we can set its "results" pointer back to local storage. The http code will fill in the details of how the request went, and we can access those details even after the slot has been cleaned up. Commit `8809703` (http: factor out http error code handling) switched us from accessing our local results struct directly to accessing it via the "results" pointer of the slot. That means we're accessing the slot after it has been marked as finished, defeating the whole purpose of keeping the results storage separate. Most of the time this doesn't matter, as finishing the slot does not actually clean up the pointer. However, when using curl's multi interface with the dumb-http revision walker, we might actually start a new request before handing control back to the original caller. In that case, we may reuse the slot, zeroing its results pointer, and leading the original caller to segfault while looking for its results inside the slot. Instead, we need to pass a pointer to our local results storage to the handle_curl_result function, rather than relying on the pointer in the slot struct. This matches what the original code did before the refactoring (which did not use a separate function, and therefore just accessed the results struct directly). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-12 09:42:31 -07:00
Jeff King	8809703072	http: factor out http error code handling Most of our http requests go through the http_request() interface, which does some nice post-processing on the results. In particular, it handles prompting for missing credentials as well as approving and rejecting valid or invalid credentials. Unfortunately, it only handles GET requests. Making it handle POSTs would be quite complex, so let's pull result handling code into its own function so that it can be reused from the POST code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-27 10:49:09 -07:00
Jim Meyering	a7793a7491	correct spelling: an URL -> a URL Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-28 08:47:23 -07:00
Junio C Hamano	1d3a035d6d	Merge branch 'jk/maint-push-over-dav' * jk/maint-push-over-dav: http-push: enable "proactive auth" t5540: test DAV push with authentication Conflicts: http.c	2011-12-19 16:05:59 -08:00

1 2

91 commits