development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Junio C Hamano	c2796ac1c2	Merge branch 'bc/push-cas-cquoted-refname' into master Pushing a ref whose name contains non-ASCII character with the "--force-with-lease" option did not work over smart HTTP protocol, which has been corrected. * bc/push-cas-cquoted-refname: remote-curl: make --force-with-lease work with non-ASCII ref names	2020-07-30 13:20:34 -07:00
Jeff King	f6d8942b1f	strvec: fix indentation in renamed calls Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	c972bf4cf5	strvec: convert remaining callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the remaining files, as the resulting diff is reasonably sized. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
brian m. carlson	cd85b447bf	remote-curl: make --force-with-lease work with non-ASCII ref names When we invoke a remote transport helper and pass an option with an argument, we quote the argument as a C-style string if necessary. This is the case for the cas option, which implements the --force-with-lease command-line flag, when we're passing a non-ASCII refname. However, the remote curl helper isn't designed to parse such an argument, meaning that if we try to use --force-with-lease with an HTTP push and a non-ASCII refname, we get an error like this: error: cannot parse expected object name '0000000000000000000000000000000000000000"' Note the double quote, which get_oid has reminded us is not valid in an hex object ID. Even if we had been able to parse it, we would send the wrong data to the server: we'd send an escaped ref, which would not behave as the user wanted and might accidentally result in updating or deleting a ref we hadn't intended. Since we need to expect a quoted C-style string here, just check if the first argument is a double quote, and if so, unquote it. Note that if the refname contains a double quote, then we will have double-quoted it already, so there is no ambiguity. We test for this case only in the smart protocol, since the DAV-based protocol is not capable of handling this capability. We use UTF-8 because this is nicer in our tests and friendlier to Windows, but the code should work for all non-ASCII refs. While we're at it, since the name of the option is now well established and isn't going to change, let's inline it instead of using the #define constant. Reported-by: Frej Bjon <frej.bjon@nemit.fi> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-20 21:05:16 -07:00
Junio C Hamano	12210859da	Merge branch 'bc/sha-256-part-2' SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...	2020-07-06 22:09:13 -07:00
brian m. carlson	97997e6ad2	remote-curl: avoid truncating refs with ls-remote Normally, the remote-curl transport helper is aware of the hash algorithm we're using because we're in a repo with the appropriate hash algorithm set. However, when using git ls-remote outside of a repository, we won't have initialized the hash algorithm properly, so use hash_to_hex_algop to print the ref corresponding to the algorithm we've detected. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
brian m. carlson	ac093d0790	remote-curl: detect algorithm for dumb HTTP by size When reading the info/refs file for a repository, we have no explicit way to detect which hash algorithm is in use because the file doesn't provide one. Detect the hash algorithm in use by the size of the first object ID. If we have an empty repository, we don't know what the hash algorithm is on the remote side, so default to whatever the local side has configured. Without doing this, we cannot clone an empty repository since we don't know its hash algorithm. Test this case appropriately, since we currently have no tests for cloning an empty repository with the dumb HTTP protocol. We anonymize the URL like elsewhere in the function in case the user has decided to include a secret in the URL. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
brian m. carlson	7f60501775	remote-curl: implement object-format extensions Implement the object-format extensions that let us determine the hash algorithm in use when pushing, pulling, and fetching. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
Denton Liu	b0df0c16ea	stateless-connect: send response end packet Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated before the transaction is complete, remote-curl will blindly forward the packets before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the transaction and continue reading, expecting more input to continue the transaction. This results in a deadlock between the two processes. This can be seen in the following command which does not terminate: $ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... whereas the v1 version does terminate as expected: $ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... fatal: the remote end hung up unexpectedly Instead of blindly forwarding packets, make remote-curl insert a response end packet after proxying the responses from the remote server when using stateless_connect(). On the RPC client side, ensure that each response ends as described. A separate control packet is chosen because we need to be able to differentiate between what the remote server sends and remote-curl's control packets. By ensuring in the remote-curl code that a server cannot send response end packets, we prevent a malicious server from being able to perform a denial of service attack in which they spoof a response end packet and cause the described deadlock to happen. Reported-by: Force Charlie <charlieio@outlook.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Denton Liu	0181b600a6	pkt-line: define PACKET_READ_RESPONSE_END In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Denton Liu	74b082ad34	remote-curl: error on incomplete packet Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated with a partially written packet, remote-curl will blindly send the partially written packet before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the partial packet and continue reading, expecting more input. This results in a deadlock between the two processes. For a stateless connection, inspect packets before sending them and error out if a packet line packet is incomplete. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Denton Liu	04cc91abcb	remote-curl: remove label indentation In the codebase, labels are aligned to the leftmost column. Remove the space-indentation from `free_specs:` to conform to this. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 11:35:06 -07:00
Denton Liu	51ca7f89f8	remote-curl: fix typo Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 11:34:24 -07:00
Jeff King	fe299ec5ae	oid_array: rename source file from sha1-array We renamed the actual data structure in `910650d2f8` (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. `fc1395f4a4` (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Junio C Hamano	4a5c3e10f2	Merge branch 'rs/show-progress-in-dumb-http-fetch' "git fetch" over HTTP walker protocol did not show any progress output. We inherently do not know how much work remains, but still we can show something not to bore users. * rs/show-progress-in-dumb-http-fetch: remote-curl: show progress for fetches over dumb HTTP	2020-03-09 11:21:21 -07:00
René Scharfe	7655b4119d	remote-curl: show progress for fetches over dumb HTTP Fetching over dumb HTTP transport doesn't show any progress, even with the option --progress. If the connection is slow or there is a lot of data to get then this can take a long time while the user is left to wonder if git got stuck. We don't know the number of objects to fetch at the outset, but we can count the ones we got. Show an open-ended progress indicator based on that number if the user asked for it. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-03 13:15:40 -08:00
Junio C Hamano	145136a95a	C: use skip_prefix() to avoid hardcoded string length We often skip an optional prefix in a string with a hardcoded constant, e.g. if (starts_with(string, "prefix")) string += 6; which is less error prone when written skip_prefix(string, "prefix", &string); Note that this changes a few error messages from "git reflog expire --expire=nonsense.timestamp", which used to complain by saying '--expire=nonsense.timestamp' is not a valid timestamp but with this change, we say 'nonsense.timestamp' is not a valid timestamp which is more technically correct (the string with --expire= as a prefix obviously cannot be a valid timestamp, but the error is about the part of the input without that prefix). Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:03:45 -08:00
Junio C Hamano	d45d771978	Merge branch 'bc/smart-http-atomic-push' The atomic push over smart HTTP transport did not work, which has been corrected. * bc/smart-http-atomic-push: remote-curl: pass on atomic capability to remote side	2019-10-23 14:43:11 +09:00
brian m. carlson	6f1194246a	remote-curl: pass on atomic capability to remote side When pushing more than one reference with the --atomic option, the server is supposed to perform a single atomic transaction to update the references, leaving them either all to succeed or all to fail. This works fine when pushing locally or over SSH, but when pushing over HTTP, we fail to pass the atomic capability to the remote side. In fact, we have not reported this capability to any remote helpers during the life of the feature. Now normally, things happen to work nevertheless, since we actually check for most types of failures, such as non-fast-forward updates, on the client side, and just abort the entire attempt. However, if the server side reports a problem, such as the inability to lock a ref, the transaction isn't atomic, because we haven't passed the appropriate capability over and the remote side has no way of knowing that we wanted atomic behavior. Fix this by passing the option from the transport code through to remote helpers, and from the HTTP remote helper down to send-pack. With this change, we can detect if the server side rejects the push and report back appropriately. Note the difference in the messages: the remote side reports "atomic transaction failed", while our own checking rejects pushes with the message "atomic push failed". Document the atomic option in the remote helper documentation, so other implementers can implement it if they like. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-17 16:08:22 +09:00
René Scharfe	062a309d36	remote-curl: use argv_array in parse_push() Use argv_array to build an array of strings instead of open-coding it. This simplifies the code a bit. We also need to make the specs parameter of push(), push_dav() and push_git() const to match the argv member of the argv_array. That's fine, as all three only actually read from the specs array anyway. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-15 10:55:11 +09:00
Jiang Xin	8a1569d655	i18n: fix typos found during l10n for git 2.22.0 Fix two typos introduced by the following commits: + `31fba9d3b4` (diff-parseopt: convert --[src\|dst]-prefix, 2019-03-24) + `ed8b4132c8` (remote-curl: mark all error messages for translation, 2019-03-05) Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-03 11:10:53 -07:00
Junio C Hamano	d4e568b2a3	Merge branch 'bc/hash-transition-16' Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...	2019-04-25 16:41:17 +09:00
Junio C Hamano	aa1edf14f9	Merge branch 'js/remote-curl-i18n' Error messages given from the http transport have been updated so that they can be localized. * js/remote-curl-i18n: remote-curl: mark all error messages for translation	2019-04-16 19:28:05 +09:00
Junio C Hamano	764bd200e1	Merge branch 'js/anonymize-remote-curl-diag' remote-http transport did not anonymize URLs reported in its error messages at places. * js/anonymize-remote-curl-diag: curl: anonymize URLs in error messages and warnings	2019-04-16 19:28:04 +09:00
brian m. carlson	9c9492e8aa	remote-curl: make hash size independent Change one hard-coded use of the constant 40 to a reference to the_hash_algo. In addition, switch a use of get_oid_hex to parse_oid_hex to avoid the need to use a constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 11:57:39 +09:00
Junio C Hamano	c42c664a2f	Merge branch 'jt/http-auth-proto-v2-fix' Unify RPC code for smart http in protocol v0/v1 and v2, which fixes a bug in the latter (lack of authentication retry) and generally improves the code base. * jt/http-auth-proto-v2-fix: remote-curl: use post_rpc() for protocol v2 also remote-curl: refactor reading into rpc_state's buf remote-curl: reduce scope of rpc_state.result remote-curl: reduce scope of rpc_state.stdin_preamble remote-curl: reduce scope of rpc_state.argv	2019-03-07 10:00:00 +09:00
Johannes Schindelin	ed8b4132c8	remote-curl: mark all error messages for translation Suggested by Jeff King. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-06 08:48:15 +09:00
Johannes Schindelin	c1284b21f2	curl: anonymize URLs in error messages and warnings Just like `47abd85ba0` (fetch: Strip usernames from url's before storing them, 2009-04-17) and later `882d49ca5c` (push: anonymize URL in status output, 2016-07-13), this change anonymizes URLs (read: strips them of user names and especially passwords) in user-facing error messages and warnings. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-05 22:11:58 +09:00
Jonathan Tan	a97d00799a	remote-curl: use post_rpc() for protocol v2 also When transmitting and receiving POSTs for protocol v0 and v1, remote-curl uses post_rpc() (and associated functions), but when doing the same for protocol v2, it uses a separate set of functions (proxy_rpc() and others). Besides duplication of code, this has caused at least one bug: the auth retry mechanism that was implemented in v0/v1 was not implemented in v2. To fix this issue and avoid it in the future, make remote-curl also use post_rpc() when handling protocol v2. Because line lengths are written to the HTTP request in protocol v2 (unlike in protocol v0/v1), this necessitates changes in post_rpc() and some of the functions it uses; perform these changes too. A test has been included to ensure that the code for both the unchunked and chunked variants of the HTTP request is exercised. Note: stateless_connect() has been updated to use the lower-level packet reading functions instead of struct packet_reader. The low-level control is necessary here because we cannot change the destination buffer of struct packet_reader while it is being used; struct packet_buffer has a peeking mechanism which relies on the destination buffer being present in between a peek and a read. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-03 19:00:42 +09:00
Jeff Hostetler	ee4512ed48	trace2: create new combined trace facility Create a new unified tracing facility for git. The eventual intent is to replace the current trace_printf* and trace_performance* routines with a unified set of git_trace2* routines. In addition to the usual printf-style API, trace2 provides higer-level event verbs with fixed-fields allowing structured data to be written. This makes post-processing and analysis easier for external tools. Trace2 defines 3 output targets. These are set using the environment variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be set to "1" or to an absolute pathname (just like the current GIT_TRACE). * GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command summary data. * GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE. It extends the output with columns for the command process, thread, repo, absolute and relative elapsed times. It reports events for child process start/stop, thread start/stop, and per-thread function nesting. * GIT_TR2_EVENT is a new structured format. It writes event data as a series of JSON records. Calls to trace2 functions log to any of the 3 output targets enabled without the need to call different trace_printf* or trace_performance* routines. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-22 15:27:59 -08:00
Jonathan Tan	78ad91728d	remote-curl: refactor reading into rpc_state's buf Currently, whenever remote-curl reads pkt-lines from its response file descriptor, only the payload is written to its buf, not the 4 characters denoting the length. A future patch will require the ability to also write those 4 characters, so in preparation for that, refactor this read into its own function. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-22 14:27:31 -08:00
Jonathan Tan	b35903092e	remote-curl: reduce scope of rpc_state.result The result field in struct rpc_state is only used in rpc_service(), and not in any functions it directly or indirectly calls. Refactor it to become an argument of rpc_service() instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-14 12:47:55 -08:00
Jonathan Tan	5d91669309	remote-curl: reduce scope of rpc_state.stdin_preamble The stdin_preamble field in struct rpc_state is only used in rpc_service(), and not in any functions it directly or indirectly calls. Refactor it to become an argument of rpc_service() instead. An observation of all callers of rpc_service() shows that the preamble is always set, so we no longer need the "if (preamble)" check. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-14 12:47:55 -08:00
Jonathan Tan	7d50d34fc7	remote-curl: reduce scope of rpc_state.argv The argv field in struct rpc_state is only used in rpc_service(), and not in any functions it directly or indirectly calls. Refactor it to become an argument of rpc_service() instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-14 12:47:55 -08:00
Jeff King	cbdb8d1439	remote-curl: tighten "version 2" check for smart-http In a v2 smart-http conversation, the server should reply to our initial request with a pkt-line saying "version 2". We check that with starts_with(), but really that should be the only thing in that packet. A response of "version 20" should not match. Let's tighten this check to use strcmp(). Note that we don't need to worry about a trailing newline here, because the ptk-line code will have chomped it for us already. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-06 12:20:22 -08:00
Jeff King	8ee3e120cd	remote-curl: refactor smart-http discovery After making initial contact with an http server, we have to decide if the server supports smart-http, and if so, which version. Our rules are a bit inconsistent: 1. For v0, we require that the content-type indicates a smart-http response. We also require the response to look vaguely like a pkt-line starting with "#". If one of those does not match, we fall back to dumb-http. But according to our http protocol spec[1]: Dumb servers MUST NOT return a return type starting with `application/x-git-`. If we see the expected content-type, we should consider it smart-http. At that point we can parse the pkt-line for real, and complain if it is not syntactically valid. 2. For v2, we do not actually check the content-type. Our v2 protocol spec says[2]: When using the http:// or https:// transport a client makes a "smart" info/refs request as described in `http-protocol.txt`[...] and the http spec is clear that for a smart-http response[3]: The Content-Type MUST be `application/x-$servicename-advertisement`. So it is required according to the spec. These inconsistencies were easy to miss because of the way the original code was written as an inline conditional. Let's pull it out into its own function for readability, and improve a few things: - we now predicate the smart/dumb decision entirely on the presence of the correct content-type - we do a real pkt-line parse before deciding how to proceed (and die if it isn't valid) - use skip_prefix() for comparing service strings, instead of constructing expected output in a strbuf; this avoids dealing with memory cleanup Note that this _is_ tightening what the client will allow. It's all according to the spec, but it's possible that other implementations might violate these. However, violating these particular rules seems like an odd choice for a server to make. [1] Documentation/technical/http-protocol.txt, l. 166-167 [2] Documentation/technical/protocol-v2.txt, l. 63-64 [3] Documentation/technical/http-protocol.txt, l. 247 Helped-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-06 12:20:19 -08:00
Junio C Hamano	5f8b86db94	Merge branch 'jt/fetch-v2-sideband' "git fetch" and "git upload-pack" learned to send all exchange over the sideband channel while talking the v2 protocol. * jt/fetch-v2-sideband: tests: define GIT_TEST_SIDEBAND_ALL {fetch,upload}-pack: sideband v2 fetch response sideband: reverse its dependency on pkt-line pkt-line: introduce struct packet_writer pack-protocol.txt: accept error packets in any context Use packet_reader instead of packet_read_line	2019-02-05 14:26:11 -08:00
Junio C Hamano	99c0bdd09d	Merge branch 'ms/http-no-more-failonerror' Debugging help for http transport. * ms/http-no-more-failonerror: test: test GIT_CURL_VERBOSE=1 shows an error remote-curl: unset CURLOPT_FAILONERROR remote-curl: define struct for CURLOPT_WRITEFUNCTION http: enable keep_error for HTTP requests http: support file handles for HTTP_KEEP_ERROR	2019-01-29 12:47:55 -08:00
Masaya Suzuki	b79bdd8c12	remote-curl: unset CURLOPT_FAILONERROR By not setting CURLOPT_FAILONERROR, curl parses the HTTP response headers even if the response is an error. This makes GIT_CURL_VERBOSE to show the HTTP headers, which is useful for debugging. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-10 15:00:56 -08:00
Masaya Suzuki	cf2fb92b00	remote-curl: define struct for CURLOPT_WRITEFUNCTION In order to pass more values for rpc_in, define a struct and pass it as an additional value. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-10 15:00:56 -08:00
Masaya Suzuki	e6cf87b12d	http: enable keep_error for HTTP requests curl stops parsing a response when it sees a bad HTTP status code and it has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP headers on error. keep_error is an option to receive the HTTP response body for those error responses. By enabling this option, curl will process the HTTP response headers, and they're shown if GIT_CURL_VERBOSE is set. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-10 15:00:56 -08:00
Masaya Suzuki	2d103c31c2	pack-protocol.txt: accept error packets in any context In the Git pack protocol definition, an error packet may appear only in a certain context. However, servers can face a runtime error (e.g. I/O error) at an arbitrary timing. This patch changes the protocol to allow an error packet to be sent instead of any packet. Without this protocol spec change, when a server cannot process a request, there's no way to tell that to a client. Since the server cannot produce a valid response, it would be forced to cut a connection without telling why. With this protocol spec change, the server can be more gentle in this situation. An old client may see these error packets as an unexpected packet, but this is not worse than having an unexpected EOF. Following this protocol spec change, the error packet handling code is moved to pkt-line.c. Implementation wise, this implementation uses pkt-line to communicate with a subprocess. Since this is not a part of Git protocol, it's possible that a packet that is not supposed to be an error packet is mistakenly parsed as an error packet. This error packet handling is enabled only for the Git pack protocol parsing code considering this. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 13:05:30 -08:00
Masaya Suzuki	01f9ec64c8	Use packet_reader instead of packet_read_line By using and sharing a packet_reader while handling a Git pack protocol request, the same reader option is used throughout the code. This makes it easy to set a reader option to the request parsing code. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 13:05:27 -08:00
Nguyễn Thái Ngọc Duy	3b3357626e	style: the opening '{' of a function is in a separate line Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-12-10 15:41:09 +09:00
Junio C Hamano	ab6409c66d	Merge branch 'en/double-semicolon-fix' into maint Code clean-up. * en/double-semicolon-fix: Remove superfluous trailing semicolons	2018-11-21 22:57:55 +09:00
Torsten Bögershausen	cb8010bbff	remote-curl.c: xcurl_off_t is not portable (on 32 bit platfoms) When setting DEVELOPER = 1 DEVOPTS = extra-all "gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516" errors out with "comparison is always false due to limited range of data type" "[-Werror=type-limits]" It turns out that the function xcurl_off_t() has 2 flavours: - It gives a warning 32 bit systems, like Linux - It takes the signed ssize_t as a paramter, but the only caller is using a size_t (which is typically unsigned these days) The original motivation of this function is to make sure that sizes > 2GiB are handled correctly. The curl documentation says: "For any given platform/compiler curl_off_t must be typedef'ed to a 64-bit wide signed integral data type" On a 32 bit system "size_t" can be promoted into a 64 bit signed value without loss of data, and therefore we may see the "comparison is always false" warning. On a 64 bit system it may happen, at least in theory, that size_t is > 2^63, and then the promotion from an unsigned "size_t" into a signed "curl_off_t" may be a problem. One solution to suppress a possible compiler warning could be to remove the function xcurl_off_t(). However, to be on the very safe side, we keep it and improve it: - The len parameter is changed from ssize_t to size_t - A temporally variable "size" is used, promoted int uintmax_t and the compared with "maximum_signed_value_of_type(curl_off_t)". Thanks to Junio C Hamano for this hint. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-12 17:03:10 +09:00
Junio C Hamano	ae109a9789	Merge branch 'en/double-semicolon-fix' Code clean-up. * en/double-semicolon-fix: Remove superfluous trailing semicolons	2018-09-24 10:30:47 -07:00
Elijah Newren	c3b9bc94b9	Remove superfluous trailing semicolons Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-05 10:21:24 -07:00
Junio C Hamano	13bf260ac7	Merge branch 'js/typofixes' Comment update. * js/typofixes: remote-curl: remove spurious period git-compat-util.h: fix typo	2018-08-20 11:33:50 -07:00
Johannes Schindelin	a8132410ee	remote-curl: remove spurious period We should not interrupt. sentences in the middle. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-08 09:07:18 -07:00
Brandon Williams	eaf6a1b6e9	remote-curl: accept compressed responses with protocol v2 Configure curl to accept compressed responses when using protocol v2 by setting `CURLOPT_ENCODING` to "", which indicates that curl should send an "Accept-Encoding" header with all supported compression encodings. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-23 10:24:13 +09:00
Brandon Williams	1a53e692af	remote-curl: accept all encodings supported by curl Configure curl to accept all encodings which curl supports instead of only accepting gzip responses. This fixes an issue when using an installation of curl which is built without the "zlib" feature. Since `aa90b9697` (Enable info/refs gzip decompression in HTTP client, 2012-09-19) we end up requesting "gzip" encoding anyway despite libcurl not being able to decode it. Worse, instead of getting a clear error message indicating so, we end up falling back to "dumb" http, producing a confusing and difficult to debug result. Since curl doesn't do any checking to verify that it supports the a requested encoding, instead set the curl option `CURLOPT_ENCODING` with an empty string indicating that curl should send an "Accept-Encoding" header containing only the encodings supported by curl. Reported-by: Anton Golubev <anton.golubev@gmail.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-23 10:24:12 +09:00
Junio C Hamano	96f29521a3	Merge branch 'ma/http-walker-no-partial' "git http-fetch" (deprecated) had an optional and experimental "feature" to fetch only commits and/or trees, which nobody used. This has been removed. * ma/http-walker-no-partial: walker: drop fields of `struct walker` which are always 1 http-fetch: make `-a` standard behaviour	2018-05-08 15:59:35 +09:00
Junio C Hamano	9bfa0f9be3	Merge branch 'bw/protocol-v2' The beginning of the next-gen transfer protocol. * bw/protocol-v2: (35 commits) remote-curl: don't request v2 when pushing remote-curl: implement stateless-connect command http: eliminate "# service" line when using protocol v2 http: don't always add Git-Protocol header http: allow providing extra headers for http requests remote-curl: store the protocol version the server responded with remote-curl: create copy of the service name pkt-line: add packet_buf_write_len function transport-helper: introduce stateless-connect transport-helper: refactor process_connect_service transport-helper: remove name parameter connect: don't request v2 when pushing connect: refactor git_connect to only get the protocol version once fetch-pack: support shallow requests fetch-pack: perform a fetch using v2 upload-pack: introduce fetch server command push: pass ref prefixes when pushing fetch: pass ref prefixes when fetching ls-remote: pass ref prefixes when requesting a remote's refs transport: convert transport_get_remote_refs to take a list of ref prefixes ...	2018-05-08 15:59:16 +09:00
Martin Ågren	0b6b342954	walker: drop fields of `struct walker` which are always 1 After the previous commit, both users of `struct walker` set `get_tree`, `get_history` and `get_all` to 1. Drop those fields and simplify the walker implementation accordingly. Let's hope that any out-of-tree users will not mind this change. They should notice that the compilation fails as they try to set these fields. (If they do not set them, note that `get_http_walker()` leaves them undefined, so the behavior will have been undefined all the time.) Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-24 10:55:04 +09:00
Stefan Beller	d807c4a01d	exec_cmd: rename to use dash in file name This is more consistent with the project style. The majority of Git's source files use dashes in preference to underscores in their file names. Signed-off-by: Stefan Beller <sbeller@google.com>	2018-04-11 18:11:00 +09:00
Brandon Williams	a4d78ce26b	remote-curl: don't request v2 when pushing In order to be able to ship protocol v2 with only supporting fetch, we need clients to not issue a request to use protocol v2 when pushing (since the client currently doesn't know how to push using protocol v2). This allows a client to have protocol v2 configured in `protocol.version` and take advantage of using v2 for fetch and falling back to using v0 when pushing while v2 for push is being designed. We could run into issues if we didn't fall back to protocol v2 when pushing right now. This is because currently a server will ignore a request to use v2 when contacting the 'receive-pack' endpoint and fall back to using v0, but when push v2 is rolled out to servers, the 'receive-pack' endpoint will start responding using v2. So we don't want to get into a state where a client is requesting to push with v2 before they actually know how to push using v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	0f1dc53f45	remote-curl: implement stateless-connect command Teach remote-curl the 'stateless-connect' command which is used to establish a stateless connection with servers which support protocol version 2. This allows remote-curl to act as a proxy, allowing the git client to communicate natively with a remote end, simply using remote-curl as a pass through to convert requests to http. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	237ffedd46	http: eliminate "# service" line when using protocol v2 When an http info/refs request is made, requesting that protocol v2 be used, don't send a "# service" line since this line is not part of the v2 spec. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	884e586f9e	http: don't always add Git-Protocol header Instead of always sending the Git-Protocol header with the configured version with every http request, explicitly send it when discovering refs and then only send it on subsequent http requests if the server understood the version requested. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	49e85e9500	remote-curl: store the protocol version the server responded with Store the protocol version the server responded with when performing discovery. This will be used in a future patch to either change the 'Git-Protocol' header sent in subsequent requests or to determine if a client needs to fallback to using a different protocol version. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	f08a5d42ea	remote-curl: create copy of the service name Make a copy of the service name being requested instead of relying on the buffer pointed to by the passed in 'const char *' to remain unchanged. Currently, all service names are string constants, but a subsequent patch will introduce service names from external sources. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	8f6982b4e1	protocol: introduce enum protocol_version value protocol_v2 Introduce protocol_v2, a new value for 'enum protocol_version'. Subsequent patches will fill in the implementation of protocol_v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 14:15:07 -07:00
Brandon Williams	ad6ac1244f	connect: discover protocol version outside of get_remote_heads In order to prepare for the addition of protocol_v2 push the protocol version discovery outside of 'get_remote_heads()'. This will allow for keeping the logic for processing the reference advertisement for protocol_v1 and protocol_v0 separate from the logic for protocol_v2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 14:15:06 -07:00
Junio C Hamano	69917e6439	Merge branch 'jk/push-options-via-transport-fix' "git push" over http transport did not unquote the push-options correctly. * jk/push-options-via-transport-fix: remote-curl: unquote incoming push-options t5545: factor out http repository setup	2018-02-28 13:37:58 -08:00
Junio C Hamano	2fb346c06a	Merge branch 'js/packet-read-line-check-null' Some low level protocol codepath could crash when they get an unexpected flush packet, which is now fixed. * js/packet-read-line-check-null: always check for NULL return from packet_read_line() correct error messages for NULL packet_read_line()	2018-02-27 10:33:55 -08:00
Jeff King	90dce21eb0	remote-curl: unquote incoming push-options The transport-helper protocol c-style quotes the value of any options passed to the helper via the "option <key> <value>" directive. However, remote-curl doesn't actually unquote the push-option values, meaning that we will send the quoted version to the other side (whereas git-over-ssh would send the raw value). The pack-protocol.txt documentation defines the push-options as a series of VCHARs, which excludes most characters that would need quoting. But: 1. You can still see the bug with a valid push-option that starts with a double-quote (since that triggers quoting). 2. We do currently handle any non-NUL characters correctly in git-over-ssh. So even though the spec does not say that we need to handle most quoted characters, it's nice if our behavior is consistent between protocols. There are two new tests: the "direct" one shows that this already works in the non-http case, and the http one covers this bugfix. Reported-by: Jon Simons <jon@jonsimons.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-20 11:27:06 -08:00
Jon Simons	bb1356dc64	always check for NULL return from packet_read_line() The packet_read_line() function will die if it sees any protocol or socket errors. But it will return NULL for a flush packet; some callers which are not expecting this may dereference NULL if they get an unexpected flush. This would involve the other side breaking protocol, but we should flag the error rather than segfault. Signed-off-by: Jon Simons <jon@jonsimons.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-08 12:37:40 -08:00
Jeff Hostetler	acb0c57260	fetch: support filters Teach fetch to support filters. This is only allowed for the remote configured in extensions.partialcloneremote. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:51 -08:00
Jonathan Tan	88e2f9ed8e	introduce fetch-object: fetch one promisor object Introduce fetch-object, providing the ability to fetch one object from a promisor remote. This uses fetch-pack. To do this, the transport mechanism has been updated with 2 flags, "from-promisor" to indicate that the resulting pack comes from a promisor remote (and thus should be annotated as such by index-pack), and "no-dependents" to indicate that only the objects themselves need to be fetched (but fetching additional objects is nevertheless safe). Whenever "no-dependents" is used, fetch-pack will refrain from using any object flags, because it is most likely invoked as part of a dynamic object fetch by another Git command (which may itself use object flags). An alternative to this is to leave fetch-pack alone, and instead update the allocation of flags so that fetch-pack's flags never overlap with any others, but this will end up shrinking the number of flags available to nearly every other Git command (that is, every Git command that accesses objects), so the approach in this commit was used instead. This will be tested in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	4c01f67d91	Merge branch 'dt/http-postbuffer-can-be-large' Allow the http.postbuffer configuration variable to be set to a size that can be expressed in size_t, which can be larger than ulong on some platforms. * dt/http-postbuffer-can-be-large: http.postbuffer: allow full range of ssize_t values	2017-04-23 22:07:45 -07:00
Junio C Hamano	b1081e4004	Merge branch 'bc/object-id' Conversion from unsigned char [40] to struct object_id continues. * bc/object-id: Documentation: update and rename api-sha1-array.txt Rename sha1_array to oid_array Convert sha1_array_for_each_unique and for_each_abbrev to object_id Convert sha1_array_lookup to take struct object_id Convert remaining callers of sha1_array_lookup to object_id Make sha1_array_append take a struct object_id * sha1-array: convert internal storage for struct sha1_array to object_id builtin/pull: convert to struct object_id submodule: convert check_for_new_submodule_commits to object_id sha1_name: convert disambiguate_hint_fn to take object_id sha1_name: convert struct disambiguate_state to object_id test-sha1-array: convert most code to struct object_id parse-options-cb: convert sha1_array_append caller to struct object_id fsck: convert init_skiplist to struct object_id builtin/receive-pack: convert portions to struct object_id builtin/pull: convert portions to struct object_id builtin/diff: convert to struct object_id Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Define new hash-size constants for allocating memory	2017-04-19 21:37:13 -07:00
David Turner	37ee680d9b	http.postbuffer: allow full range of ssize_t values Unfortunately, in order to push some large repos where a server does not support chunked encoding, the http postbuffer must sometimes exceed two gigabytes. On a 64-bit system, this is OK: we just malloc a larger buffer. This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the buffer size. Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-13 18:24:32 -07:00
brian m. carlson	910650d2f8	Rename sha1_array to oid_array Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:56 -07:00
brian m. carlson	ee3051bd23	sha1-array: convert internal storage for struct sha1_array to object_id Make the internal storage for struct sha1_array use an array of struct object_id internally. Update the users of this struct which inspect its internals. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-28 09:59:34 -07:00
Brandon Williams	511155db51	remote-curl: allow push options Teach remote-curl to understand push options and to be able to convey them across HTTP. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-22 15:41:21 -07:00
Junio C Hamano	d984592043	Merge branch 'dt/smart-http-detect-server-going-away' When the http server gives an incomplete response to a smart-http rpc call, it could lead to client waiting for a full response that will never come. Teach the client side to notice this condition and abort the transfer. An improvement counterproposal has failed. cf. <20161114194049.mktpsvgdhex2f4zv@sigill.intra.peff.net> * dt/smart-http-detect-server-going-away: upload-pack: optionally allow fetching any sha1 remote-curl: don't hang when a server dies before any output	2017-01-10 15:24:25 -08:00
Junio C Hamano	8a2882f23e	Merge branch 'jk/http-walker-limit-redirect-2.9' Transport with dumb http can be fooled into following foreign URLs that the end user does not intend to, especially with the server side redirects and http-alternates mechanism, which can lead to security issues. Tighten the redirection and make it more obvious to the end user when it happens. * jk/http-walker-limit-redirect-2.9: http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect	2016-12-19 14:45:32 -08:00
Jeff King	50d3413740	http: make redirects more obvious We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since `c93c92f30` (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-06 12:32:48 -08:00
Jeff King	fcaa6e64b3	remote-curl: rename shadowed options variable The discover_refs() function has a local "options" variable to hold the http_get_options we pass to http_get_strbuf(). But this shadows the global "struct options" that holds our program-level options, which cannot be accessed from this function. Let's give the local one a more descriptive name so we can tell the two apart. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-06 12:32:48 -08:00
David Turner	296b847c0d	remote-curl: don't hang when a server dies before any output In the event that a HTTP server closes the connection after giving a 200 but before giving any packets, we don't want to hang forever waiting for a response that will never come. Instead, we should die immediately. One case where this happens is when attempting to fetch a dangling object by its object name. In this case, the server dies before sending any data. Prior to this patch, fetch-pack would wait for data from the server, and remote-curl would wait for fetch-pack, causing a deadlock. Despite this patch, there is other possible malformed input that could cause the same deadlock (e.g. a half-finished pktline, or a pktline but no trailing flush). There are a few possible solutions to this: 1. Allowing remote-curl to tell fetch-pack about the EOF (so that fetch-pack could know that no more data is coming until it says something else). This is tricky because an out-of-band signal would be required, or the http response would have to be re-framed inside another layer of pkt-line or something. 2. Make remote-curl understand some of the protocol. It turns out that in addition to understanding pkt-line, it would need to watch for ack/nak. This is somewhat fragile, as information about the protocol would end up in two places. Also, pkt-lines which are already at the length limit would need special handling. Both of these solutions would require a fair amount of work, whereas this hack is easy and solves at least some of the problem. Still to do: it would be good to give a better error message than "fatal: The remote end hung up unexpectedly". Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-18 13:05:46 -08:00
Junio C Hamano	a460ea4a3c	Merge branch 'nd/shallow-deepen' The existing "git fetch --depth=<n>" option was hard to use correctly when making the history of an existing shallow clone deeper. A new option, "--deepen=<n>", has been added to make this easier to use. "git clone" also learned "--shallow-since=<date>" and "--shallow-exclude=<tag>" options to make it easier to specify "I am interested only in the recent N months worth of history" and "Give me only the history since that version". * nd/shallow-deepen: (27 commits) fetch, upload-pack: --deepen=N extends shallow boundary by N commits upload-pack: add get_reachable_list() upload-pack: split check_unreachable() in two, prep for get_reachable_list() t5500, t5539: tests for shallow depth excluding a ref clone: define shallow clone boundary with --shallow-exclude fetch: define shallow boundary with --shallow-exclude upload-pack: support define shallow boundary by excluding revisions refs: add expand_ref() t5500, t5539: tests for shallow depth since a specific date clone: define shallow clone boundary based on time with --shallow-since fetch: define shallow boundary with --shallow-since upload-pack: add deepen-since to cut shallow repos based on time shallow.c: implement a generic shallow boundary finder based on rev-list fetch-pack: use a separate flag for fetch in deepening mode fetch-pack.c: mark strings for translating fetch-pack: use a common function for verbose printing fetch-pack: use skip_prefix() instead of starts_with() upload-pack: move rev-list code out of check_non_tip() upload-pack: make check_non_tip() clean things up on error upload-pack: tighten number parsing at "deepen" lines ...	2016-10-10 14:03:50 -07:00
Junio C Hamano	de61cebde7	Merge branch 'jk/common-main-2.8' into jk/common-main * jk/common-main-2.8: mingw: declare main()'s argv as const common-main: call git_setup_gettext() common-main: call restore_sigpipe_to_default() common-main: call sanitize_stdfds() common-main: call git_extract_argv0_path() add an extra level of indirection to main()	2016-07-06 10:02:57 -07:00
Jeff King	5ce5f5fa5a	common-main: call git_setup_gettext() This should be part of every program, as otherwise users do not get translated error messages. However, some external commands forgot to do so (e.g., git-credential-store). This fixes them, and eliminates the repeated code in programs that did remember to use it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Jeff King	650c449250	common-main: call git_extract_argv0_path() Every program which links against libgit.a must call this function, or risk hitting an assert() in system_path() that checks whether we have configured argv0_path (though only when RUNTIME_PREFIX is defined, so essentially only on Windows). Looking at the diff, you can see that putting it into the common main() saves us having to do it individually in each of the external commands. But what you can't see are the cases where we _should_ have been doing so, but weren't (e.g., git-credential-store, and all of the t/helper test programs). This has been an accident-waiting-to-happen for a long time, but wasn't triggered until recently because it involves one of those programs actually calling system_path(). That happened with git-credential-store in v2.8.0 with `ae5f677` (lazily load core.sharedrepository, 2016-03-11). The program: - takes a lock file, which... - opens a tempfile, which... - calls adjust_shared_perm to fix permissions, which... - lazy-loads the config (as of `ae5f677`), which... - calls system_path() to find the location of /etc/gitconfig On systems with RUNTIME_PREFIX, this means credential-store reliably hits that assert() and cannot be used. We never noticed in the test suite, because we set GIT_CONFIG_NOSYSTEM there, which skips the system_path() lookup entirely. But if we were to tweak git_config() to find /etc/gitconfig even when we aren't going to open it, then the test suite shows multiple failures (for credential-store, and for some other test helpers). I didn't include that tweak here because it's way too specific to this particular call to be worth carrying around what is essentially dead code. The implementation is fairly straightforward, with one exception: there is exactly one caller (git.c) that actually cares about the result of the function, and not the side-effect of setting up argv0_path. We can accommodate that by simply replacing the value of argv[0] in the array we hand down to cmd_main(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Jeff King	3f2e2297b9	add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char " instead of "char " for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Nguyễn Thái Ngọc Duy	cccf74e2da	fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more () parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. () We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	a45a260086	fetch: define shallow boundary with --shallow-exclude Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	508ea88226	fetch: define shallow boundary with --shallow-since Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	b5f62ebea5	remote-curl.c: convert fetch_git() to use argv_array Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Johannes Schindelin	8cb01e2fd3	http: support sending custom HTTP headers We introduce a way to send custom HTTP headers with all requests. This allows us, for example, to send an extra token from build agents for temporary access to private repositories. (This is the use case that triggered this patch.) This feature can be used like this: git -c http.extraheader='Secret: sssh!' fetch $URL $REF Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only a single list, overriding any previous call. This means we have to collect _all_ of the headers we want to use into a single list, and feed it to cURL in one shot. Since we already unconditionally set a "pragma" header when initializing the curl handles, we can add our new headers to that list. For callers which override the default header list (like probe_rpc), we provide `http_copy_default_headers()` so they can do the same trick. Big thanks to Jeff King and Junio Hamano for their outstanding help and patient reviews. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-27 14:02:33 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Junio C Hamano	97c49af6a7	Merge branch 'sp/remote-curl-ssl-strerror' Help those who debug http(s) part of the system. * sp/remote-curl-ssl-strerror: remote-curl: include curl_errorstr on SSL setup failures	2016-02-24 13:25:56 -08:00
Junio C Hamano	e84d5e9fa1	Merge branch 'ew/force-ipv4' "git fetch" and friends that make network connections can now be told to only use ipv4 (or ipv6). * ew/force-ipv4: connect & http: support -4 and -6 switches for remote operations	2016-02-24 13:25:54 -08:00
Jeff King	b32fa95fd8	convert trivial cases to ALLOC_ARRAY Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Jeff King	850d2fec53	convert manual allocations to argv_array There are many manual argv allocations that predate the argv_array API. Switching to that API brings a few advantages: 1. We no longer have to manually compute the correct final array size (so it's one less thing we can screw up). 2. In many cases we had to make a separate pass to count, then allocate, then fill in the array. Now we can do it in one pass, making the code shorter and easier to follow. 3. argv_array handles memory ownership for us, making it more obvious when things should be free()d and and when not. Most of these cases are pretty straightforward. In some, we switch from "run_command_v" to "run_command" which lets us directly use the argv_array embedded in "struct child_process". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:50:32 -08:00
Shawn Pearce	00540458a8	remote-curl: include curl_errorstr on SSL setup failures For curl error 35 (CURLE_SSL_CONNECT_ERROR) users need the additional text stored in CURLOPT_ERRORBUFFER to debug why the connection did not start. This is curl_errorstr inside of http.c, so include that in the message if it is non-empty. Sometimes HTTP response codes aren't yet available, such as when the SSL setup fails. Don't include HTTP 0 in the message. Signed-off-by: Shawn Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-15 13:21:43 -08:00
Eric Wong	c915f11eb4	connect & http: support -4 and -6 switches for remote operations Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 11:34:14 -08:00

1 2 3 4 5 ...

299 commits