development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Jeff King	af22a63c39	sideband: diagnose more sideband anomalies In demultiplex_sideband(), there are two oddities when we check an incoming packet: - if it has zero length, then we assume it's a flush packet. This means we fail to notice the difference between a real flush and a true zero-length packet that's missing its sideband designator. It's not a huge problem in practice because we'd never send a zero-length data packet (even our keepalives are otherwise-empty sideband-1 packets). But it would be nice to detect and report the error, since it's likely to cause other confusion (we think the other side flushed, but they do not). - we try to detect packets missing their designator by checking for "if (len < 1)". But this will never trigger for "len == 0"; we've already detected that and left the function before then. It _could_ detect a negative "len" parameter. But in that case, the error message is wrong. The issue is not "no sideband" but rather "eof while reading the packet". However, this can't actually be triggered in practice, because neither of the two callers uses pkt_read's GENTLE_ON_EOF flag. Which means they'd die with "the remote end hung up unexpectedly" before we even get here. So this truly is dead code. We can improve these cases by passing in a pkt-line status to the demultiplexer, and by having recv_sideband() use GENTLE_ON_EOF. This gives us two improvements: - we can now reliably detect flush packets, and will report a normal packet missing its sideband designator as an error - we'll report an eof with a more detailed "protocol error: eof while reading sideband packet", rather than the generic "the remote end hung up unexpectedly" - when we see an eof, we'll flush the sideband scratch buffer, which may provide some hints from the remote about why they hung up (though note we already flush on newlines, so it's likely that most such messages already made it through) In some sense this patch goes against `fbd76cd450` (sideband: reverse its dependency on pkt-line, 2019-01-16), which caused the sideband code not to depend on the pkt-line code. But that commit was really just trying to deal with the circular header dependency. The two modules are conceptually interlinked, and it was just trying to keep things compiling. And indeed, there's a sticking point in this patch: because pkt-line.h includes sideband.h, we can't add the reverse include we need for the sideband code to have an "enum packet_read_status" parameter. Nor can we forward declare it, because you can't forward declare an enum in C. However, C does guarantee that enums fit in an int, so we can just use that type. One alternative would be for the callers to check themselves that they got something sane from the pkt-line code. But besides duplicating logic, this gets quite tricky. Any error condition requires flushing the sideband #2 scratch buffer, which only demultiplex_sideband() knows how to do. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-29 09:23:29 -07:00
Johannes Schindelin	8e86cf6581	sideband: report unhandled incomplete sideband messages as bugs It was pretty tricky to verify that incomplete sideband messages are handled correctly by the `recv_sideband()`/`demultiplex_sideband()` code: they have to be flushed out at the end of the loop in `recv_sideband()`, but the actual flushing is done by the `demultiplex_sideband()` function (which therefore has to know somehow that the loop will be done after it returns). To catch future bugs where incomplete sideband messages might not be shown by mistake, let's catch that condition and report a bug. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 13:31:02 -07:00
Junio C Hamano	12210859da	Merge branch 'bc/sha-256-part-2' SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...	2020-07-06 22:09:13 -07:00
brian m. carlson	9a9f0d3fc0	pkt-line: add a member for hash algorithm Add a member for the hash algorithm currently in use to the packet reader so it can parse references correctly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
Denton Liu	0181b600a6	pkt-line: define PACKET_READ_RESPONSE_END In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Denton Liu	101736a14c	pkt-line: extern packet_length() In a future commit, we will be manually processing packets and we will need to access the length header. In order to simplify this, extern packet_length() so that the logic can be reused. Change the function parameter from `const char *linelen` to `const char lenbuf_hex[4]`. Even though these two types behave identically as function parameters, use the array notation to semantically indicate exactly what this function is expecting as an argument. Also, rename it from linelen to lenbuf_hex as the former sounds like it should be an integral type which is misleading. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-19 15:40:26 -07:00
Junio C Hamano	56bea28036	pkt-line: drop 'const'-ness of a param to set_packet_header() The function's definition has a paramter of type "int" qualified as "const". The fact that the incoming parameter is used as read-only in the fuction is an implementation detail that the callers should not have to be told in the prototype declaring it (and "const" there has no effect, as C passes parameters by value). The prototype defined for the function in pkt-line.h lacked the matching "const" for this reason, but apparently some compilers (e.g. MS Visual C 2017) complain about the parameter type mismatch. Let's squelch it by removing the "const" that is pointless in the definition of a small and trivial function like this, which would not help optimizing compilers nor reading humans that much. Noticed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-16 13:29:06 +09:00
Jeff King	533ddba47e	pkt-line: prepare buffer before handling ERR packets Since `2d103c31c2` (pack-protocol.txt: accept error packets in any context, 2018-12-29), the pktline code will detect an ERR packet and die automatically, saving the caller from dealing with it. But we do so too early in the function, before we have terminated the buffer with a NUL. As a result, passing the ERR message to die() may result in us printing random cruft from a previous packet. This doesn't trigger memory tools like ASan because we reuse the same buffer over and over (so the contents are valid and initialized; they're just stale). We can see demonstrate this by tightening the regex we use to match the error message in t5516; without this patch, git-fetch will accidentally print the capabilities from the (much longer) initial packet we received. By moving the ERR code later in the function we get a few other benefits, too: - we'll now chomp any newline sent by the other side (which is what we want, since die() will add its own newline) - we'll now mention the ERR packet with GIT_TRACE_PACKET Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-15 14:00:51 +09:00
Junio C Hamano	27cdbdd134	Merge branch 'jk/no-sigpipe-during-network-transport' On platforms where "git fetch" is killed with SIGPIPE (e.g. OSX), the upload-pack that runs on the other end that hangs up after detecting an error could cause "git fetch" to die with a signal, which led to a flakey test. "git fetch" now ignores SIGPIPE during the network portion of its operation (this is not a problem as we check the return status from our write(2)s). * jk/no-sigpipe-during-network-transport: fetch: ignore SIGPIPE during network operation fetch: avoid calling write_or_die()	2019-03-20 15:16:06 +09:00
Jeff King	37c80012f1	fetch: avoid calling write_or_die() The write_or_die() function has one quirk that a caller might not expect: when it sees EPIPE from the write() call, it translates that into a death by SIGPIPE. This doesn't change the overall behavior (the program exits either way), but it does potentially confuse test scripts looking for a non-signal exit code. Let's switch away from using write_or_die() in a few code paths, which will give us more consistent exit codes. It also gives us the opportunity to write more descriptive error messages, since we have context that write_or_die() does not. Note that this won't do much by itself, since we'd typically be killed by SIGPIPE before write_or_die() even gets a chance to do its thing. That will be addressed in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-05 15:02:01 +09:00
Jonathan Tan	a97d00799a	remote-curl: use post_rpc() for protocol v2 also When transmitting and receiving POSTs for protocol v0 and v1, remote-curl uses post_rpc() (and associated functions), but when doing the same for protocol v2, it uses a separate set of functions (proxy_rpc() and others). Besides duplication of code, this has caused at least one bug: the auth retry mechanism that was implemented in v0/v1 was not implemented in v2. To fix this issue and avoid it in the future, make remote-curl also use post_rpc() when handling protocol v2. Because line lengths are written to the HTTP request in protocol v2 (unlike in protocol v0/v1), this necessitates changes in post_rpc() and some of the functions it uses; perform these changes too. A test has been included to ensure that the code for both the unchunked and chunked variants of the HTTP request is exercised. Note: stateless_connect() has been updated to use the lower-level packet reading functions instead of struct packet_reader. The low-level control is necessary here because we cannot change the destination buffer of struct packet_reader while it is being used; struct packet_buffer has a peeking mechanism which relies on the destination buffer being present in between a peek and a read. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-03 19:00:42 +09:00
Jonathan Tan	0bbc0bc574	{fetch,upload}-pack: sideband v2 fetch response Currently, a response to a fetch request has sideband support only while the packfile is being sent, meaning that the server cannot send notices until the start of the packfile. Extend sideband support in protocol v2 fetch responses to the whole response. upload-pack will advertise it if the uploadpack.allowsidebandall configuration variable is set, and fetch-pack will automatically request it if advertised. If the sideband is to be used throughout the whole response, upload-pack will use it to send errors instead of prefixing a PKT-LINE payload with "ERR ". This will be tested in a subsequent patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 11:25:07 -08:00
Jonathan Tan	fbd76cd450	sideband: reverse its dependency on pkt-line A subsequent patch will teach struct packet_reader a new field that, if set, instructs it to interpret read data as multiplexed. This will create a dependency from pkt-line to sideband. To avoid a circular dependency, split recv_sideband() into 2 parts: the reading loop (left in recv_sideband()) and the processing of the contents (in demultiplex_sideband()), and move the former into pkt-line. This reverses the direction of dependency: sideband no longer depends on pkt-line, and pkt-line now depends on sideband. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 11:25:07 -08:00
Jonathan Tan	bc2e795cea	pkt-line: introduce struct packet_writer A future patch will allow the client to request multiplexing of the entire fetch response (and not only during packfile transmission), which in turn allows the server to send progress and keepalive messages at any time during the response. It will be convenient for a future patch if writing options (specifically, whether the written data is to be multiplexed) could be controlled from a single place, so create struct packet_writer to serve as that place, and modify upload-pack to use it. Currently, it only stores the output fd, but a subsequent patch will (as described above) introduce an option to determine if the written data is to be multiplexed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-15 11:44:48 -08:00
Masaya Suzuki	2d103c31c2	pack-protocol.txt: accept error packets in any context In the Git pack protocol definition, an error packet may appear only in a certain context. However, servers can face a runtime error (e.g. I/O error) at an arbitrary timing. This patch changes the protocol to allow an error packet to be sent instead of any packet. Without this protocol spec change, when a server cannot process a request, there's no way to tell that to a client. Since the server cannot produce a valid response, it would be forced to cut a connection without telling why. With this protocol spec change, the server can be more gentle in this situation. An old client may see these error packets as an unexpected packet, but this is not worse than having an unexpected EOF. Following this protocol spec change, the error packet handling code is moved to pkt-line.c. Implementation wise, this implementation uses pkt-line to communicate with a subprocess. Since this is not a part of Git protocol, it's possible that a packet that is not supposed to be an error packet is mistakenly parsed as an error packet. This error packet handling is enabled only for the Git pack protocol parsing code considering this. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 13:05:30 -08:00
Nguyễn Thái Ngọc Duy	c60d7697d1	pkt-line.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	1a07e59c3e	Update messages in preparation for i18n Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Junio C Hamano	50f08db594	Merge branch 'js/use-bug-macro' Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths	2018-05-30 14:04:07 +09:00
Johannes Schindelin	033abf97fc	Replace all die("BUG: ...") calls by BUG() ones In `d8193743e0` (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in `588a538ae5` (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-06 19:06:13 +09:00
Brandon Williams	f1f4d8acf4	pkt-line: add packet_buf_write_len function Add the 'packet_buf_write_len()' function which allows for writing an arbitrary length buffer into a 'struct strbuf' and formatting it in packet-line format. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
Brandon Williams	a4cfd41c7b	pkt-line: add delim packet support One of the design goals of protocol-v2 is to improve the semantics of flush packets. Currently in protocol-v1, flush packets are used both to indicate a break in a list of packet lines as well as an indication that one side has finished speaking. This makes it particularly difficult to implement proxies as a proxy would need to completely understand git protocol instead of simply looking for a flush packet. To do this, introduce the special deliminator packet '0001'. A delim packet can then be used as a deliminator between lists of packet lines while flush packets can be reserved to indicate the end of a response. Documentation for how this packet will be used in protocol v2 will included in a future patch. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 14:15:06 -07:00
Brandon Williams	77dabc14c4	pkt-line: allow peeking a packet line without consuming it Sometimes it is advantageous to be able to peek the next packet line without consuming it (e.g. to be able to determine the protocol version a server is speaking). In order to do that introduce 'struct packet_reader' which is an abstraction around the normal packet reading logic. This enables a caller to be able to peek a single line at a time using 'packet_reader_peek()' and having a caller consume a line by calling 'packet_reader_read()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 14:15:06 -07:00
Brandon Williams	2153d478b7	pkt-line: introduce packet_read_with_status The current pkt-line API encodes the status of a pkt-line read in the length of the read content. An error is indicated with '-1', a flush with '0' (which can be confusing since a return value of '0' can also indicate an empty pkt-line), and a positive integer for the length of the read content otherwise. This doesn't leave much room for allowing the addition of additional special packets in the future. To solve this introduce 'packet_read_with_status()' which reads a packet and returns the status of the read encoded as an 'enum packet_status' type. This allows for easily identifying between special and normal packets as well as errors. It also enables easily adding a new special packet in the future. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 14:15:05 -07:00
Junio C Hamano	4c6dad0059	Merge branch 'bw/protocol-v1' A new mechanism to upgrade the wire protocol in place is proposed and demonstrated that it works with the older versions of Git without harming them. * bw/protocol-v1: Documentation: document Extra Parameters ssh: introduce a 'simple' ssh variant i5700: add interop test for protocol transition http: tell server that the client understands v1 connect: tell server that the client understands v1 connect: teach client to recognize v1 server response upload-pack, receive-pack: introduce protocol version 1 daemon: recognize hidden request arguments protocol: introduce protocol extension mechanisms pkt-line: add packet_write function connect: in ref advertisement, shallows are last	2017-12-06 09:23:44 -08:00
Brandon Williams	5d2124b34a	pkt-line: add packet_write function Add a function which can be used to write the contents of an arbitrary buffer. This makes it easy to build up data in a buffer before writing the packet instead of formatting the entire contents of the packet using 'packet_write_fmt()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 10:51:29 +09:00
Jeff King	61d36330b4	prefer "!=" when checking read_in_full() result Comparing the result of read_in_full() using less-than is potentially dangerous, as a negative return value may be converted to an unsigned type and be considered a success. This is discussed further in 561598cfcf (read_pack_header: handle signed/unsigned comparison in read result, 2017-09-13). Each of these instances is actually fine in practice: - in get-tar-commit-id, the HEADERSIZE macro expands to a signed integer. If it were switched to an unsigned type (e.g., a size_t), then it would be a bug. - the other two callers check for a short read only after handling a negative return separately. This is a fine practice, but we'd prefer to model "!=" as a general rule. So all of these cases can be considered cleanups and not actual bugfixes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:24 +09:00
Junio C Hamano	c50424a6f0	Merge branch 'jk/write-in-full-fix' Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern	2017-09-25 15:24:06 +09:00
Junio C Hamano	c78e182d55	Merge branch 'ma/pkt-line-leakfix' A leakfix. * ma/pkt-line-leakfix: pkt-line: re-'static'-ify buffer in packet_write_fmt_1()	2017-09-19 10:47:52 +09:00
Jeff King	4c95e3dd28	pkt-line: check write_in_full() errors against "< 0" As with the previous two commits, we prefer to check write_in_full()'s return value to see if it is negative, rather than comparing it to the input length. These cases actually flip the logic to check for success, making conversion a little different than in other cases. We could of course write: if (write_in_full(...) >= 0) return 0; return error(...); But our usual method of spelling write() error checks is just "< 0". So let's flip the logic for each of these conditionals to our usual style. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:17:59 +09:00
Martin Ågren	150efef1e7	pkt-line: re-'static'-ify buffer in packet_write_fmt_1() The static-ness was silently dropped in commit `70428d1a5` ("pkt-line: add packet_write_fmt_gently()", 2016-10-16). As a result, for each call to packet_write_fmt_1, we allocate and leak a buffer. We could keep the strbuf non-static and instead make sure we always release it before returning (but not before we die, so that we don't touch errno). That would also prepare us for threaded use. But until that needs to happen, let's just restore the static-ness so that we get back to a situation where we (eventually) do not continuosly keep allocating memory. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-06 13:11:14 +09:00
Jonathan Tan	fa64a2fdbe	sub-process: refactor handshake to common function Refactor, into a common function, the version and capability negotiation done when invoking a long-running process as a clean or smudge filter. This will be useful for other Git code that needs to interact similarly with a long-running process. As you can see in the change to t0021, this commit changes the error message reported when the long-running process does not introduce itself with the expected "server"-terminated line. Originally, the error message reports that the filter "does not support filter protocol version 2", differentiating between the old single-file filter protocol and the new multi-file filter protocol - I have updated it to something more generic and useful. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-26 13:00:40 -07:00
Ben Peart	c0c70f7ac0	convert: move packet_write_line() into pkt-line as packet_writel() Add packet_writel() which writes multiple lines in a single call and then calls packet_flush_gently(). Update convert.c to use the new packet_writel() function from pkt-line. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 10:57:26 +09:00
Ben Peart	825b9226bf	pkt-line: add packet_read_line_gently() Add packet_read_line_gently() to enable reading a line without dying on EOF. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 10:57:26 +09:00
Ben Peart	974b50c556	pkt-line: fix packet_read_line() to handle len < 0 errors Update packet_read_line() to test for len > 0 to avoid potential bug if read functions return lengths less than zero to indicate errors. Signed-off-by: Ben Peart <benpeart@microsoft.com> Found/Fixed-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 10:57:26 +09:00
Lars Schneider	bb643d8bf8	pkt-line: add functions to read/write flush terminated packet streams write_packetized_from_fd() and write_packetized_from_buf() write a stream of packets. All content packets use the maximal packet size except for the last one. After the last content packet a `flush` control packet is written. read_packetized_to_strbuf() reads arbitrary sized packets until it detects a `flush` packet. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Lars Schneider	edfb780cd4	pkt-line: add packet_write_gently() packet_write_fmt_gently() uses format_packet() which lets the caller only send string data via "%s". That means it cannot be used for arbitrary data that may contain NULs. Add packet_write_gently() which writes arbitrary data and does not die in case of an error. The function is used by other pkt-line functions in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Lars Schneider	038ce90f2f	pkt-line: add packet_flush_gently() packet_flush() would die in case of a write error even though for some callers an error would be acceptable. Add packet_flush_gently() which writes a pkt-line flush packet like packet_flush() but does not die in case of an error. The function is used in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Lars Schneider	70428d1a52	pkt-line: add packet_write_fmt_gently() packet_write_fmt() would die in case of a write error even though for some callers an error would be acceptable. Add packet_write_fmt_gently() which writes a formatted pkt-line like packet_write_fmt() but does not die in case of an error. The function is used in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Lars Schneider	2f60bdd1a8	pkt-line: extract set_packet_header() Extracted set_packet_header() function converts an integer to a 4 byte hex string. Make this function locally available so that other pkt-line functions could use it. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Lars Schneider	81c634e94f	pkt-line: rename packet_write() to packet_write_fmt() packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
René Scharfe	d23309733a	introduce hex2chr() for converting two hexadecimal digits to a character Add and use a helper function that decodes the char value of two hexadecimal digits. It returns a negative number on error, avoids running over the end of the given string and doesn't shift negative values. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 10:42:46 -07:00
Jeff King	fd89433dd0	pkt-line: show packets in async processes as "sideband" If you run "GIT_TRACE_PACKET=1 git push", you may get confusing output like (line prefixes omitted for clarity): packet: push< \1000eunpack ok0019ok refs/heads/master0000 packet: push< unpack ok packet: push< ok refs/heads/master packet: push< 0000 packet: push< 0000 Why do we see the data twice, once apparently wrapped inside another pkt-line, and once unwrapped? Why do we get two flush packets? The answer is that we start an async process to demux the sideband data. The first entry comes from the sideband process reading the data, and the second from push itself. Likewise, the first flush is inside the demuxed packet, and the second is an actual sideband flush. We can make this a bit more clear by marking the sideband demuxer explicitly as "sideband" rather than "push". The most elegant way to do this would be to simply call packet_trace_identity() inside the sideband demuxer. But we can't do that reliably, because it relies on a global variable, which might be shared if pthreads are in use. What we really need is thread-local storage for packet_trace_identity. But the async code does not provide an interface for that, and it would be messy to add it here (we'd have to care about pthreads, initializing our pthread_key_t ahead of time, etc). So instead, let us just assume that any async process is handling sideband data. That's always true now, and is likely to remain so in the future. The output looks like: packet: sideband< \1000eunpack ok0019ok refs/heads/master0000 packet: push< unpack ok packet: push< ok refs/heads/master packet: push< 0000 packet: sideband< 0000 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-01 15:11:57 -07:00
Jeff King	323598387d	pkt-line: support tracing verbatim pack contents When debugging the pack protocol, it is sometimes useful to store the verbatim pack that we sent or received on the wire. Looking at the on-disk result is often not helpful for a few reasons: 1. If the operation is a clone, we destroy the repo on failure, leaving nothing on disk. 2. If the pack is small, we unpack it immediately, and the full pack never hits the disk. 3. If we feed the pack to "index-pack --fix-thin", the resulting pack has the extra delta bases added to it. We already have a GIT_TRACE_PACKET mechanism for tracing packets. Let's extend it with GIT_TRACE_PACKFILE to dump the verbatim packfile. There are a few other positive fallouts that come from rearranging this code: - We currently disable the packet trace after seeing the PACK header, even though we may get human-readable lines on other sidebands; now we include them in the trace. - We currently try to print "PACK ..." in the trace to indicate that the packfile has started. But because we disable packet tracing, we never printed this line. We will now do so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-16 13:24:22 -07:00
Jeff King	d6d1a75e51	pkt-line: tighten sideband PACK check when tracing To find the start of the pack data, we accept the word PACK at the beginning of any sideband channel, even though what we really want is to find the pack data on channel 1. In practice this doesn't matter, as sideband-2 messages tend to start with "error:" or similar, but it is a good idea to be explicit (especially as we add more code in this area, we will rely on this assumption). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-15 13:25:52 -07:00
Jeff King	f3612acb93	pkt-line: simplify starts_with checks in packet tracing We carefully check that our pkt buffer has enough characters before seeing if it starts with "PACK". The intent is to avoid reading random memory if we get a short buffer like "PAC". However, we know that the traced packets are always NUL-terminated. They come from one of these sources: 1. A string literal. 2. `format_packet`, which uses a strbuf. 3. `packet_read`, which defensively NUL-terminates what we read. We can therefore drop the length checks, as we know we will hit the trailing NUL if we have a short input. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-15 13:25:50 -07:00
Jeff King	8e9faf27c1	pkt-line: allow writing of LARGE_PACKET_MAX buffers When we send out pkt-lines with refnames, we use a static 1000-byte buffer. This means that the maximum size of a ref over the git protocol is around 950 bytes (the exact size depends on the protocol line being written, but figure on a sha1 plus some boilerplate). This is enough for any sane workflow, but occasionally odd things happen (e.g., a bug may create a ref "foo/foo/foo/..." accidentally). With the current code, you cannot even use "push" to delete such a ref from a remote. Let's switch to using a strbuf, with a hard-limit of LARGE_PACKET_MAX (which is specified by the protocol). This matches the size of the readers, as of `74543a0` (pkt-line: provide a LARGE_PACKET_MAX static buffer, 2013-02-20). Versions of git older than that will complain about our large packets, but it's really no worse than the current behavior. Right now the sender barfs with "impossibly long line" trying to send the packet, and afterwards the reader will barf with "protocol error: bad line length %d", which is arguably better anyway. Note that we're not really _solving_ the problem here, but just bumping the limits. In theory, the length of a ref is unbounded, and pkt-line can only represent sizes up to 65531 bytes. So we are just bumping the limit, not removing it. But hopefully 64K should be enough for anyone. As a bonus, by using a strbuf for the formatting we can eliminate an unnecessary copy in format_buf_write. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-12-10 13:09:21 -08:00
Karsten Blees	6aa3085702	trace: improve trace performance The trace API currently rechecks the environment variable and reopens the trace file on every API call. This has the ugly side effect that errors (e.g. file cannot be opened, or the user specified a relative path) are also reported on every call. Performance can be improved by about factor three by remembering the environment state and keeping the file open. Replace the 'const char *key' parameter in the API with a pointer to a 'struct trace_key' that bundles the environment variable name with additional, trace-internal state. Change the call sites of these APIs to use a static 'struct trace_key' instead of a string constant. In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct trace_key'. Add a 'trace_disable()' API, so that packet_trace() can cleanly disable tracing when it encounters packed data (instead of using unsetenv()). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-13 21:24:23 -07:00
Christian Couder	5955654823	replace {pre,suf}fixcmp() with {starts,ends}_with() Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c \| grep -v strbuf\\.c \| xargs perl -pi -e ' s\|!prefixcmp\(\|starts_with\(\|g; s\|prefixcmp\(\|!starts_with\(\|g; s\|!suffixcmp\(\|ends_with\(\|g; s\|suffixcmp\(\|!ends_with\(\|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-05 14:13:21 -08:00
Jeff King	4981fe750b	pkt-line: share buffer/descriptor reading implementation The packet_read function reads from a descriptor. The packet_get_line function is similar, but reads from an in-memory buffer, and uses a completely separate implementation. This patch teaches the generic packet_read function to accept either source, and we can do away with packet_get_line's implementation. There are two other differences to account for between the old and new functions. The first is that we used to read into a strbuf, but now read into a fixed size buffer. The only two callers are fine with that, and in fact it simplifies their code, since they can use the same static-buffer interface as the rest of the packet_read_line callers (and we provide a similar convenience wrapper for reading from a buffer rather than a descriptor). This is technically an externally-visible behavior change in that we used to accept arbitrary sized packets up to 65532 bytes, and now cap out at LARGE_PACKET_MAX, 65520. In practice this doesn't matter, as we use it only for parsing smart-http headers (of which there is exactly one defined, and it is small and fixed-size). And any extension headers would be breaking the protocol to go over LARGE_PACKET_MAX anyway. The other difference is that packet_get_line would return on error rather than dying. However, both callers of packet_get_line are actually improved by dying. The first caller does its own error checking, but we can drop that; as a result, we'll actually get more specific reporting about protocol breakage when packet_read dies internally. The only downside is that packet_read will not print the smart-http URL that failed, but that's not a big deal; anybody not debugging can already see the remote's URL already, and anybody debugging would want to run with GIT_CURL_VERBOSE anyway to see way more information. The second caller, which is just trying to skip past any extra smart-http headers (of which there are none defined, but which we allow to keep room for future expansion), did not error check at all. As a result, it would treat an error just like a flush packet. The resulting mess would generally cause an error later in get_remote_heads, but now we get error reporting much closer to the source of the problem. Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-24 00:14:15 -08:00
Jeff King	74543a0423	pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-20 13:42:22 -08:00

1 2

69 commits