development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-30 04:01:21 +00:00

Author	SHA1	Message	Date
Junio C Hamano	2fb346c06a	Merge branch 'js/packet-read-line-check-null' Some low level protocol codepath could crash when they get an unexpected flush packet, which is now fixed. * js/packet-read-line-check-null: always check for NULL return from packet_read_line() correct error messages for NULL packet_read_line()	2018-02-27 10:33:55 -08:00
Junio C Hamano	6bed209a20	Merge branch 'jh/partial-clone' The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone	2018-02-13 13:39:04 -08:00
Junio C Hamano	f3d618d2bf	Merge branch 'jh/fsck-promisors' In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension	2018-02-13 13:39:03 -08:00
Jeff King	bc9d4dc5b0	correct error messages for NULL packet_read_line() The packet_read_line() function dies if it gets an unexpected EOF. It only returns NULL if we get a flush packet (or technically, a zero-length "0004" packet, but nobody is supposed to send those, and they are indistinguishable from a flush in this interface). Let's correct error messages which claim an unexpected EOF; it's really an unexpected flush packet. While we're here, let's also check "!line" instead of "!len" in the second case. The two events should always coincide, but checking "!line" makes it more obvious that we are not about to dereference NULL. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-08 12:37:30 -08:00
Jonathan Tan	a1c6d7c1a7	fetch-pack: restore save_commit_buffer after use In fetch-pack, the global variable save_commit_buffer is set to 0, but not restored to its original value after use. In particular, if show_log() (in log-tree.c) is invoked after fetch_pack() in the same process, show_log() will return before printing out the commit message (because the invocation to get_cached_commit_buffer() returns NULL, because the commit buffer was not saved). I discovered this when attempting to run "git log -S" in a partial clone, triggering the case where revision walking lazily loads missing objects. Therefore, restore save_commit_buffer to its original value after use. An alternative to solve the problem I had is to replace get_cached_commit_buffer() with get_commit_buffer(). That invocation was introduced in commit `a97934d` ("use get_cached_commit_buffer where appropriate", 2014-06-13) to replace "commit->buffer" introduced in commit `3131b71` ("Add "--show-all" revision walker flag for debugging", 2008-02-13). In the latter commit, the commit author seems to be deciding between not showing an unparsed commit at all and showing an unparsed commit without the message (which is what the commit does), and did not mention parsing the unparsed commit, so I prefer to preserve the existing behavior. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:52 -08:00
Jeff Hostetler	640d8b72fe	fetch-pack, index-pack, transport: partial clone Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:51 -08:00
Junio C Hamano	79bafd23a8	Merge branch 'jk/fewer-pack-rescan' Internaly we use 0{40} as a placeholder object name to signal the codepath that there is no such object (e.g. the fast-forward check while "git fetch" stores a new remote-tracking ref says "we know there is no 'old' thing pointed at by the ref, as we are creating it anew" by passing 0{40} for the 'old' side), and expect that a codepath to locate an in-core object to return NULL as a sign that the object does not exist. A look-up for an object that does not exist however is quite costly with a repository with large number of packfiles. This access pattern has been optimized. * jk/fewer-pack-rescan: sha1_file: fast-path null sha1 as a missing object everything_local: use "quick" object existence check p5551: add a script to test fetch pack-dir rescans t/perf/lib-pack: use fast-import checkpoint to create packs p5550: factor out nonsense-pack creation	2017-12-06 09:23:42 -08:00
Jonathan Tan	88e2f9ed8e	introduce fetch-object: fetch one promisor object Introduce fetch-object, providing the ability to fetch one object from a promisor remote. This uses fetch-pack. To do this, the transport mechanism has been updated with 2 flags, "from-promisor" to indicate that the resulting pack comes from a promisor remote (and thus should be annotated as such by index-pack), and "no-dependents" to indicate that only the objects themselves need to be fetched (but fetching additional objects is nevertheless safe). Whenever "no-dependents" is used, fetch-pack will refrain from using any object flags, because it is most likely invoked as part of a dynamic object fetch by another Git command (which may itself use object flags). An alternative to this is to leave fetch-pack alone, and instead update the allocation of flags so that fetch-pack's flags never overlap with any others, but this will end up shrinking the number of flags available to nearly every other Git command (that is, every Git command that accesses objects), so the approach in this commit was used instead. This will be tested in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Jeff King	c291293b2e	everything_local: use "quick" object existence check In `b495697b82` (fetch-pack: avoid repeatedly re-scanning pack directory, 2013-01-26), we noticed that everything_local() could waste time trying to find and parse objects which we _expect_ to be missing. The solution was to put has_sha1_file() in front of parse_object() to skip the more-expensive parse attempt. That optimization was negated later when has_sha1_file() learned to do the same re-scan in `45e8a74873` (has_sha1_file: re-check pack directory before giving up, 2013-08-30). We can restore it by using the "quick" flag to tell has_sha1_file (actually has_object_file these days) that we prefer speed to thoroughness for this call. See also the fixes in `5827a0354` and `0eeb077be7` for prior art and discussion on using the "quick" flag for these cases. The recently-added performance regression test in p5551 demonstrates the problem. You can see the original fix: Test b495697b82^ `b495697b82` -------------------------------------------------------- 5551.4: fetch 1.68(1.33+0.35) 0.87(0.69+0.18) -48.2% and then the regression: Test 45e8a74873^ `45e8a74873` --------------------------------------------------------- 5551.4: fetch 0.96(0.77+0.19) 2.55(2.04+0.50) +165.6% and now our fix: Test HEAD^ HEAD -------------------------------------------------------- 5551.4: fetch 7.21(6.58+0.63) 5.47(5.04+0.43) -24.1% You can also see that other things have gotten a lot slower since 2013. We'll deal with those in separate patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-21 11:30:41 +09:00
Jonathan Tan	9e6fabde82	oidmap: map with OID as key This is similar to using the hashmap in hashmap.c, but with an easier-to-use API. In particular, custom entry comparisons no longer need to be written, and lookups can be done without constructing a temporary entry structure. This is implemented as a thin wrapper over the hashmap API. In particular, this means that there is an additional 4-byte overhead due to the fact that the first 4 bytes of the hash is redundantly stored. For now, I'm taking the simpler approach, but if need be, we can reimplement oidmap without affecting the callers significantly. oidset has been updated to use oidmap. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:18:03 +09:00
Jonathan Tan	0abe14f6a5	pack: move {,re}prepare_packed_git and approximate_object_count Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	02c531eba2	Merge branch 'jt/fetch-allow-tip-sha1-implicitly' There is no good reason why "git fetch $there $sha1" should fail when the $sha1 names an object at the tip of an advertised ref, even when the other side hasn't enabled allowTipSHA1InWant. * jt/fetch-allow-tip-sha1-implicitly: fetch-pack: always allow fetching of literal SHA1s	2017-05-30 11:16:43 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Junio C Hamano	b15667bbdc	Merge branch 'js/larger-timestamps' Some platforms have ulong that is smaller than time_t, and our historical use of ulong for timestamp would mean they cannot represent some timestamp that the platform allows. Invent a separate and dedicated timestamp_t (so that we can distingiuish timestamps and a vanilla ulongs, which along is already a good move), and then declare uintmax_t is the type to be used as the timestamp_t. * js/larger-timestamps: archive-tar: fix a sparse 'constant too large' warning use uintmax_t for timestamps date.c: abort if the system time cannot handle one of our timestamps timestamp_t: a new data type for timestamps PRItime: introduce a new "printf format" for timestamps parse_timestamp(): specify explicitly where we parse timestamps t0006 & t5000: skip "far in the future" test when time_t is too limited t0006 & t5000: prepare for 64-bit timestamps ref-filter: avoid using `unsigned long` for catch-all data type	2017-05-16 11:51:59 +09:00
Jonathan Tan	fdb69d33c4	fetch-pack: always allow fetching of literal SHA1s fetch-pack, when fetching a literal SHA-1 from a server that is not configured with uploadpack.allowtipsha1inwant (or similar), always returns an error message of the form "Server does not allow request for unadvertised object %s". However, it is sometimes the case that such object is advertised. This situation would occur, for example, if a user or a script was provided a SHA-1 instead of a branch or tag name for fetching, and wanted to invoke "git fetch" or "git fetch-pack" using that SHA-1. Teach fetch-pack to also check the SHA-1s of the refs in the received ref advertisement if a literal SHA-1 was given by the user. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 10:17:05 +09:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	bc83266abe	Convert lookup_commit* to struct object_id Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id , but lookup_commit would take unsigned char , leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	e92b848cb6	shallow: convert shallow registration functions to object_id Convert register_shallow and unregister_shallow to take struct object_id. register_shallow is a caller of lookup_commit, which we will convert later. It doesn't make sense for the registration and unregistration functions to have incompatible interfaces, so convert them both. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	1b283377b1	fetch-pack: convert to struct object_id Convert all uses of unsigned char [20] to struct object_id. Switch one use of get_sha1_hex to parse_oid_hex to avoid the need for a constant. This change is necessary in order to convert parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-02 10:46:41 +09:00
Johannes Schindelin	dddbad728c	timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-27 13:07:39 +09:00
Junio C Hamano	5938454cbc	Merge branch 'dt/xgethostname-nul-termination' gethostname(2) may not NUL terminate the buffer if hostname does not fit; unfortunately there is no easy way to see if our buffer was too small, but at least this will make sure we will not end up using garbage past the end of the buffer. * dt/xgethostname-nul-termination: xgethostname: handle long hostnames use HOST_NAME_MAX to size buffers for gethostname(2)	2017-04-23 22:07:57 -07:00
Junio C Hamano	d2617eb984	Merge branch 'jt/fetch-pack-error-reporting' "git fetch-pack" was not prepared to accept ERR packet that the upload-pack can send with a human-readable error message. It showed the packet contents with ERR prefix, so there was no data loss, but it was redundant to say "ERR" in an error message. * jt/fetch-pack-error-reporting: fetch-pack: show clearer error message upon ERR	2017-04-23 22:07:53 -07:00
Johannes Schindelin	cb71f8bdb5	PRItime: introduce a new "printf format" for timestamps Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is larger than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-23 20:19:15 -07:00
David Turner	5781a9a270	xgethostname: handle long hostnames If the full hostname doesn't fit in the buffer supplied to gethostname, POSIX does not specify whether the buffer will be null-terminated, so to be safe, we should do it ourselves. Introduce new function, xgethostname, which ensures that there is always a \0 at the end of the buffer. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:58:04 -07:00
René Scharfe	da25bdb776	use HOST_NAME_MAX to size buffers for gethostname(2) POSIX limits the length of host names to HOST_NAME_MAX. Export the fallback definition from daemon.c and use this constant to make all buffers used with gethostname(2) big enough for any possible result and a terminating NUL. Inspired-by: David Turner <dturner@twosigma.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:57:41 -07:00
Jonathan Tan	8e2c7bef03	fetch-pack: show clearer error message upon ERR Currently, fetch-pack prints a confusing error message ("expected ACK/NAK") when the server it's communicating with sends a pkt-line starting with "ERR". Replace it with a less confusing error message. Also update the documentation describing the fetch-pack/upload-pack protocol (pack-protocol.txt) to indicate that "ERR" can be sent in the place of "ACK" or "NAK". In practice, this has been done for quite some time by other Git implementations (e.g. JGit sends "want $id not valid") and by Git itself (since commit `bdb31ea`: "upload-pack: report "not our ref" to client", 2017-02-23) whenever a "want" line references an object that it does not have. (This is uncommon, but can happen if a repository is garbage-collected during a negotiation.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-17 18:51:28 -07:00
brian m. carlson	910650d2f8	Rename sha1_array to oid_array Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:56 -07:00
brian m. carlson	98a72ddc12	Make sha1_array_append take a struct object_id * Convert the callers to pass struct object_id by changing the function declaration and definition and applying the following semantic patch: @@ expression E1, E2; @@ - sha1_array_append(E1, E2.hash) + sha1_array_append(E1, &E2) @@ expression E1, E2; @@ - sha1_array_append(E1, E2->hash) + sha1_array_append(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:55 -07:00
brian m. carlson	ee3051bd23	sha1-array: convert internal storage for struct sha1_array to object_id Make the internal storage for struct sha1_array use an array of struct object_id internally. Update the users of this struct which inspect its internals. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-28 09:59:34 -07:00
Junio C Hamano	07198afbd1	Merge branch 'mm/fetch-show-error-message-on-unadvertised-object' "git fetch" that requests a commit by object name, when the other side does not allow such an request, failed without much explanation. * mm/fetch-show-error-message-on-unadvertised-object: fetch-pack: add specific error for fetching an unadvertised object fetch_refs_via_pack: call report_unmatched_refs fetch-pack: move code to report unmatched refs to a function	2017-03-14 15:23:18 -07:00
Matt McCutchen	d56583ded6	fetch-pack: add specific error for fetching an unadvertised object Enhance filter_refs (which decides whether a request for an unadvertised object should be sent to the server) to record a new match status on the "struct ref" when a request is not allowed, and have report_unmatched_refs check for this status and print a special error message, "Server does not allow request for unadvertised object". Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:12:53 -08:00
Matt McCutchen	e860d96bf8	fetch-pack: move code to report unmatched refs to a function Prepare to reuse this code in transport.c for "git fetch". While we're here, internationalize the existing error message. Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:12:53 -08:00
Jeff King	41a078c60b	fetch-pack: cache results of for_each_alternate_ref We may run for_each_alternate_ref() twice, once in find_common() and once in everything_local(). This operation can be expensive, because it involves running a sub-process which must freshly load all of the alternate's refs from disk. Let's cache and reuse the results between the two calls. We can make some optimizations based on the particular use pattern in fetch-pack to keep our memory usage down. The first is that we only care about the sha1s, not the refs themselves. So it's OK to store only the sha1s, and to suppress duplicates. The natural fit would therefore be a sha1_array. However, sha1_array's de-duplication happens only after it has read and sorted all entries. It still stores each duplicate. For an alternate with a large number of refs pointing to the same commits, this is a needless expense. Instead, we'd prefer to eliminate duplicates before putting them in the cache, which implies using a hash. We can further note that fetch-pack will call parse_object() on each alternate sha1. We can therefore keep our cache as a set of pointers to "struct object". That gives us a place to put our "already seen" bit with an optimized hash lookup. And as a bonus, the object stores the sha1 for us, so pointer-to-object is all we need. There are two extra optimizations I didn't do here: - we actually store an array of pointer-to-object. Technically we could just walk the obj_hash table looking for entries with the ALTERNATE flag set (because our use case doesn't care about the order here). But that hash table may be mostly composed of non-ALTERNATE entries, so we'd waste time walking over them. So it would be a slight win in memory use, but a loss in CPU. - the items we pull out of the cache are actual "struct object"s, but then we feed "obj->sha1" to our sub-functions, which promptly call parse_object(). This second parse is cheap, because it starts with lookup_object() and will bail immediately when it sees we've already parsed the object. We could save the extra hash lookup, but it would involve refactoring the functions we call. It may or may not be worth the trouble. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -08:00
Jeff King	2429d63a46	for_each_alternate_ref: pass name/oid instead of ref struct Breaking down the fields in the interface makes it easier to change the backend of for_each_alternate_ref to something that doesn't use "struct ref" internally. The only field that callers actually look at is the oid, anyway. The refname is kept in the interface as a plausible thing for future code to want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -08:00
Ralf Thielow	dfbfb9f377	fetch-pack.c: correct command at the beginning of an error message One error message in fetch-pack.c uses 'git fetch_pack' at the beginning which is not a git command. Use 'git fetch-pack' instead. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-11 13:28:39 -08:00
Junio C Hamano	a460ea4a3c	Merge branch 'nd/shallow-deepen' The existing "git fetch --depth=<n>" option was hard to use correctly when making the history of an existing shallow clone deeper. A new option, "--deepen=<n>", has been added to make this easier to use. "git clone" also learned "--shallow-since=<date>" and "--shallow-exclude=<tag>" options to make it easier to specify "I am interested only in the recent N months worth of history" and "Give me only the history since that version". * nd/shallow-deepen: (27 commits) fetch, upload-pack: --deepen=N extends shallow boundary by N commits upload-pack: add get_reachable_list() upload-pack: split check_unreachable() in two, prep for get_reachable_list() t5500, t5539: tests for shallow depth excluding a ref clone: define shallow clone boundary with --shallow-exclude fetch: define shallow boundary with --shallow-exclude upload-pack: support define shallow boundary by excluding revisions refs: add expand_ref() t5500, t5539: tests for shallow depth since a specific date clone: define shallow clone boundary based on time with --shallow-since fetch: define shallow boundary with --shallow-since upload-pack: add deepen-since to cut shallow repos based on time shallow.c: implement a generic shallow boundary finder based on rev-list fetch-pack: use a separate flag for fetch in deepening mode fetch-pack.c: mark strings for translating fetch-pack: use a common function for verbose printing fetch-pack: use skip_prefix() instead of starts_with() upload-pack: move rev-list code out of check_non_tip() upload-pack: make check_non_tip() clean things up on error upload-pack: tighten number parsing at "deepen" lines ...	2016-10-10 14:03:50 -07:00
Junio C Hamano	b8688adb12	Merge branch 'rs/qsort' We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of the time third parameter is redundant. A new QSORT() macro lets us omit it. * rs/qsort: show-branch: use QSORT use QSORT, part 2 coccicheck: use --all-includes by default remove unnecessary check before QSORT use QSORT add QSORT	2016-10-10 14:03:46 -07:00
René Scharfe	9ed0d8d6e6	use QSORT Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-29 15:42:18 -07:00
Jonathan Tan	06b3d386e0	fetch-pack: do not reset in_vain on non-novel acks The MAX_IN_VAIN mechanism was introduced in commit `f061e5f` ("fetch-pack: give up after getting too many "ack continue"", 2006-05-24) to stop ref negotiation if a number of consecutive "have"s have been sent with no corresponding new acks. This is to stop the client from digging too deep in an irrelevant side branch in vain without ever finding a common ancestor. A use case (as described in that commit) is the scenario in which the local repository has more roots than the remote repository. However, during a negotiation in which stateless RPCs are used, MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario above and others) because in each new request, the client has to inform the server of objects it already has and knows the server has (to remind the server of the state), which the server then acks. Make fetch-pack only consider, as new acks for the purpose of MAX_IN_VAIN, acks for objects for which the client has never received an ack before in this session. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-23 12:37:45 -07:00
Jonathan Tan	da470981de	fetch-pack: grow stateless RPC windows exponentially When updating large repositories, the LARGE_FLUSH limit (that is, the limit at which the window growth strategy switches from exponential to linear) is reached quite quickly. Use a conservative exponential growth strategy when that limit is reached instead (and increase LARGE_FLUSH so that there is no regression in window size). This optimization is only applied during stateless RPCs to avoid the issue raised and fixed in commit `44d8dc54` (Fix potential local deadlock during fetch-pack, 2011-03-29). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-19 13:27:22 -07:00
Nguyễn Thái Ngọc Duy	cccf74e2da	fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more () parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. () We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	a45a260086	fetch: define shallow boundary with --shallow-exclude Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	508ea88226	fetch: define shallow boundary with --shallow-since Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	79891cb90a	fetch-pack: use a separate flag for fetch in deepening mode The shallow repo could be deepened or shortened when then user gives --depth. But in future that won't be the only way to deepen/shorten a repo. Stop relying on args->depth in this mode. Future deepening methods can simply set this flag on instead of updating all these if expressions. The new name "deepen" was chosen after the command to define shallow boundary in pack protocol. New commands also follow this tradition. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	1dd73e20d7	fetch-pack.c: mark strings for translating Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	0d789a5bc1	fetch-pack: use a common function for verbose printing This reduces the number of "if (verbose)" which makes it a bit easier to read imo. It also makes it easier to redirect all these printouts, to a file for example. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Jeff King	df85757244	fetch-pack: isolate sigpipe in demuxer thread In commit `9ff18fa` (fetch-pack: ignore SIGPIPE in sideband demuxer, 2016-02-24), we started using sigchain_push() to ignore SIGPIPE in the async demuxer thread. However, this is rather clumsy, as it ignores SIGPIPE for the entire process, including the main thread. At the time we didn't have any per-thread signal support, but we now we do. Let's use it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-20 13:33:56 -07:00
Jeff King	9ff18faf2f	fetch-pack: ignore SIGPIPE in sideband demuxer If the other side feeds us a bogus pack, index-pack (or unpack-objects) may die early, before consuming all of its input. As a result, the sideband demuxer may get SIGPIPE (racily, depending on whether our data made it into the pipe buffer or not). If this happens and we are compiled with pthread support, it will take down the main thread, too. This isn't the end of the world, as the main process will just die() anyway when it sees index-pack failed. But it does mean we don't get a chance to say "fatal: index-pack failed" or similar. And it also means that we racily fail t5504, as we sometimes die() and sometimes are killed by SIGPIPE. So let's ignore SIGPIPE while demuxing the sideband. We are already careful to check the return value of write(), so we won't waste time writing to a broken pipe. The caller will notice the error return from the async thread, though in practice we don't even get that far, as we die() as soon as we see that index-pack failed. The non-sideband case is already fine; we let index-pack read straight from the socket, so there is no SIGPIPE at all. Technically the non-threaded async case is also OK without this (the forked async process gets SIGPIPE), but it's not worth distinguishing from the threaded case here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-25 13:51:47 -08:00

1 2 3 4

199 commits