development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Junio C Hamano	771c758e8a	Merge branch 'jz/apply-run-3way-first' "git apply --3way" has always been "to fall back to 3-way merge only when straight application fails". Swap the order of falling back so that 3-way is always attempted first (only when the option is given, of course) and then straight patch application is used as a fallback when it fails. * jz/apply-run-3way-first: git-apply: try threeway first when "--3way" is used	2021-04-15 13:36:00 -07:00
Øystein Walle	f3cce896a8	transport: respect verbosity when setting upstream A command such as `git push -qu origin feature` will print "Branch 'feature' set up to track remote branch 'feature' from 'origin'." even when --quiet is passed. In this case it's because install_branch_config() is always called with BRANCH_CONFIG_VERBOSE. struct transport keeps track of the desired verbosity. Fix the above issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-15 12:52:49 -07:00
Junio C Hamano	151b6c2dd7	doc: clarify "do not capitalize the first word" rule The same "do not capitalize the first word" rule is applied to both our patch titles and error messages, but the existing description was fuzzy in two aspects. * For error messages, it was not said that this was only about the first word that begins the sentence. * For both, it was not clear when a capital letter there was not an error. We avoid capitalizing the first word when the only reason you would capitalize it is because it happens to be the first word in the sentence. If a proper noun, which is usually spelled in capital letters, happens to come at the beginning of the sentence, it should be kept in capital letters. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 23:41:00 -07:00
Derrick Stolee	4589bca829	name-hash: use expand_to_path() A sparse-index loads the name-hash data for its entries, including the sparse-directory entries. If a caller asks for a path that is contained within a sparse-directory entry, we need to expand to a full index and recalculate the name hash table before returning the result. Insert calls to expand_to_path() to protect against this case. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:48:01 -07:00
Derrick Stolee	71f82d032f	sparse-index: expand_to_path() Some users of the index API have a specific path they are looking for, but choose to use index_file_exists() to rely on the name-hash hashtable instead of doing binary search with index_name_pos(). These users only need to know a yes/no answer, not a position within the cache array. When the index is sparse, the name-hash hash table does not contain the full list of paths within sparse directories. It _does_ contain the directory names for the sparse-directory entries. Create a helper function, expand_to_path(), for intended use with the name-hash hashtable functions. The integration with name-hash.c will follow in a later change. The solution here is to use ensure_full_index() when we determine that the requested path is within a sparse directory entry. This will populate the name-hash hashtable as the index is recomputed from scratch. There may be cases where the caller is trying to find an untracked path that is not in the index but also is not within a sparse directory entry. We want to minimize the overhead for these requests. If we used index_name_pos() to find the insertion order of the path, then we could determine from that position if a sparse-directory exists. (In fact, just calling index_name_pos() in that case would lead to expanding the index to a full index.) However, this takes O(log N) time where N is the number of cache entries. To keep the performance of this call based mostly on the input string, use index_file_exists() to look for the ancestors of the path. Using the heuristic that a sparse directory is likely to have a small number of parent directories, we start from the bottom and build up. Use a string buffer to allow mutating the path name to terminate after each slash for each hashset test. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:54 -07:00
Derrick Stolee	5f11669586	name-hash: don't add directories to name_hash Sparse directory entries represent a directory that is outside the sparse-checkout definition. These are not paths to blobs, so should not be added to the name_hash table. Instead, they should be added to the directory hashtable when 'ignore_case' is true. Add a condition to avoid placing sparse directories into the name_hash hashtable. This avoids filling the table with extra entries that will never be queried. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:51 -07:00
Derrick Stolee	f5fed74fb2	revision: ensure full index Before iterating over all index entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. This case could be integrated later by ensuring that we walk the tree in the sparse-directory entry, but the current behavior is only expecting blobs. Save this integration for later when it can be properly tested. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:48 -07:00
Derrick Stolee	dc26b23ebc	resolve-undo: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:45 -07:00
Derrick Stolee	0c18c059a1	read-cache: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:42 -07:00
Derrick Stolee	465a04abc6	pathspec: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:40 -07:00
Derrick Stolee	f7ef64be0c	merge-recursive: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:37 -07:00
Derrick Stolee	3450a304aa	entry: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:35 -07:00
Derrick Stolee	d425f65127	dir: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:32 -07:00
Derrick Stolee	2508df0272	update-index: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:29 -07:00
Derrick Stolee	a02912019a	stash: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:26 -07:00
Derrick Stolee	e43e2a17d2	rm: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:24 -07:00
Derrick Stolee	299e2c4561	merge-index: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:21 -07:00
Derrick Stolee	42f44e84eb	ls-files: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one to avoid missing files. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:17 -07:00
Derrick Stolee	46eb6e31ef	grep: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full one so we do not miss blobs to scan. Later, this can integrate more carefully with sparse indexes with proper testing. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:13 -07:00
Derrick Stolee	2227ea175f	fsck: ensure full index When verifying all blobs reachable from the index, ensure that a sparse index has been expanded to a full one to avoid missing some blobs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:11 -07:00
Derrick Stolee	48b3c7da6c	difftool: ensure full index Before iterating over all cache entries, ensure that a sparse index has been expanded to a full one to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:09 -07:00
Derrick Stolee	cb8388df5b	commit: ensure full index These two loops iterate over all cache entries, so ensure that a sparse index is expanded to a full index before we do so. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:06 -07:00
Derrick Stolee	0f6d3ba6bd	checkout: ensure full index Before iterating over all cache entries in the checkout builtin, ensure that we have a full index to avoid any unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:03 -07:00
Derrick Stolee	1b850d37f4	checkout-index: ensure full index Before we iterate over all cache entries, ensure that the index is not sparse. This loop in checkout_all() might be safe to iterate over a sparse index, but let's put this protection here until it can be carefully tested. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:59 -07:00
Derrick Stolee	54beed24d2	add: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:48 -07:00
Derrick Stolee	118a2e8bde	cache: move ensure_full_index() to cache.h Soon we will insert ensure_full_index() calls across the codebase. Instead of also adding include statements for sparse-index.h, let's just use the fact that anything that cares about the index already has cache.h in its includes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:41 -07:00
Derrick Stolee	95e0321c4d	read-cache: expand on query into sparse-directory entry Callers to index_name_pos() or index_name_stage_pos() have a specific path in mind. If that happens to be a path with an ancestor being a sparse-directory entry, it can lead to unexpected results. In the case that we did not find the requested path, check to see if the position _before_ the inserted position is a sparse directory entry that matches the initial segment of the input path (including the directory separator at the end of the directory name). If so, then expand the index to be a full index and search again. This expansion will only happen once per index read. Future enhancements could be more careful to expand only the necessary sparse directory entry, but then we would have a special "not fully sparse, but also not fully expanded" mode that could affect writing the index to file. Since this only occurs if a specific file is requested outside of the sparse checkout definition, this is unlikely to be a common situation. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:30 -07:00
Derrick Stolee	847a9e5d4f	*: remove 'const' qualifier for struct index_state Several methods specify that they take a 'struct index_state' pointer with the 'const' qualifier because they intend to only query the data, not change it. However, we will be introducing a step very low in the method stack that might modify a sparse-index to become a full index in the case that our queries venture inside a sparse-directory entry. This change only removes the 'const' qualifiers that are necessary for the following change which will actually modify the implementation of index_name_stage_pos(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:00 -07:00
Derrick Stolee	839a66349e	sparse-index: API protection strategy Edit and expand the sparse-index design document with the plan for guarding index operations with ensure_full_index(). Notably, the plan has changed to not have an expand_to_path() method in favor of checking for a sparse-directory hit inside of the index_path_pos() API. The changes that follow this one will incrementally add ensure_full_index() guards to iterations over all cache entries. Some iterations over the cache entries are not protected due to a few categories listed in the document. Since these are not being modified, here is a short list of the files and methods that will not receive these guards: Looking for non-zero stage: * builtin/add.c:chmod_pathspec() * builtin/merge.c:count_unmerged_entries() * merge-ort.c:record_conflicted_index_entries() * read-cache.c:unmerged_index() * rerere.c:check_one_conflict(), find_conflict(), rerere_remaining() * revision.c:prepare_show_merge() * sequencer.c:append_conflicts_hint() * wt-status.c:wt_status_collect_changes_initial() Looking for submodules: * builtin/submodule--helper.c:module_list_compute() * submodule.c: several methods * worktree.c:validate_no_submodules() Part of the index API: * name-hash.c: lazy init methods * preload-index.c:preload_thread(), preload_index() * read-cache.c: file format methods Checking for correct order of cache entries: * read-cache.c:check_ce_order() Ignores SKIP_WORKTREE entries or already aware: * unpack-trees.c:mark_new_skip_worktree() * wt-status.c:wt_status_check_sparse_checkout() Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:45:34 -07:00
Junio C Hamano	54a3917115	The ninth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 15:28:53 -07:00
Junio C Hamano	e0d4a63c09	Merge branch 'vs/completion-with-set-u' The command-line completion script (in contrib/) had a couple of references that would have given a warning under the "-u" (nounset) option. * vs/completion-with-set-u: completion: audit and guard $GIT_* against unset use	2021-04-13 15:28:53 -07:00
Junio C Hamano	e6545201ad	Merge branch 'ab/detox-config-gettext' The last remnant of gettext-poison has been removed. * ab/detox-config-gettext: config.c: remove last remnant of GIT_TEST_GETTEXT_POISON	2021-04-13 15:28:53 -07:00
Junio C Hamano	a9414b86ac	Merge branch 'gk/gitweb-redacted-email' "gitweb" learned "e-mail privacy" feature to redact strings that look like e-mail addresses on various pages. * gk/gitweb-redacted-email: gitweb: add "e-mail privacy" feature to redact e-mail addresses	2021-04-13 15:28:52 -07:00
Junio C Hamano	8446b388b1	Merge branch 'cc/test-helper-bloom-usage-fix' Usage message fix for a test helper. * cc/test-helper-bloom-usage-fix: test-bloom: fix missing 'bloom' from usage string	2021-04-13 15:28:52 -07:00
Junio C Hamano	2279289e95	Merge branch 'ab/send-email-validate-errors' Clean-up codepaths that implements "git send-email --validate" option and improves the message from it. * ab/send-email-validate-errors: git-send-email: improve --validate error output git-send-email: refactor duplicate $? checks into a function git-send-email: test full --validate output	2021-04-13 15:28:51 -07:00
Junio C Hamano	4c6ac2da2c	Merge branch 'tb/precompose-prefix-simplify' Streamline the codepath to fix the UTF-8 encoding issues in the argv[] and the prefix on macOS. * tb/precompose-prefix-simplify: macOS: precompose startup_info->prefix precompose_utf8: make precompose_string_if_needed() public	2021-04-13 15:28:51 -07:00
Junio C Hamano	1d5fbd45c4	Merge branch 'fm/user-manual-use-preface' Doc update to improve git.info * fm/user-manual-use-preface: user-manual.txt: assign preface an id and a title	2021-04-13 15:28:51 -07:00
Junio C Hamano	7b55441db1	Merge branch 'ab/perl-do-not-abuse-map' Perl critique. * ab/perl-do-not-abuse-map: git-send-email: replace "map" in void context with "for"	2021-04-13 15:28:50 -07:00
Junio C Hamano	0623669fc6	Merge branch 'tb/pack-preferred-tips-to-give-bitmap' A configuration variable has been added to force tips of certain refs to be given a reachability bitmap. * tb/pack-preferred-tips-to-give-bitmap: builtin/pack-objects.c: respect 'pack.preferBitmapTips' t/helper/test-bitmap.c: initial commit pack-bitmap: add 'test_bitmap_commits()' helper	2021-04-13 15:28:50 -07:00
Junio C Hamano	f63add4aa8	Merge branch 'jk/ref-filter-segfault-fix' A NULL-dereference bug has been corrected in an error codepath in "git for-each-ref", "git branch --list" etc. * jk/ref-filter-segfault-fix: ref-filter: fix NULL check for parse object failure	2021-04-13 15:28:50 -07:00
Ævar Arnfjörð Bjarmason	f6d25d7878	api docs: document that BUG() emits a trace2 error event Correct documentation added in `e544221d97` (trace2: Documentation/technical/api-trace2.txt, 2019-02-22) to state that calling BUG() also emits an "error" event. See `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) for the initial implementation. The BUG() function did not emit an event then however, that was only changed later in `0a9dde4a04` (usage: trace2 BUG() invocations, 2021-02-05), that commit changed the code, but didn't update any of the docs. Let's also add a cross-reference from api-error-handling.txt. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:57:13 -07:00
Ævar Arnfjörð Bjarmason	4bf0c6f38f	api docs: document BUG() in api-error-handling.txt When the BUG() function was added in `d8193743e0` (usage.c: add BUG() function, 2017-05-12) these docs added in `1f23cfe0ef` (doc: document error handling functions and conventions, 2014-12-03) were not updated. Let's do that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:56:58 -07:00
Ævar Arnfjörð Bjarmason	c00c7382dd	usage.c: don't copy/paste the same comment three times In `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) we started with two copies of this comment, `0ee10fd129` (usage: add trace2 entry upon warning(), 2020-11-23) added a third. Let's instead add an earlier comment that applies to all these mostly-the-same functions. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:56:28 -07:00
Ævar Arnfjörð Bjarmason	feeb03bce6	tests: remove all uses of test_i18cmp Finish the removal I started in `1108cea7f8` (tests: remove most uses of test_i18ncmp, 2021-02-11). At that time the function wasn't removed due to disruption with in-flight changes, remove the occurrences that have landed since then. As of writing this there are no test_i18ncmp uses between "master" and "seen", so let's also remove the function to finally put it to rest. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:41:24 -07:00
Jeff King	c1fa951d7e	revision: avoid parsing with --exclude-promisor-objects When --exclude-promisor-objects is given, before traversing any objects we iterate over all of the objects in any promisor packs, marking them as UNINTERESTING and SEEN. We turn the oid we get from iterating the pack into an object with parse_object(), but this has two problems: - it's slow; we are zlib inflating (and reconstructing from deltas) every byte of every object in the packfile - it leaves the tree buffers attached to their structs, which means our heap usage will grow to store every uncompressed tree simultaneously. This can be gigabytes. We can obviously fix the second by freeing the tree buffers after we've parsed them. But we can observe that the function doesn't look at the object contents at all! The only reason we call parse_object() is that we need a "struct object" on which to set the flags. There are two options here: - we can look up just the object type via oid_object_info(), and then call the appropriate lookup_foo() function - we can call lookup_unknown_object(), which gives us an OBJ_NONE struct (which will get auto-converted later by object_as_type() via calls to lookup_commit(), etc). The first one is closer to the current code, but we do pay the price to look up the type for each object. The latter should be more efficient in CPU, though it wastes a little bit of memory (the "unknown" object structs are a union of all object types, so some of the structs are bigger than they need to be). It also runs the risk of triggering a latent bug in code that calls lookup_object() directly but isn't ready to handle OBJ_NONE (such code would already be buggy, but we use lookup_unknown_object() infrequently enough that it might be hiding). I went with the second option here. I don't think the risk is high (and we'd want to find and fix any such bugs anyway), and it should be more efficient overall. The new tests in p5600 show off the improvement (this is on git.git): Test HEAD^ HEAD ------------------------------------------------------------------------------- 5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7% 5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7% The improvement is particularly big in this script because _every_ object in the newly-cloned partial repo is a promisor object. So after marking them all, there's nothing left to traverse. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:22:37 -07:00
Jeff King	45a187cc34	lookup_unknown_object(): take a repository argument All of the other lookup_foo() functions take a repository argument, but lookup_unknown_object() was never converted, and it uses the_repository internally. Let's fix that. We could leave a wrapper that uses the_repository, but there aren't that many calls, so we'll just convert them all. I looked briefly at each site to see if we had a repository struct (besides the_repository) we could pass, but none of them do (so this conversion to pass the_repository is a pure noop in each case, though it does take us one step closer to eventually getting rid of the_repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:18:46 -07:00
Jeff King	fcc07e980b	is_promisor_object(): free tree buffer after parsing To get the list of all promisor objects, we not only include all objects in promisor packs, but also parse each of those objects to see which objects they reference. After parsing a tree object, the tree->buffer field will remain populated until we explicitly free it. So in a partial clone of blob:none, for example, we are essentially reading every tree in the repository (since they're all in the initial promisor pack), and keeping all of their uncompressed contents in memory at once. This patch frees the tree buffers after we've finished marking all of their reachable objects. We shouldn't need to do this for any other object type. While we are using some extra memory to store the structs, no other object type stores the whole contents in its parsed form (we do sometimes hold on to commit buffers, but less so these days due to commit graphs, plus most commands which care about promisor objects turn off the save_commit_buffer global). Even for a moderate-sized repository like git.git, this patch drops the peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB. Fsck is a good candidate for measuring here because it doesn't interact with the promisor code except to call is_promisor_object(), so we can isolate just this problem. The added perf test shows only a tiny improvement on my machine for git.git, since 1.7GB isn't enough to cause any real memory pressure: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5600.4: fsck 21.26(20.90+0.35) 20.84(20.79+0.04) -2.0% With linux.git the absolute change is a bit bigger, though still a small percentage: Test HEAD^ HEAD ----------------------------------------------------------------------------- 5600.4: fsck 262.26(259.13+3.12) 254.92(254.62+0.29) -2.8% I didn't have the patience to run it under massif with linux.git, but it's probably on the order of about 14GB improvement, since that's the sum of the sizes of all of the uncompressed trees (but still isn't enough to create memory pressure on this particular machine, which has 64GB of RAM). Smaller machines would probably see a bigger effect on runtime (and sadly our perf suite does not measure peak heap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:16:39 -07:00
Han-Wen Nienhuys	2a2112a429	refs: print errno for read_raw_ref if GIT_TRACE_REFS is set The ref backend API uses errno as a sideband error channel. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:42:37 -07:00
Han-Wen Nienhuys	61a7660516	reftable: document an alternate cleanup method on Windows The new method uses the update_index counter, which isn't susceptible to clock inaccuracies. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:29:44 -07:00
Jeff King	8e118e8490	pack-objects: update "nr_seen" progress based on pack-reused count When serving a clone or fetch with bitmaps, after deciding which objects need to be sent our "pack reuse" mechanism kicks in: we try to send more-or-less verbatim a bunch of objects from the beginning of the bitmapped packfile without even adding them to the to_pack.objects array. After deciding which objects will be in the "reused" portion, we update nr_result to account for those, and then trigger display_progress() to show the user (who is undoubtedly dazzled that we managed to enumerate so many objects so quickly). But then something confusing happens: the "Enumerating objects" progress meter jumps _backwards_, counting up from zero the number of objects we actually add into to_pack.objects. This worked correctly once upon a time, but was broken in `5af050437a` (pack-objects: show some progress when counting kept objects, 2018-04-15), when the latter half of that progress meter switched to using a separate nr_seen counter, rather than nr_result. Nobody noticed for two reasons: - prior to the pack-reuse fixes from `a14aebeac3` (Merge branch 'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost never kicked in anyway - the output looks _kind of_ correct. The "backwards" moment is hard to catch, because we overwrite the old progress number with the new one, and the larger number is displayed only for a second. So unless you look at that exact second, you just see the much smaller value, counting up to the number of non-reused objects (though of course if you catch it in stderr, or look at GIT_TRACE_PACKET from a server with bitmaps, you can see both values). This smaller output isn't wrong per se, but isn't counting what we ever intended to. We should give the user the whole number of objects we considered (which, as per 5af050437a's original purpose, is already _not_ a count of what goes into to_pack.objects). The follow-on "Counting objects" meter shows the actual number of objects we feed into that array. We can easily fix this by bumping (and showing) nr_seen for the pack-reused objects. When the included test is run without this patch, the second pack-objects invocation produces "Enumerating objects: 1" to show the one loose object, even though the resulting pack has hundreds of objects in it. With it, we jump to "Enumerating objects: 674" after deciding on reuse, and then "675" when we add in the loose object. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 11:31:30 -07:00

1 2 3 4 5 ...

62857 commits