development/git - HydraGit

mirror of https://github.com/git/git synced 2024-08-27 19:49:21 +00:00

Author	SHA1	Message	Date
Junio C Hamano	d037212d97	Merge branch 'kn/for-all-refs' "git for-each-ref" learned "--include-root-refs" option to show even the stuff outside the 'refs/' hierarchy. * kn/for-all-refs: for-each-ref: add new option to include root refs ref-filter: rename 'FILTER_REFS_ALL' to 'FILTER_REFS_REGULAR' refs: introduce `refs_for_each_include_root_refs()` refs: extract out `loose_fill_ref_dir_regular_file()` refs: introduce `is_pseudoref()` and `is_headref()`	2024-03-05 09:44:44 -08:00
Junio C Hamano	661f379791	Merge branch 'pb/ort-make-submodule-conflict-message-an-advice' When a merge conflicted at a submodule, merge-ort backend used to unconditionally give a lengthy message to suggest how to resolve it. Now the message can be squelched as an advice message. * pb/ort-make-submodule-conflict-message-an-advice: merge-ort: turn submodule conflict suggestions into an advice	2024-03-05 09:44:43 -08:00
Junio C Hamano	53929db7c4	Merge branch 'jc/doc-compat-util' Clarify wording in the CodingGuidelines that requires <git-compat-util.h> to be the first header file. * jc/doc-compat-util: doc: clarify the wording on <git-compat-util.h> requirement	2024-03-05 09:44:43 -08:00
Junio C Hamano	e58a4de3bb	Merge branch 'sg/upload-pack-error-message-fix' An error message from "git upload-pack", which responds to "git fetch" requests, had a trialing NUL in it, which has been corrected. * sg/upload-pack-error-message-fix: upload-pack: don't send null character in abort message to the client	2024-03-05 09:44:43 -08:00
Junio C Hamano	d31a515e9c	Merge branch 'rs/submodule-prefix-simplify' Code simplification. * rs/submodule-prefix-simplify: submodule: use strvec_pushf() for --submodule-prefix	2024-03-05 09:44:43 -08:00
Junio C Hamano	b5111647cb	Merge branch 'rs/name-rev-with-mempool' Many small allocations "git name-rev" makes have been updated to allocate from a mem-pool. * rs/name-rev-with-mempool: name-rev: use mem_pool_strfmt() mem-pool: add mem_pool_strfmt()	2024-03-05 09:44:43 -08:00
Junio C Hamano	6f74483667	Merge branch 'rs/fetch-simplify-with-starts-with' Code simplification. * rs/fetch-simplify-with-starts-with: fetch: convert strncmp() with strlen() to starts_with()	2024-03-05 09:44:42 -08:00
Junio C Hamano	74522bbd98	Merge branch 'jk/reflog-special-cases-fix' The logic to access reflog entries by date and number had ugly corner cases at the boundaries, which have been cleaned up. * jk/reflog-special-cases-fix: read_ref_at(): special-case ref@{0} for an empty reflog get_oid_basic(): special-case ref@{n} for oldest reflog entry Revert "refs: allow @{n} to work with n-sized reflog"	2024-03-05 09:44:42 -08:00
Junio C Hamano	542d093b1d	Merge branch 'jc/no-include-of-compat-util-from-headers' Header file clean-up. * jc/no-include-of-compat-util-from-headers: compat: drop inclusion of <git-compat-util.h>	2024-03-05 09:44:42 -08:00
Junio C Hamano	d619abf7fa	Merge branch 'js/remove-cruft-files' Remove an empty file that shouldn't have been added in the first place. * js/remove-cruft-files: neue: remove a bogus empty file	2024-03-05 09:44:42 -08:00
Junio C Hamano	6249de53a3	Merge branch 'jk/textconv-cache-outside-repo-fix' The code incorrectly attempted to use textconv cache when asked, even when we are not running in a repository, which has been corrected. * jk/textconv-cache-outside-repo-fix: userdiff: skip textconv caching when not in a repository	2024-03-05 09:44:42 -08:00
Patrick Steinhardt	fcacc2b161	refs/reftable: track last log record name via strbuf The reflog iterator enumerates all reflogs known to a ref backend. In the "reftable" backend there is no way to list all existing reflogs directly. Instead, we have to iterate through all reflog entries and discard all those redundant entries for which we have already returned a reflog entry. This logic is implemented by tracking the last reflog name that we have emitted to the iterator's user. If the next log record has the same name we simply skip it until we find another record with a different refname. This last reflog name is stored in a simple C string, which requires us to free and reallocate it whenever we need to update the reflog name. Convert it to use a `struct strbuf` instead, which reduces the number of allocations. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 68,485 allocs, 68,363 frees, 256,234,072 bytes allocated Note that even after this change we still allocate quite a lot of data, even though the number of allocations does not scale with the number of log records anymore. This remainder comes mostly from decompressing the log blocks, where we decompress each block into newly allocated memory. This will be addressed at a later point in time. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:07 -08:00
Patrick Steinhardt	7b8abc4d8c	reftable/record: use scratch buffer when decoding records When decoding log records we need a temporary buffer to decode the reflog entry's name, mail address and message. As this buffer is local to the function we thus have to reallocate it for every single log record which we're about to decode, which is inefficient. Refactor the code such that callers need to pass in a scratch buffer, which allows us to reuse it for multiple decodes. This reduces the number of allocations when iterating through reflogs. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 1,068,485 allocs, 1,068,363 frees, 281,122,886 bytes allocated Note that this commit also drop some redundant calls to `strbuf_reset()` right before calling `decode_string()`. The latter already knows to reset the buffer, so there is no need for these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Patrick Steinhardt	e0bd13beea	reftable/record: reuse message when decoding log records Same as the preceding commit we can allocate log messages as needed when decoding log records, thus further reducing the number of allocations. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 2,068,487 allocs, 2,068,365 frees, 305,122,946 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Patrick Steinhardt	193fcb3ff8	reftable/record: reuse refnames when decoding log records When decoding a log record we always reallocate their refname arrays. This results in quite a lot of needless allocation churn. Refactor the code to grow the array as required only. Like this, we should usually only end up reallocating the array a small handful of times when iterating over many refs. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 3,068,488 allocs, 3,068,366 frees, 307,122,961 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Patrick Steinhardt	01639ec148	reftable/record: avoid copying author info Each reflog entry contains information regarding the authorship of who has made the change. This authorship information is not the same as that of any of the commits that the reflog entry references, but instead corresponds to the local user that has executed the command. Thus, it is almost always the case that all reflog entries have the same author. We can make use of this fact when decoding reftable records: instead of freeing and then reallocating the authorship information of log records, we can special-case when the next record during an iteration has the exact same authorship as the preceding record. If so, then there is no need to reallocate the respective fields. This change results in two allocations less per log record that we're iterating over in the most common case. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 4,068,487 allocs, 4,068,365 frees, 332,011,793 bytes allocated An alternative would be to store the capacity of both name and email and then use `REFTABLE_ALLOC_GROW()` to conditionally reallocate the array. But reftable records are copied around quite a lot, and thus we need to be a bit mindful of the overall record size. Furthermore, a memory comparison should also be more efficient than having to copy over memory even if we wouldn't have to allocate a new array every time. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Patrick Steinhardt	87ff723018	reftable/record: convert old and new object IDs to arrays In `7af607c58d` (reftable/record: store "val1" hashes as static arrays, 2024-01-03) and `b31e3cc620` (reftable/record: store "val2" hashes as static arrays, 2024-01-03) we have converted ref records to store their object IDs in a static array. Convert log records to do the same so that their old and new object IDs are arrays, too. This change results in two allocations less per log record that we're iterating over. Before: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 8,068,495 allocs, 8,068,373 frees, 401,011,862 bytes allocated After: HEAP SUMMARY: in use at exit: 13,473 bytes in 122 blocks total heap usage: 6,068,489 allocs, 6,068,367 frees, 361,011,822 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Patrick Steinhardt	eea0d11d6d	refs/reftable: reload correct stack when creating reflog iter When creating a new reflog iterator, we first have to reload the stack that the iterator is being created. This is done so that any concurrent writes to the stack are reflected. But `reflog_iterator_for_stack()` always reloads the main stack, which is wrong. Fix this and reload the correct stack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-05 09:10:06 -08:00
Junio C Hamano	2efe7958d6	Merge branch 'ps/reftable-iteration-perf-part2' into ps/reftable-reflog-iteration-perf * ps/reftable-iteration-perf-part2: refs/reftable: precompute prefix length reftable: allow inlining of a few functions reftable/record: decode keys in place reftable/record: reuse refname when copying reftable/record: reuse refname when decoding reftable/merged: avoid duplicate pqueue emptiness check reftable/merged: circumvent pqueue with single subiter reftable/merged: handle subiter cleanup on close only reftable/merged: remove unnecessary null check for subiters reftable/merged: make subiters own their records reftable/merged: advance subiter on subsequent iteration reftable/merged: make `merged_iter` structure private reftable/pq: use `size_t` to track iterator index	2024-03-05 09:09:46 -08:00
Junio C Hamano	105ec9ae8d	clean: further clean-up of implementation around "--force" We clarified how "clean.requireForce" interacts with the "--dry-run" option in the previous commit, both in the implementation and in the documentation. Even when "git clean" (without other options) is required to be used with "--force" (i.e. either clean.requireForce is unset, or explicitly set to true) to protect end-users from casual invocation of the command by mistake, "--dry-run" does not require "--force" to be used, because it is already its own protection mechanism by being a no-op to the working tree files. The previous commit, however, missed another clean-up opportunity around the same area. Just like in the "--dry-run" mode, the command in the "--interactive" mode does not require "--force", either. This is because by going interactive and giving the end user one more chance to confirm, the mode itself is serving as its own protection mechanism. Let's take things one step further, and unify the code that defines interaction between "--force" and these two other options. Just like we added explanation for the reason why "--dry-run" does not honor "clean.requireForce", give an explanation for the reason why "--interactive" makes "clean.requireForce" to be ignored. Finally, add some tests to show the interaction between "--force" and "--interactive". We already have tests that show interaction between "--force" and "--dry-run", but didn't test "--interactive". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 14:05:13 -08:00
Patrick Steinhardt	43f70eaea0	refs/reftable: precompute prefix length We're recomputing the prefix length on every iteration of the ref iterator. Precompute it for another speedup when iterating over 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 100.3 ms ± 3.7 ms [User: 97.3 ms, System: 2.8 ms] Range (min … max): 97.5 ms … 139.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 95.8 ms ± 3.4 ms [User: 92.9 ms, System: 2.8 ms] Range (min … max): 93.0 ms … 121.9 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:58 -08:00
Patrick Steinhardt	f1bf54aee3	reftable: allow inlining of a few functions We have a few functions which are basically just accessors to structures. As those functions are executed inside the hot loop when iterating through many refs, the fact that they cannot be inlined is costing us some performance. Move the function definitions into their respective headers so that they can be inlined. This results in a performance improvement when iterating over 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 105.9 ms ± 3.6 ms [User: 103.0 ms, System: 2.8 ms] Range (min … max): 103.1 ms … 133.4 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 100.7 ms ± 3.4 ms [User: 97.8 ms, System: 2.8 ms] Range (min … max): 97.8 ms … 124.0 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:49 -08:00
Patrick Steinhardt	daf4f43d0d	reftable/record: decode keys in place When reading a record from a block, we need to decode the record's key. As reftable keys are prefix-compressed, meaning they reuse a prefix from the preceding record's key, this is a bit more involved than just having to copy the relevant bytes: we need to figure out the prefix and suffix lengths, copy the prefix from the preceding record and finally copy the suffix from the current record. This is done by passing three buffers to `reftable_decode_key()`: one buffer that holds the result, one buffer that holds the last key, and one buffer that points to the current record. The final key is then assembled by calling `strbuf_add()` twice to copy over the prefix and suffix. Performing two memory copies is inefficient though. And we can indeed do better by decoding keys in place. Instead of providing two buffers, the caller may only call a single buffer that is already pre-populated with the last key. Like this, we only have to call `strbuf_setlen()` to trim the record to its prefix and then `strbuf_add()` to add the suffix. This refactoring leads to a noticeable performance bump when iterating over 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 112.2 ms ± 3.9 ms [User: 109.3 ms, System: 2.8 ms] Range (min … max): 109.2 ms … 149.6 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 106.0 ms ± 3.5 ms [User: 103.2 ms, System: 2.7 ms] Range (min … max): 103.2 ms … 133.7 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.06 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:49 -08:00
Patrick Steinhardt	6620f9134c	reftable/record: reuse refname when copying Do the same optimization as in the preceding commit, but this time for `reftable_record_copy()`. While not as noticeable, it still results in a small speedup when iterating over 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 114.0 ms ± 3.8 ms [User: 111.1 ms, System: 2.7 ms] Range (min … max): 110.9 ms … 144.3 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 112.5 ms ± 3.7 ms [User: 109.5 ms, System: 2.8 ms] Range (min … max): 109.2 ms … 140.7 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.01 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:49 -08:00
Patrick Steinhardt	71d9a2e991	reftable/record: reuse refname when decoding When decoding a reftable record we will first release the user-provided record and then decode the new record into it. This is quite inefficient as we basically need to reallocate at least the refname every time. Refactor the function to start tracking the refname capacity. Like this, we can stow away the refname, release, restore and then grow the refname to the required number of bytes via `REFTABLE_ALLOC_GROW()`. This refactoring is safe to do because all functions that assigning to the refname will first call `reftable_ref_record_release()`, which will zero out the complete record after releasing memory. This change results in a nice speedup when iterating over 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 124.0 ms ± 3.9 ms [User: 121.1 ms, System: 2.7 ms] Range (min … max): 120.4 ms … 152.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 114.4 ms ± 3.7 ms [User: 111.5 ms, System: 2.7 ms] Range (min … max): 111.0 ms … 152.1 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.08 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Furthermore, with this change we now perform a mostly constant number of allocations when iterating. Before this change: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 1,006,620 allocs, 1,006,495 frees, 25,398,363 bytes allocated After this change: HEAP SUMMARY: in use at exit: 13,603 bytes in 125 blocks total heap usage: 6,623 allocs, 6,498 frees, 509,592 bytes allocated Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:40 -08:00
Patrick Steinhardt	080f8c4565	reftable/merged: avoid duplicate pqueue emptiness check When calling `merged_iter_next_void()` we first check whether the iter has been exhausted already. We already perform this check two levels down the stack in `merged_iter_next_entry()` though, which makes this check redundant. Now if this check was there to accelerate the common case it might have made sense to keep it. But the iterator being exhausted is rather the uncommon case because you can expect most reftable stacks to contain more than two refs. Simplify the code by removing the check. As `merged_iter_next_void()` is basically empty except for calling `merged_iter_next()` now, merge these two functions. This also results in a tiny speedup when iterating over many refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 125.6 ms ± 3.8 ms [User: 122.7 ms, System: 2.8 ms] Range (min … max): 122.4 ms … 153.4 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 124.0 ms ± 3.9 ms [User: 121.1 ms, System: 2.8 ms] Range (min … max): 120.1 ms … 156.4 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:40 -08:00
Patrick Steinhardt	f8c1a8e2e1	reftable/merged: circumvent pqueue with single subiter The merged iterator uses a priority queue to order records so that we can yielid them in the expected order. This priority queue of course comes with some overhead as we need to add, compare and remove entries in that priority queue. In the general case, that overhead cannot really be avoided. But when we have a single subiter left then there is no need to use the priority queue anymore because the order is exactly the same as what that subiter would return. While having a single subiter may sound like an edge case, it happens more frequently than one might think. In the most common scenario, you can expect a repository to have a single large table that contains most of the records and then a set of smaller tables which contain later additions to the reftable stack. In this case it is quite likely that we exhaust subiters of those smaller stacks before exhausting the large table. Special-case this and return records directly from the remaining subiter. This results in a sizeable speedup when iterating over 1m refs in a repository with a single table: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 135.4 ms ± 4.4 ms [User: 132.5 ms, System: 2.8 ms] Range (min … max): 131.0 ms … 166.3 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 126.3 ms ± 3.9 ms [User: 123.3 ms, System: 2.8 ms] Range (min … max): 122.7 ms … 157.0 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.07 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:40 -08:00
Patrick Steinhardt	3b6dd6ad1d	reftable/merged: handle subiter cleanup on close only When advancing one of the subiters fails we immediately release resources associated with that subiter. This is not necessary though as we will release these resources when closing the merged iterator anyway. Drop the logic and only release resources when the merged iterator is done. This is a mere cleanup that should help reduce the cognitive load when reading through the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:39 -08:00
Patrick Steinhardt	2d71a1d4a2	reftable/merged: remove unnecessary null check for subiters Whenever we advance a subiter we first call `iterator_is_null()`. This is not needed though because we only ever advance subiters which have entries in the priority queue, and we do not end entries to the priority queue when the subiter has been exhausted. Drop the check as well as the now-unused function. This results in a surprisingly big speedup: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 138.1 ms ± 4.4 ms [User: 135.1 ms, System: 2.8 ms] Range (min … max): 133.4 ms … 167.3 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 134.4 ms ± 4.2 ms [User: 131.5 ms, System: 2.8 ms] Range (min … max): 130.0 ms … 164.0 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.03 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:39 -08:00
Patrick Steinhardt	bb2d6be4c1	reftable/merged: make subiters own their records For each subiterator, the merged table needs to track their current record. This record is owned by the priority queue though instead of by the merged iterator. This is not optimal performance-wise. For one, we need to move around records whenever we add or remove a record from the priority queue. Thus, the bigger the entries the more bytes we need to copy around. And compared to pointers, a reftable record is rather on the bigger side. The other issue is that this makes it harder to reuse the records. Refactor the code so that the merged iterator tracks ownership of the records per-subiter. Instead of having records in the priority queue, we can now use mere pointers to the per-subiter records. This also allows us to swap records between the caller and the per-subiter record instead of doing an actual copy via `reftable_record_copy_from()`, which removes the need to release the caller-provided record. This results in a noticeable speedup when iterating through many refs. The following benchmark iterates through 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 145.5 ms ± 4.5 ms [User: 142.5 ms, System: 2.8 ms] Range (min … max): 141.3 ms … 177.0 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 139.0 ms ± 4.7 ms [User: 136.1 ms, System: 2.8 ms] Range (min … max): 134.2 ms … 182.2 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.05 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) This refactoring also allows a subsequent refactoring where we start reusing memory allocated by the reftable records because we do not need to release the caller-provided record anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:39 -08:00
Patrick Steinhardt	aad8ad6fe1	reftable/merged: advance subiter on subsequent iteration When advancing the merged iterator, we pop the topmost entry from its priority queue and then advance the sub-iterator that the entry belongs to, adding the result as a new entry. This is quite sensible in the case where the merged iterator is used to actually iterate through records. But the merged iterator is also used when we look up a single record, only, so advancing the sub-iterator is wasted effort because we would never even look at the result. Instead of immediately advancing the sub-iterator, we can also defer this to the next iteration of the merged iterator by storing the intent-to-advance. This results in a small speedup when reading many records. The following benchmark creates 10000 refs, which will also end up with many ref lookups: Benchmark 1: update-ref: create many refs (revision = HEAD~) Time (mean ± σ): 337.2 ms ± 7.3 ms [User: 200.1 ms, System: 136.9 ms] Range (min … max): 329.3 ms … 373.2 ms 100 runs Benchmark 2: update-ref: create many refs (revision = HEAD) Time (mean ± σ): 332.5 ms ± 5.9 ms [User: 197.2 ms, System: 135.1 ms] Range (min … max): 327.6 ms … 359.8 ms 100 runs Summary update-ref: create many refs (revision = HEAD) ran 1.01 ± 0.03 times faster than update-ref: create many refs (revision = HEAD~) While this speedup alone isn't really worth it, this refactoring will also allow two additional optimizations in subsequent patches. First, it will allow us to special-case when there is only a single sub-iter left to circumvent the priority queue altogether. And second, it makes it easier to avoid copying records to the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:30 -08:00
Patrick Steinhardt	48929d2e47	reftable/merged: make `merged_iter` structure private The `merged_iter` structure is not used anywhere outside of "merged.c", but is declared in its header. Move it into the code file so that it is clear that its implementation details are never exposed to anything. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:30 -08:00
Patrick Steinhardt	5c11529c66	reftable/pq: use `size_t` to track iterator index The reftable priority queue is used by the merged iterator to yield records from its sub-iterators in the expected order. Each entry has a record corresponding to such a sub-iterator as well as an index that indicates which sub-iterator the record belongs to. But while the sub-iterators are tracked with a `size_t`, we store the index as an `int` in the entry. Fix this and use `size_t` consistently. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:19:30 -08:00
Ghanshyam Thakkar	8145a8fd02	setup: remove unnecessary variable The TODO comment suggested to heed core.bare from template config file if no command line override given. And the prev_bare_repository variable seems to have been placed for this sole purpose as it is not used anywhere else. However, it was clarified by Junio [1] that such values (including core.bare) are ignored intentionally and does not make sense to propagate them from template config to repository config. Also, the directories for the worktree and repository are already created, and therefore the bare/non-bare decision has already been made, by the point we reach the codepath where the TODO comment is placed. Therefore, prev_bare_repository does not have a usecase with/without supporting core.bare from template. And the removal of prev_bare_repository is safe as proved by the later part of the comment: "Unfortunately, the line above is equivalent to is_bare_repository_cfg = !work_tree; which ignores the config entirely even if no `--[no-]bare` command line option was present. To see why, note that before this function, there was this call: prev_bare_repository = is_bare_repository() expanding the right hand side: = is_bare_repository_cfg && !get_git_work_tree() = is_bare_repository_cfg && !work_tree note that the last simplification above is valid because nothing calls repo_init() or set_git_work_tree() between any of the relevant calls in the code, and thus the !get_git_work_tree() calls will return the same result each time. So, what we are interested in computing is the right hand side of the line of code just above this comment: prev_bare_repository \|\| !work_tree = is_bare_repository_cfg && !work_tree \|\| !work_tree = !work_tree because "A && !B \|\| !B == !B" for all boolean values of A & B." Therefore, remove the TODO comment and remove prev_bare_repository variable. Also, update relevant testcases and remove one redundant testcase. [1]: https://lore.kernel.org/git/xmqqjzonpy9l.fsf@gitster.g/ Helped-by: Elijah Newren <newren@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 10:18:31 -08:00
shejialuo	0332e813d6	t9117: prefer test_path_* helper functions test -(e\|d) does not provide a nice error message when we hit test failures, so use test_path_exists, test_path_is_dir instead. Signed-off-by: shejialuo <shejialuo@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-04 09:50:21 -08:00
Rubén Justo	1284f9cc11	completion: reflog subcommands and options Make generic the completion for reflog subcommands and its options. Note that we still need to special case the options for "show". Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 14:21:39 -08:00
Rubén Justo	476a236e72	completion: factor out __git_resolve_builtins We're going to use the result of "git xxx --git-completion-helper" not only for feeding COMPREPLY. Therefore, factor out the execution and the caching of its results in __gitcomp_builtin, to a new function __git_resolve_builtins. While we're here, move an important comment we have in the function to its header, so it gains visibility. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 14:21:39 -08:00
Rubén Justo	3fec482b5f	completion: introduce __git_find_subcommand Let's have a function to get the current subcommand when completing commands that follow the syntax: git <command> <subcommand> As a convenience, let's allow an optional "default subcommand" to be returned if none is found. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 14:21:38 -08:00
Rubén Justo	c689c38bc2	completion: reflog show <log-options> Let's add completion for <log-options> in "reflog show" so that the user can easily discover uses like: $ git reflog --since=1.day.ago Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 14:21:38 -08:00
Rubén Justo	85452a1d4b	completion: reflog with implicit "show" When no subcommand is specified to "reflog", we assume "show" [1]: $ git reflog -h usage: git reflog [show] [<log-options>] [<ref>] ... This implicit "show" is not being completed correctly: $ git checkout -b default $ git reflog def<TAB><TAB> ... no completion options ... The expected result is: $ git reflog default This happens because we're completing references after seeing a valid subcommand in the command line. This prevents the implicit "show" from working properly, but also introduces a new problem: it keeps offering subcommand options when the subcommand is implicit: $ git checkout -b explore $ git reflog default ex<TAB> ... $ git reflog default expire The expected result is: $ git reflog default explore To fix this, complete references even if no subcommand is present, or in other words when the subcommand is implicit "show". Also, only include completion options for subcommands when completing the right position in the command line. 1. `cf39f54efc` (git reflog show, 2007-02-08) Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 14:21:38 -08:00
Sergey Organov	12a4883feb	clean: improve -n and -f implementation and documentation What -n actually does in addition to its documented behavior is ignoring of configuration variable clean.requireForce, that makes sense provided -n prevents files removal anyway. So, first, document this in the manual, and then modify implementation to make this more explicit in the code. Improved implementation also stops to share single internal variable 'force' between command-line -f option and configuration variable clean.requireForce, resulting in more clear logic. Two error messages with slightly different text depending on if clean.requireForce was explicitly set or not, are merged into a single one. The resulting error message now does not mention -n as well, as it neither matches intended clean.requireForce usage nor reflects clarified implementation. Documentation of clean.requireForce is changed accordingly. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:50:04 -08:00
René Scharfe	28a92478b8	parse-options: rearrange long_name matching code Move the code for handling a full match of long_name first and get rid of negations. Reduce the indent of the code for matching abbreviations and remove unnecessary curly braces. Combine the checks for whether negation is allowed and whether arg is "n", "no" or "no-" because they belong together and avoid a continue statement. The result is shorter, more readable code. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:22 -08:00
René Scharfe	b1ce2b62fa	parse-options: normalize arg and long_name before comparison Strip "no-" from arg and long_name before comparing them. This way we no longer have to repeat the comparison with an offset of 3 for negated arguments. Note that we must not modify the "flags" value, which tracks whether arg is negated, inside the loop. When registering "--n", "--no" or "--no-" as abbreviation for any negative option, we used to OR it with OPT_UNSET and end the loop. We can simply hard-code OPT_UNSET and leave flags unchanged instead. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:22 -08:00
René Scharfe	0d8a3097c7	parse-options: detect ambiguous self-negation Git currently does not detect the ambiguity of an option that starts with "no" like --notes and its negated form if given just --n or --no. All Git commands with such options have other negatable options, and we detect the ambiguity with them, so that's currently only a potential problem for scripts that use git rev-parse --parseopt. Let's fix it nevertheless, as there's no need for that confusion. To detect the ambiguity we have to loosen the check in register_abbrev(), as an option is considered an alias of itself. Add non-matching negation flags as a criterion to recognize an option being ambiguous with its negated form. And we need to keep going after finding a non-negated option as an abbreviated candidate and perform the negation checks in the same loop. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:21 -08:00
René Scharfe	cb46c3faf8	parse-options: factor out register_abbrev() and struct parsed_option Add a function, register_abbrev(), for storing the necessary details for remembering an abbreviated and thus potentially ambiguous option. Call it instead of sharing the code using goto, to make the control flow more explicit. Conveniently collect these details in the new struct parsed_option to reduce the number of necessary function arguments. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:21 -08:00
René Scharfe	597f9d037d	parse-options: set arg of abbreviated option lazily Postpone setting the opt pointer until we're about to call get_value(), which uses it. There's no point in setting it eagerly for every abbreviated candidate option, which may turn out to be ambiguous. Removing this assignment from the loop doesn't noticeably improve the performance, but allows further simplification. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:21 -08:00
René Scharfe	289cb15541	parse-options: recognize abbreviated negated option with arg Giving an argument to an option that doesn't take one causes Git to report that error specifically: $ git rm --dry-run=bogus error: option `dry-run' takes no value The same is true when the option is negated or abbreviated: $ git rm --no-dry-run=bogus error: option `no-dry-run' takes no value $ git rm --dry=bogus error: option `dry-run' takes no value Not so when doing both, though: $ git rm --no-dry=bogus error: unknown option `no-dry=bogus' usage: git rm [-f \| --force] [-n] [-r] [--cached] [--ignore-unmatch] (Rest of the usage message omitted.) Improve consistency and usefulness of the error message by recognizing abbreviated negated options even if they have a (most likely bogus) argument. With this patch we get: $ git rm --no-dry=bogus error: option `no-dry-run' takes no value Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:49:21 -08:00
René Scharfe	6cf06e9c6e	t-ctype: avoid duplicating class names TEST_CTYPE_FUNC defines a function for testing a character classifier, TEST_CHAR_CLASS calls it, causing the class name to be mentioned twice. Avoid the need to define a class-specific function by letting TEST_CHAR_CLASS do all the work. This is done by using the internal functions test__run_begin() and test__run_end(), but they do exist to be used in test macros after all. Alternatively we could unroll the loop to provide a very long expression that tests all 256 characters and EOF and hand that to TEST, but that seems awkward and hard to read. No change of behavior or output intended. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:47:33 -08:00
René Scharfe	7a8d6c0a10	t-ctype: align output of i The unit test reports misclassified characters like this: # check "isdigit(i) == !!memchr("123456789", i, len)" failed at t/unit-tests/t-ctype.c:36 # left: 1 # right: 0 # i: 0x30 Reduce the indent of i to put its colon directly below the ones in the preceding lines for consistency. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:47:33 -08:00
René Scharfe	752cb6ef81	t-ctype: simplify EOF check EOF is not a member of any character class. If a classifier function returns a non-zero result for it, presumably by mistake, then the unit test check reports: # check "!iseof(EOF)" failed at t/unit-tests/t-ctype.c:53 # i: 0xffffffff (EOF) The numeric value of EOF is not particularly interesting in this context. Stop printing the second line. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-03-03 09:47:33 -08:00

1 2 3 4 5 ...

72646 commits