development/git - HydraGit

mirror of https://github.com/git/git synced 2024-09-17 23:41:33 +00:00

Author	SHA1	Message	Date
Victoria Dye	2c66a7c8ce	read-tree: integrate with sparse index Enable use of sparse index in 'git read-tree'. The integration in this patch is limited only to usage of 'read-tree' that does not need additional functional changes for the sparse index to behave as expected (i.e., produce the same user-facing results as a non-sparse index sparse-checkout). To ensure no unexpected behavior occurs, the index is explicitly expanded when: * '--no-sparse-checkout' is specified (because it disables sparse-checkout) * '--prefix' is specified (if the prefix is inside a sparse directory, the prefixed tree cannot be properly traversed) * two or more <tree-ish> arguments are specified ('twoway_merge' and 'threeway_merge' do not yet support merging sparse directories) Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:36:01 -08:00
Victoria Dye	14bf38cfcf	read-tree: expand sparse checkout test coverage Add tests focused on how 'git read-tree' behaves in sparse checkouts. Extra emphasis is placed on interactions with files outside the sparse cone, e.g. merges with out-of-cone conflicts. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:36:01 -08:00
Victoria Dye	cc89331ddc	read-tree: explicitly disallow prefixes with a leading '/' Exit with an error if a prefix provided to `git read-tree --prefix` begins with '/'. In most cases, prefixes like this result in an "invalid path" error; however, the repository root would be interpreted as valid when specified as '--prefix=/'. This is due to leniency around trailing directory separators on prefixes (e.g., allowing both '--prefix=my-dir' and '--prefix=my-dir/') - the '/' in the prefix is actually the trailing slash, although it could be misinterpreted as a leading slash. To remove the confusing repo root-as-'/' case and make it clear that prefixes should not begin with '/', exit with an error if the first character of the provided prefix is '/'. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:36:00 -08:00
Victoria Dye	2c521b0e49	status: fix nested sparse directory diff in sparse index Enable the 'recursive' diff option for the diff executed as part of 'git status'. Without the 'recursive' enabled, 'git status' reports index changes incorrectly when the following conditions were met: * sparse index is enabled * there is a difference between the index and HEAD in a file inside a subdirectory of a sparse directory * the sparse directory index entry is not expanded in-core Because it is not recursive by default, the diff in 'git status' reports changes only at the level of files and directories that are immediate children of a sparse directory, rather than recursing into directories with changes to identify the modified file(s). As a result, 'git status' reports the immediate subdirectory itself as "modified". Example: $ git init $ mkdir -p sparse/sub $ echo test >sparse/sub/foo $ git add . $ git commit -m "commit 1" $ echo somethingelse >sparse/sub/foo $ git add . $ git commit -a -m "commit 2" $ git sparse-checkout set --cone --sparse-index 'sparse' $ git reset --soft HEAD~1 $ git status On branch master You are in a sparse checkout. Changes to be committed: (use "git restore --staged <file>..." to unstage) modified: sparse/sub Enabling the 'recursive' diff option in 'wt_status_collect_changes_index' corrects this issue by allowing the diff to recurse into subdirectories of sparse directories to find modified files. Given the same repository setup as the example above, the corrected result of `git status` is: $ git status On branch master You are in a sparse checkout. Changes to be committed: (use "git restore --staged <file>..." to unstage) modified: sparse/sub/foo Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:36:00 -08:00
Victoria Dye	287fd17e3a	sparse-index: prevent repo root from becoming sparse Prevent the repository root from being collapsed into a sparse directory by treating an empty path as "inside the sparse-checkout". When collapsing a sparse index (e.g. in 'git sparse-checkout reapply'), the root directory typically could not become a sparse directory due to the presence of in-cone root-level files and directories. However, if no such in-cone files or directories were present, there was no explicit check signaling that the "repository root path" (an empty string, in the case of 'convert_to_sparse(...)') was in-cone, and a sparse directory index entry would be created from the repository root directory. The documentation in Documentation/git-sparse-checkout.txt explicitly states that the files in the root directory are expected to be in-cone for a cone-mode sparse-checkout. Collapsing the root into a sparse directory entry violates that assumption, as sparse directory entries are expected to be outside the sparse cone and have SKIP_WORKTREE enabled. This invalid state in turn causes issues with commands that interact with the index, e.g. 'git status'. Treating an empty (root) path as in-cone prevents the creation of a root sparse directory in 'convert_to_sparse(...)'. Because the repository root is otherwise never compared with sparse patterns (in both cone-mode and non-cone sparse-checkouts), the new check does not cause additional changes to how sparse patterns are applied. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:36:00 -08:00
Derrick Stolee	c8d67b9a68	commit-graph: fix generation number v2 overflow values The Generation Data Chunk was implemented and tested in `e8b63005c` (commit-graph: implement generation data chunk, 2021-01-16), but the test was carefully constructed to work on systems with 32-bit dates. Since the corrected commit date offsets still required more than 31 bits, this triggered writing the generation_data_overflow chunk. However, upon closer look, the write_graph_chunk_generation_data_overflow() method writes the offsets to the chunk (as dictated by the format) but fill_commit_graph_info() treats the value in the chunk as if it is the full corrected commit date (not an offset). For some reason, this does not cause an issue when using the FUTURE_DATE specified in t5318-commit-graph.sh, but it does show up as a failure in 'git commit-graph verify' if we increase that FUTURE_DATE to be above four billion. Fix this error and create a 64-bit timestamp version of the test so we can test these larger values. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:15:06 -08:00
Derrick Stolee	3b0199d4c3	commit-graph: start parsing generation v2 (again) The 'read_generation_data' member of 'struct commit_graph' was introduced by `1fdc383c5` (commit-graph: use generation v2 only if entire chain does, 2021-01-16). The intention was to avoid using corrected commit dates if not all layers of a commit-graph had that data stored. The logic in validate_mixed_generation_chain() at that point incorrectly initialized read_generation_data to 1 if and only if the tip commit-graph contained the Corrected Commit Date chunk. This was "fixed" in `448a39e65` (commit-graph: validate layers for generation data, 2021-02-02) to validate that read_generation_data was either non-zero for all layers, or it would set read_generation_data to zero for all layers. The problem here is that read_generation_data is not initialized to be non-zero anywhere! This change initializes read_generation_data immediately after the chunk is parsed, so each layer will have its value present as soon as possible. The read_generation_data member is used in fill_commit_graph_info() to determine if we should use the corrected commit date or the topological levels stored in the Commit Data chunk. Due to this bug, all previous versions of Git were defaulting to topological levels in all cases! This can be measured with some performance tests. Using the Linux kernel as a testbed, I generated a complete commit-graph containing corrected commit dates and tested the 'new' version against the previous, 'old' version. First, rev-list with --topo-order demonstrates a 26% improvement using corrected commit dates: hyperfine \ -n "old" "$OLD_GIT rev-list --topo-order -1000 v3.6" \ -n "new" "$NEW_GIT rev-list --topo-order -1000 v3.6" \ --warmup=10 Benchmark 1: old Time (mean ± σ): 57.1 ms ± 3.1 ms Range (min … max): 52.9 ms … 62.0 ms 55 runs Benchmark 2: new Time (mean ± σ): 45.5 ms ± 3.3 ms Range (min … max): 39.9 ms … 51.7 ms 59 runs Summary 'new' ran 1.26 ± 0.11 times faster than 'old' These performance improvements are due to the algorithmic improvements given by walking fewer commits due to the higher cutoffs from corrected commit dates. However, this comes at a cost. The additional I/O cost of parsing the corrected commit dates is visible in case of merge-base commands that do not reduce the overall number of walked commits. hyperfine \ -n "old" "$OLD_GIT merge-base v4.8 v4.9" \ -n "new" "$NEW_GIT merge-base v4.8 v4.9" \ --warmup=10 Benchmark 1: old Time (mean ± σ): 110.4 ms ± 6.4 ms Range (min … max): 96.0 ms … 118.3 ms 25 runs Benchmark 2: new Time (mean ± σ): 150.7 ms ± 1.1 ms Range (min … max): 149.3 ms … 153.4 ms 19 runs Summary 'old' ran 1.36 ± 0.08 times faster than 'new' Performance issues like this are what motivated `702110aac` (commit-graph: use config to specify generation type, 2021-02-25). In the future, we could fix this performance problem by inserting the corrected commit date offsets into the Commit Date chunk instead of having that data in an extra chunk. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:15:06 -08:00
Derrick Stolee	75979d9460	commit-graph: fix ordering bug in generation numbers When computing the generation numbers for a commit-graph, we compute the corrected commit dates and then check if their offsets from the actual dates is too large to fit in the 32-bit Generation Data chunk. However, there is a problem with this approach: if we have parsed the generation data from the previous commit-graph, then we continue the loop because the corrected commit date is already computed. This causes an under-count in the number of overflow values. It is incorrect to add an increment to num_generation_data_overflows next to this 'continue' statement, because we might start double-counting commits that are computed because of the depth-first search walk from a commit with an earlier OID. Instead, iterate over the full commit list at the end, checking the offsets to see how many grow beyond the maximum value. Create a new t5328-commit-graph-64-bit-time.sh test script to handle special cases of testing 64-bit timestamps. This helps demonstrate this bug in more cases. It still won't hit all potential cases until the next change, which reenables reading generation numbers. Use the skip_all trick from `0a2bfccb9c` (t0051: use "skip_all" under !MINGW in single-test file, 2022-02-04) to make the output clean when run on a 32-bit system. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:14:57 -08:00
Derrick Stolee	17925e0602	t5318: extract helpers to lib-commit-graph.sh The graph_git_behavior helper is useful for testing that certain Git commands behave the same when using the commit-graph and when not using the commit-graph. Extract it to a new lib-commit-graph.sh file for use in new test scripts that will split out from t5318. While doing this extraction, also extract graph_read_expect and the logic for priming the test_oid_cache. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:09:55 -08:00
Derrick Stolee	c78c7a959c	test-read-graph: include extra post-parse info It can be helpful to verify that the 'struct commit_graph' that results from parsing a commit-graph is correctly structured. The existence of different chunks is not enough to verify that all of the optional features are correctly enabled. Update 'test-tool read-graph' to output an "options:" line that includes information for different parts of the struct commit_graph. In particular, this change demonstrates that the read_generation_data option is never being enabled, which will be fixed in a later change. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 12:09:55 -08:00
Patrick Steinhardt	0a7b38707d	refs/files-backend: optimize reading of symbolic refs When reading references via `files_read_raw_ref()` we always consult both the loose reference, and if that wasn't found, we also consult the packed-refs file. While this makes sense to read a generic reference, it is wasteful in the case where we only care about symbolic references: the packed-refs backend does not support them, and thus it cannot ever return one for us. Special-case reading of symbolic references for the files backend such that we always skip asking the packed-refs backend. We use `refs_read_symbolic_ref()` extensively to determine whether we need to skip updating local symbolic references during a fetch, which is why the change results in a significant speedup when doing fetches in repositories with huge numbers of references. The following benchmark executes a mirror-fetch in a repository with about 2 million references via `git fetch --prune --no-write-fetch-head +refs/:refs/`: Benchmark 1: HEAD~ Time (mean ± σ): 68.372 s ± 2.344 s [User: 65.629 s, System: 8.786 s] Range (min … max): 65.745 s … 70.246 s 3 runs Benchmark 2: HEAD Time (mean ± σ): 60.259 s ± 0.343 s [User: 61.019 s, System: 7.245 s] Range (min … max): 60.003 s … 60.649 s 3 runs Summary 'HEAD' ran 1.13 ± 0.04 times faster than 'HEAD~' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:46 -08:00
Patrick Steinhardt	1553f5e76c	remote: read symbolic refs via `refs_read_symbolic_ref()` We have two cases in the remote code where we check whether a reference is symbolic or not, but don't mind in case it doesn't exist or in case it exists but is a non-symbolic reference. Convert these two callsites to use the new `refs_read_symbolic_ref()` function, whose intent is to implement exactly that usecase. No change in behaviour is expected from this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:46 -08:00
Patrick Steinhardt	cd475b3b03	refs: add ability for backends to special-case reading of symbolic refs Reading of symbolic and non-symbolic references is currently treated the same in reference backends: we always call `refs_read_raw_ref()` and then decide based on the returned flags what type it is. This has one downside though: symbolic references may be treated different from normal references in a backend from normal references. The packed-refs backend for example doesn't even know about symbolic references, and as a result it is pointless to even ask it for one. There are cases where we really only care about whether a reference is symbolic or not, but don't care about whether it exists at all or may be a non-symbolic reference. But it is not possible to optimize for this case right now, and as a consequence we will always first check for a loose reference to exist, and if it doesn't, we'll query the packed-refs backend for a known-to-not-be-symbolic reference. This is inefficient and requires us to search all packed references even though we know to not care for the result at all. Introduce a new function `refs_read_symbolic_ref()` which allows us to fix this case. This function will only ever return symbolic references and can thus optimize for the scenario layed out above. By default, if the backend doesn't provide an implementation for it, we just use the old code path and fall back to `read_raw_ref()`. But in case the backend provides its own, more efficient implementation, we will use that one instead. Note that this function is explicitly designed to not distinguish between missing references and non-symbolic references. If it did, we'd be forced to always search the packed-refs backend to see whether the symbolic reference the user asked for really doesn't exist, or if it exists as a non-symbolic reference. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:46 -08:00
Patrick Steinhardt	8e55634b47	fetch: avoid lookup of commits when not appending to FETCH_HEAD When fetching from a remote repository we will by default write what has been fetched into the special FETCH_HEAD reference. The order in which references are written depends on whether the reference is for merge or not, which, despite some other conditions, is also determined based on whether the old object ID the reference is being updated from actually exists in the repository. To write FETCH_HEAD we thus loop through all references thrice: once for the references that are about to be merged, once for the references that are not for merge, and finally for all references that are ignored. For every iteration, we then look up the old object ID to determine whether the referenced object exists so that we can label it as "not-for-merge" if it doesn't exist. It goes without saying that this can be expensive in case where we are fetching a lot of references. While this is hard to avoid in the case where we're writing FETCH_HEAD, users can in fact ask us to skip this work via `--no-write-fetch-head`. In that case, we do not care for the result of those lookups at all because we don't have to order writes to FETCH_HEAD in the first place. Skip this busywork in case we're not writing to FETCH_HEAD. The following benchmark performs a mirror-fetch in a repository with about two million references via `git fetch --prune --no-write-fetch-head +refs/:refs/`: Benchmark 1: HEAD~ Time (mean ± σ): 75.388 s ± 1.942 s [User: 71.103 s, System: 8.953 s] Range (min … max): 73.184 s … 76.845 s 3 runs Benchmark 2: HEAD Time (mean ± σ): 69.486 s ± 1.016 s [User: 65.941 s, System: 8.806 s] Range (min … max): 68.864 s … 70.659 s 3 runs Summary 'HEAD' ran 1.08 ± 0.03 times faster than 'HEAD~' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:46 -08:00
Patrick Steinhardt	4de656263a	upload-pack: look up "want" lines via commit-graph During packfile negotiation the client will send "want" and "want-ref" lines to the server to tell it which objects it is interested in. The server-side parses each of those and looks them up to see whether it actually has requested objects. This lookup is performed by calling `parse_object()` directly, which thus hits the object database. In the general case though most of the objects the client requests will be commits. We can thus try to look up the object via the commit-graph opportunistically, which is much faster than doing the same via the object database. Refactor parsing of both "want" and "want-ref" lines to do so. The following benchmark is executed in a repository with a huge number of references. It uses cached request from git-fetch(1) as input to git-upload-pack(1) that contains about 876,000 "want" lines: Benchmark 1: HEAD~ Time (mean ± σ): 7.113 s ± 0.028 s [User: 6.900 s, System: 0.662 s] Range (min … max): 7.072 s … 7.168 s 10 runs Benchmark 2: HEAD Time (mean ± σ): 6.622 s ± 0.061 s [User: 6.452 s, System: 0.650 s] Range (min … max): 6.535 s … 6.727 s 10 runs Summary 'HEAD' ran 1.07 ± 0.01 times faster than 'HEAD~' Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-01 10:13:45 -08:00
Junio C Hamano	34363403a2	Merge branch 'ps/fetch-atomic' into ps/fetch-mirror-optim * ps/fetch-atomic: fetch: make `--atomic` flag cover pruning of refs fetch: make `--atomic` flag cover backfilling of tags refs: add interface to iterate over queued transactional updates fetch: report errors when backfilling tags fails fetch: control lifecycle of FETCH_HEAD in a single place fetch: backfill tags before setting upstream fetch: increase test coverage of fetches	2022-03-01 10:11:00 -08:00
Ævar Arnfjörð Bjarmason	71f26798f2	test-lib: add "fast_unwind_on_malloc=0" to LSAN_OPTIONS Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful stack traces from LSAN. This isn't required under ASAN which will emit traces such as this one for a leak in "t/t0006-date.sh": $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd [...] Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x488b94 in strdup (t/helper/test-tool+0x488b94) #1 0x9444a4 in xstrdup wrapper.c:29:14 #2 0x5995fa in parse_date_format date.c:991:24 #3 0x4d2056 in show_dates t/helper/test-date.c:39:2 #4 0x4d174a in cmd__date t/helper/test-date.c:116:3 #5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11 #6 0x4cd1e3 in main common-main.c:52:11 #7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16 #8 0x422b09 in _start (t/helper/test-tool+0x422b09) SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted Whereas LSAN would emit this instead: $ ./t0006-date.sh -vixd [...] Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) #1 0x7f2be1d614aa in strdup string/strdup.c:42:15 SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted Now we'll instead git this sensible stack trace under LSAN. I.e. almost the same one (but starting with "malloc", as is usual for LSAN) as under ASAN: Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8) #1 0x7f012af5c4aa in strdup string/strdup.c:42:15 #2 0x5cb164 in xstrdup wrapper.c:29:14 #3 0x495ee9 in parse_date_format date.c:991:24 #4 0x453aac in show_dates t/helper/test-date.c:39:2 #5 0x453782 in cmd__date t/helper/test-date.c:116:3 #6 0x451d95 in cmd_main t/helper/test-tool.c:127:11 #7 0x451f1e in main common-main.c:52:11 #8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16 #9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9) SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s). Aborted As the option name suggests this does make things slower, e.g. for t0001-init.sh we're around 10% slower: $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3 Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh Time (mean ± σ): 2.135 s ± 0.015 s [User: 1.951 s, System: 0.554 s] Range (min … max): 2.122 s … 2.152 s 3 runs Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh Time (mean ± σ): 1.981 s ± 0.055 s [User: 1.769 s, System: 0.488 s] Range (min … max): 1.941 s … 2.044 s 3 runs Summary 'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran 1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh' I think that's more than worth it to get the more meaningful stack traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for one-off "fast" runs. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 13:35:56 -08:00
Ævar Arnfjörð Bjarmason	b9638d7286	test-lib: make $GIT_BUILD_DIR an absolute path Change the GIT_BUILD_DIR from a path like "/path/to/build/t/.." to "/path/to/build". The "TEST_DIRECTORY" here is already made an absolute path a few lines above this. We could simply do $(cd "$TEST_DIRECTORY"/.." && pwd) here, but as noted in the preceding commit the "$TEST_DIRECTORY" can't be anything except the path containing this test-lib.sh file at this point, so we can more cheaply and equally strip the "/t" off the end. This change will be helpful to LSAN_OPTIONS which will want to strip the build directory path from filenames, which we couldn't do if we had a "/.." in there. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 13:35:56 -08:00
Ævar Arnfjörð Bjarmason	9dbf20e7f6	test-lib: correct and assert TEST_DIRECTORY overriding Correct a misleading comment added by me in `62f539043c` (test-lib: Allow overriding of TEST_DIRECTORY, 2010-08-19), and add an assertion that TEST_DIRECTORY cannot point to any directory except the "t" directory in the top-level of git.git. This assertion is in effect not new, since we'd already die if that wasn't the case[1], but it and the updated commentary help to make that clearer. The existing comments were also on the wrong arms of the "if". I.e. the "allow tests to override this" was on the "test -z" arm. That came about due to a combination of `62f539043c` and `85176d7251` (test-lib.sh: convert $TEST_DIRECTORY to an absolute path, 2013-11-17). Those earlier comments could be read as allowing the "$TEST_DIRECTORY" to be some path outside of t/. As explained in the updated comment that's impossible, rather it was meant for tests that ran outside of t/, i.e. the "t0000-basic.sh" tests that use "lib-subtest.sh". Those tests have a different working directory, but they set the "TEST_DIRECTORY" to the same path for bootstrapping. The comments now reflect that, and further comment on why we have a hard dependency on this. 1. https://lore.kernel.org/git/220222.86o82z8als.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 13:35:56 -08:00
Ævar Arnfjörð Bjarmason	66c1a56870	test-lib: add GIT_SAN_OPTIONS, inherit [AL]SAN_OPTIONS Change our ASAN_OPTIONS and LSAN_OPTIONS to set defaults for those variables, rather than punting out entirely if we already have them in the environment. We want to take any user-provided settings over our own, but we can do that by prepending our defaults to the variable. The libsanitizer options parsing has "last option wins" semantics. It's now possible to do e.g.: LSAN_OPTIONS=report_objects=1 ./t0006-date.sh And not have the "report_objects=1" setting overwrite our sensible default of "abort_on_error=1", but by prepending to the list we ensure that: LSAN_OPTIONS=report_objects=1:abort_on_error=0 ./t0006-date.sh Will take the desired "abort_on_error=0" over our default. See `b0f4c9087e` (t: support clang/gcc AddressSanitizer, 2014-12-08) for the original pattern being altered here, and `85b81b35ff` (test-lib: set LSAN_OPTIONS to abort by default, 2017-09-05) for when LSAN_OPTIONS was added in addition to the then-existing ASAN_OPTIONS. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 13:35:56 -08:00
Tao Klerks	317956d912	untracked-cache: write index when populating empty untracked cache It is expected that an empty/unpopulated untracked cache structure can be written to the index - by update-index, or by a "git status" call that sees the untracked cache should be enabled and is not, but is running with options that make the untracked cache non-applicable in that run (eg a pathspec). Currently, if that happens, then subsequent "git status" calls end up populating the untracked cache, but not writing the index (not saving their work) - so the performance outcome is almost identical to the cache being altogether disabled. This continues until the index gets written with the untracked cache populated, for some other reason, such as a working tree change. Detect the condition where an empty untracked cache exists in the index and we will collect the list of untracked paths, and queue an index write under that condition, so that the collected untracked paths can be written out to the untracked cache extension in the index. This change depends on previous fixes to t7519 for the "ignore .git changes when invalidating UNTR" test case to pass - before this fix, the test never actually did anything as it was not set up correctly. Signed-off-by: Tao Klerks <tao@klerks.biz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 10:02:18 -08:00
Tao Klerks	37482b4080	t7519: populate untracked cache before test In its current state, the t7519 test dealing with untracked cache assumes that "git update-index --untracked-cache" will populate the untracked cache. This is not correct - it will only add an empty untracked cache structure to the index. If we're going to compare two git status runs with something interesting happening in-between, we need to ensure that the index is in a stable/steady state before that first run. Achieve this by adding another prior "git status" run. At this stage this change does nothing, because there is a bug, addressed in the next patch, whereby once the empty untracked cache structure is added by the update-index invocation, the untracked cache gets updated in every subsequent "git status" call, but the index with these updates does not get written down. That bug actually invalidates this entire test case - but we're fixing that next. Signed-off-by: Tao Klerks <tao@klerks.biz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 10:02:18 -08:00
Tao Klerks	a67d178be4	t7519: avoid file to index mtime race for untracked cache In t7519 there is a test that writes files to disk, and immediately writes the index with the untracked cache. Because of mtime-comparison logic that uses a 1-second resolution, this means the cached entries are not trusted/used under some circumstances (see read-cache.c#is_racy_stat()). Untracked cache tests in t7063 use a 1-second delay to avoid this issue, but we don't want to introduce arbitrary slowdowns, so instead use test-tool chmtime to backdate the files slightly. The t7063 delays are a #leftoverbit, to be worked on in a separate series. This change doesn't actually affect the outcome of the test, but does enhance its validity, and becomes relevant after later changes. Signed-off-by: Tao Klerks <tao@klerks.biz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-28 10:02:17 -08:00
Junio C Hamano	2587df669b	rerere-train: two fixes to the use of "git show -s" The script uses "git show -s" to display the title of the merge commit being studied, without explicitly disabling the pager, which is not a safe thing to do in a script. For example, when the pager is set to "less" with "-SF" options (-S tells the pager not to fold lines but allow horizontal scrolling to show the overly long lines, -F tells the pager not to wait if the output in its entirety is shown on a single page), and the title of the merge commit is longer than the width of the terminal, the pager will wait until the end-user tells it to quit after showing the single line. Explicitly disable the pager with this "git show" invocation to fix this. The command uses the "--pretty=format:..." format, which adds LF in between each pair of commits it outputs, which means that the label for the merge being learned from will be followed by the next message on the same line. "--pretty=tformat:..." is what we should instead, which adds LF after each commit, or a more modern way to spell it, i.e. "--format=...". This existing breakage becomes easier to see, now we no longer use the pager. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-27 14:14:03 -08:00
Alex Henrie	808213ba36	switch: mention the --detach option when dying due to lack of a branch Users who are accustomed to doing `git checkout <tag>` assume that `git switch <tag>` will do the same thing. Inform them of the --detach option so they aren't left wondering why `git switch` doesn't work but `git checkout` does. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 22:21:48 -08:00
Ævar Arnfjörð Bjarmason	6aea6baeb3	object-file API: pass an enum to read_object_with_reference() Change the read_object_with_reference() function to take an "enum object_type". It was not prepared to handle an arbitrary "const char *type", as it was itself calling type_from_string(). Let's change the only caller that passes in user data to use type_from_string(), and convert the rest to use e.g. "OBJ_TREE" instead of "tree_type". The "cat-file" caller is not on the codepath that handles"--allow-unknown", so the type_from_string() there is safe. Its use of type_from_string() doesn't functionally differ from that of the pre-image. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	2bbb28a3ee	object-file.c: add a literal version of write_object_file_prepare() Split off a *_literally() variant of the write_object_file_prepare() function. To do this create a new "hash_object_body()" static helper. We now defer the type_name() call until the very last moment in format_object_header() for those callers that aren't "hash-object --literally". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	44439c1c58	object-file API: have hash_object_file() take "enum object_type" Change the hash_object_file() function to take an "enum object_type". Since a preceding commit all of its callers are passing either "{commit,tree,blob,tag}_type", or the result of a call to type_name(), the parse_object() caller that would pass NULL is now using stream_object_signature(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	0ff7b4f976	object API: rename hash_object_file_literally() to write_() Before `0c3db67cc8` (hash-object --literally: fix buffer overrun with extra-long object type, 2015-05-04) the hash-object code being changed here called write_sha1_file() to both hash and write a loose object. Before that we'd use hash_sha1_file() to if "-w" wasn't provided, and otherwise call write_sha1_file(). Now we'll always call the same function for both writing. Let's rename it from hash__literally() to write__literally(). Even though the write_() might not actually write if HASH_WRITE_OBJECT isn't in "flags", having it be more similar to write_object_file_flags() than hash_object_file(), but carrying a name that would suggest that it's a variant of the latter is confusing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	0f156dbb04	object-file API: split up and simplify check_object_signature() Split up the check_object_signature() function into that non-streaming version (it accepts an already filled "buf"), and a new stream_object_signature() which will retrieve the object from storage, and hash it on-the-fly. All of the callers of check_object_signature() were effectively calling two different functions, if we go by cyclomatic complexity. I.e. they'd either take the early "if (map)" branch and return early, or not. This has been the case since the "if (map)" condition was added in `090ea12671` (parse_object: avoid putting whole blob in core, 2012-03-07). We can then further simplify the resulting check_object_signature() function since only one caller wanted to pass a non-NULL "buf" and a non-NULL "real_oidp". That "read_loose_object()" codepath used by "git fsck" can instead use hash_object_file() followed by oideq(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	ee213de22d	object API users + docs: check <0, not !0 with check_object_signature() Change those users of the object API that misused check_object_signature() by assuming it returned any non-zero when the OID didn't match the expected value to check <0 instead. In practice all of this code worked before, but it wasn't consistent with rest of the users of the API. Let's also clarify what the <0 return value means in API docs. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	cdcaaec9a6	object API docs: move check_object_signature() docs to cache.h Move the API documentation for check_object_signature() to cache.h, where its prototype is declared. This is in preparation for adding a companion function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	73182b2d84	object API: correct "buf" v.s. "map" mismatch in .c and .h Change the name of the second argument to check_object_signature() to be "buf" in object-file.c, making it consistent with the prototype in cache.h This fixes an inconsistency that's been with us since `2ade934026` (Add "check_sha1_signature()" helper function, 2005-04-08), and makes a subsequent commit's diff smaller, as we'll move these API docs to cache.h. While we're at it fix a small grammar error in the documentation, dropping an "an" before "in-core object-data". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	c80d226a04	object-file API: have write_object_file() take "enum object_type" Change the write_object_file() function to take an "enum object_type" instead of a "const char type". Its callers either passed {commit,tree,blob,tag}_type and can pass the corresponding OBJ_ type instead, or were hardcoding strings like "blob". This avoids the back & forth fragility where the callers of write_object_file() would have the enum type, and convert it themselves via type_name(). We do have to now do that conversion ourselves before calling write_object_file_prepare(), but those codepaths will be similarly adjusted in subsequent commits. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	b04cdea46c	object-file API: add a format_object_header() function Add a convenience function to wrap the xsnprintf() command that generates loose object headers. This code was copy/pasted in various parts of the codebase, let's define it in one place and re-use it from there. All except one caller of it had a valid "enum object_type" for us, it's only write_object_file_prepare() which might need to deal with "git hash-object --literally" and a potential garbage type. Let's have the primary API use an "enum object_type", and define a _literally() function that can take an arbitrary "const char " for the type. See [1] for the discussion that prompted this patch, i.e. new code in object-file.c that wanted to copy/paste the xsnprintf() invocation. In the case of fast-import.c the callers unfortunately need to cast back & forth between "unsigned char " and "char ", since format_object_header() ad encode_in_pack_object_header() take different signedness. 1. https://lore.kernel.org/git/211213.86bl1l9bfz.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	63e05f9056	object-file API: return "void", not "int" from hash_object_file() The hash_object_file() function added in `abdc3fc842` (Add hash_sha1_file(), 2006-10-14) did not have a meaningful return value, and it never has. One was seemingly added to avoid adding braces to the "ret = " assignments being modified here. Let's instead assign "0" to the "ret" variables at the beginning of the relevant functions, and have them return "void". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Ævar Arnfjörð Bjarmason	bbea0ddeb9	object-file.c: split up declaration of unrelated variables Split up the declaration of the "ret" and "re_allocated" variables. It's not our usual style to group variable declarations simply because they share a type, we'd only prefer to do so when the two are closely related (e.g. "int i, j"). This change makes a subsequent and meaningful change's diff smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Junio C Hamano	715d08a9e5	The eighth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 15:47:38 -08:00
Junio C Hamano	0fd097b9a0	Merge branch 'tb/coc-plc-update' Document Taylor as a new member of Git PLC at SFC. Welcome. * tb/coc-plc-update: CODE_OF_CONDUCT.md: update PLC members list	2022-02-25 15:47:38 -08:00
Junio C Hamano	b3db182886	Merge branch 'en/ort-inner-merge-conflict-report' Messages "ort" merge backend prepares while dealing with conflicted paths were unnecessarily confusing since it did not differentiate inner merges and outer merges. * en/ort-inner-merge-conflict-report: merge-ort: make informational messages from recursive merges clearer	2022-02-25 15:47:38 -08:00
Junio C Hamano	5c4f3804a7	Merge branch 'rs/pcre-invalid-utf8-fix-fix' Workaround we have for versions of PCRE2 before their version 10.36 were in effect only for their versions newer than 10.36 by mistake, which has been corrected. * rs/pcre-invalid-utf8-fix-fix: grep: fix triggering PCRE2_NO_START_OPTIMIZE workaround	2022-02-25 15:47:38 -08:00
Junio C Hamano	80f7f618b6	Merge branch 'ds/core-untracked-cache-config' Setting core.untrackedCache to true failed to add the untracked cache extension to the index. * ds/core-untracked-cache-config: dir: force untracked cache with core.untrackedCache	2022-02-25 15:47:36 -08:00
Junio C Hamano	362f869ff2	Merge branch 'ab/diff-free-more' Leakfixes. * ab/diff-free-more: diff.[ch]: have diff_free() free options->parseopts diff.[ch]: have diff_free() call clear_pathspec(opts.pathspec)	2022-02-25 15:47:36 -08:00
Junio C Hamano	0a01df08c0	Merge branch 'ab/date-mode-release' Plug (some) memory leaks around parse_date_format(). * ab/date-mode-release: date API: add and use a date_mode_release() date API: add basic API docs date API: provide and use a DATE_MODE_INIT date API: create a date.h, split from cache.h cache.h: remove always unused show_date_human() declaration	2022-02-25 15:47:36 -08:00
Junio C Hamano	294f296292	Merge branch 'jc/name-rev-stdin' Finishing touches to an earlier "name-rev --annotate-stdin" series. * jc/name-rev-stdin: name-rev: replace --stdin with --annotate-stdin in synopsis	2022-02-25 15:47:36 -08:00
Junio C Hamano	5b84280c65	Merge branch 'ab/grep-patterntype' Some code clean-up in the "git grep" machinery. * ab/grep-patterntype: grep: simplify config parsing and option parsing grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" grep.h: make "grep_opt.pattern_type_option" use its enum grep API: call grep_config() after grep_init() grep.c: don't pass along NULL callback value built-ins: trust the "prefix" from run_builtin() grep tests: add missing "grep.patternType" config tests grep tests: create a helper function for "BRE" or "ERE" log tests: check if grep_config() is called by "log"-like cmds grep.h: remove unused "regex_t regexp" from grep_opt	2022-02-25 15:47:36 -08:00
Junio C Hamano	2e65591ed6	Merge branch 'js/apply-partial-clone-filters-recursively' "git clone --filter=... --recurse-submodules" only makes the top-level a partial clone, while submodules are fully cloned. This behaviour is changed to pass the same filter down to the submodules. * js/apply-partial-clone-filters-recursively: clone, submodule: pass partial clone filters to submodules	2022-02-25 15:47:35 -08:00
Junio C Hamano	d21d5ddfe6	Merge branch 'ja/i18n-common-messages' Unify more messages to help l10n. * ja/i18n-common-messages: i18n: fix some misformated placeholders in command synopsis i18n: remove from i18n strings that do not hold translatable parts i18n: factorize "invalid value" messages i18n: factorize more 'incompatible options' messages	2022-02-25 15:47:35 -08:00
Junio C Hamano	a47fcfe871	Merge branch 'ab/only-single-progress-at-once' Further tweaks on progress API. * ab/only-single-progress-at-once: pack-bitmap-write.c: don't return without stop_progress() progress API: unify stop_progress{,_msg}(), fix trace2 bug progress.c: refactor stop_progress{,_msg}() to use helpers progress.c: use dereferenced "progress" variable, not "(*p_progress)" progress.h: format and be consistent with progress.c naming progress.c tests: test some invalid usage progress.c tests: make start/stop commands on stdin progress.c test helper: add missing braces leak tests: fix a memory leak in "test-progress" helper	2022-02-25 15:47:35 -08:00
Junio C Hamano	6249ce2d1b	Merge branch 'ds/sparse-checkout-requires-per-worktree-config' "git sparse-checkout" wants to work with per-worktree configuration, but did not work well in a worktree attached to a bare repository. * ds/sparse-checkout-requires-per-worktree-config: config: make git_configset_get_string_tmp() private worktree: copy sparse-checkout patterns and config on add sparse-checkout: set worktree-config correctly config: add repo_config_set_worktree_gently() worktree: create init_worktree_config() Documentation: add extensions.worktreeConfig details	2022-02-25 15:47:33 -08:00

1 2 3 4 5 ...

66232 commits