development/git - HydraGit

mirror of https://github.com/git/git synced 2024-09-19 08:21:36 +00:00

Author	SHA1	Message	Date
Michael Strawbridge	a8022c5f7b	send-email: expose header information to git-send-email's sendemail-validate hook To allow further flexibility in the Git hook, the SMTP header information of the email which git-send-email intends to send, is now passed as the 2nd argument to the sendemail-validate hook. As an example, this can be useful for acting upon keywords in the subject or specific email addresses. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Junio C Hamano <gitster@pobox.com> Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-19 14:19:09 -07:00
Jeff King	7891e46585	gpg-interface: set trust level of missing key to "undefined" In check_signature(), we initialize the trust_level field to "-1", with the idea that if gpg does not return a trust level at all (if there is no signature, or if the signature is made by an unknown key), we'll use that value. But this has two problems: 1. Since the field is an enum, it's up to the compiler to decide what underlying storage to use, and it only has to fit the values we've declared. So we may not be able to store "-1" at all. And indeed, on my system (linux with gcc), the resulting enum is an unsigned 32-bit value, and -1 becomes 4294967295. The difference may seem academic (and you even get "-1" if you pass it to printf("%d")), but it means that code like this: status \|= sigc->trust_level < configured_min_trust_level; does not necessarily behave as expected. This turns out not to be a bug in practice, though, because we keep the "-1" only when gpg did not report a signature from a known key, in which case the line above: status \|= sigc->result != 'G'; would always set status to non-zero anyway. So only a 'G' signature with no parsed trust level would cause a problem, which doesn't seem likely to trigger (outside of unexpected gpg behavior). 2. When using the "%GT" format placeholder, we pass the value to gpg_trust_level_to_str(), which complains that the value is out of range with a BUG(). This behavior was introduced by `803978da49` (gpg-interface: add function for converting trust level to string, 2022-07-11). Before that, we just did a switch() on the enum, and anything that wasn't matched would end up as the empty string. Curiously, solving this by naively doing: if (level < 0) return ""; in that function isn't sufficient. Because of (1) above, the compiler can (and does in my case) actually remove that conditional as dead code! We can solve both by representing this state as an enum value. We could do this by adding a new "unknown" value. But this really seems to match the existing "undefined" level well. GPG describes this as "Not enough information for calculation". We have tests in t7510 that trigger this case (verifying a signature from a key that we don't have, and then checking various %G placeholders), but they didn't notice the BUG() because we didn't look at %GT for that case! Let's make sure we check all %G placeholders for each case in the formatting tests. The interesting ones here are "show unknown signature with custom format" and "show lack of signature with custom format", both of which would BUG() before, and now turn %GT into "undefined". Prior to `803978da49` they would have turned it into the empty string, but I think saying "undefined" consistently is a reasonable outcome, and probably makes life easier for anyone parsing the output (and any such parser had to be ready to see "undefined" already). The other modified tests produce the same output before and after this patch, but now we're consistently checking both %G? and %GT in all of them. Signed-off-by: Jeff King <peff@peff.net> Reported-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-19 08:30:54 -07:00
Taylor Blau	e3e24de1bf	builtin/gc.c: make `gc.cruftPacks` enabled by default Back in `5b92477f89` (builtin/gc.c: conditionally avoid pruning objects via loose, 2022-05-20), `git gc` learned the `--cruft` option and `gc.cruftPacks` configuration to opt-in to writing cruft packs when collecting or pruning unreachable objects. Cruft packs were introduced with the merge in `a50036da1a` (Merge branch 'tb/cruft-packs', 2022-06-03). They address the problem of "loose object explosions", where Git will write out many individual loose objects when there is a large number of unreachable objects that have not yet aged past `--prune=<date>`. Instead of keeping track of those unreachable yet recent objects via their loose object file's mtime, cruft packs collect all unreachable objects into a single pack with a corresponding `*.mtimes` file that acts as a table to store the mtimes of all unreachable objects. This prevents the need to store unreachable objects as loose as they age out of the repository, and avoids the problem of loose object explosions. Beyond avoiding loose object explosions, cruft packs also act as a more efficient mechanism to store unreachable objects as they age out of a repository. This is because pairs of similar unreachable objects serve as delta bases for one another. In `5b92477f89`, the feature was introduced as experimental. Since then, GitHub has been running these patches in every repository generating hundreds of millions of cruft packs along the way. The feature is battle-tested, and avoids many pathological cases such as above. Users who either run `git gc` manually, or via `git maintenance` can benefit from having cruft packs. As such, enable cruft pack generation to take place by default (by making `gc.cruftPacks` have the default of "true" rather than "false). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	c58100ab5d	t/t9300-fast-import.sh: prepare for `gc --cruft` by default In a similar fashion as previous commits, adjust the fast-import tests to prepare for "git gc" generating a cruft pack by default. This adjustment is slightly different, however. Instead of relying on us writing out the objects loose, and then calling `git prune` to remove them, t9300 needs to be prepared to drop objects that would be moved into cruft packs. To do this, we can combine the `git gc` invocation with `git prune` into one `git gc --prune`, which handles pruning both loose objects, and objects that would otherwise be written to a cruft pack. Likely this pattern of "git gc && git prune" started all the way back in `03db4525d3` (Support gitlinks in fast-import., 2008-07-19), which happened after deprecating `git gc --prune` in `9e7d501990` (builtin-gc.c: deprecate --prune, it now really has no effect, 2008-05-09). After `--prune` was un-deprecated in `58e9d9d472` (gc: make --prune useful again by accepting an optional parameter, 2009-02-14), this script got a handful of new "git gc && git prune" instances via via `4cedb78cb5` (fast-import: add input format tests, 2011-08-11). These could have been `git gc --prune`, but weren't (likely taking after `03db4525d3`). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	b9061bc628	t/t6500-gc.sh: add additional test cases In the last commit, we refactored some of the tests in t6500 to make clearer when cruft packs will and won't be generated by `git gc`. Add the remaining cases not covered by the previous patch into this one, which enumerates all possible combinations of arguments that will produce (or not produce) a cruft pack. This prepares us for a future commit which will change the default value of `gc.cruftPacks` by ensuring that we understand which invocations do and do not change as a result. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	50685e0e0b	t/t6500-gc.sh: refactor cruft pack tests In `12253ab6d0` (gc: add tests for --cruft and friends, 2022-10-26), we added a handful of tests to t6500 to ensure that `git gc` respected the value of `--cruft` and `gc.cruftPacks`. Then, in `c695592850` (config: let feature.experimental imply gc.cruftPacks=true, 2022-10-26), another set of similar tests was added to ensure that `feature.experimental` correctly implied enabling cruft pack generation (or not). These tests are similar and could be consolidated. Do so in this patch to prepare for expanding the set of command-line invocations that enable or disable writing cruft packs. This makes it possible to easily test more combinations of arguments without being overly repetitive. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	b31d45b831	t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default In a similar spirit as previous commits, prepare for `gc --cruft` becoming the default by ensuring that the tests in t6501 explicitly cover the case of freshening loose objects not using cruft packs. We could run this test twice, once with `--cruft` and once with `--no-cruft`, but doing so is unnecessary, since we already test object rescuing, freshening, and dealing with corrupt parts of the unreachable object graph extensively via t5329. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Taylor Blau	b934207a22	t/t5304-prune.sh: prepare for `gc --cruft` by default Many of the tests in t5304 run `git gc`, and rely on its behavior that unreachable-but-recent objects are written out loose. This is sensible, since t5304 deals specifically with this kind of pruning. If left unattended, however, this test would break when the default behavior of a bare "git gc" is adjusted to generate a cruft pack by default. Ensure that these tests continue to work as-is (and continue to provide coverage of loose object pruning) by passing `--no-cruft` explicitly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Taylor Blau	05b9013b71	builtin/gc.c: ignore cruft packs with `--keep-largest-pack` When cruft packs were implemented, we never adjusted the code for `git gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft packs. This option and configuration option share a common implementation, but including cruft packs is wrong in both cases: - Running `git gc --keep-largest-pack` in a repository where the largest pack is the cruft pack itself will make it impossible for `git gc` to prune objects, since the cruft pack itself is kept. - The same is true for `gc.bigPackThreshold`, if the size of the cruft pack exceeds the limit set by the caller. In the future, it is possible that `gc.bigPackThreshold` could be used to write a separate cruft pack containing any new unreachable objects that entered the repository since the last time a cruft pack was written. There are some complexities to doing so, mainly around handling pruning objects that are in an existing cruft pack that is above the threshold (which would either need to be rewritten, or else delay pruning). Rewriting a substantially similar cruft pack isn't ideal, but it is significantly better than the status-quo. If users have large cruft packs that they don't want to rewrite, they can mark them as `*.keep` packs. But in general, if a repository has a cruft pack that is so large it is slowing down GC's, it should probably be pruned anyway. In the meantime, ignore cruft packs in the common implementation for both of these options, and add a pair of tests to prevent any future regressions here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Junio C Hamano	3c957e6d39	Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling' Clean-up of the code path that deals with merge strategy option handling in "git rebase". * pw/rebase-cleanup-merge-strategy-option-handling: rebase: remove a couple of redundant strategy tests rebase -m: fix serialization of strategy options rebase -m: cleanup --strategy-option handling sequencer: use struct strvec to store merge strategy options rebase: stop reading and writing unnecessary strategy state	2023-04-17 18:05:13 -07:00
Junio C Hamano	9d8370d445	Merge branch 'tk/mergetool-gui-default-config' "git mergetool" and "git difftool" learns a new configuration guiDefault to optionally favor configured guitool over non-gui-tool automatically when $DISPLAY is set. * tk/mergetool-gui-default-config: mergetool: new config guiDefault supports auto-toggling gui by DISPLAY	2023-04-17 18:05:11 -07:00
Junio C Hamano	d47ee0a565	Merge branch 'sl/sparse-write-tree' "git write-tree" learns to work better with sparse-index. * sl/sparse-write-tree: write-tree: integrate with sparse index	2023-04-17 18:05:11 -07:00
Derrick Stolee	5a6072f631	fsck: validate .rev file header While parsing a .rev file, we check the header information to be sure it makes sense. This happens before doing any additional validation such as a checksum or value check. In order to differentiate between a bad header and a non-existent file, we need to update the API for loading a reverse index. Make load_pack_revindex_from_disk() non-static and specify that a positive value means "the file does not exist" while other errors during parsing are negative values. Since an invalid header prevents setting up the structures we would use for further validations, we can stop at that point. The place where we can distinguish between a missing file and a corrupt file is inside load_revindex_from_disk(), which is used both by pack rev-indexes and multi-pack-index rev-indexes. Some tests in t5326 demonstrate that it is critical to take some conditions to allow positive error signals. Add tests that check the three header values. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:05 -07:00
Derrick Stolee	5f658d1b57	fsck: check rev-index position values When checking a rev-index file, it may be helpful to identify exactly which positions are incorrect. Compare the rev-index to a freshly-computed in-memory rev-index and report the comparison failures. This additional check (on top of the checksum validation) can help find files that were corrupt by a single bit flip on-disk or perhaps were written incorrectly due to a bug in Git. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Derrick Stolee	d975fe1fa5	fsck: check rev-index checksums The previous change added calls to verify_pack_revindex() in builtin/fsck.c, but the implementation of the method was left empty. Add the first and most-obvious check to this method: checksum verification. While here, create a helper method in the test script that makes it easy to adjust the .rev file and check that 'git fsck' reports the correct error message. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Derrick Stolee	0d30feef3c	fsck: create scaffolding for rev-index checks The 'fsck' builtin checks many of Git's on-disk data structures, but does not currently validate the pack rev-index files (a .rev file to pair with a .pack and .idx file). Before doing a more-involved check process, create the scaffolding within builtin/fsck.c to have a new error type and add that error type when the API method verify_pack_revindex() returns an error. That method does nothing currently, but we will add checks to it in later changes. For now, check that 'git fsck' succeeds without any errors in the normal case. Future checks will be paired with tests that corrupt the .rev file appropriately. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Johannes Schindelin	3d3c11852c	Sync with 2.39.3 * maint-2.39: (34 commits) Git 2.39.3 Git 2.38.5 Git 2.37.7 Git 2.36.6 Git 2.35.8 Makefile: force -O0 when compiling with SANITIZE=leak Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix range-diff: use ssize_t for parsed "len" in read_patches() ...	2023-04-17 21:16:10 +02:00
Johannes Schindelin	15628975cf	Sync with 2.38.5 * maint-2.38: (32 commits) Git 2.38.5 Git 2.37.7 Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion ...	2023-04-17 21:16:08 +02:00
Johannes Schindelin	c96ecfe6a5	Sync with 2.37.7 * maint-2.37: (31 commits) Git 2.37.7 Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 ...	2023-04-17 21:16:06 +02:00
Johannes Schindelin	1df551ce5c	Sync with 2.36.6 * maint-2.36: (30 commits) Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix ...	2023-04-17 21:16:04 +02:00
Johannes Schindelin	62298def14	Sync with 2.35.8 * maint-2.35: (29 commits) Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR ...	2023-04-17 21:16:02 +02:00
Johannes Schindelin	8cd052ea53	Sync with 2.34.8 * maint-2.34: (28 commits) Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION ...	2023-04-17 21:15:59 +02:00
Johannes Schindelin	d6e9f67a8e	Sync with 2.33.8 * maint-2.33: (27 commits) Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ...	2023-04-17 21:15:56 +02:00
Johannes Schindelin	bcd874d50f	Sync with 2.32.7 * maint-2.32: (26 commits) Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ...	2023-04-17 21:15:52 +02:00
Johannes Schindelin	31f7fe5e34	Sync with 2.31.8 * maint-2.31: (25 commits) Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ci: use the same version of p4 on both Linux and macOS ...	2023-04-17 21:15:49 +02:00
Johannes Schindelin	92957d8427	tests: avoid using `test_i18ncmp` Since `test_i18ncmp` was deprecated in v2.31.*, the instances added in v2.30.9 needed to be converted to `test_cmp` calls. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2023-04-17 21:15:45 +02:00
Johannes Schindelin	b524e896b6	Sync with 2.30.9 * maint-2.30: (23 commits) Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ci: use the same version of p4 on both Linux and macOS ci: remove the pipe after "p4 -V" to catch errors github-actions: run gcc-8 on ubuntu-20.04 image ...	2023-04-17 21:15:44 +02:00
Taylor Blau	528290f8c6	Merge branch 'tb/config-copy-or-rename-in-file-injection' Avoids issues with renaming or deleting sections with long lines, where configuration values may be interpreted as sections, leading to configuration injection. Addresses CVE-2023-29007. * tb/config-copy-or-rename-in-file-injection: config.c: disallow overly-long lines in `copy_or_rename_section_in_file()` config.c: avoid integer truncation in `copy_or_rename_section_in_file()` config: avoid fixed-sized buffer when renaming/deleting a section t1300: demonstrate failure when renaming sections with long lines Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:42 +02:00
Taylor Blau	3bb3d6bac5	config.c: disallow overly-long lines in `copy_or_rename_section_in_file()` As a defense-in-depth measure to guard against any potentially-unknown buffer overflows in `copy_or_rename_section_in_file()`, refuse to work with overly-long lines in a gitconfig. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>	2023-04-17 21:15:40 +02:00
Taylor Blau	a5bb10fd5e	config: avoid fixed-sized buffer when renaming/deleting a section When renaming (or deleting) a section of configuration, Git uses the function `git_config_copy_or_rename_section_in_file()` to rewrite the configuration file after applying the rename or deletion to the given section. To do this, Git repeatedly calls `fgets()` to read the existing configuration data into a fixed size buffer. When the configuration value under `old_name` exceeds the size of the buffer, we will call `fgets()` an additional time even if there is no newline in the configuration file, since our read length is capped at `sizeof(buf)`. If the first character of the buffer (after zero or more characters satisfying `isspace()`) is a '[', Git will incorrectly treat it as beginning a new section when the original section is being removed. In other words, a configuration value satisfying this criteria can incorrectly be considered as a new secftion instead of a variable in the original section. Avoid this issue by using a variable-width buffer in the form of a strbuf rather than a fixed-with region on the stack. A couple of small points worth noting: - Using a strbuf will cause us to allocate arbitrary sizes to match the length of each line. In practice, we don't expect any reasonable configuration files to have lines that long, and a bandaid will be introduced in a later patch to ensure that this is the case. - We are using strbuf_getwholeline() here instead of strbuf_getline() in order to match `fgets()`'s behavior of leaving the trailing LF character on the buffer (as well as a trailing NUL). This could be changed later, but using strbuf_getwholeline() changes the least about this function's implementation, so it is picked as the safest path. - It is temping to want to replace the loop to skip over characters matching isspace() at the beginning of the buffer with a convenience function like `strbuf_ltrim()`. But this is the wrong approach for a couple of reasons: First, it involves a potentially large and expensive `memmove()` which we would like to avoid. Second, and more importantly, we also do want to preserve those spaces to avoid changing the output of other sections. In all, this patch is a minimal replacement of the fixed-width buffer in `git_config_copy_or_rename_section_in_file()` to instead use a `struct strbuf`. Reported-by: André Baptista <andre@ethiack.com> Reported-by: Vítor Pinho <vitor@ethiack.com> Helped-by: Patrick Steinhardt <ps@pks.im> Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:40 +02:00
Taylor Blau	29198213c9	t1300: demonstrate failure when renaming sections with long lines When renaming a configuration section which has an entry whose length exceeds the size of our buffer in config.c's implementation of `git_config_copy_or_rename_section_in_file()`, Git will incorrectly form a new configuration section with part of the data in the section being removed. In this instance, our first configuration file looks something like: [b] c = d <spaces> [a] e = f [a] g = h Here, we have two configuration values, "b.c", and "a.g". The value "[a] e = f" belongs to the configuration value "b.c", and does not form its own section. However, when renaming the section 'a' to 'xyz', Git will write back "[xyz]\ne = f", but "[xyz]" is still attached to the value of "b.c", which is why "e = f" on its own line becomes a new entry called "b.e". A slightly different example embeds the section being renamed within another section. Demonstrate this failure in a test in t1300, which we will fix in the following commit. Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:39 +02:00
Johannes Schindelin	9db05711c9	apply --reject: overwrite existing `.rej` symlink if it exists The `git apply --reject` is expected to write out `.rej` files in case one or more hunks fail to apply cleanly. Historically, the command overwrites any existing `.rej` files. The idea being that apply/reject/edit cycles are relatively common, and the generated `.rej` files are not considered precious. But the command does not overwrite existing `.rej` symbolic links, and instead follows them. This is unsafe because the same patch could potentially create such a symbolic link and point at arbitrary paths outside the current worktree, and `git apply` would write the contents of the `.rej` file into that location. Therefore, let's make sure that any existing `.rej` file or symbolic link is removed before writing it. Reported-by: RyotaK <ryotak.mail@gmail.com> Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2023-04-17 21:15:38 +02:00
Jeff King	c4716236f2	t5512: test "ls-remote --heads --symref" filtering with v0 and v2 We have two overlapping tests for checking the behavior of "ls-remote --symref" when filtering output. The first test checks that using "--heads" will omit the symref for HEAD (since we don't print anything about HEAD at all), but still prints other symrefs. This has been marked as expecting failure since it was added in `99c08d4eb2` (ls-remote: add support for showing symrefs, 2016-01-19). That's because back then, we only had the v0 protocol, and it only reported on the HEAD symref, not others. But these days we have v2, which does exactly what the test wants. It would even have started unexpectedly passing when we switched to v2 by default, except that `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25) over-zealously marked it to run only in v0 mode. So let's run it with both protocol versions, and adjust the expected output for each. It passes in v2 without modification. In v0 mode, we'll drop the extra symref, but this is still testing something useful: it ensures that we do omit HEAD. The test after this checks "--heads" again, this time using the expected v0 output. That's now redundant. It also checks that limiting with a pattern like "refs/heads/*" works similarly, but that's redundant with a test earlier in the script which limits by HEAD (again, back then the "HEAD" test was less interesting because there were no other symrefs to omit, but in a modern v2 world, there are). So we can just delete that second test entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:13 -07:00
Jeff King	d6747adfa8	t5512: allow any protocol version for filtered symref test We have a test that checks that ls-remote, when asked only about HEAD, will report the HEAD symref, and not others. This was marked to always run with the v0 protocol by `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25). But in v0 this test is doing nothing! For v0, upload-pack only reports the HEAD symref anyway, so we'd never have any other symref to report. For v2, it is useful; we learn about all symrefs (and the test repo has multiple), so this demonstrates that we correctly avoid showing them. We could perhaps mark this to test explicitly with v2, but since that is the default these days, it's sufficient to just run ls-remote without any protocol specification. It still passes if somebody does an explicit GIT_TEST_PROTOCOL_VERSION=0; it's just testing less. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	20272ee8cf	t5512: add v2 support for "ls-remote --symref" test Commit `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25) configured this test to always run with protocol v0, since the output is different for v2. But that means we are not getting any test coverage of the feature with v2 at all. We could obviously switch to using and expecting v2, but then that leaves v0 behind (and while we don't use it by default, it's still important for testing interoperability with older servers). Likewise, we could switch the expected output based on $GIT_TEST_PROTOCOL_VERSION, but hardly anybody runs the tests for v0 these days. Instead, let's explicitly run it for both protocol versions to make sure they're well behaved. This matches other similar tests added later in `6a139cdd74` (ls-remote: pass heads/tags prefixes to transport, 2018-10-31), etc. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	13e67aa39b	v0 protocol: fix sha1/sha256 confusion for capabilities^{} Commit `eb398797cd` (connect: advertized capability is not a ref, 2016-09-09) added support for an upload-pack server responding with: 0000000000000000000000000000000000000000 capabilities^{} followed by a NUL and the actual capabilities. We correctly parse the oid using the packet_reader's hash_algo field, but then we compare it to null_oid(), which will instead use our current repo's default algorithm. If we're defaulting to sha256 locally but the other side is sha1, they won't match and we'll fail to parse the line (and thus die()). This can cause a test failure when the suite is run with GIT_TEST_DEFAULT_HASH=sha256, and we even do so regularly via the linux-sha256 CI job. But since the test requires JGit to run, it's usually just skipped, and nobody noticed the problem. The reason the original patch used JGit is that Git itself does not ever produce such a line via upload-pack; the feature was added to fix a real-world problem when interacting with JGit. That was good for verifying that the incompatibility was fixed, but it's not a good regression test: - hardly anybody runs it, because you have to have jgit installed; hence this bug going unnoticed - we're depending on jgit's behavior for the test to do anything useful. In particular, this behavior is only relevant to the v0 protocol, but these days we ask for the v2 protocol by default. So for modern jgit, this is probably testing nothing. - it's complicated and slow. We had to do some fifo trickery to handle races, and this one test makes up 40% of the runtime of the total script. Instead, let's just hard-code the response that's of interest to us. That will test exactly what we want for every run, and reveals the bug when run in sha256 mode. And of course we'll fix the actual bug by using the correct hash_algo struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	e6c4309748	t5512: stop referring to "v1" protocol There really isn't a "v1" Git protocol. It's just v0 with an extra probe which we used to test compatibility in preparation for v2. Any tests that are looking for before/after behavior for v2 really care about "v0". Mentioning "v1" in these tests is just making things more confusing, because we don't care about that probe; we're really testing v0. So let's say so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	aa962fef27	v0 protocol: fix infinite loop when parsing multi-valued capabilities If Git's client-side parsing of an upload-pack response (so git-fetch or ls-remote) sees multiple instances of a single capability, it can enter an infinite loop due to a bug in advancing the "offset" parameter in the parser. This bug can't happen between a client and server of the same Git version. The client bug is in parse_feature_value() when the caller passes in an offset parameter. And that only happens when the v0 protocol is parsing "symref" and "object-format" capabilities, via next_server_feature_value(). But Git has never produced multiple object-format capabilities, and it stopped producing multiple symref values in `d007dbf7d6` (Revert "upload-pack: send non-HEAD symbolic refs", 2013-11-18). However, upload-pack did produce multiple symref entries for a while, and they are valid. Plus other implementations, such as Dulwich will still do so. So we should handle them. And even if we do not expect it, it is obviously a bug for the parser to enter an infinite loop. The bug itself is pretty simple. Commit `2c6a403d96` (connect: add function to parse multiple v1 capability values, 2020-05-25) added the "offset" parameter, which is used as both an in- and out-parameter. When parsing the first "symref" capability, offset will be 0 on input, and after parsing the capability, we set offset to an index just past the value by taking a pointer difference "(value + end) - feature_list". But on the second call, now offset is set to that larger index, which lets us skip past the first "symref" capability. However, we do so by incrementing feature_list. That means our pointer difference is now too small; it is counting from where we resumed parsing, not from the start of the original feature_list pointer. And because we incremented feature_list only inside our function, and not the caller, that increment is lost next time the function is called. One solution would be to account for those skipped bytes by incrementing offset, rather than assigning to it. But wait, there's more! We also increment feature_list if we have a near-miss. Say we are looking for "symref" and find "almost-symref". In that case we'll point feature_list to the "y" in "almost-symref" and restart our search. But that again means our offset won't be correct, as it won't account for the bytes between the start of the string and that "y". So instead, let's just record the beginning of the feature_list string in a separate pointer that we never touch. That offset we take in and return is meant to be using that point as a base, and now we'll do so consistently. Since the bug can't be reproduced using the current version of git-upload-pack, we'll instead hard-code an input which triggers the problem. Before this patch it loops forever re-parsing the second symref entry. Now we check both that it finishes, and that it parses both entries correctly (a case we could not test at all before). We don't need to worry about testing v2 here; it communicates the capabilities in a completely different way, and doesn't use this code at all. There are tests earlier in t5512 that are meant to cover this (they don't, but we'll address that in a future patch). Reported-by: Jonas Haag <jonas@lophus.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Robin Jarry	3c8d3adeae	send-email: export patch counters in validate environment When sending patch series (with a cover-letter or not) sendemail-validate is called with every email/patch file independently from the others. When one of the patches depends on a previous one, it may not be possible to use this hook in a meaningful way. A hook that wants to check some property of the whole series needs to know which patch is the final one. Expose the current and total number of patches to the hook via the GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment variables so that both incremental and global validation is possible. Sharing any other state between successive invocations of the validate hook must be done via external means. For example, by storing it in a git config sendemail.validateWorktree entry. Add a sample script with placeholder validations and update tests to check that the counters are properly exported. Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Robin Jarry <robin@jarry.cc> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:41:15 -07:00
Patrick Steinhardt	d85cd18777	repack: disable writing bitmaps when doing a local repack In order to write a bitmap, we need to have full coverage of all objects that are about to be packed. In the traditional non-multi-pack-index world this meant we need to do a full repack of all objects into a single packfile. But in the new multi-pack-index world we can get away with writing bitmaps when we have multiple packfiles as long as the multi-pack-index covers all objects. This is not always the case though. When asked to perform a repack of local objects, only, then we cannot guarantee to have full coverage of all objects regardless of whether we do a full repack or a repack with a multi-pack-index. The end result is that writing the bitmap will fail in both worlds: $ git multi-pack-index write --stdin-packs --bitmap <packfiles warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing) error: could not write multi-pack bitmap Now there are two different ways to fix this. The first one would be to amend git-multi-pack-index(1) to disable writing bitmaps when we notice that we don't have full object coverage. - We don't have enough information in git-multi-pack-index(1) in order to tell whether the local repository _should_ have full coverage. Because even when connected to an alternate object directory, it may be the case that we still have all objects around in the main object database. - git-multi-pack-index(1) is quite a low-level tool. Automatically disabling functionality that it was asked to provide does not feel like the right thing to do. We can easily fix it at a higher level in git-repack(1) though. When asked to only include local objects via `-l` and when connected to an alternate object directory then we will override the user's ask and disable writing bitmaps with a warning. This is similar to what we do in git-pack-objects(1), where we also disable writing bitmaps in case we omit an object from the pack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	932c16c04b	repack: honor `-l` when calculating pack geometry When the user passes `-l` to git-repack(1), then they essentially ask us to only repack objects part of the local object database while ignoring any packfiles part of an alternate object database. And we in fact honor this bit when doing a geometric repack as the resulting packfile will only ever contain local objects. What we're missing though is that we don't take locality of packfiles into account when computing whether the geometric sequence is intact or not. So even though we would only ever roll up local packfiles anyway, we could end up trying to repack because of non-local packfiles. This does not make much sense, and in the worst case it can cause us to try and do the geometric repack over and over again because we're never able to restore the geometric sequence. Fix this bug by honoring whether the user has passed `-l`. If so, we skip adding any non-local packfiles to the pack geometry. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	19a3a7bde9	t/helper: allow chmtime to print verbosely without modifying mtime The `test-tool chmtime` helper allows us to both read and modify the modification time of files. But while it is possible to only read the mtimes of a file via `--get`, it is not possible to read the mtimes and report them together with their respective file paths via the `--verbose` flag without also modifying the mtime at the same time. Fix this so that it is possible to call `test-tool chmtime --verbose <files>...` without modifying any mtimes. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	f3028418c3	pack-objects: extend test coverage of `--stdin-packs` with alternates We don't have any tests that verify that git-pack-objects(1) works with `--stdin-packs` when combined with alternate object directories. Add some to make sure that the basic functionality works as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	752b465c3c	pack-objects: fix error when same packfile is included and excluded When passing the same packfile both as included and excluded via the `--stdin-packs` option, then we will return an error because the excluded packfile cannot be found. This is because we will only set the `util` pointer for the included packfile list if it was found, so that we later die when we notice that it's in fact not set for the excluded packfile list. Fix this bug by always setting the `util` pointer for both the included and excluded list entries. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	732194b5f2	pack-objects: fix error when packing same pack twice When passed the same packfile twice via `--stdin-packs` we return an error that the packfile supposedly was not found. This is because when reading packs into the list of included or excluded packfiles, we will happily re-add packfiles even if they are part of the lists already. And while the list can now contain duplicates, we will only set the `util` pointer of the first list entry to the `packed_git` structure. We notice that at a later point when checking that all list entries have their `util` pointer set and die with an error. While this is kind of a nonsensical request, this scenario can be hit when doing geometric repacks. When a repository is connected to an alternate object directory and both have the exact same packfile then both would get added to the geometric sequence. And when we then decide to perform the repack, we will invoke git-pack-objects(1) with the same packfile twice. Fix this bug by removing any duplicates from both the included and excluded packs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	b7b8f048f5	pack-objects: split out `--stdin-packs` tests into separate file The test suite for git-pack-objects(1) is quite huge, and we're about to add more tests that relate to the `--stdin-packs` option. Split out all tests related to this option into a standalone file so that it becomes easier to test the feature in isolation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	51861340f8	repack: fix generating multi-pack-index with only non-local packs When writing the multi-pack-index with geometric repacking we will add all packfiles to the index that are part of the geometric sequence. This can potentially also include packfiles borrowed from an alternate object directory. But given that a multi-pack-index can only ever include packs that are part of the main object database this does not make much sense whatsoever. In the edge case where all packfiles are contained in the alternate object database and the local repository has none itself this bug can cause us to invoke git-multi-pack-index(1) with only non-local packfiles that it ultimately cannot find. This causes it to return an error and thus causes the geometric repack to fail. Fix the code to skip non-local packfiles. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	3d74a2337c	repack: fix trying to use preferred pack in alternates When doing a geometric repack with multi-pack-indices, then we ask git-multi-pack-index(1) to use the largest packfile as the preferred pack. It can happen though that the largest packfile is not part of the main object database, but instead part of an alternate object database. The result is that git-multi-pack-index(1) will not be able to find the preferred pack and print a warning. It then falls back to use the first packfile that the multi-pack-index shall reference. Fix this bug by only considering packfiles as preferred pack that are local. This is the right thing to do given that a multi-pack-index should never reference packfiles borrowed from an alternate. While at it, rename the function `get_largest_active_packfile()` to `get_preferred_pack()` to better document its intent. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	ceb96a160b	midx: fix segfault with no packs and invalid preferred pack When asked to write a multi-pack-index the user can specify a preferred pack that is used as a tie breaker when multiple packs contain the same objects. When this packfile cannot be found, we just pick the first pack that is getting tracked by the newly written multi-pack-index as a fallback. Picking the fallback can fail in the case where we're asked to write a multi-pack-index with no packfiles at all: picking the fallback value will cause a segfault as we blindly index into the array of packfiles, which would be empty. Fix this bug by resetting the preferred packfile index to `-1` before searching for the preferred pack. This fixes the segfault as we already check for whether the index is `> - 1`. If it is not, we simply don't pick a preferred packfile at all. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Øystein Walle	aabfdc9514	branch, for-each-ref, tag: add option to omit empty lines If the given format string expands to the empty string, a newline is still printed. This makes using the output linewise more tedious. For example, git update-ref --stdin does not accept empty lines. Add options to "git branch", "git for-each-ref", and "git tag" to not print these empty lines. The default behavior remains the same. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 08:07:45 -07:00
Taylor Blau	9f7f10a282	t: invert `GIT_TEST_WRITE_REV_INDEX` Back in `e8c58f894b` (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we added a test knob to conditionally enable writing a ".rev" file when indexing a pack. At the time, this was used to ensure that the test suite worked even when ".rev" files were written, which served as a stress-test for the on-disk reverse index implementation. Now that reading from on-disk ".rev" files is enabled by default, the test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning. We could get rid of the option entirely, but there would be no convenient way to test Git when ".rev" files aren't in place. Instead of getting rid of the option, invert its meaning to instead disable writing ".rev" files, thereby running the test suite in a mode where the reverse index is generated from scratch. This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some spelling of "true", we are still running and exercising Git's behavior when forced to generate reverse indexes from scratch. Do so by setting it in the linux-TEST-vars CI run to ensure that we are maintaining good coverage of this now-legacy code. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	a8dd7e05b1	config: enable `pack.writeReverseIndex` by default Back in `e37d0b8730` (builtin/index-pack.c: write reverse indexes, 2021-01-25), Git learned how to read and write a pack's reverse index from a file instead of in-memory. A pack's reverse index is a mapping from pack position (that is, the order that objects appear together in a ".pack") to their position in lexical order (that is, the order that objects are listed in an ".idx" file). Reverse indexes are consulted often during pack-objects, as well as during auxiliary operations that require mapping between pack offsets, pack order, and index index. They are useful in GitHub's infrastructure, where we have seen a dramatic increase in performance when writing ".rev" files[1]. In particular: - an ~80% reduction in the time it takes to serve fetches on a popular repository, Homebrew/homebrew-core. - a ~60% reduction in the peak memory usage to serve fetches on that same repository. - a collective savings of ~35% in CPU time across all pack-objects invocations serving fetches across all repositories in a single datacenter. Reverse indexes are also beneficial to end-users as well as forges. For example, the time it takes to generate a pack containing the objects for the 10 most recent commits in linux.git (representing a typical push) is significantly faster when on-disk reverse indexes are available: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 543.0 ms ± 20.3 ms [User: 616.2 ms, System: 58.8 ms] Range (min … max): 521.0 ms … 577.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 245.0 ms ± 11.4 ms [User: 335.6 ms, System: 31.3 ms] Range (min … max): 226.0 ms … 259.6 ms 13 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' The same is true of writing a pack containing the objects for the 30 most-recent commits: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 866.5 ms ± 16.2 ms [User: 1414.5 ms, System: 97.0 ms] Range (min … max): 839.3 ms … 886.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 581.6 ms ± 10.2 ms [User: 1181.7 ms, System: 62.6 ms] Range (min … max): 567.5 ms … 599.3 ms 10 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ...and savings on trivial operations like computing the on-disk size of a single (packed) object are even more dramatic: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 305.8 ms ± 11.4 ms [User: 264.2 ms, System: 41.4 ms] Range (min … max): 290.3 ms … 331.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 4.0 ms ± 0.3 ms [User: 1.7 ms, System: 2.3 ms] Range (min … max): 1.6 ms … 4.6 ms 1155 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' In the more than two years since `e37d0b8730` was merged, Git's implementation of on-disk reverse indexes has been thoroughly tested, both from users enabling `pack.writeReverseIndexes`, and from GitHub's deployment of the feature. The latter has been running without incident for more than two years. This patch changes Git's behavior to write on-disk reverse indexes by default when indexing a pack, which should make the above operations faster for everybody's Git installation after a repack. (The previous commit explains some potential drawbacks of using on-disk reverse indexes in certain limited circumstances, that essentially boil down to a trade-off between time to generate, and time to access. For those limited cases, the `pack.readReverseIndex` escape hatch can be used). [1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	dbcf611617	pack-revindex: introduce `pack.readReverseIndex` Since `1615c567b8` (Documentation/config/pack.txt: advertise 'pack.writeReverseIndex', 2021-01-25), we have had the `pack.writeReverseIndex` configuration option, which tells Git whether or not it is allowed to write a ".rev" file when indexing a pack. Introduce a complementary configuration knob, `pack.readReverseIndex` to control whether or not Git will read any ".rev" file(s) that may be available on disk. This option is useful for debugging, as well as disabling the effect of ".rev" files in certain instances. This is useful because of the trade-off[^1] between the time it takes to generate a reverse index (slow from scratch, fast when reading an existing ".rev" file), and the time it takes to access a record (the opposite). For example, even though it is faster to use the on-disk reverse index when computing the on-disk size of a packed object, it is slower to enumerate the same value for all objects. Here are a couple of examples from linux.git. When computing the above for a single object, using the on-disk reverse index is significantly faster: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 302.5 ms ± 12.5 ms [User: 258.7 ms, System: 43.6 ms] Range (min … max): 291.1 ms … 328.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 3.9 ms ± 0.3 ms [User: 1.6 ms, System: 2.4 ms] Range (min … max): 2.0 ms … 4.4 ms 801 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 77.29 ± 7.14 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' , but when instead trying to compute the on-disk object size for all objects in the repository, using the ".rev" file is a disadvantage over creating the reverse index from scratch: $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 8.258 s ± 0.035 s [User: 7.949 s, System: 0.308 s] Range (min … max): 8.199 s … 8.293 s 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 16.976 s ± 0.107 s [User: 16.706 s, System: 0.268 s] Range (min … max): 16.839 s … 17.105 s 10 runs Summary 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' ran 2.06 ± 0.02 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' Luckily, the results when running `git cat-file` with `--unordered` are closer together: $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 5.066 s ± 0.105 s [User: 4.792 s, System: 0.274 s] Range (min … max): 4.943 s … 5.220 s 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 6.193 s ± 0.069 s [User: 5.937 s, System: 0.255 s] Range (min … max): 6.145 s … 6.356 s 10 runs Summary 'git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' ran 1.22 ± 0.03 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' Because the equilibrium point between these two is highly machine- and repository-dependent, allow users to configure whether or not they will read any ".rev" file(s) with this configuration knob. [^1]: Generating a reverse index in memory takes O(N) time (where N is the number of objects in the repository), since we use a radix sort. Reading an entry from an on-disk ".rev" file is slower since each operation is bound by disk I/O instead of memory I/O. In order to compute the on-disk size of a packed object, we need to find the offset of our object, and the adjacent object (the on-disk size difference of these two). Finding the first offset requires a binary search. Finding the latter involves a single .rev lookup. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	b77919ed6e	t5325: mark as leak-free This test is leak-free as of the previous commit, so let's mark it as such to ensure we don't regress and introduce a leak in the future. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:45 -07:00
Junio C Hamano	063cd850f2	Merge branch 'jk/use-perl-path-consistently' Tests had a few places where we ignored PERL_PATH and blindly used /usr/bin/perl, which have been corrected. * jk/use-perl-path-consistently: t/lib-httpd: pass PERL_PATH to CGI scripts	2023-04-11 13:49:13 -07:00
Junio C Hamano	96f4113ac0	Merge branch 'jc/clone-object-format-from-void' "git clone" from an empty repository learned to propagate the choice of the hash algorithm from the source repository to the newly created repository. * jc/clone-object-format-from-void: clone: propagate object-format when cloning from void	2023-04-11 13:49:13 -07:00
Junio C Hamano	30e04bcfa8	Merge branch 'ar/adjust-tests-for-the-index-fallout' Comment updates. * ar/adjust-tests-for-the-index-fallout: t2107: fix mention of the_index.cache_changed t3060: fix mention of function prune_index	2023-04-11 13:49:12 -07:00
Junio C Hamano	647a2bb3ff	Merge branch 'jc/spell-id-in-both-caps-in-message-id' Consistently spell "Message-ID" as such, not "Message-Id". * jc/spell-id-in-both-caps-in-message-id: e-mail workflow: Message-ID is spelled with ID in both capital letters	2023-04-11 13:49:12 -07:00
Junio C Hamano	d02343b599	Merge branch 'ws/sparse-check-rules' "git sparse-checkout" command learns a debugging aid for the sparse rule definitions. * ws/sparse-check-rules: builtin/sparse-checkout: add check-rules command builtin/sparse-checkout: remove NEED_WORK_TREE flag	2023-04-11 13:49:12 -07:00
Elijah Newren	e93fc5d721	treewide: remove cache.h inclusion due to object-name.h changes Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	dabab1d6e6	object-name.h: move declarations for object-name.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	5579f44d2f	treewide: remove unnecessary cache.h inclusion Several files were including cache.h solely to get other headers, such as trace.h and trace2.h. Since the last few commits have modified files to make these dependencies more explicit, the inclusion of cache.h is no longer needed in several cases. Remove it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	5bc07225e5	treewide: be explicit about dependence on mem-pool.h Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	74ea5c9574	treewide: be explicit about dependence on trace.h & trace2.h Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:08 -07:00
Glen Choo	4e33535ea9	clone: error specifically with --local and symlinked objects `6f054f9fb3` (builtin/clone.c: disallow --local clones with symlinks, 2022-07-28) gives a good error message when "git clone --local" fails when the repo to clone has symlinks in "$GIT_DIR/objects". In `bffc762f87` (dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS, 2023-01-24), we later extended this restriction to the case where "$GIT_DIR/objects" is itself a symlink, but we didn't update the error message then - bffc762f87's tests show that we print a generic "failed to start iterator over" message. This is exacerbated by the fact that Documentation/git-clone.txt mentions neither restriction, so users are left wondering if this is intentional behavior or not. Fix this by adding a check to builtin/clone.c: when doing a local clone, perform an extra check to see if "$GIT_DIR/objects" is a symlink, and if so, assume that that was the reason for the failure and report the relevant information. Ideally, dir_iterator_begin() would tell us that the real failure reason is the presence of the symlink, but (as far as I can tell) there isn't an appropriate errno value for that. Also, update Documentation/git-clone.txt to reflect that this restriction exists. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:46:09 -07:00
Andrei Rybak	fd72637423	t2024: fix loose/strict local base branch DWIM test Test 'loosely defined local base branch is reported correctly' in t2024-checkout-dwim.sh, which was introduced in [1] compares output of two invocations of "git checkout", invoked with two different branches named "strict" and "loose". As per description in [1], the test is validating that output of tracking information for these two branches. This tracking information is printed to standard output: Your branch is behind 'main' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) The test assumes that the names of the two branches (strict and loose) are in that output, and pipes the output through sed to replace names of the branches with "BRANCHNAME". Command "git checkout", however, outputs the branch name to standard error, not standard output -- see message "Switched to branch '%s'\n" in function "update_refs_for_switch" in "builtin/checkout.c". This means that the two invocations of sed do nothing. Redirect both the standard output and the standard error of "git checkout" for these assertions. Ensure that compared files have the string "BRANCHNAME". In a series of piped commands, only the return code of the last command is used. Thus, all other commands will have their return codes masked. Avoid piping of output of git directly into sed to preserve the exit status code of "git checkout", while we're here. [1] `05e73682cd` (checkout: report upstream correctly even with loosely defined branch.*.merge, 2014-10-14) Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 10:11:23 -07:00
Phillip Wood	05106aa198	rebase: remove a couple of redundant strategy tests Remove a test in t3402 that has been redundant ever since `80ff47957b` (rebase: remember strategy and strategy options, 2011-02-06). That commit added a new test, the first part of which (as noted in the old commit message) duplicated an existing test. Also remove a test t3418 that has been redundant since the merge backend was removed in `68aa495b59` (rebase: implement --merge via the interactive machinery, 2018-12-11), since it now tests the same code paths as the preceding test. Helped-by: Elijah Newren <newren@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00
Phillip Wood	4960e5c7bd	rebase -m: fix serialization of strategy options To store the strategy options rebase prepends " --" to each one and writes them to a file. To load them it reads the file and passes the contents to split_cmdline(). This roughly mimics the behavior of the scripted rebase but has a couple of limitations, (1) options containing whitespace are not properly preserved (this is true of the scripted rebase as well) and (2) options containing '"' or '\' are incorrectly parsed and may cause the parser to return an error. Fix these limitations by quoting each option when they are stored so that they can be parsed correctly. Now that "--preserve-merges" no longer exist this change also stops prepending "--" to the options when they are stored as that was an artifact of the scripted rebase. These changes are backwards compatible so the files written by an older version of git can still be read. They are also forwards compatible, the file can still be parsed by recent versions of git as they treat the "--" prefix as optional. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00
Phillip Wood	4a8bc9860a	rebase -m: cleanup --strategy-option handling When handling "--strategy-option" rebase collects the commands into a struct string_list, then concatenates them into a string, prepending "--" to each one before splitting the string and removing the "--" prefix. This is an artifact of the scripted rebase and the need to support "rebase --preserve-merges". Now that "--preserve-merges" no-longer exists we can cleanup the way the argument is handled. The tests for a bad strategy option are adjusted now that parse_strategy_opts() is no-longer called when starting a rebase. The fact that it only errors out when running "git rebase --continue" is a mixed blessing but the next commit will fix the root cause of the parsing problem so lets not worry about that here. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00
René Scharfe	8a7f0b666f	date: remove approxidate_relative() When `29f4332e66` (Quit passing 'now' to date code, 2019-09-11) removed its timeval parameter, approxidate_relative() became equivalent to approxidate(). Convert its last two call sites and remove the redundant function. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 08:46:40 -07:00
René Scharfe	be39144954	userdiff: support regexec(3) with multi-byte support Since `1819ad327b` (grep: fix multibyte regex handling under macOS, 2022-08-26) we use the system library for all regular expression matching on macOS, not just for git grep. It supports multi-byte strings and rejects invalid multi-byte characters. This broke all built-in userdiff word regexes in UTF-8 locales because they all include such invalid bytes in expressions that are intended to match multi-byte characters without explicit support for that from the regex engine. "\|[^[:space:]]\|[\xc0-\xff][\x80-\xbf]+" is added to all built-in word regexes to match a single non-space or multi-byte character. The \xNN characters are invalid if interpreted as UTF-8 because they have their high bit set, which indicates they are part of a multi-byte character, but they are surrounded by single-byte characters. Replace that expression with "\|[^[:space:]]" if the regex engine supports multi-byte matching, as there is no need to have an explicit range for multi-byte characters then. Check for that capability at runtime, because it depends on the locale and thus on environment variables. Construct the full replacement expression at build time and just switch it in if necessary to avoid string manipulation and allocations at runtime. Additionally the word regex for tex contains the expression "[a-zA-Z0-9\x80-\xff]+" with a similarly invalid range. The best replacement with only valid characters that I can come up with is "([a-zA-Z0-9]\|[^\x01-\x7f])+". Unlike the original it matches NUL characters, though. Assuming that tex files usually don't contain NUL this should be acceptable. Reported-by: D. Ben Knoble <ben.knoble@gmail.com> Reported-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-07 07:38:09 -07:00
Junio C Hamano	89833fc249	Merge branch 'ds/fetch-bundle-uri-with-all' "git fetch --all" does not have to download and handle the same bundleURI over and over, which has been corrected. * ds/fetch-bundle-uri-with-all: fetch: download bundles once, even with --all	2023-04-06 13:38:32 -07:00
Junio C Hamano	0b94009649	Merge branch 'jk/chainlint-fixes' Test framework fix. * jk/chainlint-fixes: tests: skip test_eval_ in internal chain-lint tests: drop here-doc check from internal chain-linter tests: diagnose unclosed here-doc in chainlint.pl tests: replace chainlint subshell with a function tests: run internal chain-linter under "make test"	2023-04-06 13:38:31 -07:00
Junio C Hamano	6047b28eb7	Merge branch 'en/header-split-cleanup' Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers	2023-04-06 13:38:31 -07:00
Junio C Hamano	72871b198f	Merge branch 'ab/remove-implicit-use-of-the-repository' Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository " argument, not "the_repository" post-cocci: adjust comments for recent repo_ migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"	2023-04-06 13:38:30 -07:00
Junio C Hamano	06e9e726d4	Merge branch 'gc/config-parsing-cleanup' Config API clean-up to reduce its dependence on static variables * gc/config-parsing-cleanup: config.c: rename "struct config_source cf" config: report cached filenames in die_bad_number() config.c: remove current_parsing_scope config.c: remove current_config_kvi config.c: plumb the_reader through callbacks config.c: create config_reader and the_reader config.c: don't assign to "cf_global" directly config.c: plumb config_source through static fns	2023-04-06 13:38:29 -07:00
Junio C Hamano	87daf40750	Merge branch 'ab/config-multi-and-nonbool' Assorted config API updates. * ab/config-multi-and-nonbool: for-each-repo: with bad config, don't conflate <path> and <cmd> config API: add "string" version of _value_multi(), fix segfaults config API users: test for _get_value_multi() segfaults for-each-repo: error on bad --config config API: have _multi() return an "int" and take a "dest" versioncmp.c: refactor config reading next commit config API: add and use a "git_config_get()" family of functions config tests: add "NULL" tests for _get_value_multi() config tests: cover blind spots in git_die_config() tests	2023-04-06 13:38:29 -07:00
Junio C Hamano	955abf5f72	Merge branch 'jk/unused-post-2.40-part2' Code clean-up for "-Wunused-parameter" build. * jk/unused-post-2.40-part2: parse-options: drop parse_opt_unknown_cb() t/helper: mark unused argv/argc arguments mark "argv" as unused when we check argc builtins: mark unused prefix parameters builtins: annotate always-empty prefix parameters builtins: always pass prefix to parse_options() fast-import: fix file access when run from subdir	2023-04-06 13:38:28 -07:00
Junio C Hamano	119e82a515	Merge branch 'ps/ahead-behind-truncation-fix' Fix unnecessary truncation of generation numbers used in-core. * ps/ahead-behind-truncation-fix: commit-graph: fix truncated generation numbers	2023-04-06 13:38:27 -07:00
Junio C Hamano	7727da99df	Merge branch 'ds/ahead-behind' "git for-each-ref" learns '%(ahead-behind:<base>)' that computes the distances from a single reference point in the history with bunch of commits in bulk. * ds/ahead-behind: commit-reach: add tips_reachable_from_bases() for-each-ref: add ahead-behind format atom commit-reach: implement ahead_behind() logic commit-graph: introduce `ensure_generations_valid()` commit-graph: return generation from memory commit-graph: simplify compute_generation_numbers() commit-graph: refactor compute_topological_levels() for-each-ref: explicitly test no matches for-each-ref: add --stdin option	2023-04-06 13:38:21 -07:00
Jeff King	c1917156a0	t/lib-httpd: pass PERL_PATH to CGI scripts As discussed in t/README, tests should aim to use PERL_PATH rather than straight "perl". We usually do this automatically with a "perl" function in test-lib.sh, but a few cases need to be handled specially. One such case is the apply-one-time-perl.sh CGI, which invokes plain "perl". It should be using $PERL_PATH, but to make that work, we must also instruct Apache to pass through the variable. Prior to this patch, doing: mv /usr/bin/perl /usr/bin/my-perl make PERL_PATH=/usr/bin/my-perl test would fail t5702, t5703, and t5616. After this it passes. This is a pretty extreme case, as even if you install perl elsewhere, you'd likely still have it in your $PATH. A more realistic case is that you don't want to use the perl in your $PATH (because it's older, broken, etc) and expect PERL_PATH to consistently override that (since that's what it's documented to do). Removing it completely is just a convenient way of completely breaking it for testing purposes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-06 09:29:43 -07:00
Tao Klerks	42943b950e	mergetool: new config guiDefault supports auto-toggling gui by DISPLAY When no merge.tool or diff.tool is configured or manually selected, the selection of a default tool is sensitive to the DISPLAY variable; in a GUI session a gui-specific tool will be proposed if found, and otherwise a terminal-based one. This "GUI-optimizing" behavior is important because a GUI can make a huge difference to a user's ability to understand and correctly complete a non-trivial conflicting merge. Some time ago the merge.guitool and diff.guitool config options were introduced to enable users to configure both a GUI tool, and a non-GUI tool (with fallback if no GUI tool configured), in the same environment. Unfortunately, the --gui argument introduced to support the selection of the guitool is still explicit. When using configured tools, there is no equivalent of the no-tool-configured "propose a GUI tool if we are in a GUI environment" behavior. As proposed in <xmqqmtb8jsej.fsf@gitster.g>, introduce new configuration options, difftool.guiDefault and mergetool.guiDefault, supporting a special value "auto" which causes the corresponding tool or guitool to be selected depending on the presence of a non-empty DISPLAY value. Also support "true" to say "default to the guitool (unless --no-gui is passed on the commandline)", and "false" as the previous default behavior when these new configuration options are not specified. Signed-off-by: Tao Klerks <tao@klerks.biz> Acked-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-05 21:03:29 -07:00
Junio C Hamano	8b214c2e9d	clone: propagate object-format when cloning from void A user could prepare an empty repository and set it to use SHA256 as the object format. The new repository created by "git clone" from such a repository however would not record that it is expecting objects in the same SHA256 format. This works as expected if the source repository is not empty. Just like we started copying the name of the primary branch from the remote repository even if it is unborn in `3d8314f8` (clone: propagate empty remote HEAD even with other branches, 2022-07-07), lift the code that records the object format out of the block executed only when cloning from an instantiated repository, so that it works also when cloning from an empty repository. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-05 14:17:00 -07:00
Junio C Hamano	45602dd029	Merge branch 'ar/test-cleanup-unused-file-creation' Test clean-up. * ar/test-cleanup-unused-file-creation: t1507: assert output of rev-parse t1404: don't create unused file t1400: assert output of update-ref t1302: don't create unused file t1010: don't create unused files t1006: assert error output of cat-file t1005: assert output of ls-files	2023-04-04 14:28:29 -07:00
Junio C Hamano	62df03c277	Merge branch 'jk/blame-contents-with-arbitrary-commit' "git blame --contents=<file> <rev> -- <path>" used to be forbidden, but now it finds the origins of lines starting at <file> contents through the history that leads to <rev>. * jk/blame-contents-with-arbitrary-commit: blame: allow --contents to work with non-HEAD commit	2023-04-04 14:28:28 -07:00
Junio C Hamano	6dd9d96129	Merge branch 'rs/archive-mtime' Test update. * rs/archive-mtime: t5000: use check_mtime()	2023-04-04 14:28:28 -07:00
Junio C Hamano	9142fce9b0	Merge branch 'ah/rebase-merges-config' Streamline --rebase-merges command line option handling and introduce rebase.merges configuration variable. * ah/rebase-merges-config: rebase: add a config option for --rebase-merges rebase: deprecate --rebase-merges="" rebase: add documentation and test for --no-rebase-merges	2023-04-04 14:28:28 -07:00
Junio C Hamano	7e13d654c2	Merge branch 'jk/fast-export-cleanup' Code clean-up. * jk/fast-export-cleanup: fast-export: drop unused parameter from anonymize_commit_message() fast-export: drop data parameter from anonymous generators fast-export: de-obfuscate --anonymize-map handling fast-export: factor out anonymized_entry creation fast-export: simplify initialization of anonymized hashmaps fast-export: drop const when storing anonymized values	2023-04-04 14:28:27 -07:00
Junio C Hamano	f315a8b609	Merge branch 'js/split-index-fixes' The index files can become corrupt under certain conditions when the split-index feature is in use, especially together with fsmonitor, which have been corrected. * js/split-index-fixes: unpack-trees: take care to propagate the split-index flag fsmonitor: avoid overriding `cache_changed` bits split-index; stop abusing the `base_oid` to strip the "link" extension split-index & fsmonitor: demonstrate a bug	2023-04-04 14:28:27 -07:00
Junio C Hamano	f834089925	Merge branch 'pw/wildmatch-fixes' The wildmatch library code unlearns exponential behaviour it acquired some time ago since it was borrowed from rsync. * pw/wildmatch-fixes: t3070: make chain lint tester happy wildmatch: hide internal return values wildmatch: avoid undefined behavior wildmatch: fix exponential behavior	2023-04-04 14:28:27 -07:00
Shuqi Liang	1a65b41b38	write-tree: integrate with sparse index Update 'git write-tree' to allow using the sparse-index in memory without expanding to a full one. The recursive algorithm for update_one() was already updated in `2de37c5` (cache-tree: integrate with sparse directory entries, 2021-03-03) to handle sparse directory entries in the index. Hence we can just set the requires-full-index to false for "write-tree". The `p2000` tests demonstrate a ~96% execution time reduction for 'git write-tree' using a sparse index: Test before after ----------------------------------------------------------------- 2000.78: git write-tree (full-v3) 0.34 0.33 -2.9% 2000.79: git write-tree (full-v4) 0.32 0.30 -6.3% 2000.80: git write-tree (sparse-v3) 0.47 0.02 -95.8% 2000.81: git write-tree (sparse-v4) 0.45 0.02 -95.6% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-04 12:50:54 -07:00
Junio C Hamano	e7dca80692	Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository " argument, not "the_repository" post-cocci: adjust comments for recent repo_ migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"	2023-04-04 08:25:52 -07:00
Raghul Nanth A	748b8d669a	describe: enable sparse index for describe git describe compares the index with the working tree when (and only when) it is run with the "--dirty" flag. This is done by the run_diff_index() function. The function has been made aware of the sparse-index in the series that led to `8d2c3732` (Merge branch 'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the requires-full-index to false for "describe". Performance metrics Test HEAD~1 HEAD ------------------------------------------------------------------------------------------------- 2000.2: git describe --dirty (full-v3) 0.08(0.09+0.01) 0.08(0.06+0.03) +0.0% 2000.3: git describe --dirty (full-v4) 0.09(0.07+0.03) 0.08(0.05+0.04) -11.1% 2000.4: git describe --dirty (sparse-v3) 0.88(0.82+0.06) 0.02(0.01+0.05) -97.7% 2000.5: git describe --dirty (sparse-v4) 0.68(0.60+0.08) 0.02(0.02+0.04) -97.1% 2000.6: echo >>new && git describe --dirty (full-v3) 0.08(0.04+0.05) 0.08(0.05+0.04) +0.0% 2000.7: echo >>new && git describe --dirty (full-v4) 0.08(0.07+0.03) 0.08(0.05+0.04) +0.0% 2000.8: echo >>new && git describe --dirty (sparse-v3) 0.75(0.69+0.07) 0.02(0.03+0.03) -97.3% 2000.9: echo >>new && git describe --dirty (sparse-v4) 0.81(0.73+0.09) 0.02(0.01+0.05) -97.5% Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-03 11:30:23 -07:00
Alex Henrie	f024913164	format-patch: correct documentation of --thread without an argument In Git, almost all command line flags unconditionally override the corresponding config option.[1] Add a test to confirm that this is the case for `git format-patch --thread`. [1] https://lore.kernel.org/git/CAMMLpeS3+NUQa2oqpHKVo3yWQNVMgkEXrs4U5_ggvk31yQbezQ@mail.gmail.com/ Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-03 09:59:20 -07:00
Junio C Hamano	ba4324c4e1	e-mail workflow: Message-ID is spelled with ID in both capital letters We used to write "Message-Id:" and "Message-ID:" pretty much interchangeably, and the header name is defined to be case insensitive by the RFCs, but the canonical form "Message-ID:" is used throughout the RFC documents, so let's imitate it ourselves. Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Elijah Newren <newren@gmail.com>	2023-04-03 08:55:43 -07:00
Junio C Hamano	290a973bb9	Merge branch 'ds/p2000-fix-grep-sparse' Fix perf test. * ds/p2000-fix-grep-sparse: p2000: remove stray '--sparse' flag from test	2023-03-31 17:50:23 -07:00
Junio C Hamano	0d865049f7	Merge branch 'ab/retire-scripted-add-p' Test fix. * ab/retire-scripted-add-p: t3701: we don't need no Perl for `add -i` anymore	2023-03-31 17:50:23 -07:00
Junio C Hamano	dd88a1af1a	Merge branch 'js/t5563-portability-fix' Test portability fix. * js/t5563-portability-fix: t5563: prevent "ambiguous redirect"	2023-03-31 17:50:23 -07:00
Andrei Rybak	1ec40a83a5	t2107: fix mention of the_index.cache_changed Commit [1] added a test to t2107-update-index-basic.sh with a comment that mentions macro "active_cache_changed". Later in [2], the macro was removed and its usage in function cmd_update_index in file builtin/update-index.c was replaced with "the_index.cache_changed". Fix the outdated comment in file t2107-update-index-basic.sh. [1] `fa137f67a4` (lockfile.c: store absolute path, 2014-11-02) [2] `dc594180d9` (cocci & cache.h: apply variable section of "pending" index-compatibility, 2022-11-19) Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-31 16:57:04 -07:00
Andrei Rybak	993d7085be	t3060: fix mention of function prune_index Commit [1] added tests which trigger function prune_cache. The comments in these tests, however, incorrectly call it "prune_path". Since then, function "prune_cache" has been renamed to "prune_index" in commit [2]. Later still in commit [3], the_index singleton, which is also mentioned in a comment, stopped being used directly with function "prune_index". Fix mentions of function "prune_index" and the struct it changes in comments in file "t3060-ls-files-with-tree.sh". [1] `54e1abce90` (Add test case for ls-files --with-tree, 2007-10-03) [2] `6510ae173a` (ls-files: convert prune_cache to take an index, 2017-06-12) [3] `188dce131f` (ls-files: use repository object, 2017-06-22) Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-31 16:57:03 -07:00
Derrick Stolee	25bccb4b79	fetch: download bundles once, even with --all When fetch.bundleURI is set, 'git fetch' downloads bundles from the given bundle URI before fetching from the specified remote. However, when using non-file remotes, 'git fetch --all' will launch 'git fetch' subprocesses which then read fetch.bundleURI and fetch the bundle list again. We do not expect the bundle list to have new information during these multiple runs, so avoid these extra calls by un-setting fetch.bundleURI in the subprocess arguments. Be careful to skip fetching bundles for the empty bundle string. Fetching bundles from the empty list presents some interesting test failures. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-31 10:07:33 -07:00
Johannes Schindelin	92c7b3d473	t5563: prevent "ambiguous redirect" When I ran this test using `TEST_SHELL_PATH=/bin/bash` in my Ubuntu setup (where Bash is at version 5.0.17(1)-release), I was greeted with this error message: ./test-lib.sh: line 1072: $CHALLENGE: ambiguous redirect This commit fixes that error by quoting the `CHALLENGE` variable (which has as value a path containing spaces), and by avoiding to cuddle the empty string parameter in the `printf` call with the redirect character (in fact, the `printf ''>$CHALLENGE` is removed because the next line overwrites the file anyway because it _also_ uses a single `>` to redirect the output). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-31 08:50:30 -07:00
Jeff King	cc48ddd937	tests: skip test_eval_ in internal chain-lint To check for broken &&-chains, we run "fail_117 && $1" as a test snippet, and check the exit code. We use test_eval_ to do so, because that's the way we run the actual test. But we don't need any of its niceties, like "set -x" tracing. In fact, they hinder us, because we have to explicitly disable them. So let's skip that and use "eval" more directly, which is simpler. I had hoped it would also be faster, but it doesn't seem to produce a measurable improvement (probably because it's just running internal shell commands, with no subshells or forks). Note that there is one gotcha: even though we don't intend to run any of the commands if the &&-chain is intact, an error like this: test_expect_success 'broken' ' # this next line breaks the &&-chain true # and then this one is executed even by the linter return 1 ' means we'll "return 1" from the eval, and thus from test_run_(). We actually do notice this in test_expect_success, but only by saying "hey, this test didn't say it was OK, so it must have failed", which is not right (it should say "broken &&-chain"). We can handle this by calling test_eval_inner_() instead, which is our trick for wrapping "return" in a test snippet. But to do that, we have to push the trace code out of that inner function and into test_eval_(). This is arguably where it belonged in the first place, but it never mattered because the "inner_" function had only one caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 13:07:29 -07:00
Jeff King	750b260411	tests: drop here-doc check from internal chain-linter Commit `99a64e4b73` (tests: lint for run-away here-doc, 2017-03-22) tweaked the chain-lint test to catch unclosed here-docs. It works by adding an extra "echo" command after the test snippet, and checking that it is run (if it gets swallowed by a here-doc, naturally it is not run). The downside here is that we introduced an extra $() substitution, which happens in a subshell. This has a measurable performance impact when run for many tests. The tradeoff in safety was undoubtedly worth it when `99a64e4b73` was written. But since the external chainlint.pl learned to find these recently, we can just rely on it. By switching back to a simpler chain-lint, hyperfine reports a measurable speedup on t3070 (which has 1800 tests): 'HEAD' ran 1.12 ± 0.01 times faster than 'HEAD~1' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 13:07:29 -07:00
Eric Sunshine	2b61c8dc88	tests: diagnose unclosed here-doc in chainlint.pl An unclosed here-doc in a test is a problem, because it silently gobbles up any remaining commands. Since `99a64e4b73` (tests: lint for run-away here-doc, 2017-03-22) we detect this by piggy-backing on the internal chainlint checker in test-lib.sh. However, it would be nice to detect it in chainlint.pl, for a few reasons: - the output from chainlint.pl is much nicer; it can show the exact spot of the error, rather than a vague "somewhere in this test you broke the &&-chain or had a bad here-doc" message. - the implementation in test-lib.sh runs for each test snippet. And since it requires a subshell, the extra cost is small but not zero. If chainlint.pl can reliably find the problem, we can optimize the test-lib.sh code. The chainlint.pl code never intended to find here-doc problems. But since it has to parse them anyway (to avoid reporting problems inside here-docs), most of what we need is already there. We can detect the problem when we fail to find the missing end-tag in swallow_heredocs(). The extra change in scan_heredoc_tag() stores the location of the start of the here-doc, which lets us mark it as the source of the error in the output (see the new tests for examples). [jk: added commit message and tests] Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 13:07:29 -07:00
Jeff King	1686de55fa	tests: replace chainlint subshell with a function To test that we don't break the &&-chain, test-lib.sh does something like: (exit 117) && $test_commands and checks that the result is exit code 117. We don't care what that initial command is, as long as it exits with a unique code. Using "exit" works and is simple, but is a bit expensive since it requires a subshell (to avoid exiting the whole script!). This isn't usually very noticeable, but it can add up for scripts which have a large number of tests. Using "return" naively won't work here, because we'd return from the function eval-ing the snippet (and it wouldn't find &&-chain breakages). But if we further push that into its own function, it does exactly what we want, without extra subshell overhead. According to hyperfine, this produces a measurable improvement when running t3070 (which has 1800 tests, all of them quite short): 'HEAD' ran 1.09 ± 0.01 times faster than 'HEAD~1' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 13:07:29 -07:00
Jeff King	7b6555ab8d	tests: run internal chain-linter under "make test" Since `69b9924b87` (t/Makefile: teach `make test` and `make prove` to run chainlint.pl, 2022-09-01), we run a single chainlint.pl process for all scripts, and then instruct each individual script to run with the equivalent of --no-chain-lint, which tells them not to redundantly run the chainlint script themselves. However, this also disables the internal linter run within the shell by eval-ing "(exit 117) && $1" and confirming we get code 117. In theory the external linter produces a superset of complaints, and we don't need the internal one anymore. However, we know there is at least one case where they differ. A test like: test_expect_success 'should fail linter' ' false && sleep 2 & pid=$! && kill $pid ' is buggy (it ignores the failure from "false", because it is backgrounded along with the sleep). The internal linter catches this, but the external one doesn't (and teaching it to do so is complicated[1]). So not only does "make test" miss this problem, but it's doubly confusing because running the script standalone does complain. Let's teach the suppression in the Makefile to only turn off the external linter (which we know is redundant, as it was already run) and leave the internal one intact. I've used a new environment variable to do this here, and intentionally did not add a "--no-ext-chain-lint" option. This is an internal optimization used by the Makefile, and not something that ordinary users would need to tweak. [1] For discussion of chainlint.pl and this case, see: https://lore.kernel.org/git/CAPig+cQtLFX4PgXyyK_AAkCvg4Aw2RAC5MmLbib-aHHgTBcDuw@mail.gmail.com/ Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 13:07:29 -07:00
Jeff King	126e3b3d2a	t/helper: mark unused argv/argc arguments Many test helper programs do not bother to look at argc or argv, because they don't take any options. In a user-facing program, it's a good idea to check for unexpected arguments and complain. But for a test helper, it's not worth the trouble to enforce this. But we do want to tell the compiler we're OK with ignoring them, to silence -Wunused-parameter (and obviously we can't get rid of them, since we have to conform to the usual cmd__foo() interface). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 14:11:24 -07:00
Jeff King	9dc607f1c2	fast-import: fix file access when run from subdir In cmd_fast_import(), we ignore the "prefix" argument entirely, even though it tells us how we may have changed directory to the root of the repository earlier in the process. Which means that if you run it from a subdir and point to paths in the filesystem, like: cd subdir git fast-import --import-marks=foo <dump then it will look for "foo" in the root of the repository, not the current directory ("subdir/") which the user would have expected. We can fix this by recording the prefix and using it as appropriate whenever we open a file for reading or writing. I found each of these by looking for cases where we call fopen() within fast-import.c, so this should cover all cases. The new test triggers each one, as well as making sure we don't accidentally apply the prefix when --relative-marks is in use (since that option interprets some paths as relative to a specific directory). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 14:11:24 -07:00
Derrick Stolee	d52fcf493b	p2000: remove stray '--sparse' flag from test This argument was added in `7cae7627c4` (builtin/grep.c: integrate with sparse index, 2022-09-22), but it was a carry-over from an earlier version where the --sparse flag was added to the 'git grep' builtin. This argument does not exist, so currently the p2000-sparse-operations.sh performance test script fails when reaching this step. With this fix, the script works with these numbers for my copy of the Git source code repository: Test HEAD ------------------------------------------------------------ 2000.30: git grep --cached ... (full-v3) 0.34(1.20+0.14) 2000.31: git grep --cached ... (full-v4) 0.31(1.15+0.13) 2000.32: git grep --cached ... (sparse-v3) 0.26(1.13+0.12) 2000.33: git grep --cached ... (sparse-v4) 0.27(1.13+0.12) Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 13:25:52 -07:00
Glen Choo	e2016508e7	config: report cached filenames in die_bad_number() If, when parsing numbers from config, die_bad_number() is called, it reports the filename and config source type if we were parsing a config file, but not if we were iterating a config_set (it defaults to a less specific error message). Most call sites don't parse config files because config is typically read once and cached, so we only report filename and config source type in "git config --type" (since "git config" always parses config files). This could have been fixed when we taught the current_config_* functions to respect config_set values (`0d44a2dacc` (config: return configset value for current_config_ functions, 2016-05-26), but it was hard to spot then and we might have just missed it (I didn't find mention of die_bad_number() in the original ML discussion [1].) Fix this by refactoring the current_config_* functions into variants that don't BUG() when we aren't reading config, and using the resulting functions in die_bad_number(). "git config --get[-regexp] --type=int" cannot use the non-refactored version because it parses the int value _after_ parsing the config file, which would run into the BUG(). Since the refactored functions aren't public, they use "struct config_reader". 1. https://lore.kernel.org/git/20160518223712.GA18317@sigill.intra.peff.net/ Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 13:03:27 -07:00
Junio C Hamano	f879501ad0	Merge branch 'jk/fix-proto-downgrade-to-v0' Transports that do not support protocol v2 did not correctly fall back to protocol v0 under certain conditions, which has been corrected. * jk/fix-proto-downgrade-to-v0: git_connect(): fix corner cases in downgrading v2 to v0	2023-03-28 10:51:52 -07:00
Junio C Hamano	8069aa01cd	Merge branch 'fc/oid-quietly-parse-upstream' "git rev-parse --quiet foo@{u}", or anything that asks @{u} to be parsed with GET_OID_QUIETLY option, did not quietly fail, which has been corrected. * fc/oid-quietly-parse-upstream: object-name: fix quiet @{u} parsing	2023-03-28 10:51:52 -07:00
Junio C Hamano	6041a13ec2	Merge branch 'fc/completion-colors-do-not-need-prompt-command' Lift the limitation that colored prompts can only be used with PROMPT_COMMAND mode. * fc/completion-colors-do-not-need-prompt-command: completion: prompt: use generic colors	2023-03-28 10:51:52 -07:00
Ævar Arnfjörð Bjarmason	3611f7467f	for-each-repo: with bad config, don't conflate <path> and <cmd> Fix a logic error in `4950b2a2b5` (for-each-repo: run subcommands on configured repos, 2020-09-11). Due to assuming that elements returned from the repo_config_get_value_multi() call wouldn't be "NULL" we'd conflate the <path> and <command> part of the argument list when running commands. As noted in the preceding commit the fix is to move to a safer "_string_multi()" version of the _multi() API. This change is separated from the rest because those all segfaulted. In this change we ended up with different behavior. When using the "--config=<config>" form we take each element of the list as a path to a repository. E.g. with a configuration like: [repo] list = /some/repo We would, with this command: git for-each-repo --config=repo.list status builtin Run a "git status" in /some/repo, as: git -C /some/repo status builtin I.e. ask "status" to report on the "builtin" directory. But since a configuration such as this would result in a "struct string_list *" with one element, whose "string" member is "NULL": [repo] list We would, when constructing our command-line in "builtin/for-each-repo.c"... strvec_pushl(&child.args, "-C", path, NULL); for (i = 0; i < argc; i++) strvec_push(&child.args, argv[i]); ...have that "path" be "NULL", and as strvec_pushl() stops when it sees NULL we'd end with the first "argv" element as the argument to the "-C" option, e.g.: git -C status builtin I.e. we'd run the command "builtin" in the "status" directory. In another context this might be an interesting security vulnerability, but I think that this amounts to a nothingburger on that front. A hypothetical attacker would need to be able to write config for the victim to run, if they're able to do that there's more interesting attack vectors. See the "safe.directory" facility added in `8d1a744820` (setup.c: create `safe.bareRepository`, 2022-07-14). An even more unlikely possibility would be an attacker able to generate the config used for "for-each-repo --config=<key>", but nothing else (e.g. an automated system producing that list). Even in that case the attack vector is limited to the user running commands whose name matches a directory that's interesting to the attacker (e.g. a "log" directory in a repository). The second argument (if any) of the command is likely to make git die without doing anything interesting (e.g. "-p" to "log", there being no "-p" built-in command to run). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason	9e2d884d0f	config API: add "string" version of _value_multi(), fix segfaults Fix numerous and mostly long-standing segfaults in consumers of the _config_value_multi() API. As discussed in the preceding commit an empty key in the config syntax yields a "NULL" string, which these users would give to strcmp() (or similar), resulting in segfaults. As this change shows, most users users of the _config_value_multi() API didn't really want such an an unsafe and low-level API, let's give them something with the safety of git_config_get_string() instead. This fix is similar to what the _string() functions and others acquired in[1] and [2]. Namely introducing and using a safer "_get_string_multi()" variant of the low-level "_value_multi()" function. This fixes segfaults in code introduced in: - `d811c8e17c` (versionsort: support reorder prerelease suffixes, 2015-02-26) - `c026557a37` (versioncmp: generalize version sort suffix reordering, 2016-12-08) - `a086f921a7` (submodule: decouple url and submodule interest, 2017-03-17) - `a6be5e6764` (log: add log.excludeDecoration config option, 2020-04-16) - `92156291ca` (log: add default decoration filter, 2022-08-05) - `50a044f1e4` (gc: replace config subprocesses with API calls, 2022-09-27) There are now two users ofthe low-level API: - One in "builtin/for-each-repo.c", which we'll convert in a subsequent commit. - The "t/helper/test-config.c" code added in [3]. As seen in the preceding commit we need to give the "t/helper/test-config.c" caller these "NULL" entries. We could also alter the underlying git_configset_get_value_multi() function to be "string safe", but doing so would leave no room for other variants of "*_get_value_multi()" that coerce to other types. Such coercion can't be built on the string version, since as we've established "NULL" is a true value in the boolean context, but if we coerced it to "" for use in a list of strings it'll be subsequently coerced to "false" as a boolean. The callback pattern being used here will make it easy to introduce e.g. a "multi" variant which coerces its values to "bool", "int", "path" etc. 1. `40ea4ed903` (Add config_error_nonbool() helper function, 2008-02-11) 2. `6c47d0e8f3` (config.c: guard config parser from value=NULL, 2008-02-11). 3. `4c715ebb96` (test-config: add tests for the config_set API, 2014-07-28) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason	1c7e239bd0	config API users: test for _get_value_multi() segfaults As we'll discuss in the subsequent commit these tests all show _get_value_multi() API users unable to handle there being a value-less key in the config, which is represented with a "NULL" for that entry in the "string" member of the returned "struct string_list", causing a segfault. These added tests exhaustively test for that issue, as we'll see in a subsequent commit we'll need to change all of the API users of *_get_value_multi(). These cases were discovered by triggering each one individually, and then adding these tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason	f7b2ff9516	for-each-repo: error on bad --config As noted in `6c62f01552` (for-each-repo: do nothing on empty config, 2021-01-08) this command wants to ignore a non-existing config key, but let's not conflate that with bad config. Before this, all these added tests would pass with an exit code of 0. We could preserve the comment added in `6c62f01552`, but now that we're directly using the documented repo_config_get_value_multi() value it's just narrating something that should be obvious from the API use, so let's drop it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason	a428619309	config API: have _multi() return an "int" and take a "dest" Have the "git_configset_get_value_multi()" function and its siblings return an "int" and populate a "dest" parameter like every other git_configset_get_()" in the API. As we'll take advantage of in subsequent commits, this fixes a blind spot in the API where it wasn't possible to tell whether a list was empty from whether a config key existed. For now we don't make use of those new return values, but faithfully convert existing API users. Most of this is straightforward, commentary on cases that stand out: - To ensure that we'll properly use the return values of this function in the future we're using the "RESULT_MUST_BE_USED" macro introduced in [1]. As git_die_config() now has to handle this return value let's have it BUG() if it can't find the config entry. As tested for in a preceding commit we can rely on getting the config list in git_die_config(). - The loops after getting the "list" value in "builtin/gc.c" could also make use of "unsorted_string_list_has_string()" instead of using that loop, but let's leave that for now. - In "versioncmp.c" we now use the return value of the functions, instead of checking if the lists are still non-NULL. 1. `1e8697b5c4` (submodule--helper: check repo{_submodule,}_init() return values, 2022-09-01), Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:53 -07:00
Ævar Arnfjörð Bjarmason	b83efcecaf	config API: add and use a "git_config_get()" family of functions We already have the basic "git_config_get_value()" function and its "repo_" and "configset" siblings to get a given "key" and assign the last key found to a provided "value". But some callers don't care about that value, but just want to use the return value of the "get_value()" function to check whether the key exist (or another non-zero return value). The immediate motivation for this is that a subsequent commit will need to change all callers of the "_get_value_multi()" family of functions. In two cases here we (ab)used it to check whether we had any values for the given key, but didn't care about the return value. The rest of the callers here used various other config API functions to do the same, all of which resolved to the same underlying functions to provide the answer. Some of these were using either git_config_get_string() or git_config_get_string_tmp(), see `fe4c750fb1` (submodule--helper: fix a configure_added_submodule() leak, 2022-09-01) for a recent example. We can now use a helper function that doesn't require a throwaway variable. We could have changed git_configset_get_value_multi() (and then git_config_get_value() etc.) to accept a "NULL" as a "dest" for all callers, but let's avoid changing the behavior of existing API users. Having an "unused" value that we throw away internal to config.c is cheap. A "NULL as optional dest" pattern is also more fragile, as the intent of the caller might be misinterpreted if he were to accidentally pass "NULL", e.g. when "dest" is passed in from another function. Another name for this function could have been "_config_key_exists()", as suggested in [1]. That would work for all of these callers, and would currently be equivalent to this function, as the git_configset_get_value() API normalizes all non-zero return values to a "1". But adding that API would set us up to lose information, as e.g. if git_config_parse_key() in the underlying configset_find_element() fails we'd like to return -1, not 1. Let's change the underlying configset_find_element() function to support this use-case, we'll make further use of it in a subsequent commit where the git_configset_get_value_multi() function itself will expose this new return value. This still leaves various inconsistencies and clobbering or ignoring of the return value in place. E.g here we're modifying configset_add_value(), but ever since it was added in [2] we've been ignoring its "int" return value, but as we're changing the configset_find_element() it uses, let's have it faithfully ferry that "ret" along. Let's also use the "RESULT_MUST_BE_USED" macro introduced in [3] to assert that we're checking the return value of configset_find_element(). We're leaving the same change to configset_add_value() for some future series. Once we start paying attention to its return value we'd need to ferry it up as deep as do_config_from(), and would need to make least read_{,very_}early_config() and git_protected_config() return an "int" instead of "void". Let's leave that for now, and focus on the _get_*() functions. 1. `3c8687a73e` (add `config_set` API for caching config-like files, 2014-07-28) 2. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/ 3. `1e8697b5c4` (submodule--helper: check repo{_submodule,}_init() return values, 2022-09-01), Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason	e7587a8f53	config tests: add "NULL" tests for *_get_value_multi() A less well known edge case in the config format is that keys can be value-less, a shorthand syntax for "true" boolean keys. I.e. these two are equivalent as far as "--type=bool" is concerned: [a]key [a]key = true But as far as our parser is concerned the values for these two are NULL, and "true". I.e. for a sequence like: [a]key=x [a]key [a]key=y We get a "struct string_list" with "string" members with ".string" values of: { "x", NULL, "y" } This behavior goes back to the initial implementation of git_config_bool() in `17712991a5` (Add ".git/config" file parser, 2005-10-10). When parts of the config_set API were tested for in [1] they didn't add coverage for 3/4 of the "(NULL)" cases handled in "t/helper/test-config.c". We'd test that case for "get_value", but not "get_value_multi", "configset_get_value" and "configset_get_value_multi". We now cover all of those cases, which in turn expose the details of how this part of the config API works. 1. `4c715ebb96` (test-config: add tests for the config_set API, 2014-07-28) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason	258902ce07	config tests: cover blind spots in git_die_config() tests There were no tests checking for the output of the git_die_config() function in the config API, added in `5a80e97c82` (config: add `git_die_config()` to the config-set API, 2014-08-07). We only tested "test_must_fail", but didn't assert the output. We need tests for this because a subsequent commit will alter the return value of git_config_get_value_multi(), which is used to get the config values in the git_die_config() function. This test coverage helps to build confidence in that subsequent change. These tests cover different interactions with git_die_config(): - The "notes.mergeStrategy" test in "t/t3309-notes-merge-auto-resolve.sh" is a case where a function outside of config.c (git_config_get_notes_strategy()) calls git_die_config(). - The "gc.pruneExpire" test in "t5304-prune.sh" is a case where git_config_get_expiry() calls git_die_config(), covering a different "type" than the "string" test for "notes.mergeStrategy". - The "fetch.negotiationAlgorithm" test in "t/t5552-skipping-fetch-negotiator.sh" is a case where git_config_get_string*() calls git_die_config(). We also cover both the "from command-line config" and "in file..at line" cases here. The clobbering of existing ".git/config" files here is so that we're not implicitly testing the line count of the default config. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:37:52 -07:00
Ævar Arnfjörð Bjarmason	bab821646a	cocci: apply the "pretty.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "pretty.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason	ecb5091fd4	cocci: apply the "commit.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "commit.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason	cb338c23d6	cocci: apply the "commit-reach.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "commit-reach.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:36 -07:00
Ævar Arnfjörð Bjarmason	d850b7a545	cocci: apply the "cache.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "cache.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:36 -07:00
Michael J Gruber	3dc0b7f0dc	t3070: make chain lint tester happy `1f2e05f0b7` ("wildmatch: fix exponential behavior", 2023-03-20) introduced a new test with a background process. Backgrounding necessarily gives a result of 0, so that a seemingly broken && chain is not really broken. Adjust t3070 slightly so that our chain lint test recognizes the construct for what it is and does not raise a false positive. Signed-off-by: Michael J Gruber <git@grubix.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 17:02:38 -07:00
Patrick Steinhardt	d3af1c193d	commit-graph: fix truncated generation numbers In `80c928d947` (commit-graph: simplify compute_generation_numbers(), 2023-03-20), the code to compute generation numbers was simplified to use the same infrastructure as is used to compute topological levels. This refactoring introduced a bug where the generation numbers are truncated when they exceed UINT32_MAX because we explicitly cast the computed generation number to `uint32_t`. This is not required though: both the computed value and the field of `struct commit_graph_data` are of the same type `timestamp_t` already, so casting to `uint32_t` will cause truncation. This cast can cause us to miscompute generation data overflows: 1. Given a commit with no parents and committer date `UINT32_MAX + 1`. 2. We compute its generation number as `UINT32_MAX + 1`, but truncate it to `1`. 3. We calculate the generation offset via `$generation - $date`, which is thus `1 - (UINT32_MAX + 1)`. The computation underflows and we thus end up with an offset that is bigger than the maximum allowed offset. As a result, we'd be writing generation data overflow information into the commit-graph that is bogus and ultimately not even required. Fix this bug by removing the needless cast. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 10:52:06 -07:00
William Sprent	00408adeac	builtin/sparse-checkout: add check-rules command There exists no direct way to interrogate git about which paths are matched by a given set of sparsity rules. It is possible to get this information from git, but it includes checking out the commit that contains the paths, applying the sparse checkout patterns and then using something like 'git ls-files -t' to check if the skip worktree bit is set. This works in some case, but there are cases where it is awkward or infeasible to generate a checkout for this purpose. Exposing the pattern matching of sparse checkout enables more tooling to be built and avoids a situation where tools that want to reason about sparse checkouts start containing parallel implementation of the rules. To accommodate this, add a 'check-rules' subcommand to the 'sparse-checkout' builtin along the lines of the 'git check-ignore' and 'git check-attr' commands. The new command accepts a list of paths on stdin and outputs just the ones the match the sparse checkout. To allow for use in a bare repository and to allow for interrogating about other patterns than the current ones, include a '--rules-file' option which allows the caller to explicitly pass sparse checkout rules in the format accepted by 'sparse-checkout set --stdin'. To allow for reuse of the handling of input patterns for the '--rules-file' flag, modify 'add_patterns_from_input()' to be able to read from a 'FILE' instead of just stdin. To allow for reuse of the logic which decides whether or not rules should be interpreted as cone-mode patterns, split that part out of 'update_modes()' such that can be called without modifying the config. An alternative could have been to create a new 'check-sparsity' command. However, placing it under 'sparse-checkout' allows for a) more easily re-using the sparse checkout pattern matching and cone/non-code mode handling, and b) keeps the documentation for the command next to the experimental warning and the cone-mode discussion. Signed-off-by: William Sprent <williams@unity3d.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 10:51:12 -07:00
William Sprent	24fc2cde64	builtin/sparse-checkout: remove NEED_WORK_TREE flag In preparation for adding a sub-command to 'sparse-checkout' that can be run in a bare repository, remove the 'NEED_WORK_TREE' flag from its entry in the 'commands' array of 'git.c'. To avoid that this changes any behaviour, add calls to 'setup_work_tree()' to all of the 'sparse-checkout' sub-commands and add tests that verify that 'sparse-checkout <cmd>' still fail with a clear error message telling the user that the command needs a work tree. Signed-off-by: William Sprent <williams@unity3d.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 10:43:51 -07:00
Johannes Schindelin	3457b50e8c	t3701: we don't need no Perl for `add -i` anymore This should have been removed in `ab/retire-scripted-add-p` but wasn't. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 10:40:12 -07:00
Johannes Schindelin	3b7a4475b0	split-index; stop abusing the `base_oid` to strip the "link" extension When a split-index is in effect, the `$GIT_DIR/index` file needs to contain a "link" extension that contains all the information about the split-index, including the information about the shared index. However, in some cases Git needs to suppress writing that "link" extension (i.e. to fall back to writing a full index) even if the in-memory index structure _has_ a `split_index` configured. This is the case e.g. when "too many not shared" index entries exist. In such instances, the current code sets the `base_oid` field of said `split_index` structure to all-zero to indicate that `do_write_index()` should skip writing the "link" extension. This can lead to problems later on, when the in-memory index is still used to perform other operations and eventually wants to write a split-index, detects the presence of the `split_index` and reuses that, too (under the assumption that it has been initialized correctly and still has a non-null `base_oid`). Let's stop zeroing out the `base_oid` to indicate that the "link" extension should not be written. One might be tempted to simply call `discard_split_index()` instead, under the assumption that Git decided to write a non-split index and therefore the `split_index` structure might no longer be wanted. However, that is not possible because that would release index entries in `split_index->base` that are likely to still be in use. Therefore we cannot do that. The next best thing we _can_ do is to introduce a bit field to indicate specifically which index extensions (not) to write. So that's what we do here. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:40:39 -07:00
Johannes Schindelin	3704fed5ea	split-index & fsmonitor: demonstrate a bug This commit adds a new test case that demonstrates a bug in the split-index code that is triggered under certain circumstances when the FSMonitor is enabled, and its symptom manifests in the form of one of the following error messages: BUG: fsmonitor.c:20: fsmonitor_dirty has more entries than the index (2 > 1) BUG: unpack-trees.c:776: pos <n> doesn't point to the first entry of <dir>/ in index error: invalid path '' error: The following untracked working tree files would be overwritten by reset: initial.t Which of these error messages appears depends on timing-dependent conditions. Technically the root cause lies with a bug in the split-index code that has nothing to do with FSMonitor, but for the sake of this new test case it was the easiest way to trigger the bug. The bug is this: Under specific conditions, Git needs to skip writing the "link" extension (which is the index extension containing the information pertaining to the split-index). To do that, the `base_oid` attribute of the `split_index` structure in the in-memory index is zeroed out, and `do_write_index()` specifically checks for a "null" `base_oid` to understand that the "link" extension should not be written. However, this violates the consistency of the in-memory index structure, but that does not cause problems in most cases because the process exits without using the in-memory index structure anymore, anyway. But: _When_ the in-memory index is still used (which is the case e.g. in `git rebase`), subsequent writes of `the_index` are at risk of writing out a bogus index file, one that _should_ have a "link" extension but does not. In many cases, the `SPLIT_INDEX_ORDERED` flag _happens_ to be set for subsequent writes, forcing the shared index to be written, which re-initializes `base_oid` to a non-bogus state, and all is good. When it is _not_ set, however, all kinds of mayhem ensue, resulting in above-mentioned error messages, and often enough putting worktrees in a totally broken state where the only recourse is to manually delete the `index` and the `index.lock` files and then call `git reset` manually. Not something to ask users to do. The reason why it is comparatively easy to trigger the bug with FSMonitor is that there is _another_ bug in the FSMonitor code: `mark_fsmonitor_valid()` sets `cache_changed` to 1, i.e. treating that variable as a Boolean. But it is a bit field, and 1 happens to be the `SOMETHING_CHANGED` bit that forces the "link" extension to be skipped when writing the index, among other things. "Comparatively easy" is a relative term in this context, for sure. The essence of how the new test case triggers the bug is as following: 1. The `git rebase` invocation will first reset the worktree to a commit that contains only the `one.t` file, and then execute a rebase script that starts with the following commands (commit hashes skipped): label onto reset initial pick two label two reset two pick three [...] 2. Before executing the `label` command, a split index is written, as well as the shared index. 3. The `reset initial` command in the rebase script writes out a new split index but skips writing the shared index, as intended. 4. The `pick two` command updates the worktree and refreshes the index, marking the `two.t` entry as valid via the FSMonitor, which sets the `SOMETHING_CHANGED` bit in `cache_changed`, which in turn causes the `base_oid` attribute to be zeroed out and a full (non-split) index to be written (making sure _not_ to write the "link" extension). 5. Now, the `reset two` command will leave the worktree alone, but still write out a new split index, not writing the shared index (because `base_oid` is still zeroed out, and there is no index entry update requiring it to be written, either). 6. When it is turn to run `pick three`, the index is read, but it is too short: It only contains a single entry when there should be two, because the "link" extension is missing from the written-out index file. There are three bugs at play, actually, which will be fixed over the course of the next commits: - The `base_oid` attribute should not be zeroed out to indicate when the "link" extension should not be written, as it puts the in-memory index structure into an inconsistent state. - The FSMonitor should not overwrite bits in `cache_changed`. - The `unpack_trees()` function tries to reuse the `split_index` structure from the source index, if any, but does not propagate the `SPLIT_INDEX_ORDERED` flag. While a fix for the second bug would let this test case pass, there are other conditions where the `SOMETHING_CHANGED` bit is set. Therefore, the bug that most crucially needs to be fixed is the first one. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:40:39 -07:00
Rubén Justo	a675ad1708	branch: rename orphan branches in any worktree In `cfaff3aac` (branch -m: allow renaming a yet-unborn branch, 2020-12-13) we added support for renaming an orphan branch when that branch is checked out in the current worktree. Let's also allow renaming an orphan branch checked out in a worktree different than the current one. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:40:15 -07:00
Rubén Justo	7a6ccdfb4e	branch: description for orphan branch errors In `bcfc82bd48` (branch: description for non-existent branch errors, 2022-10-08) we checked the HEAD in the current worktree to detect if the branch to operate with is an orphan branch, so as to avoid the confusing error: "No branch named...". If we are asked to operate with an orphan branch in a different working tree than the current one, we need to check the HEAD in that different working tree. Let's extend the check we did in `bcfc82bd48`, to check the HEADs in all worktrees linked to the current repository, using the helper introduced in `31ad6b61bd` (branch: add branch_checked_out() helper, 2022-06-15). The helper, branch_checked_out(), does its work obtaining internally a list of worktrees linked to the current repository. Obtaining that list is not a lightweight work because it implies disk access. In copy_or_rename_branch() we already have a list of worktrees. Let's use that already obtained list, and avoid using here the helper. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:40:14 -07:00
Rubén Justo	2e8af499ff	branch: test for failures while renaming branches When we introduced replace_each_worktree_head_symref() in `70999e9cec` (branch -m: update all per-worktree HEADs, 2016-03-27), we implemented a best effort approach. If we are asked to rename a branch that is simultaneously checked out in multiple worktrees, we try to update all of those worktrees. If we fail updating any of them, we die() as a signal that something has gone wrong. However, at this point, the branch ref has already been renamed and also updated the HEADs of the successfully updated worktrees. Despite returning an error, we do not try to rollback those changes. Let's add a test to notice if we change this behavior in the future. In next commits we will change replace_each_worktree_head_symref() to work more closely with its only caller, copy_or_rename_branch(). Let's move the former closer to its caller, to facilitate those changes. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:40:14 -07:00
Alex Henrie	6605fb70cb	rebase: add a config option for --rebase-merges The purpose of the new option is to accommodate users who would like --rebase-merges to be on by default and to facilitate turning on --rebase-merges by default without configuration in a future version of Git. Name the new option rebase.rebaseMerges, even though it is a little redundant, for consistency with the name of the command line option and to be clear when scrolling through values in the [rebase] section of .gitconfig. Support setting rebase.rebaseMerges to the nonspecific value "true" for users who don't need to or don't want to learn about the difference between rebase-cousins and no-rebase-cousins. Make --rebase-merges without an argument on the command line override any value of rebase.rebaseMerges in the configuration, for consistency with other command line flags with optional arguments that have an associated config option. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:32:49 -07:00
Alex Henrie	7e5dcec3ca	rebase: add documentation and test for --no-rebase-merges As far as I can tell, --no-rebase-merges has always worked, but has never been documented. It is especially important to document it before a rebase.rebaseMerges option is introduced so that users know how to override the config option on the command line. It's also important to clarify that --rebase-merges without an argument is not the same as --no-rebase-merges and not passing --rebase-merges is not the same as passing --rebase-merges=no-rebase-cousins. A test case is necessary to make sure that --no-rebase-merges keeps working after its code is refactored in the following patches of this series. The test case is a little contrived: It's unlikely that a user would type both --rebase-merges and --no-rebase-merges at the same time. However, if an alias is defined which includes --rebase-merges, the user might decide to add --no-rebase-merges to countermand that part of the alias but leave alone other flags set by the alias. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:32:49 -07:00
René Scharfe	1aaed69d11	t5000: use check_mtime() `fd2da4b1ea` (archive: add --mtime, 2023-02-18) added a helper function for checking the file modification time of an extracted entry. Use it for the older mtime test as well to shorten the code and piggyback on the archive extraction done to validate file contents. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 09:13:30 -07:00
René Scharfe	92b1dd1b9e	archive: improve support for running in subdirectory When git archive is started in a subdirectory, it archives its corresponding tree and its child objects, only. That is intended. It does that by effectively cd'ing into that tree and setting "prefix" to the empty string. This has unfortunate consequences, though: Attributes are anchored at the root of the repository and git archive still applies them to subtrees, causing mismatches. And when checking pathspecs it cannot tell the difference between one that doesn't match anthing or one that matches some actual blob outside of the subdirectory, leading to a confusing error message. Fix that by keeping the "prefix" value and passing it to pathspec and attribute functions, and shortening it using relative_path() for paths written to the archive and (if --verbose is given) to stdout. Still reject attempts to archive files outside the current directory, but print a more specific error in that case. Recognizing it requires a full traversal of the subtree for each pathspec, however. Allowing them would be easier, but archive entry paths starting with "../" can be problematic to extract -- e.g. bsdtar skips them by default. Reported-by: Cristian Le <cristian.le@mpsd.mpg.de> Reported-by: Matthias Görgens <matthias.goergens@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-24 15:51:25 -07:00
Jacob Keller	1a3119ed06	blame: allow --contents to work with non-HEAD commit The --contents option can be used with git blame to blame the file as if it had the contents from the specified file. This is akin to copying the contents into the working tree and then running git blame. This option has been supported since `1cfe77333f` ("git-blame: no rev means start from the working tree file.") The --contents option always blames the file as if it was based on the current HEAD commit. If you try to pass a revision while using --contents, you get the following error: fatal: cannot use --contents with final commit object name This is because the blame process generates a fake working tree commit which always uses the HEAD object as its sole parent. Enhance fake_working_tree_commit to take the object ID to use for the parent instead of always using the HEAD object. Then, always generate a fake commit when we have contents provided, even if we have a final object. Remove the check to disallow --contents and a final revision. Note that the behavior of generating a fake working commit is still skipped when a revision is provided but --contents is not provided. Generating such a commit in that case would combine the currently checked out file contents with the provided revision, which breaks normal blame behavior and produces unexpected results. This enables use of --contents with an arbitrary revision, rather than forcing the use of the local HEAD commit. This makes the --contents option significantly more flexible, as it is no longer required to check out the working tree to the desired commit before using --contents. Reword the documentation so that its clear that --contents can be used with <rev>. Add tests for the --contents option to the annotate-tests.sh test script. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-24 12:05:22 -07:00
Jeff King	aa548459a0	fast-export: de-obfuscate --anonymize-map handling When we handle an --anonymize-map option, we parse the orig/anon pair, and then feed the "orig" string to anonymize_str(), along with a generator function that duplicates the "anon" string to be cached in the map. This works, because anonymize_str() says "ah, there is no mapping yet for orig; I'll add one from the generator". But there are some downsides: 1. It's a bit too clever, as it's not obvious what the code is trying to do or why it works. 2. It requires allowing generator functions to take an extra void pointer, which is not something any of the normal callers of anonymize_str() want. 3. It does the wrong thing if the same token is provided twice. When there are conflicting options, like: git fast-export --anonymize \ --anonymize-map=foo:one \ --anonymize-map=foo:two we usually let the second one override the first. But by using anonymize_str(), which has first-one-wins logic, we do the opposite. So instead of relying on anonymize_str(), let's directly add the entry ourselves. We can tweak the tests to show that we handle overridden options correctly now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:09 -07:00
Junio C Hamano	ba235249c0	Merge branch 'fc/test-aggregation-clean-up' Code clean-up for test framework. * fc/test-aggregation-clean-up: test: don't print aggregate-results command test: simplify counts aggregation	2023-03-21 14:18:56 -07:00
Junio C Hamano	1071deae00	Merge branch 'aj/ls-files-format-fix' Fix for a "ls-files --format="%(path)" that produced nonsense output, which was a bug in 2.38. * aj/ls-files-format-fix: ls-files: fix "--format" output of relative paths	2023-03-21 14:18:55 -07:00
Junio C Hamano	15108de2fa	Merge branch 'jk/format-patch-ignore-noprefix' "git format-patch" honors the src/dst prefixes set to nonstandard values with configuration variables like "diff.noprefix", causing receiving end of the patch that expects the standard -p1 format to break. Teach "format-patch" to ignore end-user configuration and always use the standard prefixes. This is a backward compatibility breaking change. * jk/format-patch-ignore-noprefix: rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch format-patch: add format.noprefix option format-patch: do not respect diff.noprefix diff: add --default-prefix option t4013: add tests for diff prefix options diff: factor out src/dst prefix setup	2023-03-21 14:18:55 -07:00
Elijah Newren	d48be35ca6	write-or-die.h: move declarations for write-or-die.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	61a7b98264	treewide: remove cache.h inclusion due to setup.h changes By moving several declarations to setup.h, the previous patch made it possible to remove the include of cache.h in several source files. Do so. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	e38da487cc	setup.h: move declarations for setup.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	9875058870	treewide: remove cache.h inclusion due to environment.h changes Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	32a8f51061	environment.h: move declarations for environment.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:53 -07:00

1 2 3 4 5 ...

20964 commits