development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-30 14:03:28 +00:00

Author	SHA1	Message	Date
M Hickford	6c26da8404	credential: erase all matching credentials `credential reject` sends the erase action to each helper, but the exact behaviour of erase isn't specified in documentation or tests. Some helpers (such as credential-store and credential-libsecret) delete all matching credentials, others (such as credential-cache) delete at most one matching credential. Test that helpers erase all matching credentials. This behaviour is easiest to reason about. Users expect that `echo "url=https://example.com" \| git credential reject` or `echo "url=https://example.com\nusername=tim" \| git credential reject` erase all matching credentials. Fix credential-cache. Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-15 13:26:41 -07:00
M Hickford	aeb21ce22e	credential: avoid erasing distinct password Test that credential helpers do not erase a password distinct from the input. Such calls can happen when multiple credential helpers are configured. Fixes for credential-cache and credential-store. Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-15 13:26:39 -07:00
Patrick Steinhardt	c40f0b7877	revision: handle pseudo-opts in `--stdin` mode While both git-rev-list(1) and git-log(1) support `--stdin`, it only accepts commits and files. Most notably, it is impossible to pass any of the pseudo-opts like `--all`, `--glob=` or others via stdin. This makes it hard to use this function in certain scripted scenarios, like when one wants to support queries against specific revisions, but also against reference patterns. While this is theoretically possible by using arguments, this may run into issues once we hit platform limits with sufficiently large queries. And because `--stdin` cannot handle pseudo-opts, the only alternative would be to use a mixture of arguments and standard input, which is cumbersome. Implement support for handling pseudo-opts in both commands to support this usecase better. One notable restriction here is that `--stdin` only supports "stuck" arguments in the form of `--glob=foo`. This is because "unstuck" arguments would also require us to read the next line, which would add quite some complexity to the code. This restriction should be fine for scripted usage though. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-15 12:09:31 -07:00
Josip Sokcevic	5768478edc	diff-lib: honor override_submodule_config flag bit When `diff.ignoreSubmodules = all` is set and submodule commits are manually staged (e.g. via `git-update-index`), `git-commit` should record the commit with updated submodules. `index_differs_from` is called from `prepare_to_commit` with flags set to `override_submodule_config = 1`. `index_differs_from` then merges the default diff flags and passed flags. When `diff.ignoreSubmodules` is set to "all", `flags` ends up having both `override_submodule_config` and `ignore_submodules` set to 1. This results in `git-commit` ignoring staged commits. This patch restores original `flags.ignore_submodule` if `flags.override_submodule_config` is set. Signed-off-by: Josip Sokcevic <sokcevic@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-14 11:28:12 -07:00
Junio C Hamano	32fe7fff0c	Merge branch 'zh/ls-files-format-atoms' Some atoms that can be used in "--format=<format>" for "git ls-tree" were not supported by "git ls-files", even though they were relevant in the context of the latter. * zh/ls-files-format-atoms: ls-files: align format atoms with ls-tree	2023-06-13 12:29:46 -07:00
Junio C Hamano	ca9c063c18	Merge branch 'sl/diff-tree-sparse' "git diff-tree" has been taught to take advantage of the sparse-index feature. * sl/diff-tree-sparse: diff-tree: integrate with sparse index	2023-06-13 12:29:46 -07:00
Junio C Hamano	e490bea8a6	Merge branch 'jk/format-patch-message-id-unleak' Leakfix. * jk/format-patch-message-id-unleak: format-patch: free elements of rev.ref_message_ids list format-patch: free rev.message_id when exiting	2023-06-13 12:29:46 -07:00
Junio C Hamano	cbc882ea38	Merge branch 'jc/pack-ref-exclude-include' "git pack-refs" learns "--include" and "--exclude" to tweak the ref hierarchy to be packed using pattern matching. * jc/pack-ref-exclude-include: pack-refs: teach pack-refs --include option pack-refs: teach --exclude option to exclude refs from being packed docs: clarify git-pack-refs --all will pack all refs	2023-06-13 12:29:45 -07:00
Junio C Hamano	6901ffe80c	Merge branch 'jc/diff-s-with-other-options' The "-s" (silent, squelch) option of the "diff" family of commands did not interact with other options that specify the output format well. This has been cleaned up so that it will clear all the formatting options given before. * jc/diff-s-with-other-options: diff: fix interaction between the "-s" option and other options	2023-06-13 12:29:45 -07:00
Junio C Hamano	6d2a88c728	Merge branch 'kh/keep-tag-editmsg-upon-failure' "git tag" learned to leave the "$GIT_DIR/TAG_EDITMSG" file when the command failed, so that the user can salvage what they typed. * kh/keep-tag-editmsg-upon-failure: tag: keep the message file in case ref transaction fails t/t7004-tag: add regression test for successful tag creation doc: tag: document `TAG_EDITMSG`	2023-06-13 12:29:44 -07:00
Taylor Blau	06f3867865	pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack When opening a MIDX bitmap, we the pack-bitmap machinery eagerly calls `prepare_midx_pack()` on each of the packs contained in the MIDX. This is done in order to populate the array of `struct packed_git `s held by the MIDX, which we need later on in `load_reverse_index()`, since it calls `load_pack_revindex()` on each of the MIDX'd packs, and requires that the caller provide a pointer to a `struct packed_git`. When opening one of these packs fails, the pack-bitmap code will `die()` indicating that it can't open one of the packs in the MIDX. This indicates that the MIDX is somehow broken with respect to the current state of the repository. When this is the case, we indeed cannot make use of the MIDX bitmap to speed up reachability traversals. However, it does not mean that we can't perform reachability traversals at all. In other failure modes, that same function calls `warning()` and then returns -1, indicating to its caller (`open_bitmap()`) that we should either look for a pack bitmap if one is available, or perform normal object traversal without using bitmaps at all. There is no reason why this case should cause us to die. If we instead continued (by jumping to `cleanup` as this patch does) and avoid using bitmaps altogether, we may again try and query the MIDX, which will also fail. But when trying to call `fill_midx_entry()` fails, it also returns a signal of its failure, and prompts the caller to try and locate the object elsewhere. In other words, the normal object traversal machinery works fine in the presence of a corrupt MIDX, so there is no reason that the MIDX bitmap machinery should abort in that case when we could easily continue. Note that we could* in theory try again to load a MIDX bitmap after calling `reprepare_packed_git()`. Even though the `prepare_packed_git()` code is careful to avoid adding a pack that we already have, `prepare_midx_pack()` is not. So if we got part of the way through calling `prepare_midx_pack()` on a stale MIDX, and then tried again on a fresh MIDX that contains some of the same packs, we would end up with a loop through the `->next` pointer. For now, let's do the simplest thing possible and fallback to the non-bitmap code when we detect a stale MIDX so that the complete fix as above can be implemented carefully. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 14:19:31 -07:00
Taylor Blau	4dc16e2cb0	gc: introduce `gc.recentObjectsHook` This patch introduces a new multi-valued configuration option, `gc.recentObjectsHook` as a means to mark certain objects as recent (and thus exempt from garbage collection), regardless of their age. When performing a garbage collection operation on a repository with unreachable objects, Git makes its decision on what to do with those object(s) based on how recent the objects are or not. Generally speaking, unreachable-but-recent objects stay in the repository, and older objects are discarded. However, we have no convenient way to keep certain precious, unreachable objects around in the repository, even if they have aged out and would be pruned. Our options today consist of: - Point references at the reachability tips of any objects you consider precious, which may be undesirable or infeasible if there are many such objects. - Track them via the reflog, which may be undesirable since the reflog's lifetime is limited to that of the reference it's tracking (and callers may want to keep those unreachable objects around for longer). - Extend the grace period, which may keep around other objects that the caller does want to discard. - Manually modify the mtimes of objects you want to keep. If those objects are already loose, this is easy enough to do (you can just enumerate and `touch -m` each one). But if they are packed, you will either end up modifying the mtimes of all objects in that pack, or be forced to write out a loose copy of that object, both of which may be undesirable. Even worse, if they are in a cruft pack, that requires modifying its `*.mtimes` file by hand, since there is no exposed plumbing for this. - Force the caller to construct the pack of objects they want to keep themselves, and then mark the pack as kept by adding a ".keep" file. This works, but is burdensome for the caller, and having extra packs is awkward as you roll forward your cruft pack. This patch introduces a new option to the above list via the `gc.recentObjectsHook` configuration, which allows the caller to specify a program (or set of programs) whose output is treated as a set of objects to treat as recent, regardless of their true age. The implementation is straightforward. Git enumerates recent objects via `add_unseen_recent_objects_to_traversal()`, which enumerates loose and packed objects, and eventually calls add_recent_object() on any objects for which `want_recent_object()`'s conditions are met. This patch modifies the recency condition from simply "is the mtime of this object more recent than the cutoff?" to "[...] or, is this object mentioned by at least one `gc.recentObjectsHook`?". Depending on whether or not we are generating a cruft pack, this allows the caller to do one of two things: - If generating a cruft pack, the caller is able to retain additional objects via the cruft pack, even if they would have otherwise been pruned due to their age. - If not generating a cruft pack, the caller is likewise able to retain additional objects as loose. A potential alternative here is to introduce a new mode to alter the contents of the reachable pack instead of the cruft one. One could imagine a new option to `pack-objects`, say `--extra-reachable-tips` that does the same thing as above, adding the visited set of objects along the traversal to the pack. But this has the unfortunate side-effect of altering the reachability closure of that pack. If parts of the unreachable object graph mentioned by one or more of the "extra reachable tips" programs is not closed, then the resulting pack won't be either. This makes it impossible in the general case to write out reachability bitmaps for that pack, since closure is a requirement there. Instead, keep these unreachable objects in the cruft pack (or set of unreachable, loose objects) instead, to ensure that we can continue to have a pack containing just reachable objects, which is always safe to write a bitmap over. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 14:12:20 -07:00
Taylor Blau	73320e49ad	builtin/repack.c: only collect fully-formed packs To partition the set of packs based on which ones are "kept" (either they have a .keep file, or were otherwise marked via the `--keep-pack` option) and "non-kept" ones (anything else), `git repack` uses its `collect_pack_filenames()` function. Ordinarily, we would rely on a convenience function such as `get_all_packs()` to enumerate and partition the set of packs. But `collect_pack_filenames()` uses `readdir()` directly to read the contents of the "$GIT_DIR/objects/pack" directory, and adds each entry ending in ".pack" to the appropriate list (either kept, or non-kept as above). This is subtly racy, since `collect_pack_filenames()` may see a pack that is not fully staged (i.e., it is missing its ".idx" file). Ordinarily, this doesn't cause a problem. But it can cause issues when generating a cruft pack. This is because `git repack` feeds (among other things) the list of existing kept packs down to `git pack-objects --cruft` to indicate that any kept packs will not be removed from the repository (so that the cruft pack machinery can avoid packing objects that appear in those packs as cruft). But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`. So if a ".pack" file exists (necessary to get that pack to appear to `collect_pack_filenames()`), but doesn't have a corresponding ".idx" file (necessary to get that pack to appear via `get_all_packs()`), we'll complain with: fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack' Fix the above by teaching `collect_pack_filenames()` to only collect packs with their corresponding `.idx` files in place, indicating that those packs have been fully staged. There are a couple of things worth noting: - Since each entry in the `extra_keep` list (which contains the `--keep-pack` names) has a `.pack` suffix, we'll have to swap the suffix from ".pack" to ".idx", and compare that instead. - Since we use the the `fname_kept_list` to figure out which packs to delete (with `git repack -d`), we would have previously deleted a `.pack` with no index (since the existince of a ".pack" file is necessary and sufficient to include that pack in the list of existing non-kept packs). Now we will leave it alone (since that pack won't appear in the list). This is far more correct behavior, since we don't want to race with a pack being staged. Deleting a partially staged pack is unlikely, however, since the window of time between staging a pack and moving its .idx file into place is miniscule. Note that this window does not* include the time it takes to receive and index the pack, since the incoming data goes into "$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack" and is thus ignored by collect_pack_filenames(). In the future, this function should probably be rewritten as a callback to `for_each_file_in_pack_dir()`, but this is the simplest change we could do in the short-term. Reported-by: Michael Haggerty <mhagger@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:54:38 -07:00
Todd Zullinger	78e56cff69	t/lib-gpg: require GPGSSH for GPGSSH_VERIFYTIME prereq The GPGSSH_VERIFYTIME prequeq makes use of "${GNUPGHOME}" but does not create it. Require GPGSSH which creates the "${GNUPGHOME}" directory. Additionally, it makes sense to require GPGSSH in GPGSSH_VERIFYTIME because the latter builds on the former. If we can't use GPGSSH, there's little point in checking whether GPGSSH_VERIFYTIME is usable. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:50:39 -07:00
Derrick Stolee	9c7d1b057f	repository: create read_replace_refs setting The 'read_replace_refs' global specifies whether or not we should respect the references of the form 'refs/replace/<oid>' to replace which object we look up when asking for '<oid>'. This global has caused issues when it is not initialized properly, such as in `b6551feadf` (merge-tree: load default git config, 2023-05-10). To make this more robust, move its config-based initialization out of git_default_config and into prepare_repo_settings(). This provides a repository-scoped version of the 'read_replace_refs' global. The global still has its purpose: it is disabled process-wide by the GIT_NO_REPLACE_OBJECTS environment variable or by a call to disable_replace_refs() in some specific Git commands. Since we already encapsulated the use of the constant inside replace_refs_enabled(), we can perform the initialization inside that method, if necessary. This solves the problem of forgetting to check the config, as we will check it before returning this value. Due to this encapsulation, the global can move to be static within replace-object.c. There is an interesting behavior change possible here: we now have a repository-scoped understanding of this config value. Thus, if there was a command that recurses into submodules and might follow replace refs, then it would now respect the core.useReplaceRefs config value in each repository. 'git grep --recurse-submodules' is such a command that recurses into submodules in-process. We can demonstrate the granularity of this config value via a test in t7814. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:34:55 -07:00
Patrick Steinhardt	f79e18849b	cat-file: add option '-Z' that delimits input and output with NUL In `db9d67f2e9` (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), we have introduced a new mode to read the input via NUL-delimited records instead of newline-delimited records. This allows the user to query for revisions that have newlines in their path component. While unusual, such queries are perfectly valid and thus it is clear that we should be able to support them properly. Unfortunately, the commit only changed the input to be NUL-delimited, but didn't change the output at the same time. While this is fine for queries that are processed successfully, it is less so for queries that aren't. In the case of missing commits for example the result can become entirely unparsable: ``` $ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" \| git cat-file --batch -z `7ce4f05bae` blob 10 1234567890 commit missing ``` This is of course a crafted query that is intentionally gaming the deficiency, but more benign queries that contain newlines would have similar problems. Ideally, we should have also changed the output to be NUL-delimited when `-z` is specified to avoid this problem. As the input is NUL-delimited, it is clear that the output in this case cannot ever contain NUL characters by itself. Furthermore, Git does not allow NUL characters in revisions anyway, further stressing the point that using NUL-delimited output is safe. The only exception is of course the object data itself, but as git-cat-file(1) prints the size of the object data clients should read until that specified size has been consumed. But even though `-z` has only been introduced a few releases ago in Git v2.38.0, changing the output format retroactively to also NUL-delimit output would be a backwards incompatible change. And while one could make the argument that the output is inherently broken already, we need to assume that there are existing users out there that use it just fine given that revisions containing newlines are quite exotic. Instead, introduce a new option `-Z` that switches to NUL-delimited input and output. While this new option could arguably only switch the output format to be NUL-delimited, the consequence would be that users have to always specify both `-z` and `-Z` when the input may contain newlines. On the other hand, if the user knows that there never will be newlines in the input, they don't have to use either of those options. There is thus no usecase that would warrant treating input and output format separately, which is why we instead opt to "do the right thing" and have `-Z` mean to NUL-terminate both formats. The old `-z` option is marked as deprecated with a hint that its output may become unparsable. It is thus hidden both from the synopsis as well as the command's help output. Co-authored-by: Toon Claes <toon@iotcl.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:46 -07:00
Patrick Steinhardt	b116c77307	t1006: modernize test style to use `test_cmp` The tests for git-cat-file(1) are quite old and haven't ever been updated since they were introduced. They thus tend to use old idioms that have since grown outdated. Most importantly, many of the tests use `test $A = $B` to compare expected and actual output. This has the downside that it is impossible to tell what exactly is different between both versions in case the test fails. Refactor the tests to instead use `test_cmp`. While more verbose, it both tends to be more readable and will result in a nice diff in case states don't match. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:24 -07:00
Patrick Steinhardt	c7309f63c6	t1006: don't strip timestamps from expected results In t1006 we have a bunch of tests that verify the output format of the git-cat-file(1) command. But while part of the output for some tests would include commit timestamps, we don't verify those but instead strip them before comparing expected with actual results. This is done by the function `maybe_remove_timestamp`, which goes all the way back to the ancient commit `b335d3f121` (Add tests for git cat-file, 2008-04-23). Our tests had been in a different shape back then. Most importantly we didn't yet have the infrastructure to create objects with deterministic timestamps. Nowadays we do though, and thus there is no reason anymore to strip the timestamps. Refactor the tests to not strip the timestamp anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:23:24 -07:00
Shuqi Liang	8fac776f44	worktree: integrate with sparse-index The index is read in 'worktree.c' at two points: 1.The 'validate_no_submodules' function, which checks if there are any submodules present in the worktree. 2.The 'check_clean_worktree' function, which verifies if a worktree is 'clean', i.e., there are no untracked or modified but uncommitted files. This is done by running the 'git status' command, and an error message is thrown if the worktree is not clean. Given that 'git status' is already sparse-aware, the function is also sparse-aware. Hence we can just set the requires-full-index to false for "git worktree". Add tests that verify that 'git worktree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~20% execution time reduction for 'git worktree' using a sparse index: (Note:the p2000 test results didn't reflect the huge speedup because of the index reading time is minuscule comparing to the filesystem operations.) Test before after ----------------------------------------------------------------------- 2000.102: git worktree add....(full-v3) 3.15 2.82 -10.5% 2000.103: git worktree add....(full-v4) 3.14 2.84 -9.6% 2000.104: git worktree add....(sparse-v3) 2.59 2.14 -16.4% 2000.105: git worktree add....(sparse-v4) 2.10 1.57 -25.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Acked-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 12:13:58 -07:00
René Scharfe	6d224ac286	run-command: report exec error even on ENOENT If execve(2) fails with ENOENT and we report the error, we use the format "cannot run %s", followed by the actual error message. For other errors we use "cannot exec '%s'". Stop making this subtle distinction and use the second format for all execve(2) errors. This simplifies the code and makes the prefix more precise by indicating the failed operation. It also allows us to slightly simplify t1800.16. On Windows -- which lacks execve(2) -- we already use a single format in all cases: "cannot spawn %s". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 11:00:22 -07:00
René Scharfe	6b6fe8b43e	t1800: loosen matching of error message for bad shebang t1800.16 checks whether an attempt to run a hook script with a missing executable in its #! line fails and reports that error. The expected error message differs between platforms. The test handles two common variants, but on NonStop OS we get a third one: "fatal: cannot exec 'bad-hooks/test-hook': ...", which causes the test to fail there. We don't really care about the specific message text all that much here. Use grep and a single regex with alternations to ascertain that we get an error message (fatal or otherwise) about the failed invocation of the hook, but don't bother checking if we get the right variant for the platform the test is running on or whether quoting is done. This looser check let's the test pass on NonStop OS. Reported-by: Randall S. Becker <randall.becker@nexbridge.ca> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 11:00:21 -07:00
Derrick Stolee	6f74648cea	add: test use of brackets when color is disabled From 02156b81bbb2cafb19d702c55d45714fcf224048 Mon Sep 17 00:00:00 2001 From: Derrick Stolee <derrickstolee@github.com> Date: Wed, 7 Jun 2023 09:39:01 -0400 Subject: [PATCH v2 2/2] add: test use of brackets when color is disabled The interactive add command, 'git add -i', displays a menu of options using full words. When color is enabled, the first letter of each word is changed to a highlight color to signal that the first letter could be used as a command. Without color, brackets ("[]") are used around these first letters. This behavior was not previously tested directly in t3701, so add a test for it now. Since we use 'git add -i >actual <input' without 'force_color', the color system recognizes that colors are not available on stdout and will be disabled by default. This test would reproduce correctly with or without the fix in the previous commit to make sure that color.ui is respected in 'git add'. Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 10:50:18 -07:00
Derrick Stolee	7cf3b49f47	add: check color.ui for interactive add When 'git add -i' and 'git add -p' were converted to a builtin, they introduced a color bug: the 'color.ui' config setting is ignored. The included test demonstrates an example that is similar to the previous test, which focuses on customizing colors. Here, we are demonstrating that colors are not being used at all by comparing the raw output and the color-decoded version of that output. The fix is simple, to use git_color_default_config() as the fallback for git_add_config(). A more robust change would instead encapsulate the git_use_color_default global in methods that would check the config setting if it has not been initialized yet. Some ideas are being discussed on this front [1], but nothing has been finalized. [1] https://lore.kernel.org/git/pull.1539.git.1685716420.gitgitgadget@gmail.com/ This test case naturally bisects down to `0527ccb1b5` (add -i: default to the built-in implementation, 2021-11-30), but the fix makes it clear that this would be broken even if we added the config to use the builtin earlier than this. Reported-by: Greg Alexander <gitgreg@galexander.org> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 10:49:16 -07:00
Kousik Sanagavarapu	26c9c03f0a	ref-filter: add new "signature" atom Duplicate the code for outputting the signature and its other parameters for commits and tags in ref-filter from pretty. In the future, this will help in getting rid of the current duplicate implementations of such logic everywhere, when ref-filter can do everything that pretty is doing. The new atom "signature" and its friends are equivalent to the existing pretty formats as follows: %(signature) = %GG %(signature:grade) = %G? %(siganture:signer) = %GS %(signature:key) = %GK %(signature:fingerprint) = %GF %(signature:primarykeyfingerprint) = %GP %(signature:trustlevel) = %GT Co-authored-by: Hariom Verma <hariom18599@gmail.com> Co-authored-by: Jaydeep Das <jaydeepjd.8914@gmail.com> Co-authored-by: Nsengiyumva Wilberforce <nsengiyumvawilberforce@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-06 09:32:15 +09:00
Kousik Sanagavarapu	2f36339fa8	t/lib-gpg: introduce new prereq GPG2 GnuPG v2.0.0 released in 2006, which according to its release notes https://gnupg.org/download/release_notes.html is the "First stable version of GnuPG integrating OpenPGP and S/MIME". Use this version or its successors for tests that will fail for versions less than v2.0.0 because of the difference in the output on stderr between the versions (v2.* vs v0.* or v2.* vs v1.). Skip if the GPG version detected is less than v2.0.0. Do not, however, remove the existing prereq GPG yet since a lot of tests still work with the prereq GPG (that is even with versions v0. or v1.*) and some systems still use these versions. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-06 09:32:12 +09:00
Jeff King	8260bc5902	diff: detect pathspec magic not supported by --follow The --follow code doesn't handle most forms of pathspec magic. We check that no unexpected ones have made it to try_to_follow_renames() with a runtime GUARD_PATHSPEC() check, which gives behavior like this: $ git log --follow ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted The same is true of ":(glob)", ":(attr)", and so on. It's good that we notice the problem rather than continuing and producing a wrong answer. But there are two non-ideal things: 1. The idea of GUARD_PATHSPEC() is to catch programming errors where low-level code gets unexpected pathspecs. We'd usually try to catch unsupported pathspecs by passing a magic_mask to parse_pathspec(), which would give the user a much better message like: pathspec magic not supported by this command: 'icase' That doesn't happen here because git-log usually _does_ support all types of pathspec magic, and so it passes "0" for the mask (this call actually happens in setup_revisions()). It needs to distinguish the normal case from the "--follow" one but currently doesn't. 2. In addition to --follow, we have the log.follow config option. When that is set, we try to turn on --follow mode only when there is a single pathspec (since --follow doesn't handle anything else). But really, that ought to be expanded to "use --follow when the pathspec supports it". Otherwise, we'd complain any time you use an exotic pathspec: $ git config log.follow true $ git log ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted We should instead just avoid enabling follow mode if it's not supported by this particular invocation. This patch expands our diff_check_follow_pathspec() function to cover pathspec magic, solving both problems. A few final notes: - we could also solve (1) by passing the appropriate mask to parse_pathspec(). But that's not great for two reasons. One is that the error message is less precise. It says "magic not supported by this command", but really it is not the command, but rather the --follow option which is the problem. The second is that it always calls die(). But for our log.follow code, we want to speculatively ask "is this pathspec OK?" and just get a boolean result. - This is obviously the right thing to do for ':(icase)' and most other magic options. But ':(glob)' is a bit odd here. The --follow code doesn't support wildcards, but we allow them anyway. From try_to_follow_renames(): #if 0 /* * We should reject wildcards as well. Unfortunately we * haven't got a reliable way to detect that 'foo\bar' in fact has no wildcards. nowildcard_len is merely a hint for * optimization. Let it slip for now until wildmatch is taught * about dry-run mode and returns wildcard info. / if (opt->pathspec.has_wildcard) BUG("wildcards are not supported"); #endif So something like "git log --follow 'Make'" is already doing the wrong thing, since ":(glob)" behavior is already the default (it is used only to countermand an earlier --noglob-pathspecs). So we _could_ loosen the guard to allow :(glob), since it just behaves the same as pathspecs do by default. But it seems like a backwards step to do so. It already doesn't work (it hits the BUG() case currently), and given that the user took an explicit step to say "this pathspec should glob", it is reasonable for us to say "no, --follow does not support globbing" (or in the case of log.follow, avoid turning on follow mode). Which is what happens after this patch. - The set of allowed pathspec magic is obviously the same as in GUARD_PATHSPEC(). We could perhaps factor these out to avoid repetition. The point of having separate masks and GUARD calls is that we don't necessarily know which parsed pathspecs will be used where. But in this case, the two are heavily correlated. Still, there may be some value in keeping them separate; it would make anyone think twice about adding new magic to the list in diff_check_follow_pathspec(). They'd need to touch try_to_follow_renames() as well, which is the code that would actually need to be updated to handle more exotic pathspecs. - The documentation for log.follow says that it enables --follow "...when a single <path> is given". We could possibly expand that to say "with no unsupported pathspec magic", but that raises the question of documenting which magic is supported. I think the existing wording of "single <path>" sufficiently encompasses the idea (the forbidden magic is stuff that might match multiple entries), and the spirit remains the same. Reported-by: Jim Pryor <dubiousjim@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-03 10:34:25 +09:00
Victoria Dye	3867f6d650	repository: move 'repository_format_worktree_config' to repo scope Move 'repository_format_worktree_config' out of the global scope and into the 'repository' struct. This change is similar to how 'repository_format_partial_clone' was moved in `ebaf3bcf1a` (repository: move global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository' struct and updating 'setup.c' & 'repository.c' functions to assign the value appropriately. The primary goal of this change is to be able to load the worktree config of a submodule depending on whether that submodule - not its superproject - has 'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()' has access to the newly repo-scoped configuration, add a 'struct repository' argument to 'do_git_config_sequence()' and pass it the 'repo' value from 'config_with_options()'. Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to verify 'extensions.worktreeConfig' is read an used independently by superprojects and submodules. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-26 13:53:41 +09:00
Victoria Dye	847d0027d2	config: use gitdir to get worktree config Update 'do_git_config_sequence()' to read the worktree config from 'config.worktree' in 'opts->git_dir' rather than the gitdir of 'the_repository'. The worktree config is loaded from the path returned by 'git_pathdup("config.worktree")', the 'config.worktree' relative to the gitdir of 'the_repository'. If loading the config for a submodule, this path is incorrect, since 'the_repository' is the superproject. 'opts->git_dir' is the gitdir of the submodule being configured, so the config file in that location should be read instead. To ensure the use of 'opts->git_dir' is safe, require that 'opts->git_dir' is set if-and-only-if 'opts->commondir' is set (rather than "only-if" as it is now). In all current usage of 'config_options', these values are set together, so the stricter check does not change any behavior. Finally, add tests to 't3007-ls-files-recurse-submodules.sh' to verify the corrected config is loaded. Use 'ls-files' to test this because, unlike some other '--recurse-submodules' commands, 'ls-files' parses the config of the submodule in the same process as the superproject (via 'show_submodule()' -> 'repo_read_index()' -> 'prepare_repo_settings()'). As a result, 'the_repository' points to the config of the superproject but the commondir/gitdir in the config sequence will be that of the submodule, providing the exact scenario needed to verify this patch. The first test ('--recurse-submodules parses submodule repo config') checks that the submodule's repo config is read when running 'ls-files' on the superproject; this confirms already-working behavior, serving as a reference for how worktree config parsing should behave. The second test ('--recurse-submodules parses submodule worktree config') tests the same scenario as the previous but instead using the worktree config, demonstrating the corrected behavior. The 'test_config' helper is extended for this case so that it properly applies the '--worktree' option to the configure/unconfigure operations it performs. Note that, although the submodule worktree config is now parsed instead of the superproject's, 'extensions.worktreeConfig' in the superproject still controls whether or not the worktree config is enabled at all in the submodule. This will be fixed in a later patch. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-26 13:53:40 +09:00
Todd Zullinger	20025fdfc7	t/lib-gpg: fix ssh-keygen -Y check-novalidate with openssh-9.0 OpenSSH-9.0 requires a namespace option with `-Y check-novalidate`. This was added in openssh-portable commit a0b5816f8 (upstream: ssh-keygen -Y check-novalidate requires namespace or SEGV, 2022-03-18). The -n option was documented as a required option since check-novalidate was added in openssh-portable 8aa2aa3cd (upstream: Allow testing signature syntax and validity without verifying, 2019-09-16). Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-26 13:37:08 +09:00
Todd Zullinger	e48a21df65	trace2 tests: fix PTHREADS prereq The prereq guard added in `14903c8e92` (trace2 tests: guard pthread test with "PTHREAD", 2022-11-24) lacks the S in PTHREADS, causing it to never be satisfied. Fix the spelling of the prereq. Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-26 13:31:15 +09:00
Junio C Hamano	6a6621fe9a	Merge branch 'sl/sparse-write-tree-part-2' Fix-up to a topic already graduated to 'master'. * sl/sparse-write-tree-part-2: t1092: update a write-tree test	2023-05-25 05:53:55 +09:00
Taylor Blau	fbc806acd1	builtin/submodule--helper.c: handle missing submodule URLs In `e0a862fdaf` (submodule helper: convert relative URL to absolute URL if needed, 2018-10-16), `prepare_to_clone_next_submodule()` lost the ability to handle URL-less submodules, due to a change from: if (repo_get_config_string_const(the_repostiory, sb.buf, &url)) url = sub->url; to if (repo_get_config_string_const(the_repostiory, sb.buf, &url)) { if (starts_with_dot_slash(sub->url) \|\| starts_with_dot_dot_slash(sub->url)) { /* ... */ } } , which will segfault when `sub->url` is NULL, since both `starts_with_dot_slash()` does not guard its arguments as non-NULL. Guard the checks to both of the above functions by first checking whether `sub->url` is non-NULL. There is no need to check whether `sub` itself is NULL, since we already perform this check earlier in `prepare_to_clone_next_submodule()`. By adding a NULL-ness check on `sub->url`, we'll fall into the 'else' branch, setting `url` to `sub->url` (which is NULL). Before attempting to invoke `git submodule--helper clone`, check whether `url` is NULL, and die() if it is. Reported-by: Tribo Dar <3bodar@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-25 05:26:59 +09:00
ZheNing Hu	4d28c4f75f	ls-files: align format atoms with ls-tree "git ls-files --format" can be used to format the output of multiple file entries in the index, while "git ls-tree --format" can be used to format the contents of a tree object. However, the current set of %(objecttype), "(objectsize)", and "%(objectsize:padded)" atoms supported by "git ls-files --format" is a subset of what is available in "git ls-tree --format". Users sometimes need to establish a unified view between the index and tree, which can help with comparison or conversion between the two. Therefore, this patch adds the missing atoms to "git ls-files --format". "%(objecttype)" can be used to retrieve the object type corresponding to a file in the index, "(objectsize)" can be used to retrieve the object size corresponding to a file in the index, and "%(objectsize:padded)" is the same as "(objectsize)", except with padded format. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 20:12:57 +09:00
John Cai	447a3b7331	t9400-git-cvsserver-server: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	7dac6347c5	t9200-git-cvsexportcommit: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	7d7097bf59	t9104-git-svn-follow-parent: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	be1fce6dae	t9100-git-svn-basic: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	3a3b98be91	t7700-repack: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	bd48dfad69	t7600-merge: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:29 +09:00
John Cai	a6171e1478	t7508-status: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:28 +09:00
John Cai	10dae78533	t7201-co: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:28 +09:00
John Cai	c970681f50	t7111-reset-table: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:28 +09:00
John Cai	a32a724b03	t7110-reset-merge: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-23 12:54:28 +09:00
Junio C Hamano	cacc15ee3f	Merge branch 'js/rebase-count-fixes' A few bugs in the sequencer machinery that results in miscounting the steps have been corrected. * js/rebase-count-fixes: rebase -r: fix the total number shown in the progress rebase --update-refs: fix loops	2023-05-20 05:35:57 +09:00
Junio C Hamano	dc3fd2486f	Merge branch 'jc/do-not-negate-test-helpers' Small fixes. * jc/do-not-negate-test-helpers: test: do not negate test_path_is_* to assert absense t2021: do not negate test_path_is_dir tests: do not negate test_path_exists	2023-05-20 05:35:56 +09:00
John Cai	3b8724bce6	t7101-reset-empty-subdirs: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:12 -07:00
John Cai	32942346aa	t6050-replace: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:12 -07:00
John Cai	a45bb750db	t5306-pack-nobase: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:12 -07:00
John Cai	aac864059f	t5303-pack-corruption-resilience: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:12 -07:00
John Cai	cc0c1ad9ad	t5301-sliding-window: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:12 -07:00
John Cai	e478a52087	t5300-pack-object: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	1afebc92ef	t4206-log-follow-harder-copies: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	3da9be913a	t4202-log: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	93fc423e9a	t4004-diff-rename-symlink: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	9297229d15	t4003-diff-rename-1: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	9cfcbcc095	t4002-diff-basic: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	a8fcc0ac89	t3903-stash: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	0aa0266c4b	t3700-add: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	0a6cb5c42f	t3500-cherry: modernize test format Some tests still use the old format with four spaces indentation. Standardize the tests to the new format with tab indentation. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	350c484239	t1006-cat-file: modernize test format Some tests in t1006-cat-file.sh used the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	a10bb2ded5	t1002-read-tree-m-u-2way: modernize test format Some tests are still using the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	58db6e450b	t1001-read-tree-m-2way: modernize test format Some tests are still using the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:11 -07:00
John Cai	3c2f5d26c0	t3210-pack-refs: modernize test format Some tests in t3210-pack-refs.sh used the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:10 -07:00
John Cai	27990663f0	t0030-stripspace: modernize test format Some tests in t0030-stripspace.sh used the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:10 -07:00
John Cai	2f68c99a3b	t0000-basic: modernize test format Some tests in t0000-basic.sh used the older four space indent format. Update these to use tabs. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-19 10:08:10 -07:00
Junio C Hamano	646ca89558	Merge branch 'jk/http-test-cgipassauth-unavailable-in-older-apache' We started unconditionally testing with CGIPassAuth directive but it is unavailable in older Apache that ships with CentOS 7 that has about a year of shelf-life still left. The test has conditionally been disabled when running with an ancient Apache. This was a fix for a recent regression caught before the release, so no need to mention it in the release notes. * jk/http-test-cgipassauth-unavailable-in-older-apache: t/lib-httpd: make CGIPassAuth support conditional	2023-05-19 09:27:07 -07:00
Junio C Hamano	633390bd08	Merge branch 'bc/clone-empty-repo-via-protocol-v0' The server side of "git clone" now advertises the necessary hints to clients to help them to clone from an empty repository and learn object hash algorithm and the (unborn) branch pointed at by HEAD, even over the older v0/v1 protocol. * bc/clone-empty-repo-via-protocol-v0: upload-pack: advertise capabilities when cloning empty repos	2023-05-19 09:27:06 -07:00
Junio C Hamano	b04671b638	Merge branch 'jc/send-email-pre-process-fix' When "git send-email" that uses the validate hook is fed a message without and then with Message-ID, it failed to auto-assign a unique Message-ID to the former and instead reused the Message-ID from the latter, which has been corrected. This was a fix for a recent regression caught before the release, so no need to mention it in the release notes. * jc/send-email-pre-process-fix: t9001: mark the script as no longer leak checker clean send-email: clear the $message_id after validation	2023-05-19 09:27:06 -07:00
Jeff King	cfa120947e	format-patch: free rev.message_id when exiting We may allocate a message-id string via gen_message_id(), but we never free it, causing a small leak. This can be demonstrated by running t9001 with a leak-checking build. The offending test is the one touched by `3ece9bf0f9` (send-email: clear the $message_id after validation, 2023-05-17), but the leak is much older than that. The test was simply unlucky enough to trigger the leaking code path for the first time. We can fix this by freeing the string at the end of the function. We can also re-mark the test script as leak-free, effectively reverting `20bd08aefb` (t9001: mark the script as no longer leak checker clean, 2023-05-17). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-18 18:33:04 -07:00
Jeff King	eb1c42da8e	t/lib-httpd: make CGIPassAuth support conditional Commit `988aad99b4` (t5563: add tests for basic and anoymous HTTP access, 2023-02-27) added tests that require Apache to support the CGIPassAuth directive, which was added in Apache 2.4.13. This is fairly old (~8 years), but recent enough that we still encounter it in the wild (e.g., RHEL/CentOS 7, which is not EOL until June 2024). We can live with skipping the new tests on such a platform. But unfortunately, since the directive is used unconditionally in our apache.conf, it means the web server fails to start entirely, and we cannot run other HTTP tests at all (e.g., the basic ones in t5551). We can fix that by making the config conditional, and only triggering it for t5563. That solves the problem for t5551 (which then ignores the directive entirely). For t5563, we'd see apache complain in start_httpd; with the default setting of GIT_TEST_HTTPD, we'd then skip the whole script. But that leaves one small problem: people may set GIT_TEST_HTTPD=1 explicitly, which instructs the tests to fail (rather than skip) when we can't start the webserver (to avoid accidentally missing some tests). This could be worked around by having the user manually set GIT_SKIP_TESTS on a platform with an older Apache. But we can be a bit friendlier by doing the version check ourselves and setting an appropriate prereq. We'll use the (lack of) prereq to then skip the rest of t5563. In theory we could use the prereq to skip individual tests, but in practice this whole script depends on it. Reported-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-18 14:29:32 -07:00
Shuqi Liang	48c5fbfb89	diff-tree: integrate with sparse index The index is read in 'cmd_diff_tree' at two points: 1. The first index read was added in `fd66bcc31f` (diff-tree: read the index so attribute checks work in bare repositories, 2017-12-06) to deal with reading '.gitattributes' content. `77efbb366a` (attr: be careful about sparse directories, 2021-09-08) established that, in a sparse index, we do _not_ try to load a '.gitattributes' file from within a sparse directory. 2. The second index access point is involved in rename detection, specifically when reading from stdin.This was initially added in `f0c6b2a2fd` ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where 'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE \|DIFF_SETUP_USE_CACHE'. That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in `ff7fe37b05` (diff.c: move read_index() code back to the caller, 2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as of `6e0b8ed6d3` (diff.c: do not use a separate "size cache"., 2007-05-07) and nothing about 'detect_rename' otherwise indicates index usage. Hence we can just set the requires-full-index to false for "diff-tree". Add tests that verify that 'git diff-tree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~98% execution time reduction for 'git diff-tree' using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-tree HEAD (full-v3) 0.05 0.04 -20.0% 2000.95: git diff-tree HEAD (full-v4) 0.06 0.05 -16.7% 2000.96: git diff-tree HEAD (sparse-v3) 0.59 0.01 -98.3% 2000.97: git diff-tree HEAD (sparse-v4) 0.61 0.01 -98.4% 2000.98: git diff-tree HEAD -- f2/f4/a (full-v3) 0.05 0.05 +0.0% 2000.99: git diff-tree HEAD -- f2/f4/a (full-v4) 0.05 0.04 -20.0% 2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3) 0.58 0.01 -98.3% 2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4) 0.55 0.01 -98.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-18 10:40:33 -07:00
Junio C Hamano	20bd08aefb	t9001: mark the script as no longer leak checker clean The test uses "format-patch --thread" which is known to leak the generated message ID list. Plugging these leaks involves straightening out the memory ownership rules around rev_info.message_id and rev_info.ref_message_ids, and is beyond the scope of send-email fix, so for now mark the test as leaky to unblock the topic before the release. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 16:47:36 -07:00
Jacob Abel	926c40d04b	worktree add: emit warn when there is a bad HEAD Add a warning to `worktree add` when the command tries to reference HEAD, there exist valid local branches, and the HEAD points to a non-existent reference. Current Behavior: % git -C foo worktree list /path/to/repo/foo `dadc8e6dac` [main] /path/to/repo/foo_wt 0000000000 [badref] % git -C foo worktree add ../wt1 Preparing worktree (new branch 'wt1') HEAD is now at `dadc8e6dac` dummy commit % git -C foo_wt worktree add ../wt2 hint: If you meant to create a worktree containing a new orphan branch [...] hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % New Behavior: % git -C foo worktree list /path/to/repo/foo `dadc8e6dac` [main] /path/to/repo/foo_wt 0000000000 [badref] % git -C foo worktree add ../wt1 Preparing worktree (new branch 'wt1') HEAD is now at `dadc8e6dac` dummy commit % git -C foo_wt worktree add ../wt2 warning: HEAD points to an invalid (or orphaned) reference. HEAD path: '/path/to/repo/foo/.git/worktrees/foo_wt/HEAD' HEAD contents: 'ref: refs/heads/badref' hint: If you meant to create a worktree containing a new orphan branch [...] hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:25 -07:00
Jacob Abel	128e5496b3	worktree add: extend DWIM to infer --orphan Extend DWIM to try to infer `--orphan` when in an empty repository. i.e. a repository with an invalid/unborn HEAD, no local branches, and if `--guess-remote` is used then no remote branches. This behavior is equivalent to `git switch -c` or `git checkout -b` in an empty repository. Also warn the user (overriden with `-f`/`--force`) when they likely intend to checkout a remote branch to the worktree but have not yet fetched from the remote. i.e. when using `--guess-remote` and there is a remote but no local or remote refs. Current Behavior: % git --no-pager branch --list --remotes % git remote origin % git workree add ../main hint: If you meant to create a worktree containing a new orphan branch [...] hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % git workree add --guess-remote ../main hint: If you meant to create a worktree containing a new orphan branch [...] hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % git fetch --quiet % git --no-pager branch --list --remotes origin/HEAD -> origin/main origin/main % git workree add --guess-remote ../main Preparing worktree (new branch 'main') branch 'main' set up to track 'origin/main'. HEAD is now at `dadc8e6dac` commit message % New Behavior: % git --no-pager branch --list --remotes % git remote origin % git workree add ../main No possible source branch, inferring '--orphan' Preparing worktree (new branch 'main') % git worktree remove ../main % git workree add --guess-remote ../main fatal: No local or remote refs exist despite at least one remote present, stopping; use 'add -f' to overide or fetch a remote first % git workree add --guess-remote -f ../main No possible source branch, inferring '--orphan' Preparing worktree (new branch 'main') % git worktree remove ../main % git fetch --quiet % git --no-pager branch --list --remotes origin/HEAD -> origin/main origin/main % git workree add --guess-remote ../main Preparing worktree (new branch 'main') branch 'main' set up to track 'origin/main'. HEAD is now at `dadc8e6dac` commit message % Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:25 -07:00
Jacob Abel	35f0383ca6	worktree add: introduce "try --orphan" hint Add a new advice/hint in `git worktree add` for when the user tries to create a new worktree from a reference that doesn't exist. Current Behavior: % git init foo Initialized empty Git repository in /path/to/foo/ % touch file % git -C foo commit -q -a -m "test commit" % git -C foo switch --orphan norefbranch % git -C foo worktree add newbranch/ Preparing worktree (new branch 'newbranch') fatal: invalid reference: HEAD % New Behavior: % git init --bare foo Initialized empty Git repository in /path/to/foo/ % touch file % git -C foo commit -q -a -m "test commit" % git -C foo switch --orphan norefbranch % git -C foo worktree add newbranch/ Preparing worktree (new branch 'newbranch') hint: If you meant to create a worktree containing a new orphan branch hint: (branch with no commits) for this repository, you can do so hint: using the --orphan option: hint: hint: git worktree add --orphan newbranch/ hint: hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % git -C foo worktree add -b newbranch2 new_wt/ Preparing worktree (new branch 'newbranch') hint: If you meant to create a worktree containing a new orphan branch hint: (branch with no commits) for this repository, you can do so hint: using the --orphan option: hint: hint: git worktree add --orphan -b newbranch2 new_wt/ hint: hint: Disable this message with "git config advice.worktreeAddOrphan false" fatal: invalid reference: HEAD % Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:24 -07:00
Jacob Abel	7ab8918985	worktree add: add --orphan flag Add support for creating an orphan branch when adding a new worktree. The functionality of this flag is equivalent to git switch's --orphan option. Current Behavior: % git -C foo.git --no-pager branch -l + main % git -C foo.git worktree add main/ Preparing worktree (new branch 'main') HEAD is now at 6c93a75 a commit % % git init bar.git Initialized empty Git repository in /path/to/bar.git/ % git -C bar.git --no-pager branch -l % git -C bar.git worktree add main/ Preparing worktree (new branch 'main') fatal: not a valid object name: 'HEAD' % New Behavior: % git -C foo.git --no-pager branch -l + main % git -C foo.git worktree add main/ Preparing worktree (new branch 'main') HEAD is now at 6c93a75 a commit % % git init --bare bar.git Initialized empty Git repository in /path/to/bar.git/ % git -C bar.git --no-pager branch -l % git -C bar.git worktree add main/ Preparing worktree (new branch 'main') fatal: invalid reference: HEAD % git -C bar.git worktree add --orphan -b main/ Preparing worktree (new branch 'main') % git -C bar.git worktree add --orphan -b newbranch worktreedir/ Preparing worktree (new branch 'newbranch') % Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:24 -07:00
Jacob Abel	9ccdace1e8	t2400: add tests to verify --quiet Add tests to verify that the command performs operations the same with `--quiet` as without it. Additionally verifies that all non-fatal output is suppressed. Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:24 -07:00
Jacob Abel	ed6db0e9ff	t2400: refactor "worktree add" opt exclusion tests Pull duplicate test code into a function so that additional opt combinations can be tested succinctly. Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:24 -07:00
Jacob Abel	1b28fbd218	t2400: cleanup created worktree in test Signed-off-by: Jacob Abel <jacobabel@nullpo.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 15:55:24 -07:00
Junio C Hamano	3ece9bf0f9	send-email: clear the $message_id after validation Recently git-send-email started parsing the same message twice, once to validate _all_ the message before sending even the first one, and then after the validation hook is happy and each message gets sent, to read the contents to find out where to send to etc. Unfortunately, the effect of reading the messages for validation lingered even after the validation is done. Namely $message_id gets assigned if exists in the input files but the variable is global, and it is not cleared before pre_process_file runs. This causes reading a message without a message-id followed by reading a message with a message-id to misbehave---the sub reports as if the message had the same id as the previously written one. Clear the variable before starting to read the headers in pre_process_file. Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 14:11:38 -07:00
brian m. carlson	933e3a4ee2	upload-pack: advertise capabilities when cloning empty repos When cloning an empty repository, protocol versions 0 and 1 currently offer nothing but the header and flush packets for the /info/refs endpoint. This means that no capabilities are provided, so the client side doesn't know what capabilities are present. However, this does pose a problem when working with SHA-256 repositories, since we use the capabilities to know the remote side's object format (hash algorithm). As of `8b214c2e9d` ("clone: propagate object-format when cloning from void", 2023-04-05), this has been fixed for protocol v2, since there we always read the hash algorithm from the remote. Fortunately, the push version of the protocol already indicates a clue for how to solve this. When the /info/refs endpoint is accessed for a push and the remote is empty, we include a dummy "capabilities^{}" ref pointing to the all-zeros object ID. The protocol documentation already indicates this should _always_ be sent, even for fetches and clones, so let's just do that, which means we'll properly announce the hash algorithm as part of the capabilities. This just works with the existing code because we share the same ref code for fetches and clones, and libgit2, JGit, and dulwich do as well. There is one minor issue to fix, though. If we called send_ref with namespaces, we would return NULL with the capabilities entry, which would cause a crash. Instead, let's refactor out a function to print just the ref itself without stripping the namespace and use it for our special capabilities entry. Add several sets of tests for HTTP as well as for local clones. The behavior can be slightly different for HTTP versus a local or SSH clone because of the stateless-rpc functionality, so it's worth testing both. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-17 13:22:46 -07:00
Junio C Hamano	67a3b2b39f	Merge branch 'jc/attr-source-tree' "git --attr-source=<tree> cmd $args" is a new way to have any command to read attributes not from the working tree but from the given tree object. * jc/attr-source-tree: attr: teach "--attr-source=<tree>" global option to "git"	2023-05-17 10:11:41 -07:00
Kristoffer Haugsbakk	08c12ec1d0	tag: keep the message file in case ref transaction fails The ref transaction can fail after the user has written their tag message. In particular, if there exists a tag `foo/bar` and `git tag -a foo` is said then the command will only fail once it tries to write `refs/tags/foo`, which is after the file has been unlinked. Hold on to the message file for a little longer so that it is not unlinked before the fatal error occurs. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-16 11:38:14 -07:00
Kristoffer Haugsbakk	669c11de85	t/t7004-tag: add regression test for successful tag creation The standard tag message file is unlinked if the tag is created. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-16 11:38:14 -07:00
Junio C Hamano	b126b65b33	test: do not negate test_path_is_* to assert absense These tests use "! test_path_is_dir" or "! test_path_is_file" to ensure that the path is not recursively checked out or "submodule update" did not touch the working tree. Use "test_path_is_missing" to assert that the path does not exist, instead of negating test_path_is_* helpers; they are designed to be loud in wrong occasions. Besides, negating "test_path_is_dir" would mean we would be happy if a file exists there, which is not the case for these tests. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-16 09:14:23 -07:00
Junio C Hamano	eab648d2b4	t2021: do not negate test_path_is_dir In this test, a path (some_dir) that is originally a directory is to be removed and then to be replaced with a file of the same name. The expectation is that the path becomes a file at the end. However, "! test_path_is_dir some_dir" is used to catch a breakage that leaves the path as a directory. But as with all the "test_path_is..." helpers, this use of the helper makes it loud when the expectation (i.e. it is a directory) is met, and otherwise is silent when it is not---this does not help debugging. Be more explicit and state that we expect the path to become a file. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-16 09:14:21 -07:00
Junio C Hamano	c205923649	tests: do not negate test_path_exists As a way to assert the path 'foo' is missing, "! test_path_exists foo" is a poor way to do so, as the helper is designed to complain when 'foo' is missing, but the intention of the author who used negated form was to make sure it does not exist. This does not help debugging the tests. Use test_path_is_missing instead, which is a more appropriate helper. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-16 09:13:55 -07:00
Junio C Hamano	15ba44f1b4	Merge branch 'ps/fetch-output-format' "git fetch" learned the "--porcelain" option that emits what it did in a machine-parseable format. * ps/fetch-output-format: fetch: introduce machine-parseable "porcelain" output format fetch: move option related variables into main function fetch: lift up parsing of "fetch.output" config variable fetch: introduce `display_format` enum fetch: refactor calculation of the display table width fetch: print left-hand side when fetching HEAD:foo fetch: add a test to exercise invalid output formats fetch: split out tests for output format fetch: fix `--no-recurse-submodules` with multi-remote fetches	2023-05-15 13:59:07 -07:00
Junio C Hamano	5ca11547bb	Merge branch 'sl/diff-files-sparse' Teach "diff-files" not to expand sparse-index unless needed. * sl/diff-files-sparse: diff-files: integrate with sparse index t1092: add tests for `git diff-files`	2023-05-15 13:59:06 -07:00
Junio C Hamano	80754c5cc0	Merge branch 'ds/merge-tree-use-config' Allow git forges to disable replace-refs feature while running "git merge-tree". * ds/merge-tree-use-config: merge-tree: load default git config	2023-05-15 13:59:06 -07:00
Junio C Hamano	85cee30566	Merge branch 'ar/test-cleanup-unused-file-creation' Test fix. * ar/test-cleanup-unused-file-creation: test: rev-parse-upstream: add missing cmp	2023-05-15 13:59:06 -07:00
Junio C Hamano	5334592b1d	Merge branch 'jk/test-verbose-no-more' Retire "verbose" helper function from the test framework. * jk/test-verbose-no-more: t: drop "verbose" helper function t7001: use "ls-files --format" instead of "cut" t7001: avoid git on upstream of pipe	2023-05-15 13:59:05 -07:00
Junio C Hamano	f37da97723	Merge branch 'tl/push-branches-is-an-alias-for-all' "git push --all" gained an alias "git push --branches". * tl/push-branches-is-an-alias-for-all: t5583: fix shebang line push: introduce '--branches' option	2023-05-15 13:59:05 -07:00
Junio C Hamano	3fb8a0f0a2	Merge branch 'jc/t9800-fix-use-of-show-s-raw' A test fix. * jc/t9800-fix-use-of-show-s-raw: t9800: correct misuse of 'show -s --raw' in a test	2023-05-15 13:59:05 -07:00
Junio C Hamano	1e1dcb2a42	Merge branch 'jc/dirstat-plug-leaks' "git diff --dirstat" leaked memory, which has been plugged. * jc/dirstat-plug-leaks: diff: plug leaks in dirstat diff: refactor common tail part of dirstat computation	2023-05-15 13:59:05 -07:00
Junio C Hamano	cd2b740ca9	Merge branch 'ds/fsck-bitmap' "git fsck" learned to detect bit-flip breakages in the reachability bitmap files. * ds/fsck-bitmap: fsck: use local repository fsck: verify checksums of all .bitmap files	2023-05-15 13:59:04 -07:00
Junio C Hamano	2bb14fbf2f	Merge branch 'ar/config-count-tests-updates' Test updates. * ar/config-count-tests-updates: t1300: add tests for missing keys t1300: check stderr for "ignores pairs" tests t1300: drop duplicate test	2023-05-15 13:59:04 -07:00
Junio C Hamano	fa889347e3	Merge branch 'gc/trace-bare-repo-setup' The tracing mechanism learned to notice and report when auto-discovered bare repositories are being used, as allowing so without explicitly stating the user intends to do so (with setting GIT_DIR for example) can be used with social engineering as an attack vector. * gc/trace-bare-repo-setup: setup: trace bare repository setups	2023-05-15 13:59:03 -07:00
Junio C Hamano	64477d20d7	Merge branch 'mc/send-email-header-cmd' "git send-email" learned "--header-cmd=<cmd>" that can inject arbitrary e-mail header lines to the outgoing messages. * mc/send-email-header-cmd: send-email: detect empty blank lines in command output send-email: add --header-cmd, --no-header-cmd options send-email: extract execute_cmd from recipients_cmd	2023-05-15 13:59:03 -07:00
Junio C Hamano	d3f2e4ab13	Merge branch 'rj/branch-unborn-in-other-worktrees' Error messages given when working on an unborn branch that is checked out in another worktree have been improved. * rj/branch-unborn-in-other-worktrees: branch: avoid unnecessary worktrees traversals branch: rename orphan branches in any worktree branch: description for orphan branch errors branch: use get_worktrees() in copy_or_rename_branch() branch: test for failures while renaming branches	2023-05-15 13:59:03 -07:00
Johannes Schindelin	170eea9750	rebase -r: fix the total number shown in the progress For regular, non-`--rebase-merges` runs, there is very little work to do for the parser when determining the total number of commands in a rebase script: it is simply the number of lines after stripping the commented lines and then trimming the trailing empty line, if any. The `--rebase-merges` mode complicates things by introducing empty lines and comments in the middle of the script. These should _not_ be counted as commands, and indeed, when an interactive rebase is interrupted and subsequently resumed, the total number of commands can magically shrink, sometimes dramatically. The reason for this strange behavior is that empty lines _are_ counted in `edit_todo_list()` (but not the comments, as they are stripped via `strbuf_stripspace(..., 1)`, which is a bug. Let's fix this so that the correct total number is shown from the get-go, by carefully adjusting it according to what's in the rebase script. Extra care needs to be taken in case the user edits the script: the number of commands might be different after the user edited than beforehand. Note: Even though commented lines are skipped in `edit_todo_list()`, we still need to handle `TODO_COMMENT` items by decrementing the already-incremented `total_nr` again: empty lines are also marked as `TODO_COMMENT`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-14 21:44:50 -07:00
John Cai	4fe42f326e	pack-refs: teach pack-refs --include option Allow users to be more selective over which refs to pack by adding an --include option to git-pack-refs. The existing options allow some measure of selectivity. By default git-pack-refs packs all tags. --all can be used to include all refs, and the previous commit added the ability to exclude certain refs with --exclude. While these knobs give the user some selection over which refs to pack, it could be useful to give more control. For instance, a repository may have a set of branches that are rarely updated and would benefit from being packed. --include would allow the user to easily include a set of branches to be packed while leaving everything else unpacked. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-12 14:54:14 -07:00
John Cai	826ae79fca	pack-refs: teach --exclude option to exclude refs from being packed At GitLab, we have a system that creates ephemeral internal refs that don't live long before getting deleted. Having an option to exclude certain refs from a packed-refs file allows these internal references to be deleted much more efficiently. Add an --exclude option to the pack-refs builtin, and use the ref exclusions API to exclude certain refs from being packed into the final packed-refs file Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-12 14:54:14 -07:00
Elijah Newren	022fbb655d	t5583: fix shebang line The shebang was missing the leading `/` character, resulting in: $ ./t5583-push-branches.sh bash: ./t5583-push-branches.sh: cannot execute: required file not found Add the missing character so the test can run. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-12 10:02:18 -07:00
Derrick Stolee	b6551feadf	merge-tree: load default git config The 'git merge-tree' command handles creating root trees for merges without using the worktree. This is a critical operation in many Git hosts, as they typically store bare repositories. This builtin does not load the default Git config, which can have several important ramifications. In particular, one config that is loaded by default is core.useReplaceRefs. This is typically disabled in Git hosts due to the ability to spoof commits in strange ways. Since this config is not loaded specifically during merge-tree, users were previously able to use refs/replace/ references to make pull requests that looked valid but introduced malicious content. The resulting merge commit would have the correct commit history, but the malicious content would exist in the root tree of the merge. The fix is simple: load the default Git config in cmd_merge_tree(). This may also fix other behaviors that are effected by reading default config. The only possible downside is a little extra computation time spent reading config. The config parsing is placed after basic argument parsing so it does not slow down usage errors. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 12:20:44 -07:00
Patrick Steinhardt	dd781e3856	fetch: introduce machine-parseable "porcelain" output format The output of git-fetch(1) is obviously designed for consumption by users, only: we neatly columnize data, we abbreviate reference names, we print neat arrows and we don't provide information about actual object IDs that have changed. This makes the output format basically unusable in the context of scripted invocations of git-fetch(1) that want to learn about the exact changes that the command performs. Introduce a new machine-parseable "porcelain" output format that is supposed to fix this shortcoming. This output format is intended to provide information about every reference that is about to be updated, the old object ID that the reference has been pointing to and the new object ID it will be updated to. Furthermore, the output format provides the same flags as the human-readable format to indicate basic conditions for each reference update like whether it was a fast-forward update, a branch deletion, a rejected update or others. The output format is quite simple: ``` <flag> <old-object-id> <new-object-id> <local-reference>\n ``` We assume two conditions which are generally true: - The old and new object IDs have fixed known widths and cannot contain spaces. - References cannot contain newlines. With these assumptions, the output format becomes unambiguously parseable. Furthermore, given that this output is designed to be consumed by scripts, the machine-readable data is printed to stdout instead of stderr like the human-readable output is. This is mostly done so that other data printed to stderr, like error messages or progress meters, don't interfere with the parseable data. A notable ommission here is that the output format does not include the remote from which a reference was fetched, which might be important information especially in the context of multi-remote fetches. But as such a format would require us to print the remote for every single reference update due to parallelizable fetches it feels wasteful for the most likely usecase, which is when fetching from a single remote. In a similar spirit, a second restriction is that this cannot be used with `--recurse-submodules`. This is because any reference updates would be ambiguous without also printing the repository in which the update happens. Considering that both multi-remote and submodule fetches are user-facing features, using them in conjunction with `--porcelain` that is intended for scripting purposes is likely not going to be useful in the majority of cases. With that in mind these restrictions feel acceptable. If usecases for either of these come up in the future though it is easy enough to add a new "porcelain-v2" format that adds this information. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 10:35:25 -07:00
Patrick Steinhardt	1c31764dda	fetch: print left-hand side when fetching HEAD:foo `store_updated_refs()` parses the remote reference for two purposes: - It gets used as a note when writing FETCH_HEAD. - It is passed through to `display_ref_update()` to display updated references in the following format: ``` * branch master -> master ``` In most cases, the parsed remote reference is the prettified reference name and can thus be used for both cases. But if the remote reference is HEAD, the parsed remote reference becomes empty. This is intended when we write the FETCH_HEAD, where we skip writing the note in that case. But when displaying the updated references this leads to inconsistent output where the left-hand side of reference updates is missing in some cases: ``` $ git fetch origin HEAD HEAD:explicit-head :implicit-head main From https://github.com/git/git * branch HEAD -> FETCH_HEAD * [new ref] -> explicit-head * [new ref] -> implicit-head * branch main -> FETCH_HEAD ``` This behaviour has existed ever since the table-based output has been introduced for git-fetch(1) via `165f390250` (git-fetch: more terse fetch output, 2007-11-03) and was never explicitly documented either in the commit message or in any of our tests. So while it may not be a bug per se, it feels like a weird inconsistency and not like it was a concious design decision. The logic of how we compute the remote reference name that we ultimately pass to `display_ref_update()` is not easy to follow. There are three different cases here: - When the remote reference name is "HEAD" we set the remote reference name to the empty string. This is the case that causes the left-hand side to go missing, where we would indeed want to print "HEAD" instead of the empty string. This is what `prettify_refname()` would return. - When the remote reference name has a well-known prefix then we strip this prefix. This matches what `prettify_refname()` does. - Otherwise, we keep the fully qualified reference name. This also matches what `prettify_refname()` does. As the return value of `prettify_refname()` would do the correct thing for us in all three cases, we can thus fix the inconsistency by passing through the full remote reference name to `display_ref_update()`, which learns to call `prettify_refname()`. At the same time, this also simplifies the code a bit. Note that this patch also changes formatting of the block that computes the "kind" (which is the category like "branch" or "tag") and "what" (which is the prettified reference name like "master" or "v1.0") variables. This is done on purpose so that it is part of the diff, hopefully making the change easier to comprehend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 10:35:25 -07:00
Patrick Steinhardt	3daf6558ed	fetch: add a test to exercise invalid output formats Add a testcase that exercises the logic when an invalid output format is passed via the `fetch.output` configuration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 10:35:24 -07:00
Patrick Steinhardt	2c5691d6cf	fetch: split out tests for output format We're about to introduce a new porcelain mode for the output of git-fetch(1). As part of that we'll be introducing a set of new tests that only relate to the output of this command. Split out tests that exercise the output format of git-fetch(1) so that it becomes easier to verify this functionality as a standalone unit. As the tests assume that the default branch is called "main" we set up the corresponding GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME environment variable accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 10:35:24 -07:00
Patrick Steinhardt	5667141e3b	fetch: fix `--no-recurse-submodules` with multi-remote fetches When running `git fetch --no-recurse-submodules`, the exectation is that we don't fetch any submodules. And while this works for fetches of a single remote, it doesn't when fetching multiple remotes at once. The result is that we do recurse into submodules even though the user has explicitly asked us not to. This is because while we pass on `--recurse-submodules={yes,on-demand}` if specified by the user, we don't pass on `--no-recurse-submodules` to the subprocess spawned to perform the submodule fetch. Fix this by also forwarding this flag as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-10 10:35:24 -07:00
Junio C Hamano	2ca91d1ee0	Merge branch 'mh/credential-oauth-refresh-token' The credential subsystem learns to help OAuth framework. * mh/credential-oauth-refresh-token: credential: new attribute oauth_refresh_token	2023-05-10 10:23:29 -07:00
Junio C Hamano	7f3cc51b28	Merge branch 'ar/test-cleanup-unused-file-creation-part2' Test cleanup. * ar/test-cleanup-unused-file-creation-part2: t2019: don't create unused files t1502: don't create unused files t1450: don't create unused files t1300: don't create unused files t1300: fix config file syntax error descriptions t0300: don't create unused file	2023-05-10 10:23:28 -07:00
Junio C Hamano	b6e9521956	Merge branch 'ms/send-email-feed-header-to-validate-hook' "git send-email" learned to give the e-mail headers to the validate hook by passing an extra argument from the command line. * ms/send-email-feed-header-to-validate-hook: send-email: expose header information to git-send-email's sendemail-validate hook send-email: refactor header generation functions	2023-05-10 10:23:28 -07:00
Junio C Hamano	fbbf60a9bc	Merge branch 'tb/credential-long-lines' The implementation of credential helpers used fgets() over fixed size buffers to read protocol messages, causing the remainder of the folded long line to trigger unexpected behaviour, which has been corrected. * tb/credential-long-lines: contrib/credential: embiggen fixed-size buffer in wincred contrib/credential: avoid fixed-size buffer in libsecret contrib/credential: .gitignore libsecret build artifacts contrib/credential: remove 'gnome-keyring' credential helper contrib/credential: avoid fixed-size buffer in osxkeychain t/lib-credential.sh: ensure credential helpers handle long headers credential.c: store "wwwauth[]" values in `credential_read()`	2023-05-10 10:23:27 -07:00
Junio C Hamano	6710b68db1	Merge branch 'rs/test-ctype-eof' ctype tests have been taught to test EOF, too. * rs/test-ctype-eof: test-ctype: check EOF	2023-05-10 10:23:27 -07:00
Junio C Hamano	0004d97099	Merge branch 'ob/t3501-retitle' Retitle a test script with an overly narrow name. * ob/t3501-retitle: t/t3501-revert-cherry-pick.sh: clarify scope of the file	2023-05-09 16:45:46 -07:00
Junio C Hamano	ccd12a3d6c	Merge branch 'en/header-split-cache-h-part-2' More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...	2023-05-09 16:45:46 -07:00
Junio C Hamano	620e92b845	Merge branch 'jk/parse-commit-with-malformed-ident' The commit object parser has been taught to be a bit more lenient to parse timestamps on the author/committer line with a malformed author/committer ident. * jk/parse-commit-with-malformed-ident: parse_commit(): describe more date-parsing failure modes parse_commit(): handle broken whitespace-only timestamp parse_commit(): parse timestamp from end of line t4212: avoid putting git on left-hand side of pipe	2023-05-09 16:45:45 -07:00
Shuqi Liang	8c30be9176	diff-files: integrate with sparse index Remove full index requirement for `git diff-files`. Refactor the ensure_expanded and ensure_not_expanded functions by introducing a common helper function, ensure_index_state. Add test to ensure the index is no expanded in `git diff-files`. The `p2000` tests demonstrate a ~96% execution time reduction for 'git diff-files' and a ~97% execution time reduction for 'git diff-files' for a file using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-files (full-v3) 0.09 0.08 -11.1% 2000.95: git diff-files (full-v4) 0.09 0.09 +0.0% 2000.96: git diff-files (sparse-v3) 0.52 0.02 -96.2% 2000.97: git diff-files (sparse-v4) 0.51 0.02 -96.1% 2000.98: git diff-files -- f2/f4/a (full-v3) 0.06 0.07 +16.7% 2000.99: git diff-files -- f2/f4/a (full-v4) 0.08 0.08 +0.0% 2000.100: git diff-files -- f2/f4/a (sparse-v3) 0.46 0.01 -97.8% 2000.101: git diff-files -- f2/f4/a (sparse-v4) 0.51 0.02 -96.1% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-09 14:26:36 -07:00
Shuqi Liang	0aba1a989c	t1092: add tests for `git diff-files` Before integrating the 'git diff-files' builtin with the sparse index feature, add tests to t1092-sparse-checkout-compatibility.sh to ensure it currently works with sparse-checkout and will still work with sparse index after that integration. When adding tests against a sparse-checkout definition, we test two modes: all changes are within the sparse-checkout cone and some changes are outside the sparse-checkout cone. In order to have staged changes outside of the sparse-checkout cone, make a directory called 'folder1' and copy `a` into 'folder1/a'. 'folder1/a' is identical to `a` in the base commit. These make 'folder1/a' in the index, while leaving it outside of the sparse-checkout definition. Change content inside 'folder1/a' in order to test 'folder1/a' being present on-disk with modifications. Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-09 14:26:34 -07:00
Felipe Contreras	159f4b9c3b	test: rev-parse-upstream: add missing cmp It seems pretty clear `5236fce6b4` (t1507: stop losing return codes of git commands, 2019-12-20) missed a test_cmp. Cc: Denton Liu <liu.denton@gmail.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-09 09:25:53 -07:00
Jeff King	8ddfce7144	t: drop "verbose" helper function We have a small helper function called "verbose", with the idea that you can write: verbose foo to get a message to stderr when the "foo" command fails, even if it does not produce any output itself. This goes back to `8ad1652418` (t5304: use helper to report failure of "test foo = bar", 2014-10-10). It does work, but overall it has not been a big success for two reasons: 1. Test writers have to remember to put it there (and the resulting test code is longer as a result). 2. It doesn't handle the opposite case (we expect "foo" to fail, but it succeeds), leading to inconsistencies in tests (which you can see in many hunks of this patch, e.g. ones involving "has_cr"). Most importantly, we added `a136f6d8ff` (test-lib.sh: support -x option for shell-tracing, 2014-10-10) at the same time, and it does roughly the same thing. The output is not quite as succinct as "verbose", and you have to watch out for stray shell-traces ending up in stderr. But it solves both of the problems above, and has clearly become the preferred tool. Let's consider the "verbose" function a failed experiment and remove the last few callers (which are all many years old, and have been dwindling as we remove them from scripts we touch for other reasons). It will be one less thing for new test writers to see and wonder if they should be using themselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-08 14:50:28 -07:00
Jeff King	a9ea5296b7	t7001: use "ls-files --format" instead of "cut" Since ls-files recently learned a "--format" option, we can use that rather than asking for all of "--stage" and then pulling out the bits we want with "cut". That's simpler and avoids two extra processes (one for cut, and one for the subshell to hold the intermediate result). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-08 14:50:28 -07:00
Jeff King	b1c8ac3996	t7001: avoid git on upstream of pipe We generally avoid git on the left-hand side of a pipe, because it loses the exit code of the command (and thus we'd miss things like segfaults or unexpected failures). In the cases in t7001, we wouldn't expect failures (they are just inspecting the repository state, and are not the main point of the test), but it doesn't hurt to be careful. In all but one case here we're piping "ls-files --stage" to cut off the pathname (since we compare entries before and after moving). Let's pull that into a helper function to avoid repeating the slightly awkward replacement. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-08 14:50:28 -07:00
Shuqi Liang	6e210175c7	t1092: update a write-tree test * 'on all' in the title of the test 'write-tree on all' was unclear; remove it. * Add a baseline 'test_all_match git write-tree' before making any changes to the index, providing a reference point for the 'write-tree' prior to any modifications. * Add a comparison of the output of 'git status --porcelain=v2' to test the working tree after 'write-tree' exits. * Ensure SKIP_WORKTREE files weren't materialized on disk by using "test_path_is_missing". Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-08 14:41:30 -07:00
Taylor Blau	b0afdce5da	pack-bitmap.c: use commit boundary during bitmap traversal When reachability bitmap coverage exists in a repository, Git will use a different (and hopefully faster) traversal to compute revision walks. Consider a set of positive and negative tips (which we'll refer to with their standard bitmap parlance by "wants", and "haves"). In order to figure out what objects exist between the tips, the existing traversal in `prepare_bitmap_walk()` does something like: 1. Consider if we can even compute the set of objects with bitmaps, and fall back to the usual traversal if we cannot. For example, pathspec limiting traversals can't be computed using bitmaps (since they don't know which objects are at which paths). The same is true of certain kinds of non-trivial object filters. 2. If we can compute the traversal with bitmaps, partition the (dereferenced) tips into two object lists, "haves", and "wants", based on whether or not the objects have the UNINTERESTING flag, respectively. 3. Fall back to the ordinary object traversal if either (a) there are more than zero haves, none of which are in the bitmapped pack or MIDX, or (b) there are no wants. 4. Construct a reachability bitmap for the "haves" side by walking from the revision tips down to any existing bitmaps, OR-ing in any bitmaps as they are found. 5. Then do the same for the "wants" side, stopping at any objects that appear in the "haves" bitmap. 6. Filter the results if any object filter (that can be easily computed with bitmaps alone) was given, and then return back to the caller. When there is good bitmap coverage relative to the traversal tips, this walk is often significantly faster than an ordinary object traversal because it can visit far fewer objects. But in certain cases, it can be significantly slower than the usual object traversal. Why? Because we need to compute complete bitmaps on either side of the walk. If either one (or both) of the sides require walking many (or all!) objects before they get to an existing bitmap, the extra bitmap machinery is mostly or all overhead. One of the benefits, however, is that even if the walk is slower, bitmap traversals are guaranteed to provide an exact answer. Unlike the traditional object traversal algorithm, which can over-count the results by not opening trees for older commits, the bitmap walk builds an exact reachability bitmap for either side, meaning the results are never over-counted. But producing non-exact results is OK for our traversal here (both in the bitmap case and not), as long as the results are over-counted, not under. Relaxing the bitmap traversal to allow it to produce over-counted results gives us the opportunity to make some significant improvements. Instead of the above, the new algorithm only has to walk from the boundary down to the nearest bitmap, instead of from each of the UNINTERESTING tips. The boundary-based approach still has degenerate cases, but we'll show in a moment that it is often a significant improvement. The new algorithm works as follows: 1. Build a (partial) bitmap of the haves side by first OR-ing any bitmap(s) that already exist for UNINTERESTING commits between the haves and the boundary. 2. For each commit along the boundary, add it as a fill-in traversal tip (where the traversal terminates once an existing bitmap is found), and perform fill-in traversal. 3. Build up a complete bitmap of the wants side as usual, stopping any time we intersect the (partial) haves side. 4. Return the results. And is more-or-less equivalent to using the old algorithm with this invocation: $ git rev-list --objects --use-bitmap-index $WANTS --not \ $(git rev-list --objects --boundary $WANTS --not $HAVES \| perl -lne 'print $1 if /^-(.)/') The new result performs significantly better in many cases, particularly when the distance from the boundary commit(s) to an existing bitmap is shorter than the distance from (all of) the have tips to the nearest bitmapped commit. Note that when using the old bitmap traversal algorithm, the results can be slower* than without bitmaps! Under the new algorithm, the result is computed faster with bitmaps than without (at the cost of over-counting the true number of objects in a similar fashion as the non-bitmap traversal): # (Computing the number of tagged objects not on any branches # without bitmaps). $ time git rev-list --count --objects --tags --not --branches 20 real 0m1.388s user 0m1.092s sys 0m0.296s # (Computing the same query using the old bitmap traversal). $ time git rev-list --count --objects --tags --not --branches --use-bitmap-index 19 real 0m22.709s user 0m21.628s sys 0m1.076s # (this commit) $ time git.compile rev-list --count --objects --tags --not --branches --use-bitmap-index 19 real 0m1.518s user 0m1.234s sys 0m0.284s The new algorithm is still slower than not using bitmaps at all, but it is nearly a 15-fold improvement over the existing traversal. In a more realistic setting (using my local copy of git.git), I can observe a similar (if more modest) speed-up: $ argv="--count --objects --branches --not --tags" hyperfine \ -n 'no bitmaps' "git.compile rev-list $argv" \ -n 'existing traversal' "git.compile rev-list --use-bitmap-index $argv" \ -n 'boundary traversal' "git.compile -c pack.useBitmapBoundaryTraversal=true rev-list --use-bitmap-index $argv" Benchmark 1: no bitmaps Time (mean ± σ): 124.6 ms ± 2.1 ms [User: 103.7 ms, System: 20.8 ms] Range (min … max): 122.6 ms … 133.1 ms 22 runs Benchmark 2: existing traversal Time (mean ± σ): 368.6 ms ± 3.0 ms [User: 325.3 ms, System: 43.1 ms] Range (min … max): 365.1 ms … 374.8 ms 10 runs Benchmark 3: boundary traversal Time (mean ± σ): 167.6 ms ± 0.9 ms [User: 139.5 ms, System: 27.9 ms] Range (min … max): 166.1 ms … 169.2 ms 17 runs Summary 'no bitmaps' ran 1.34 ± 0.02 times faster than 'boundary traversal' 2.96 ± 0.05 times faster than 'existing traversal' Here, the new algorithm is also still slower than not using bitmaps, but represents a more than 2-fold improvement over the existing traversal in a more modest example. Since this algorithm was originally written (nearly a year and a half ago, at the time of writing), the bitmap lookup table shipped, making the new algorithm's result more competitive. A few other future directions for improving bitmap traversal times beyond not using bitmaps at all: - Decrease the cost to decompress and OR together many bitmaps together (particularly when enumerating the uninteresting side of the walk). Here we could explore more efficient bitmap storage techniques, like Roaring+Run and/or use SIMD instructions to speed up ORing them together. - Store pseudo-merge bitmaps, which could allow us to OR together fewer "summary" bitmaps (which would also help with the above). Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-08 12:05:55 -07:00
Teng Long	425b4d7f47	push: introduce '--branches' option The '--all' option of git-push built-in cmd support to push all branches (refs under refs/heads) to remote. Under the usage, a user can easlily work in some scenarios, for example, branches synchronization and batch upload. The '--all' was introduced for a long time, meanwhile, git supports to customize the storage location under "refs/". when a new git user see the usage like, 'git push origin --all', we might feel like we're pushing _all_ the refs instead of just branches without looking at the documents until we found the related description of it or '--mirror'. To ensure compatibility, we cannot rename '--all' to another name directly, one way is, we can try to add a new option '--heads' which be identical with the functionality of '--all' to let the user understand the meaning of representation more clearly. Actually, We've more or less named options this way already, for example, in 'git-show-ref' and 'git ls-remote'. At the same time, we fix a related issue about the wrong help information of '--all' option in code and add some test cases in t5523, t5543 and t5583. Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-06 14:36:43 -07:00
John Cai	44451a2e5e	attr: teach "--attr-source=<tree>" global option to "git" Earlier, `47cfc9bd` (attr: add flag `--source` to work with tree-ish, 2023-01-14) taught "git check-attr" the "--source=<tree>" option to allow it to read attribute files from a tree-ish, but did so only for the command. Just like "check-attr" users wanted a way to use attributes from a tree-ish and not from the working tree files, users of other commands (like "git diff") would benefit from the same. Undo most of the UI change the commit made, while keeping the internal logic to read attributes from a given tree-ish. Expose the internal logic via a new "--attr-source=<tree>" command line option given to "git", so that it can be used with any git command that runs as part of the main git process. Additionally, add an environment variable GIT_ATTR_SOURCE that is set when --attr-source is passed in, so that subprocesses use the same value for the attributes source tree. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-06 14:34:09 -07:00
Junio C Hamano	b7cf25c8f4	t9800: correct misuse of 'show -s --raw' in a test There is $(git show -s --raw --pretty=format:%at HEAD) in this test that is meant to grab the author time of the commit. We used to have a bug in the command line option parser of the diff family of commands, where "show -s --raw" was identical to "show -s". With the "-s" bug fixed, "show -s --raw" would mean the same thing as "show --raw", i.e. show the output from the diff machinery in the "raw" format. And this test will start failing, so fix it before that happens. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-06 14:30:51 -07:00
Junio C Hamano	9d484b92ed	diff: fix interaction between the "-s" option and other options Sergey Organov noticed and reported "--patch --no-patch --raw" behaves differently from just "--raw". It turns out that there are a few interesting bugs in the implementation and documentation. * First, the documentation for "--no-patch" was unclear that it could be read to mean "--no-patch" countermands an earlier "--patch" but not other things. The intention of "--no-patch" ever since it was introduced at `d09cd15d` (diff: allow --no-patch as synonym for -s, 2013-07-16) was to serve as a synonym for "-s", so "--raw --patch --no-patch" should have produced no output, but it can be (mis)read to allow showing only "--raw" output. * Then the interaction between "-s" and other format options were poorly implemented. Modern versions of Git uses one bit each to represent formatting options like "--patch", "--stat" in a single output_format word, but for historical reasons, "-s" also is represented as another bit in the same word. This allows two interesting bugs to happen, and we have both X-<. (1) After setting a format bit, then setting NO_OUTPUT with "-s", the code to process another "--<format>" option drops the NO_OUTPUT bit to allow output to be shown again. However, the code to handle "-s" only set NO_OUTPUT without unsetting format bits set earlier, so the earlier format bit got revealed upon seeing the second "--<format>" option. This is the problem Sergey observed. (2) After setting NO_OUTPUT with "-s", code to process "--<format>" option can forget to unset NO_OUTPUT, leaving the command still silent. It is tempting to change the meaning of "--no-patch" to mean "disable only the patch format output" and reimplement "-s" as "not showing anything", but it would be an end-user visible change in behavior. Let's fix the interactions of these bits to first make "-s" work as intended. The fix is conceptually very simple. * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo" option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is given), we make sure we drop DIFF_FORMAT_NO_OUTPUT. We forgot to do so in some of the options and caused (2) above. * When processing "-s" option, we should not just set DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits. We didn't do so and retained format bits set by options previously seen, causing (1) above. It is even more tempting to lose NO_OUTPUT bit and instead take output_format word being 0 as its replacement, but that would break the mechanism "git show" uses to default to "--patch" output, where the distinction between telling the command to be silent with "-s" and having no output format specified on the command line matters, and an explicit output format given on the command line should not be "combined" with the default "--patch" format. So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we may want to replace it with OPTION_GIVEN bit, and * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and DIFF_FORMAT_OPTION_GIVEN bit on for each format. "--no-raw", etc. will set off DIFF_FORMAT_$format bit but still record the fact that we saw an option from the command line by setting DIFF_FORMAT_OPTION_GIVEN bit. * make "-s" (and its synonym "--no-patch") clear all other bits and set only the DIFF_FORMAT_OPTION_GIVEN bit on. which I suspect would make the code much cleaner without breaking any end-user expectations. Once that is in place, transitioning "--no-patch" to mean the counterpart of "--patch", just like "--no-raw" only defeats an earlier "--raw", would be quite simple at the code level. The social cost of migrating the end-user expectations might be too great for it to be worth, but at least the "GIVEN" bit clean-up alone may be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-05 14:24:32 -07:00
Junio C Hamano	83973981eb	diff: plug leaks in dirstat The array of dirstat_file contained in the dirstat_dir structure is not freed after the processing ends. Unfortunately, the member that points at the array, .files, is incremented as the gather_dirstat() function recursively walks it, and this needs to be plugged by remembering the beginning of the array before gather_dirstat() mucks with it and freeing it after we are done. We can mark t4047 as leak-free. t4000, which is marked as leak-free, now can exercise dirstat in it, which will happen next. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-05 14:24:09 -07:00
Andrei Rybak	a5855fd8d4	t2019: don't create unused files Tests in t2019-checkout-ambiguous-ref.sh redirect two invocations of "git checkout" to files "stdout" and "stderr". Several assertions are made using file "stderr". File "stdout", however, is unused. Don't redirect standard output of "git checkout" to file "stdout" in t2019-checkout-ambiguous-ref.sh to avoid creating unnecessary files. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:53:10 -07:00
Andrei Rybak	dca675c6ef	t1502: don't create unused files Three tests in file t1502-rev-parse-parseopt.sh use three redirections with invocation of "git rev-parse --parseopt --". All three tests redirect standard output to file "out" and file "spec" to standard input. Two of the tests redirect standard output a second time to file "actual", and the third test redirects standard error to file "err". These tests check contents of files "actual" and "err", but don't use the files named "out" for assertions. The two tests that redirect to standard output twice might also be confusing to the reader. Don't redirect standard output of "git rev-parse" to file "out" in t1502-rev-parse-parseopt.sh to avoid creating unnecessary files. Acked-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:53:06 -07:00
Andrei Rybak	59162ece57	t1450: don't create unused files Test 'fsck error and recovery on invalid object type' in file t1450-fsck.sh redirects output of a failing "git fsck" invocation to files "out" and "err" to assert presence of error messages in the output of the command. Commit `31deb28f5e` (fsck: don't hard die on invalid object types, 2021-10-01) changed the way assertions in this test are performed. The test doesn't compare the whole standard error with prepared file "err.expect" and it doesn't assert that standard output is empty. Don't create unused files "err.expect" and "out" in test 'fsck error and recovery on invalid object type'. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:53:03 -07:00
Andrei Rybak	a7cae2905b	t1300: don't create unused files Three tests in t1300-config.sh check that "git config --get" barfs when syntax errors are present in the config file. The tests redirect standard output and standard error of "git config --get" to files, "actual" and "error" correspondingly. They assert presence of an error message in file "error". However, these tests don't use file "actual" for assertions. Don't redirect standard output of "git config --get" to file "actual" in t1300-config.sh to avoid creating unnecessary files. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:52:48 -07:00
Andrei Rybak	6fc68e7ca3	t1300: fix config file syntax error descriptions Three tests in t1300-config.sh check that "git config --get" barfs when the config file contains various syntax errors: key=value pair without equals sign, broken section line, and broken value string. The sample config files include a comment describing the kind of broken syntax. This description seems to have been copy-pasted from the "broken section line" sample to the other two samples. Fix descriptions of broken config file syntax in samples used in t1300-config.sh. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:52:45 -07:00
Andrei Rybak	ed5288cff2	t0300: don't create unused file Test 'credential config with partial URLs' in t0300-credentials.sh contains three "git credential fill" invocations. For two of the invocations, the test asserts presence or absence of string "yep" in the standard output. For the third test it checks for an error message in standard error. Don't redirect standard output of "git credential" to file "stdout" in t0300-credentials.sh to avoid creating an unnecessary file when only standard error is checked. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-03 08:52:17 -07:00
Junio C Hamano	d699e27bd4	Merge branch 'tb/ban-strtok' Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`	2023-05-02 10:13:35 -07:00
Junio C Hamano	cf85f4b3bd	Merge branch 'jk/blame-fake-commit-label' The output given by "git blame" that attributes a line to contents taken from the file specified by the "--contents" option shows it differently from a line attributed to the working tree file. * jk/blame-fake-commit-label: blame: use different author name for fake commit generated by --contents	2023-05-02 10:13:35 -07:00
René Scharfe	31885f64e9	test-ctype: check EOF The character classifiers are supposed to allow passing EOF to them, a negative value. It isn't part of any character class. Extend the tests to cover that. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-02 09:25:54 -07:00
Derrick Stolee	756f1bcd29	fsck: verify checksums of all .bitmap files If a filesystem-level corruption occurs in a .bitmap file, Git can react poorly. This could take the form of a run-time error due to failing to parse an EWAH bitmap or be more subtle such as returning the wrong set of objects to a fetch or clone. A natural first response to either of these kinds of errors is to run 'git fsck' to see if any files are corrupt. This currently ignores all .bitmap files. Add checks to 'git fsck' for all .bitmap files that are currently associated with a multi-pack-index or pack file. Verify their checksums using the hashfile API. We iterate through all multi-pack-indexes and pack-files to be sure to check all .bitmap files, not just the one that would be read by the process. For example, a multi-pack-index bitmap overrules a pack-bitmap. However, if the multi-pack-index is removed, the pack-bitmap may be selected instead. Be thorough to include every file that could become active in such a way. This includes checking files in alternates. There is potential that we could extend this effort to check the structure of the reachability bitmaps themselves, but it is very expensive to do so. At minimum, it's as expensive as generating the bitmaps in the first place, and that's assuming that we don't use the trivial algorithm of verifying each bitmap individually. The trivial algorithm will result in quadratic behavior (number of objects times number of bitmapped commits) while the bitmap building operation constructs a lattice of commits to build bitmaps incrementally and then generate the final bitmaps from a subset of those commits. If we were to extend 'git fsck' to check .bitmap file contents more closely like this, then we would likely want to hide it behind an option that signals the user is more willing to do expensive operations such as this. For testing, set up a repository with a pack-bitmap _and_ a multi-pack-index bitmap. This requires some file movement to avoid deleting the pack-bitmap during the repack that creates the multi-pack-index bitmap. We can then verify that 'git fsck' is checking all files, not just the "active" bitmap. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-02 08:48:22 -07:00
Glen Choo	e35f202b45	setup: trace bare repository setups safe.bareRepository=explicit is a safer default mode of operation, since it guards against the embedded bare repository attack [1]. Most end users don't use bare repositories directly, so they should be able to set safe.bareRepository=explicit, with the expectation that they can reenable bare repositories by specifying GIT_DIR or --git-dir. However, the user might use a tool that invokes Git on bare repositories without setting GIT_DIR (e.g. "go mod" will clone bare repositories [2]), so even if a user wanted to use safe.bareRepository=explicit, it wouldn't be feasible until their tools learned to set GIT_DIR. To make this transition easier, add a trace message to note when we attempt to set up a bare repository without setting GIT_DIR. This allows users and tool developers to audit which of their tools are problematic and report/fix the issue. When they are sufficiently confident, they would switch over to "safe.bareRepository=explicit". Note that this uses trace2_data_string(), which isn't supported by the "normal" GIT_TRACE2 target, only _EVENT or _PERF. [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/ [2] https://go.dev/ref/mod Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-01 11:20:33 -07:00
Taylor Blau	71201ab0e5	t/lib-credential.sh: ensure credential helpers handle long headers Add a test ensuring that the "wwwauth[]" field cannot be used to inject malicious data into the credential helper stream. Many of the credential helpers in contrib/credential read the newline-delimited protocol stream one line at a time by repeatedly calling fgets() into a fixed-size buffer. This assumes that each line is no more than 1024 characters long, since each iteration of the loop assumes that it is parsing starting at the beginning of a new line in the stream. However, similar to `a5bb10fd5e` (config: avoid fixed-sized buffer when renaming/deleting a section, 2023-04-06), if a line is longer than 1024 characters, a malicious actor can embed another command within an existing line, bypassing the usual checks introduced in `9a6bbee800` (credential: avoid writing values with newlines, 2020-03-11). As with the problem fixed in that commit, specially crafted input can cause the helper to return the credential for the wrong host, letting an attacker trick the victim into sending credentials for one host to another. Luckily, all parts of the credential helper protocol that are available in a tagged release of Git are immune to this attack: - "protocol" is restricted to known values, and is thus immune. - "host" is immune because curl will reject hostnames that have a '=' character in them, which would be required to carry out this attack. - "username" is immune, because the buffer characters to fill out the first `fgets()` call would pollute the `username` field, causing the credential helper to return nothing (because it would match a username if present, and the username of the credential to be stolen is likely not 1024 characters). - "password" is immune because providing a password instructs credential helpers to avoid filling credentials in the first place. - "path" is similar to username; if present, it is not likely to match any credential the victim is storing. It's also not enabled by default; the victim would have to set credential.useHTTPPath explicitly. However, the new "wwwauth[]" field introduced via `5f2117b24f` (credential: add WWW-Authenticate header to cred requests, 2023-02-27) can be used to inject data into the credential helper stream. For example, running: { printf 'HTTP/1.1 401\r\n' printf 'WWW-Authenticate: basic realm=' perl -e 'print "a" x 1024' printf 'host=victim.com\r\n' } \| nc -Nlp 8080 in one terminal, and then: git clone http://localhost:8080 in another would result in a line like: wwwauth[]=basic realm=aaa[...]aaahost=victim.com being sent to the credential helper. If we tweak that "1024" to align our output with the helper's buffer size and the rest of the data on the line, it can cause the helper to see "host=victim.com" on its own line, allowing motivated attackers to exfiltrate credentials belonging to "victim.com". The below test demonstrates these failures and provides us with a test to ensure that our fix is correct. That said, it has a couple of shortcomings: - it's in t0303, since that's the only mechanism we have for testing random helpers. But that means nobody is going to run it under normal circumstances. - to get the attack right, it has to line up the stuffed name with the buffer size, so we depend on the exact buffer size. I parameterized it so it could be used to test other helpers, but in practice it's not likely for anybody to do that. Still, it's the best we can do, and will help us confirm the presence of the problem (and our fixes) in the new few patches. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-01 09:27:01 -07:00
Maxim Cournoyer	3a7a18a045	send-email: detect empty blank lines in command output The email format does not allow blank lines in headers; detect such input and report it as malformed and add a test for it. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-01 08:55:52 -07:00
Maxim Cournoyer	ba92106e93	send-email: add --header-cmd, --no-header-cmd options Sometimes, adding a header different than CC or TO is desirable; for example, when using Debbugs, it is best to use 'X-Debbugs-Cc' headers to keep people in CC; this is an example use case enabled by the new '--header-cmd' option. The header unfolding logic is extracted to a subroutine so that it can be reused; a test is added for coverage. Signed-off-by: Maxim Cournoyer <maxim.cournoyer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-01 08:55:52 -07:00
Oswald Buddenhagen	8bb19c14fb	t/t3501-revert-cherry-pick.sh: clarify scope of the file The file started out as a test for picks and reverts with renames, but has been subsequently populated with all kinds of basic tests, in accordance with its generic name. Adjust the description to reflect that. Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-01 08:24:58 -07:00
Junio C Hamano	fc23c397c7	Merge branch 'tb/enable-cruft-packs-by-default' When "gc" needs to retain unreachable objects, packing them into cruft packs (instead of exploding them into loose object files) has been offered as a more efficient option for some time. Now the use of cruft packs has been made the default and no longer considered an experimental feature. * tb/enable-cruft-packs-by-default: repository.h: drop unused `gc_cruft_packs` builtin/gc.c: make `gc.cruftPacks` enabled by default t/t9300-fast-import.sh: prepare for `gc --cruft` by default t/t6500-gc.sh: add additional test cases t/t6500-gc.sh: refactor cruft pack tests t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default t/t5304-prune.sh: prepare for `gc --cruft` by default builtin/gc.c: ignore cruft packs with `--keep-largest-pack` builtin/repack.c: fix incorrect reference to '-C' pack-write.c: plug a leak in stage_tmp_packfiles()	2023-04-28 16:03:03 -07:00
Junio C Hamano	aabc69885e	Merge branch 'jk/gpg-trust-level-fix' The "%GT" placeholder for the "--format" option of "git log" and friends caused BUG() to trigger on a commit signed with an unknown key, which has been corrected. * jk/gpg-trust-level-fix: gpg-interface: set trust level of missing key to "undefined"	2023-04-28 16:03:03 -07:00
Junio C Hamano	a02675ad90	Merge branch 'ds/fsck-pack-revindex' "git fsck" learned to validate the on-disk pack reverse index files. * ds/fsck-pack-revindex: fsck: validate .rev file header fsck: check rev-index position values fsck: check rev-index checksums fsck: create scaffolding for rev-index checks	2023-04-27 16:00:59 -07:00
Junio C Hamano	849c8b3dbf	Merge branch 'tb/pack-revindex-on-disk' The on-disk reverse index that allows mapping from the pack offset to the object name for the object stored at the offset has been enabled by default. * tb/pack-revindex-on-disk: t: invert `GIT_TEST_WRITE_REV_INDEX` config: enable `pack.writeReverseIndex` by default pack-revindex: introduce `pack.readReverseIndex` pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK pack-revindex: make `load_pack_revindex` take a repository t5325: mark as leak-free pack-write.c: plug a leak in stage_tmp_packfiles()	2023-04-27 16:00:59 -07:00
Jeff King	089d9adff6	parse_commit(): handle broken whitespace-only timestamp The comment in parse_commit_date() claims that parse_timestamp() will not walk past the end of the buffer we've been given, since it will hit the newline at "eol" and stop. This is usually true, when dateptr contains actual numbers to parse. But with a line like: committer name <email> \n with just whitespace, and no numbers, parse_timestamp() will consume that newline as part of the leading whitespace, and we may walk past our "tail" pointer (which itself is set from the "size" parameter passed in to parse_commit_buffer()). In practice this can't cause us to walk off the end of an array, because we always add an extra NUL byte to the end of objects we load from disk (as a defense against exactly this kind of bug). However, you can see the behavior in action when "committer" is the final header (which it usually is, unless there's an encoding) and the subject line can be parsed as an integer. We walk right past the newline on the committer line, as well as the "\n\n" separator, and mistake the subject for the timestamp. We can solve this by trimming the whitespace ourselves, making sure that it has some non-whitespace to parse. Note that we need to be a bit careful about the definition of "whitespace" here, as our isspace() doesn't match exotic characters like vertical tab or formfeed. We can work around that by checking for an actual number (see the in-code comment). This is slightly more restrictive than the current code, but in practice the results are either the same (we reject "foo" as "0", but so would parse_timestamp()) or extremely unlikely even for broken commits (parse_timestamp() would allow "\v123" as "123", but we'll now make it "0"). I did also allow "-" here, which may be controversial, as we don't currently support negative timestamps. My reasoning was two-fold. One, the design of parse_timestamp() is such that we should be able to easily switch it to handling signed values, and this otherwise creates a hard-to-find gotcha that anybody doing that work would get tripped up on. And two, the status quo is that we currently parse them, though the result of course ends up as a very large unsigned value (which is likely to just get clamped to "0" for display anyway, since our date routines can't handle it). The new test checks the commit parser (via "--until") for both vanilla spaces and the vertical-tab case. I also added a test to check these against the pretty-print formatter, which uses split_ident_line(). It's not subject to the same bug, because it already insists that there be one or more digits in the timestamp. Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-27 08:53:53 -07:00
Jeff King	ea1615dfdd	parse_commit(): parse timestamp from end of line To find the committer timestamp, we parse left-to-right looking for the closing ">" of the email, and then expect the timestamp right after that. But we've seen some broken cases in the wild where this fails, but we _could_ find the timestamp with a little extra work. E.g.: Name <Name<email>> 123456789 -0500 This means that features that rely on the committer timestamp, like --since or --until, will treat the commit as happening at time 0 (i.e., 1970). This is doubly confusing because the pretty-print parser learned to handle these in `03818a4a94` (split_ident: parse timestamp from end of line, 2013-10-14). So printing them via "git show", etc, makes everything look normal, but --until, etc are still broken (despite the fact that that commit explicitly mentioned --until!). So let's use the same trick as `03818a4a94`: find the end of the line, and parse back to the final ">". In theory we could use split_ident_line() here, but it's actually a bit more strict. In particular, it requires a valid time-zone token, too. That should be present, of course, but we wouldn't want to break --until for cases that are working currently. We might want to teach split_ident_line() to become more lenient there, but it would require checking its many callers (since right now they can assume that if date_start is non-NULL, so is tz_start). So for now we'll just reimplement the same trick in the commit parser. The test is in t4212, which already covers similar cases, courtesy of `03818a4a94`. We'll just adjust the broken commit to munge both the author and committer timestamps. Note that we could match (author\|committer) here, but alternation can't be used portably in sed. Since we wouldn't expect to see ">" except as part of an ident line, we can just match that character on any line. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-27 08:53:35 -07:00
Jeff King	2063b86b81	t4212: avoid putting git on left-hand side of pipe We wouldn't expect cat-file to fail here, but it's good practice to avoid putting git on the upstream of a pipe, as we otherwise ignore its exit code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-27 08:53:32 -07:00
Junio C Hamano	36628c56ed	Merge branch 'ps/fix-geom-repack-with-alternates' Geometric repacking ("git repack --geometric=<n>") in a repository that borrows from an alternate object database had various corner case bugs, which have been corrected. * ps/fix-geom-repack-with-alternates: repack: disable writing bitmaps when doing a local repack repack: honor `-l` when calculating pack geometry t/helper: allow chmtime to print verbosely without modifying mtime pack-objects: extend test coverage of `--stdin-packs` with alternates pack-objects: fix error when same packfile is included and excluded pack-objects: fix error when packing same pack twice pack-objects: split out `--stdin-packs` tests into separate file repack: fix generating multi-pack-index with only non-local packs repack: fix trying to use preferred pack in alternates midx: fix segfault with no packs and invalid preferred pack	2023-04-25 13:56:20 -07:00
Junio C Hamano	c4c9d5586f	Merge branch 'rj/send-email-validate-hook-count-messages' The sendemail-validate validate hook learned to pass the total number of input files and where in the sequence each invocation is via environment variables. * rj/send-email-validate-hook-count-messages: send-email: export patch counters in validate environment	2023-04-25 13:56:20 -07:00
Junio C Hamano	80d268f309	Merge branch 'jk/protocol-cap-parse-fix' The code to parse capability list for v0 on-wire protocol fell into an infinite loop when a capability appears multiple times, which has been corrected. * jk/protocol-cap-parse-fix: v0 protocol: use size_t for capability length/offset t5512: test "ls-remote --heads --symref" filtering with v0 and v2 t5512: allow any protocol version for filtered symref test t5512: add v2 support for "ls-remote --symref" test v0 protocol: fix sha1/sha256 confusion for capabilities^{} t5512: stop referring to "v1" protocol v0 protocol: fix infinite loop when parsing multi-valued capabilities	2023-04-25 13:56:20 -07:00
Junio C Hamano	0807e57807	Merge branch 'en/header-split-cache-h' Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...	2023-04-25 13:56:20 -07:00
Junio C Hamano	9ce9dea4e1	Sync with Git 2.40.1	2023-04-24 22:31:32 -07:00
Taylor Blau	a2742f8c59	t/helper/test-json-writer.c: avoid using `strtok()` Apply similar treatment as in the previous commit to remove usage of `strtok()` from the "oidmap" test helper. Each of the different commands that the "json-writer" helper accepts pops the next space-delimited token from the current line and interprets it as a string, integer, or double (with the exception of the very first token, which is the command itself). To accommodate this, split the line in place by the space character, and pass the corresponding string_list to each of the specialized `get_s()`, `get_i()`, and `get_d()` functions. `get_i()` and `get_d()` are thin wrappers around `get_s()` that convert their result into the appropriate type by either calling `strtol()` or `strtod()`, respectively. In `get_s()`, we mark the token as "consumed" by incrementing the `consumed_nr` counter, indicating how many tokens we have read up to that point. Because each of these functions needs the string-list parts, the number of tokens consumed, and the line number, these three are wrapped up in to a struct representing the line state. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 16:01:28 -07:00
Taylor Blau	deeabc1ff0	t/helper/test-oidmap.c: avoid using `strtok()` Apply similar treatment as in the previous commit to remove usage of `strtok()` from the "oidmap" test helper. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 16:01:28 -07:00
Taylor Blau	826f0e33ab	t/helper/test-hashmap.c: avoid using `strtok()` Avoid using the non-reentrant `strtok()` to separate the parts of each incoming command. Instead of replacing it with `strtok_r()`, let's instead use the more friendly pair of `string_list_split_in_place()` and `string_list_remove_empty_items()`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 16:01:28 -07:00
Taylor Blau	52acddf36c	string-list: multi-delimiter `string_list_split_in_place()` Enhance `string_list_split_in_place()` to accept multiple characters as delimiters instead of a single character. Instead of using `strchr(2)` to locate the first occurrence of the given delimiter character, `string_list_split_in_place_multi()` uses `strcspn(2)` to move past the initial segment of characters comprised of any characters in the delimiting set. When only a single delimiting character is provided, `strpbrk(2)` (which is implemented with `strcspn(2)`) has equivalent performance to `strchr(2)`. Modern `strcspn(2)` implementations treat an empty delimiter or the singleton delimiter as a special case and fall back to calling strchrnul(). Both glibc[1] and musl[2] implement `strcspn(2)` this way. This change is one step to removing `strtok(2)` from the tree. Note that `string_list_split_in_place()` is not a strict replacement for `strtok()`, since it will happily turn sequential delimiter characters into empty entries in the resulting string_list. For example: string_list_split_in_place(&xs, "foo:;:bar:;:baz", ":;", -1) would yield a string list of: ["foo", "", "", "bar", "", "", "baz"] Callers that wish to emulate the behavior of strtok(2) more directly should call `string_list_remove_empty_items()` after splitting. To avoid regressions for the new multi-character delimter cases, update t0063 in this patch as well. [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strcspn.c;hb=glibc-2.37#l35 [2]: https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn.c?h=v1.2.3#n11 Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 16:01:28 -07:00
Jacob Keller	603d0fdce2	blame: use different author name for fake commit generated by --contents When the --contents option is used with git blame, and the contents of the file have lines which can't be annotated by the history being blamed, the user will see an author of "Not Committed Yet". This is similar to the way blame handles working tree contents when blaming without a revision. This is slightly confusing since this data isn't the working copy and while it is technically "not committed yet", its also coming from an external file. Replace this author name with "External file (--contents)" to better differentiate such lines from actual working copy lines. Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Glen Choo <chooglen@google.com> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 15:16:31 -07:00
Andrei Rybak	3d77fbb664	t1300: add tests for missing keys There are several tests in t1300-config.sh that validate failing invocations of "git config". However, there are no tests that check what happens when "git config" is asked to retrieve a value for a missing key. Add tests that check this for various combinations of "<section>.<key>" and "<section>.<subsection>.<key>". Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 15:10:50 -07:00
Andrei Rybak	93f86046c9	t1300: check stderr for "ignores pairs" tests Tests "git config ignores pairs ..." in t1300-config.sh validate that "git config" ignores various kinds of supplied pairs of environment variables GIT_CONFIG_KEY_* GIT_CONFIG_VALUE_* depending on GIT_CONFIG_COUNT. By "ignores" here we mean that "git config" abides by the value of environment variable GIT_CONFIG_COUNT and doesn't use key-value pairs outside of the supplied GIT_CONFIG_COUNT when trying to produce a value for config key "pair.one". These tests also validate that "git config" doesn't complain about mismatched environment variables to standard error. This is validated by redirecting the standard error to a file called "error" and asserting that it is empty. However, two of these tests incorrectly redirect to standard output while calling the file "error", and test 'git config ignores pairs exceeding count' doesn't validate standard error at all. Fix these tests by redirecting standard error to file "error" and asserting its emptiness. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 15:10:50 -07:00
Andrei Rybak	f7f9a836e2	t1300: drop duplicate test There are two almost identical tests called 'git config ignores pairs with zero count' in file t1300-config.sh. Drop the first of these and keep the one that contains more assertions. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 15:10:50 -07:00
Elijah Newren	e3a3f5edf5	reftable: ensure git-compat-util.h is the first (indirect) include Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:33 -07:00
Elijah Newren	5e3f94dfe3	treewide: remove cache.h inclusion due to previous changes Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:33 -07:00
Elijah Newren	d1cbe1e6d8	hash-ll.h: split out of hash.h to remove dependency on repository.h hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:32 -07:00
Elijah Newren	d4ff2072ab	match-trees.h: move declarations for match-trees.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:32 -07:00
Elijah Newren	0ff73d742b	packfile.h: move pack_window and pack_entry from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:31 -07:00
Elijah Newren	69a63fe663	treewide: be explicit about dependence on strbuf.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:31 -07:00
Junio C Hamano	b64894c206	Merge branch 'ow/ref-filter-omit-empty' "git branch --format=..." and "git format-patch --format=..." learns "--omit-empty" to hide refs that whose formatting result becomes an empty string from the output. * ow/ref-filter-omit-empty: branch, for-each-ref, tag: add option to omit empty lines	2023-04-21 15:35:05 -07:00
Junio C Hamano	9e0d1aa495	Merge branch 'ah/format-patch-thread-doc' Doc update. * ah/format-patch-thread-doc: format-patch: correct documentation of --thread without an argument	2023-04-21 15:35:05 -07:00
Junio C Hamano	7ac228c994	Merge branch 'rn/sparse-describe' "git describe --dirty" learns to work better with sparse-index. * rn/sparse-describe: describe: enable sparse index for describe	2023-04-21 15:35:04 -07:00
Junio C Hamano	de73a20756	Merge branch 'rs/archive-from-subdirectory-fixes' "git archive" run from a subdirectory mishandled attributes and paths outside the current directory. * rs/archive-from-subdirectory-fixes: archive: improve support for running in subdirectory	2023-04-21 15:35:04 -07:00
M Hickford	a5c76569e7	credential: new attribute oauth_refresh_token Git authentication with OAuth access token is supported by every popular Git host including GitHub, GitLab and BitBucket [1][2][3]. Credential helpers Git Credential Manager (GCM) and git-credential-oauth generate OAuth credentials [4][5]. Following RFC 6749, the application prints a link for the user to authorize access in browser. A loopback redirect communicates the response including access token to the application. For security, RFC 6749 recommends that OAuth response also includes expiry date and refresh token [6]. After expiry, applications can use the refresh token to generate a new access token without user reauthorization in browser. GitLab and BitBucket set the expiry at two hours [2][3]. (GitHub doesn't populate expiry or refresh token.) However the Git credential protocol has no attribute to store the OAuth refresh token (unrecognised attributes are silently discarded). This means that the user has to regularly reauthorize the helper in browser. On a browserless system, this is particularly intrusive, requiring a second device. Introduce a new attribute oauth_refresh_token. This is especially useful when a storage helper and a read-only OAuth helper are configured together. Recall that `credential fill` calls each helper until it has a non-expired password. ``` [credential] helper = storage # eg. cache or osxkeychain helper = oauth ``` The OAuth helper can use the stored refresh token forwarded by `credential fill` to generate a fresh access token without opening the browser. See https://github.com/hickford/git-credential-oauth/pull/3/files for an implementation tested with this patch. Add support for the new attribute to credential-cache. Eventually, I hope to see support in other popular storage helpers. Alternatives considered: ask helpers to store all unrecognised attributes. This seems excessively complex for no obvious gain. Helpers would also need extra information to distinguish between confidential and non-confidential attributes. Workarounds: GCM abuses the helper get/store/erase contract to store the refresh token during credential get as the password for a fictitious host [7] (I wrote this hack). This workaround is only feasible for a monolithic helper with its own storage. [1] https://github.blog/2012-09-21-easier-builds-and-deployments-using-git-over-https-and-oauth/ [2] https://docs.gitlab.com/ee/api/oauth2.html#access-git-over-https-with-access-token [3] https://support.atlassian.com/bitbucket-cloud/docs/use-oauth-on-bitbucket-cloud/#Cloning-a-repository-with-an-access-token [4] https://github.com/GitCredentialManager/git-credential-manager [5] https://github.com/hickford/git-credential-oauth [6] https://datatracker.ietf.org/doc/html/rfc6749#section-5.1 [7] `66b94e489a/src/shared/GitLab/GitLabHostProvider.cs (L207)` Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-21 09:38:30 -07:00
Junio C Hamano	a4a4db8cf7	Merge branch 'gc/better-error-when-local-clone-fails-with-symlink' "git clone --local" stops copying from an original repository that has symbolic links inside its $GIT_DIR; an error message when that happens has been updated. * gc/better-error-when-local-clone-fails-with-symlink: clone: error specifically with --local and symlinked objects	2023-04-20 14:33:36 -07:00
Junio C Hamano	98c496fcd0	Merge branch 'ar/t2024-checkout-output-fix' Test fix. * ar/t2024-checkout-output-fix: t2024: fix loose/strict local base branch DWIM test	2023-04-20 14:33:36 -07:00
Junio C Hamano	fa9172c70a	Merge branch 'rs/remove-approxidate-relative' The approxidate() API has been simplified by losing an extra function that did the same thing as another one. * rs/remove-approxidate-relative: date: remove approxidate_relative()	2023-04-20 14:33:35 -07:00
Junio C Hamano	cbfe844aa1	Merge branch 'rs/userdiff-multibyte-regex' The userdiff regexp patterns for various filetypes that are built into the system have been updated to avoid triggering regexp errors from UTF-8 aware regex engines. * rs/userdiff-multibyte-regex: userdiff: support regexec(3) with multi-byte support	2023-04-20 14:33:35 -07:00
Michael Strawbridge	a8022c5f7b	send-email: expose header information to git-send-email's sendemail-validate hook To allow further flexibility in the Git hook, the SMTP header information of the email which git-send-email intends to send, is now passed as the 2nd argument to the sendemail-validate hook. As an example, this can be useful for acting upon keywords in the subject or specific email addresses. Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Junio C Hamano <gitster@pobox.com> Cc: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Michael Strawbridge <michael.strawbridge@amd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-19 14:19:09 -07:00
Jeff King	7891e46585	gpg-interface: set trust level of missing key to "undefined" In check_signature(), we initialize the trust_level field to "-1", with the idea that if gpg does not return a trust level at all (if there is no signature, or if the signature is made by an unknown key), we'll use that value. But this has two problems: 1. Since the field is an enum, it's up to the compiler to decide what underlying storage to use, and it only has to fit the values we've declared. So we may not be able to store "-1" at all. And indeed, on my system (linux with gcc), the resulting enum is an unsigned 32-bit value, and -1 becomes 4294967295. The difference may seem academic (and you even get "-1" if you pass it to printf("%d")), but it means that code like this: status \|= sigc->trust_level < configured_min_trust_level; does not necessarily behave as expected. This turns out not to be a bug in practice, though, because we keep the "-1" only when gpg did not report a signature from a known key, in which case the line above: status \|= sigc->result != 'G'; would always set status to non-zero anyway. So only a 'G' signature with no parsed trust level would cause a problem, which doesn't seem likely to trigger (outside of unexpected gpg behavior). 2. When using the "%GT" format placeholder, we pass the value to gpg_trust_level_to_str(), which complains that the value is out of range with a BUG(). This behavior was introduced by `803978da49` (gpg-interface: add function for converting trust level to string, 2022-07-11). Before that, we just did a switch() on the enum, and anything that wasn't matched would end up as the empty string. Curiously, solving this by naively doing: if (level < 0) return ""; in that function isn't sufficient. Because of (1) above, the compiler can (and does in my case) actually remove that conditional as dead code! We can solve both by representing this state as an enum value. We could do this by adding a new "unknown" value. But this really seems to match the existing "undefined" level well. GPG describes this as "Not enough information for calculation". We have tests in t7510 that trigger this case (verifying a signature from a key that we don't have, and then checking various %G placeholders), but they didn't notice the BUG() because we didn't look at %GT for that case! Let's make sure we check all %G placeholders for each case in the formatting tests. The interesting ones here are "show unknown signature with custom format" and "show lack of signature with custom format", both of which would BUG() before, and now turn %GT into "undefined". Prior to `803978da49` they would have turned it into the empty string, but I think saying "undefined" consistently is a reasonable outcome, and probably makes life easier for anyone parsing the output (and any such parser had to be ready to see "undefined" already). The other modified tests produce the same output before and after this patch, but now we're consistently checking both %G? and %GT in all of them. Signed-off-by: Jeff King <peff@peff.net> Reported-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-19 08:30:54 -07:00
Taylor Blau	e3e24de1bf	builtin/gc.c: make `gc.cruftPacks` enabled by default Back in `5b92477f89` (builtin/gc.c: conditionally avoid pruning objects via loose, 2022-05-20), `git gc` learned the `--cruft` option and `gc.cruftPacks` configuration to opt-in to writing cruft packs when collecting or pruning unreachable objects. Cruft packs were introduced with the merge in `a50036da1a` (Merge branch 'tb/cruft-packs', 2022-06-03). They address the problem of "loose object explosions", where Git will write out many individual loose objects when there is a large number of unreachable objects that have not yet aged past `--prune=<date>`. Instead of keeping track of those unreachable yet recent objects via their loose object file's mtime, cruft packs collect all unreachable objects into a single pack with a corresponding `*.mtimes` file that acts as a table to store the mtimes of all unreachable objects. This prevents the need to store unreachable objects as loose as they age out of the repository, and avoids the problem of loose object explosions. Beyond avoiding loose object explosions, cruft packs also act as a more efficient mechanism to store unreachable objects as they age out of a repository. This is because pairs of similar unreachable objects serve as delta bases for one another. In `5b92477f89`, the feature was introduced as experimental. Since then, GitHub has been running these patches in every repository generating hundreds of millions of cruft packs along the way. The feature is battle-tested, and avoids many pathological cases such as above. Users who either run `git gc` manually, or via `git maintenance` can benefit from having cruft packs. As such, enable cruft pack generation to take place by default (by making `gc.cruftPacks` have the default of "true" rather than "false). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	c58100ab5d	t/t9300-fast-import.sh: prepare for `gc --cruft` by default In a similar fashion as previous commits, adjust the fast-import tests to prepare for "git gc" generating a cruft pack by default. This adjustment is slightly different, however. Instead of relying on us writing out the objects loose, and then calling `git prune` to remove them, t9300 needs to be prepared to drop objects that would be moved into cruft packs. To do this, we can combine the `git gc` invocation with `git prune` into one `git gc --prune`, which handles pruning both loose objects, and objects that would otherwise be written to a cruft pack. Likely this pattern of "git gc && git prune" started all the way back in `03db4525d3` (Support gitlinks in fast-import., 2008-07-19), which happened after deprecating `git gc --prune` in `9e7d501990` (builtin-gc.c: deprecate --prune, it now really has no effect, 2008-05-09). After `--prune` was un-deprecated in `58e9d9d472` (gc: make --prune useful again by accepting an optional parameter, 2009-02-14), this script got a handful of new "git gc && git prune" instances via via `4cedb78cb5` (fast-import: add input format tests, 2011-08-11). These could have been `git gc --prune`, but weren't (likely taking after `03db4525d3`). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	b9061bc628	t/t6500-gc.sh: add additional test cases In the last commit, we refactored some of the tests in t6500 to make clearer when cruft packs will and won't be generated by `git gc`. Add the remaining cases not covered by the previous patch into this one, which enumerates all possible combinations of arguments that will produce (or not produce) a cruft pack. This prepares us for a future commit which will change the default value of `gc.cruftPacks` by ensuring that we understand which invocations do and do not change as a result. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	50685e0e0b	t/t6500-gc.sh: refactor cruft pack tests In `12253ab6d0` (gc: add tests for --cruft and friends, 2022-10-26), we added a handful of tests to t6500 to ensure that `git gc` respected the value of `--cruft` and `gc.cruftPacks`. Then, in `c695592850` (config: let feature.experimental imply gc.cruftPacks=true, 2022-10-26), another set of similar tests was added to ensure that `feature.experimental` correctly implied enabling cruft pack generation (or not). These tests are similar and could be consolidated. Do so in this patch to prepare for expanding the set of command-line invocations that enable or disable writing cruft packs. This makes it possible to easily test more combinations of arguments without being overly repetitive. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:48 -07:00
Taylor Blau	b31d45b831	t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default In a similar spirit as previous commits, prepare for `gc --cruft` becoming the default by ensuring that the tests in t6501 explicitly cover the case of freshening loose objects not using cruft packs. We could run this test twice, once with `--cruft` and once with `--no-cruft`, but doing so is unnecessary, since we already test object rescuing, freshening, and dealing with corrupt parts of the unreachable object graph extensively via t5329. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Taylor Blau	b934207a22	t/t5304-prune.sh: prepare for `gc --cruft` by default Many of the tests in t5304 run `git gc`, and rely on its behavior that unreachable-but-recent objects are written out loose. This is sensible, since t5304 deals specifically with this kind of pruning. If left unattended, however, this test would break when the default behavior of a bare "git gc" is adjusted to generate a cruft pack by default. Ensure that these tests continue to work as-is (and continue to provide coverage of loose object pruning) by passing `--no-cruft` explicitly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Taylor Blau	05b9013b71	builtin/gc.c: ignore cruft packs with `--keep-largest-pack` When cruft packs were implemented, we never adjusted the code for `git gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft packs. This option and configuration option share a common implementation, but including cruft packs is wrong in both cases: - Running `git gc --keep-largest-pack` in a repository where the largest pack is the cruft pack itself will make it impossible for `git gc` to prune objects, since the cruft pack itself is kept. - The same is true for `gc.bigPackThreshold`, if the size of the cruft pack exceeds the limit set by the caller. In the future, it is possible that `gc.bigPackThreshold` could be used to write a separate cruft pack containing any new unreachable objects that entered the repository since the last time a cruft pack was written. There are some complexities to doing so, mainly around handling pruning objects that are in an existing cruft pack that is above the threshold (which would either need to be rewritten, or else delay pruning). Rewriting a substantially similar cruft pack isn't ideal, but it is significantly better than the status-quo. If users have large cruft packs that they don't want to rewrite, they can mark them as `*.keep` packs. But in general, if a repository has a cruft pack that is so large it is slowing down GC's, it should probably be pruned anyway. In the meantime, ignore cruft packs in the common implementation for both of these options, and add a pair of tests to prevent any future regressions here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-18 14:56:47 -07:00
Junio C Hamano	3c957e6d39	Merge branch 'pw/rebase-cleanup-merge-strategy-option-handling' Clean-up of the code path that deals with merge strategy option handling in "git rebase". * pw/rebase-cleanup-merge-strategy-option-handling: rebase: remove a couple of redundant strategy tests rebase -m: fix serialization of strategy options rebase -m: cleanup --strategy-option handling sequencer: use struct strvec to store merge strategy options rebase: stop reading and writing unnecessary strategy state	2023-04-17 18:05:13 -07:00
Junio C Hamano	9d8370d445	Merge branch 'tk/mergetool-gui-default-config' "git mergetool" and "git difftool" learns a new configuration guiDefault to optionally favor configured guitool over non-gui-tool automatically when $DISPLAY is set. * tk/mergetool-gui-default-config: mergetool: new config guiDefault supports auto-toggling gui by DISPLAY	2023-04-17 18:05:11 -07:00
Junio C Hamano	d47ee0a565	Merge branch 'sl/sparse-write-tree' "git write-tree" learns to work better with sparse-index. * sl/sparse-write-tree: write-tree: integrate with sparse index	2023-04-17 18:05:11 -07:00
Derrick Stolee	5a6072f631	fsck: validate .rev file header While parsing a .rev file, we check the header information to be sure it makes sense. This happens before doing any additional validation such as a checksum or value check. In order to differentiate between a bad header and a non-existent file, we need to update the API for loading a reverse index. Make load_pack_revindex_from_disk() non-static and specify that a positive value means "the file does not exist" while other errors during parsing are negative values. Since an invalid header prevents setting up the structures we would use for further validations, we can stop at that point. The place where we can distinguish between a missing file and a corrupt file is inside load_revindex_from_disk(), which is used both by pack rev-indexes and multi-pack-index rev-indexes. Some tests in t5326 demonstrate that it is critical to take some conditions to allow positive error signals. Add tests that check the three header values. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:05 -07:00
Derrick Stolee	5f658d1b57	fsck: check rev-index position values When checking a rev-index file, it may be helpful to identify exactly which positions are incorrect. Compare the rev-index to a freshly-computed in-memory rev-index and report the comparison failures. This additional check (on top of the checksum validation) can help find files that were corrupt by a single bit flip on-disk or perhaps were written incorrectly due to a bug in Git. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Derrick Stolee	d975fe1fa5	fsck: check rev-index checksums The previous change added calls to verify_pack_revindex() in builtin/fsck.c, but the implementation of the method was left empty. Add the first and most-obvious check to this method: checksum verification. While here, create a helper method in the test script that makes it easy to adjust the .rev file and check that 'git fsck' reports the correct error message. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Derrick Stolee	0d30feef3c	fsck: create scaffolding for rev-index checks The 'fsck' builtin checks many of Git's on-disk data structures, but does not currently validate the pack rev-index files (a .rev file to pair with a .pack and .idx file). Before doing a more-involved check process, create the scaffolding within builtin/fsck.c to have a new error type and add that error type when the API method verify_pack_revindex() returns an error. That method does nothing currently, but we will add checks to it in later changes. For now, check that 'git fsck' succeeds without any errors in the normal case. Future checks will be paired with tests that corrupt the .rev file appropriately. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-17 14:39:04 -07:00
Johannes Schindelin	3d3c11852c	Sync with 2.39.3 * maint-2.39: (34 commits) Git 2.39.3 Git 2.38.5 Git 2.37.7 Git 2.36.6 Git 2.35.8 Makefile: force -O0 when compiling with SANITIZE=leak Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix range-diff: use ssize_t for parsed "len" in read_patches() ...	2023-04-17 21:16:10 +02:00
Johannes Schindelin	15628975cf	Sync with 2.38.5 * maint-2.38: (32 commits) Git 2.38.5 Git 2.37.7 Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion ...	2023-04-17 21:16:08 +02:00
Johannes Schindelin	c96ecfe6a5	Sync with 2.37.7 * maint-2.37: (31 commits) Git 2.37.7 Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 ...	2023-04-17 21:16:06 +02:00
Johannes Schindelin	1df551ce5c	Sync with 2.36.6 * maint-2.36: (30 commits) Git 2.36.6 Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix ...	2023-04-17 21:16:04 +02:00
Johannes Schindelin	62298def14	Sync with 2.35.8 * maint-2.35: (29 commits) Git 2.35.8 Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR ...	2023-04-17 21:16:02 +02:00
Johannes Schindelin	8cd052ea53	Sync with 2.34.8 * maint-2.34: (28 commits) Git 2.34.8 Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION ...	2023-04-17 21:15:59 +02:00
Johannes Schindelin	d6e9f67a8e	Sync with 2.33.8 * maint-2.33: (27 commits) Git 2.33.8 Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ...	2023-04-17 21:15:56 +02:00
Johannes Schindelin	bcd874d50f	Sync with 2.32.7 * maint-2.32: (26 commits) Git 2.32.7 Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ...	2023-04-17 21:15:52 +02:00
Johannes Schindelin	31f7fe5e34	Sync with 2.31.8 * maint-2.31: (25 commits) Git 2.31.8 tests: avoid using `test_i18ncmp` Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ci: use the same version of p4 on both Linux and macOS ...	2023-04-17 21:15:49 +02:00
Johannes Schindelin	92957d8427	tests: avoid using `test_i18ncmp` Since `test_i18ncmp` was deprecated in v2.31.*, the instances added in v2.30.9 needed to be converted to `test_cmp` calls. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2023-04-17 21:15:45 +02:00
Johannes Schindelin	b524e896b6	Sync with 2.30.9 * maint-2.30: (23 commits) Git 2.30.9 gettext: avoid using gettext if the locale dir is not present apply --reject: overwrite existing `.rej` symlink if it exists http.c: clear the 'finished' member once we are done with it clone.c: avoid "exceeds maximum object size" error with GCC v12.x range-diff: use ssize_t for parsed "len" in read_patches() range-diff: handle unterminated lines in read_patches() range-diff: drop useless "offset" variable from read_patches() t5604: GETTEXT_POISON fix, conclusion t5604: GETTEXT_POISON fix, part 1 t5619: GETTEXT_POISON fix t0003: GETTEXT_POISON fix, conclusion t0003: GETTEXT_POISON fix, part 1 t0033: GETTEXT_POISON fix http: support CURLOPT_PROTOCOLS_STR http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT ci: install python on ubuntu ci: use the same version of p4 on both Linux and macOS ci: remove the pipe after "p4 -V" to catch errors github-actions: run gcc-8 on ubuntu-20.04 image ...	2023-04-17 21:15:44 +02:00
Taylor Blau	528290f8c6	Merge branch 'tb/config-copy-or-rename-in-file-injection' Avoids issues with renaming or deleting sections with long lines, where configuration values may be interpreted as sections, leading to configuration injection. Addresses CVE-2023-29007. * tb/config-copy-or-rename-in-file-injection: config.c: disallow overly-long lines in `copy_or_rename_section_in_file()` config.c: avoid integer truncation in `copy_or_rename_section_in_file()` config: avoid fixed-sized buffer when renaming/deleting a section t1300: demonstrate failure when renaming sections with long lines Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:42 +02:00
Taylor Blau	3bb3d6bac5	config.c: disallow overly-long lines in `copy_or_rename_section_in_file()` As a defense-in-depth measure to guard against any potentially-unknown buffer overflows in `copy_or_rename_section_in_file()`, refuse to work with overly-long lines in a gitconfig. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>	2023-04-17 21:15:40 +02:00
Taylor Blau	a5bb10fd5e	config: avoid fixed-sized buffer when renaming/deleting a section When renaming (or deleting) a section of configuration, Git uses the function `git_config_copy_or_rename_section_in_file()` to rewrite the configuration file after applying the rename or deletion to the given section. To do this, Git repeatedly calls `fgets()` to read the existing configuration data into a fixed size buffer. When the configuration value under `old_name` exceeds the size of the buffer, we will call `fgets()` an additional time even if there is no newline in the configuration file, since our read length is capped at `sizeof(buf)`. If the first character of the buffer (after zero or more characters satisfying `isspace()`) is a '[', Git will incorrectly treat it as beginning a new section when the original section is being removed. In other words, a configuration value satisfying this criteria can incorrectly be considered as a new secftion instead of a variable in the original section. Avoid this issue by using a variable-width buffer in the form of a strbuf rather than a fixed-with region on the stack. A couple of small points worth noting: - Using a strbuf will cause us to allocate arbitrary sizes to match the length of each line. In practice, we don't expect any reasonable configuration files to have lines that long, and a bandaid will be introduced in a later patch to ensure that this is the case. - We are using strbuf_getwholeline() here instead of strbuf_getline() in order to match `fgets()`'s behavior of leaving the trailing LF character on the buffer (as well as a trailing NUL). This could be changed later, but using strbuf_getwholeline() changes the least about this function's implementation, so it is picked as the safest path. - It is temping to want to replace the loop to skip over characters matching isspace() at the beginning of the buffer with a convenience function like `strbuf_ltrim()`. But this is the wrong approach for a couple of reasons: First, it involves a potentially large and expensive `memmove()` which we would like to avoid. Second, and more importantly, we also do want to preserve those spaces to avoid changing the output of other sections. In all, this patch is a minimal replacement of the fixed-width buffer in `git_config_copy_or_rename_section_in_file()` to instead use a `struct strbuf`. Reported-by: André Baptista <andre@ethiack.com> Reported-by: Vítor Pinho <vitor@ethiack.com> Helped-by: Patrick Steinhardt <ps@pks.im> Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:40 +02:00
Taylor Blau	29198213c9	t1300: demonstrate failure when renaming sections with long lines When renaming a configuration section which has an entry whose length exceeds the size of our buffer in config.c's implementation of `git_config_copy_or_rename_section_in_file()`, Git will incorrectly form a new configuration section with part of the data in the section being removed. In this instance, our first configuration file looks something like: [b] c = d <spaces> [a] e = f [a] g = h Here, we have two configuration values, "b.c", and "a.g". The value "[a] e = f" belongs to the configuration value "b.c", and does not form its own section. However, when renaming the section 'a' to 'xyz', Git will write back "[xyz]\ne = f", but "[xyz]" is still attached to the value of "b.c", which is why "e = f" on its own line becomes a new entry called "b.e". A slightly different example embeds the section being renamed within another section. Demonstrate this failure in a test in t1300, which we will fix in the following commit. Co-authored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2023-04-17 21:15:39 +02:00
Johannes Schindelin	9db05711c9	apply --reject: overwrite existing `.rej` symlink if it exists The `git apply --reject` is expected to write out `.rej` files in case one or more hunks fail to apply cleanly. Historically, the command overwrites any existing `.rej` files. The idea being that apply/reject/edit cycles are relatively common, and the generated `.rej` files are not considered precious. But the command does not overwrite existing `.rej` symbolic links, and instead follows them. This is unsafe because the same patch could potentially create such a symbolic link and point at arbitrary paths outside the current worktree, and `git apply` would write the contents of the `.rej` file into that location. Therefore, let's make sure that any existing `.rej` file or symbolic link is removed before writing it. Reported-by: RyotaK <ryotak.mail@gmail.com> Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2023-04-17 21:15:38 +02:00
Jeff King	c4716236f2	t5512: test "ls-remote --heads --symref" filtering with v0 and v2 We have two overlapping tests for checking the behavior of "ls-remote --symref" when filtering output. The first test checks that using "--heads" will omit the symref for HEAD (since we don't print anything about HEAD at all), but still prints other symrefs. This has been marked as expecting failure since it was added in `99c08d4eb2` (ls-remote: add support for showing symrefs, 2016-01-19). That's because back then, we only had the v0 protocol, and it only reported on the HEAD symref, not others. But these days we have v2, which does exactly what the test wants. It would even have started unexpectedly passing when we switched to v2 by default, except that `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25) over-zealously marked it to run only in v0 mode. So let's run it with both protocol versions, and adjust the expected output for each. It passes in v2 without modification. In v0 mode, we'll drop the extra symref, but this is still testing something useful: it ensures that we do omit HEAD. The test after this checks "--heads" again, this time using the expected v0 output. That's now redundant. It also checks that limiting with a pattern like "refs/heads/*" works similarly, but that's redundant with a test earlier in the script which limits by HEAD (again, back then the "HEAD" test was less interesting because there were no other symrefs to omit, but in a modern v2 world, there are). So we can just delete that second test entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:13 -07:00
Jeff King	d6747adfa8	t5512: allow any protocol version for filtered symref test We have a test that checks that ls-remote, when asked only about HEAD, will report the HEAD symref, and not others. This was marked to always run with the v0 protocol by `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25). But in v0 this test is doing nothing! For v0, upload-pack only reports the HEAD symref anyway, so we'd never have any other symref to report. For v2, it is useful; we learn about all symrefs (and the test repo has multiple), so this demonstrates that we correctly avoid showing them. We could perhaps mark this to test explicitly with v2, but since that is the default these days, it's sufficient to just run ls-remote without any protocol specification. It still passes if somebody does an explicit GIT_TEST_PROTOCOL_VERSION=0; it's just testing less. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	20272ee8cf	t5512: add v2 support for "ls-remote --symref" test Commit `b2f73b70b2` (t5512: compensate for v0 only sending HEAD symrefs, 2019-02-25) configured this test to always run with protocol v0, since the output is different for v2. But that means we are not getting any test coverage of the feature with v2 at all. We could obviously switch to using and expecting v2, but then that leaves v0 behind (and while we don't use it by default, it's still important for testing interoperability with older servers). Likewise, we could switch the expected output based on $GIT_TEST_PROTOCOL_VERSION, but hardly anybody runs the tests for v0 these days. Instead, let's explicitly run it for both protocol versions to make sure they're well behaved. This matches other similar tests added later in `6a139cdd74` (ls-remote: pass heads/tags prefixes to transport, 2018-10-31), etc. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	13e67aa39b	v0 protocol: fix sha1/sha256 confusion for capabilities^{} Commit `eb398797cd` (connect: advertized capability is not a ref, 2016-09-09) added support for an upload-pack server responding with: 0000000000000000000000000000000000000000 capabilities^{} followed by a NUL and the actual capabilities. We correctly parse the oid using the packet_reader's hash_algo field, but then we compare it to null_oid(), which will instead use our current repo's default algorithm. If we're defaulting to sha256 locally but the other side is sha1, they won't match and we'll fail to parse the line (and thus die()). This can cause a test failure when the suite is run with GIT_TEST_DEFAULT_HASH=sha256, and we even do so regularly via the linux-sha256 CI job. But since the test requires JGit to run, it's usually just skipped, and nobody noticed the problem. The reason the original patch used JGit is that Git itself does not ever produce such a line via upload-pack; the feature was added to fix a real-world problem when interacting with JGit. That was good for verifying that the incompatibility was fixed, but it's not a good regression test: - hardly anybody runs it, because you have to have jgit installed; hence this bug going unnoticed - we're depending on jgit's behavior for the test to do anything useful. In particular, this behavior is only relevant to the v0 protocol, but these days we ask for the v2 protocol by default. So for modern jgit, this is probably testing nothing. - it's complicated and slow. We had to do some fifo trickery to handle races, and this one test makes up 40% of the runtime of the total script. Instead, let's just hard-code the response that's of interest to us. That will test exactly what we want for every run, and reveals the bug when run in sha256 mode. And of course we'll fix the actual bug by using the correct hash_algo struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	e6c4309748	t5512: stop referring to "v1" protocol There really isn't a "v1" Git protocol. It's just v0 with an extra probe which we used to test compatibility in preparation for v2. Any tests that are looking for before/after behavior for v2 really care about "v0". Mentioning "v1" in these tests is just making things more confusing, because we don't care about that probe; we're really testing v0. So let's say so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Jeff King	aa962fef27	v0 protocol: fix infinite loop when parsing multi-valued capabilities If Git's client-side parsing of an upload-pack response (so git-fetch or ls-remote) sees multiple instances of a single capability, it can enter an infinite loop due to a bug in advancing the "offset" parameter in the parser. This bug can't happen between a client and server of the same Git version. The client bug is in parse_feature_value() when the caller passes in an offset parameter. And that only happens when the v0 protocol is parsing "symref" and "object-format" capabilities, via next_server_feature_value(). But Git has never produced multiple object-format capabilities, and it stopped producing multiple symref values in `d007dbf7d6` (Revert "upload-pack: send non-HEAD symbolic refs", 2013-11-18). However, upload-pack did produce multiple symref entries for a while, and they are valid. Plus other implementations, such as Dulwich will still do so. So we should handle them. And even if we do not expect it, it is obviously a bug for the parser to enter an infinite loop. The bug itself is pretty simple. Commit `2c6a403d96` (connect: add function to parse multiple v1 capability values, 2020-05-25) added the "offset" parameter, which is used as both an in- and out-parameter. When parsing the first "symref" capability, offset will be 0 on input, and after parsing the capability, we set offset to an index just past the value by taking a pointer difference "(value + end) - feature_list". But on the second call, now offset is set to that larger index, which lets us skip past the first "symref" capability. However, we do so by incrementing feature_list. That means our pointer difference is now too small; it is counting from where we resumed parsing, not from the start of the original feature_list pointer. And because we incremented feature_list only inside our function, and not the caller, that increment is lost next time the function is called. One solution would be to account for those skipped bytes by incrementing offset, rather than assigning to it. But wait, there's more! We also increment feature_list if we have a near-miss. Say we are looking for "symref" and find "almost-symref". In that case we'll point feature_list to the "y" in "almost-symref" and restart our search. But that again means our offset won't be correct, as it won't account for the bytes between the start of the string and that "y". So instead, let's just record the beginning of the feature_list string in a separate pointer that we never touch. That offset we take in and return is meant to be using that point as a base, and now we'll do so consistently. Since the bug can't be reproduced using the current version of git-upload-pack, we'll instead hard-code an input which triggers the problem. Before this patch it loops forever re-parsing the second symref entry. Now we check both that it finishes, and that it parses both entries correctly (a case we could not test at all before). We don't need to worry about testing v2 here; it communicates the capabilities in a completely different way, and doesn't use this code at all. There are tests earlier in t5512 that are meant to cover this (they don't, but we'll address that in a future patch). Reported-by: Jonas Haag <jonas@lophus.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 15:08:12 -07:00
Robin Jarry	3c8d3adeae	send-email: export patch counters in validate environment When sending patch series (with a cover-letter or not) sendemail-validate is called with every email/patch file independently from the others. When one of the patches depends on a previous one, it may not be possible to use this hook in a meaningful way. A hook that wants to check some property of the whole series needs to know which patch is the final one. Expose the current and total number of patches to the hook via the GIT_SENDEMAIL_PATCH_COUNTER and GIT_SENDEMAIL_PATCH_TOTAL environment variables so that both incremental and global validation is possible. Sharing any other state between successive invocations of the validate hook must be done via external means. For example, by storing it in a git config sendemail.validateWorktree entry. Add a sample script with placeholder validations and update tests to check that the counters are properly exported. Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Robin Jarry <robin@jarry.cc> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:41:15 -07:00
Patrick Steinhardt	d85cd18777	repack: disable writing bitmaps when doing a local repack In order to write a bitmap, we need to have full coverage of all objects that are about to be packed. In the traditional non-multi-pack-index world this meant we need to do a full repack of all objects into a single packfile. But in the new multi-pack-index world we can get away with writing bitmaps when we have multiple packfiles as long as the multi-pack-index covers all objects. This is not always the case though. When asked to perform a repack of local objects, only, then we cannot guarantee to have full coverage of all objects regardless of whether we do a full repack or a repack with a multi-pack-index. The end result is that writing the bitmap will fail in both worlds: $ git multi-pack-index write --stdin-packs --bitmap <packfiles warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing) error: could not write multi-pack bitmap Now there are two different ways to fix this. The first one would be to amend git-multi-pack-index(1) to disable writing bitmaps when we notice that we don't have full object coverage. - We don't have enough information in git-multi-pack-index(1) in order to tell whether the local repository _should_ have full coverage. Because even when connected to an alternate object directory, it may be the case that we still have all objects around in the main object database. - git-multi-pack-index(1) is quite a low-level tool. Automatically disabling functionality that it was asked to provide does not feel like the right thing to do. We can easily fix it at a higher level in git-repack(1) though. When asked to only include local objects via `-l` and when connected to an alternate object directory then we will override the user's ask and disable writing bitmaps with a warning. This is similar to what we do in git-pack-objects(1), where we also disable writing bitmaps in case we omit an object from the pack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	932c16c04b	repack: honor `-l` when calculating pack geometry When the user passes `-l` to git-repack(1), then they essentially ask us to only repack objects part of the local object database while ignoring any packfiles part of an alternate object database. And we in fact honor this bit when doing a geometric repack as the resulting packfile will only ever contain local objects. What we're missing though is that we don't take locality of packfiles into account when computing whether the geometric sequence is intact or not. So even though we would only ever roll up local packfiles anyway, we could end up trying to repack because of non-local packfiles. This does not make much sense, and in the worst case it can cause us to try and do the geometric repack over and over again because we're never able to restore the geometric sequence. Fix this bug by honoring whether the user has passed `-l`. If so, we skip adding any non-local packfiles to the pack geometry. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	19a3a7bde9	t/helper: allow chmtime to print verbosely without modifying mtime The `test-tool chmtime` helper allows us to both read and modify the modification time of files. But while it is possible to only read the mtimes of a file via `--get`, it is not possible to read the mtimes and report them together with their respective file paths via the `--verbose` flag without also modifying the mtime at the same time. Fix this so that it is possible to call `test-tool chmtime --verbose <files>...` without modifying any mtimes. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	f3028418c3	pack-objects: extend test coverage of `--stdin-packs` with alternates We don't have any tests that verify that git-pack-objects(1) works with `--stdin-packs` when combined with alternate object directories. Add some to make sure that the basic functionality works as expected. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:52 -07:00
Patrick Steinhardt	752b465c3c	pack-objects: fix error when same packfile is included and excluded When passing the same packfile both as included and excluded via the `--stdin-packs` option, then we will return an error because the excluded packfile cannot be found. This is because we will only set the `util` pointer for the included packfile list if it was found, so that we later die when we notice that it's in fact not set for the excluded packfile list. Fix this bug by always setting the `util` pointer for both the included and excluded list entries. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	732194b5f2	pack-objects: fix error when packing same pack twice When passed the same packfile twice via `--stdin-packs` we return an error that the packfile supposedly was not found. This is because when reading packs into the list of included or excluded packfiles, we will happily re-add packfiles even if they are part of the lists already. And while the list can now contain duplicates, we will only set the `util` pointer of the first list entry to the `packed_git` structure. We notice that at a later point when checking that all list entries have their `util` pointer set and die with an error. While this is kind of a nonsensical request, this scenario can be hit when doing geometric repacks. When a repository is connected to an alternate object directory and both have the exact same packfile then both would get added to the geometric sequence. And when we then decide to perform the repack, we will invoke git-pack-objects(1) with the same packfile twice. Fix this bug by removing any duplicates from both the included and excluded packs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	b7b8f048f5	pack-objects: split out `--stdin-packs` tests into separate file The test suite for git-pack-objects(1) is quite huge, and we're about to add more tests that relate to the `--stdin-packs` option. Split out all tests related to this option into a standalone file so that it becomes easier to test the feature in isolation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	51861340f8	repack: fix generating multi-pack-index with only non-local packs When writing the multi-pack-index with geometric repacking we will add all packfiles to the index that are part of the geometric sequence. This can potentially also include packfiles borrowed from an alternate object directory. But given that a multi-pack-index can only ever include packs that are part of the main object database this does not make much sense whatsoever. In the edge case where all packfiles are contained in the alternate object database and the local repository has none itself this bug can cause us to invoke git-multi-pack-index(1) with only non-local packfiles that it ultimately cannot find. This causes it to return an error and thus causes the geometric repack to fail. Fix the code to skip non-local packfiles. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	3d74a2337c	repack: fix trying to use preferred pack in alternates When doing a geometric repack with multi-pack-indices, then we ask git-multi-pack-index(1) to use the largest packfile as the preferred pack. It can happen though that the largest packfile is not part of the main object database, but instead part of an alternate object database. The result is that git-multi-pack-index(1) will not be able to find the preferred pack and print a warning. It then falls back to use the first packfile that the multi-pack-index shall reference. Fix this bug by only considering packfiles as preferred pack that are local. This is the right thing to do given that a multi-pack-index should never reference packfiles borrowed from an alternate. While at it, rename the function `get_largest_active_packfile()` to `get_preferred_pack()` to better document its intent. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Patrick Steinhardt	ceb96a160b	midx: fix segfault with no packs and invalid preferred pack When asked to write a multi-pack-index the user can specify a preferred pack that is used as a tie breaker when multiple packs contain the same objects. When this packfile cannot be found, we just pick the first pack that is getting tracked by the newly written multi-pack-index as a fallback. Picking the fallback can fail in the case where we're asked to write a multi-pack-index with no packfiles at all: picking the fallback value will cause a segfault as we blindly index into the array of packfiles, which would be empty. Fix this bug by resetting the preferred packfile index to `-1` before searching for the preferred pack. This fixes the segfault as we already check for whether the index is `> - 1`. If it is not, we simply don't pick a preferred packfile at all. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-14 10:27:51 -07:00
Øystein Walle	aabfdc9514	branch, for-each-ref, tag: add option to omit empty lines If the given format string expands to the empty string, a newline is still printed. This makes using the output linewise more tedious. For example, git update-ref --stdin does not accept empty lines. Add options to "git branch", "git for-each-ref", and "git tag" to not print these empty lines. The default behavior remains the same. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 08:07:45 -07:00
Taylor Blau	9f7f10a282	t: invert `GIT_TEST_WRITE_REV_INDEX` Back in `e8c58f894b` (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we added a test knob to conditionally enable writing a ".rev" file when indexing a pack. At the time, this was used to ensure that the test suite worked even when ".rev" files were written, which served as a stress-test for the on-disk reverse index implementation. Now that reading from on-disk ".rev" files is enabled by default, the test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning. We could get rid of the option entirely, but there would be no convenient way to test Git when ".rev" files aren't in place. Instead of getting rid of the option, invert its meaning to instead disable writing ".rev" files, thereby running the test suite in a mode where the reverse index is generated from scratch. This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some spelling of "true", we are still running and exercising Git's behavior when forced to generate reverse indexes from scratch. Do so by setting it in the linux-TEST-vars CI run to ensure that we are maintaining good coverage of this now-legacy code. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	a8dd7e05b1	config: enable `pack.writeReverseIndex` by default Back in `e37d0b8730` (builtin/index-pack.c: write reverse indexes, 2021-01-25), Git learned how to read and write a pack's reverse index from a file instead of in-memory. A pack's reverse index is a mapping from pack position (that is, the order that objects appear together in a ".pack") to their position in lexical order (that is, the order that objects are listed in an ".idx" file). Reverse indexes are consulted often during pack-objects, as well as during auxiliary operations that require mapping between pack offsets, pack order, and index index. They are useful in GitHub's infrastructure, where we have seen a dramatic increase in performance when writing ".rev" files[1]. In particular: - an ~80% reduction in the time it takes to serve fetches on a popular repository, Homebrew/homebrew-core. - a ~60% reduction in the peak memory usage to serve fetches on that same repository. - a collective savings of ~35% in CPU time across all pack-objects invocations serving fetches across all repositories in a single datacenter. Reverse indexes are also beneficial to end-users as well as forges. For example, the time it takes to generate a pack containing the objects for the 10 most recent commits in linux.git (representing a typical push) is significantly faster when on-disk reverse indexes are available: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 543.0 ms ± 20.3 ms [User: 616.2 ms, System: 58.8 ms] Range (min … max): 521.0 ms … 577.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 245.0 ms ± 11.4 ms [User: 335.6 ms, System: 31.3 ms] Range (min … max): 226.0 ms … 259.6 ms 13 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' The same is true of writing a pack containing the objects for the 30 most-recent commits: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 866.5 ms ± 16.2 ms [User: 1414.5 ms, System: 97.0 ms] Range (min … max): 839.3 ms … 886.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 581.6 ms ± 10.2 ms [User: 1181.7 ms, System: 62.6 ms] Range (min … max): 567.5 ms … 599.3 ms 10 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ...and savings on trivial operations like computing the on-disk size of a single (packed) object are even more dramatic: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 305.8 ms ± 11.4 ms [User: 264.2 ms, System: 41.4 ms] Range (min … max): 290.3 ms … 331.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 4.0 ms ± 0.3 ms [User: 1.7 ms, System: 2.3 ms] Range (min … max): 1.6 ms … 4.6 ms 1155 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' In the more than two years since `e37d0b8730` was merged, Git's implementation of on-disk reverse indexes has been thoroughly tested, both from users enabling `pack.writeReverseIndexes`, and from GitHub's deployment of the feature. The latter has been running without incident for more than two years. This patch changes Git's behavior to write on-disk reverse indexes by default when indexing a pack, which should make the above operations faster for everybody's Git installation after a repack. (The previous commit explains some potential drawbacks of using on-disk reverse indexes in certain limited circumstances, that essentially boil down to a trade-off between time to generate, and time to access. For those limited cases, the `pack.readReverseIndex` escape hatch can be used). [1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	dbcf611617	pack-revindex: introduce `pack.readReverseIndex` Since `1615c567b8` (Documentation/config/pack.txt: advertise 'pack.writeReverseIndex', 2021-01-25), we have had the `pack.writeReverseIndex` configuration option, which tells Git whether or not it is allowed to write a ".rev" file when indexing a pack. Introduce a complementary configuration knob, `pack.readReverseIndex` to control whether or not Git will read any ".rev" file(s) that may be available on disk. This option is useful for debugging, as well as disabling the effect of ".rev" files in certain instances. This is useful because of the trade-off[^1] between the time it takes to generate a reverse index (slow from scratch, fast when reading an existing ".rev" file), and the time it takes to access a record (the opposite). For example, even though it is faster to use the on-disk reverse index when computing the on-disk size of a packed object, it is slower to enumerate the same value for all objects. Here are a couple of examples from linux.git. When computing the above for a single object, using the on-disk reverse index is significantly faster: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 302.5 ms ± 12.5 ms [User: 258.7 ms, System: 43.6 ms] Range (min … max): 291.1 ms … 328.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 3.9 ms ± 0.3 ms [User: 1.6 ms, System: 2.4 ms] Range (min … max): 2.0 ms … 4.4 ms 801 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 77.29 ± 7.14 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' , but when instead trying to compute the on-disk object size for all objects in the repository, using the ".rev" file is a disadvantage over creating the reverse index from scratch: $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 8.258 s ± 0.035 s [User: 7.949 s, System: 0.308 s] Range (min … max): 8.199 s … 8.293 s 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 16.976 s ± 0.107 s [User: 16.706 s, System: 0.268 s] Range (min … max): 16.839 s … 17.105 s 10 runs Summary 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' ran 2.06 ± 0.02 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" --batch-all-objects' Luckily, the results when running `git cat-file` with `--unordered` are closer together: $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 5.066 s ± 0.105 s [User: 4.792 s, System: 0.274 s] Range (min … max): 4.943 s … 5.220 s 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects Time (mean ± σ): 6.193 s ± 0.069 s [User: 5.937 s, System: 0.255 s] Range (min … max): 6.145 s … 6.356 s 10 runs Summary 'git.compile -c pack.readReverseIndex=false cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' ran 1.22 ± 0.03 times faster than 'git.compile -c pack.readReverseIndex=true cat-file --unordered --batch-check="%(objectsize:disk)" --batch-all-objects' Because the equilibrium point between these two is highly machine- and repository-dependent, allow users to configure whether or not they will read any ".rev" file(s) with this configuration knob. [^1]: Generating a reverse index in memory takes O(N) time (where N is the number of objects in the repository), since we use a radix sort. Reading an entry from an on-disk ".rev" file is slower since each operation is bound by disk I/O instead of memory I/O. In order to compute the on-disk size of a packed object, we need to find the offset of our object, and the adjacent object (the on-disk size difference of these two). Finding the first offset requires a binary search. Finding the latter involves a single .rev lookup. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Taylor Blau	b77919ed6e	t5325: mark as leak-free This test is leak-free as of the previous commit, so let's mark it as such to ensure we don't regress and introduce a leak in the future. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:45 -07:00
Junio C Hamano	063cd850f2	Merge branch 'jk/use-perl-path-consistently' Tests had a few places where we ignored PERL_PATH and blindly used /usr/bin/perl, which have been corrected. * jk/use-perl-path-consistently: t/lib-httpd: pass PERL_PATH to CGI scripts	2023-04-11 13:49:13 -07:00
Junio C Hamano	96f4113ac0	Merge branch 'jc/clone-object-format-from-void' "git clone" from an empty repository learned to propagate the choice of the hash algorithm from the source repository to the newly created repository. * jc/clone-object-format-from-void: clone: propagate object-format when cloning from void	2023-04-11 13:49:13 -07:00
Junio C Hamano	30e04bcfa8	Merge branch 'ar/adjust-tests-for-the-index-fallout' Comment updates. * ar/adjust-tests-for-the-index-fallout: t2107: fix mention of the_index.cache_changed t3060: fix mention of function prune_index	2023-04-11 13:49:12 -07:00
Junio C Hamano	647a2bb3ff	Merge branch 'jc/spell-id-in-both-caps-in-message-id' Consistently spell "Message-ID" as such, not "Message-Id". * jc/spell-id-in-both-caps-in-message-id: e-mail workflow: Message-ID is spelled with ID in both capital letters	2023-04-11 13:49:12 -07:00
Junio C Hamano	d02343b599	Merge branch 'ws/sparse-check-rules' "git sparse-checkout" command learns a debugging aid for the sparse rule definitions. * ws/sparse-check-rules: builtin/sparse-checkout: add check-rules command builtin/sparse-checkout: remove NEED_WORK_TREE flag	2023-04-11 13:49:12 -07:00
Elijah Newren	e93fc5d721	treewide: remove cache.h inclusion due to object-name.h changes Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	dabab1d6e6	object-name.h: move declarations for object-name.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	5579f44d2f	treewide: remove unnecessary cache.h inclusion Several files were including cache.h solely to get other headers, such as trace.h and trace2.h. Since the last few commits have modified files to make these dependencies more explicit, the inclusion of cache.h is no longer needed in several cases. Remove it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	5bc07225e5	treewide: be explicit about dependence on mem-pool.h Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:09 -07:00
Elijah Newren	74ea5c9574	treewide: be explicit about dependence on trace.h & trace2.h Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:08 -07:00
Glen Choo	4e33535ea9	clone: error specifically with --local and symlinked objects `6f054f9fb3` (builtin/clone.c: disallow --local clones with symlinks, 2022-07-28) gives a good error message when "git clone --local" fails when the repo to clone has symlinks in "$GIT_DIR/objects". In `bffc762f87` (dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS, 2023-01-24), we later extended this restriction to the case where "$GIT_DIR/objects" is itself a symlink, but we didn't update the error message then - bffc762f87's tests show that we print a generic "failed to start iterator over" message. This is exacerbated by the fact that Documentation/git-clone.txt mentions neither restriction, so users are left wondering if this is intentional behavior or not. Fix this by adding a check to builtin/clone.c: when doing a local clone, perform an extra check to see if "$GIT_DIR/objects" is a symlink, and if so, assume that that was the reason for the failure and report the relevant information. Ideally, dir_iterator_begin() would tell us that the real failure reason is the presence of the symlink, but (as far as I can tell) there isn't an appropriate errno value for that. Also, update Documentation/git-clone.txt to reflect that this restriction exists. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:46:09 -07:00
Andrei Rybak	fd72637423	t2024: fix loose/strict local base branch DWIM test Test 'loosely defined local base branch is reported correctly' in t2024-checkout-dwim.sh, which was introduced in [1] compares output of two invocations of "git checkout", invoked with two different branches named "strict" and "loose". As per description in [1], the test is validating that output of tracking information for these two branches. This tracking information is printed to standard output: Your branch is behind 'main' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) The test assumes that the names of the two branches (strict and loose) are in that output, and pipes the output through sed to replace names of the branches with "BRANCHNAME". Command "git checkout", however, outputs the branch name to standard error, not standard output -- see message "Switched to branch '%s'\n" in function "update_refs_for_switch" in "builtin/checkout.c". This means that the two invocations of sed do nothing. Redirect both the standard output and the standard error of "git checkout" for these assertions. Ensure that compared files have the string "BRANCHNAME". In a series of piped commands, only the return code of the last command is used. Thus, all other commands will have their return codes masked. Avoid piping of output of git directly into sed to preserve the exit status code of "git checkout", while we're here. [1] `05e73682cd` (checkout: report upstream correctly even with loosely defined branch.*.merge, 2014-10-14) Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 10:11:23 -07:00
Phillip Wood	05106aa198	rebase: remove a couple of redundant strategy tests Remove a test in t3402 that has been redundant ever since `80ff47957b` (rebase: remember strategy and strategy options, 2011-02-06). That commit added a new test, the first part of which (as noted in the old commit message) duplicated an existing test. Also remove a test t3418 that has been redundant since the merge backend was removed in `68aa495b59` (rebase: implement --merge via the interactive machinery, 2018-12-11), since it now tests the same code paths as the preceding test. Helped-by: Elijah Newren <newren@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00
Phillip Wood	4960e5c7bd	rebase -m: fix serialization of strategy options To store the strategy options rebase prepends " --" to each one and writes them to a file. To load them it reads the file and passes the contents to split_cmdline(). This roughly mimics the behavior of the scripted rebase but has a couple of limitations, (1) options containing whitespace are not properly preserved (this is true of the scripted rebase as well) and (2) options containing '"' or '\' are incorrectly parsed and may cause the parser to return an error. Fix these limitations by quoting each option when they are stored so that they can be parsed correctly. Now that "--preserve-merges" no longer exist this change also stops prepending "--" to the options when they are stored as that was an artifact of the scripted rebase. These changes are backwards compatible so the files written by an older version of git can still be read. They are also forwards compatible, the file can still be parsed by recent versions of git as they treat the "--" prefix as optional. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00
Phillip Wood	4a8bc9860a	rebase -m: cleanup --strategy-option handling When handling "--strategy-option" rebase collects the commands into a struct string_list, then concatenates them into a string, prepending "--" to each one before splitting the string and removing the "--" prefix. This is an artifact of the scripted rebase and the need to support "rebase --preserve-merges". Now that "--preserve-merges" no-longer exists we can cleanup the way the argument is handled. The tests for a bad strategy option are adjusted now that parse_strategy_opts() is no-longer called when starting a rebase. The fact that it only errors out when running "git rebase --continue" is a mixed blessing but the next commit will fix the root cause of the parsing problem so lets not worry about that here. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-10 09:53:19 -07:00

... 3 4 5 6 7 ...

21245 commits