development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-30 04:01:21 +00:00

Author	SHA1	Message	Date
Emily Shaffer	303b3c1c46	submodule: add newline on invalid submodule error Since 'err' contains output for multiple submodules and is printed all at once by fetch_populated_submodules(), errors for each submodule should be newline separated for readability. The same strbuf is added to with a newline in the other half of the conditional where this error is detected, so make the two consistent. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-07 11:05:45 -08:00
Heba Waly	887a0fd573	add: change advice config variables used by the add API advice.addNothing config variable is used to control the visibility of two advice messages in the add library. This config variable is replaced by two new variables, whose names are more clear and relevant to the two cases. Also add the two new variables to the documentation. Signed-off-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-06 11:08:00 -08:00
Junio C Hamano	de93cc14ab	The third batch for 2.26 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 14:35:00 -08:00
Junio C Hamano	ea46d9097b	Merge branch 'mt/sparse-checkout-doc-update' Doc update. * mt/sparse-checkout-doc-update: completion: add support for sparse-checkout doc: sparse-checkout: mention --cone option	2020-02-05 14:35:00 -08:00
Junio C Hamano	ff5134b2ff	Merge branch 'pb/recurse-submodule-in-worktree-fix' The "--recurse-submodules" option of various subcommands did not work well when run in an alternate worktree, which has been corrected. * pb/recurse-submodule-in-worktree-fix: submodule.c: use get_git_dir() instead of get_git_common_dir() t2405: clarify test descriptions and simplify test t2405: use git -C and test_commit -C instead of subshells t7410: rename to t2405-worktree-submodule.sh	2020-02-05 14:35:00 -08:00
Junio C Hamano	b5c71cc33d	Merge branch 'es/fetch-show-failed-submodules-atend' A fetch that is told to recursively fetch updates in submodules inevitably produces reams of output, and it becomes hard to spot error messages. The command has been taught to enumerate submodules that had errors at the end of the operation. * es/fetch-show-failed-submodules-atend: fetch: emphasize failure during submodule fetch	2020-02-05 14:34:59 -08:00
Junio C Hamano	7ab963e122	Merge branch 'en/fill-directory-fixes-more' Corner case bugs in "git clean" that stems from a (necessarily for performance reasons) awkward calling convention in the directory enumeration API has been corrected. * en/fill-directory-fixes-more: dir: point treat_leading_path() warning to the right place dir: restructure in a way to avoid passing around a struct dirent dir: treat_leading_path() and read_directory_recursive(), round 2 clean: demonstrate a bug with pathspecs	2020-02-05 14:34:59 -08:00
Junio C Hamano	f52ab33616	Merge branch 'bc/hash-independent-tests-part-7' Preparation of test scripts for the day when the object names will use SHA-256 continues. * bc/hash-independent-tests-part-7: t5604: make hash independent t5601: switch into repository to hash object t5562: use $ZERO_OID t5540: make hash size independent t5537: make hash size independent t5530: compute results based on object length t5512: abstract away SHA-1-specific constants t5510: make hash size independent t5504: make hash algorithm independent t5324: make hash size independent t5319: make test work with SHA-256 t5319: change invalid offset for SHA-256 compatibility t5318: update for SHA-256 t4300: abstract away SHA-1-specific constants t4204: make hash size independent t4202: abstract away SHA-1-specific constants t4200: make hash size independent t4134: compute appropriate length constant t4066: compute index line in diffs t4054: make hash-size independent	2020-02-05 14:34:59 -08:00
Junio C Hamano	25794d6ce9	Merge branch 'km/submodule-add-errmsg' Improve error message generation for "git submodule add". * km/submodule-add-errmsg: submodule add: show 'add --dry-run' stderr when aborting	2020-02-05 14:34:59 -08:00
Junio C Hamano	d0e70cd32e	Merge branch 'am/checkout-file-and-ref-ref-ambiguity' "git checkout X" did not correctly fail when X is not a local branch but could name more than one remote-tracking branches (i.e. to be dwimmed as the starting point to create a corresponding local branch), which has been corrected. * am/checkout-file-and-ref-ref-ambiguity: checkout: don't revert file on ambiguous tracking branches parse_branchname_arg(): extract part as new function	2020-02-05 14:34:58 -08:00
Junio C Hamano	76c57fedfa	Merge branch 'js/add-p-leftover-bits' The final leg of rewriting "add -i/-p" in C. * js/add-p-leftover-bits: ci: include the built-in `git add -i` in the `linux-gcc` job built-in add -p: handle Escape sequences more efficiently built-in add -p: handle Escape sequences in interactive.singlekey mode built-in add -p: respect the `interactive.singlekey` config setting terminal: add a new function to read a single keystroke terminal: accommodate Git for Windows' default terminal terminal: make the code of disable_echo() reusable built-in add -p: handle diff.algorithm built-in add -p: support interactive.diffFilter t3701: adjust difffilter test	2020-02-05 14:34:58 -08:00
Junio C Hamano	9a5315edfd	Merge branch 'js/patch-mode-in-others-in-c' The effort to move "git-add--interactive" to C continues. * js/patch-mode-in-others-in-c: commit --interactive: make it work with the built-in `add -i` built-in add -p: implement the "worktree" patch modes built-in add -p: implement the "checkout" patch modes built-in stash: use the built-in `git add -p` if so configured legacy stash -p: respect the add.interactive.usebuiltin setting built-in add -p: implement the "stash" and "reset" patch modes built-in add -p: prepare for patch modes other than "stage"	2020-02-05 14:34:58 -08:00
Junio C Hamano	381e8e9de1	Merge branch 'dl/test-must-fail-fixes' Test clean-up. * dl/test-must-fail-fixes: t1507: inline full_name() t1507: run commands within test_expect_success t1507: stop losing return codes of git commands t1501: remove use of `test_might_fail cp` t1409: use test_path_is_missing() t1409: let sed open its own input file t1307: reorder `nongit test_must_fail` t1306: convert `test_might_fail rm` to `rm -f` t0020: use ! check_packed_refs_marked t0020: don't use `test_must_fail has_cr` t0003: don't use `test_must_fail attr_check` t0003: use test_must_be_empty() t0003: use named parameters in attr_check() t0000: replace test_must_fail with run_sub_test_lib_test_err() t/lib-git-p4: use test_path_is_missing()	2020-02-05 14:34:57 -08:00
Jacques Bodin-Hullin	395518cf7a	parse-options: lose an unnecessary space in an error message Signed-off-by: Jacques Bodin-Hullin <j.bodinhullin@monsieurbiz.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:49:29 -08:00
René Scharfe	079f970971	name-rev: sort tip names before applying name_ref() is called for each ref and checks if its a better name for the referenced commit. If that's the case it remembers it and checks if a name based on it is better for its ancestors as well. This in done in the the order for_each_ref() imposes on us. That might not be optimal. If bad names happen to be encountered first (as defined by is_better_name()), names derived from them may spread to a lot of commits, only to be replaced by better names later. Setting better names first can avoid that. is_better_name() prefers tags, short distances and old references. The distance is a measure that we need to calculate for each candidate commit, but the other two properties are not dependent on the relationships of commits. Sorting the refs by them should yield better performance than the essentially random order we currently use. And applying older references first should also help to reduce rework due to the fact that older commits have less ancestors than newer ones. So add all details of names to the tip table first, then sort them to prefer tags and older references and then apply them in this order. Here's the performance as measures by hyperfine for the Linux repo before: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 851.1 ms ± 4.5 ms [User: 806.7 ms, System: 44.4 ms] Range (min … max): 845.9 ms … 859.5 ms 10 runs ... and with this patch: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 736.2 ms ± 8.7 ms [User: 688.4 ms, System: 47.5 ms] Range (min … max): 726.0 ms … 755.2 ms 10 runs Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:36:33 -08:00
René Scharfe	2d53975488	name-rev: release unused name strings name_rev() assigns a name to a commit and its parents and grandparents and so on. Commits share their name string with their first parent, which in turn does the same, recursively to the root. That saves a lot of allocations. When a better name is found, the old name is replaced, but its memory is not released. That leakage can become significant. Can we release these old strings exactly once even though they are referenced multiple times? Yes, indeed -- we can make use of the fact that name_rev() visits the ancestors of a commit after it set a new name for it and tries to update their names as well. Members of the first ancestral line have the same taggerdate and from_tag values, but a higher distance value than their child commit at generation 0. These are the only criteria used by is_better_name(). Lower distance values are considered better, so a name that is better for a child will also be better for its parent and grandparent etc. That means we can free(3) an inferior name at generation 0 and rely on name_rev() to replace all references in ancestors as well. If we do that then we need to stop using the string pointer alone to distinguish new empty rev_name slots from initialized ones, though, as it technically becomes invalid after the free(3) call -- even though its value is still different from NULL. We can check the generation value first, as empty slots will have it initialized to 0, and for the actual generation 0 we'll set a new valid name right after the create_or_update_name() call that releases the string. For the Chromium repo, releasing superceded names reduces the memory footprint of name-rev --all significantly. Here's the output of GNU time before: 0.98user 0.48system 0:01.46elapsed 99%CPU (0avgtext+0avgdata 2601812maxresident)k 0inputs+0outputs (0major+571470minor)pagefaults 0swaps ... and with this patch: 1.01user 0.26system 0:01.28elapsed 100%CPU (0avgtext+0avgdata 1559196maxresident)k 0inputs+0outputs (0major+314370minor)pagefaults 0swaps It also gets faster; hyperfine before: Benchmark #1: ./git -C ../chromium/src name-rev --all Time (mean ± σ): 1.534 s ± 0.006 s [User: 1.039 s, System: 0.494 s] Range (min … max): 1.522 s … 1.542 s 10 runs ... and with this patch: Benchmark #1: ./git -C ../chromium/src name-rev --all Time (mean ± σ): 1.338 s ± 0.006 s [User: 1.047 s, System: 0.291 s] Range (min … max): 1.327 s … 1.346 s 10 runs For the Linux repo it doesn't pay off; memory usage only gets down from: 0.76user 0.03system 0:00.80elapsed 99%CPU (0avgtext+0avgdata 292848maxresident)k 0inputs+0outputs (0major+44579minor)pagefaults 0swaps ... to: 0.78user 0.03system 0:00.81elapsed 100%CPU (0avgtext+0avgdata 284696maxresident)k 0inputs+0outputs (0major+44892minor)pagefaults 0swaps The runtime actually increases slightly from: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 828.8 ms ± 5.0 ms [User: 797.2 ms, System: 31.6 ms] Range (min … max): 824.1 ms … 838.9 ms 10 runs ... to: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 847.6 ms ± 3.4 ms [User: 807.9 ms, System: 39.6 ms] Range (min … max): 843.4 ms … 854.3 ms 10 runs Why is that? In the Chromium repo, ca. 44000 free(3) calls in create_or_update_name() release almost 1GB, while in the Linux repo 240000+ calls release a bit more than 5MB, so the average discarded name is ca. 1000x longer in the latter. Overall I think it's the right tradeoff to make, as it helps curb the memory usage in repositories with big discarded names, and the added overhead is small. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	977dc1912b	name-rev: generate name strings only if they are better Leave setting the tip_name member of struct rev_name to callers of create_or_update_name(). This avoids allocations for names that are rejected by that function. Here's how this affects the runtime when working with a fresh clone of Git's own repository; performance numbers by hyperfine before: Benchmark #1: ./git -C ../git-pristine/ name-rev --all Time (mean ± σ): 437.8 ms ± 4.0 ms [User: 422.5 ms, System: 15.2 ms] Range (min … max): 432.8 ms … 446.3 ms 10 runs ... and with this patch: Benchmark #1: ./git -C ../git-pristine/ name-rev --all Time (mean ± σ): 408.5 ms ± 1.4 ms [User: 387.2 ms, System: 21.2 ms] Range (min … max): 407.1 ms … 411.7 ms 10 runs Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	1c56fc2084	name-rev: pre-size buffer in get_parent_name() We can calculate the size of new name easily and precisely. Open-code the xstrfmt() calls and grow the buffers as needed before filling them. This provides a surprisingly large benefit when working with the Chromium repository; here are the numbers measured using hyperfine before: Benchmark #1: ./git -C ../chromium/src name-rev --all Time (mean ± σ): 5.822 s ± 0.013 s [User: 5.304 s, System: 0.516 s] Range (min … max): 5.803 s … 5.837 s 10 runs ... and with this patch: Benchmark #1: ./git -C ../chromium/src name-rev --all Time (mean ± σ): 1.527 s ± 0.003 s [User: 1.015 s, System: 0.511 s] Range (min … max): 1.524 s … 1.535 s 10 runs Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	ddc42ec786	name-rev: factor out get_parent_name() Reduce nesting by moving code to come up with a name for the parent into its own function. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	f13ca7cef5	name-rev: put struct rev_name into commit slab The commit slab commit_rev_name contains a pointer to a struct rev_name, and the actual struct is allocated separatly. Avoid that allocation and pointer indirection by storing the full struct in the commit slab. Use the tip_name member pointer to determine if the returned struct is initialized. Performance in the Linux repository measured with hyperfine before: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 953.5 ms ± 6.3 ms [User: 901.2 ms, System: 52.1 ms] Range (min … max): 945.2 ms … 968.5 ms 10 runs ... and with this patch: Benchmark #1: ./git -C ../linux/ name-rev --all Time (mean ± σ): 851.0 ms ± 3.1 ms [User: 807.4 ms, System: 43.6 ms] Range (min … max): 846.7 ms … 857.0 ms 10 runs Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	d689d6d82f	name-rev: don't _peek() in create_or_update_name() Look up the commit slab slot for the commit once using commit_rev_name_at() and populate it in case it is empty, instead of checking for emptiness in a separate step using commit_rev_name_peek() via get_commit_rev_name(). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	15a4205d96	name-rev: don't leak path copy in name_ref() name_ref() duplicates the path string and passes it to name_rev(), which either puts it into a commit slab or ignores it if there is already a better name, leaking it. Move the duplication to name_rev() and release the copy in the latter case. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	36d2419c9a	name-rev: respect const qualifier Keep the const qualifier of the first parameter of get_rev_name() even when casting the object pointer to a commit pointer, and further for the parameter of get_commit_rev_name(), as all these uses are read-only. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
René Scharfe	71620ca86c	name-rev: remove unused typedef The type alias became unused with `bf43abc6e6` (name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation, 2019-11-12); remove it. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:24:15 -08:00
Martin Ågren	3e2feb0d64	name-rev: rewrite create_or_update_name() This code was moved straight out of name_rev(). As such, we inherited the "goto" to jump from an if into an else-if. We also inherited the fact that "nothing to do -- return NULL" is handled last. Rewrite the function to first handle the "nothing to do" case. Then we can handle the conditional allocation early before going on to populate the struct. No need for goto-ing. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-05 10:23:42 -08:00
Jeff King	a21781011f	index-pack: downgrade twice-resolved REF_DELTA to die() When we're resolving a REF_DELTA, we compare-and-swap its type from REF_DELTA to whatever real type the base object has, as discussed in `ab791dd138` (index-pack: fix race condition with duplicate bases, 2014-08-29). If the old type wasn't a REF_DELTA, we consider that a BUG(). But as discussed in that commit, we might see this case whenever we try to resolve an object twice, which may happen because we have multiple copies of the base object. So this isn't a bug at all, but rather a sign that the input pack is broken. And indeed, this case is triggered already in t5309.5 and t5309.6, which create packs with delta cycles and duplicate bases. But we never noticed because those tests are marked expect_failure. Those tests were added by `b2ef3d9ebb` (test index-pack on packs with recoverable delta cycles, 2013-08-23), which was leaving the door open for cases that we theoretically _could_ handle. And when we see an already-resolved object like this, in theory we could keep going after confirming that the previously resolved child->real_type matches base->obj->real_type. But: - enforcing the "only resolve once" rule here saves us from an infinite loop in other parts of the code. If we keep going, then the delta cycle in t5309.5 causes us to loop infinitely, as find_ref_delta_children() doesn't realize which objects have already been resolved. So there would be more changes needed to make this case work, and in the meantime we'd be worse off. - any pack that triggers this is broken anyway. It either has a duplicate base object, or it has a cycle which causes us to bring in a duplicate via --fix-thin. In either case, we'd end up rejecting the pack in write_idx_file(), which also detects duplicates. So the tests have little value in documenting what we _could_ be doing (and have been neglected for 6+ years). Let's switch them to confirming that we handle this case cleanly (and switch out the BUG() for a more informative die() so that we do so). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 13:19:11 -08:00
Johan Herland	dbc27477ff	notes.c: fix off-by-one error when decreasing notes fanout As noted in the previous commit, the nature of the fanout heuristic in the notes code causes the exact point at which we increase or decrease the notes fanout to vary with the objects being annotated. Since the object ids generated by the test environment are deterministic (by design), the notes generated and tested by t3305 are always the same, and we therefore happen to see the same fanout behavior from one run to the next. Coincidentally, if we were to change the test environment slightly (say by making a test commit on an unrelated branch before we start the t3305 test proper), we not only see the fanout switch happen at different points, we also manage to trigger a _bug_ in the notes code where the fanout 1 -> 0 switch is not applied uniformly across the notes tree, but instead yields a notes tree like this: ... bdeafb301e44b0e4db0f738a2d2a7beefdb70b70 bff2d39b4f7122bd4c5caee3de353a774d1e632a d3/8ec8f851adf470131178085bfbaab4b12ad2a7 e0b173960431a3e692ae929736df3c9b73a11d5b eb3c3aede523d729990ac25c62a93eb47c21e2e3 ... The bug occurs when we are writing out a notes tree with a newly decreased fanout, and the notes tree contains unexpanded subtrees that should be consolidated into the parent tree as a consequence of the decreased fanout): Subtrees that happen to sit at an _even_ level in the internal notes 16-tree structure (in other words: subtrees whose path - "d3" in the example above - is unique in the first nibble - i.e. there are no other note paths that start with "d") are _not_ unpacked as part of the tree writeout. This error will repeat itself in subsequent note trees until the subtree is forced to be unpacked. In t3305 this only happens when the d38ec8f8 note is itself removed from the tree. The error is not severe (no information is lost, and the notes code is able to read/decode this tree and manipulate it correctly), but this is nonetheless a bug in the current implementation that should be fixed. That said, fixing the off-by-one error is not without complications: We must take into account that the load_subtree() call from for_each_note_helper() (that is now done to correctly unpack the subtree while we're writing out the notes tree) may end up inserting unpacked non-notes into the linked list of non_note entries held by the struct notes_tree. Since we are in the process of writing out the notes tree, this linked list is currently in the process of being traversed by write_each_non_note_until(). The unpacked non-notes are necessarily inserted between the last non-note we wrote out, and the next non-note to be written. Hence, we cannot simply hold the next_non_note to write in struct write_each_note_data (as we would then silently skip these newly inserted notes), but must instead always follow the ->next pointer from the last non-note we wrote. (This part was caught by an existing test in t3304.) Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de> Cc: Brian M. Carlson <sandals@crustytoothpaste.net> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 12:20:43 -08:00
Johan Herland	e1c5253951	t3305: check notes fanout more carefully and robustly In short, before this patch, this test script: - creates many notes - verifies that all notes in the notes tree has a fanout of 1 - removes most notes - verifies that the notes in the notes tree now has a fanout of 0 The fanout verification only happened twice: after creating all the notes, and after removing most of them. This patch strengthens the test by checking the fanout after _each_ added/removed note: We assert that the switch from fanout 0 -> 1 happens exactly once while adding notes (and that the switch pervades the entire notes tree). Likewise, we assert that the switch from fanout 1 -> 0 happens exactly once while removing notes. Additionally, we decrease the number of notes left after removal, from 50 to 15 notes, in order to ensure that fanout 1 -> 0 transition keeps happening regardless of external factors[1]. [1]: Currently (with the SHA1 hash function and the deterministic object ids of the test environment) the fanout heuristic in the notes code happens to switch from 0 -> 1 at 109 notes, and from 1 -> 0 at 59 notes. However, changing the hash function or other external factors will vary these numbers, and the latter may - in theory - go as low as 15. For more details, please see the discussion at https://public-inbox.org/git/20200125230035.136348-4-sandals@crustytoothpaste.net/ Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de> Cc: Brian M. Carlson <sandals@crustytoothpaste.net> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 12:20:27 -08:00
Martin Ågren	e469afe158	git-filter-branch.txt: wrap "maths" notation in backticks In this paragraph, we have a few instances of the '^' character, which we give as "\^". This renders well with AsciiDoc ("^"), but Asciidoctor renders it literally as "\^". Dropping the backslashes renders fine with Asciidoctor, but not AsciiDoc... An earlier version of this patch used "{caret}" instead of "^", which avoided these escaping problems. The rendering was still so-so, though -- these expressions end up set as normal text, similarly to when one provides, e.g., computer code in the middle of running text, without properly marking it with `backticks` to be monospaced. As noted by Jeff King, this suggests actually wrapping these expressions in backticks, setting them in monospace. The lone "5" could be left as is or wrapped as `5`. Spell it out as "five" instead -- this generally looks better anyway for small numbers in the middle of text like this. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 12:17:18 -08:00
Taylor Blau	a7df60cac8	commit-graph.h: use odb in 'load_commit_graph_one_fd_st' Apply a similar treatment as in the previous patch to pass a 'struct object_directory *' through the 'load_commit_graph_one_fd_st' initializer, too. This prevents a potential bug where a pointer comparison is made to a NULL 'g->odb', which would cause the commit-graph machinery to think that a pair of commit-graphs belonged to different alternates when in fact they do not (i.e., in the case of no '--object-dir'). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:51 -08:00
Taylor Blau	ad2dd5bb63	commit-graph.c: remove path normalization, comparison As of the previous patch, all calls to 'commit-graph.c' functions which perform path normalization (for e.g., 'get_commit_graph_filename()') are of the form 'ctx->odb->path', which is always in normalized form. Now that there are no callers passing non-normalized paths to these functions, ensure that future callers are bound by the same restrictions by making these functions take a 'struct object_directory ' instead of a 'const char '. To match, replace all calls with arguments of the form 'ctx->odb->path' with 'ctx->odb' To recover the path, functions that perform path manipulation simply use 'odb->path'. Further, avoid string comparisons with arguments of the form 'odb->path', and instead prefer raw pointer comparisons, which accomplish the same effect, but are far less brittle. This has a pleasant side-effect of making these functions much more robust to paths that cannot be normalized by 'normalize_path_copy()', i.e., because they are outside of the current working directory. For example, prior to this patch, Valgrind reports that the following uninitialized memory read [1]: $ ( cd t && GIT_DIR=../.git valgrind git rev-parse HEAD^ ) because 'normalize_path_copy()' can't normalize '../.git' (since it's relative to but above of the current working directory) [2]. By using a 'struct object_directory *' directly, 'get_commit_graph_filename()' does not need to normalize, because all paths are relative to the current working directory since they are always read from the '->path' of an object directory. [1]: https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net. [2]: The bug here is that 'get_commit_graph_filename()' returns the result of 'normalize_path_copy()' without checking the return value. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:51 -08:00
Taylor Blau	13c2499249	commit-graph.h: store object directory in 'struct commit_graph' In a previous patch, the 'char object_dir' in 'struct commit_graph' was replaced with a 'struct object_directory'. This patch applies the same treatment to 'struct commit_graph', which is another intermediate step towards getting rid of all path normalization in 'commit-graph.c'. Instead of taking a 'char object_dir', functions that construct a 'struct commit_graph' now take a 'struct object_directory *'. Any code that needs an object directory path use '->path' instead. This ensures that all calls to functions that perform path normalization are given arguments which do not themselves require normalization. This prepares those functions to drop their normalization entirely, which will occur in the subsequent patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:51 -08:00
Taylor Blau	0bd52e27e3	commit-graph.h: store an odb in 'struct write_commit_graph_context' There are lots of places in 'commit-graph.h' where a function either has (or almost has) a full 'struct object_directory ', accesses '->path', and then throws away the rest of the struct. This can cause headaches when comparing the locations of object directories across alternates (e.g., in the case of deciding if two commit-graph layers can be merged). These paths are normalized with 'normalize_path_copy()' which mitigates some comparison issues, but not all [1]. Replace usage of 'char object_dir' with 'odb->path' by storing a 'struct object_directory *' in the 'write_commit_graph_context' structure. This is an intermediate step towards getting rid of all path normalization in 'commit-graph.c'. Resolving a user-provided '--object-dir' argument now requires that we compare it to the known alternates for equality. Prior to this patch, an unknown '--object-dir' argument would silently exit with status zero. This can clearly lead to unintended behavior, such as verifying commit-graphs that aren't in a repository's own object store (or one of its alternates), or causing a typo to mask a legitimate commit-graph verification failure. Make this error non-silent by 'die()'-ing when the given '--object-dir' does not match any known alternate object store. [1]: In my testing, for example, I can get one side of the commit-graph code to fill object_dir with "./objects" and the other with just "objects". Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:37 -08:00
Peter Kaestle	f38c92452d	t7400: testcase for submodule status on unregistered inner git repos We have test coverage for "git submodule status" output in various cases, i.e. 1) not-init, not-cloned: status should initially be "missing" 2) init, not-cloned: status should be "missing" 3) not-init, cloned: status should ignore the inner git-repo 4) init, cloned: status should be "up-to-date" after update 4.1) + modified: status should be "modified" after submodule commit 4.2) + modified, committed: status should be "up-to-date" after update the case 3) is not covered yet. Test that submodule status reports an inner git repo as unknown, while it is not added to the superproject. This covers case (3). Signed-off-by: Peter Kaestle <peter@piie.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:26:22 -08:00
Jeff King	5290d45134	tree-walk.c: break circular dependency with unpack-trees The unpack-trees API depends on the tree-walk API. But we've recently introduced a dependency in tree-walk.c on MAX_UNPACK_TREES, which doesn't otherwise care about unpack-trees at all. Let's break that dependency by reversing the constants: we'll introduce a new MAX_TRAVERSE_TREES which belongs to the tree-walk API. And then we can define MAX_UNPACK_TREES in terms of that (since unpack-trees cannot possibly work with more trees than it can traverse at once via tree-walk). The value for both will remain at 8. This is somewhat arbitrary and probably more than is necessary, per `ca885a4fe6` (read-tree() and unpack_trees(): use consistent limit, 2008-03-13), but there's not really any pressing need to reduce it. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 10:32:15 -08:00
Derrick Stolee	f998a3f1e5	sparse-checkout: fix cone mode behavior mismatch The intention of the special "cone mode" in the sparse-checkout feature is to always match the same patterns that are matched by the same sparse-checkout file as when cone mode is disabled. When a file path is given to "git sparse-checkout set" in cone mode, then the cone mode improperly matches the file as a recursive path. When setting the skip-worktree bits, files were not expecting the MATCHED_RECURSIVE response, and hence these were left out of the matched cone. Fix this bug by checking for MATCHED_RECURSIVE in addition to MATCHED and add a test that prevents regression. Reported-by: Finn Bryant <finnbryant@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	d2e65f4c90	sparse-checkout: improve docs around 'set' in cone mode The existing documentation does not clarify how the 'set' subcommand changes when core.sparseCheckoutCone is enabled. Correct this by changing some language around the "A/B/C" example. Also include a description of the input format matching the output of 'git ls-tree --name-only'. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	e53ffe2704	sparse-checkout: escape all glob characters on write The sparse-checkout patterns allow special globs according to fnmatch(3). When writing cone-mode patterns for paths containing these characters, they must be escaped. Use is_glob_special() to check which characters must be escaped this way, and add a path to the tests that contains all glob characters at once. Note that ']' is not special, since the initial bracket '[' is escaped. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	e55682ea26	sparse-checkout: use C-style quotes in 'list' subcommand When in cone mode, the 'git sparse-checkout list' subcommand lists the directories included in the sparse cone. When these directories contain odd characters, such as a backslash, then we need to use C-style quotes similar to 'git ls-tree'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	bd64de42de	sparse-checkout: unquote C-style strings over --stdin If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. Even more specifically, the goal is to always allow the following from the root of a repo: git ls-tree --name-only -d HEAD \| git sparse-checkout set --stdin The ls-tree command provides directory names with an unescaped asterisk. It also quotes the directories that contain an escaped backslash. We must remove these quotes, then keep the escaped backslashes. Use unquote_c_style() when parsing lines from stdin. Command-line arguments will be parsed as-is, assuming the user can do the correct level of escaping from their environment to match the exact directory names. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	d585f0e799	sparse-checkout: write escaped patterns in cone mode If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. However, there is some care needed for the timing of these escapes. The in-memory pattern list is used to update the working directory before writing the patterns to disk. Thus, we need the command to have the unescaped names in the hashsets for the cone comparisons, then escape the patterns later. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	4f52c2ce6c	sparse-checkout: properly match escaped characters In cone mode, the sparse-checkout feature uses hashset containment queries to match paths. Make this algorithm respect escaped asterisk () and backslash (\) characters. Create dup_and_filter_pattern() method to convert a pattern by removing escape characters and dropping an optional "/" at the end. This method is available in dir.h as we will use it in builtin/sparse-checkout.c in a later change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Derrick Stolee	9abc60f801	sparse-checkout: warn on globs in cone patterns In cone mode, the sparse-checkout commmand will write patterns that allow faster pattern matching. This matching only works if the patterns in the sparse-checkout file are those written by that command. Users can edit the sparse-checkout file and create patterns that cause the cone mode matching to fail. The cone mode patterns may end in "/*" but otherwise an un-escaped asterisk or other glob character is invalid. Add checks to disable cone mode when seeing these values. A later change will properly handle escaped globs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:05:29 -08:00
Junio C Hamano	145136a95a	C: use skip_prefix() to avoid hardcoded string length We often skip an optional prefix in a string with a hardcoded constant, e.g. if (starts_with(string, "prefix")) string += 6; which is less error prone when written skip_prefix(string, "prefix", &string); Note that this changes a few error messages from "git reflog expire --expire=nonsense.timestamp", which used to complain by saying '--expire=nonsense.timestamp' is not a valid timestamp but with this change, we say 'nonsense.timestamp' is not a valid timestamp which is more technically correct (the string with --expire= as a prefix obviously cannot be a valid timestamp, but the error is about the part of the input without that prefix). Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 13:03:45 -08:00
Kyle Meyer	04e5b3f0b4	submodule foreach: replace $path with $sm_path in example `f0fd0dc5c5` (submodule foreach: document '$sm_path' instead of '$path', 2018-05-08) updated the documentation to advise callers to favor $sm_path over the deprecated synonym $path. However, the example in that section still uses $path. Update it to use $sm_path. Signed-off-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 12:57:18 -08:00
Taylor Blau	1793280e91	t5318: don't pass non-object directory to '--object-dir' In `f237c8b6fe` (commit-graph: implement git-commit-graph write, 2018-04-02) the test t5318.3 was introduced to ensure that calling 'git commit-graph write' in a repository with no packfiles does not write any commit-graph file(s). To exercise more paths in 'builtin/commit-graph.c', this test passes '--object-dir' to 'git commit-graph write', but the given argument refers to the working copy, not the object directory. Since the commit-graph sub-commands currently swallow these errors, this does not result in a test failure. But, it is only lucky that the test ends with no commit-graphs, since there were none to begin with. In preparation for a future commit where an '--object-dir' argument that does not match a known object directory will print out a failure, let's fix the test to still use '--object-dir', but pass the correct location to the object store instead of '.'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 12:47:40 -08:00
Jeff King	da8063522f	diff: move diff.wsErrorHighlight to "basic" config We parse diff.wsErrorHighlight in git_diff_ui_config(), meaning that it doesn't take effect for plumbing commands, only for porcelains like git-diff itself. This is mildly annoying as it means scripts like add--interactive, which produce a user-visible diff with color, don't respect the option. We could teach that script to parse the config and pass it along as --ws-error-highlight to the diff plumbing. But there's a simpler solution. It should be reasonably safe for plumbing to respect this option, as it only kicks in when color is otherwise enabled. And anybody parsing colorized output must already deal with the fact that color.diff.* may change the exact output they see; those options have been part of git_diff_basic_config() since its inception in `9a1805a872` (add a "basic" diff config callback, 2008-01-04). So we can just move it to the "basic" config, which fixes add--interactive, along with any other script in the same boat, with a very low risk of hurting any plumbing users. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 12:01:19 -08:00
Matheus Tavares	b98d188581	sha1-file: allow check_object_signature() to handle any repo Some callers of check_object_signature() can work on arbitrary repositories, but the repo does not get passed to this function. Instead, the_repository is always used internally. To fix possible inconsistencies, allow the function to receive a struct repository and make those callers pass on the repo being handled. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 10:45:39 -08:00
Matheus Tavares	2dcde20e1c	sha1-file: pass git_hash_algo to hash_object_file() Allow hash_object_file() to work on arbitrary repos by introducing a git_hash_algo parameter. Change callers which have a struct repository pointer in their scope to pass on the git_hash_algo from the said repo. For all other callers, pass on the_hash_algo, which was already being used internally at hash_object_file(). This functionality will be used in the following patch to make check_object_signature() be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency at object.c:parse_object()). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 10:45:39 -08:00
Matheus Tavares	7ad5c44d9c	sha1-file: pass git_hash_algo to write_object_file_prepare() Allow write_object_file_prepare() to receive arbitrary 'struct git_hash_algo's instead of always using the_hash_algo. The added parameter will be used in the next commit to make hash_object_file() be able to work with arbitrary git_hash_algo's, as well. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 10:45:39 -08:00

... 2 3 4 5 6 ...

58497 commits