development/git - HydraGit

mirror of https://github.com/git/git synced 2024-08-27 11:39:22 +00:00

Author	SHA1	Message	Date
Elijah Newren	ec2101abf3	treewide: add direct includes currently only pulled in transitively The next commit will remove a bunch of unnecessary includes, but to do so, we need some of the lower level direct includes that files rely on to be explicitly specified. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	0a4d5b9772	trace2/tr2_tls.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	e9bb166491	submodule-config.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	545f7b50e8	pkt-line.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	a28fe2d901	line-log.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	f25e65e0fe	http.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	31d20faa90	fsmonitor--daemon.h: remove unnecessary includes The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	bd6cc1d9ec	blame.h: remove unnecessary includes The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	c2c4138c07	archive.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:31 -08:00
Elijah Newren	eea0e59ffb	treewide: remove unnecessary includes in source files Each of these were checked with gcc -E -I. ${SOURCE_FILE} \| grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:31 -08:00
Elijah Newren	147438e8a0	treewide: remove unnecessary includes from header files There are three kinds of unnecessary includes: * includes which aren't directly needed, but which include some other forgotten include * includes which could be replaced by a simple forward declaration of some structs * includes which aren't needed at all Remove the third kind of include. Subsequent commits (and a subsequent series) will work on removing some of the other kinds of includes. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:31 -08:00
René Scharfe	5b7eec4bc5	fast-import: use mem_pool_calloc() Use mem_pool_calloc() to get a zeroed buffer instead of zeroing it ourselves. This makes the code clearer and less repetitive. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 11:06:23 -08:00
Jeff King	f546151228	t1006: add tests for %(objectsize:disk) Back when we added this placeholder in `a4ac106178` (cat-file: add %(objectsize:disk) format atom, 2013-07-10), there were no tests, claiming "[...]the exact numbers returned are volatile and subject to zlib and packing decisions". But we can use a little shell hackery to get the expected numbers ourselves. To a certain degree this is just re-implementing what Git is doing under the hood, but it is still worth doing. It makes sure we exercise the %(objectsize:disk) code at all, and having the two implementations agree gives us more confidence. Note that our shell code assumes that no object appears twice (either in two packs, or as both loose and packed), as then the results really are undefined. That's OK for our purposes, and the test will notice if that assumption is violated (the shell version would produce duplicate lines that Git's output does not have). Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-21 10:37:46 -08:00
Junio C Hamano	d6b6cd1393	archive: "--list" does not take further options "git archive --list blah" should notice an extra command line parameter that goes unused. Make it so. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-21 10:33:09 -08:00
Michael Lohmann	63956c553d	Documentation/git-merge.txt: use backticks for command wrapping As René found in the guidance from CodingGuidelines: Literal examples (e.g. use of command-line options, command names, branch names, URLs, pathnames (files and directories), configuration and environment variables) must be typeset in monospace (i.e. wrapped with backticks) So all instances of single and double quotes for wraping said examples were replaced with simple backticks. Suggested-by: René Scharfe <l.s.r@web.de> Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-20 13:40:01 -08:00
Michael Lohmann	dc18ead555	Documentation/git-merge.txt: fix reference to synopsis `437591a9d7` combined the synopsis of "The second syntax" (meaning `git merge --abort`) and "The third syntax" (for `git merge --continue`) into this single line: git merge (--continue \| --abort \| --quit) but it was still referred to when describing the preconditions that have to be fulfilled to run the respective actions. In other words: References by number are no longer valid after a merge of some of the synopses. Also the previous version of the documentation did not acknowledge that `--no-commit` would result in the precondition being fulfilled (thanks to Elijah Newren and Junio C Hamano for pointing that out). This change also groups `--abort` and `--continue` together when explaining the prerequisites in order to avoid duplication. Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-20 13:39:56 -08:00
Linus Arver	de7c27a186	trailer: use offsets for trailer_start/trailer_end Previously these fields in the trailer_info struct were of type "const char *" and pointed to positions in the input string directly (to the start and end positions of the trailer block). Use offsets to make the intended usage less ambiguous. We only need to reference the input string in format_trailer_info(), so update that function to take a pointer to the input. While we're at it, rename trailer_start to trailer_block_start to be more explicit about these offsets (that they are for the entire trailer block including other trailers). Ditto for trailer_end. Reported-by: Glen Choo <glencbz@gmail.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-20 11:55:04 -08:00
Linus Arver	97e9d0b78a	trailer: find the end of the log message Previously, trailer_info_get() computed the trailer block end position by (1) checking for the opts->no_divider flag and optionally calling find_patch_start() to find the "patch start" location (patch_start), and (2) calling find_trailer_end() to find the end of the trailer block using patch_start as a guide, saving the return value into "trailer_end". The logic in (1) was awkward because the variable "patch_start" is misleading if there is no patch in the input. The logic in (2) was misleading because it could be the case that no trailers are in the input (yet we are setting a "trailer_end" variable before even searching for trailers, which happens later in find_trailer_start()). The name "find_trailer_end" was misleading because that function did not look for any trailer block itself --- instead it just computed the end position of the log message in the input where the end of the trailer block (if it exists) would be (because trailer blocks must always come after the end of the log message). Combine the logic in (1) and (2) together into find_patch_start() by renaming it to find_end_of_log_message(). The end of the log message is the starting point which find_trailer_start() needs to start searching backward to parse individual trailers (if any). Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Arver <linusa@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-20 11:55:04 -08:00
René Scharfe	45184afb4d	rebase: use strvec_pushf() for format-patch revisions In run_am(), a strbuf is used to create a revision argument that is then added to the argument list for git format-patch using strvec_push(). Use strvec_pushf() to add it directly instead, simplifying the code and plugging a small leak on the error code path. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-20 09:26:58 -08:00
Stan Hu	44dbb3bf29	completion: support pseudoref existence checks for reftables In contrib/completion/git-completion.bash, there are a bunch of instances where we read pseudorefs, such as HEAD, MERGE_HEAD, REVERT_HEAD, and others via the filesystem. However, the upcoming reftable refs backend won't use '.git/HEAD' at all but instead will write an invalid refname as placeholder for backwards compatibility, which will break the git-completion script. Update the '__git_pseudoref_exists' function to: 1. Recognize the placeholder '.git/HEAD' written by the reftable backend (its content is specified in the reftable specs). 2. If reftable is in use, use 'git rev-parse' to determine whether the given ref exists. 3. Otherwise, continue to use 'test -f' to check for the ref's filename. Signed-off-by: Stan Hu <stanhu@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-19 15:11:58 -08:00
Stan Hu	666270a2df	completion: refactor existence checks for pseudorefs In preparation for the reftable backend, this commit introduces a '__git_pseudoref_exists' function that continues to use 'test -f' to determine whether a given pseudoref exists in the local filesystem. Signed-off-by: Stan Hu <stanhu@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-19 15:11:56 -08:00
Junio C Hamano	a762af3dfd	remote.h: retire CAS_OPT_NAME When the "--force-with-lease" option was introduced in `28f5d176` (remote.c: add command line option parser for "--force-with-lease", 2013-07-08), the design discussion revolved around the concept of "compare-and-swap", and it can still be seen in the name used for variables and helper functions. The end-user facing option name ended up to be a bit different, so during the development iteration of the feature, we used this C preprocessor macro to make it easier to rename it later. All of that happened more than 10 years ago, and the flexibility afforded by the CAS_OPT_NAME macro outlived its usefulness. Inline the constant string for the option name, like all other option names in the code. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-19 11:27:04 -08:00
Jiang Xin	7033d5479b	pkt-line: do not chomp newlines for sideband messages When calling "packet_read_with_status()" to parse pkt-line encoded packets, we can turn on the flag "PACKET_READ_CHOMP_NEWLINE" to chomp newline character for each packet for better line matching. But when receiving data and progress information using sideband, we should turn off the flag "PACKET_READ_CHOMP_NEWLINE" to prevent mangling newline characters from data and progress information. When both the server and the client support "sideband-all" capability, we have a dilemma that newline characters in negotiation packets should be removed, but the newline characters in the progress information should be left intact. Add new flag "PACKET_READ_USE_SIDEBAND" for "packet_read_with_status()" to prevent mangling newline characters in sideband messages. Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 13:24:38 -08:00
Jiang Xin	64220dc5f7	pkt-line: memorize sideband fragment in reader When we turn on the "use_sideband" field of the packet_reader, "packet_reader_read()" will call the function "demultiplex_sideband()" to parse and consume sideband messages. Sideband fragment which does not end with "\r" or "\n" will be saved in the sixth parameter "scratch" and it can be reused and be concatenated when parsing another sideband message. In "packet_reader_read()" function, the local variable "scratch" can only be reused by subsequent sideband messages. But if there is a payload message between two sideband fragments, the first fragment which is saved in the local variable "scratch" will be lost. To solve this problem, we can add a new field "scratch" in packet_reader to memorize the sideband fragment across different calls of "packet_reader_read()". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 13:24:37 -08:00
Jiang Xin	eaa82f8e98	test-pkt-line: add option parser for unpack-sideband We can use the test helper program "test-tool pkt-line" to test pkt-line related functions. E.g.: * Use "test-tool pkt-line send-split-sideband" to generate sideband messages. * Pipe these generated sideband messages to command "test-tool pkt-line unpack-sideband" to test packet_reader_read() function. In order to make a complete test of the packet_reader_read() function, add option parser for command "test-tool pkt-line unpack-sideband". * To remove newlines in sideband messages, we can use: $ test-tool pkt-line unpack-sideband --chomp-newline * To preserve newlines in sideband messages, we can use: $ test-tool pkt-line unpack-sideband --no-chomp-newline * To parse sideband messages using "demultiplex_sideband()" inside the function "packet_reader_read()", we can use: $ test-tool pkt-line unpack-sideband --reader-use-sideband We also add new example sideband packets in send_split_sideband() and add several new test cases in t0070. Among these test cases, we pipe output of the "send-split-sideband" subcommand to the "unpack-sideband" subcommand. We found two issues: 1. The two splitted sideband messages "Hello," and " world!\n" should be concatenated together. But when we turn on use_sideband field of reader to parse sideband messages, the first part of the splitted message ("Hello,") is lost. 2. The newline characters in sideband 2 (progress info) and sideband 3 (error message) should be preserved, but they are both trimmed. Will fix the above two issues in subsequent commits. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 13:24:37 -08:00
Junio C Hamano	6d6f1cd7ee	doc: format.notes specify a ref under refs/notes/ hierarchy There is no 'ref/notes/' hierarchy. '[format] notes = foo' uses notes that are found in 'refs/notes/foo'. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 11:30:46 -08:00
Shreyansh Paliwal	37e8d795be	test-lib-functions.sh: fix test_grep fail message wording In the recent commit `2e87fca189` (test framework: further deprecate test_i18ngrep, 2023-10-31), the test_i18ngrep function was deprecated, and all the callers were updated to call the test_grep function instead. But test_grep inherited an error message that still refers to test_i18ngrep by mistake. Correct it so that a broken call to the test_grep will identify itself as such. Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 10:44:41 -08:00
Jiang Xin	18ce48918c	fetch: no redundant error message for atomic fetch If an error occurs during an atomic fetch, a redundant error message will appear at the end of do_fetch(). It was introduced in `b3a804663c` (fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17). Because a failure message is displayed before setting retcode in the function do_fetch(), calling error() on the err message at the end of this function may result in redundant or empty error message to be displayed. We can remove the redundant error() function, because we know that the function ref_transaction_abort() never fails. While we can find a common pattern for calling ref_transaction_abort() by running command "git grep -A1 ref_transaction_abort", e.g.: if (ref_transaction_abort(transaction, &error)) error("abort: %s", error.buf); Following this pattern, we can tolerate the return value of the function ref_transaction_abort() being changed in the future. We also delay the output of the err message to the end of do_fetch() to reduce redundant code. With these changes, the test case "fetch porcelain output (atomic)" in t5574 will also be fixed. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 08:30:33 -08:00
Jiang Xin	97d82b2963	t5574: test porcelain output of atomic fetch The test case "fetch porcelain output" checks output of the fetch command. The error output must be empty with the follow assertion: test_must_be_empty stderr But this assertion fails if using atomic fetch. Refactor this test case to use different fetch options by splitting it into three test cases. 1. "setup for fetch porcelain output". 2. "fetch porcelain output": for non-atomic fetch. 3. "fetch porcelain output (atomic)": for atomic fetch. Add new command "test_commit ..." in the first test case, so that if we run these test cases individually (--run=4-6), "git rev-parse HEAD~" command will work properly. Run the above test cases, we can find that one test case has a known breakage, as shown below: ok 4 - setup for fetch porcelain output ok 5 - fetch porcelain output # TODO known breakage vanished not ok 6 - fetch porcelain output (atomic) # TODO known breakage The failed test case has an error message with only the error prompt but no message body, as follows: 'stderr' is not empty, it contains: error: In a later commit, we will fix this issue. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-18 08:30:32 -08:00
Junio C Hamano	bc62d27d5c	docs: MERGE_AUTOSTASH is not that special A handful of manual pages called MERGE_AUTOSTASH a "special ref", but there is nothing special about it. It merely is yet another pseudoref. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 14:08:28 -08:00
Junio C Hamano	dada38646a	docs: AUTO_MERGE is not that special A handful of manual pages called AUTO_MERGE a "special ref", but there is nothing special about it. It merely is yet another pseudoref. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 14:08:28 -08:00
Junio C Hamano	7122f4f747	refs.h: HEAD is not that special In-code comment explains pseudorefs but used a wrong nomenclature "special ref". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 14:08:28 -08:00
Junio C Hamano	2047b2c28c	git-bisect.txt: BISECT_HEAD is not that special The description of "git bisect --no-checkout" called BISECT_HEAD a "special ref", but there is nothing special about it. It merely is yet another pseudoref. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 14:08:28 -08:00
Junio C Hamano	d9a4bb3385	git.txt: HEAD is not that special The introductory text in "git help git" that describes HEAD called it "a special ref". It is special compared to the more regular refs like refs/heads/master and refs/tags/v1.0.0, but not that special, unlike truly special ones like FETCH_HEAD. Rewrite a few sentences to also introduce the distinction between a regular ref that contain the object name and a symbolic ref that contain the name of another ref. Update the description of HEAD that point at the current branch to use the more correct term, a "symbolic ref". This was found as part of auditing the documentation and in-code comments for uses of "special ref" that refer merely a "pseudo ref". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 14:08:21 -08:00
Eric Sunshine	68fcebfb1a	git-add.txt: add missing short option -A to synopsis With one exception, the synopsis for `git add` consistently lists the short counterpart alongside the long-form of each option (for instance, "[--edit \| -e]"). The exception is that -A is not mentioned alongside --all. Fix this inconsistency Reported-by: Benjamin Lehmann <ben.lehmann@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 13:01:51 -08:00
Patrick Steinhardt	647b5e0998	tests: adjust whitespace in chainlint expectations The "check-chainlint" target runs automatically when running tests and performs self-checks to verify that the chainlinter itself produces the expected output. Originally, the chainlinter was implemented via sed, but the infrastructure has been rewritten in `fb41727b7e` (t: retire unused chainlint.sed, 2022-09-01) to use a Perl script instead. The rewrite caused some slight whitespace changes in the output that are ultimately not of much importance. In order to be able to assert that the actual chainlinter errors match our expectations we thus have to ignore whitespace characters when diffing them. As the `-w` flag is not in POSIX we try to use `git diff -w --no-index` before we fall back to `diff -w -u`. To accomodate for cases where the host system has no Git installation we use the locally-compiled version of Git. This can result in problems though when the Git project's repository is using extensions that the locally-compiled version of Git doesn't understand. It will refuse to run and thus cause the checks to fail. Instead of improving the detection logic, fix our ".expect" files so that we do not need any post-processing at all anymore. This allows us to drop the `-w` flag when diffing so that we can always use diff(1) now. Note that we keep some of the post-processing of `chainlint.pl` output intact to strip leading line numbers generated by the script. Having these would cause a rippling effect whenever we add a new test that sorts into the middle of existing tests and would require us to renumerate all subsequent lines, which seems rather pointless. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-15 08:36:14 -08:00
Jeff King	dee182941f	mailinfo: avoid recursion when unquoting From headers Our unquote_comment() function is recursive; when it sees a comment within a comment, like: (this is an (embedded) comment) it recurses to handle the inner comment. This is fine for practical use, but it does mean that you can easily run out of stack space with a malicious header. For example: perl -e 'print "From: ", "(" x 2**18;' \| git mailinfo /dev/null /dev/null segfaults on my system. And since mailinfo is likely to be fed untrusted input from the Internet (if not by human users, who might recognize a garbage header, but certainly there are automated systems that apply patches from a list) it may be possible for an attacker to trigger the problem. That said, I don't think there's an interesting security vulnerability here. All an attacker can do is make it impossible to parse their email and apply their patch, and there are lots of ways to generate bogus emails. So it's more of an annoyance than anything. But it's pretty easy to fix it. The recursion is not helping us preserve any particular state from each level. The only flag in our parsing is take_next_literally, and we can never recurse when it is set (since the start of a new comment implies it was not backslash-escaped). So it is really only useful for finding the end of the matched pair of parentheses. We can do that easily with a simple depth counter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-14 14:33:52 -08:00
Jeff King	2d9396c2fe	t5100: make rfc822 comment test more careful When processing "From" headers in an email, mailinfo "unquotes" quoted strings and rfc822 parenthesized comments. For quoted strings, we actually remove the double-quotes, so: From: "A U Thor" <someone@example.com> become: Author: A U Thor Email: someone@example.com But for comments, we leave the outer parentheses in place, so: From: A U (this is a comment) Thor <someone@example.com> becomes: Author: A U (this is a comment) Thor Email: someone@example.com So what is the comment "unquoting" actually doing? In our code, being in a comment section has exactly two effects: 1. We'll unquote backslash-escaped characters inside a comment section. 2. We _won't_ unquote double-quoted strings inside a comment section. Our test for comments in t5100 checks this: From: "A U Thor" <somebody@example.com> (this is $really$ a comment (honestly)) So it is covering (1), but not (2). Let's add in a quoted string to cover this. Moreover, because the comment appears at the end of the From header, there's nothing to confirm that we correctly found the end of the comment section (and not just the end-of-string). Let's instead move it to the beginning of the header, which means we can confirm that the existing quoted string is detected (which will only happen if we know we've left the comment block). As expected, the test continues to pass, but this will give us more confidence as we refactor the code in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-14 14:33:50 -08:00
René Scharfe	fbc6526ea6	t6300: avoid hard-coding object sizes `f4ee22b526` (ref-filter: add tests for objectsize:disk, 2018-12-24) hard-coded the expected object sizes. Coincidentally the size of commit and tag is the same with zlib at the default compression level. `1f5f8f3e85` (t6300: abstract away SHA-1-specific constants, 2020-02-22) encoded the sizes as a single value, which coincidentally also works with sha256. Different compression libraries like zlib-ng may arrive at different values. Get them from the file system instead of hard-coding them to make switching the compression library (or changing the compression level) easier. Reported-by: Ondrej Pohorelsky <opohorel@redhat.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-12 15:41:15 -08:00
Jeff King	d1bd3a8c34	mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair() When processing a header like a "From" line, mailinfo uses unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized comments. It takes a NUL-terminated string on input, and loops over the "in" pointer until it sees the NUL. When it finds the start of an interesting block, it delegates to helper functions which also increment "in", and return the updated pointer. But there's a bug here: the helpers find the NUL with a post-increment in the loop condition, like: while ((c = *in++) != 0) So when they do see a NUL (rather than the correct termination of the quote or comment section), they return "in" as one _past_ the NUL terminator. And thus the outer loop in unquote_quoted_pair() does not realize we hit the NUL, and keeps reading past the end of the buffer. We should instead make sure to return "in" positioned at the NUL, so that the caller knows to stop their loop, too. A hacky way to do this is to return "in - 1" after leaving the inner loop. But a slightly cleaner solution is to avoid incrementing "in" until we are sure it contained a non-NUL byte (i.e., doing it inside the loop body). The two tests here show off the problem. Since we check the output, they'll _usually_ report a failure in a normal build, but it depends on what garbage bytes are found after the heap buffer. Building with SANITIZE=address reliably notices the problem. The outcome (both the exit code and the exact bytes) are just what Git happens to produce for these cases today, and shouldn't be taken as an endorsement. It might be reasonable to abort on an unterminated string, for example. The priority for this patch is fixing the out-of-bounds memory access. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-12 15:32:49 -08:00
Patrick Steinhardt	c0cadb0576	reftable/block: reuse buffer to compute record keys When iterating over entries in the block iterator we compute the key of each of the entries and write it into a buffer. We do not reuse the buffer though and thus re-allocate it on every iteration, which is wasteful. Refactor the code to reuse the buffer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:17 -08:00
Patrick Steinhardt	a8305bc6d8	reftable/block: introduce macro to initialize `struct block_iter` There are a bunch of locations where we initialize members of `struct block_iter`, which makes it harder than necessary to expand this struct to have additional members. Unify the logic via a new `BLOCK_ITER_INIT` macro that initializes all members. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:17 -08:00
Patrick Steinhardt	829231dc20	reftable/merged: reuse buffer to compute record keys When iterating over entries in the merged iterator's queue, we compute the key of each of the entries and write it into a buffer. We do not reuse the buffer though and thus re-allocate it on every iteration, which is wasteful given that we never transfer ownership of the allocated bytes outside of the loop. Refactor the code to reuse the buffer. This also fixes a potential memory leak when `merged_iter_advance_subiter()` returns an error. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	9abda98149	reftable/stack: fix use of unseeded randomness When writing a new reftable stack, Git will first create the stack with a random suffix so that concurrent updates will not try to write to the same file. This random suffix is computed via a call to rand(3P). But we never seed the function via srand(3P), which means that the suffix is in fact always the same. Fix this bug by using `git_rand()` instead, which does not need to be initialized. While this function is likely going to be slower depending on the platform, this slowness should not matter in practice as we only use it when writing a new reftable stack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	3054fbd93e	reftable/stack: fix stale lock when dying When starting a transaction via `reftable_stack_init_addition()`, we create a lockfile for the reftable stack itself which we'll write the new list of tables to. But if we terminate abnormally e.g. via a call to `die()`, then we do not remove the lockfile. Subsequent executions of Git which try to modify references will thus fail with an out-of-date error. Fix this bug by registering the lock as a `struct tempfile`, which ensures automatic cleanup for us. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	d779996a10	reftable/stack: reuse buffers when reloading stack In `reftable_stack_reload_once()` we iterate over all the tables added to the stack in order to figure out whether any of the tables needs to be reloaded. We use a set of buffers in this context to compute the paths of these tables, but discard those buffers on every iteration. This is quite wasteful given that we do not need to transfer ownership of the allocated buffer outside of the loop. Refactor the code to instead reuse the buffers to reduce the number of allocations we need to do. Note that we do not have to manually reset the buffer because `stack_filename()` does this for us already. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	5c086453ff	reftable/stack: perform auto-compaction with transactional interface Whenever updating references or reflog entries in the reftable stack, we need to add a new table to the stack, thus growing the stack's length by one. The stack can grow to become quite long rather quickly, leading to performance issues when trying to read records. But besides performance issues, this can also lead to exhaustion of file descriptors very rapidly as every single table requires a separate descriptor when opening the stack. While git-pack-refs(1) fixes this issue for us by merging the tables, it runs too irregularly to keep the length of the stack within reasonable limits. This is why the reftable stack has an auto-compaction mechanism: `reftable_stack_add()` will call `reftable_stack_auto_compact()` after its added the new table, which will auto-compact the stack as required. But while this logic works alright for `reftable_stack_add()`, we do not do the same in `reftable_addition_commit()`, which is the transactional equivalent to the former function that allows us to write multiple updates to the stack atomically. Consequentially, we will easily run into file descriptor exhaustion in code paths that use many separate transactions like e.g. non-atomic fetches. Fix this issue by calling `reftable_stack_auto_compact()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	15f98b602f	reftable/stack: verify that `reftable_stack_add()` uses auto-compaction While we have several tests that check whether we correctly perform auto-compaction when manually calling `reftable_stack_auto_compact()`, we don't have any tests that verify whether `reftable_stack_add()` does call it automatically. Add one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	85a8c899ce	reftable: handle interrupted writes There are calls to write(3P) where we don't properly handle interrupts. Convert them to use `write_in_full()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00
Patrick Steinhardt	917a2b3ce9	reftable: handle interrupted reads There are calls to pread(3P) and read(3P) where we don't properly handle interrupts. Convert them to use `pread_in_full()` and `read_in_full()`, respectively. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-11 07:23:16 -08:00

... 3 4 5 6 7 ...

71874 commits