development/git - HydraGit

mirror of https://github.com/git/git synced 2024-08-24 02:11:08 +00:00

Author	SHA1	Message	Date
Eric Sunshine	d73f5cfa89	chainlint.sed: stop splitting "(..." into separate lines "(" and "..." Because `sed` is line-oriented, for ease of implementation, when chainlint.sed encounters an opening subshell in which the first command is cuddled with the "(", it splits the line into two lines: one containing only "(", and the other containing whatever follows "(". This allows chainlint.sed to get by with a single set of regular expressions for matching shell statements rather than having to duplicate each expression (one set for matching non-cuddled statements, and one set for matching cuddled statements). However, although syntactically and semantically immaterial, this transformation has no value to test authors and might even confuse them into thinking that the linter is misbehaving by inserting (whitespace) line-noise into the shell code it is validating. Moreover, it also allows an implementation detail of chainlint.sed to seep into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. To address these concerns, stop splitting cuddled "(..." into two lines. Note that, as an implementation artifact, due to sed's line-oriented nature, this change inserts a blank line at output time just before the "(..." line is emitted. It would be possible to suppress this blank line but doing so would add a fair bit of complexity to chainlint.sed. Therefore, rather than suppressing the extra blank line, the Makefile's `check-chainlint` target which runs the chainlint self-tests is instead modified to ignore blank lines when comparing chainlint output against the self-test "expect" output. This is a reasonable compromise for two reasons. First, the purpose of the chainlint self-tests is to verify that the ?!AMP?! annotations are being correctly added; precise whitespace is immaterial. Second, by necessity, chainlint.sed itself already throws away all blank lines within subshells since, when checking for a broken &&-chain, it needs to check the final _statement_ in a subshell, not the final _line_ (which might be blank), thus it has never made any attempt to precisely reproduce blank lines in its output. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	31da22d1fd	chainlint.sed: swallow comments consistently When checking for broken a &&-chain, chainlint.sed knows that the final statement in a subshell should not end with `&&`, so it takes care to make a distinction between the final line which is an actual statement and any lines which may be mere comments preceding the closing ')'. As such, it swallows comment lines so that they do not interfere with the &&-chain check. However, since `sed` does not provide any sort of real recursion, chainlint.sed only checks &&-chains in subshells one level deep; it doesn't do any checking in deeper subshells or in `{...}` blocks within subshells. Furthermore, on account of potential implementation complexity, it doesn't check &&-chains within `case` arms. Due to an oversight, it also doesn't swallow comments inside deep subshells, `{...}` blocks, or `case` statements, which makes its output inconsistent (swallowing comments in some cases but not others). Unfortunately, this inconsistency seeps into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. Therefore, teach chainlint.sed to consistently swallow comments in all cases. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	34ba05c296	chainlint.sed: stop throwing away here-doc tags The purpose of chainlint is to highlight problems it finds in test code by inserting annotations at the location of each problem. Arbitrarily eliding bits of the code it is checking is not helpful, yet this is exactly what chainlint.sed does by cavalierly and unnecessarily dropping the here-doc operator and tag; i.e. `cat <<TAG` becomes simply `cat` in the output. This behavior can make it more difficult for the test writer to align the annotated output of chainlint.sed with the original test code. Address this by retaining here-doc tags. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	22597af97d	chainlint.sed: don't mistake `<< word` in string as here-doc operator Tighten here-doc recognition to prevent it from being fooled by text which looks like a here-doc operator but happens merely to be the content of a string, such as this real-world case from t7201: echo "<<<<<<< ours" && echo ourside && echo "=======" && echo theirside && echo ">>>>>>> theirs" This problem went unnoticed because chainlint.sed is not a real parser, but rather applies heuristics to pretend to understand shell code. In this case, it saw what it thought was a here-doc operator (`<< ours`), and fell off the end of the test looking for the closing tag "ours" which it never found, thus swallowed the remainder of the test without checking it for &&-chain breakage. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	2d53614210	chainlint.sed: make here-doc "<<-" operator recognition more POSIX-like According to POSIX, "<<" and "<<-" are distinct shell operators. For the latter to be recognized, no whitespace is allowed before the "-", though whitespace is allowed after the operator. However, the chainlint patterns which identify here-docs are both too loose and too tight, incorrectly allowing whitespace between "<<" and "-" but disallowing it between "-" and the here-doc tag. Fix the patterns to better match POSIX. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	5be30d0cd3	chainlint.sed: drop subshell-closing ">" annotation chainlint.sed inserts a ">" annotation at the beginning of a line to signal that its heuristics have identified an end-of-subshell. This was useful as a debugging aid during development of the script, but it has no value to test writers and might even confuse them into thinking that the linter is misbehaving by inserting line-noise into the shell code it is validating. Moreover, its presence also potentially makes it difficult to reuse the chainlint self-test "expect" output should a more capable linter ever be developed. Therefore, drop the ">" annotation. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	0d7131763e	chainlint.sed: drop unnecessary distinction between ?!AMP?! and ?!SEMI?! >From inception, when chainlint.sed encountered a line using semicolon to separate commands rather than `&&`, it would insert a ?!SEMI?! annotation at the beginning of the line rather ?!AMP?! even though the &&-chain is also broken by the semicolon. Given a line such as: ?!SEMI?! cmd1; cmd2 && the ?!SEMI?! annotation makes it easier to see what the problem is than if the output had been: ?!AMP?! cmd1; cmd2 && which might confuse the test author into thinking that the linter is broken (since the line clearly ends with `&&`). However, now that the ?!AMP?! an ?!SEMI?! annotations are inserted at the point of breakage rather than at the beginning of the line, and taking into account that both represent a broken &&-chain, there is little reason to distinguish between the two. Using ?!AMP?! alone is sufficient to point the test author at the problem. For instance, in: cmd1; ?!AMP?! cmd2 && cmd3 it is clear that the &&-chain is broken between `cmd1` and `cmd2`. Likewise, in: cmd1 && cmd2 ?!AMP?! cmd3 it is clear that the &&-chain is broken between `cmd2` and `cmd3`. Finally, in: cmd1; ?!AMP?! cmd2 ?!AMP?! cmd3 it is clear that the &&-chain is broken between each command. Hence, there is no longer a good reason to make a distinction between a broken &&-chain due to a semicolon and a broken chain due to a missing `&&` at end-of-line. Therefore, drop the ?!SEMI?! annotation and use ?!AMP?! exclusively. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	3865a7e36d	chainlint.sed: tolerate harmless ";" at end of last line in block chainlint.sed flags ";" when used as a command terminator since it breaks the &&-chain, thus can allow failures to go undetected. However, when a command terminated by ";" is the last command in the body of a compound statement, such as `command-2` in: if test $# -gt 1 then command-1 && command-2; fi then the ";" is harmless and the exit code from `command-2` is passed through untouched and becomes the exit code of the compound statement, as if the ";" was not present. Therefore, tolerate a trailing ";" in this position rather than complaining about broken &&-chain. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	fbd992b61b	chainlint.sed: improve ?!SEMI?! placement accuracy When chainlint.sed detects commands separated by a semicolon rather than by `&&`, it places a ?!SEMI?! annotation at the beginning of the line. However, this is an unusual location for programmers accustomed to error messages (from compilers, for instance) indicating the exact point of the problem. Therefore, relocate the ?!SEMI?! annotation to the location of the semicolon in order to better direct the programmer's attention to the source of the problem. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Eric Sunshine	db8c7a1cc0	chainlint.sed: improve ?!AMP?! placement accuracy When chainlint.sed detects a broken &&-chain, it places an ?!AMP?! annotation at the beginning of the line. However, this is an unusual location for programmers accustomed to error messages (from compilers, for instance) indicating the exact point of the problem. Therefore, relocate the ?!AMP?! annotation to the end of the line in order to better direct the programmer's attention to the source of the problem. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00
Kyohei Kadota	b3b753b104	Fit to Plan 9's ANSI/POSIX compatibility layer tr(1) of ANSI/POSIX environment, aka APE, don't support \n literal. It's handles only octal(\ooo) or hexadecimal(\xhhhh) numbers. And its sed(1)'s label is limited to maximum seven characters. Therefore I replaced some labels to drop a character. * close -> cl * continue -> cont (cnt is used for count) * line -> ln * hered -> hdoc * shell -> sh * string -> str Signed-off-by: Kyohei Kadota <lufia@lufia.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 22:31:31 -07:00
Junio C Hamano	e9983f8965	Merge branch 'es/chain-lint-more' The test linter code has learned that the end of here-doc mark "EOF" can be quoted in a double-quote pair, not just in a single-quote pair. * es/chain-lint-more: chainlint: match "quoted" here-doc tags	2018-09-04 14:31:40 -07:00
Eric Sunshine	3042b6bb59	chainlint: match "quoted" here-doc tags A here-doc tag can be quoted ('EOF'/"EOF") or escaped (\EOF) to suppress interpolation within the body. chainlint recognizes single-quoted and escaped tags, but does not know about double-quoted tags. For completeness, teach it to recognize double-quoted tags, as well. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 10:57:38 -07:00
Ævar Arnfjörð Bjarmason	a3c4c8841c	tests: use shorter labels in chainlint.sed for AIX sed Improve the portability of chainlint by using shorter labels. On AIX sed will complain about: sed: 0602-417 The label :hereslurp is greater than eight characters This, in combination with the previous fix to this file makes GIT_TEST_CHAIN_LINT=1 (which is the default) working again on AIX without issues, and the "gmake check-chainlint" test also passes. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-27 14:07:18 -07:00
Ævar Arnfjörð Bjarmason	2d9ded8acc	tests: fix comment syntax in chainlint.sed for AIX sed Change a comment in chainlint.sed to appease AIX sed, which would previously print this error: sed: # stash for later printing is not a recognized function 1. https://public-inbox.org/git/CAPig+cTTbU5HFMKgNyrxTp3+kcK46-Fn=4ZH6zDt1oQChAc3KA@mail.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-27 11:31:43 -07:00
Eric Sunshine	22e3e0241a	chainlint: recognize multi-line quoted strings more robustly chainlint.sed recognizes multi-line quoted strings within subshells: echo "abc def" >out && so it can avoid incorrectly classifying lines internal to the string as breaking the &&-chain. To identify the first line of a multi-line string, it checks if the line contains a single quote. However, this is fragile and can be easily fooled by a line containing multiple strings: echo "xyz" "abc def" >out && Make detection more robust by checking for an odd number of quotes rather than only a single one. (Escaped quotes are not handled, but support may be added later.) The original multi-line string recognizer rather cavalierly threw away all but the final quote, whereas the new one is careful to retain all quotes, so the "expected" output of a couple existing chainlint tests is updated to account for this new behavior. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 12:22:12 -07:00
Eric Sunshine	d93871143f	chainlint: let here-doc and multi-line string commence on same line After swallowing a here-doc, chainlint.sed assumes that no other processing needs to be done on the line aside from checking for &&-chain breakage; likewise, after folding a multi-line quoted string. However, it's conceivable (even if unlikely in practice) that both a here-doc and a multi-line quoted string might commence on the same line: cat <<\EOF && echo "foo bar" data EOF Support this case by sending the line (after swallowing and folding) through the normal processing sequence rather than jumping directly to the check for broken &&-chain. This change also allows other somewhat pathological cases to be handled, such as closing a subshell on the same line starting a here-doc: ( cat <<-\INPUT) data INPUT or, for instance, opening a multi-line $(...) expression on the same line starting a here-doc: x=$(cat <<-\END && data END echo "x") among others. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 12:22:12 -07:00
Eric Sunshine	06fc5c9f90	chainlint: recognize multi-line $(...) when command cuddled with "$(" For multi-line $(...) expressions nested within subshells, chainlint.sed only recognizes: x=$( echo foo && ... but it is not unlikely that test authors may also cuddle the command with the opening "$(", so support that style, as well: x=$(echo foo && ... The closing ")" is already correctly recognized when cuddled or not. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 12:22:11 -07:00
Eric Sunshine	7e32a31b21	chainlint: match 'quoted' here-doc tags A here-doc tag can be quoted ('EOF') or escaped (\EOF) to suppress interpolation within the body. Although, chainlint recognizes escaped tags, it does not know about quoted tags. For completeness, teach it to recognize quoted tags, as well. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 12:22:11 -07:00
Eric Sunshine	c2c29cc03e	chainlint: match arbitrary here-docs tags rather than hard-coded names chainlint.sed swallows top-level here-docs to avoid being fooled by content which might look like start-of-subshell. It likewise swallows here-docs in subshells to avoid marking content lines as breaking the &&-chain, and to avoid being fooled by content which might look like end-of-subshell, start-of-nested-subshell, or other specially-recognized constructs. At the time of implementation, it was believed that it was not possible to support arbitrary here-doc tag names since 'sed' provides no way to stash the opening tag name in a variable for later comparison against a line signaling end-of-here-doc. Consequently, tag names are hard-coded, with "EOF" being the only tag recognized at the top-level, and only "EOF", "EOT", and "INPUT_END" being recognized within subshells. Also, special care was taken to avoid being confused by here-docs nested within other here-docs. In practice, this limited number of hard-coded tag names has been "good enough" for the 13000+ existing Git test, despite many of those tests using tags other than the recognized ones, since the bodies of those here-docs do not contain content which would fool the linter. Nevertheless, the situation is not ideal since someone writing new tests, and choosing a name not in the "blessed" set could potentially trigger a false-positive. To address this shortcoming, upgrade chainlint.sed to handle arbitrary here-doc tag names, both at the top-level and within subshells. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 12:22:11 -07:00
Eric Sunshine	ace64e56c1	t/chainlint.sed: drop extra spaces from regex character class This character class, like many others in this script, matches horizontal whitespace consisting of spaces and tabs, however, a few extra, entirely harmless, spaces somehow slipped into the expression. Removing them is purely a cosmetic fix. While at it, re-indent three lines with a single TAB each which were incorrectly indented with six spaces. Also, a purely cosmetic fix. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-31 11:24:14 -07:00
Eric Sunshine	878f988350	t/test-lib: teach --chain-lint to detect broken &&-chains in subshells The --chain-lint option detects broken &&-chains by forcing the test to exit early (as the very first step) with a sentinel value. If that sentinel is the test's overall exit code, then the &&-chain is intact; if not, then the chain is broken. Unfortunately, this detection does not extend to &&-chains within subshells even when the subshell itself is properly linked into the outer &&-chain. Address this shortcoming by feeding the body of the test to a lightweight "linter" which can peer inside subshells and identify broken &&-chains by pure textual inspection. Although the linter does not actually parse shell scripts, it has enough knowledge of shell syntax to reliably deal with formatting style variations (as evolved over the years) and to avoid being fooled by non-shell content (such as inside here-docs and multi-line strings). It recognizes modern subshell formatting: statement1 && ( statement2 && statement3 ) && statement4 as well as old-style: statement1 && (statement2 && statement3) && statement4 Heuristics are employed to properly identify the extent of a subshell formatted in the old-style since a number of legitimate constructs may superficially appear to close the subshell even though they don't. For example, it understands that neither "x=$(command)" nor "case $x in *)" end a subshell, despite the ")" at the end of line. Due to limitations of the tool used ('sed') and its inherent line-by-line processing, only subshells one level deep are handled, as well as one-liner subshells one level below that. Subshells deeper than that or multi-line subshells at level two are passed through as-is, thus &&-chains in their bodies are not checked. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-17 09:15:14 -07:00

22 commits