git/t/chainlint/inline-comment.expect

9 lines
137 B
Plaintext
Raw Normal View History

(
chainlint: annotate original test definition rather than token stream When chainlint detects problems in a test, such as a broken &&-chain, it prints out the test with "?!FOO?!" annotations inserted at each problem location. However, rather than annotating the original test definition, it instead dumps out a parsed token representation of the test. Since it lacks comments, indentations, here-doc bodies, and so forth, this tokenized representation can be difficult for the test author to digest and relate back to the original test definition. However, now that each parsed token carries positional information, the location of a detected problem can be pinpointed precisely in the original test definition. Therefore, take advantage of this information to annotate the test definition itself rather than annotating the parsed token stream, thus making it easier for a test author to relate a problem back to the source. Maintaining the positional meta-information associated with each detected problem requires a slight change in how the problems are managed internally. In particular, shell syntax such as: msg="total: $(cd data; wc -w *.txt) words" requires the lexical analyzer to recursively invoke the parser in order to detect problems within the $(...) expression inside the double-quoted string. In this case, the recursive parse context will detect the broken &&-chain between the `cd` and `wc` commands, returning the token stream: cd data ; ?!AMP?! wc -w *.txt However, the parent parse context will see everything inside the double-quotes as a single string token: "total: $(cd data ; ?!AMP?! wc -w *.txt) words" losing whatever positional information was attached to the ";" token where the problem was detected. One way to preserve the positional information of a detected problem in a recursive parse context within a string would be to attach the positional information to the annotation textually; for instance: "total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words" and then extract the positional information when annotating the original test definition. However, a cleaner and much simpler approach is to maintain the list of detected problems separately rather than embedding the problems as annotations directly in the parsed token stream. Not only does this ensure that positional information within recursive parse contexts is not lost, but it keeps the token stream free from non-token pollution, which may simplify implementation of validations added in the future since they won't have to handle non-token "?!FOO!?" items specially. Finally, the chainlint self-test "expect" files need a few mechanical adjustments now that the original test definitions are emitted rather than the parsed token stream. In particular, the following items missing from the historic parsed-token output are now preserved verbatim: * indentation (and whitespace, in general) * comments * here-doc bodies * here-doc tag quoting (i.e. "\EOF") * line-splices (i.e. "\" at the end of a line) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 19:08:30 +00:00
foobar && # comment 1
barfoo ?!AMP?! # wrong position for &&
flibble "not a # comment"
) &&
chainlint.sed: stop splitting "(..." into separate lines "(" and "..." Because `sed` is line-oriented, for ease of implementation, when chainlint.sed encounters an opening subshell in which the first command is cuddled with the "(", it splits the line into two lines: one containing only "(", and the other containing whatever follows "(". This allows chainlint.sed to get by with a single set of regular expressions for matching shell statements rather than having to duplicate each expression (one set for matching non-cuddled statements, and one set for matching cuddled statements). However, although syntactically and semantically immaterial, this transformation has no value to test authors and might even confuse them into thinking that the linter is misbehaving by inserting (whitespace) line-noise into the shell code it is validating. Moreover, it also allows an implementation detail of chainlint.sed to seep into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. To address these concerns, stop splitting cuddled "(..." into two lines. Note that, as an implementation artifact, due to sed's line-oriented nature, this change inserts a blank line at output time just before the "(..." line is emitted. It would be possible to suppress this blank line but doing so would add a fair bit of complexity to chainlint.sed. Therefore, rather than suppressing the extra blank line, the Makefile's `check-chainlint` target which runs the chainlint self-tests is instead modified to ignore blank lines when comparing chainlint output against the self-test "expect" output. This is a reasonable compromise for two reasons. First, the purpose of the chainlint self-tests is to verify that the ?!AMP?! annotations are being correctly added; precise whitespace is immaterial. Second, by necessity, chainlint.sed itself already throws away all blank lines within subshells since, when checking for a broken &&-chain, it needs to check the final _statement_ in a subshell, not the final _line_ (which might be blank), thus it has never made any attempt to precisely reproduce blank lines in its output. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13 06:30:59 +00:00
(cd foo &&
flibble "not a # comment")