development/git - HydraGit

development/git

Fork 0

mirror of https://github.com/git/git synced 2024-10-31 01:43:41 +00:00

Commit graph

Author	SHA1	Message	Date
Eric Sunshine	73c768dae9	chainlint: annotate original test definition rather than token stream When chainlint detects problems in a test, such as a broken &&-chain, it prints out the test with "?!FOO?!" annotations inserted at each problem location. However, rather than annotating the original test definition, it instead dumps out a parsed token representation of the test. Since it lacks comments, indentations, here-doc bodies, and so forth, this tokenized representation can be difficult for the test author to digest and relate back to the original test definition. However, now that each parsed token carries positional information, the location of a detected problem can be pinpointed precisely in the original test definition. Therefore, take advantage of this information to annotate the test definition itself rather than annotating the parsed token stream, thus making it easier for a test author to relate a problem back to the source. Maintaining the positional meta-information associated with each detected problem requires a slight change in how the problems are managed internally. In particular, shell syntax such as: msg="total: $(cd data; wc -w .txt) words" requires the lexical analyzer to recursively invoke the parser in order to detect problems within the $(...) expression inside the double-quoted string. In this case, the recursive parse context will detect the broken &&-chain between the `cd` and `wc` commands, returning the token stream: cd data ; ?!AMP?! wc -w .txt However, the parent parse context will see everything inside the double-quotes as a single string token: "total: $(cd data ; ?!AMP?! wc -w .txt) words" losing whatever positional information was attached to the ";" token where the problem was detected. One way to preserve the positional information of a detected problem in a recursive parse context within a string would be to attach the positional information to the annotation textually; for instance: "total: $(cd data ; ?!AMP:21:22?! wc -w .txt) words" and then extract the positional information when annotating the original test definition. However, a cleaner and much simpler approach is to maintain the list of detected problems separately rather than embedding the problems as annotations directly in the parsed token stream. Not only does this ensure that positional information within recursive parse contexts is not lost, but it keeps the token stream free from non-token pollution, which may simplify implementation of validations added in the future since they won't have to handle non-token "?!FOO!?" items specially. Finally, the chainlint self-test "expect" files need a few mechanical adjustments now that the original test definitions are emitted rather than the parsed token stream. In particular, the following items missing from the historic parsed-token output are now preserved verbatim: * indentation (and whitespace, in general) * comments * here-doc bodies * here-doc tag quoting (i.e. "\EOF") * line-splices (i.e. "\" at the end of a line) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:10:49 -05:00
Eric Sunshine	31da22d1fd	chainlint.sed: swallow comments consistently When checking for broken a &&-chain, chainlint.sed knows that the final statement in a subshell should not end with `&&`, so it takes care to make a distinction between the final line which is an actual statement and any lines which may be mere comments preceding the closing ')'. As such, it swallows comment lines so that they do not interfere with the &&-chain check. However, since `sed` does not provide any sort of real recursion, chainlint.sed only checks &&-chains in subshells one level deep; it doesn't do any checking in deeper subshells or in `{...}` blocks within subshells. Furthermore, on account of potential implementation complexity, it doesn't check &&-chains within `case` arms. Due to an oversight, it also doesn't swallow comments inside deep subshells, `{...}` blocks, or `case` statements, which makes its output inconsistent (swallowing comments in some cases but not others). Unfortunately, this inconsistency seeps into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. Therefore, teach chainlint.sed to consistently swallow comments in all cases. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 14:15:29 -08:00

Author

SHA1

Message

Date

Eric Sunshine

73c768dae9

chainlint: annotate original test definition rather than token stream

When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.

However, now that each parsed token carries positional information, the
location of a detected problem can be pinpointed precisely in the
original test definition. Therefore, take advantage of this information
to annotate the test definition itself rather than annotating the parsed
token stream, thus making it easier for a test author to relate a
problem back to the source.

Maintaining the positional meta-information associated with each
detected problem requires a slight change in how the problems are
managed internally. In particular, shell syntax such as:

    msg="total: $(cd data; wc -w *.txt) words"

requires the lexical analyzer to recursively invoke the parser in order
to detect problems within the $(...) expression inside the double-quoted
string. In this case, the recursive parse context will detect the broken
&&-chain between the `cd` and `wc` commands, returning the token stream:

    cd data ; ?!AMP?! wc -w *.txt

However, the parent parse context will see everything inside the
double-quotes as a single string token:

    "total: $(cd data ; ?!AMP?! wc -w *.txt) words"

losing whatever positional information was attached to the ";" token
where the problem was detected.

One way to preserve the positional information of a detected problem in
a recursive parse context within a string would be to attach the
positional information to the annotation textually; for instance:

    "total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words"

and then extract the positional information when annotating the original
test definition.

However, a cleaner and much simpler approach is to maintain the list of
detected problems separately rather than embedding the problems as
annotations directly in the parsed token stream. Not only does this
ensure that positional information within recursive parse contexts is
not lost, but it keeps the token stream free from non-token pollution,
which may simplify implementation of validations added in the future
since they won't have to handle non-token "?!FOO!?" items specially.

Finally, the chainlint self-test "expect" files need a few mechanical
adjustments now that the original test definitions are emitted rather
than the parsed token stream. In particular, the following items missing
from the historic parsed-token output are now preserved verbatim:

    * indentation (and whitespace, in general)

    * comments

    * here-doc bodies

    * here-doc tag quoting (i.e. "\EOF")

    * line-splices (i.e. "\" at the end of a line)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>

2022-11-08 15:10:49 -05:00

Eric Sunshine

31da22d1fd

chainlint.sed: swallow comments consistently

When checking for broken a &&-chain, chainlint.sed knows that the final
statement in a subshell should not end with `&&`, so it takes care to
make a distinction between the final line which is an actual statement
and any lines which may be mere comments preceding the closing ')'. As
such, it swallows comment lines so that they do not interfere with the
&&-chain check.

However, since `sed` does not provide any sort of real recursion,
chainlint.sed only checks &&-chains in subshells one level deep; it
doesn't do any checking in deeper subshells or in `{...}` blocks within
subshells. Furthermore, on account of potential implementation
complexity, it doesn't check &&-chains within `case` arms.

Due to an oversight, it also doesn't swallow comments inside deep
subshells, `{...}` blocks, or `case` statements, which makes its output
inconsistent (swallowing comments in some cases but not others).
Unfortunately, this inconsistency seeps into the chainlint self-test
"expect" files, which potentially makes it difficult to reuse the
self-tests should a more capable chainlint ever be developed. Therefore,
teach chainlint.sed to consistently swallow comments in all cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>

2021-12-13 14:15:29 -08:00

2 commits