bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)

A line with only a line continuation character should be considered
a blank line at tokenizer level so that only a single NEWLINE token
gets emitted. The old parser was working around the issue, but the
new parser threw a `SyntaxError` for valid input. For example,
an empty line following a line continuation character was interpreted
as a `SyntaxError`.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This commit is contained in:
Lysandros Nikolaou 2020-06-11 02:56:08 +03:00 committed by GitHub
parent 7f888c7ef9
commit 896f4cf63f
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 27 additions and 1 deletions

View file

@ -153,6 +153,13 @@ def f():
('dict_comp', '{x:1 for x in a}'),
('dict_comp_if', '{x:1+2 for x in a if b}'),
('dict_empty', '{}'),
('empty_line_after_linecont',
r'''
pass
\
pass
'''),
('for',
'''
for i in a:

View file

@ -858,6 +858,20 @@ def test_kwargs_last3(self):
"iterable argument unpacking follows "
"keyword argument unpacking")
def test_empty_line_after_linecont(self):
# See issue-40847
s = r"""\
pass
\
pass
"""
try:
compile(s, '<string>', 'exec')
except SyntaxError:
self.fail("Empty line after a line continuation character is valid.")
def test_main():
support.run_unittest(SyntaxTestCase)
from test import test_syntax

View file

@ -0,0 +1,4 @@
Fix a bug where a line with only a line continuation character is not considered a blank line at tokenizer level.
In such cases, more than a single `NEWLINE` token was emitted. The old parser was working around the issue,
but the new parser threw a :exc:`SyntaxError` for valid input due to this. For example, an empty line following
a line continuation character was interpreted as a :exc:`SyntaxError`.

View file

@ -1203,8 +1203,9 @@ tok_get(struct tok_state *tok, const char **p_start, const char **p_end)
}
}
tok_backup(tok, c);
if (c == '#' || c == '\n') {
if (c == '#' || c == '\n' || c == '\\') {
/* Lines with only whitespace and/or comments
and/or a line continuation character
shouldn't affect the indentation and are
not passed to the parser as NEWLINE tokens,
except *totally* empty lines in interactive