Commit graph

871 commits

Author SHA1 Message Date
Guido van Rossum 9d197c7d48
bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21021) 2020-06-27 17:33:49 -07:00
Lysandros Nikolaou 1f0f4abb11
bpo-41076: Pre-feed the parser with the f-string expression location (GH-21054)
This commit changes the parsing of f-string expressions with the new parser. The parser gets pre-fed with the location of the expression itself (not the f-string, which was what we were doing before). This allows us to completely skip the shifting of the AST nodes after the parsing is completed.
2020-06-28 00:41:48 +01:00
Batuhan Taskaya c8f29ad986
bpo-40769: Allow extra surrounding parentheses for invalid annotated assignment rule (GH-20387) 2020-06-27 19:33:08 +01:00
Lysandros Nikolaou 6dcbc2422d
bpo-41132: Use pymalloc allocator in the f-string parser (GH-21173) 2020-06-27 18:47:00 +01:00
Lysandros Nikolaou 2e0a920e9e
bpo-41084: Adjust message when an f-string expression causes a SyntaxError (GH-21084)
Prefix the error message with `fstring: `, when parsing an f-string expression throws a `SyntaxError`.
2020-06-26 12:24:05 +01:00
Lysandros Nikolaou 4b85e60601
bpo-41119: Output correct error message for list/tuple followed by colon (GH-21160) 2020-06-26 00:22:36 +01:00
Lysandros Nikolaou 564cd18767
bpo-40939: Rename PyPegen* functions to PyParser* (GH-21016)
Rename PyPegen* functions to PyParser*, so that we can remove the
old set of PyParser* functions that were using the old parser.
2020-06-22 00:47:46 +01:00
Lysandros Nikolaou 6c4e0bd974
bpo-41060: Avoid SEGFAULT when calling GET_INVALID_TARGET in the grammar (GH-21020)
`GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not
caught will cause a SEGFAULT. Therefore, this commit introduces a new
inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always
checks for `GET_INVALID_TARGET` returning NULL and can be used in
the grammar, replacing the long C ternary operation used till now.
2020-06-21 03:18:01 +01:00
Lysandros Nikolaou 314858e276
bpo-40939: Remove the old parser (Part 2) (GH-21005)
Remove some remaining files and Makefile targets for the old parser
2020-06-20 19:07:25 +01:00
Lysandros Nikolaou 861efc6e8f
bpo-40958: Avoid 'possible loss of data' warning on Windows (GH-20970) 2020-06-20 05:57:27 -07:00
Lysandros Nikolaou 01ece63d42
bpo-40334: Produce better error messages on invalid targets (GH-20106)
The following error messages get produced:
- `cannot delete ...` for invalid `del` targets
- `... is an illegal 'for' target` for invalid targets in for
  statements
- `... is an illegal 'with' target` for invalid targets in
  with statements

Additionally, a few `cut`s were added in various places before the
invocation of the `invalid_*` rule, in order to speed things
up.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-06-19 00:10:43 +01:00
Pablo Galindo 51c5896b62
bpo-40958: Avoid buffer overflow in the parser when indexing the current line (GH-20875) 2020-06-16 16:49:43 +01:00
Pablo Galindo e0bec69854
Remove old comment in string_parser.c (GH-20906) 2020-06-16 02:13:33 +01:00
Victor Stinner e822e37946
bpo-36020: Remove snprintf macro in pyerrors.h (GH-20889)
On Windows, #include "pyerrors.h" no longer defines "snprintf" and
"vsnprintf" macros.

PyOS_snprintf() and PyOS_vsnprintf() should be used to get portable
behavior.

Replace snprintf() calls with PyOS_snprintf() and replace vsnprintf()
calls with PyOS_vsnprintf().
2020-06-15 21:59:47 +02:00
Pablo Galindo fb61c42361
Improve readability and style in parser files (GH-20884) 2020-06-15 14:23:43 +01:00
Pablo Galindo 1ed83adb0e
bpo-40939: Remove the old parser (GH-20768)
This commit removes the old parser, the deprecated parser module, the old parser compatibility flags and environment variables and all associated support code and documentation.
2020-06-11 17:30:46 +01:00
Lysandros Nikolaou bcd7deed91
bpo-40939: Remove PEG parser easter egg (__new_parser__) (#20802)
It no longer serves a purpose (there's only one parser) and having "new" in any name will eventually look odd. Also, it impinges on a potential sub-namespace, `__new_...__`.
2020-06-11 09:09:21 -07:00
Lysandros Nikolaou 896f4cf63f
bpo-40847: Consider a line with only a LINECONT a blank line (GH-20769)
A line with only a line continuation character should be considered
a blank line at tokenizer level so that only a single NEWLINE token
gets emitted. The old parser was working around the issue, but the
new parser threw a `SyntaxError` for valid input. For example,
an empty line following a line continuation character was interpreted
as a `SyntaxError`.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-06-11 00:56:08 +01:00
Victor Stinner 1bcc32f062
bpo-39465: Use _PyInterpreterState_GET() (GH-20788)
Replace _PyThreadState_GET() with _PyInterpreterState_GET() in:

* get_small_int()
* gcmodule.c: add also get_gc_state() function
* _PyTrash_deposit_object()
* _PyTrash_destroy_chain()
* warnings_get_state()
* Py_GetRecursionLimit()

Cleanup listnode.c: add 'parser' variable.
2020-06-10 20:08:26 +02:00
Pablo Galindo c6483c9896
Raise specialised syntax error for invalid lambda parameters (GH-20776) 2020-06-10 14:07:06 +01:00
Pablo Galindo 9f495908c5
bpo-40903: Handle multiple '=' in invalid assignment rules in the PEG parser (GH-20697)
Automerge-Triggered-By: @pablogsal
2020-06-07 18:57:00 -07:00
Pablo Galindo 972ab03276
bpo-40904: Fix segfault in the new parser with f-string containing yield statements with no value (GH-20701) 2020-06-08 01:47:37 +01:00
Pablo Galindo 2e6593db00
bpo-40880: Fix invalid read in newline_in_string in pegen.c (#20666)
* bpo-40880: Fix invalid read in newline_in_string in pegen.c

* Update Parser/pegen/pegen.c

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>

* Add NEWS entry

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-06-06 00:52:27 +01:00
Pablo Galindo a54096e305
bpo-40883: Fix memory leak in fstring_compile_expr in parse_string.c (GH-20667) 2020-06-06 00:52:15 +01:00
Victor Stinner fa7ab6aa0f
bpo-40826: Add _PyOS_InterruptOccurred(tstate) function (GH-20599)
my_fgets() now calls _PyOS_InterruptOccurred(tstate) to check for
pending signals, rather calling PyOS_InterruptOccurred().

my_fgets() is called with the GIL released, whereas
PyOS_InterruptOccurred() must be called with the GIL held.

test_repl: use text=True and avoid SuppressCrashReport in
test_multiline_string_parsing().

Fix my_fgets() on Windows: fgets(fp) does crash if fileno(fp) is closed.
2020-06-03 14:39:59 +02:00
Victor Stinner c353764fd5
bpo-40826: Fix GIL usage in PyOS_Readline() (GH-20579)
Fix GIL usage in PyOS_Readline(): lock the GIL to set an exception.

Pass tstate to my_fgets() and _PyOS_WindowsConsoleReadline(). Cleanup
these functions.
2020-06-01 20:59:35 +02:00
Shantanu c116c94ff1
bpo-40614: Respect feature version for f-string debug expressions (GH-20196)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-27 21:30:38 +01:00
Lysandros Nikolaou 526e23f153
Refactor error handling code in Parser/pegen/pegen.c (GH-20440)
Set p->error_indicator in various places, where it's needed, but it's
not done.

Automerge-Triggered-By: @gvanrossum
2020-05-27 09:04:11 -07:00
Pablo Galindo 1cf15af9a6 bpo-40217: Ensure Py_VISIT(Py_TYPE(self)) is always called for PyType_FromSpec types (reverts GH-19414) (GH-20264)
Heap types now always visit the type in tp_traverse. See added docs for details.

This reverts commit 0169d3003b.

Automerge-Triggered-By: @encukou
2020-05-27 02:03:38 -07:00
Pablo Galindo 404b23b85b
Fix lookahead of soft keywords in the PEG parser (GH-20436)
Automerge-Triggered-By: @gvanrossum
2020-05-26 16:15:52 -07:00
Guido van Rossum b45af1a569
Add soft keywords (GH-20370)
These are like keywords but they only work in context; they are not reserved except when there is an exact match.

This would enable things like match statements without reserving `match` (which would be bad for the `re.match()` function and probably lots of other places).

Automerge-Triggered-By: @gvanrossum
2020-05-26 10:58:44 -07:00
Ammar Askar a2bbedc8b1
Fix peg_generator compiler warnings under MSVC (GH-20405) 2020-05-26 05:33:35 +01:00
Lysandros Nikolaou f7b1e46156
bpo-38964: Print correct filename on a SyntaxError in an fstring (GH-20399)
When a `SyntaxError` in the expression part of a fstring is found,
the filename attribute of the `SyntaxError` is always `<fstring>`.
With this commit, it gets changed to always have the name of the file
the fstring resides in.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-26 01:32:18 +01:00
Pablo Galindo deb4355a37
bpo-40750: Do not expand the new parser debug flags if Py_BUILD_CORE is not defined (GH-20393) 2020-05-25 20:17:12 +01:00
Pablo Galindo 800a35c623
bpo-40750: Support -d flag in the new parser (GH-20340) 2020-05-25 18:38:45 +01:00
Rémi Lapeyre c73914a562
bpo-36290: Fix keytword collision handling in AST node constructors (GH-12382) 2020-05-24 22:12:57 +01:00
Pablo Galindo b23d7adfdf
Use Py_ssize_t for the column number in the PEG support code (GH-20341) 2020-05-24 06:01:34 +01:00
Lysandros Nikolaou ae14583302
bpo-40334: Produce better error messages for non-parenthesized genexps (GH-20153)
The error message, generated for a non-parenthesized generator expression
in function calls, was still the generic `invalid syntax`, when the generator expression wasn't appearing as the first argument in the call. With this patch, even on input like `f(a, b, c for c in d, e)`, the correct error message gets produced.
2020-05-22 01:56:52 +01:00
Batuhan Taskaya b8a65ec1d3
bpo-40715: Reject dict unpacking on dict comprehensions (GH-20292)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-21 23:39:56 +01:00
Batuhan Taskaya 72e0aa2fd2
bpo-40176: Improve error messages for trailing comma on from import (GH-20294) 2020-05-21 21:41:58 +01:00
Pablo Galindo ced4e5c227
Regenerate the parser (#20195) 2020-05-18 23:47:51 +02:00
Lysandros Nikolaou 75b863aa97
bpo-40334: Reproduce error message for type comments on bare '*' in the new parser (GH-20151) 2020-05-18 20:14:47 +01:00
Batuhan Taskaya 63b8e0cba3
bpo-40528: Improve AST generation script to do builds simultaneously (GH-19968)
- Switch from getopt to argparse.
- Removed the limitation of not being able to produce both C and H simultaneously.

This will make it run faster since it parses the asdl definition once and uses the generated tree to generate both the header and the C source.
2020-05-18 18:42:10 +01:00
Lysandros Nikolaou 7b7a21bc4f
bpo-40661: Fix segfault when parsing invalid input (GH-20165)
Fix segfaults when parsing very complex invalid input, like `import äˆ ð£„¯ð¢·žð±‹á”€ð””ð‘©±å®ä±¬ð©¾\n𗶽`.

Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-18 18:32:03 +01:00
Lysandros Nikolaou 2c8cd06afe
bpo-40334: Improvements to error-handling code in the PEG parser (GH-20003)
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-17 04:19:23 +01:00
Pablo Galindo 16ab07063c
bpo-40334: Correctly identify invalid target in assignment errors (GH-20076)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-05-15 02:04:52 +01:00
Lysandros Nikolaou ce21cfca7b
bpo-40618: Disallow invalid targets in augassign and except clauses (GH-20083)
This commit fixes the new parser to disallow invalid targets in the
following scenarios:
- Augmented assignments must only accept a single target (Name,
  Attribute or Subscript), but no tuples or lists.
- `except` clauses should only accept a single `Name` as a target.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-14 21:13:50 +01:00
Pablo Galindo bcc3036095
bpo-40619: Correctly handle error lines in programs without file mode (GH-20090) 2020-05-14 21:11:48 +01:00
Lysandros Nikolaou a15c9b3a05
bpo-40334: Always show the caret on SyntaxErrors (GH-20050)
This commit fixes SyntaxError locations when the caret is not displayed,
by doing the following:

- `col_number` always gets set to the location of the offending
  node/expr. When no caret is to be displayed, this gets achieved
  by setting the object holding the error line to None.

- Introduce a new function `_PyPegen_raise_error_known_location`,
  which can be called, when an arbitrary `lineno`/`col_offset`
  needs to be passed. This function then gets used in the grammar
  (through some new macros and inline functions) so that SyntaxError
  locations of the new parser match that of the old.
2020-05-13 20:36:27 +01:00
Serhiy Storchaka 74ea6b5a75
bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033) 2020-05-12 12:42:04 +03:00