Commit graph

44 commits

Author SHA1 Message Date
Pablo Galindo Salgado da8f87b7ea
gh-107015: Remove async_hacks from the tokenizer (#107018) 2023-07-26 16:34:15 +01:00
Marta Gómez Macías 6715f91edc
gh-102856: Python tokenizer implementation for PEP 701 (#104323)
This commit replaces the Python implementation of the tokenize module with an implementation
that reuses the real C tokenizer via a private extension module. The tokenize module now implements
a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward
compatibility.

As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via
the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation.

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2023-05-21 01:03:02 +01:00
Pablo Galindo Salgado 1ef61cf71a
gh-102856: Initial implementation of PEP 701 (#102855)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
Co-authored-by: Marta Gómez Macías <mgmacias@google.com>
Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
2023-04-19 11:18:16 -05:00
Victor Stinner 1863302d61
gh-97669: Create Tools/build/ directory (#97963)
Create Tools/build/ directory. Move the following scripts from
Tools/scripts/ to Tools/build/:

* check_extension_modules.py
* deepfreeze.py
* freeze_modules.py
* generate_global_objects.py
* generate_levenshtein_examples.py
* generate_opcode_h.py
* generate_re_casefix.py
* generate_sre_constants.py
* generate_stdlib_module_names.py
* generate_token.py
* parse_html5_entities.py
* smelly.py
* stable_abi.py
* umarshal.py
* update_file.py
* verify_ensurepip_wheels.py

Update references to these scripts.
2022-10-17 12:01:00 +02:00
Pablo Galindo b280248be8
bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
Guido van Rossum 495da29225 bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086)
This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.)



https://bugs.python.org/issue35975
2019-03-07 12:38:08 -08:00
Guido van Rossum dcfcd146f8 bpo-35766: Merge typed_ast back into CPython (GH-11645) 2019-01-31 12:40:27 +01:00
Emily Morehouse 8f59ee01be
bpo-35224: PEP 572 Implementation (#10497)
* Add tokenization of :=
- Add token to Include/token.h. Add token to documentation in Doc/library/token.rst.
- Run `./python Lib/token.py` to regenerate Lib/token.py.
- Update Parser/tokenizer.c: add case to handle `:=`.

* Add initial usage of := in grammar.

* Update Python.asdl to match the grammar updates. Regenerated Include/Python-ast.h and Python/Python-ast.c

* Update AST and compiler files in Python/ast.c and Python/compile.c. Basic functionality, this isn't scoped properly

* Regenerate Lib/symbol.py using `./python Lib/symbol.py`

* Tests - Fix failing tests in test_parser.py due to changes in token numbers for internal representation

* Tests - Add simple test for := token

* Tests - Add simple tests for named expressions using expr and suite

* Tests - Update number of levels for nested expressions to prevent stack overflow

* Update symbol table to handle NamedExpr

* Update Grammar to allow assignment expressions in if statements.
Regenerate Python/graminit.c accordingly using `make regen-grammar`

* Tests - Add additional tests for named expressions in RoundtripLegalSyntaxTestCase, based on examples and information directly from PEP 572

Note: failing tests are currently commented out (4 out of 24 tests currently fail)

* Tests - Add temporary syntax test failure tests in test_parser.py

Note: There is an outstanding TODO for this -- syntax tests need to be
moved to a different file (presumably test_syntax.py), but this is
covering what needs to be tested at the moment, and it's more convenient
to run a single test for the time being

* Add support for allowing assignment expressions as function argument annotations. Uncomment tests for these cases because they all pass now!

* Tests - Move existing syntax tests out of test_parser.py and into test_named_expressions.py. Refactor syntax tests to use unittest

* Add TargetScopeError exception to extend SyntaxError

Note: This simply creates the TargetScopeError exception, it is not yet
used anywhere

* Tests - Update tests per PEP 572

Continue refactoring test suite:
The named expression test suite now checks for any invalid cases that
throw exceptions (no longer limited to SyntaxErrors), assignment tests
to ensure that variables are properly assigned, and scope tests to
ensure that variable availability and values are correct

Note:
- There are still tests that are marked to skip, as they are not yet
implemented
- There are approximately 300 lines of the PEP that have not yet been
addressed, though these may be deferred

* Documentation - Small updates to XXX/todo comments

- Remove XXX from child description in ast.c
- Add comment with number of previously supported nested expressions for
3.7.X in test_parser.py

* Fix assert in seq_for_testlist()

* Cleanup - Denote "Not implemented -- No keyword args" on failing test case. Fix PEP8 error for blank lines at beginning of test classes in test_parser.py

* Tests - Wrap all file opens in `with...as` to ensure files are closed

* WIP: handle f(a := 1)

* Tests and Cleanup - No longer skips keyword arg test. Keyword arg test now uses a simpler test case and does not rely on an external file. Remove print statements from ast.c

* Tests - Refactor last remaining test case that relied on on external file to use a simpler test case without the dependency

* Tests - Add better description of remaning skipped tests. Add test checking scope when using assignment expression in a function argument

* Tests - Add test for nested comprehension, testing value and scope. Fix variable name in skipped comprehension scope test

* Handle restriction of LHS for named expressions - can only assign to LHS of type NAME. Specifically, restrict assignment to tuples

This adds an alternative set_context specifically for named expressions,
set_namedexpr_context. Thus, context is now set differently for standard
assignment versus assignment for named expressions in order to handle
restrictions.

* Tests - Update negative test case for assigning to lambda to match new error message. Add negative test case for assigning to tuple

* Tests - Reorder test cases to group invalid syntax cases and named assignment target errors

* Tests - Update test case for named expression in function argument - check that result and variable are set correctly

* Todo - Add todo for TargetScopeError based on Guido's comment (2b3acd37bd (r30472562))

* Tests - Add named expression tests for assignment operator in function arguments

Note: One of two tests are skipped, as function arguments are currently treating
an assignment expression inside of parenthesis as one child, which does
not properly catch the named expression, nor does it count arguments
properly

* Add NamedStore to expr_context. Regenerate related code with `make regen-ast`

* Add usage of NamedStore to ast_for_named_expr in ast.c. Update occurances of checking for Store to also handle NamedStore where appropriate

* Add ste_comprehension to _symtable_entry to track if the namespace is a comprehension. Initialize ste_comprehension to 0. Set set_comprehension to 1 in symtable_handle_comprehension

* s/symtable_add_def/symtable_add_def_helper. Add symtable_add_def to handle grabbing st->st_cur and passing it to symtable_add_def_helper. This now allows us to call the original code from symtable_add_def by instead calling symtable_add_def_helper with a different ste.

* Refactor symtable_record_directive to take lineno and col_offset as arguments instead of stmt_ty. This allows symtable_record_directive to be used for stmt_ty and expr_ty

* Handle elevating scope for named expressions in comprehensions.

* Handle error for usage of named expression inside a class block

* Tests - No longer skip scope tests. Add additional scope tests

* Cleanup - Update error message for named expression within a comprehension within a class. Update comments. Add assert for symtable_extend_namedexpr_scope to validate that we always find at least a ModuleScope if we don't find a Class or FunctionScope

* Cleanup - Add missing case for NamedStore in expr_context_name. Remove unused var in set_namedexpr_content

* Refactor - Consolidate set_context and set_namedexpr_context to reduce duplicated code. Special cases for named expressions are handled by checking if ctx is NamedStore

* Cleanup - Add additional use cases for ast_for_namedexpr in usage comment. Fix multiple blank lines in test_named_expressions

* Tests - Remove unnecessary test case. Renumber test case function names

* Remove TargetScopeError for now. Will add back if needed

* Cleanup - Small comment nit for consistency

* Handle positional argument check with named expression

* Add TargetScopeError exception definition. Add documentation for TargetScopeError in c-api docs. Throw TargetScopeError instead of SyntaxError when using a named expression in a comprehension within a class scope

* Increase stack size for parser by 200. This is a minimal change (approx. 5kb) and should not have an impact on any systems. Update parser test to allow 99 nested levels again

* Add TargetScopeError to exception_hierarchy.txt for test_baseexception.py_

* Tests - Major update for named expression tests, both in test_named_expressions and test_parser

- Add test for TargetScopeError
- Add tests for named expressions in comprehension scope and edge cases
- Add tests for named expressions in function arguments (declarations
and call sites)
- Reorganize tests to group them more logically

* Cleanup - Remove unnecessary comment

* Cleanup - Comment nitpicks

* Explicitly disallow assignment expressions to a name inside parentheses, e.g.: ((x) := 0)

- Add check for LHS types to detect a parenthesis then a name (see note)
- Add test for this scenario
- Update tests for changed error message for named assignment to a tuple
(also, see note)

Note: This caused issues with the previous error handling for named assignment
to a LHS that contained an expression, such as a tuple. Thus, the check
for the LHS of a named expression must be changed to be more specific if
we wish to maintain the previous error messages

* Cleanup - Wrap lines more strictly in test file

* Revert "Explicitly disallow assignment expressions to a name inside parentheses, e.g.: ((x) := 0)"

This reverts commit f1531400ca7d7a2d148830c8ac703f041740896d.

* Add NEWS.d entry

* Tests - Fix error in test_pickle.test_exceptions by adding TargetScopeError to list of exceptions

* Tests - Update error message tests to reflect improved messaging convention (s/can't/cannot)

* Remove cases that cannot be reached in compile.c. Small linting update.

* Update Grammar/Tokens to add COLONEQUAL. Regenerate all files

* Update TargetScopeError PRE_INIT and POST_INIT, as this was purposefully left out when fixing rebase conflicts

* Add NamedStore back and regenerate files

* Pass along line number and end col info for named expression

* Simplify News entry

* Fix compiler warning and explicity mark fallthrough
2019-01-24 16:49:56 -07:00
Serhiy Storchaka 8ac658114d
bpo-30455: Generate all token related code and docs from Grammar/Tokens. (GH-10370)
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.

"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.

Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".

The documentation contains now strings for operators and punctuation tokens.
2018-12-22 11:18:40 +02:00
Serhiy Storchaka d08972fdb9
bpo-33260: Regenerate token.py after removing ASYNC and AWAIT. (GH-6447) 2018-04-11 19:15:51 +03:00
Albert-Jan Nijburg fc354f0785 bpo-25324: copy tok_name before changing it (#1608)
* add test to check if were modifying token

* copy list so import tokenize doesnt have side effects on token

* shorten line

* add tokenize tokens to token.h to get them to show up in token

* move ERRORTOKEN back to its previous location, and fix nitpick

* copy comments from token.h automatically

* fix whitespace and make more pythonic

* change to fix comments from @haypo

* update token.rst and Misc/NEWS

* change wording

* some more wording changes
2017-05-31 16:00:21 +02:00
Yury Selivanov 7544508f02 PEP 0492 -- Coroutines with async and await syntax. Issue #24017. 2015-05-11 22:57:16 -04:00
Serhiy Storchaka 46ba6c8563 Issue #22831: Use "with" to avoid possible fd leaks. 2015-04-04 11:01:02 +03:00
Benjamin Peterson d51374ed78 PEP 465: a dedicated infix operator for matrix multiplication (closes #21176) 2014-04-09 23:55:56 -04:00
Serhiy Storchaka 8f8ec92de8 Issue #19936: Added executable bits or shebang lines to Python scripts which
requires them.  Disable executable bits and shebang lines in test and
benchmark files in order to prevent using a random system python, and in
source files of modules which don't provide command line interface.  Fixed
shebang lines in the unittestgui and checkpip scripts.
2014-01-16 17:33:23 +02:00
Serhiy Storchaka b992a0e102 Issue #19936: Added executable bits or shebang lines to Python scripts which
requires them.  Disable executable bits and shebang lines in test and
benchmark files in order to prevent using a random system python, and in
source files of modules which don't provide command line interface.  Fixed
shebang line to use python3 executable in the unittestgui script.
2014-01-16 17:15:49 +02:00
Andrew Svetlov f7a17b48d7 Replace IOError with OSError (#16715) 2012-12-25 16:47:37 +02:00
Antoine Pitrou ea3eb88bca Issue #9260: A finer-grained import lock.
Most of the import sequence now uses per-module locks rather than the
global import lock, eliminating well-known issues with threads and imports.
2012-05-17 18:55:59 +02:00
Meador Inge 3388060127 Issue #13629: Renumber the tokens in token.h to match the _PyParser_TokenNames indexes. 2012-01-15 19:15:36 -06:00
Éric Araujo e1886bfaf4 Fix instructions on how to rebuild some modules 2011-11-29 16:45:34 +01:00
Alexander Belopolsky b9d10d08c4 Issue #10386: Added __all__ to token module; this simplifies importing
in tokenize module and prevents leaking of private names through
import *.
2010-11-11 14:07:41 +00:00
Benjamin Peterson 90f5ba538b convert shebang lines: python -> python3 2010-03-11 22:53:45 +00:00
Christian Heimes 836baa53d8 Merged revisions 61038,61042-61045,61047,61050,61053,61055-61056,61061-61064,61066-61080 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r61063 | andrew.kuchling | 2008-02-25 17:29:19 +0100 (Mon, 25 Feb 2008) | 1 line

  Move .setupterm() output so that we don't try to call endwin() if it fails
........
  r61064 | andrew.kuchling | 2008-02-25 17:29:58 +0100 (Mon, 25 Feb 2008) | 1 line

  Use file descriptor for real stdout
........
  r61067 | facundo.batista | 2008-02-25 19:06:00 +0100 (Mon, 25 Feb 2008) | 4 lines


  Issue 2117.  Update compiler module to handle class decorators.
  Thanks Thomas Herve
........
  r61069 | georg.brandl | 2008-02-25 21:17:56 +0100 (Mon, 25 Feb 2008) | 2 lines

  Rename sphinx.addons to sphinx.ext.
........
  r61071 | georg.brandl | 2008-02-25 21:20:45 +0100 (Mon, 25 Feb 2008) | 2 lines

  Revert r61029.
........
  r61072 | facundo.batista | 2008-02-25 23:33:55 +0100 (Mon, 25 Feb 2008) | 4 lines


  Issue 2168. gdbm and dbm needs to be iterable; this fixes a
  failure in the shelve module.  Thanks Thomas Herve.
........
  r61073 | raymond.hettinger | 2008-02-25 23:42:32 +0100 (Mon, 25 Feb 2008) | 1 line

  Make sure the itertools filter functions give the same performance for func=bool as func=None.
........
  r61074 | raymond.hettinger | 2008-02-26 00:17:41 +0100 (Tue, 26 Feb 2008) | 1 line

  Revert part of r60927 which made invalid assumptions about the API offered by db modules.
........
  r61075 | facundo.batista | 2008-02-26 00:46:02 +0100 (Tue, 26 Feb 2008) | 3 lines


  Coerced PyBool_Type to be able to compare it.
........
  r61076 | raymond.hettinger | 2008-02-26 03:46:54 +0100 (Tue, 26 Feb 2008) | 1 line

  Docs for itertools.combinations().  Implementation in forthcoming checkin.
........
  r61077 | neal.norwitz | 2008-02-26 05:50:37 +0100 (Tue, 26 Feb 2008) | 3 lines

  Don't use a hard coded port.  This test could hang/fail if the port is in use.
  Speed this test up by avoiding a sleep and using the event.
........
  r61078 | neal.norwitz | 2008-02-26 06:12:50 +0100 (Tue, 26 Feb 2008) | 1 line

  Whitespace normalization
........
  r61079 | neal.norwitz | 2008-02-26 06:23:51 +0100 (Tue, 26 Feb 2008) | 1 line

  Whitespace normalization
........
  r61080 | georg.brandl | 2008-02-26 07:40:10 +0100 (Tue, 26 Feb 2008) | 2 lines

  Banish tab.
........
2008-02-26 08:18:30 +00:00
Georg Brandl dde002899d Make ELLIPSIS a separate token. This makes it a syntax error to write ". . ." for Ellipsis. 2007-03-18 19:01:53 +00:00
Guido van Rossum c145ef3728 Use better idiom to sort keys. 2007-02-26 14:08:27 +00:00
Georg Brandl b79b78111c Fix token.py main code vs. dict views. 2007-02-26 09:41:19 +00:00
Guido van Rossum cc2b016125 - PEP 3106: dict.iterkeys(), .iteritems(), .itervalues() are now gone;
and .keys(), .items(), .values() return dict views.

The dict views aren't fully functional yet; in particular, they can't
be compared to sets yet.  but they are useful as "iterator wells".

There are still 27 failing unit tests; I expect that many of these
have fairly trivial fixes, but there are so many, I could use help.
2007-02-11 06:12:03 +00:00
Guido van Rossum b940e113bf SF patch 1631942 by Collin Winter:
(a) "except E, V" -> "except E as V"
(b) V is now limited to a simple name (local variable)
(c) V is now deleted at the end of the except block
2007-01-10 16:19:56 +00:00
Neal Norwitz c150536b5e PEP 3107 - Function Annotations thanks to Tony Lownds 2006-12-28 06:47:50 +00:00
Neal Norwitz 2eca440c8d Get rid of some more cases of backquotes. parsermodule.c doesn't compile
but looks like that was a problem before this change.
2006-08-29 04:40:24 +00:00
Anthony Baxter c2a5a63654 PEP-0318, @decorator-style. In Guido's words:
"@ seems the syntax that everybody can hate equally"
Implementation by Mark Russell, from SF #979728.
2004-08-02 06:10:11 +00:00
Michael W. Hudson adf1606161 Updates to track Grammar changes. The patch to token.py loosens the regexp to
allow "testlist1" to be snagged.
2002-10-03 09:42:01 +00:00
Guido van Rossum 8f15bd8500 Remove redundant 'import string' (PyChecker). 2001-08-13 15:48:06 +00:00
Tim Peters a38d2608bc Regenerated token.py to account for new DOUBLESLASH and DOUBLESLASHEQUAL. 2001-08-08 06:35:56 +00:00
Eric S. Raymond 6e025bcde8 String method cleanup. 2001-02-10 00:22:33 +00:00
Eric S. Raymond b08b2d3166 String method conversion. 2001-02-09 11:10:16 +00:00
Thomas Wouters 34052622c9 Update for augmented assignment. 2000-08-24 21:08:39 +00:00
Guido van Rossum e7b146fb3b The third and final doc-string sweep by Ka-Ping Yee.
The attached patches update the standard library so that all modules
have docstrings beginning with one-line summaries.

A new docstring was added to formatter.  The docstring for os.py
was updated to mention nt, os2, ce in addition to posix, dos, mac.
2000-02-04 15:28:42 +00:00
Guido van Rossum 45e2fbc2e7 Mass check-in after untabifying all files that need it. 1998-03-26 21:13:24 +00:00
Guido van Rossum 9694fcab53 Convert all remaining *simple* cases of regex usage to re usage. 1997-10-22 21:00:49 +00:00
Fred Drake e3dbc7e422 Reduced number of temporary names used at module scope. Use underscores in
front of temporary names in the module namespace.
1997-10-06 21:28:04 +00:00
Guido van Rossum 4747887880 New batch from Fred 1996-08-21 14:32:37 +00:00
Guido van Rossum 154a539460 Changes for new parser module (Fred Drake) 1996-07-21 02:17:52 +00:00
Guido van Rossum b31c7f732a * test_select.py: (some) tests for built-in select module
* test_grammar.py, testall.out: added test for funny things in string literals
* token.py, symbol.py: definitions used with built-in parser module.
* tokenize.py: added double-quote recognition
1993-11-11 10:31:23 +00:00