cpython/Parser
Eric Snow 81c72044a1
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
..
action_helpers.c Refactor parser compilation units into specific components (GH-29676) 2021-11-21 01:08:50 +00:00
asdl.py bpo-43651: PEP 597: Fix EncodingWarning in some tests (GH-25142) 2021-04-02 12:53:46 +09:00
asdl_c.py Fixed typo in "decclarations" (GH-28578) 2021-09-28 13:56:41 +03:00
myreadline.c bpo-45434: Move _Py_BEGIN_SUPPRESS_IPH to pycore_fileutils.h (GH-28922) 2021-10-13 15:03:35 +02:00
parser.c bpo-46110: Restore commit e9898bf153 2022-01-03 19:54:06 +00:00
peg_api.c bpo-43244: Remove parser_interface.h header file (GH-25001) 2021-03-24 01:29:09 +01:00
pegen.c bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010) 2022-02-08 11:54:37 +00:00
pegen.h bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010) 2022-02-08 11:54:37 +00:00
pegen_errors.c Fix the caret position in some syntax errors in interactive mode (GH-30718) 2022-01-20 15:34:13 +00:00
Python.asdl bpo-46289: Make conversion of FormattedValue not optional on ASDL (GH-30467) 2022-01-07 13:05:28 -08:00
string_parser.c bpo-46503: Prevent an assert from firing when parsing some invalid \N sequences in f-strings. (GH-30865) 2022-01-24 21:53:27 -05:00
string_parser.h bpo-43244: Remove ast.h, asdl.h, Python-ast.h headers (GH-24933) 2021-03-23 20:47:40 +01:00
token.c bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
tokenizer.c bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928) 2022-02-08 13:39:07 -07:00
tokenizer.h Ensure the str member of the tokenizer is always initialised (GH-29681) 2021-11-21 02:06:39 +00:00