bpo-44029: Remove Py_UNICODE APIs (GH-25881)

Remove deprecated `Py_UNICODE` APIs: `PyUnicode_Encode`,
`PyUnicode_EncodeUTF7`, `PyUnicode_EncodeUTF8`,
`PyUnicode_EncodeUTF16`, `PyUnicode_EncodeUTF32`,
`PyUnicode_EncodeLatin1`, `PyUnicode_EncodeMBCS`,
`PyUnicode_EncodeDecimal`, `PyUnicode_EncodeRawUnicodeEscape`,
`PyUnicode_EncodeCharmap`, `PyUnicode_EncodeUnicodeEscape`,
`PyUnicode_TransformDecimalToASCII`, `PyUnicode_TranslateCharmap`,
`PyUnicodeEncodeError_Create`, `PyUnicodeTranslateError_Create`.

See :pep:`393` and :pep:`624` for reference.
This commit is contained in:
Inada Naoki 2021-05-07 15:58:29 +09:00 committed by GitHub
parent 4ebf4a6bfa
commit 9ad8f109ac
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
11 changed files with 15 additions and 840 deletions

View file

@ -675,27 +675,6 @@ The following functions are used to create and modify Unicode exceptions from C.
*object*, *length*, *start*, *end* and *reason*. *encoding* and *reason* are
UTF-8 encoded strings.
.. c:function:: PyObject* PyUnicodeEncodeError_Create(const char *encoding, const Py_UNICODE *object, Py_ssize_t length, Py_ssize_t start, Py_ssize_t end, const char *reason)
Create a :class:`UnicodeEncodeError` object with the attributes *encoding*,
*object*, *length*, *start*, *end* and *reason*. *encoding* and *reason* are
UTF-8 encoded strings.
.. deprecated:: 3.3 3.11
``Py_UNICODE`` is deprecated since Python 3.3. Please migrate to
``PyObject_CallFunction(PyExc_UnicodeEncodeError, "sOnns", ...)``.
.. c:function:: PyObject* PyUnicodeTranslateError_Create(const Py_UNICODE *object, Py_ssize_t length, Py_ssize_t start, Py_ssize_t end, const char *reason)
Create a :class:`UnicodeTranslateError` object with the attributes *object*,
*length*, *start*, *end* and *reason*. *reason* is a UTF-8 encoded string.
.. deprecated:: 3.3 3.11
``Py_UNICODE`` is deprecated since Python 3.3. Please migrate to
``PyObject_CallFunction(PyExc_UnicodeTranslateError, "Onns", ...)``.
.. c:function:: PyObject* PyUnicodeDecodeError_GetEncoding(PyObject *exc)
PyObject* PyUnicodeEncodeError_GetEncoding(PyObject *exc)

View file

@ -719,17 +719,6 @@ Extension modules can continue using them, as they will not be removed in Python
:c:func:`PyUnicode_ReadChar` or similar new APIs.
.. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size)
Create a Unicode object by replacing all decimal digits in
:c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9
according to their decimal value. Return ``NULL`` if an exception occurs.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`Py_UNICODE_TODECIMAL`.
.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size)
Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE`
@ -1038,20 +1027,6 @@ These are the generic codec APIs:
the codec.
.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, \
const char *encoding, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python
bytes object. *encoding* and *errors* have the same meaning as the
parameters of the same name in the Unicode :meth:`~str.encode` method. The codec
to be used is looked up using the Python codec registry. Return ``NULL`` if an
exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsEncodedString`.
UTF-8 Codecs
""""""""""""
@ -1114,18 +1089,6 @@ These are the UTF-8 codec APIs:
The return type is now ``const char *`` rather of ``char *``.
.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
return a Python bytes object. Return ``NULL`` if an exception was raised by
the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsUTF8String`, :c:func:`PyUnicode_AsUTF8AndSize` or
:c:func:`PyUnicode_AsEncodedString`.
UTF-32 Codecs
"""""""""""""
@ -1176,29 +1139,6 @@ These are the UTF-32 codec APIs:
Return ``NULL`` if an exception was raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, \
const char *errors, int byteorder)
Return a Python bytes object holding the UTF-32 encoded value of the Unicode
data in *s*. Output is written according to the following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)
byteorder == 1: big endian
If byteorder is ``0``, the output string will always start with the Unicode BOM
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If ``Py_UNICODE_WIDE`` is not defined, surrogate pairs will be output
as a single code point.
Return ``NULL`` if an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsUTF32String` or :c:func:`PyUnicode_AsEncodedString`.
UTF-16 Codecs
"""""""""""""
@ -1250,30 +1190,6 @@ These are the UTF-16 codec APIs:
Return ``NULL`` if an exception was raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, \
const char *errors, int byteorder)
Return a Python bytes object holding the UTF-16 encoded value of the Unicode
data in *s*. Output is written according to the following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)
byteorder == 1: big endian
If byteorder is ``0``, the output string will always start with the Unicode BOM
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If ``Py_UNICODE_WIDE`` is defined, a single :c:type:`Py_UNICODE` value may get
represented as a surrogate pair. If it is not defined, each :c:type:`Py_UNICODE`
values is interpreted as a UCS-2 character.
Return ``NULL`` if an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsUTF16String` or :c:func:`PyUnicode_AsEncodedString`.
UTF-7 Codecs
""""""""""""
@ -1295,23 +1211,6 @@ These are the UTF-7 codec APIs:
bytes that have been decoded will be stored in *consumed*.
.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, \
int base64SetO, int base64WhiteSpace, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-7 and
return a Python bytes object. Return ``NULL`` if an exception was raised by
the codec.
If *base64SetO* is nonzero, "Set O" (punctuation that has no otherwise
special meaning) will be encoded in base-64. If *base64WhiteSpace* is
nonzero, whitespace will be encoded in base-64. Both are set to zero for the
Python "utf-7" codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsEncodedString`.
Unicode-Escape Codecs
"""""""""""""""""""""
@ -1332,16 +1231,6 @@ These are the "Unicode Escape" codec APIs:
raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and
return a bytes object. Return ``NULL`` if an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsUnicodeEscapeString`.
Raw-Unicode-Escape Codecs
"""""""""""""""""""""""""
@ -1362,18 +1251,6 @@ These are the "Raw Unicode Escape" codec APIs:
was raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, \
Py_ssize_t size)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
and return a bytes object. Return ``NULL`` if an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsRawUnicodeEscapeString` or
:c:func:`PyUnicode_AsEncodedString`.
Latin-1 Codecs
""""""""""""""
@ -1394,18 +1271,6 @@ ordinals and only these are accepted by the codecs during encoding.
raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and
return a Python bytes object. Return ``NULL`` if an exception was raised by
the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsLatin1String` or
:c:func:`PyUnicode_AsEncodedString`.
ASCII Codecs
""""""""""""
@ -1426,18 +1291,6 @@ codes generate errors.
raised by the codec.
.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and
return a Python bytes object. Return ``NULL`` if an exception was raised by
the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsASCIIString` or
:c:func:`PyUnicode_AsEncodedString`.
Character Map Codecs
""""""""""""""""""""
@ -1477,19 +1330,6 @@ These are the mapping codec APIs:
``None`` are treated as "undefined mapping" and cause an error.
.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \
PyObject *mapping, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
*mapping* object and return the result as a bytes object. Return ``NULL`` if
an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsCharmapString` or
:c:func:`PyUnicode_AsEncodedString`.
The following codec API is special in that maps Unicode to Unicode.
.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)
@ -1509,19 +1349,6 @@ The following codec API is special in that maps Unicode to Unicode.
use the default error handling.
.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \
PyObject *mapping, const char *errors)
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
character *mapping* table to it and return the resulting Unicode object.
Return ``NULL`` when an exception was raised by the codec.
.. deprecated-removed:: 3.3 3.11
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_Translate`. or :ref:`generic codec based API
<codec-registry>`
MBCS codecs for Windows
"""""""""""""""""""""""
@ -1561,18 +1388,6 @@ the user settings on the machine running the codec.
.. versionadded:: 3.3
.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return
a Python bytes object. Return ``NULL`` if an exception was raised by the
codec.
.. deprecated-removed:: 3.3 4.0
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
:c:func:`PyUnicode_AsMBCSString`, :c:func:`PyUnicode_EncodeCodePage` or
:c:func:`PyUnicode_AsEncodedString`.
Methods & Slots
"""""""""""""""

View file

@ -2427,10 +2427,6 @@ PyUnicode_FromUnicode:Py_ssize_t:size::
PyUnicode_AsUnicode:Py_UNICODE*:::
PyUnicode_AsUnicode:PyObject*:unicode:0:
PyUnicode_TransformDecimalToASCII:PyObject*::+1:
PyUnicode_TransformDecimalToASCII:Py_UNICODE*:s::
PyUnicode_TransformDecimalToASCII:Py_ssize_t:size::
PyUnicode_AsUnicodeAndSize:Py_UNICODE*:::
PyUnicode_AsUnicodeAndSize:PyObject*:unicode:0:
PyUnicode_AsUnicodeAndSize:Py_ssize_t*:size::
@ -2478,12 +2474,6 @@ PyUnicode_DecodeUTF8Stateful:Py_ssize_t:size::
PyUnicode_DecodeUTF8Stateful:const char*:errors::
PyUnicode_DecodeUTF8Stateful:Py_ssize_t*:consumed::
PyUnicode_Encode:PyObject*::+1:
PyUnicode_Encode:const Py_UNICODE*:s::
PyUnicode_Encode:Py_ssize_t:size::
PyUnicode_Encode:const char*:encoding::
PyUnicode_Encode:const char*:errors::
PyUnicode_AsEncodedString:PyObject*::+1:
PyUnicode_AsEncodedString:PyObject*:unicode:0:
PyUnicode_AsEncodedString:const char*:encoding::
@ -2500,23 +2490,11 @@ PyUnicode_DecodeUTF7Stateful:Py_ssize_t:size::
PyUnicode_DecodeUTF7Stateful:const char*:errors::
PyUnicode_DecodeUTF7Stateful:Py_ssize_t*:consumed::
PyUnicode_EncodeUTF7:PyObject*::+1:
PyUnicode_EncodeUTF7:const Py_UNICODE*:s::
PyUnicode_EncodeUTF7:Py_ssize_t:size::
PyUnicode_EncodeUTF7:int:base64SetO::
PyUnicode_EncodeUTF7:int:base64WhiteSpace::
PyUnicode_EncodeUTF7:const char*:errors::
PyUnicode_DecodeUTF8:PyObject*::+1:
PyUnicode_DecodeUTF8:const char*:s::
PyUnicode_DecodeUTF8:Py_ssize_t:size::
PyUnicode_DecodeUTF8:const char*:errors::
PyUnicode_EncodeUTF8:PyObject*::+1:
PyUnicode_EncodeUTF8:const Py_UNICODE*:s::
PyUnicode_EncodeUTF8:Py_ssize_t:size::
PyUnicode_EncodeUTF8:const char*:errors::
PyUnicode_AsUTF8String:PyObject*::+1:
PyUnicode_AsUTF8String:PyObject*:unicode:0:
@ -2533,12 +2511,6 @@ PyUnicode_DecodeUTF16:Py_ssize_t:size::
PyUnicode_DecodeUTF16:const char*:errors::
PyUnicode_DecodeUTF16:int*:byteorder::
PyUnicode_EncodeUTF16:PyObject*::+1:
PyUnicode_EncodeUTF16:const Py_UNICODE*:s::
PyUnicode_EncodeUTF16:Py_ssize_t:size::
PyUnicode_EncodeUTF16:const char*:errors::
PyUnicode_EncodeUTF16:int:byteorder::
PyUnicode_AsUTF16String:PyObject*::+1:
PyUnicode_AsUTF16String:PyObject*:unicode:0:
@ -2558,21 +2530,11 @@ PyUnicode_DecodeUTF32Stateful:Py_ssize_t*:consumed::
PyUnicode_AsUTF32String:PyObject*::+1:
PyUnicode_AsUTF32String:PyObject*:unicode:0:
PyUnicode_EncodeUTF32:PyObject*::+1:
PyUnicode_EncodeUTF32:const Py_UNICODE*:s::
PyUnicode_EncodeUTF32:Py_ssize_t:size::
PyUnicode_EncodeUTF32:const char*:errors::
PyUnicode_EncodeUTF32:int:byteorder::
PyUnicode_DecodeUnicodeEscape:PyObject*::+1:
PyUnicode_DecodeUnicodeEscape:const char*:s::
PyUnicode_DecodeUnicodeEscape:Py_ssize_t:size::
PyUnicode_DecodeUnicodeEscape:const char*:errors::
PyUnicode_EncodeUnicodeEscape:PyObject*::+1:
PyUnicode_EncodeUnicodeEscape:const Py_UNICODE*:s::
PyUnicode_EncodeUnicodeEscape:Py_ssize_t:size::
PyUnicode_AsUnicodeEscapeString:PyObject*::+1:
PyUnicode_AsUnicodeEscapeString:PyObject*:unicode:0:
@ -2581,10 +2543,6 @@ PyUnicode_DecodeRawUnicodeEscape:const char*:s::
PyUnicode_DecodeRawUnicodeEscape:Py_ssize_t:size::
PyUnicode_DecodeRawUnicodeEscape:const char*:errors::
PyUnicode_EncodeRawUnicodeEscape:PyObject*::+1:
PyUnicode_EncodeRawUnicodeEscape:const Py_UNICODE*:s::
PyUnicode_EncodeRawUnicodeEscape:Py_ssize_t:size::
PyUnicode_AsRawUnicodeEscapeString:PyObject*::+1:
PyUnicode_AsRawUnicodeEscapeString:PyObject*:unicode:0:
@ -2593,11 +2551,6 @@ PyUnicode_DecodeLatin1:const char*:s::
PyUnicode_DecodeLatin1:Py_ssize_t:size::
PyUnicode_DecodeLatin1:const char*:errors::
PyUnicode_EncodeLatin1:PyObject*::+1:
PyUnicode_EncodeLatin1:const Py_UNICODE*:s::
PyUnicode_EncodeLatin1:Py_ssize_t:size::
PyUnicode_EncodeLatin1:const char*:errors::
PyUnicode_AsLatin1String:PyObject*::+1:
PyUnicode_AsLatin1String:PyObject*:unicode:0:
@ -2606,11 +2559,6 @@ PyUnicode_DecodeASCII:const char*:s::
PyUnicode_DecodeASCII:Py_ssize_t:size::
PyUnicode_DecodeASCII:const char*:errors::
PyUnicode_EncodeASCII:PyObject*::+1:
PyUnicode_EncodeASCII:const Py_UNICODE*:s::
PyUnicode_EncodeASCII:Py_ssize_t:size::
PyUnicode_EncodeASCII:const char*:errors::
PyUnicode_AsASCIIString:PyObject*::+1:
PyUnicode_AsASCIIString:PyObject*:unicode:0:
@ -2620,22 +2568,10 @@ PyUnicode_DecodeCharmap:Py_ssize_t:size::
PyUnicode_DecodeCharmap:PyObject*:mapping:0:
PyUnicode_DecodeCharmap:const char*:errors::
PyUnicode_EncodeCharmap:PyObject*::+1:
PyUnicode_EncodeCharmap:const Py_UNICODE*:s::
PyUnicode_EncodeCharmap:Py_ssize_t:size::
PyUnicode_EncodeCharmap:PyObject*:mapping:0:
PyUnicode_EncodeCharmap:const char*:errors::
PyUnicode_AsCharmapString:PyObject*::+1:
PyUnicode_AsCharmapString:PyObject*:unicode:0:
PyUnicode_AsCharmapString:PyObject*:mapping:0:
PyUnicode_TranslateCharmap:PyObject*::+1:
PyUnicode_TranslateCharmap:const Py_UNICODE*:s::
PyUnicode_TranslateCharmap:Py_ssize_t:size::
PyUnicode_TranslateCharmap:PyObject*:mapping:0:
PyUnicode_TranslateCharmap:const char*:errors::
PyUnicode_DecodeMBCS:PyObject*::+1:
PyUnicode_DecodeMBCS:const char*:s::
PyUnicode_DecodeMBCS:Py_ssize_t:size::
@ -2652,11 +2588,6 @@ PyUnicode_EncodeCodePage:int:code_page::
PyUnicode_EncodeCodePage:PyObject*:unicode:0:
PyUnicode_EncodeCodePage:const char*:errors::
PyUnicode_EncodeMBCS:PyObject*::+1:
PyUnicode_EncodeMBCS:const Py_UNICODE*:s::
PyUnicode_EncodeMBCS:Py_ssize_t:size::
PyUnicode_EncodeMBCS:const char*:errors::
PyUnicode_AsMBCSString:PyObject*::+1:
PyUnicode_AsMBCSString:PyObject*:unicode:0:
@ -2891,21 +2822,6 @@ PyUnicodeDecodeError_SetStart:int:::
PyUnicodeDecodeError_SetStart:PyObject*:exc:0:
PyUnicodeDecodeError_SetStart:Py_ssize_t:start::
PyUnicodeEncodeError_Create:PyObject*::+1:
PyUnicodeEncodeError_Create:const char*:encoding::
PyUnicodeEncodeError_Create:const Py_UNICODE*:object::
PyUnicodeEncodeError_Create:Py_ssize_t:length::
PyUnicodeEncodeError_Create:Py_ssize_t:start::
PyUnicodeEncodeError_Create:Py_ssize_t:end::
PyUnicodeEncodeError_Create:const char*:reason::
PyUnicodeTranslateError_Create:PyObject*::+1:
PyUnicodeTranslateError_Create:const Py_UNICODE*:object::
PyUnicodeTranslateError_Create:Py_ssize_t:length::
PyUnicodeTranslateError_Create:Py_ssize_t:start::
PyUnicodeTranslateError_Create:Py_ssize_t:end::
PyUnicodeTranslateError_Create:const char*:reason::
PyWeakref_Check:int:::
PyWeakref_Check:PyObject*:ob::

View file

@ -161,30 +161,6 @@ PyAPI_FUNC(PyObject *) PyErr_ProgramTextObject(
PyObject *filename,
int lineno);
/* Create a UnicodeEncodeError object.
*
* TODO: This API will be removed in Python 3.11.
*/
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicodeEncodeError_Create(
const char *encoding, /* UTF-8 encoded string */
const Py_UNICODE *object,
Py_ssize_t length,
Py_ssize_t start,
Py_ssize_t end,
const char *reason /* UTF-8 encoded string */
);
/* Create a UnicodeTranslateError object.
*
* TODO: This API will be removed in Python 3.11.
*/
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicodeTranslateError_Create(
const Py_UNICODE *object,
Py_ssize_t length,
Py_ssize_t start,
Py_ssize_t end,
const char *reason /* UTF-8 encoded string */
);
PyAPI_FUNC(PyObject *) _PyUnicodeTranslateError_Create(
PyObject *object,
Py_ssize_t start,

View file

@ -743,27 +743,8 @@ PyAPI_FUNC(const char *) PyUnicode_AsUTF8(PyObject *unicode);
#define _PyUnicode_AsString PyUnicode_AsUTF8
/* --- Generic Codecs ----------------------------------------------------- */
/* Encodes a Py_UNICODE buffer of the given size and returns a
Python string object. */
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_Encode(
const Py_UNICODE *s, /* Unicode char buffer */
Py_ssize_t size, /* number of Py_UNICODE chars to encode */
const char *encoding, /* encoding */
const char *errors /* error handling */
);
/* --- UTF-7 Codecs ------------------------------------------------------- */
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF7(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* number of Py_UNICODE chars to encode */
int base64SetO, /* Encode RFC2152 Set O characters in base64 */
int base64WhiteSpace, /* Encode whitespace (sp, ht, nl, cr) in base64 */
const char *errors /* error handling */
);
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF7(
PyObject *unicode, /* Unicode object */
int base64SetO, /* Encode RFC2152 Set O characters in base64 */
@ -777,21 +758,8 @@ PyAPI_FUNC(PyObject*) _PyUnicode_AsUTF8String(
PyObject *unicode,
const char *errors);
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF8(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* number of Py_UNICODE chars to encode */
const char *errors /* error handling */
);
/* --- UTF-32 Codecs ------------------------------------------------------ */
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF32(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* number of Py_UNICODE chars to encode */
const char *errors, /* error handling */
int byteorder /* byteorder to use 0=BOM+native;-1=LE,1=BE */
);
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF32(
PyObject *object, /* Unicode object */
const char *errors, /* error handling */
@ -813,19 +781,7 @@ PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF32(
If byteorder is 0, the output string will always start with the
Unicode BOM mark (U+FEFF). In the other two modes, no BOM mark is
prepended.
Note that Py_UNICODE data is being interpreted as UTF-16 reduced to
UCS-2. This trick makes it possible to add full UTF-16 capabilities
at a later point without compromising the APIs.
*/
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUTF16(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* number of Py_UNICODE chars to encode */
const char *errors, /* error handling */
int byteorder /* byteorder to use 0=BOM+native;-1=LE,1=BE */
);
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeUTF16(
PyObject* unicode, /* Unicode object */
const char *errors, /* error handling */
@ -845,60 +801,22 @@ PyAPI_FUNC(PyObject*) _PyUnicode_DecodeUnicodeEscape(
string. */
);
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeUnicodeEscape(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length /* Number of Py_UNICODE chars to encode */
);
/* --- Raw-Unicode-Escape Codecs ------------------------------------------ */
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeRawUnicodeEscape(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length /* Number of Py_UNICODE chars to encode */
);
/* --- Latin-1 Codecs ----------------------------------------------------- */
PyAPI_FUNC(PyObject*) _PyUnicode_AsLatin1String(
PyObject* unicode,
const char* errors);
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeLatin1(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
const char *errors /* error handling */
);
/* --- ASCII Codecs ------------------------------------------------------- */
PyAPI_FUNC(PyObject*) _PyUnicode_AsASCIIString(
PyObject* unicode,
const char* errors);
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeASCII(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
const char *errors /* error handling */
);
/* --- Character Map Codecs ----------------------------------------------- */
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeCharmap(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
PyObject *mapping, /* encoding mapping */
const char *errors /* error handling */
);
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
PyObject *unicode, /* Unicode object */
PyObject *mapping, /* encoding mapping */
const char *errors /* error handling */
);
/* Translate a Py_UNICODE buffer of the given length by applying a
character mapping table to it and return the resulting Unicode
object.
/* Translate an Unicode object by applying a character mapping table to
it and return the resulting Unicode object.
The mapping table must map Unicode ordinal integers to Unicode strings,
Unicode ordinal integers or None (causing deletion of the character).
@ -906,68 +824,15 @@ PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
Mapping tables may be dictionaries or sequences. Unmapped character
ordinals (ones which cause a LookupError) are left untouched and
are copied as-is.
*/
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject *) PyUnicode_TranslateCharmap(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
PyObject *table, /* Translate table */
PyAPI_FUNC(PyObject*) _PyUnicode_EncodeCharmap(
PyObject *unicode, /* Unicode object */
PyObject *mapping, /* encoding mapping */
const char *errors /* error handling */
);
/* --- MBCS codecs for Windows -------------------------------------------- */
#ifdef MS_WINDOWS
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_EncodeMBCS(
const Py_UNICODE *data, /* Unicode char buffer */
Py_ssize_t length, /* number of Py_UNICODE chars to encode */
const char *errors /* error handling */
);
#endif
/* --- Decimal Encoder ---------------------------------------------------- */
/* Takes a Unicode string holding a decimal value and writes it into
an output buffer using standard ASCII digit codes.
The output buffer has to provide at least length+1 bytes of storage
area. The output string is 0-terminated.
The encoder converts whitespace to ' ', decimal characters to their
corresponding ASCII digit and all other Latin-1 characters except
\0 as-is. Characters outside this range (Unicode ordinals 1-256)
are treated as errors. This includes embedded NULL bytes.
Error handling is defined by the errors argument:
NULL or "strict": raise a ValueError
"ignore": ignore the wrong characters (these are not copied to the
output buffer)
"replace": replaces illegal characters with '?'
Returns 0 on success, -1 on failure.
*/
Py_DEPRECATED(3.3) PyAPI_FUNC(int) PyUnicode_EncodeDecimal(
Py_UNICODE *s, /* Unicode buffer */
Py_ssize_t length, /* Number of Py_UNICODE chars to encode */
char *output, /* Output buffer; must have size >= length */
const char *errors /* error handling */
);
/* Transforms code points that have decimal digit property to the
corresponding ASCII digit code points.
Returns a new Unicode string on success, NULL on failure.
*/
Py_DEPRECATED(3.3)
PyAPI_FUNC(PyObject*) PyUnicode_TransformDecimalToASCII(
Py_UNICODE *s, /* Unicode buffer */
Py_ssize_t length /* Number of Py_UNICODE chars to transform */
);
/* Coverts a Unicode object holding a decimal value to an ASCII string
for using in int, float and complex parsers.
Transforms code points that have decimal digit property to the

View file

@ -2935,40 +2935,6 @@ def test_copycharacters(self):
self.assertRaises(SystemError, unicode_copycharacters, s, 0, s, 0, -1)
self.assertRaises(SystemError, unicode_copycharacters, s, 0, b'', 0, 0)
@support.cpython_only
@support.requires_legacy_unicode_capi
def test_encode_decimal(self):
from _testcapi import unicode_encodedecimal
with warnings_helper.check_warnings():
warnings.simplefilter('ignore', DeprecationWarning)
self.assertEqual(unicode_encodedecimal('123'),
b'123')
self.assertEqual(unicode_encodedecimal('\u0663.\u0661\u0664'),
b'3.14')
self.assertEqual(unicode_encodedecimal(
"\N{EM SPACE}3.14\N{EN SPACE}"), b' 3.14 ')
self.assertRaises(UnicodeEncodeError,
unicode_encodedecimal, "123\u20ac", "strict")
self.assertRaisesRegex(
ValueError,
"^'decimal' codec can't encode character",
unicode_encodedecimal, "123\u20ac", "replace")
@support.cpython_only
@support.requires_legacy_unicode_capi
def test_transform_decimal(self):
from _testcapi import unicode_transformdecimaltoascii as transform_decimal
with warnings_helper.check_warnings():
warnings.simplefilter('ignore', DeprecationWarning)
self.assertEqual(transform_decimal('123'),
'123')
self.assertEqual(transform_decimal('\u0663.\u0661\u0664'),
'3.14')
self.assertEqual(transform_decimal("\N{EM SPACE}3.14\N{EN SPACE}"),
"\N{EM SPACE}3.14\N{EN SPACE}")
self.assertEqual(transform_decimal('123\u20ac'),
'123\u20ac')
@support.cpython_only
def test_pep393_utf8_caching_bug(self):
# Issue #25709: Problem with string concatenation and utf-8 cache

View file

@ -0,0 +1,9 @@
Remove deprecated ``Py_UNICODE`` APIs: ``PyUnicode_Encode``,
``PyUnicode_EncodeUTF7``, ``PyUnicode_EncodeUTF8``,
``PyUnicode_EncodeUTF16``, ``PyUnicode_EncodeUTF32``,
``PyUnicode_EncodeLatin1``, ``PyUnicode_EncodeMBCS``,
``PyUnicode_EncodeDecimal``, ``PyUnicode_EncodeRawUnicodeEscape``,
``PyUnicode_EncodeCharmap``, ``PyUnicode_EncodeUnicodeEscape``,
``PyUnicode_TransformDecimalToASCII``, ``PyUnicode_TranslateCharmap``,
``PyUnicodeEncodeError_Create``, ``PyUnicodeTranslateError_Create``. See
:pep:`393` and :pep:`624` for reference.

View file

@ -2570,7 +2570,7 @@ save_picklebuffer(PicklerObject *self, PyObject *obj)
return 0;
}
/* A copy of PyUnicode_EncodeRawUnicodeEscape() that also translates
/* A copy of PyUnicode_AsRawUnicodeEscapeString() that also translates
backslash and newline characters to \uXXXX escapes. */
static PyObject *
raw_unicode_escape(PyObject *obj)

View file

@ -2154,51 +2154,6 @@ unicode_copycharacters(PyObject *self, PyObject *args)
_Py_COMP_DIAG_PUSH
_Py_COMP_DIAG_IGNORE_DEPR_DECLS
static PyObject *
unicode_encodedecimal(PyObject *self, PyObject *args)
{
Py_UNICODE *unicode;
Py_ssize_t length;
char *errors = NULL;
PyObject *decimal;
Py_ssize_t decimal_length, new_length;
int res;
if (!PyArg_ParseTuple(args, "u#|s", &unicode, &length, &errors))
return NULL;
decimal_length = length * 7; /* len('&#8364;') */
decimal = PyBytes_FromStringAndSize(NULL, decimal_length);
if (decimal == NULL)
return NULL;
res = PyUnicode_EncodeDecimal(unicode, length,
PyBytes_AS_STRING(decimal),
errors);
if (res < 0) {
Py_DECREF(decimal);
return NULL;
}
new_length = strlen(PyBytes_AS_STRING(decimal));
assert(new_length <= decimal_length);
res = _PyBytes_Resize(&decimal, new_length);
if (res < 0)
return NULL;
return decimal;
}
static PyObject *
unicode_transformdecimaltoascii(PyObject *self, PyObject *args)
{
Py_UNICODE *unicode;
Py_ssize_t length;
if (!PyArg_ParseTuple(args, "u#|s", &unicode, &length))
return NULL;
return PyUnicode_TransformDecimalToASCII(unicode, length);
}
static PyObject *
unicode_legacy_string(PyObject *self, PyObject *args)
{
@ -5737,8 +5692,6 @@ static PyMethodDef TestMethods[] = {
{"unicode_findchar", unicode_findchar, METH_VARARGS},
{"unicode_copycharacters", unicode_copycharacters, METH_VARARGS},
#if USE_UNICODE_WCHAR_CACHE
{"unicode_encodedecimal", unicode_encodedecimal, METH_VARARGS},
{"unicode_transformdecimaltoascii", unicode_transformdecimaltoascii, METH_VARARGS},
{"unicode_legacy_string", unicode_legacy_string, METH_VARARGS},
#endif /* USE_UNICODE_WCHAR_CACHE */
{"_test_thread_state", test_thread_state, METH_VARARGS},

View file

@ -2129,15 +2129,6 @@ static PyTypeObject _PyExc_UnicodeEncodeError = {
};
PyObject *PyExc_UnicodeEncodeError = (PyObject *)&_PyExc_UnicodeEncodeError;
PyObject *
PyUnicodeEncodeError_Create(
const char *encoding, const Py_UNICODE *object, Py_ssize_t length,
Py_ssize_t start, Py_ssize_t end, const char *reason)
{
return PyObject_CallFunction(PyExc_UnicodeEncodeError, "su#nns",
encoding, object, length, start, end, reason);
}
/*
* UnicodeDecodeError extends UnicodeError
@ -2342,16 +2333,6 @@ static PyTypeObject _PyExc_UnicodeTranslateError = {
};
PyObject *PyExc_UnicodeTranslateError = (PyObject *)&_PyExc_UnicodeTranslateError;
/* Deprecated. */
PyObject *
PyUnicodeTranslateError_Create(
const Py_UNICODE *object, Py_ssize_t length,
Py_ssize_t start, Py_ssize_t end, const char *reason)
{
return PyObject_CallFunction(PyExc_UnicodeTranslateError, "u#nns",
object, length, start, end, reason);
}
PyObject *
_PyUnicodeTranslateError_Create(
PyObject *object,

View file

@ -3730,22 +3730,6 @@ PyUnicode_AsDecodedUnicode(PyObject *unicode,
return NULL;
}
PyObject *
PyUnicode_Encode(const Py_UNICODE *s,
Py_ssize_t size,
const char *encoding,
const char *errors)
{
PyObject *v, *unicode;
unicode = PyUnicode_FromWideChar(s, size);
if (unicode == NULL)
return NULL;
v = PyUnicode_AsEncodedString(unicode, encoding, errors);
Py_DECREF(unicode);
return v;
}
PyObject *
PyUnicode_AsEncodedObject(PyObject *unicode,
const char *encoding,
@ -5047,22 +5031,6 @@ _PyUnicode_EncodeUTF7(PyObject *str,
return NULL;
return v;
}
PyObject *
PyUnicode_EncodeUTF7(const Py_UNICODE *s,
Py_ssize_t size,
int base64SetO,
int base64WhiteSpace,
const char *errors)
{
PyObject *result;
PyObject *tmp = PyUnicode_FromWideChar(s, size);
if (tmp == NULL)
return NULL;
result = _PyUnicode_EncodeUTF7(tmp, base64SetO,
base64WhiteSpace, errors);
Py_DECREF(tmp);
return result;
}
#undef IS_BASE64
#undef FROM_BASE64
@ -5705,21 +5673,6 @@ _PyUnicode_AsUTF8String(PyObject *unicode, const char *errors)
}
PyObject *
PyUnicode_EncodeUTF8(const Py_UNICODE *s,
Py_ssize_t size,
const char *errors)
{
PyObject *v, *unicode;
unicode = PyUnicode_FromWideChar(s, size);
if (unicode == NULL)
return NULL;
v = _PyUnicode_AsUTF8String(unicode, errors);
Py_DECREF(unicode);
return v;
}
PyObject *
PyUnicode_AsUTF8String(PyObject *unicode)
{
@ -6029,21 +5982,6 @@ _PyUnicode_EncodeUTF32(PyObject *str,
return NULL;
}
PyObject *
PyUnicode_EncodeUTF32(const Py_UNICODE *s,
Py_ssize_t size,
const char *errors,
int byteorder)
{
PyObject *result;
PyObject *tmp = PyUnicode_FromWideChar(s, size);
if (tmp == NULL)
return NULL;
result = _PyUnicode_EncodeUTF32(tmp, errors, byteorder);
Py_DECREF(tmp);
return result;
}
PyObject *
PyUnicode_AsUTF32String(PyObject *unicode)
{
@ -6382,21 +6320,6 @@ _PyUnicode_EncodeUTF16(PyObject *str,
#undef STORECHAR
}
PyObject *
PyUnicode_EncodeUTF16(const Py_UNICODE *s,
Py_ssize_t size,
const char *errors,
int byteorder)
{
PyObject *result;
PyObject *tmp = PyUnicode_FromWideChar(s, size);
if (tmp == NULL)
return NULL;
result = _PyUnicode_EncodeUTF16(tmp, errors, byteorder);
Py_DECREF(tmp);
return result;
}
PyObject *
PyUnicode_AsUTF16String(PyObject *unicode)
{
@ -6773,21 +6696,6 @@ PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
return repr;
}
PyObject *
PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s,
Py_ssize_t size)
{
PyObject *result;
PyObject *tmp = PyUnicode_FromWideChar(s, size);
if (tmp == NULL) {
return NULL;
}
result = PyUnicode_AsUnicodeEscapeString(tmp);
Py_DECREF(tmp);
return result;
}
/* --- Raw Unicode Escape Codec ------------------------------------------- */
PyObject *
@ -6988,19 +6896,6 @@ PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode)
return repr;
}
PyObject *
PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s,
Py_ssize_t size)
{
PyObject *result;
PyObject *tmp = PyUnicode_FromWideChar(s, size);
if (tmp == NULL)
return NULL;
result = PyUnicode_AsRawUnicodeEscapeString(tmp);
Py_DECREF(tmp);
return result;
}
/* --- Latin-1 Codec ------------------------------------------------------ */
PyObject *
@ -7285,21 +7180,6 @@ unicode_encode_ucs1(PyObject *unicode,
return NULL;
}
/* Deprecated */
PyObject *
PyUnicode_EncodeLatin1(const Py_UNICODE *p,
Py_ssize_t size,
const char *errors)
{
PyObject *result;
PyObject *unicode = PyUnicode_FromWideChar(p, size);
if (unicode == NULL)
return NULL;
result = unicode_encode_ucs1(unicode, errors, 256);
Py_DECREF(unicode);
return result;
}
PyObject *
_PyUnicode_AsLatin1String(PyObject *unicode, const char *errors)
{
@ -7426,21 +7306,6 @@ PyUnicode_DecodeASCII(const char *s,
return NULL;
}
/* Deprecated */
PyObject *
PyUnicode_EncodeASCII(const Py_UNICODE *p,
Py_ssize_t size,
const char *errors)
{
PyObject *result;
PyObject *unicode = PyUnicode_FromWideChar(p, size);
if (unicode == NULL)
return NULL;
result = unicode_encode_ucs1(unicode, errors, 128);
Py_DECREF(unicode);
return result;
}
PyObject *
_PyUnicode_AsASCIIString(PyObject *unicode, const char *errors)
{
@ -8168,20 +8033,6 @@ encode_code_page(int code_page,
return outbytes;
}
PyObject *
PyUnicode_EncodeMBCS(const Py_UNICODE *p,
Py_ssize_t size,
const char *errors)
{
PyObject *unicode, *res;
unicode = PyUnicode_FromWideChar(p, size);
if (unicode == NULL)
return NULL;
res = encode_code_page(CP_ACP, unicode, errors);
Py_DECREF(unicode);
return res;
}
PyObject *
PyUnicode_EncodeCodePage(int code_page,
PyObject *unicode,
@ -9008,22 +8859,6 @@ _PyUnicode_EncodeCharmap(PyObject *unicode,
return NULL;
}
/* Deprecated */
PyObject *
PyUnicode_EncodeCharmap(const Py_UNICODE *p,
Py_ssize_t size,
PyObject *mapping,
const char *errors)
{
PyObject *result;
PyObject *unicode = PyUnicode_FromWideChar(p, size);
if (unicode == NULL)
return NULL;
result = _PyUnicode_EncodeCharmap(unicode, mapping, errors);
Py_DECREF(unicode);
return result;
}
PyObject *
PyUnicode_AsCharmapString(PyObject *unicode,
PyObject *mapping)
@ -9448,22 +9283,6 @@ _PyUnicode_TranslateCharmap(PyObject *input,
return NULL;
}
/* Deprecated. Use PyUnicode_Translate instead. */
PyObject *
PyUnicode_TranslateCharmap(const Py_UNICODE *p,
Py_ssize_t size,
PyObject *mapping,
const char *errors)
{
PyObject *result;
PyObject *unicode = PyUnicode_FromWideChar(p, size);
if (!unicode)
return NULL;
result = _PyUnicode_TranslateCharmap(unicode, mapping, errors);
Py_DECREF(unicode);
return result;
}
PyObject *
PyUnicode_Translate(PyObject *str,
PyObject *mapping,
@ -9523,110 +9342,6 @@ _PyUnicode_TransformDecimalAndSpaceToASCII(PyObject *unicode)
return result;
}
PyObject *
PyUnicode_TransformDecimalToASCII(Py_UNICODE *s,
Py_ssize_t length)
{
PyObject *decimal;
Py_ssize_t i;
Py_UCS4 maxchar;
enum PyUnicode_Kind kind;
const void *data;
maxchar = 127;
for (i = 0; i < length; i++) {
Py_UCS4 ch = s[i];
if (ch > 127) {
int decimal = Py_UNICODE_TODECIMAL(ch);
if (decimal >= 0)
ch = '0' + decimal;
maxchar = Py_MAX(maxchar, ch);
}
}
/* Copy to a new string */
decimal = PyUnicode_New(length, maxchar);
if (decimal == NULL)
return decimal;
kind = PyUnicode_KIND(decimal);
data = PyUnicode_DATA(decimal);
/* Iterate over code points */
for (i = 0; i < length; i++) {
Py_UCS4 ch = s[i];
if (ch > 127) {
int decimal = Py_UNICODE_TODECIMAL(ch);
if (decimal >= 0)
ch = '0' + decimal;
}
PyUnicode_WRITE(kind, data, i, ch);
}
return unicode_result(decimal);
}
/* --- Decimal Encoder ---------------------------------------------------- */
int
PyUnicode_EncodeDecimal(Py_UNICODE *s,
Py_ssize_t length,
char *output,
const char *errors)
{
PyObject *unicode;
Py_ssize_t i;
enum PyUnicode_Kind kind;
const void *data;
if (output == NULL) {
PyErr_BadArgument();
return -1;
}
unicode = PyUnicode_FromWideChar(s, length);
if (unicode == NULL)
return -1;
kind = PyUnicode_KIND(unicode);
data = PyUnicode_DATA(unicode);
for (i=0; i < length; ) {
PyObject *exc;
Py_UCS4 ch;
int decimal;
Py_ssize_t startpos;
ch = PyUnicode_READ(kind, data, i);
if (Py_UNICODE_ISSPACE(ch)) {
*output++ = ' ';
i++;
continue;
}
decimal = Py_UNICODE_TODECIMAL(ch);
if (decimal >= 0) {
*output++ = '0' + decimal;
i++;
continue;
}
if (0 < ch && ch < 256) {
*output++ = (char)ch;
i++;
continue;
}
startpos = i;
exc = NULL;
raise_encode_exception(&exc, "decimal", unicode,
startpos, startpos+1,
"invalid decimal Unicode string");
Py_XDECREF(exc);
Py_DECREF(unicode);
return -1;
}
/* 0-terminate the output string */
*output++ = '\0';
Py_DECREF(unicode);
return 0;
}
/* --- Helpers ------------------------------------------------------------ */
/* helper macro to fixup start/end slice values */