mirror of
https://github.com/python/cpython
synced 2024-09-19 20:51:48 +00:00
898 lines
33 KiB
ReStructuredText
898 lines
33 KiB
ReStructuredText
.. highlight:: c
|
|
|
|
.. _defining-new-types:
|
|
|
|
**********************************
|
|
Defining Extension Types: Tutorial
|
|
**********************************
|
|
|
|
.. sectionauthor:: Michael Hudson <mwh@python.net>
|
|
.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
|
|
.. sectionauthor:: Jim Fulton <jim@zope.com>
|
|
|
|
|
|
Python allows the writer of a C extension module to define new types that
|
|
can be manipulated from Python code, much like the built-in :class:`str`
|
|
and :class:`list` types. The code for all extension types follows a
|
|
pattern, but there are some details that you need to understand before you
|
|
can get started. This document is a gentle introduction to the topic.
|
|
|
|
|
|
.. _dnt-basics:
|
|
|
|
The Basics
|
|
==========
|
|
|
|
The :term:`CPython` runtime sees all Python objects as variables of type
|
|
:c:expr:`PyObject*`, which serves as a "base type" for all Python objects.
|
|
The :c:type:`PyObject` structure itself only contains the object's
|
|
:term:`reference count` and a pointer to the object's "type object".
|
|
This is where the action is; the type object determines which (C) functions
|
|
get called by the interpreter when, for instance, an attribute gets looked up
|
|
on an object, a method called, or it is multiplied by another object. These
|
|
C functions are called "type methods".
|
|
|
|
So, if you want to define a new extension type, you need to create a new type
|
|
object.
|
|
|
|
This sort of thing can only be explained by example, so here's a minimal, but
|
|
complete, module that defines a new type named :class:`!Custom` inside a C
|
|
extension module :mod:`!custom`:
|
|
|
|
.. note::
|
|
What we're showing here is the traditional way of defining *static*
|
|
extension types. It should be adequate for most uses. The C API also
|
|
allows defining heap-allocated extension types using the
|
|
:c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.
|
|
|
|
.. literalinclude:: ../includes/newtypes/custom.c
|
|
|
|
Now that's quite a bit to take in at once, but hopefully bits will seem familiar
|
|
from the previous chapter. This file defines three things:
|
|
|
|
#. What a :class:`!Custom` **object** contains: this is the ``CustomObject``
|
|
struct, which is allocated once for each :class:`!Custom` instance.
|
|
#. How the :class:`!Custom` **type** behaves: this is the ``CustomType`` struct,
|
|
which defines a set of flags and function pointers that the interpreter
|
|
inspects when specific operations are requested.
|
|
#. How to initialize the :mod:`!custom` module: this is the ``PyInit_custom``
|
|
function and the associated ``custommodule`` struct.
|
|
|
|
The first bit is::
|
|
|
|
typedef struct {
|
|
PyObject_HEAD
|
|
} CustomObject;
|
|
|
|
This is what a Custom object will contain. ``PyObject_HEAD`` is mandatory
|
|
at the start of each object struct and defines a field called ``ob_base``
|
|
of type :c:type:`PyObject`, containing a pointer to a type object and a
|
|
reference count (these can be accessed using the macros :c:macro:`Py_TYPE`
|
|
and :c:macro:`Py_REFCNT` respectively). The reason for the macro is to
|
|
abstract away the layout and to enable additional fields in :ref:`debug builds
|
|
<debug-build>`.
|
|
|
|
.. note::
|
|
There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
|
|
Be wary of adding one by accident: some compilers will complain.
|
|
|
|
Of course, objects generally store additional data besides the standard
|
|
``PyObject_HEAD`` boilerplate; for example, here is the definition for
|
|
standard Python floats::
|
|
|
|
typedef struct {
|
|
PyObject_HEAD
|
|
double ob_fval;
|
|
} PyFloatObject;
|
|
|
|
The second bit is the definition of the type object. ::
|
|
|
|
static PyTypeObject CustomType = {
|
|
.ob_base = PyVarObject_HEAD_INIT(NULL, 0)
|
|
.tp_name = "custom.Custom",
|
|
.tp_doc = PyDoc_STR("Custom objects"),
|
|
.tp_basicsize = sizeof(CustomObject),
|
|
.tp_itemsize = 0,
|
|
.tp_flags = Py_TPFLAGS_DEFAULT,
|
|
.tp_new = PyType_GenericNew,
|
|
};
|
|
|
|
.. note::
|
|
We recommend using C99-style designated initializers as above, to
|
|
avoid listing all the :c:type:`PyTypeObject` fields that you don't care
|
|
about and also to avoid caring about the fields' declaration order.
|
|
|
|
The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
|
|
many more :ref:`fields <type-structs>` than the definition above. The
|
|
remaining fields will be filled with zeros by the C compiler, and it's
|
|
common practice to not specify them explicitly unless you need them.
|
|
|
|
We're going to pick it apart, one field at a time::
|
|
|
|
.ob_base = PyVarObject_HEAD_INIT(NULL, 0)
|
|
|
|
This line is mandatory boilerplate to initialize the ``ob_base``
|
|
field mentioned above. ::
|
|
|
|
.tp_name = "custom.Custom",
|
|
|
|
The name of our type. This will appear in the default textual representation of
|
|
our objects and in some error messages, for example:
|
|
|
|
.. code-block:: pycon
|
|
|
|
>>> "" + custom.Custom()
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in <module>
|
|
TypeError: can only concatenate str (not "custom.Custom") to str
|
|
|
|
Note that the name is a dotted name that includes both the module name and the
|
|
name of the type within the module. The module in this case is :mod:`!custom` and
|
|
the type is :class:`!Custom`, so we set the type name to :class:`!custom.Custom`.
|
|
Using the real dotted import path is important to make your type compatible
|
|
with the :mod:`pydoc` and :mod:`pickle` modules. ::
|
|
|
|
.tp_basicsize = sizeof(CustomObject),
|
|
.tp_itemsize = 0,
|
|
|
|
This is so that Python knows how much memory to allocate when creating
|
|
new :class:`!Custom` instances. :c:member:`~PyTypeObject.tp_itemsize` is
|
|
only used for variable-sized objects and should otherwise be zero.
|
|
|
|
.. note::
|
|
|
|
If you want your type to be subclassable from Python, and your type has the same
|
|
:c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
|
|
inheritance. A Python subclass of your type will have to list your type first
|
|
in its :attr:`~class.__bases__`, or else it will not be able to call your type's
|
|
:meth:`~object.__new__` method without getting an error. You can avoid this problem by
|
|
ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
|
|
base type does. Most of the time, this will be true anyway, because either your
|
|
base type will be :class:`object`, or else you will be adding data members to
|
|
your base type, and therefore increasing its size.
|
|
|
|
We set the class flags to :c:macro:`Py_TPFLAGS_DEFAULT`. ::
|
|
|
|
.tp_flags = Py_TPFLAGS_DEFAULT,
|
|
|
|
All types should include this constant in their flags. It enables all of the
|
|
members defined until at least Python 3.3. If you need further members,
|
|
you will need to OR the corresponding flags.
|
|
|
|
We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
|
|
|
|
.tp_doc = PyDoc_STR("Custom objects"),
|
|
|
|
To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
|
|
handler. This is the equivalent of the Python method :meth:`~object.__new__`, but
|
|
has to be specified explicitly. In this case, we can just use the default
|
|
implementation provided by the API function :c:func:`PyType_GenericNew`. ::
|
|
|
|
.tp_new = PyType_GenericNew,
|
|
|
|
Everything else in the file should be familiar, except for some code in
|
|
:c:func:`!PyInit_custom`::
|
|
|
|
if (PyType_Ready(&CustomType) < 0)
|
|
return;
|
|
|
|
This initializes the :class:`!Custom` type, filling in a number of members
|
|
to the appropriate default values, including :c:member:`~PyObject.ob_type` that we initially
|
|
set to ``NULL``. ::
|
|
|
|
if (PyModule_AddObjectRef(m, "Custom", (PyObject *) &CustomType) < 0) {
|
|
Py_DECREF(m);
|
|
return NULL;
|
|
}
|
|
|
|
This adds the type to the module dictionary. This allows us to create
|
|
:class:`!Custom` instances by calling the :class:`!Custom` class:
|
|
|
|
.. code-block:: pycon
|
|
|
|
>>> import custom
|
|
>>> mycustom = custom.Custom()
|
|
|
|
That's it! All that remains is to build it; put the above code in a file called
|
|
:file:`custom.c`,
|
|
|
|
.. literalinclude:: ../includes/newtypes/pyproject.toml
|
|
|
|
in a file called :file:`pyproject.toml`, and
|
|
|
|
.. code-block:: python
|
|
|
|
from setuptools import Extension, setup
|
|
setup(ext_modules=[Extension("custom", ["custom.c"])])
|
|
|
|
in a file called :file:`setup.py`; then typing
|
|
|
|
.. code-block:: shell-session
|
|
|
|
$ python -m pip install .
|
|
|
|
in a shell should produce a file :file:`custom.so` in a subdirectory
|
|
and install it; now fire up Python --- you should be able to ``import custom``
|
|
and play around with ``Custom`` objects.
|
|
|
|
That wasn't so hard, was it?
|
|
|
|
Of course, the current Custom type is pretty uninteresting. It has no data and
|
|
doesn't do anything. It can't even be subclassed.
|
|
|
|
|
|
Adding data and methods to the Basic example
|
|
============================================
|
|
|
|
Let's extend the basic example to add some data and methods. Let's also make
|
|
the type usable as a base class. We'll create a new module, :mod:`!custom2` that
|
|
adds these capabilities:
|
|
|
|
.. literalinclude:: ../includes/newtypes/custom2.c
|
|
|
|
|
|
This version of the module has a number of changes.
|
|
|
|
The :class:`!Custom` type now has three data attributes in its C struct,
|
|
*first*, *last*, and *number*. The *first* and *last* variables are Python
|
|
strings containing first and last names. The *number* attribute is a C integer.
|
|
|
|
The object structure is updated accordingly::
|
|
|
|
typedef struct {
|
|
PyObject_HEAD
|
|
PyObject *first; /* first name */
|
|
PyObject *last; /* last name */
|
|
int number;
|
|
} CustomObject;
|
|
|
|
Because we now have data to manage, we have to be more careful about object
|
|
allocation and deallocation. At a minimum, we need a deallocation method::
|
|
|
|
static void
|
|
Custom_dealloc(CustomObject *self)
|
|
{
|
|
Py_XDECREF(self->first);
|
|
Py_XDECREF(self->last);
|
|
Py_TYPE(self)->tp_free((PyObject *) self);
|
|
}
|
|
|
|
which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
|
|
|
|
.tp_dealloc = (destructor) Custom_dealloc,
|
|
|
|
This method first clears the reference counts of the two Python attributes.
|
|
:c:func:`Py_XDECREF` correctly handles the case where its argument is
|
|
``NULL`` (which might happen here if ``tp_new`` failed midway). It then
|
|
calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
|
|
(computed by ``Py_TYPE(self)``) to free the object's memory. Note that
|
|
the object's type might not be :class:`!CustomType`, because the object may
|
|
be an instance of a subclass.
|
|
|
|
.. note::
|
|
The explicit cast to ``destructor`` above is needed because we defined
|
|
``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
|
|
function pointer expects to receive a ``PyObject *`` argument. Otherwise,
|
|
the compiler will emit a warning. This is object-oriented polymorphism,
|
|
in C!
|
|
|
|
We want to make sure that the first and last names are initialized to empty
|
|
strings, so we provide a ``tp_new`` implementation::
|
|
|
|
static PyObject *
|
|
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
|
|
{
|
|
CustomObject *self;
|
|
self = (CustomObject *) type->tp_alloc(type, 0);
|
|
if (self != NULL) {
|
|
self->first = PyUnicode_FromString("");
|
|
if (self->first == NULL) {
|
|
Py_DECREF(self);
|
|
return NULL;
|
|
}
|
|
self->last = PyUnicode_FromString("");
|
|
if (self->last == NULL) {
|
|
Py_DECREF(self);
|
|
return NULL;
|
|
}
|
|
self->number = 0;
|
|
}
|
|
return (PyObject *) self;
|
|
}
|
|
|
|
and install it in the :c:member:`~PyTypeObject.tp_new` member::
|
|
|
|
.tp_new = Custom_new,
|
|
|
|
The ``tp_new`` handler is responsible for creating (as opposed to initializing)
|
|
objects of the type. It is exposed in Python as the :meth:`~object.__new__` method.
|
|
It is not required to define a ``tp_new`` member, and indeed many extension
|
|
types will simply reuse :c:func:`PyType_GenericNew` as done in the first
|
|
version of the :class:`!Custom` type above. In this case, we use the ``tp_new``
|
|
handler to initialize the ``first`` and ``last`` attributes to non-``NULL``
|
|
default values.
|
|
|
|
``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
|
|
if a subclass is instantiated) and any arguments passed when the type was
|
|
called, and is expected to return the instance created. ``tp_new`` handlers
|
|
always accept positional and keyword arguments, but they often ignore the
|
|
arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
|
|
in C or ``__init__`` in Python) methods.
|
|
|
|
.. note::
|
|
``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
|
|
will do it itself.
|
|
|
|
The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
|
|
slot to allocate memory::
|
|
|
|
self = (CustomObject *) type->tp_alloc(type, 0);
|
|
|
|
Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
|
|
result against ``NULL`` before proceeding.
|
|
|
|
.. note::
|
|
We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
|
|
:c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
|
|
which is :class:`object` by default. Most types use the default allocation
|
|
strategy.
|
|
|
|
.. note::
|
|
If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
|
|
that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`~object.__new__`),
|
|
you must *not* try to determine what method to call using method resolution
|
|
order at runtime. Always statically determine what type you are going to
|
|
call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
|
|
``type->tp_base->tp_new``. If you do not do this, Python subclasses of your
|
|
type that also inherit from other Python-defined classes may not work correctly.
|
|
(Specifically, you may not be able to create instances of such subclasses
|
|
without getting a :exc:`TypeError`.)
|
|
|
|
We also define an initialization function which accepts arguments to provide
|
|
initial values for our instance::
|
|
|
|
static int
|
|
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
|
|
{
|
|
static char *kwlist[] = {"first", "last", "number", NULL};
|
|
PyObject *first = NULL, *last = NULL, *tmp;
|
|
|
|
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
|
|
&first, &last,
|
|
&self->number))
|
|
return -1;
|
|
|
|
if (first) {
|
|
tmp = self->first;
|
|
Py_INCREF(first);
|
|
self->first = first;
|
|
Py_XDECREF(tmp);
|
|
}
|
|
if (last) {
|
|
tmp = self->last;
|
|
Py_INCREF(last);
|
|
self->last = last;
|
|
Py_XDECREF(tmp);
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
|
|
|
|
.tp_init = (initproc) Custom_init,
|
|
|
|
The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
|
|
:meth:`~object.__init__` method. It is used to initialize an object after it's
|
|
created. Initializers always accept positional and keyword arguments,
|
|
and they should return either ``0`` on success or ``-1`` on error.
|
|
|
|
Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
|
|
is called at all (for example, the :mod:`pickle` module by default
|
|
doesn't call :meth:`~object.__init__` on unpickled instances). It can also be
|
|
called multiple times. Anyone can call the :meth:`!__init__` method on
|
|
our objects. For this reason, we have to be extra careful when assigning
|
|
the new attribute values. We might be tempted, for example to assign the
|
|
``first`` member like this::
|
|
|
|
if (first) {
|
|
Py_XDECREF(self->first);
|
|
Py_INCREF(first);
|
|
self->first = first;
|
|
}
|
|
|
|
But this would be risky. Our type doesn't restrict the type of the
|
|
``first`` member, so it could be any kind of object. It could have a
|
|
destructor that causes code to be executed that tries to access the
|
|
``first`` member; or that destructor could release the
|
|
:term:`Global interpreter Lock <GIL>` and let arbitrary code run in other
|
|
threads that accesses and modifies our object.
|
|
|
|
To be paranoid and protect ourselves against this possibility, we almost
|
|
always reassign members before decrementing their reference counts. When
|
|
don't we have to do this?
|
|
|
|
* when we absolutely know that the reference count is greater than 1;
|
|
|
|
* when we know that deallocation of the object [#]_ will neither release
|
|
the :term:`GIL` nor cause any calls back into our type's code;
|
|
|
|
* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
|
|
handler on a type which doesn't support cyclic garbage collection [#]_.
|
|
|
|
We want to expose our instance variables as attributes. There are a
|
|
number of ways to do that. The simplest way is to define member definitions::
|
|
|
|
static PyMemberDef Custom_members[] = {
|
|
{"first", Py_T_OBJECT_EX, offsetof(CustomObject, first), 0,
|
|
"first name"},
|
|
{"last", Py_T_OBJECT_EX, offsetof(CustomObject, last), 0,
|
|
"last name"},
|
|
{"number", Py_T_INT, offsetof(CustomObject, number), 0,
|
|
"custom number"},
|
|
{NULL} /* Sentinel */
|
|
};
|
|
|
|
and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
|
|
|
|
.tp_members = Custom_members,
|
|
|
|
Each member definition has a member name, type, offset, access flags and
|
|
documentation string. See the :ref:`Generic-Attribute-Management` section
|
|
below for details.
|
|
|
|
A disadvantage of this approach is that it doesn't provide a way to restrict the
|
|
types of objects that can be assigned to the Python attributes. We expect the
|
|
first and last names to be strings, but any Python objects can be assigned.
|
|
Further, the attributes can be deleted, setting the C pointers to ``NULL``. Even
|
|
though we can make sure the members are initialized to non-``NULL`` values, the
|
|
members can be set to ``NULL`` if the attributes are deleted.
|
|
|
|
We define a single method, :meth:`!Custom.name()`, that outputs the objects name as the
|
|
concatenation of the first and last names. ::
|
|
|
|
static PyObject *
|
|
Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
|
|
{
|
|
if (self->first == NULL) {
|
|
PyErr_SetString(PyExc_AttributeError, "first");
|
|
return NULL;
|
|
}
|
|
if (self->last == NULL) {
|
|
PyErr_SetString(PyExc_AttributeError, "last");
|
|
return NULL;
|
|
}
|
|
return PyUnicode_FromFormat("%S %S", self->first, self->last);
|
|
}
|
|
|
|
The method is implemented as a C function that takes a :class:`!Custom` (or
|
|
:class:`!Custom` subclass) instance as the first argument. Methods always take an
|
|
instance as the first argument. Methods often take positional and keyword
|
|
arguments as well, but in this case we don't take any and don't need to accept
|
|
a positional argument tuple or keyword argument dictionary. This method is
|
|
equivalent to the Python method:
|
|
|
|
.. code-block:: python
|
|
|
|
def name(self):
|
|
return "%s %s" % (self.first, self.last)
|
|
|
|
Note that we have to check for the possibility that our :attr:`!first` and
|
|
:attr:`!last` members are ``NULL``. This is because they can be deleted, in which
|
|
case they are set to ``NULL``. It would be better to prevent deletion of these
|
|
attributes and to restrict the attribute values to be strings. We'll see how to
|
|
do that in the next section.
|
|
|
|
Now that we've defined the method, we need to create an array of method
|
|
definitions::
|
|
|
|
static PyMethodDef Custom_methods[] = {
|
|
{"name", (PyCFunction) Custom_name, METH_NOARGS,
|
|
"Return the name, combining the first and last name"
|
|
},
|
|
{NULL} /* Sentinel */
|
|
};
|
|
|
|
(note that we used the :c:macro:`METH_NOARGS` flag to indicate that the method
|
|
is expecting no arguments other than *self*)
|
|
|
|
and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::
|
|
|
|
.tp_methods = Custom_methods,
|
|
|
|
Finally, we'll make our type usable as a base class for subclassing. We've
|
|
written our methods carefully so far so that they don't make any assumptions
|
|
about the type of the object being created or used, so all we need to do is
|
|
to add the :c:macro:`Py_TPFLAGS_BASETYPE` to our class flag definition::
|
|
|
|
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
|
|
|
|
We rename :c:func:`!PyInit_custom` to :c:func:`!PyInit_custom2`, update the
|
|
module name in the :c:type:`PyModuleDef` struct, and update the full class
|
|
name in the :c:type:`PyTypeObject` struct.
|
|
|
|
Finally, we update our :file:`setup.py` file to include the new module,
|
|
|
|
.. code-block:: python
|
|
|
|
from setuptools import Extension, setup
|
|
setup(ext_modules=[
|
|
Extension("custom", ["custom.c"]),
|
|
Extension("custom2", ["custom2.c"]),
|
|
])
|
|
|
|
and then we re-install so that we can ``import custom2``:
|
|
|
|
.. code-block:: shell-session
|
|
|
|
$ python -m pip install .
|
|
|
|
Providing finer control over data attributes
|
|
============================================
|
|
|
|
In this section, we'll provide finer control over how the :attr:`!first` and
|
|
:attr:`!last` attributes are set in the :class:`!Custom` example. In the previous
|
|
version of our module, the instance variables :attr:`!first` and :attr:`!last`
|
|
could be set to non-string values or even deleted. We want to make sure that
|
|
these attributes always contain strings.
|
|
|
|
.. literalinclude:: ../includes/newtypes/custom3.c
|
|
|
|
|
|
To provide greater control, over the :attr:`!first` and :attr:`!last` attributes,
|
|
we'll use custom getter and setter functions. Here are the functions for
|
|
getting and setting the :attr:`!first` attribute::
|
|
|
|
static PyObject *
|
|
Custom_getfirst(CustomObject *self, void *closure)
|
|
{
|
|
Py_INCREF(self->first);
|
|
return self->first;
|
|
}
|
|
|
|
static int
|
|
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
|
|
{
|
|
PyObject *tmp;
|
|
if (value == NULL) {
|
|
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
|
|
return -1;
|
|
}
|
|
if (!PyUnicode_Check(value)) {
|
|
PyErr_SetString(PyExc_TypeError,
|
|
"The first attribute value must be a string");
|
|
return -1;
|
|
}
|
|
tmp = self->first;
|
|
Py_INCREF(value);
|
|
self->first = value;
|
|
Py_DECREF(tmp);
|
|
return 0;
|
|
}
|
|
|
|
The getter function is passed a :class:`!Custom` object and a "closure", which is
|
|
a void pointer. In this case, the closure is ignored. (The closure supports an
|
|
advanced usage in which definition data is passed to the getter and setter. This
|
|
could, for example, be used to allow a single set of getter and setter functions
|
|
that decide the attribute to get or set based on data in the closure.)
|
|
|
|
The setter function is passed the :class:`!Custom` object, the new value, and the
|
|
closure. The new value may be ``NULL``, in which case the attribute is being
|
|
deleted. In our setter, we raise an error if the attribute is deleted or if its
|
|
new value is not a string.
|
|
|
|
We create an array of :c:type:`PyGetSetDef` structures::
|
|
|
|
static PyGetSetDef Custom_getsetters[] = {
|
|
{"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
|
|
"first name", NULL},
|
|
{"last", (getter) Custom_getlast, (setter) Custom_setlast,
|
|
"last name", NULL},
|
|
{NULL} /* Sentinel */
|
|
};
|
|
|
|
and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
|
|
|
|
.tp_getset = Custom_getsetters,
|
|
|
|
The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
|
|
above. In this case, we aren't using a closure, so we just pass ``NULL``.
|
|
|
|
We also remove the member definitions for these attributes::
|
|
|
|
static PyMemberDef Custom_members[] = {
|
|
{"number", Py_T_INT, offsetof(CustomObject, number), 0,
|
|
"custom number"},
|
|
{NULL} /* Sentinel */
|
|
};
|
|
|
|
We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
|
|
allow strings [#]_ to be passed::
|
|
|
|
static int
|
|
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
|
|
{
|
|
static char *kwlist[] = {"first", "last", "number", NULL};
|
|
PyObject *first = NULL, *last = NULL, *tmp;
|
|
|
|
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
|
|
&first, &last,
|
|
&self->number))
|
|
return -1;
|
|
|
|
if (first) {
|
|
tmp = self->first;
|
|
Py_INCREF(first);
|
|
self->first = first;
|
|
Py_DECREF(tmp);
|
|
}
|
|
if (last) {
|
|
tmp = self->last;
|
|
Py_INCREF(last);
|
|
self->last = last;
|
|
Py_DECREF(tmp);
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
With these changes, we can assure that the ``first`` and ``last`` members are
|
|
never ``NULL`` so we can remove checks for ``NULL`` values in almost all cases.
|
|
This means that most of the :c:func:`Py_XDECREF` calls can be converted to
|
|
:c:func:`Py_DECREF` calls. The only place we can't change these calls is in
|
|
the ``tp_dealloc`` implementation, where there is the possibility that the
|
|
initialization of these members failed in ``tp_new``.
|
|
|
|
We also rename the module initialization function and module name in the
|
|
initialization function, as we did before, and we add an extra definition to the
|
|
:file:`setup.py` file.
|
|
|
|
|
|
Supporting cyclic garbage collection
|
|
====================================
|
|
|
|
Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
|
|
can identify unneeded objects even when their reference counts are not zero.
|
|
This can happen when objects are involved in cycles. For example, consider:
|
|
|
|
.. code-block:: pycon
|
|
|
|
>>> l = []
|
|
>>> l.append(l)
|
|
>>> del l
|
|
|
|
In this example, we create a list that contains itself. When we delete it, it
|
|
still has a reference from itself. Its reference count doesn't drop to zero.
|
|
Fortunately, Python's cyclic garbage collector will eventually figure out that
|
|
the list is garbage and free it.
|
|
|
|
In the second version of the :class:`!Custom` example, we allowed any kind of
|
|
object to be stored in the :attr:`!first` or :attr:`!last` attributes [#]_.
|
|
Besides, in the second and third versions, we allowed subclassing
|
|
:class:`!Custom`, and subclasses may add arbitrary attributes. For any of
|
|
those two reasons, :class:`!Custom` objects can participate in cycles:
|
|
|
|
.. code-block:: pycon
|
|
|
|
>>> import custom3
|
|
>>> class Derived(custom3.Custom): pass
|
|
...
|
|
>>> n = Derived()
|
|
>>> n.some_attribute = n
|
|
|
|
To allow a :class:`!Custom` instance participating in a reference cycle to
|
|
be properly detected and collected by the cyclic GC, our :class:`!Custom` type
|
|
needs to fill two additional slots and to enable a flag that enables these slots:
|
|
|
|
.. literalinclude:: ../includes/newtypes/custom4.c
|
|
|
|
|
|
First, the traversal method lets the cyclic GC know about subobjects that could
|
|
participate in cycles::
|
|
|
|
static int
|
|
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
|
|
{
|
|
int vret;
|
|
if (self->first) {
|
|
vret = visit(self->first, arg);
|
|
if (vret != 0)
|
|
return vret;
|
|
}
|
|
if (self->last) {
|
|
vret = visit(self->last, arg);
|
|
if (vret != 0)
|
|
return vret;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
For each subobject that can participate in cycles, we need to call the
|
|
:c:func:`!visit` function, which is passed to the traversal method. The
|
|
:c:func:`!visit` function takes as arguments the subobject and the extra argument
|
|
*arg* passed to the traversal method. It returns an integer value that must be
|
|
returned if it is non-zero.
|
|
|
|
Python provides a :c:func:`Py_VISIT` macro that automates calling visit
|
|
functions. With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
|
|
in ``Custom_traverse``::
|
|
|
|
static int
|
|
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
|
|
{
|
|
Py_VISIT(self->first);
|
|
Py_VISIT(self->last);
|
|
return 0;
|
|
}
|
|
|
|
.. note::
|
|
The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
|
|
arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.
|
|
|
|
Second, we need to provide a method for clearing any subobjects that can
|
|
participate in cycles::
|
|
|
|
static int
|
|
Custom_clear(CustomObject *self)
|
|
{
|
|
Py_CLEAR(self->first);
|
|
Py_CLEAR(self->last);
|
|
return 0;
|
|
}
|
|
|
|
Notice the use of the :c:func:`Py_CLEAR` macro. It is the recommended and safe
|
|
way to clear data attributes of arbitrary types while decrementing
|
|
their reference counts. If you were to call :c:func:`Py_XDECREF` instead
|
|
on the attribute before setting it to ``NULL``, there is a possibility
|
|
that the attribute's destructor would call back into code that reads the
|
|
attribute again (*especially* if there is a reference cycle).
|
|
|
|
.. note::
|
|
You could emulate :c:func:`Py_CLEAR` by writing::
|
|
|
|
PyObject *tmp;
|
|
tmp = self->first;
|
|
self->first = NULL;
|
|
Py_XDECREF(tmp);
|
|
|
|
Nevertheless, it is much easier and less error-prone to always
|
|
use :c:func:`Py_CLEAR` when deleting an attribute. Don't
|
|
try to micro-optimize at the expense of robustness!
|
|
|
|
The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
|
|
attributes. It means the circular GC can be triggered inside the function.
|
|
Since the GC assumes reference count is not zero, we need to untrack the object
|
|
from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
|
|
Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
|
|
and ``Custom_clear``::
|
|
|
|
static void
|
|
Custom_dealloc(CustomObject *self)
|
|
{
|
|
PyObject_GC_UnTrack(self);
|
|
Custom_clear(self);
|
|
Py_TYPE(self)->tp_free((PyObject *) self);
|
|
}
|
|
|
|
Finally, we add the :c:macro:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
|
|
|
|
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
|
|
|
|
That's pretty much it. If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
|
|
:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
|
|
garbage collection. Most extensions will use the versions automatically provided.
|
|
|
|
|
|
Subclassing other types
|
|
=======================
|
|
|
|
It is possible to create new extension types that are derived from existing
|
|
types. It is easiest to inherit from the built in types, since an extension can
|
|
easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
|
|
these :c:type:`PyTypeObject` structures between extension modules.
|
|
|
|
In this example we will create a :class:`!SubList` type that inherits from the
|
|
built-in :class:`list` type. The new type will be completely compatible with
|
|
regular lists, but will have an additional :meth:`!increment` method that
|
|
increases an internal counter:
|
|
|
|
.. code-block:: pycon
|
|
|
|
>>> import sublist
|
|
>>> s = sublist.SubList(range(3))
|
|
>>> s.extend(s)
|
|
>>> print(len(s))
|
|
6
|
|
>>> print(s.increment())
|
|
1
|
|
>>> print(s.increment())
|
|
2
|
|
|
|
.. literalinclude:: ../includes/newtypes/sublist.c
|
|
|
|
|
|
As you can see, the source code closely resembles the :class:`!Custom` examples in
|
|
previous sections. We will break down the main differences between them. ::
|
|
|
|
typedef struct {
|
|
PyListObject list;
|
|
int state;
|
|
} SubListObject;
|
|
|
|
The primary difference for derived type objects is that the base type's
|
|
object structure must be the first value. The base type will already include
|
|
the :c:func:`PyObject_HEAD` at the beginning of its structure.
|
|
|
|
When a Python object is a :class:`!SubList` instance, its ``PyObject *`` pointer
|
|
can be safely cast to both ``PyListObject *`` and ``SubListObject *``::
|
|
|
|
static int
|
|
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
|
|
{
|
|
if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
|
|
return -1;
|
|
self->state = 0;
|
|
return 0;
|
|
}
|
|
|
|
We see above how to call through to the :meth:`~object.__init__` method of the base
|
|
type.
|
|
|
|
This pattern is important when writing a type with custom
|
|
:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
|
|
members. The :c:member:`~PyTypeObject.tp_new` handler should not actually
|
|
create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
|
|
but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.
|
|
|
|
The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
|
|
specifying the type's concrete base class. Due to cross-platform compiler
|
|
issues, you can't fill that field directly with a reference to
|
|
:c:type:`PyList_Type`; it should be done later in the module initialization
|
|
function::
|
|
|
|
PyMODINIT_FUNC
|
|
PyInit_sublist(void)
|
|
{
|
|
PyObject* m;
|
|
SubListType.tp_base = &PyList_Type;
|
|
if (PyType_Ready(&SubListType) < 0)
|
|
return NULL;
|
|
|
|
m = PyModule_Create(&sublistmodule);
|
|
if (m == NULL)
|
|
return NULL;
|
|
|
|
if (PyModule_AddObjectRef(m, "SubList", (PyObject *) &SubListType) < 0) {
|
|
Py_DECREF(m);
|
|
return NULL;
|
|
}
|
|
|
|
return m;
|
|
}
|
|
|
|
Before calling :c:func:`PyType_Ready`, the type structure must have the
|
|
:c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving an
|
|
existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
|
|
slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
|
|
type will be inherited.
|
|
|
|
After that, calling :c:func:`PyType_Ready` and adding the type object to the
|
|
module is the same as with the basic :class:`!Custom` examples.
|
|
|
|
|
|
.. rubric:: Footnotes
|
|
|
|
.. [#] This is true when we know that the object is a basic type, like a string or a
|
|
float.
|
|
|
|
.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
|
|
in this example, because our type doesn't support garbage collection.
|
|
|
|
.. [#] We now know that the first and last members are strings, so perhaps we
|
|
could be less careful about decrementing their reference counts, however,
|
|
we accept instances of string subclasses. Even though deallocating normal
|
|
strings won't call back into our objects, we can't guarantee that deallocating
|
|
an instance of a string subclass won't call back into our objects.
|
|
|
|
.. [#] Also, even with our attributes restricted to strings instances, the user
|
|
could pass arbitrary :class:`str` subclasses and therefore still create
|
|
reference cycles.
|