Users Guide¶
Implementing the buffer protocol¶
A matrix class¶
The following Python code implements a matrix of floats, where the number of columns is fixed at construction time but rows can be added dynamically.
# matrix.py
import array
class Matrix:
def __init__(self, ncols: int):
self.ncols: int = ncols
self.vector: array = array.array("f")
def add_row(self):
"""Adds a row, initially zero-filled."""
for _ in range(self.ncols):
self.vector.append(0.0)
There are no methods to do anything productive with the matrices’ contents.
We could implement custom __getitem__
, __setitem__
, etc. for this,
but instead we’ll use the buffer protocol to expose the matrix’s data to
Python so we can use array module (or alternatively e.g. NumPy) to do useful
work.
Implementing the buffer protocol requires adding two methods,
__getbuffer__
and __releasebuffer__
, which Python handles specially.
import array
import ctypes as ct
from memorybuffer import Buffer, Py_buffer
class Matrix(Buffer):
def __init__(self, ncols: int):
self.ncols: int = ncols
self.vector: array = array.array("f")
def add_row(self):
"""Adds a row, initially zero-filled."""
for _ in range(self.ncols):
self.vector.append(0.0)
# Buffer protocol
def __getbuffer__(self, buffer: Py_buffer, flags: int):
length = len(self.vector)
itemsize = self.vector.itemsize
buffsize = length * itemsize
shape = (ct.c_ssize_t * 2)()
strides = (ct.c_ssize_t * 2)()
shape[0] = length // self.ncols
shape[1] = self.ncols
# Stride 0 is the distance, in bytes, between the first elements
# of adjacent rows.
# Stride 1 is the distance, in bytes, between two items in a row;
# this is the distance between two adjacent items in the vector.
strides[0] = self.ncols * itemsize
strides[1] = itemsize
buffer.buf = self.__from_buffer__(self.vector, buffsize)
buffer.len = buffsize # product(shape) * itemsize
buffer.itemsize = itemsize
buffer.readonly = False
buffer.ndim = 2
buffer.format = b"f" # float
buffer.shape = shape
buffer.strides = strides
buffer.suboffsets = None # for pointer arrays only
buffer.internal = None # see References
def __releasebuffer__(self, buffer: Py_buffer):
pass
The method Matrix.__getbuffer__
fills a descriptor structure, called
a Py_buffer
, that is defined by the Python C-API and also has its fully
compatible ctypes based implementation as memorybuffer.Py_buffer
class.
It contains a pointer to the actual buffer in memory, as well as metadata
about the shape of the array and the strides (step sizes to get from one
element or row to the next).
Its shape
and strides
members are pointers that must point to arrays
of type and size ctypes.c_ssize_t[ndim]
.
These arrays have to stay alive as long as any buffer views the data.
Fortunately ctypes guarantee this, so both can be defined/created locally
in __getbuffer__
method.
The code is not yet complete, but we can already compile it and test the basic functionality.
>>> from matrix import Matrix
>>> m = Matrix(6)
>>> m.add_row()
>>> m.add_row()
>>> m.vector
array('f', [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
>>> a = memoryview(m)
>>> for col in range(m.ncols):
>>> a[0,col] = 1
>>> m.vector
array('f', [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
Now we can view the Matrix
using memoryview
, and modify its contents
using standard memoryview
operations.
Flags¶
We skipped some input validation in the code.
The flags
argument to __getbuffer__
comes from memoryview
,
(and other clients, e.g. np.asarray
) and is an OR of boolean flags
that describe the kind of array that is requested.
Strictly speaking, if the flags contain PyBUF_ND
, PyBUF_SIMPLE
,
or PyBUF_F_CONTIGUOUS
, __getbuffer__
must raise a BufferError
.
These macros are class attributes of Py_buffer
class which can be
imported from memorybuffer
.
(The matrix-in-vector structure actually conforms to PyBUF_ND
,
but that would prohibit __getbuffer__
from filling in the strides.
A single-row matrix is F-contiguous, but a larger matrix is not.)
References¶
The buffer interface used here is set out in PEP 3118, Revising the buffer protocol.
A tutorial for using this API from C is on Jake Vanderplas’s blog, An Introduction to the Python Buffer Protocol.
Reference documentation is available for Python 3.