Overview

The numpy refactoring project seeks to build a CPython-independent version of the numpy module without splitting the codebase or breaking compatibility with the existing Python and C APIs to numpy. The approach being taken is to refactor NumPY into two distinct architectural layers: a Python-independent core implementing the multi-dimensional array, ufunc functionality, and array manipulation; and a CPython-specific interface layer. The interface to CPython will be the reference implementation interface layer and a new .NET-compatible interface supporting IronPython is under development. In general the core API is intended to support the development of additional interface layers as well as present a clean interface for C/C++ applications.

Specifically, the high-level goals and requirements of this project are:

  • Source-code compatibility with the existing numpy Python API must be maintained. This means that the existing (passing) unit tests in the numpy test suite must pass.
  • Source-code compatibility with the existing numpy C API must be maintained. No attempt to retain binary compatibility will be made.
  • CPython application performance will remain essentially unchanged. This means that performance over large numeric arrays should remain unchanged. There is may be a slight increase in per-call overhead due to the extra architectural layer. There might be a small increase in runtime for arrays of Python objects for similar reasons.
  • Developers will be able to use numpy in non-CPython systems by using the new core API and implementing a minimal set of interface-layer support functions.

Implementation Changes

This section describes the changes made to NumPy in order to refactor it into separate architectural layers. Moving the major functionality into a Python-independent core means the the data structures must be split into the core data structures and the associated interface-layer wrappers. Also, because core data structures can refer to other core data structures, the memory management becomes somewhat more complicated and cannot rely completely on the CPython reference counting methods. These issues and examples of the changes are described in the following sections.

Data Structures

The original structures such as PyArrayObject included the CPython header fields (PyObject_HEAD) and all of the data fields. This organization causes problems with the refactoring because the memory layout of the PyObject_HEAD fields can not be known to the core library. One option is to build a custom version of the core library for each interface that allows the interface layer to inject custom header fields. This has the advantage of simplifying memory management and has a possible, small performance advantage in having all of the data stored in a single memory block. The downside is the core library must be recompiled for each interface layer, complicating testing and maintenance.

The method used in the refactoring is to split each object into a Python-independent core data structure and a lightweight interface-specific wrapper. For PyArrayObject this means the Python wrapper is simply the PyObject_HEAD fields and a pointer to the core data structure, NpyArray. The NpyArray structure contains all of the remaining fields. The mapping between interface-layer object and the new core representations are shown in the table below:

Interface StructCore StructureCore Header FileNotes
PyArrayObjectNpyArraynpy_arrayobject.h
PyArrayDescrNpyArray_Descrnpy_descriptor.h
NpyArray_ArrFuncsnpy_descriptor.hWas PyArray_ArrFuncs
NpyArray_ArrayDescrnpy_descriptor.hWas PyArray_ArrayDescr
NpyDict, NpyDict_Iternpy_dict.hUsed in place of PyDict
PyArrayIterObjectNpyArrayIterObjectnpy_iterator.h
PyArrayMultiIterObjectNpyArrayMultiIterObjectnpy_iterator.h
PyArrayNeighborhoodIterObjectNpyArrayNeighborhoodIterObjectnpy_iterator.h
PyArrayMapIterObjectNpyArrayMapIterObjectnpy_iterator.h
!PyUFuncObject!NpyUFuncObjectnpy_ufunc_object.h
!NpyUFuncLoopObjectnpy_ufunc_object.hWas PyUFuncLoopObject, only used internally
!NpyUFuncReduceObjectnpy_ufunc_object.hWas PyUFuncReduceObject, only used internally

One issue this split raises is memory management and how to manage the lifetime of the two objects to keep them in sync. Further complicating the memory management is the need to support reference counted memory management (CPython) and garbage collected systems (IronPython, Jython, etc). All of these issues are discussed in the Memory Management section.

Core Objects

Each core data structure defines three field similar to the PyObject_HEAD fields. They are:

npy_intp nob_refcnt;             /* Current reference count */
NpyTypeObject* nob_type;         /* NpyType object, currently only defines the deallocate function */
void *nob_interface;             /* Opaque pointer to the interface wrapper object or NULL */
int nob_magic_number;            /* DEBUG ONLY: flags valid, allocated objects */

The first two fields, nob_refcnt and nob_type, behave similarly to the PyObject equivalents, ob_refcnt and ob_type. The nob_interface field is import because it associates the object with some object provided by the interface; it can also be NULL if not needed by the interface. When this value is non-NULL it plays an important part in the reference counting semantics -- see Memory Management below for details.

The macro Npy_INTERFACE (see npy_defs.h) returns the interface pointer as a void*.

The field nob_magic_number is defined only in debug builds and is used to help quickly validate that a given pointer refers to a valid, allocated object. Immediately after allocation this field is set to NPY_VALID_MAGIC (1234567 in base-10). This field does not change until it is deallocated and set to NPY_INVALID_MAGIC. These two settings allow detection of memory allocation errors. Additionally, any other value of this field means that the memory has been deallocated and already re-allocated or the pointer was cast incorrectly. Casting errors occur easily when moving between the interface and core layers and asserts that check the nob_magic_number field help to quickly identify these errors. A magic_number field has been added to a couple of the CPython-layer objects as well. This field takes on the same values as the core objects but is always at a different offset from the pointer than the core objects to ensure that it is possible to differentiate between the two.

Memory Management

The core library implements a reference counting mechanism derived from CPython's. The reference count of the core object is manipulated with the macros Npy_INCREF, Npy_DECREF, and Npy_XDECREF. These macros are similar but slightly different than the corresponding CPython equivalents and can not be mixed. In debug builds these macros check the nob_magic_number field to verify they are operating on valid objects.

When the interface layer has not provided an interface wrapper object during allocation (that is, nob_interface is NULL), the reference counting behaves exactly the same. When a new object is allocated nob_refcnt is set to 1. Once nob_refcnt reaches 0 the object is deallocated.

When the interface layer does provide an interface wrapper (nob_interface != NULL), the lifetime of the core object is tied to the interface object and it is the interface object that is responsible for deallocating the core object. To implement this, each core object holds a reference to the interface object whenever the core object's reference count is > 0. This means at creation the interface object is referenced and the reference count of the core object is set to 1. It is possible for external objects to hold a reference to the interface wrapper object when the reference count of the core object goes to 0. In this case, the core object (via the Npy_DECREF macro) releases it's reference on the interface wrapper. If the interface wrapper is still referenced, the core object continues to be allocated and valid until the interface is dereferenced, at which time both objects are deallocated. It is also possible for the core object to be re-referenced. This transition of the core reference count from 0->1 causes the object to add a reference to the interface wrapper (via Npy_INCREF).

The core interacts with the interface via two callbacks, accessed through the macros NpyInterface_INCREF and NpyInterface_DECREF.

The handshaking gets somewhat more complex when used with garbage collected systems. In these systems, releasing the reference to the interface wrapper object deallocates the handle and thus nob_interface no longer refers to valid memory and should be set to NULL. However, this would break the link to the interface wrapper. To solve this issue, the interface returns a new weak reference handle that is stored into nob_interface. Weak references allow the garbage collector to clean up the target object if no other reference exists, but allows the target to be referenced prior to that.

Behavior of the memory management protocol is relatively complex due to the need to manage the interface wrappers and support multiple memory management methods so the following sections walk through specific examples for reference counted and garbaged collected systems.

Reference-counted Interface Example

This example walks through an example of the handshaking between core and interface layer objects when the interface uses reference-counted memory management. Where code is show it uses the CPython interface layer as an example but the concepts are generally applicable for other reference-counted systems.

When the core library constructs a new NpyArray instance, A, a callback is made to the array wrapper callback provided by the interface (!NpyInterface_ArrayNewWrapper in ctors.c for the CPython interace). This function returns an wrapper object (!PyArrayObject for the CPython interface), Awrap. At this point we have:

A->nob_refcnt = 1
A->nob_interface = Awrap

Awrap->ob_refcnt = 1

Both referent counts are '1' because the core construction function is returning a new reference to the core object and whenever the core object's reference count > 1, it holds a single reference to the interface wrapper.

Now suppose A is returned from the core to the interface layer. The interface layer can convert A to Awrap by calling Npy_INTERFACE(A). It also needs to move the reference from A to Awrap. The following code shows how it would be down in the CPython interface:

PyArrayObject *Awrap;
NpyArray *A = someFunctionReturningNpyArray();

Awrap = (PyArrayObject *)Npy_INTERFACE(a);
Py_INCREF(Awrap);
Npy_DECREF(A);

Notice that the Py_INCREF statement will cause the reference count on Awrap to go to 2 since there is already one reference. However, Npy_DECREF causes the reference count on A to go to zero, thus releasing it's reference to Awrap and resulting a decref of Awrap, bringing it back to 1. At this point, Awrap has a single reference as expected and A has a count of zero. A is not deallocated because its lifetime is managed by the wrapper object. In normal situations a reference count of zero means that there is no away to ever reference a given object again. However, since Awrap is still alive there is effectively a weak reference to A.

At this point, if Py_DECREF(Awrap) is called, the reference count of Awrap goes to zero and both instances are deallocated.

Instead, consider the case where A is retrieved from Awrap. In the CPython interface layer this is done through the field array on PyArrayObject like this:

PyArrayObject *Awrap;
NpyArray *A;

A = Awrap->array;   /* Alternately: A = PyArray_ARRAY(Awrap); */
Npy_INCREF(A);
Py_DECREF(Awrap);

This example not only retrieves A from Awrap but also moves the reference from Awrap to A. When Npy_INCREF is called, the reference count on A goes from 0 to 1 and thus it also increments the reference of Awrap bringing it to 2. Py_DECREF then decrements the reference count of Awrap back to 1.

If the core library now calls Npy_DECREF again, the reference count of A transitions from 1 to 0 and thus releases the reference to Awrap. Awrap is now has a reference count of 0 and the deallocate function cleans up both Awrap and A.

Garbage-collected Interface Example

NOTE: This example is under construction and currently incomplete and probably incorrect in places.

This example is similar to the reference counted example above but shows how the memory management system works with a garbage collected environment. The examples used apply to IronPython? and the C#/.NET environment but should be generally applicable to other garbage collected systems.

When the core library constructs a new NpyArray instance, A, a callback is made to the array wrapper callback provided by the interface (!NpyArray.ArrayNewWrapper in NpyArray?.cs for the IronPython? interace). This function returns an IntPtr? to a GCHandle instance that references the wrapper object (ndarray for the IronPython? interface), Awrap. At this point we have:

A->nob_refcnt = 1
A->nob_interface = GCHandle(Awrap)

A has a reference count of '1' and holds a GC reference to the wrapper instance.

Now suppose A is returned from the core to the interface layer. The interface layer can convert A to Awrap by calling NpyArray.ToInterface<ndarray>(A). This causes the reference to be moved from the core to the interface layer:

ndarray *Awrap;
IntPtr *A = someFunctionReturningNpyArray();

Awrap = NpyArray.ToInterface(A);
NpyArray.Decref(A);

The call to ToInterface accesses the nob_interface field of A to retrieve the GCHandle referencing Awrap and returns the 'Awrap instance. Calling NpyArray.Decref results in the reference count of A going from 1 to 0, and thus A releases it's reference to Awrap. Releasing the reference in this environment means that the GCHandle instance pointed to by A->nob_interface is deallocated. This would break the link to Awrap so a new weak reference is allocated:

GCHandle old = A.nob_interface;
A.nob_interface = GCHandle.Alloc(old.Target, GCHandleType.Weak);
old.Free();

At this point, if the interface code stops referencing Awrap, the garbage collector will be free to deallocate it since the A only holds a weak reference. In this case, the finalizer for Awrap will dealloc A as well.

Instead, consider the case where A is retrieved from Awrap. In the IronPython? interface this is done using the Array property:

ndarray Awrap;
IntPtr A;

A = Awrap.Array;
NpyArray.Incref(A);

The call to !NpyArray.IncRef causes the reference count on A to go from 0 to 1 and thus it must hold a reference to Awrap. This is done by deallocating the weak reference in nob_interface and replacing it with a newly allocated full reference. For example:

GCHandle oldWeak = A.nob_interface;
A.nob_interface = GCHandle.Alloc(oldWeak.Target);
oldWeak.Free();

At this point, if the interface stops referencing Awrap then there is one reference on A and A holds the the only GC reference to Awrap. If the core calls Npy_DECREF on A, the reference count will go to zero and the GCHandle to Awrap will also be deallocated. Since neither instance is now references, the garbage collector will at some point collect Awrap and the finalizer for Awrap will deallocate A.

Object Creation

Discusses the object creation process and requirements on the interface

Library Initialization and Interface-provided Callbacks

Describes the set of callbacks that each interface must provide and the callback that the interface may provide.

Thread-safety

As a part of the refactoring, the core library will have a compile-time option to make it thread-safe in terms of object management. This means that all memory management must be protected from multiple access.

No attempt will be made to enforce safety when accessing array data in terms of multiple threads modifying the same array data - that is left to the developer. The exception is when accessing object arrays and arrays where the underlying data is a Python (or other) object. In this case it is the responsibility of the interface layer to provide the appropriate protections.

The build-time options supported in the initial release will be:

  • None - no thread safety
  • pthreads - On Unix/Linux platforms
  • win32 - For Windows system

The implementation will support the addition of other options in the future, but these will be the only options for the first release.