using special function such as __add__ in cython cdef classes - cython

I am wanting to create a cython object that can has convenient operations such as addition multiplication and comparisons. But when I compile such classes they all seem to have a lot of python overhead.
A simple example:
%%cython -a
cdef class Pair:
cdef public:
int a
int b
def __init__(self, int a, int b):
self.a = a
self.b = b
def __add__(self, Pair other):
return Pair(self.a + other.a, self.b + other.b)
p1 = Pair(1, 2)
p2 = Pair(3, 4)
p3 = p1+p2
print(p3.a, p3.b)
But I end up getting quite large readouts from the annotated compiler
It seems like the __add__ function is converting objects from python floats to cython doubles and doing a bunch of type checking. Am I doing something wrong?

There's likely a couple of issues:
I'm assuming that you're using Cython 0.29.x (and not the newer Cython 3 alpha). See https://cython.readthedocs.io/en/stable/src/userguide/special_methods.html#arithmetic-methods
This means that you can’t rely on the first parameter of these methods being “self” or being the right type, and you should test the types of both operands before deciding what to do
It is likely treating self as untyped and thus accessing a and b as Python attributes.
The Cython 3 alpha treats special methods differently (see https://cython.readthedocs.io/en/latest/src/userguide/special_methods.html#arithmetic-methods) so you could also consider upgrading to that.
Although the call to __init__ has C typed arguements it's still a Python call so you can't avoid boxing and unboxing the arguments to Python ints. You could avoid this call and do something like:
cdef Pair res = Pair.__new__()
res.a = ... # direct assignment to attribute

Related

Can Python object be referenced under cdef function?

In the code below, self is a python object, as it is declared from def instead of cdef. Can a python object be referenced under a cdef function, just like how it is used under the c_function from my example below?
I am confused because cdef makes it sound like cdef is a C function, so I am not sure if it is able to take on a python object.
class A(self, int w):
def __init__():
self.a = w
cdef c_function (self, int b):
return self.a + b
Thank you,
I am the OP.
From "Cython: A Guide for Python Programmers":
Source: https://books.google.com.hk/books?id=H3JmBgAAQBAJ&pg=PA49&lpg=PA49&dq=python+object+cdef+function&source=bl&ots=QI9If_wiyR&sig=ACfU3U1CmEPBaEVmW0UN1_9m9G8B9a6oFQ&hl=en&sa=X&ved=2ahUKEwiBgs3Lw5vqAhXSZt4KHTs0DK0Q6AEwBHoECAoQAQ#v=onepage&q=python%20object%20cdef%20function&f=false
"[...] Nothing prevents us from declaring and using Python objects and dynamic variables in cdef functions, or accepting them as arguments"
So I take this as the book saying "yes" to my original question.

issue using deepcopy function for cython classes

I've been playing with Cython recently for the speed ups, but when I was trying to use copy.deepcopy() some error occurred.Here is the code:
from copy import deepcopy
cdef class cy_child:
cdef public:
int move[2]
int Q
int N
def __init__(self, move):
self.move = move
self.Q = 0
self.N = 0
a = cy_child((1,2))
b = deepcopy(a)
This is the error:
can't pickle _cython_magic_001970156a2636e3189b2b84ebe80443.cy_child objects
How can I solve the problem for this code?
As hpaulj says in the comments, deepcopy looks to use pickle by default to do its work. Cython cdef classes didn't used to be pickleable. In recent versions of Cython they are where possible (see also http://blog.behnel.de/posts/whats-new-in-cython-026.html) but pickling the array looks to be a problem (and even without the array I didn't get it to work).
The solution is to implement the relevant functions yourself. I've done __deepcopy__ since it's simple but alternatively you could implement the pickle protocol
def __deepcopy__(self,memo_dictionary):
res = cy_child(self.move)
res.Q = self.Q
res.N = self.N
return res
I suspect that you won't need to do that in the future as Cython improves their pickle implementation.
A note on memo_dictionary: Suppose you have
a=[None]
b=[A]
a[0]=B
# i.e. A contains a link to B and B contains a link to A
c = deepcopy(a)
memo_dictionary is used by deepcopy to keep a note of what it's already copied so that it doesn't loop forever. You don't need to do much with it yourself. However, if your cdef class contains a Python object (including another cdef class) you should copy it like this:
cdef class C:
cdef object o
def __deepcopy__(self,memo_dictionary):
# ...
res.o = deepcopy(self.o,memo_dictionary)
# ...
(i.e. make sure it gets passed on to further calls of deepcopy)

Cython overhead on extension types in memoryview

I am compiling a Cython module, and checked this piece of code with cython -a command.
cdef INT_t print_info(Charge[:] electrons):
cdef INT_t i, index
for i in range(electrons.shape[0]):
index = electrons[i].particleindex
return index
It turns out that
+ index = electrons[i].particleindex
__pyx_t_4 = __pyx_v_i;
__pyx_t_3 = (PyObject *) *((struct __pyx_obj_14particle_class_Charge * *) ( /* dim=0 */ (__pyx_v_electrons.data + __pyx_t_4 * __pyx_v_electrons.strides[0]) ));
__Pyx_INCREF((PyObject*)__pyx_t_3);
__pyx_t_5 = ((struct __pyx_obj_14particle_class_Charge *)__pyx_t_3)->particleindex;
__Pyx_DECREF(((PyObject *)__pyx_t_3)); __pyx_t_3 = 0;
__pyx_v_index = __pyx_t_5;
Charge is a cdef extension type and I am trying to use a memoryview buffer Charge[:] here. It seems that Cython calls some Python API in this case, in particular __Pyx_INCREF((PyObject*) and __Pyx_DECREF(((PyObject *) have been generated.
I am wondering what causes this, will it cause a lot of slowdown? It is my first post in the forum, any comments or suggestions are greatly appreciated!
PS: Charge object is defined as
charge.pyx
cdef class Charge:
def __cinit__(Charge self):
self.particleindex = 0
self.charge = 0
self.mass = 0
self.energy = 0
self.on_electrode = False
charge.pxd
cdef class Charge:
cdef INT_t particleindex
cdef FLOAT_t charge
cdef FLOAT_t mass
cdef FLOAT_t energy
cdef bint on_electrode
Cython will likely be happier with pythonic code. Rewrite your function:
cdef INT_t print_info(Charge[:] electrons):
cdef INT_t i, index
for electron in electrons:
index = electron.particleindex
return index
and try again.
It's not the memoryview, it's the extension type. Cython extension types are treated with the same reference-counting semantics as Python objects.
You can get and work with pointers to them which do not change recounts, either with <void*> or with <PyObject*> (which you can cimport from cpython.ref), but the pointers obviously don't have any methods or attributes. The minute you try to cast back to the extension type type, the INCREF/DECREF code reappears. Those instructions are pretty fast though.
There was some talk on the mailing list about how non-refcounted references (i.e., with access to object data) to extension types might be a new feature in the future, but there seemed to be little enthusiasm for adding a feature that, realistically, is probably going to lead to a lot of horrifyingly buggy code of the "access violation" variety.

Cython - copy constructors

I've got a C library that I'm trying to wrap in Cython. One of the classes I'm creating contains a pointer to a C structure. I'd like to write a copy constructor that would create a second Python object pointing to the same C structure, but I'm having trouble, as the pointer cannot be converted into a python object.
Here's a sketch of what I'd like to have:
cdef class StructName:
cdef c_libname.StructName* __structname
def __cinit__(self, other = None):
if not other:
self.__structname = c_libname.constructStructName()
elif type(other) is StructName:
self.__structname = other.__structname
The real problem is that last line - it seems Cython can't access cdef fields from within a python method. I've tried writing an accessor method, with the same result. How can I create a copy constructor in this situation?
When playing with cdef classes, attribute access are compiled to C struct member access. As a consequence, to access to a cdef member of an object A you have to be sure of the type of A. In __cinit__ you didn't tell Cython that other is an instance of StructName. Therefore Cython refuses to compile other.__structname. To fix the problem, just write
def __cinit__(self, StructName other = None):
Note: None is equivalent to NULL and therefore is accepted as a StructName.
If you want more polymorphism then you have to rely on type casts:
def __cinit__(self, other = None):
cdef StructName ostr
if not other:
self.__structname = c_libname.constructStructName()
elif type(other) is StructName:
ostr = <StructName> other
self.__structname = ostr.__structname

Why can't cython memory views be pickled?

I have a cython module that uses memoryview arrays, that is...
double[:,:] foo
I want to run this module in parallel using multiprocessing. However I get the error:
PicklingError: Can't pickle <type 'tile_class._memoryviewslice'>: attribute lookup tile_class._memoryviewslice failed
Why can't I pickle a memory view and what can I do about it.
Maybe passing the actual array instead of the memory view can solve your problem.
If you want to execute a function in parallel, all of it parameters have to be picklable if i recall correctly. At least that is the case with python multiprocessing. So you could pass the array to the function and create the memoryview inside your function.
def some_function(matrix_as_array):
cdef double[:,:] matrix = matrix_as_array
...
I don't know if this helps you, but I encountered a similar problem. I use a memoryview as an attribute in a cdef class. I had to write my own __reduce__ and __setstate__ methods to correctly unpickle instances of my class. Pickling the memory view as an array by using numpy.asarray and restoring it in __setstate__ worked for me. A reduced version of my code:
import numpy as np
cdef class Foo:
cdef double[:,:] matrix
def __init__(self, matrix):
'''Assign a passed array to the typed memory view.'''
self.matrix = matrix
def __reduce__(self):
'''Define how instances of Foo are pickled.'''
d=dict()
d['matrix'] = np.asarray(self.matrix)
return (Foo, (d['matrix'],), d)
def __setstate__(self, d):
'''Define how instances of Foo are restored.'''
self.matrix = d['matrix']
Note that __reduce__ returns a tuple consisting of a callable (Foo), a tuple of parameters for that callable (i.e. what is needed to create a 'new' Foo instance, in this case the saved matrix) and the dictionary with all values needed to restore the instance.