#cython.locals(arr=np.ndarray[np.int_t, ndim=1]) not supported? - cython

It seems that one can't declare np.ndarray in cython.locals in .pxd files. It works with memoryviews but not with np.ndarray. However, there are cases where we need np.ndarray.
In notsupported.py
import numpy as np
def func():
arr = np.ones(2)
return arr**2
In notsupported.pxd
import cython
import numpy as np
cimport numpy as np
#cython.locals(arr=np.ndarray[np.int_t, ndim=1])
cpdef func()
Error log:
Error compiling Cython file:
------------------------------------------------------------
...
import cython
import numpy as np
cimport numpy as np
#cython.locals(arr=np.ndarray[np.int_t, ndim=1])
^
------------------------------------------------------------
notsupported.pxd:6:44: Expected ']', found '='
Is there something wrong with this code? What is the alternative?

Since it looks like this isn't supported I assume you're really interested in workarounds. For the purpose of this question I'm assuming you want your code to also be valid in pure Python. I'm also assuming that your code is of the form:
def func():
arr = np.ones(2)
for n in range(arr.shape[0]):
arr[n] = # some operation element-by-element
return arr**2
If your code doesn't have the element-by-element section then there's really no benefit to setting the type at all - I don't believe Cython uses the type for Numpy "whole array" operations like the power operator you show here.
My first choice would be to have two variables: arr and arr_view. arr should be untyped, and arr_view a memoryview. You only use the memoryview in the element-by-element section. Provided you stick to in-place operations the two share the same memory so modifying one modifies the other:
def func():
arr = np.ones(2)
arr_view = arr
for n in range(arr_view.shape[0]):
arr_view[n] = ...
return arr**2
The pxd file is then:
#cython.locals(arr_view=np.int_t[:])
cpdef func()
My second choice would be to type arr as a memoryview, and use np.asarray when you want to do "whole array" operations
def func():
arr = np.ones(2)
for n in range(arr.shape[0]):
arr[n] = # some operation element-by-element
return np.asarray(arr)**2
with pxd:
#cython.locals(arr=nnp.int_t[:])
cpdef func()
np.asarray is essentially a no-op if it's passed an array, and can usuaully avoid a copy if passed a memoryview, so it won't slow things down too much.
A third option is to use the arr.base object of a memoryview to get the underlying Numpy array. This loses pure Python compatibility though since arr.base is often None when arr is a Numpy array. Therefore I don't really recommend it here.

Related

Memoryviews slices in Cython ask for a scalar

I'm trying to create a memoryview to store several vectors as rows, but when I try to change the value of any I got an error, like it is expecting a scalar.
%%cython
import numpy as np
cimport numpy as np
DTYPE = np.float
ctypedef np.float_t DTYPE_t
cdef DTYPE_t[:, ::1] results = np.zeros(shape=(10, 10), dtype=DTYPE)
results[:, 0] = np.random.rand(10)
This trows me the following error:
TypeError: only size-1 arrays can be converted to Python scalars
Which I don't understand given that I want to overwrite the first row with that vector... Any idea about what I am doing wrong?
The operation you would like to use is possible between numpy arrays (Python functionality) or Cython's memory views (C functionality, i.e. Cython generates right for-loops in the C-code), but not when you mix a memory view (on the left-hand side) and a numpy array (on the right-hand side).
So you have either to use Cython's memory-views:
...
cdef DTYPE_t[::1] r = np.random.rand(10)
results[:, 0] = r
#check it worked:
print(results.base)
...
or numpy-arrays (we know .base is a numpy-array):
results.base[:, 0] = np.random.rand(10)
#check it worked:
print(results.base)
Cython's version has less overhead, but for large matrices there won't be much difference.

Polymorphism with cython extesion types

I have a cython extension type that I want to make more general. One of the attributes of this extension type is a double and I want it to be a memoryview (double[::1]) when needed.
Here is a simple example :
import numpy as np
cimport numpy as np
cimport cython
cdef class Test:
cdef bint numeric
cdef double du
def __init__(self, bint numeric):
self.numeric = numeric
if self.numeric:
self.du = 1
else:
self.du = np.ones(10)
def disp(self)
print(self.du)
Test(True).disp() # returns 1
Test(False).disp() # gives of course an error
I tried to subclass Test changing du type to double[::1] and implementing a new __init__ but it seems that we can't override class attributes of extension types. Even if it worked, it wouldn't be satisfactory because I don't really want to have one extension type for each case.
The best would be that my extension type directly handle both cases (scalar du and memoryview du).
Is there a way to do this with Cython ?
Unfortunately, you cannot use fused_type as attributes type. You can have two options here:
You could try to use the memory adress of the variable you want to call, and cast it when needed (everything is explained here.) Unfortunately, I did not succeed at making it work with typed memory views.
Or you can use your defined attribute numeric to call the appropriate method:
import numpy as np
cimport numpy as np
cimport cython
cdef class Test:
cdef bint numeric
cdef double du_numeric
cdef double[:] du_mem_view
def __init__(self, bint numeric):
self.numeric = numeric
if self.numeric:
self.du_numeric = 1
else:
self.du_mem_view = np.ones(10)
def disp(self):
if self.numeric:
print(self.du_numeric)
else:
print(self.du_numeric_mem_view)
Test(True).disp() # returns 1
Test(False).disp() # Does not give an error anymore !

How to call a cdef method

I'd like to call my cdef methods and improve the speed of my program as much as possible. I do not want to use cpdef (I explain why below). Ultimately, I'd like to access cdef methods (some of which return void) that are members of my Cython extensions.
I tried following this example, which gives me the impression that I can call a cdef function by making a Python (def) wrapper for it.
I can't reproduce these results, so I tried a different problem for myself (summing all the numbers from 0 to n).
Of course, I'm looking at the documentation, which says
The directive cpdef makes two versions of the method available; one fast for use from Cython and one slower for use from Python.
and later (emphasis mine),
This does slightly more than providing a python wrapper for a cdef method: unlike a cdef method, a cpdef method is fully overridable by methods and instance attributes in Python subclasses. It adds a little calling overhead compared to a cdef method.
So how does one use a cdef function without the extra calling overhead of a cpdef function?
With the code at the end of this question, I get the following results:
def/cdef:
273.04207632583245
def/cpdef:
304.4114626176919
cpdef/cdef:
0.8969507060538783
Somehow, cpdef is faster than cdef. For n < 100, I can occasionally get cpdef/cdef > 1, but it's rare. I think it has to do with wrapping the cdef function in a def function. This is what the example I link to does, but they claim better performance from using cdef than from using cpdef.
I'm pretty sure this is not how you wrap a cdef function while avoiding the additional overhead (the source of which is not clearly documented) of a cpdef.
And now, the code:
setup.py
from setuptools import setup, Extension
from Cython.Build import cythonize
pkg_name = "tmp"
compile_args=['-std=c++17']
cy_foo = Extension(
name=pkg_name + '.core.cy_foo',
sources=[
pkg_name + '/core/cy_foo.pyx',
],
language='c++',
extra_compile_args=compile_args,
)
setup(
name=pkg_name,
ext_modules=cythonize(cy_foo,
annotate=True,
build_dir='build'),
packages=[
pkg_name,
pkg_name + '.core',
],
)
foo.py
def foo_def(n):
sum = 0
for i in range(n):
sum += i
return sum
cy_foo.pyx
def foo_cdef(n):
return foo_cy(n)
cdef int foo_cy(int n):
cdef int sum = 0
cdef int i = 0
for i in range(n):
sum += i
return sum
cpdef int foo_cpdef(int n):
cdef int sum = 0
cdef int i = 0
for i in range(n):
sum += i
return sum
test.py
import timeit
from tmp.core.foo import foo_def
from tmp.core.cy_foo import foo_cdef
from tmp.core.cy_foo import foo_cpdef
n = 10000
# Python call
start_time = timeit.default_timer()
a = foo_def(n)
pyTime = timeit.default_timer() - start_time
# Call Python wrapper for C function
start_time = timeit.default_timer()
b = foo_cdef(n)
cTime = timeit.default_timer() - start_time
# Call cpdef function, which does more than wrap a cdef function (whatever that means)
start_time = timeit.default_timer()
c = foo_cpdef(n)
cpTime = timeit.default_timer() - start_time
print("def/cdef:")
print(pyTime/cTime)
print("def/cpdef:")
print(pyTime/cpTime)
print("cpdef/cdef:")
print(cpTime/cTime)
The reason for your seemingly anomalous result is that you aren't calling the cdef function foo_cy directly, but instead the def function foo_cdef wrapping it.
when you are wrapping inside another def indeed you are again calling the python function. However you should be able to reach similar results as the cpdef.
Here is what you could do:
like the python def, give the type for both input and output
def foo_cdef(int n):
cdef int val = 0
val = foo_cy(n)
return val
this should have similar results as cpdef, however again you are calling a python function. If you want to directly call the c function, you should use the ctypes and call from there.
and for the benchmarking, the way that you have written, it only considers one run and could fluctuate a lot due OS other task and as well the timer.
better to use the timeit builtin method to calculate for some iteration:
# Python call
pyTime = timeit.timeit('foo_def(n)',globals=globals(), number=10000)
# Call Python wrapper for C function
cTime = timeit.timeit('foo_cdef(n)',globals=globals(), number=10000)
# Call cpdef function, which does more than wrap a cdef function (whatever that means)
cpTime = timeit.timeit('foo_cpdef(n)',globals=globals(), number=10000)
output:
def/cdef:
154.0166154428522
def/cpdef:
154.22669848136132
cpdef/cdef:
0.9986378296327566
like this, you get consistent results and as well you see always close to 1 for both either cython itself wraps or we explicitly wrap around a python function.

Will cupy support cython( eg. buffered index)?

I have implement myself defined a chainer Link, but because it is too slow.
I have implemented cython CPU version of my code. But I want to further boost speed via GPU. So I test the following code , but it failed:
%%cython
import numpy as np
cimport numpy as np
import cupy as cp
cimport cupy as cp
cdef class A:
def __init__(self):
pass
cdef cp_test(self, cp.ndarray[cp.float_t, ndim=2] arr):
return cp.sum(arr)
a = A()
arr = cp.arange(100).reshape(20,50)
print(a.cp_test(arr))
reporting:
cdef cp_test(self, cp.ndarray[cp.float_t, ndim=2] arr):
^
------------------------------------------------------------
C:\Users\.ipython\cython\_cython_magic_d4940a274af88f0257c368b8a5d0e3f5.pyx:13:23: 'ndarray' is not a type identifier
Sorry, but CuPy does not provide cython interface currently (I am one of CuPy developers).

Cython: dimensions is not a member of 'tagPyArrayObject'

I implemented a pure Python code in object-oriented style. In some of the methods there are time intensive loops, which I hope to speed up by cythonizing the code.
I am using a lot of numpy arrays and struggle with converting classes into Cython extension types.
Here I declare two numpy arrays 'verteces' and 'norms' as attributes:
import numpy as np
cimport numpy as np
cdef class Geometry(object):
cdef:
np.ndarray verteces
np.ndarray norms
def __init__(self, config):
""" Initialization"""
self.config = config
self.verteces = np.empty([1,3,3],dtype=np.float32)
self.norms = np.empty(3,dtype=np.float32)
During runtime the actual size of the arrays will be defined. This happens when calling the Geometry.load() method of the same class. The method opens an STL-file and loops over the triangle entries.
Finally I want to determine the intersection points of the triangles and a ray. In the respective method I use the following declarations.
cdef void hit(self, object photon):
""" Ray-triangle intersection according to Moeller and Trumbore algorithm """
cdef:
np.ndarray[DTYPE_t, ndim=3] verteces = self.verteces # nx3x3
np.ndarray[DTYPE_t, ndim=2] norms = self.norms
np.ndarray[DTYPE_t, ndim=1] ph_dir = photon.direction
np.ndarray[DTYPE_t, ndim=1] ph_origin = photon.origin
np.ndarray[DTYPE_t, ndim=1] v0, v1, v2, vec1, vec2, trsc, norm, v, p_inter
float a, b, par, q, q0, q1, s0, s1
int i_tri
When I try to compile this code I get the following error message:
'dimensions' is not a member of 'tagPyArrayObject'
I am not very familiar cython programming, but maybe the error is do to the fact that I have to initialize an array of fixed size in a C-extension type? The size of the array is, however, unkown until the STL-file is read.
Not sure if this is related to your problem, but I've got the same identical error message when specifying the "NPY_1_7_API_VERSION" macro in my setup.py file.
extension_module = Extension(
'yourfilename',
sources=["yourfilename.pyx],
include_dirs=[numpy.get_include()],
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
)
With this macro, a simple npmatrix.shape[0] numpy function is compiled as:
/* "yourfilename.pyx":35
*
* cpdef int vcount(self):
* return self.npmatrix.shape[0]
*
*/
__pyx_r = (__pyx_v_self->npmatrix->dimensions[0]);
which causes the error. Just removing the macro resolved this error to me.