Why Cython compilation could fails (on C stage), while working well in Jupyter? - cython

Here's the code that works perfectly fine in Jupyter %%cython cell:
ctypedef fused fused_type:
long
double
cdef fused_checker(fused_type i):
if fused_type is long:
return True
else:
return False
def test():
return fused_checker(0)
But when I put it into .pyx file, compilation fails on C stage failing with error C2065 (unknown identifier)
update:
The problem was in Cython.Compiler.Options.cimport_from_pyx = True I used in my setup.py. Without it everything works fine. My fault I didn't notice it before. So it's time to file an issue...

Related

Cython Extern C++ Function

I'm trying to extern a c++ function to cython. Here is my code (all files are in the same directory)
function.cpp
int cfunc(int x){
return x;
}
wrapper.pyx
cdef extern from "function.cpp":
cpdef int cfunc(int)
def pyfunc(int x):
return cfunc(x)
setup.py
from distutils.core import setup, Extension
from Cython.Build import cythonize
source = ['function.cpp', 'wrapper.pyx']
ext = [Extension('lib', source, language='c++')]
setup(ext_modules=cythonize(ext))
When I run python setup.py build_ext --inplace it gives the following error
/home/hyunix/anaconda3/envs/c-playground/bin/../lib/gcc/x86_64-conda_cos6-linux-gnu/7.3.0/../../../../x86_64-conda_cos6-linux-gnu/bin/ld: build/temp.linux-x86_64-3.7/function.o: in function `cfunc(int)':
function.cpp:(.text._Z5cfunci+0x0): multiple definition of `cfunc(int)'; build/temp.linux-x86_64-3.7/wrapper.o:wrapper.cpp:(.text._Z5cfunci+0x0): first defined here
collect2: error: ld returned 1 exit status
error: command '/home/hyunix/anaconda3/envs/c-playground/bin/x86_64-conda_cos6-linux-gnu-c++' failed with exit status 1
However if I remove language='c++' from setup.py it works fine. Why does this happen?
I'm using:
Python 3.7.9
Cython 0.29.21
Ubuntu 20.04
Well, when you use cpdef int cfunc(int), you're explicitly creating a new C function, and a new python function. If you want to refer to cfunc() as it's externally defined in function.cpp, your signature should be
cdef extern from "function.cpp":
int cfunc(int)
So, when you compile with the language='c++' flag, Cython is giving you an appropriate error. However, when you remove the language flag, Cython needs to reason based on compiler directives whether you're asking for a .c or a .cpp, and it defaults to .c. You should notice that your wrapper is being compiled to .c instead of .cpp when the language argument is removed. In this C compilation, Cython does not recognize the signature in the .cpp, but it does recognize the cpdef. So, no error, but you're getting an empty cfunc function, as opposed to the one defined in cpp.

Cython set variable to named constant

I'm chasing my tail with what I suspect is a simple problem, but I can't seem to find any explanation for the observed behavior. Assume I have a constant in a C header file defined by:
#define FOOBAR 128
typedef uint32_t mytype_t;
I convert this in Cython by putting the following in the .pxd file:
cdef int _FOOBAR "FOOBAR"
ctypedef uint32_t mytype_t
In my .pyx file, I have a declaration:
FOOBAR = _FOOBAR
followed later in a class definition:
cdef class MyClass:
cdef mytype_t myvar
def __init__(self):
try:
self.myvar = FOOBAR
print("GOOD")
except:
print("BAD")
I then try to execute this with a simple program:
try:
foo = MyClass()
except:
print("FAILED TO CREATE CLASS")
Sadly, this errors out, but I don't get an error message - I just get the exception print output:
BAD
Any suggestions on root cause would be greatly appreciated.
I believe I have finally tracked it down. The root cause issue is that FOOBAR in my code was actually set to UINT32MAX. Apparently, Cython/Python interprets that as a -1 and Python then rejects setting a uint32_t variable equal to it. The solution is to define FOOBAR to be 0xffffffff - apparently Python thinks that is a non-negative value and accepts it.

Cython: Convert Python string list to 2D character array

I am trying to convert a list of python strings to a 2D character array, and then pass it into a C function.
Python version: 3.6.4, Cython version: 0.28.3, OS Ubuntu 16.04
My first try looks like this:
def my_function(name_list):
cdef char name_array[50][30]
for i in range(len(name_list)):
name_array[i] = name_list[i]
The code builds, but during runtime I receive the following response:
Traceback (most recent call last):
File "test.py", line 532, in test_my_function
my_function(name_list)
File "my_module.pyx", line 817, in my_module.my_function
File "stringsource", line 93, in
carray.from_py.__Pyx_carray_from_py_char
IndexError: not enough values found during array assignment, expected 25, got 2
I then tried to make sure that the string on the right-hand side of the assignment is exactly 30 characters by doing the following:
def my_function(name_list):
cdef char name_array[50][30]
for i in range(len(name_list)):
name_array[i] = (name_list[i] + ' '*30)[:30]
This caused another error, as follows:
Traceback (most recent call last):
File "test.py", line 532, in test_my_function
my_function(name_list)
File "my_module.pyx", line 818, in my_module.my_function
File "stringsource", line 87, in carray.from_py.__Pyx_carray_from_py_char
TypeError: an integer is required
I will appreciate any help. Thanks.
I don't like this functionality of Cython and seems to be at least not very well thought trough:
It is convenient to use char-array and thus to avoid the hustle with allocating/freeing of dynamically allocated memory. However, it is only natural that the allocated buffer is larger than the strings for which it is used. Enforcing equal lengths doesn't make sense.
C-strings are null-terminated. Not always is \0 at the end needed, but often it is necessary, so some additional steps are needed to ensure this.
Thus, I would roll out my own solution:
%%cython
from libc.string cimport memcpy
cdef int from_str_to_chararray(source, char *dest, size_t N, bint ensure_nullterm) except -1:
cdef size_t source_len = len(source)
cdef bytes as_bytes = source.encode('ascii') #hold reference to the underlying byte-object
cdef const char *as_ptr = <const char *>(as_bytes)
if ensure_nullterm:
source_len+=1
if source_len > N:
raise IndexError("destination array too small")
memcpy(dest, as_ptr, source_len)
return 0
and then use it as following:
%%cython
def test(name):
cdef char name_array[30]
from_str_to_chararray(name, name_array, 30, 1)
print("In array: ", name_array)
A quick test yields:
>>> tests("A")
In array: A
>>> test("A"*29)
In array: AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>> test("A"*30)
IndexError: destination array too small
Some additional remarks to the implementation:
it is necessary to hold the reference of the underlying bytes object, to keep it alive, otherwise as_ptr will become dangling as soon as it is created.
internal representation of bytes-objects has a trailing \0, so memcpy(dest, as_ptr, source_len) is safe even if source_len=len(source)+1.
except -1 in the signature is needed, so the exception is really passed to/checked in Python code.
Obviously, not everything is perfect: one has to pass the size of the array manually and this will leads to errors in the long run - something Cython's version does automatically right. But given the lacking functionality in Cython's version right now, the roll-out version is the better option in my opinion.
Thanks to #ead for responding. It got me to something that works. I am not convinced that it is the best way, but for now it is OK.
I addressed null termination, as #ead suggested, by appending null characters.
I received a TypeError: string argument without an encoding error, and had to encode the string before converting it to a bytearray. That is what the added .encode('ASCII') bit is for.
Here is the working code:
def my_function(name_list):
cdef char name_array[50][30]
for i in range(len(name_list)):
name_array[i] = bytearray((name_list[i] + '\0'*30)[:30].encode('ASCII'))

Cython: dimensions is not a member of 'tagPyArrayObject'

I implemented a pure Python code in object-oriented style. In some of the methods there are time intensive loops, which I hope to speed up by cythonizing the code.
I am using a lot of numpy arrays and struggle with converting classes into Cython extension types.
Here I declare two numpy arrays 'verteces' and 'norms' as attributes:
import numpy as np
cimport numpy as np
cdef class Geometry(object):
cdef:
np.ndarray verteces
np.ndarray norms
def __init__(self, config):
""" Initialization"""
self.config = config
self.verteces = np.empty([1,3,3],dtype=np.float32)
self.norms = np.empty(3,dtype=np.float32)
During runtime the actual size of the arrays will be defined. This happens when calling the Geometry.load() method of the same class. The method opens an STL-file and loops over the triangle entries.
Finally I want to determine the intersection points of the triangles and a ray. In the respective method I use the following declarations.
cdef void hit(self, object photon):
""" Ray-triangle intersection according to Moeller and Trumbore algorithm """
cdef:
np.ndarray[DTYPE_t, ndim=3] verteces = self.verteces # nx3x3
np.ndarray[DTYPE_t, ndim=2] norms = self.norms
np.ndarray[DTYPE_t, ndim=1] ph_dir = photon.direction
np.ndarray[DTYPE_t, ndim=1] ph_origin = photon.origin
np.ndarray[DTYPE_t, ndim=1] v0, v1, v2, vec1, vec2, trsc, norm, v, p_inter
float a, b, par, q, q0, q1, s0, s1
int i_tri
When I try to compile this code I get the following error message:
'dimensions' is not a member of 'tagPyArrayObject'
I am not very familiar cython programming, but maybe the error is do to the fact that I have to initialize an array of fixed size in a C-extension type? The size of the array is, however, unkown until the STL-file is read.
Not sure if this is related to your problem, but I've got the same identical error message when specifying the "NPY_1_7_API_VERSION" macro in my setup.py file.
extension_module = Extension(
'yourfilename',
sources=["yourfilename.pyx],
include_dirs=[numpy.get_include()],
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
)
With this macro, a simple npmatrix.shape[0] numpy function is compiled as:
/* "yourfilename.pyx":35
*
* cpdef int vcount(self):
* return self.npmatrix.shape[0]
*
*/
__pyx_r = (__pyx_v_self->npmatrix->dimensions[0]);
which causes the error. Just removing the macro resolved this error to me.

Using Cython extension module to wrap std::vector - How do I program __setitem__() method?

This seems like a question that should have an obvious answer, but for some reason I can't find any examples online.
I am wrapping a vector of C++ objects in a Python class using Cython. I also have a Cython wrapper for the C++ class already coded. I can get several methods such as __len__(), __getitem__(), and resize() to work properly, but the __setitem__() method is giving me problems.
For simplicity, I coded a small example using a vector of ints. I figure if I can get this code to work, then I can build on that to get the solution for my C++ class as well.
MyPyModule.pyx
# distutils: language = c++
from libcpp.vector cimport vector
from cython.operator cimport dereference as deref
cdef class MyArray:
cdef vector[int]* thisptr
def __cinit__(self):
self.thisptr = new vector[int]()
def __dealloc__(self):
del self.thisptr
def __len__(self):
return self.thisptr.size()
def __getitem__(self, size_t key):
return self.thisptr.at(key)
def resize(self, size_t newsize):
self.thisptr.resize(newsize)
def __setitem__(self, size_t key, int value):
# Attempt 1:
# self.thisptr.at(key) = value
# Attempt 2:
# cdef int* itemptr = &(self.thisptr.at(key))
# itemptr[0] = value
# Attempt 3:
# (self.thisptr)[key] = value
# Attempt 4:
self[key] = value
When I tried to cythonize using Attempt 1, I got the error Cannot assign to or delete this. When I tried Attempt 2, the .cpp file was created, but the compiler complained that:
error: cannot convert β€˜__Pyx_FakeReference<int>*’ to β€˜int*’ in assignment
__pyx_v_itemptr = (&__pyx_t_1);
On Attempt 3, Cython would not build the file because Cannot assign type 'int' to 'vector[int]'. (When I tried this style with the C++ object instead of int, it complained because I had a reference as a left-value.) Attempt 4 compiles, but when I try to use it, I get a segfault.
Cython docs say that returning a reference as a left-value is not supported, which is fine -- but how do I get around it so that I can assign a new value to one of my vector elements?
There are two ways to access the vector through a pointer,
def __setitem__(self, size_t key, int value):
deref(self.thisptr)[key] = value
# or
# self.thisptr[0][key] = value
Cython translates those two cases as follows:
Python: deref(self.thisptr)[key] = value
C++: ((*__pyx_v_self->thisptr)[__pyx_v_key]) = __pyx_v_value;
Python: self.thisptr[0][key] = value
C++: ((__pyx_v_self->thisptr[0])[__pyx_v_key]) = __pyx_v_value;
which are equivalent i.e. access the same vector object.
Instead of trying to handle a pointer from Cython code, you can let Cython itself do it for you:
cdef class MyArray:
cdef vector[int] thisptr
def __len__(self):
return self.thisptr.size()
def __getitem__(self, size_t key):
return self.thisptr[key]
def __setitem__(self, size_t key, int value):
self.thisptr[key] = value
def resize(self, size_t newsize):
self.thisptr.resize(newsize)
Is there any problem with this approach?
I have already accepted J.J. Hakala's answer (many thanks!). I tweaked that method to include an out-of-bounds check, since it uses the [] operator instead of the at() method:
cdef class MyArray:
(....)
def __setitem__(self, size_t key, int value):
if key < self.thisptr.size():
deref(self.thisptr)[key] = value
else:
raise IndexError("Index is out of range.")