Define array of strings in Cython - cython

Stuck on some basic Cython here - what's a canonical and efficient way to define an an array of strings in Cython? Specifically, I want to define a fixed-length constant array of char. (Please note that I would prefer not to bring in NumPy at this point.)
In C this would be:
/* cletters.c */
#include <stdio.h>
int main(void)
{
const char *headers[3] = {"to", "from", "sender"};
int i;
for (i = 0; i < 3; i++)
printf("%s\n", headers[i]);
}
Attempt in Cython:
# cython: language_level=3
# letters.pyx
cpdef main():
cdef const char *headers[3] = {"to", "from", "sender"}
print(headers)
However, this gives:
(cy) $ python3 ./setup.py build_ext --inplace --quiet
cpdef main():
cdef const char *headers[3] = {"to", "from", "sender"}
^
------------------------------------------------------------
letters.pyx:5:32: Syntax error in C variable declaration

You need two lines:
%%cython
cpdef main():
cdef const char *headers[3]
headers[:] = ['to','from','sender`]
print(headers)
Somewhat counterintuitive is than one assigns unicode-strings (Python3!) to char*. That is one of Cython's quirks. On the other hand, while initializing everything with only one value, bytes-object is needed:
%%cython
cpdef main():
cdef const char *headers[3]
headers[:] = b'init_value` ## unicode-string 'init_value' doesn't work.
print(headers)
Another alternative is the following oneliner:
%%cython
cpdef main():
cdef const char **headers=['to','from','sender`]
print(headers[0], headers[1], headers[2])
which is not exactly the same as above and leads to the following C-code:
char const **__pyx_v_headers;
...
char const *__pyx_t_1[3];
...
__pyx_t_1[0] = ((char const *)"to");
__pyx_t_1[1] = ((char const *)"from");
__pyx_t_1[2] = ((char const *)"sender");
__pyx_v_headers = __pyx_t_1;
__pyx_v_headers is of type char ** and downside is, that print(headers)no longer works out of the box.

For python3 Unicode strings, this is possible-
cdef Py_UNICODE* x[2]
x = ["hello", "worlᏪd"]
or
cdef Py_UNICODE** x
x = ["hello", "worlᏪd"]

Related

UnboundLocalError: local variable 'animal_signals' referenced before assignment

I have a some Cython code where if a variable equals a value from a list then values from another list are copied into a testing array.
double [:] signals
cdef int total_days=signals.shape[0]
cdef size_t epoch=0
cdef int total_animals
cdef int n
cdef double[:] animal_signals
for animal in range(total_animals):
individual_animal = uniq_instr[animal]
for element in range(total_days):
if list(animal_ids[n]) == individual_animal:
animal_signals.append(signals[n])
I am getting an error:
UnboundLocalError: local variable 'animal_signals' referenced before assignment
I have thought having the line
cdef double[:] animal_signals
would have meant the array was assigned.
Update
As suggested I have also tried declaring the array animal_signals (and removing the append):
cdef int total_days=signals.shape[0]
cdef size_t epoch=0
cdef int total_animals
cdef int n
cdef int count=0
for animal in range(total_animals):
count=0
individual_animal = uniq_instr[animal]
for element in range(total_days):
if list(animal_ids[element]) == individual_animal:
cdef double[:] animal_signals[count] = signals[n]
count=count+1
however when I compile the code I get the error:
Error compiling Cython file:
------------------------------------------------------------
...
for element in range(total_days):
if list(animal_ids[element]) == individual_animal:
cdef double[:] animal_signals[count] = signals[n]
^
------------------------------------------------------------
project/temps.pyx:288:21: cdef statement not allowed here
Where am I going wrong?
Indeed, your line cdef double[:] animal_signals
declares animal_signals as a variable, but you never assign anything to it before using it (in Python assignement is done with the = operator).
In Cython, using the slice ([:]) notation when defining a variable is usually done to get the memory view of an other object (see the reference documentation).
For example :
some_1d_numpy_array = np.zeros((10,10)).reshape(-1)
cdef double[:] animal_signals = some_1d_numpy_array
If you want to create a C array, you have to allocate the memory for it (here for a size of number entries containing double) :
cdef double *my_array = <double *> malloc(number * sizeof(double))
Also, regarding to your original code, note that in both case you won't be able to use the append method on this object because it will not be a Python list, you will have to access its member by their indexes.

Cython return tuple within cdef?

Hi I am trying to convert a python code into cython in order to speed up its calculation. I am trying to return multiple arrays within the cython code from a cdef to cpdef. Based on classical C, I could either use a pointer or a tuple. I decide to use tuple because the size varies. I know the following code doesn't work, any help? Thank you!
import numpy as np
cimport numpy as np
cdef tuple funA(double[:] X, double[:] Y):
cdef int nX, nY, i
nX = len(X)
nY = len(Y)
for i in range(nX):
X[i] = X[i]*X[i]
for i in range(nY):
Y[i] = Y[i]*Y[i]
return X,Y
cpdef Run(double[:] X, double[:] Y)
cdef Tuple1, Tuple2 = funA(X,Y)
# Do some calculation with Tuple1 and Tuple2
# Example
cdef int i, nTuple1, nTuple2
nTuple1 = len(Tuple1)
for i in range(nTuple1):
Tuple1[i] = Tuple1[i]**2
nTuple2 = len(Tuple2)
for i in range(nTuple2):
Tuple2[i] = Tuple2[i]/2
return Tuple1, Tuple2
You've got a few indentation errors and missing colons. But your real issue is:
cdef Tuple1, Tuple2 = funA(X,Y)
Remove the cdef and it's fine. It doesn't look like cdef and tuple unpacking quite mix, and since you're treating them as Python objects it should be OK.
However, note that you don't really need to return anything from funA since you modify X and Y them in place there.

Converting from python list to char** and back makes all elements the same in Cython

I have a Cython file called test.pyx with the following code:
from libc.stdlib cimport malloc, free
def test():
x = ["a1", "a2", "a3"]
cdef char** y = <char**> malloc(len(x) * sizeof(char*))
for i in range(len(x)):
item_uni = x[i].encode("UTF-8")
y[i] = item_uni
z = []
for i in range(len(x)):
item = y[i]
z.append(item)
print(z)
The function should seemingly print ["a1", "a2", "a3"]. However, it is giving me three instances of "a3"instead:
>>> test()
[b'a3', b'a3', b'a3']
Why is this happening?
It is because temporary objects are involved. item_uni probably has always the same memory address for its contents, y[i] = item_uni will then point to the same address for all items, and since the last string is "a3", there will be three times "a3".
strdup should fix it (free is needed for those then)
from libc.string cimport strdup
...
y[i] = strdup(item_uni)

Cython - Wrapping pointer to structure from C to python

I have a C function which take pointer to struct and i want to use it in python by C-Extensions by Cython way but when i want to pass pointer to struct from python give me an error: "Cannot convert Python object to 'Foo *'"
In the below example i make object to call the C function but what passed to C function is NULL pointer.
My Trial:
hello.h
#include <stdio.h>
typedef struct
{
int x;
} Foo;
int hello(Foo* what);
hello.c
#include "hello.h"
int hello(Foo* what)
{
printf("Hello Wrapper\n");
printf("what: %p\n", what);
what->x = 5;
return what->x;
}
phello.pxd
cdef extern from "hello.h":
ctypedef struct Foo:
int x
cdef int hello(Foo* what)
phello.pyx
cimport phello
cdef class Foo_c:
cdef phello.Foo* s
def hello_fn(self):
return phello.hello(self.s)
setup.py
from distutils.core import setup, Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules=[ Extension("hellomodule",
sources=["phello.pyx", "hello.c"],
) ]
test.py
import hellomodule
print "Hello test.py"
tobject = hellomodule.Foo_c()
print "Object:", tobject
tobject.hello_fn()
So i want create "Foo" struct in "test.py" and pass it to "hello_fn()" function to call the C function "hello()" after passing this struct, so i can read or write on this structure from both sides python & C.
Can Anyone help me in this, please?
Your code does not allocate memory for phello.Foo. Allocation can be done in __cinit__ with calloc (or malloc) and deallocation in __dealloc__ with free.
cimport phello
from libc.stdlib cimport calloc, free
cdef class Foo_c:
cdef phello.Foo* s
def __cinit__(self, int n):
self.s = <phello.Foo *>calloc(1, sizeof(phello.Foo))
def __dealloc__(self):
free(<void *>self.s)
def __init__(self, int n):
self.s.x = n

Why cannot I pass a c array to a function which expects memory view in nogil content?

cdef double testB(double[:] x) nogil:
return x[0]
def test():
cdef double xx[2]
with nogil:
testB(xx)
# compiler error: Operation not allowed without gil
If with gil, it works fine.
Is it because that when pass in an c array, it creates a memory view and such creation action actually requires gil? So the memory view is not completely a c object?
Update
%%cython --annotate
cimport cython
cdef double testA(double[:] x) nogil:
return x[0]
cpdef myf():
cdef double pd[8]
cdef double[:] x = pd
testA(x)
cdef double[:] x = pd is compiled to:
__pyx_t_3 = __pyx_format_from_typeinfo(&__Pyx_TypeInfo_double);
__pyx_t_2 = Py_BuildValue((char*) "(" __PYX_BUILD_PY_SSIZE_T ")", ((Py_ssize_t)8));
if (unlikely(!__pyx_t_3 || !__pyx_t_2 || !PyBytes_AsString(__pyx_t_3))) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 8; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_GOTREF(__pyx_t_3);
__Pyx_GOTREF(__pyx_t_2);
__pyx_t_1 = __pyx_array_new(__pyx_t_2, sizeof(double), PyBytes_AS_STRING(__pyx_t_3), (char *) "fortran", (char *) __pyx_v_pd);
if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 8; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_GOTREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
__Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
__pyx_t_4 = __Pyx_PyObject_to_MemoryviewSlice_ds_double(((PyObject *)__pyx_t_1));
if (unlikely(!__pyx_t_4.memview)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 8; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_DECREF(((PyObject *)__pyx_t_1)); __pyx_t_1 = 0;
__pyx_v_x = __pyx_t_4;
__pyx_t_4.memview = NULL;
__pyx_t_4.data = NULL;
There exists __Pyx_PyObject_to_MemoryviewSlice_ds_double. So it seems when binding a memory view it does require gil.
You should use a numpy array, as your cdef double[:] declaration gets wrapped by a Python object, and its use is restricted without gil. You can see it by trying to slice a double[:]
def test()
cdef double[:] asd
with nogil:
asd[:1]
Your output will be:
with nogil:
asd[:1]
^
------------------------------------------------------------
prueba.pyx:16:11: Slicing Python object not allowed without gil
Using a numpy array would compile; numpy uses Python buffer protocole, and is smoothly integrated with Cython (a Google Summercamp project was financed for this). So no wrapping conflict arises inside the def:
import numpy as np
cdef double testA(double[:] x) nogil:
return x[0]
cpdef test():
xx = np.zeros(2, dtype = 'double')
with nogil:
a = testB(xx)
print(a)
This will build your module with test() on it. But it crashes, and in an ugly way (at least with mi PC):
Process Python segmentation fault (core dumped)
If I may insist with my (now deleted) previous answer, in my own experience, when dealing with Cython memoryviews and C arrays, passing pointers works just like one would expect in C. And most wrapping is avoided (actually, you are writing the code passing exactly the directions you want, thus making unnecesary wrapping). This compiles and functions as expected:
cdef double testB(double* x) nogil:
return x[0]
def test():
cdef double asd[2]
asd[0] = 1
asd[1] = 2
with nogil:
a = testB(asd)
print(a)
And, after compilig:
In [5]: import prueba
In [6]: prueba.test()
1.0
Memoryviews are not, by themselves, Python objects, but they can be wrapped in one. I am not a proficient Cython programmer, so sometimes I get unexpected wrappings or code that remains at Python level when I supposed it would be at C. Trial and error got me to the pointer strategy.