How to use shared_ptr and make_shared with arrays? - cython

I want to use a C++ shared_ptr as a replacement for raw C pointers. As a simple example the following code seems to work as intended:
from libcpp.memory cimport shared_ptr, allocator
cdef shared_ptr[double] spd
cdef allocator[double] allo
spd.reset(allo.allocate(13))
The size is chosen as 13 here, but in general is not know at compile time.
I'm not sure if this is correct, but I haven't had any errors (no memory leaks and segfaults yet). I'm curious if there is a more optimal solution with make_shared.
But I can't use C++11 arrays because Cython doesn't allow literals as templates, e.g. something like
cdef shared_ptr[array[double]] spd = make_shared[array[double,13]]()
and "normal" arrays which are supposed to work with C++20 compiler (e.g. gcc 10) are causing problems in one way or another:
# Cython error "Expected an identifier or literal"
cdef shared_ptr[double[]] spd = make_shared[double[]](3)
# Can't get ptr to work
ctypedef double[] darr
cdef shared_ptr[darr] spd = make_shared[darr](13)
cdef double* ptr = spd.get() # or spd.get()[0] or <double*> spd.get()[0] or ...
Is the allocator solution the correct and best one or is there another way how to do it?

Here is what I'm going with
cdef extern from *:
"""
template <typename T>
struct Ptr_deleter{
size_t nn;
void (*free_ptr)(T*, size_t);
Ptr_deleter(size_t nIn, void (*free_ptrIn)(T*, size_t)){
this->nn = nIn;
this->free_ptr = free_ptrIn;
};
void operator()(T* ptr){
free_ptr(ptr, nn);
};
};
template <typename T>
std::shared_ptr<T> ptr_to_sptr (T* ptr, size_t nn, void (*free_ptr)(T*, size_t)) {
Ptr_deleter dltr(nn, free_ptr);
std::shared_ptr<T> sp (ptr, dltr);
return sp;
};
"""
shared_ptr[double] d_ptr_to_sptr "ptr_to_sptr"(double* ptr, size_t nn, void (*free_ptr)(double*, size_t) nogil) nogil
cdef void free_d_ptr(double* ptr, size_t nn) nogil:
free(ptr)
cdef shared_ptr[double] sp_d_empty(size_t nn) nogil:
return d_ptr_to_sptr(<double*> nullCheckMalloc(nn*sizeof(double)), nn, &free_d_ptr)
My understanding is that the "right" way to handle malloced arrays is to use a custom deleter like I did. I personally prefer sticking with somewhat-raw C pointers (double* instead of double[] or something), since it's more natural in Cython and my projects.
I think it's reasonably easy to see how to change free_ptr for more complicated data types. For simple data types it could be done in less lines and less convoluted, but I wanted to have the same base.
I like my solution in the regard that I can just "wrap" existing Cython/C code raw pointers in a shared_ptr.
When working with C++ (especially newer standards like C++20) I think verbatim code is pretty often necessary. But I've intentionally defined free_d_ptr in Cython, so it's easy to use existing Cython code to handle the actual work done to free/clear/whatever the array.
I didn't get C++11 std::arrays to work, and it's apparently not "properly" possible in Cython in general (see Interfacing C++11 array with Cython).
I didn't get double[] or similar to work either (is possible in C++20), but with verbatim C++ code I think this should be doable in Cython. I prefer more C-like pointers/arrays anyway as I said.

Related

Cython compiling error: Saying that Array is a Struct

This is a minimally reproducible version of my Cython error. The code runs in C++.
The compiler is telling me error C2088 that "+= is illegal for struct". However, it is being passed an array.
The pyx file:
from libc.stdint cimport uint32_t as Card
from cpython cimport array
import array
cdef extern from "ace_eval.h":
void ACE_addcard(h, Card c)
def create_hand():
cdef array.array h = array.array('I',[0, 0, 0, 0, 0])
ACE_addcard(h, 257)
return h
The function imported from the header is:
#define ACE_addcard(h,c) h[c&7]+=c,h[3]|=c
I have also tried declaring my arrays using
cdef Card h[5]
array.array is a Python object that is ultimately compiled into a struct (so this is what C++ sees). Element access to it is controlled at a Python level by __getitem__ and __setitem__, which are compiled by Cython into C API function calls. When Cython sees code for an array being manipulated it'll generate the appropriate C API function calls. You code using C++ #define statements attempts to manipulate it at C++ compile time and prevents Cython from knowing what's going on.
Ideally you should be using "typed memoryviews" which give Cython quicker access to the array (but will still not work with the C++ #define since this is applied after Cython has processed the file):
cdef int[::1] h = array.array('I',[0, 0, 0, 0, 0]) # you may have to change the type long... I haven't tested it
h[257&7]+=257
h[3]|=257
If you absolutely insist on using macros instead then they need to take something with a C++ array interface. A pointer is probably the easiest option and can be got from:
cdef int* h_ptr = &h[0]
#DavidW 's second way of
cdef Card h[5]
h[:] = [0, 0, 0, 0, 0]
cdef Card* h_ptr = &h[0]
also worked once I also adjusted my cdef like so to accept the pointer. Note that the function in the #define macro is not changed and does not have return type specified.
cdef extern from "ace_eval.h":
void ACE_addcard(Card* h, Card c)
This allowed me to pass any of my arrays over flawlessly.
This is actually what it says in the docs, but it was a bit obtuse to me - hopefully my explanation helps someone else.
https://cython.readthedocs.io/en/latest/src/userguide/external_C_code.html
If the header file defines a function using a macro, declare it as though it were an ordinary function, with appropriate argument and result types.

operator= in Cython cppclass

How can I tell Cython that my C++ class has overloaded operator=? I tried:
cdef extern from "my_source.H":
cdef cppclass MyStatus:
void operator=(const char* status)
cdef public void setStatus(MyStatus& status):
status = "FOO"
but Cython either complains "Assignment to reference status" or (if I make status a non-reference) constructs a python object out of the string "FOO" and then tries to assign the python object to status.
The problem in your code is, that for Cython "FOO" is a Python-object. For expressions like
char *s = "FOO"
Cython is clever enough to understand, what you want and automatically interprets "FOO" as char *.
However, Cython doesn't really "understand"/interpret the signatures of wrapped c++-functions (for that it must be a c++-compiler) and thus cannot know, that you want "FOO" be a char *.
Thus you have to help Cython, for example:
status = <const char *>"FOO"
You also have to work around the problem with reference, for example via:
cdef public void setStatus(MyStatus *status):
status[0] = <const char *>"FOO"
or if you want to have keep the signature of the function intact:
cdef public void setStatus(MyStatus& status):
cdef MyStatus * as_ptr = &status
as_ptr[0] = <const char *>"FOO"
I'm not completely sure the problem with the assigment to reference isn't a bug.
Another observation: the assigment operators aren't part of the "official" wrap of the standard containers, see here or here.

How to change the default code generated by SWIG for the allocation of memory for a C structure?

I am using a flexible array in the structure. So I want to change the memory allocated for that structure with some of my own code. Basically I want to change the new_structname() and structname_variable_set() functions.
typedef struct vector{
int x;
char y;
int arr[0];
} vector;
here, SWIG generated new_vector() function to allocate memory by calling calloc(1,sizeof(struct vector)) where swig will not handle these type of structure in a special manner. So we need to modify the swig generated new_vector() in order to allocate memory for the flexible array. So is there any way to handle this?
There are a few ways you can do this. What you're looking for though is %extend. That lets us define new constructors and implement them as we see fit. (It even works with a C compiler, they're only constructors from the perspective of the target language).
Using your vector as a starting point we can illustrate this:
%module test
%include <stdint.i>
%inline %{
typedef struct vector{ int x; char y; int arr[0]; }vector;
%}
%extend vector {
vector(const size_t len) {
vector *v = calloc(1, sizeof *v + len);
v->x = len;
return v;
}
}
With this SWIG synthesises a new_vector function in the generated module code as you'd hoped.
I also assumed that you want to record the length inside the struct as one of its members. If that's not the case you can simply delete the assignment I made.

Cython how to make multidimensional string matrix?

Cython how to make multidimensional string matrix?any one knows?Thanks
I have below code but not work:
def make_matrix(size_t nrows, size_t ncols):
cdef char *mat = <char*>malloc(nrows * ncols * sizeof(char))
cdef char[:, ::1] mv = <char[:nrows, :ncols]>mat
cdef cnp.ndarray arr = np.asarray(mv)
return arr
Given you want an array nrows strings, each ncols long you can just do:
np.zeros((nrows,),dtype=('S',ncols))
This creates an empty numpy array of the format you want, and there's no need to invoke specialised Cython features.
The good reason not to attempt to do it in Cython using malloc is that the memory will never get freed (so you'll have a memory leak unless you free the memory yourself). It's very hard to know when you need to free it.
As an alternative (if you genuinely do need malloc for some reason) you could work with int8 instead, which is the same size as a char and should be interconvertable.

Cython overhead on extension types in memoryview

I am compiling a Cython module, and checked this piece of code with cython -a command.
cdef INT_t print_info(Charge[:] electrons):
cdef INT_t i, index
for i in range(electrons.shape[0]):
index = electrons[i].particleindex
return index
It turns out that
+ index = electrons[i].particleindex
__pyx_t_4 = __pyx_v_i;
__pyx_t_3 = (PyObject *) *((struct __pyx_obj_14particle_class_Charge * *) ( /* dim=0 */ (__pyx_v_electrons.data + __pyx_t_4 * __pyx_v_electrons.strides[0]) ));
__Pyx_INCREF((PyObject*)__pyx_t_3);
__pyx_t_5 = ((struct __pyx_obj_14particle_class_Charge *)__pyx_t_3)->particleindex;
__Pyx_DECREF(((PyObject *)__pyx_t_3)); __pyx_t_3 = 0;
__pyx_v_index = __pyx_t_5;
Charge is a cdef extension type and I am trying to use a memoryview buffer Charge[:] here. It seems that Cython calls some Python API in this case, in particular __Pyx_INCREF((PyObject*) and __Pyx_DECREF(((PyObject *) have been generated.
I am wondering what causes this, will it cause a lot of slowdown? It is my first post in the forum, any comments or suggestions are greatly appreciated!
PS: Charge object is defined as
charge.pyx
cdef class Charge:
def __cinit__(Charge self):
self.particleindex = 0
self.charge = 0
self.mass = 0
self.energy = 0
self.on_electrode = False
charge.pxd
cdef class Charge:
cdef INT_t particleindex
cdef FLOAT_t charge
cdef FLOAT_t mass
cdef FLOAT_t energy
cdef bint on_electrode
Cython will likely be happier with pythonic code. Rewrite your function:
cdef INT_t print_info(Charge[:] electrons):
cdef INT_t i, index
for electron in electrons:
index = electron.particleindex
return index
and try again.
It's not the memoryview, it's the extension type. Cython extension types are treated with the same reference-counting semantics as Python objects.
You can get and work with pointers to them which do not change recounts, either with <void*> or with <PyObject*> (which you can cimport from cpython.ref), but the pointers obviously don't have any methods or attributes. The minute you try to cast back to the extension type type, the INCREF/DECREF code reappears. Those instructions are pretty fast though.
There was some talk on the mailing list about how non-refcounted references (i.e., with access to object data) to extension types might be a new feature in the future, but there seemed to be little enthusiasm for adding a feature that, realistically, is probably going to lead to a lot of horrifyingly buggy code of the "access violation" variety.