strncat use in cython - cython

for learning purposes I am trying to use strncat in cython within a class.
I do:
from libc.stdlib cimport free
cdef extern from "<string.h>":
char *strncat(char *pto, const char **pfrom, size_t_size)
size_t strlen (const char *s)
cdef class test:
cdef:
char *palabra
#size_t len_palabra
def __cinit__(self, char *Palabra):
self.palabra = Palabra
cdef append_palabra(self, char *other_palabra):
strncat(self.palabra, &other_palabra, strlen(other_palabra))
print(other_palabra)
def manipulate_palabra(self, char *other_palabra):
self.append_palabra(other_palabra)
print(self.palabra)
def __dealloc__(self):
free(self.palabra)
a = test(b'first ')
a.manipulate_palabra(b'second')
The print statements throw:
b'second'
b'first H\xc3-\xcd\xa9\x7f'
How should I obtain b'first second' ?
I purposely bring up the whole class to hear critics and comments as I am almost illiterate in Cython.
Thank you!

Your issue was that you'd defined:
char *strncat(char *pto, const char **pfrom, size_t_size)
but the actual signature is
char *strncat(char *pto, const char *pfrom, size_t_size)
I suspect this should have at least generated a warning at the C compilation stage that it's worth taking seriously.
Cython does wrap most of the C standard library so the simplest solution would be
from libc.string cimport strncat, stolen
There's nothing magical about the Cython wrappings - they're provided for convenience but the don't do anything that you can't write yourself (at least for C, this isn't quite true for the C++ wrappings). However the good thing about them is that they are tested, so you can be reasonably confident that they're right.

Related

Why can I not use int parameter in Cython in Jupyterlab?

I am trying to use Cython in Jupyterlab:
%load_ext Cython
%%cython -a
def add (int x, int y):
cdef int result
result = x + y
return result
add(3,67)
error:
File "<ipython-input-1-e496e2665826>", line 9
def add (int x, int y):
^
SyntaxError: invalid syntax
What am I missing?
Update:
I just measured cpdef vs def and the difference between score was quite a low one (45(cpdef) vs 52(def), smaller = better/faster), so for your function it might not matter if called just a few times, but having that chew through a large amount of data might do some real difference.
If that's not applicable for you, just call that %load_ext in a separate cell, keep def and that should be enough.
(Cython 0.29.24, GCC 9.3.0, x86_64)
Use cpdef to make it C-like function, but also to expose it to Python, so you can call it from Jupyter (because Jupyter is using Python, unless specified by the %%cython magic func). Also, check the Early Binding for Speed section.
cpdef add (int x, int y):
cdef int result
result = x + y
return result
Also make sure to check Using the Jupyter notebook which explains that the % needs to be in a separate cell as ead mentioned in the comments.
add has to be defined with cdef, not def.
cdef add (int x, int y):
cdef int result
result = x + y
return result
add(3,67)

ctypes How to get address of NULL c_void_p field?

I need to get the address of a NULL void pointer. If I make a NULL c_void_p in Python I have no problem getting its address:
ptr = c_void_p(None)
print(ptr)
print(ptr.value)
print(addressof(ptr))
gives
c_void_p(None)
None
4676189120
But I have a
class Effect(structure):
_fields_ = [("ptr", c_void_p)]
where ptr gets initialized to NULL in C. When I access it in python
myclib.get_effect.restype = POINTER(Effect)
effect = myclib.get_effect().contents
print(effect.ptr)
gives None, so I can't take addressof(effect.ptr).
If I change my field type to a pointer to any ctype type
class Effect(structure):
_fields_ = [("ptr", POINTER(c_double)]
# get effect instance from C shared library
print(addressof(effect.ptr))
I have checked that I get the right address on the heap on the C side
140530973811664
Unfortunately, changing the field type from c_void_p is not an option. How can I do this?
Clarification
Here's C code following #CristiFati for my specific situation. struct is allocated in C, I get a ptr back to it in Python, and now I need to pass a reference to a ptr in the struct. First if I make the ptr a double, there's no problem!
#include <stdio.h>
#include <stdlib.h>
#define PRINT_MSG_2SX(ARG0, ARG1) printf("From C - [%s] (%d) - [%s]: ARG0: [%s], ARG1: 0x%016llX\n", __FILE__, __LINE__, __FUNCTION__, ARG0, (unsigned long long)ARG1)
typedef struct Effect {
double* ptr;
} Effect;
void print_ptraddress(double** ptraddress){
PRINT_MSG_2SX("Address of Pointer:", ptraddress);
}
Effect* get_effect(){
Effect* pEffect = malloc(sizeof(*pEffect));
pEffect->ptr = NULL;
print_ptraddress(&pEffect->ptr);
return pEffect;
}
And in Python
from ctypes import cdll, Structure, c_int, c_void_p, addressof, pointer, POINTER, c_double, byref
clibptr = cdll.LoadLibrary("libpointers.so")
class Effect(Structure):
_fields_ = [("ptr", POINTER(c_double))]
clibptr.get_effect.restype = POINTER(Effect)
pEffect = clibptr.get_effect()
effect = pEffect.contents
clibptr.print_ptraddress(byref(effect.ptr))
gives matching addresses:
From C - [pointers.c] (11) - [print_ptraddress]: ARG0: [Address of Pointer:], ARG1: 0x00007FC2E1AD3770
From C - [pointers.c] (11) - [print_ptraddress]: ARG0: [Address of Pointer:], ARG1: 0x00007FC2E1AD3770
But if I change the double* to void* and c_void_p, I get an error, because the c_void_p in python is set to None
ctypes ([Python 3]: ctypes - A foreign function library for Python) is meant to be able to "talk to" C from Python, which makes it Python friendly, and that means no pointers, memory addresses, ... whatsoever (well at least as possible, to be more precise).
So, under the hood, it does some "magic", which in this case stands between you and your goal.
#EDIT0: Updated the answer to better fit the (clarified) question.
Example:
>>> import ctypes
>>> s0 = ctypes.c_char_p(b"Some dummy text")
>>> s0, type(s0)
(c_char_p(2180506798080), <class 'ctypes.c_char_p'>)
>>> s0.value, "0x{:016X}".format(ctypes.addressof(s0))
(b'Some dummy text', '0x000001FBB021CF90')
>>>
>>> class Stru0(ctypes.Structure):
... _fields_ = [("s", ctypes.c_char_p)]
...
>>> stru0 = Stru0(s0)
>>> type(stru0)
<class '__main__.Stru0'>
>>> "0x{:016X}".format(ctypes.addressof(stru0))
'0x000001FBB050E310'
>>> stru0.s, type(stru0.s)
(b'Dummy text', <class 'bytes'>)
>>>
>>>
>>> b = b"Other dummy text"
>>> char_p = ctypes.POINTER(ctypes.c_char)
>>> s1 = ctypes.cast((ctypes.c_char * len(b))(*b), char_p)
>>> s1, type(s1)
(<ctypes.LP_c_char object at 0x000001FBB050E348>, <class 'ctypes.LP_c_char'>)
>>> s1.contents, "0x{:016X}".format(ctypes.addressof(s1))
(c_char(b'O'), '0x000001FBB050E390')
>>>
>>> class Stru1(ctypes.Structure):
... _fields_ = [("s", ctypes.POINTER(ctypes.c_char))]
...
>>> stru1 = Stru1(s1)
>>> type(stru1)
<class '__main__.Stru1'>
>>> "0x{:016X}".format(ctypes.addressof(stru1))
'0x000001FBB050E810'
>>> stru1.s, type(stru1.s)
(<ctypes.LP_c_char object at 0x000001FBB050E6C8>, <class 'ctypes.LP_c_char'>)
>>> "0x{:016X}".format(ctypes.addressof(stru1.s))
'0x000001FBB050E810'
This is a parallel between 2 types which in theory are the same thing:
ctypes.c_char_p: as you can see, s0 was automatically converted to bytes. This makes sense, since it's Python, and there's no need to work with pointers here; also it would be very annoying to have to convert each member from ctypes to plain Python (and viceversa), every time when working with it.
Current scenario is not part of the "happy flow", it's rather a corner case and there's no functionality for it (or at least I'm not aware of any)
ctypes.POINTER(ctypes.c_char) (named it char_p): This is closer to C, and offers the functionality you needed, but as seen it's also much harder (from Python perspective) to work with it
The problem is that ctypes.c_void_p is similar to #1., so there's no OOTB functionality for what you want, and also there's no ctypes.c_void to go with #2.. However, it is possible to do it, but additional work is required.
The well known (C) rule is: AddressOf(Structure.Member) = AddressOf(Structure) + OffsetOf(Structure, Member) (beware of memory alignment who can "play dirty tricks on your mind").
For this particular case, things couldn't be simpler. Here's an example:
dll.c:
#include <stdio.h>
#include <stdlib.h>
#if defined(_WIN32)
# define DLL_EXPORT __declspec(dllexport)
#else
# define DLL_EXPORT
#endif
#define PRINT_MSG_2SX(ARG0, ARG1) printf("From C - [%s] (%d) - [%s]: ARG0: [%s], ARG1: 0x%016llX\n", __FILE__, __LINE__, __FUNCTION__, ARG0, (unsigned long long)ARG1)
static float f = 1.618033;
typedef struct Effect {
void *ptr;
} Effect;
DLL_EXPORT void test(Effect *pEffect, int null) {
PRINT_MSG_2SX("pEffect", pEffect);
PRINT_MSG_2SX("pEffect->ptr", pEffect->ptr);
PRINT_MSG_2SX("&pEffect->ptr", &pEffect->ptr);
pEffect->ptr = !null ? NULL : &f;
PRINT_MSG_2SX("new pEffect->ptr", pEffect->ptr);
}
code.py:
#!/usr/bin/env python3
import sys
from ctypes import CDLL, POINTER, \
Structure, \
c_int, c_void_p, \
addressof, pointer
DLL = "./dll.dll"
class Effect(Structure):
_fields_ = [("ptr", c_void_p)]
def hex64_str(item):
return "0x{:016X}".format(item)
def print_addr(ctypes_inst, inst_name, heading=""):
print("{:s}{:s} addr: {:s} (type: {:})".format(heading, "{:s}".format(inst_name) if inst_name else "", hex64_str(addressof(ctypes_inst)), type(ctypes_inst)))
def main():
dll_dll = CDLL(DLL)
test_func = dll_dll.test
test_func.argtypes = [POINTER(Effect), c_int]
effect = Effect()
print_addr(effect, "effect")
test_func(pointer(effect), 1)
print(effect.ptr, type(effect.ptr)) # Not helping, it's Python int for c_void_p
try:
print_addr(effect.ptr, "effect.ptr")
except:
print("effect.ptr: - wrong type")
print_addr(effect, "effect", "\nSecond time...\n ")
print("Python addrs (irrelevant): effect: {:s}, effect.ptr: {:s}".format(hex64_str(id(effect)), hex64_str(id(effect.ptr))))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
Output:
(py35x64_test) e:\Work\Dev\StackOverflow\q053531795>call "c:\Install\x86\Microsoft\Visual Studio Community\2015\vc\vcvarsall.bat" x64
(py35x64_test) e:\Work\Dev\StackOverflow\q053531795>dir /b
code.py
dll.c
(py35x64_test) e:\Work\Dev\StackOverflow\q053531795>cl /nologo /DDLL /MD dll.c /link /NOLOGO /DLL /OUT:dll.dll
dll.c
Creating library dll.lib and object dll.exp
(py35x64_test) e:\Work\Dev\StackOverflow\q053531795>dir /b
code.py
dll.c
dll.dll
dll.exp
dll.lib
dll.obj
(py35x64_test) e:\Work\Dev\StackOverflow\q053531795>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" code.py
Python 3.5.4 (v3.5.4:3f56838, Aug 8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)] on win32
effect addr: 0x000001FB25B8CB10 (type: <class '__main__.Effect'>)
From C - [dll.c] (21) - [test]: ARG0: [pEffect], ARG1: 0x000001FB25B8CB10
From C - [dll.c] (22) - [test]: ARG0: [pEffect->ptr], ARG1: 0x0000000000000000
From C - [dll.c] (23) - [test]: ARG0: [&pEffect->ptr], ARG1: 0x000001FB25B8CB10
From C - [dll.c] (25) - [test]: ARG0: [new pEffect->ptr], ARG1: 0x00007FFFAFB13000
140736141012992 <class 'int'>
effect.ptr: - wrong type
Second time...
effect addr: 0x000001FB25B8CB10 (type: <class '__main__.Effect'>)
Python addrs (irrelevant): effect: 0x000001FB25B8CAC8, effect.ptr: 0x000001FB25BCC9F0
As seen, the address of effect is the same as the address of effect's ptr. But again, this is the simplest possible scenario. But, as explained a general solution, is preferred. However that's not possible, but it can be worked around:
Use the above formula and get the field offset using [SO]: Getting elements from ctype structure with introspection? (it's long, I had a hard time coming to the current solution - especially because of the 2 container types (Structure and Array) nesting possibilities; hopefully, it's bug free (or as close as possible) :) )
Modify the C interface to something like: Effect *get_effect(void **ptr), and store the address in the parameter
Modify the (Python) Effect structure, and instead of ctypes.c_void_p field have something that involves POINTER (e.g.: ("ptr", POINTER(c_ubyte))). The definition will differ from C, and semantically things are not OK, but at the end they're both pointers
Note: don't forget to have a function that destroys a pointer returned by get_effect (to avoid memory leaks)
So after raising this in the python bug tracker, Martin Panter and Eryk Sun provided a better solution.
There is indeed an undocumented offset attribute, which allows us to access the right location in memory without having to do any introspection. We can get back our pointer using
offset = type(Effect).ptr.offset
ptr = (c_void_p).from_buffer(effect, offset)
We can more elegantly wrap this into our class by using a private field and adding a property:
class Effect(Structure):
_fields_ = [("j", c_int),
("_ptr", c_void_p)]
#property
def ptr(self):
offset = type(self)._ptr.offset
return (c_void_p).from_buffer(self, offset)
I have added an integer field before our pointer so the offset isn't just zero. For completeness, here is the code above adapted with this solution showing that it works. In C:
#include <stdio.h>
#include <stdlib.h>
#define PRINT_MSG_2SX(ARG0, ARG1) printf("%s : 0x%016llX\n", ARG0, (unsigned long long)ARG1)
typedef struct Effect {
int j;
void* ptr;
} Effect;
void print_ptraddress(double** ptraddress){
PRINT_MSG_2SX("Address of Pointer:", ptraddress);
}
Effect* get_effect(){
Effect* pEffect = malloc(sizeof(*pEffect));
pEffect->ptr = NULL;
print_ptraddress(&pEffect->ptr);
return pEffect;
}
In Python (omitting the above Effect definition):
from ctypes import cdll, Structure, c_int, c_void_p, POINTER, byref
clibptr = cdll.LoadLibrary("libpointers.so")
clibptr.get_effect.restype = POINTER(Effect)
effect = clibptr.get_effect().contents
clibptr.print_ptraddress(byref(effect.ptr))
yields
Address of Pointer: : 0x00007F9EB248FB28
Address of Pointer: : 0x00007F9EB248FB28
Thanks again to everyone for quick suggestions. For more, see here:

Cython set variable to named constant

I'm chasing my tail with what I suspect is a simple problem, but I can't seem to find any explanation for the observed behavior. Assume I have a constant in a C header file defined by:
#define FOOBAR 128
typedef uint32_t mytype_t;
I convert this in Cython by putting the following in the .pxd file:
cdef int _FOOBAR "FOOBAR"
ctypedef uint32_t mytype_t
In my .pyx file, I have a declaration:
FOOBAR = _FOOBAR
followed later in a class definition:
cdef class MyClass:
cdef mytype_t myvar
def __init__(self):
try:
self.myvar = FOOBAR
print("GOOD")
except:
print("BAD")
I then try to execute this with a simple program:
try:
foo = MyClass()
except:
print("FAILED TO CREATE CLASS")
Sadly, this errors out, but I don't get an error message - I just get the exception print output:
BAD
Any suggestions on root cause would be greatly appreciated.
I believe I have finally tracked it down. The root cause issue is that FOOBAR in my code was actually set to UINT32MAX. Apparently, Cython/Python interprets that as a -1 and Python then rejects setting a uint32_t variable equal to it. The solution is to define FOOBAR to be 0xffffffff - apparently Python thinks that is a non-negative value and accepts it.

Inlining a cdef method from a cdef class from another cython package

I have cython a class which looks like this:
cdef class Cls:
cdef func1(self):
pass
If I use this class in another library, will I be able to inline func1 which is a class method? Or should I find a way around it (by creating a func that takes a Cls pointer as an arg, for example?
There are bad and good news: The inlining isn't possible from the other module, but you don't have to pay the full price of a Python-function-call.
What is inlining? It is done by the C-compiler: when the C-compiler knows the definition of a function it can decide to inline it. This has two advantages:
You don't have to pay the overhead of calling a function
It makes further optimizations possible.
See for example:
%%cython -a
ctypedef unsigned long long ull
cdef ull doit(ull a):
return a
def calc_sum_fun():
cdef ull res=0
cdef ull i
for i in range(1000000000):#10**9
res+=doit(i)
return res
>>> %timeit calc_sum_fun()
53.4 ns ± 1.4 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
How was it possible to do 10^9 additions in 53 nanoseconds? Because it was not done: The C-Compiler inlined the cdef doit() and was able to calculate the result of the loop during the compiler time. So during the run time the program simple returns the precomputed result.
It is pretty obvious from there, that C compiler will not be able to inline a function from another module, because the definition is concealed from it in another c-file/translation-unit. As example see:
#simple.pdx:
ctypedef unsigned long long ull
cdef ull doit(ull a)
#simple.pyx:
cdef ull doit(ull a):
return a
def doit_slow(a):
return a
and now accessing it from another cython module:
%%cython
cimport simple
ctypedef unsigned long long ull
def calc_sum_fun():
cdef ull res=0
cdef ull i
for i in range(10000000):#10**7
res+=doit(i)
return res
leads to the following timings:
>>> %timeit calc_sum_fun()
17.8 ms ± 208 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Because the inlining was not possible, the function really has to do the loop... However, it does it faster than a normal python-call, which we can do by replacing cdef doit() through def doit_slow():
%%cython
import simple #import, not cimport
ctypedef unsigned long long ull
def calc_sum_fun_slow():
cdef ull res=0
cdef ull i
for i in range(10000000):#10**7
res+=simple.doit_slow(i) #slow
return res
Python-call is about 50 times slower!
>>> %timeit calc_sum_fun_slow()
1.07 s ± 20.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
But you asked about class-methods and not global functions. For class-methods the inlining is not possible even in the same module:
%%cython
ctypedef unsigned long long ull
cdef class A:
cdef ull doit(self, ull a):
return a
def calc_sum_class():
cdef ull res=0
cdef ull i
cdef A a=A()
for i in range(10000000):#10**7
res+=a.doit(i)
return res
Leads to:
>>> %timeit calc_sum_class()
18.2 ms ± 264 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
which is basically the same as in the case, where the cdef class is defined in another module.
The reason for this behavior is the way a cdef-class is build. It is a lot unlike virtual classes in C++ - the class definition has something similar to a virtual table called __pyx_vtab:
struct __pyx_obj_12simple_class_A {
PyObject_HEAD
struct __pyx_vtabstruct_12simple_class_A *__pyx_vtab;
};
where the pointer to cdef doit() is saved:
struct __pyx_vtabstruct_12simple_class_A {
__pyx_t_12simple_class_ull (*doit)(struct __pyx_obj_12simple_class_A *, __pyx_t_12simple_class_ull);
};
When we call a.doit() we don't call the function directly but via this pointer:
((struct __pyx_vtabstruct_12simple_class_A *)__pyx_v_a->__pyx_vtab)->doit(__pyx_v_a, __pyx_v_i);
which explains why the C-compiler cannot inline the function doit().

"Storing unsafe C derivative of temporary Python reference" when trying to access struct pointer

I want to use a library that gives me a dynamic array. The dynamic array struct has a property void* _heap_ptr which gives the start of the array.
After having built the list, I want to access this pointer in cython (to make a copy of the array). But I cannot seem to get the pointer element from the struct.
Here is my pyx:
cimport src.clist as l
def main():
cdef l.ptr_list basic_list
cdef int i = 42
basic_list = l.create_list_size(sizeof(i), 100)
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
for i in range(1):
print(arr[i])
This is the error message:
Error compiling Cython file:
------------------------------------------------------------
...
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
^
------------------------------------------------------------
src/test.pyx:14:20: Cannot convert Python object to 'int *'
Error compiling Cython file:
------------------------------------------------------------
...
l.list_add_ptr(basic_list, &i)
cdef int* arr;
arr = basic_list._heap_ptr
^
------------------------------------------------------------
src/test.pyx:14:20: Storing unsafe C derivative of temporary Python reference
And my pxd:
cdef extern from "src/list.h":
ctypedef struct _list:
void* _heap_ptr
ctypedef struct ptr_list:
pass
ptr_list create_list_size(size_t size, int length)
list_destroy(ptr_list this_list)
void* list_at_ptr(ptr_list this_list, int index)
list_add_ptr(ptr_list this_list, void* value)
How can I fix my code? Why is this happening? From my investigations that error message pops up if you have forgotten to declare something as C (ie. use malloc not libc.stdlib.malloc, but I cannot see that anything similar is happening here.)
There are two issues in your code.
First: struct ptr_list has no members and thus no member _heap_ptr. It probably should have been
ctypedef struct ptr_list:
void* _heap_ptr
Cython's error message is not really helpful here, but as you said it pops up usually when a C-declaration is forgotten.
Second: you need to cast from void * to int * explicitly:
arr = <int*>basic_list._heap_ptr