#cython.wraparound(False) cast integer CORE GENERATED Error - cython

In cython when my code is compiled with
#cython.wraparound(True)
and I use the following cast function to convert (cast) a float to an integer
cdef DTYPE_t_I float_int(np.float_t val):
return <DTYPE_t_I>val
it runs ok
BUT
when I turn off
#cython.wraparound(False)
the code compiles normally and when it runs it gives the following error
CORE GENERATED
It happens compiling in linux with gcc and windows with MGS
What is wrong? Should it be like this?
Because I am trying to gain speed, I would like to know to switch off these flag.

Related

How to use C complex numbers in 'language=c++' mode?

Most of my library is written with Cython in the "normal" C mode. Up to now I rarely needed any C++ functionality, but always assumed (and sometimes did!) I could just switch to C++-mode for one module if I wanted to.
So I have like 10+ modules in C-mode and 1 module in C++-mode.
The problem is now how Cython seems to handle complex numbers definitions. In C-mode it assumes I think C complex numbers, and in C++-mode it assumes I think C++ complex numbers. I've read they might be even the same by now, but in any case Cython complains that they are not:
openChargeState/utility/cheb.cpp:2895:35: error: cannot convert ‘__pyx_t_double_complex {aka std::complex<double>}’ to ‘__complex__ double’ for argument ‘1’ to ‘double cabs(__complex__ double)’
__pyx_t_5 = ((cabs(__pyx_v_num) == INFINITY) != 0);
In that case I'm trying to use cabs defined in a C-mode module, and calling it from the C++-mode module.
I know there are some obvious workarounds (right now I'm just not using C++-mode; I'd like to use vectors and instead use the slower Python lists for now).
Is there maybe a way to tell my C++-mode module to use C complex numbers, or tell it that they are the same? If there is I couldn't find a working way to ctypedef C complex numbers in my C++-mode module... Or are there any other solutions?
EDIT: Comments of DavidW and ead suggested some reasonable things. First the minimum working example.
setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
from Cython.Build import cythonize
extra_compile_args=['-O3']
compdir = {'language_level' : '3'}
extensions = cythonize([
Extension("cmod", ["cmod.pyx"]),
Extension("cppmod", ["cppmod.pyx"], language='c++')
],
compiler_directives = compdir
)
setup(cmdclass = {'build_ext': build_ext},
ext_modules = extensions
)
import cppmod
cmod.pyx
cdef double complex c_complex_fun(double complex xx):
return xx**2
cmod.pxd
cdef double complex c_complex_fun(double complex xx)
cdef extern from "complex.h":
double cabs(double complex zz) nogil
cppmod.pyx
cimport cmod
cdef double complex cpp_complex_fun(double complex xx):
return cmod.c_complex_fun(xx)*abs(xx) # cmod.cabs(xx) doesn't work here
print(cpp_complex_fun(5.5))
Then just compile with python3 setup.py build_ext --inplace.
Now the interesting part is that (as written in the code) only "indirectly" imported c functions have a problem, in my case cabs. So the suggestion to just use abs actually does help, but I still don't understand the underlying logic. I hope I don't encounter this in another problem. I'm leaving the question up for now. Maybe somebody knows what's happening.
Your problem has nothing to do with the fact, that one module is compiled as a C-extension and the other as a C++-extension - one can easily reproduce the issue in a C++-extension alone:
%%cython -+
cdef extern from "complex.h":
double cabs(double complex zz) nogil
def cpp_complex_fun(double complex xx):
return cabs(xx)
results in your error-message:
error: cannot convert __pyx_t_double_complex {aka
std::complex<double>} to __complex__ double for argument 1 to
double cabs(__complex__ double)
The problem is that the complex numbers are ... well, complex. Cython's strategy (can be looked up here and here) to handle complex numbers is to use an available implementation from C/CPP and if none is found a hand-written fallback is used:
#if !defined(CYTHON_CCOMPLEX)
#if defined(__cplusplus)
#define CYTHON_CCOMPLEX 1
#elif defined(_Complex_I)
#define CYTHON_CCOMPLEX 1
#else
#define CYTHON_CCOMPLEX 0
#endif
#endif
....
#if CYTHON_CCOMPLEX
#ifdef __cplusplus
typedef ::std::complex< double > __pyx_t_double_complex;
#else
typedef double _Complex __pyx_t_double_complex;
#endif
#else
typedef struct { double real, imag; } __pyx_t_double_complex;
#endif
In case of a C++-extension, Cython's double complex is translated to std::complex<double> and thus cannot be called with cabs( double complex z ) because std::complex<double> isn't double complex.
So actually, it is your "fault": you lied to Cython and told him, that cabs has the signature double cabs(std::complex<double> z), but it was not enough to fool the c++-compiler.
That means, in c++-module std::abs(std::complex<double>) could be used, or just Cython's/Python's abs, which is automatically translated to the right function (this is however not possible for all standard-function).
In case of the C-extension, because you have included complex.h as an so called "early include" with cdef extern from "complex.h", thus for the above defines _Complex_I becomes defined and Cython's complex becomes an alias for double complex and thus cabs can be used.
Probably the right thing for Cython would be to always use the fallback per default and that the user should be able to choose the desired implementation (double complex or std::complex<double>) explicitly.

Cython: declare a PyCapsule_Destructor in pyx file

I don't know python and trying to wrap an existing C library that provides 200 init functions for some objects and 200 destructors with help of PyCapsule. So my idea is to return a PyCapsule from init functions` wrappers and forget about destructors that shall be called automatically.
According to documentation PyCapsule_New() accepts:
typedef void (*PyCapsule_Destructor)(PyObject *);
while C-library has destructors in a form of:
int foo(void*);
I'm trying to generate a C function in .pyx file with help of cdef that would generate a C-function that will wrap library destructor, hide its return type and pass a pointer taken with PyCapsule_GetPointer to destructor. (pyx file is programmatically generated for 200 functions).
After a few experiments I end up with following .pyx file:
from cpython.ref cimport PyObject
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_IsValid, PyCapsule_GetPointer
cdef void stateFree( PyObject *capsule ):
cdef:
void * _state
# some code with PyCapsule_GetPointer
def stateInit():
cdef:
void * _state
return PyCapsule_New(_state, "T", stateFree)
And when I'm trying to compile it with cython I'm getting:
Cannot assign type 'void (PyObject *)' to 'PyCapsule_Destructor'
using PyCapsule_New(_state, "T", &stateFree) doesn't help.
Any idea what is wrong?
UPD:
Ok, I think I found a solution. At least it compiles. Will see if it works. I'll bold the places I think I made a mistake:
from cpython.ref cimport PyObject
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_IsValid, PyCapsule_GetPointer, PyCapsule_Destructor
cpdef void stateFree( object capsule ):
cdef:
void* _state
_state = PyCapsule_GetPointer(capsule, "T")
print('destroyed')
def stateInit():
cdef:
int _state = 1
print ("initialized")
return PyCapsule_New(_state, "T", < PyCapsule_Destructor >stateFree)
The issue is that Cython distinguishes between
object - a Python object that it knows about and handles the reference-counting for, and
PyObject*, which as far as it's concerned is a mystery type that it basically nothing about except that it's a pointer to a struct.
This is despite the fact that the C code generated for Cython's object ends up written in terms of PyObject*.
The signature used by the Cython cimport is ctypedef void (*PyCapsule_Destructor)(object o) (which isn't quite the same as the C definition. Therefore, define the destructor as
cdef void stateFree( object capsule ):
Practically in this case the distinction makes no difference. It matters more in cases where a function steals a reference or returns a borrowed reference. Here capsule has the same reference count on both the input and output of the function whether Cython manages it or not.
In terms of your edited-in solution:
cpdef is wrong for stateFree. Use cdef since it is not a function that should be exposed in a Python interface (and if you use cpdef it isn't obvious whether the Python or C version is passed as a function pointer).
You shouldn't need the cast to PyCapsule_Destructor and should avoid it because casts can easily hide bugs.
Can I just take a moment to express my general dislike for PyCapsule (it's occasionally useful for passing an opaque type through Python code without touching it, but for anything more I think it's usually better to wrap it properly in a cdef class). It's possible you've thought about it and it is the right tool for the job, but I'm putting this warning in to try to discourage people in the future who might be trying to use it on a more "copy-and-paste" basis.

Not allowed in a constant expression for module access

I have two cython files:
intern.pxd
cdef int test = 8
extern.pyx
cimport intern
cpdef enum test_enum:
test = intern.test
If I try to compile this, It throws the following error:
Error compiling Cython file:
------------------------------------------------------------
...
cimport intern
cpdef enum test_enum:
test = intern.test ^
------------------------------------------------------------
side_tests\extern.pyx:4:17: Not allowed in a constant expression
I guess this is because the value of intern.test can not be known at compile time. I would like to get a solution for this. It is not an option to export the values of intern.pxd into extern.pyx because in the real project intern.pxd contains around 2000 external defined values/functions.
#DavidW pointed me to the working solution 'wrap in enum':
# In intern.pxd
cdef enum test_enum_intern:
test = 8
This works, but feels 'weird'. If somebody has another solution, he is welcome to post it.

When and how does cython do boundscheck?

c doesn't do bounds check. So how does cython do the check if it compiles to c?
%%cython --annotate
cimport cython
#cython.boundscheck(True)
cpdef myf():
cdef double pd[8]
for i in range(100):
pd[i] = 0
print pd[i]
The above code compiles to the same C code no matter whether I set True or False for boundscheck. And if I run myf() there is no warnings (it happens to not crash...).
Update
So cython doens't do bounds check on c arrays anyway.
http://docs.cython.org/src/reference/compilation.html#compiler-directives
"Cython is free to assume that indexing operations ([]-operator) in the code will not cause any IndexErrors to be raised. Lists, tuples, and strings are affected..."
I think in your code a C double array doesn't store its length anywhere, and so it's impossible for Cython to do any useful checks (except in your very trivial example). However, a built in Python type which can raise IndexErrors should be different (I'd assume numpy arrays, python arrays and cython memoryviews should also be affected since they all have a mechanism for Cython to tell if it's gone off the end).

CUDA FORTRAN: function gives different answer if I pass variable instead of number

I'm trying to use the ISHFT() function to bitshift some 32-bit integers in parallel, using CUDA FORTRAN.
The problem is that I get different answers to ISHFT(-4,-1) and ISHFT(var,-1) even though var = -4. This is the test code I've written:
module testshift
integer :: test
integer, device :: d_test
contains
attributes(global) subroutine testshft ()
integer :: var
var = -4
d_test = ISHFT(var,-1)
end subroutine testshft
end module testshift
program foo
use testshift
integer :: i
call testshft<<<1,1>>>() ! carry out ishft on gpu
test = d_test ! copy device result to host
i = ISHFT(-4,-1) ! carry out ishft on cpu
print *, i, test ! print the results
end program foo
I then compile and execute:
pgf90 testishft.f90 -Mcuda
./a.out
2147483646 -2
Both should be 2147483646 if working correctly. I get the right answer if I replace var with 4.
How do I fix this problem?
Thanks for the help
When I remove the GPU-specific code from the above program I get 2147483646 2147483646 from the g95 compiler, as you expect. Have you tried running a "scalar" version of the program using the pgf90 compiler? If the scalar version works but the GPU version does not, that helps to isolate the problem. If the problem is pgf90/CUDA specific, perhaps the best place to ask your question is
PGI User Forum Forum Index -> Programming and Compiling
http://www.pgroup.com/userforum/viewforum.php?f=4 .
I've found a workaround, which is posted in this forum:
http://www.pgroup.com/userforum/viewtopic.php?t=2455&postdays=0&postorder=asc&start=15
Instead of using ISHFT I use IBITS, which is described here: http://gcc.gnu.org/onlinedocs/gfortran/IBITS.html
Also the problem has since been fixed in version 11.3 of the PGI compiler
http://www.pgroup.com/support/release_tprs_2011.htm