Sympy's autowrap with cython and Matrix generates fatal error: 'numpy/arrayobject.h' file not found - cython

I'm trying to execute the simple example from the Sympy's autowrap module that includes matrix/vector product with the Cython langage (since I do not have gfortran installed):
import sympy.utilities.autowrap as aw
from sympy.utilities.autowrap import autowrap
from sympy import symbols, IndexedBase, Idx, Eq
A, x, y = map(IndexedBase, ['A', 'x', 'y'])
m, n = symbols('m n', integer=True)
i = Idx('i', m)
j = Idx('j', n)
instruction = Eq(y[i], A[i, j]*x[j])
matvec = autowrap(instruction, language='C',backend='cython')
I'm on OSX 10.9.4, with the anaconda distribution for python 2.7, sympy 0.7.6.1 and cython 0.23.2.
I get the following (known) error: fatal error: 'numpy/arrayobject.h' file not found
It seems to be a systematic error, and one needs to include the appropriate numpy's header target in the setup file attached to the compilation process of cython as suggested here.
How to get rid form this issue in an autowrap context?
It seems this is a bug fixed here, but it does not work for me... Is this bug fix included in sympy's realease 0.7.6.1?
Any idea?

This was a bug and is now fixed. See this pull request:
https://github.com/sympy/sympy/pull/8848
If you use the development version of SymPy, it should work. Else you could have autowrap spit the files out to a temporary directory, add the correct include statement to the generated files, and manually compile the code.

Related

Is cython compatible with typing.NamedTuple?

I have the following code in file temp.py
from typing import NamedTuple
class C(NamedTuple):
a: int
b: int
c = C(1, 2)
I compile it using the command:
cythonize -3 -i temp.py
and run it using the command
python3 -c 'import temp'
I get the following exception:
Traceback (most recent call last): File "<string>", line 1, in <module> File "temp.py", line 7, in init temp
c = C(1, 2) TypeError: __new__() takes 1 positional argument but 3 were given
Version of python: 3.6.15
Version of cython: 0.29.14
Is there anything wrong in the above code/build steps ?
It'll work in the current Cython 3 alpha version (and later). It won't work in Cython 0.29.x (you're using a pretty outdated version of this, but that won't affect this feature).
It requires classes to have an __annotations__ dictionary, which is a feature that was added in the Cython 3 alpha releases.
You won't get much/any speed advantage from compiling this is Cython though - it'll still generate a normal Python class. But it will work.
in short, NO, it is not compatible. Edit: not currently compatible.
named tuples is just python magic (creating classes at runtime), cython doesn't know about it, so you have to execute that code by calling the interpreter at runtime, using exec.
# temp.pyx
temp_global = {}
exec("""
from typing import NamedTuple
class C(NamedTuple):
a: int
b: int
""",temp_global)
C = temp_global['C']
c = C(1,2)
print(c)
to test it
import pyximport
pyximport.install()
import temp
this ends up being some python code that's being executed whenever you import your binary, the entire file is being passed to exec whenever you import it, so it's not really "Cython Code", you can just write it as a python .py file and avoid cython, or just implement your "Cython class" without relying on python magic. (no named tuples or dynamic code that is created at runtime)

How to solve or suppress wall of warnings using any pystan code

When I run any pystan code, the output is what I expect, but I get a wall of warnings.
I've tried updating pystan and cython, as these are mentioned in the wall of warnings. My pystan is now version 2.17.1 and cython 0.29.2. I'm running python3.7.
import pystan
model_code = 'parameters {real y;} model {y ~ normal(0,1);}'
model = pystan.StanModel(model_code=model_code) # this will take a minute
y = model.sampling(n_jobs=1).extract()['y']
y.mean() # should be close to 0
The error message that I get starts with:
/home/femke/anaconda3/lib/python3.7/site-packages/Cython/Compiler/Main.py:367: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/tmp8_plkepg/stanfit4anon_model_5944b02c79788fa0db5b3a93728ca2bf_5335140894361802645.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/femke/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1823:0,
from /home/femke/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
from /home/femke/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from /tmp/tmp8_plkepg/stanfit4anon_model_5944b02c79788fa0db5b3a93728ca2bf_5335140894361802645.cpp:688:
/home/femke/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by " \
^~~~~~~
In file included from /home/femke/anaconda3/lib/python3.7/site-packages/pystan/stan/lib/stan_math/lib/boost_1.66.0/boost/numeric/ublas/matrix.hpp:19:0,
from /home/femke/anaconda3/lib/python3.7/site-packages/pystan/stan/lib/stan_math/lib/boost_1.66.0/boost/numeric/odeint/util/ublas_wrapper.hpp:24,
from /home/femke/anaconda3/lib/python3.7/site-packages
Is this something to worry about? If not, how do I specifically disable these warnings, but not from other parts of my code? If so, what should I change.
Edit: after having read the question Cython Numpy warning about NPY_NO_DEPRECATED_API when using MemoryView, I still don't know how to safely disable this warning.

Reading binary files in Cython

I am attempting to read a binary file in Cython. Previously this was working in Python, but I am looking to speed up the process. This code below was written as a familiarisation and logic check before writing the complete module. Once this section is complete the code will be expanded to read in multiple 400 Mb files and process.
A function was created that opens the file, reads in a number of data point and returns them to an array.
from libc.stdlib cimport malloc, free
from libc.stdio cimport fopen, fclose, FILE, fscanf, fread
def readin_binary(filename, int number_of_points):
"""
Test reading in a file and returning data
"""
header_bytes = <unsigned char*>malloc(number_of_points)
filename_byte_string = filename.encode("UTF-8")
cdef FILE *in_binary_file
in_binary_file = fopen(filename_byte_string, 'rb')
if in_binary_file is NULL:
print("file not found")
else:
print("Read file {}".format(filename))
fread(&header_bytes, 1, number_of_points, in_binary_file)
fclose(in_binary_file)
return header_bytes
print(hDVS.readin_binary(filename, 10))
The code compiles.
When the code is run the following error occurs:
Python has stopped working error
I've been playing with this for a few days now. I think there is a simple error but I can not see it. Any ideas?

Cython undefined symbol with c wrapper

I am trying to expose c code to cython and am running into "undefined symbol" errors when trying to use functions defined in my c file from another cython module.
Functions defined in my h files and functions using a manual wrapper work without a problem.
Basically the same case as this question but the solution (Linking against the library) isn't satisfactory for me.
I assume i am missing something in the setup.py script ?
Minimized example of my case:
foo.h
int source_func(void);
inline int header_func(void){
return 1;
}
foo.c
#include "foo.h"
int source_func(void){
return 2;
}
foo_wrapper.pxd
cdef extern from "foo.h":
int source_func()
int header_func()
cdef source_func_wrapper()
foo_wrapper.pyx
cdef source_func_wrapper():
return source_func()
The cython module i want to use the functions in:
test_lib.pyx
cimport foo_wrapper
def do_it():
print "header func"
print foo_wrapper.header_func() # ok
print "source func wrapped"
print foo_wrapper.source_func_wrapper() # ok
print "source func"
print foo_wrapper.source_func() # undefined symbol: source_func
setup.py build both foo_wrapper and test_lib
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
# setup wrapper
setup(
ext_modules = cythonize([
Extension("foo_wrapper", ["foo_wrapper.pyx", "foo.c"])
])
)
# setup test module
setup(
ext_modules = cythonize([
Extension("test_lib", ["test_lib.pyx"])
])
)
There are 3 different types of function in foo_wrapper:
source_func_wrapper is a python-function and python run-time handles the calling of this function.
header_func is an inline-function which is used at compile time, so its definition/machine code is not needed later on.
source_func on the other hand must be handled by static (this is the case in foo_wrapper) or dynamic (I assume this is your wish for test_lib) linker.
Further down I'll try to explain, why the setup doesn't not work out of the box, but fist I would like to introduce two (at least in my opinion) best alternatives :
A: avoid this problem altogether. Your foo_wrapper wraps c-functions from foo.h. That means every other module should use these wrapper-functions. If everyone just can access the functionality directly - this makes the whole wrapper kind of obsolete. Hide the foo.h interface in your `pyx-file:
#foo_wrapper.pdx
cdef source_func_wrapper()
cdef header_func_wrapper()
#foo_wrapper.pyx
cdef extern from "foo.h":
int source_func()
int header_func()
cdef source_func_wrapper():
return source_func()
cdef header_func_wrapper():
B: It might be valid to want to use the foo-functionality directly via c-functions. In this case we should use the same strategy as cython with stdc++-library: foo.cpp should become a shared library and there should be only a foo.pdx-file (no pyx!) which can be imported via cimport wherever needed. Additionally, libfoo.so should then be added as dependency to both foo_wrapper and test_lib.
However, approach B means more hustle - you need to put libfoo.so somewhere the dynamic loader can find it...
Other alternatives:
As we will see, there are a lot of ways to get foo_wrapper+test_lib to work. First, let's see in more detail, how loading of dynamic libraries works in python.
We start out by taking a look at the test_lib.so at hand:
>>> nm test_lib.so --undefined
....
U PyXXXXX
U source_func
there are a lot of undefined symbols most of which start with Py and will be provided by a python executable during the runtime. But also there is our evildoer - source_func.
Now, we start python via
LD_DEBUG=libs,files,symbols python
and load our extension via import test_lib. In the triggered debug -trace we can see the following:
>>>>: file=./test_lib.so [0]; dynamically loaded by python [0]
python loads test_lib.so via dlopen and starts to look-up/resolve the undefined symbols from test_lib.so:
>>>>: symbol=PyExc_RuntimeError; lookup in file=python [0]
>>>>: symbol=PyExc_TypeError; lookup in file=python [0]
these python symbols are found pretty quickly - they are all defined in the python-executable - the first place dynamic linker looks at (if this executable was linked with -Wl,-export-dynamic). But it is different with source_func:
>>>>: symbol=source_func; lookup in file=python [0]
>>>>: symbol=source_func; lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
...
>>>>: symbol=source_func; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
>>>>: ./test_lib.so: error: symbol lookup error: undefined symbol: source_func (fatal)
So after looking up all loaded shared libraries the symbol is not found and we have to abort. The fun fact is, that foo_wrapper is not yet loaded, so the source_func cannot be looked up there (it would be loaded in the next step as dependency of test_lib by python).
What happens if we start python with preloaded foo_wrapper.so?
LD_DEBUG=libs,files,symbols LD_PRELOAD=$(pwd)/foo_wrapper.so python
this time, calling import test_lib succeed, because preloaded foo_wrapper is the first place the dynamic loader looks up the symbols (after the python-executable):
>>>>: symbol=source_func; lookup in file=python [0]
>>>>: symbol=source_func; lookup in file=/home/ed/python_stuff/cython/two/foo_wrapper.so [0]
But how does it work, when foo_wrapper.so is not preloaded? First let's add foo_wrapper.so as library to our setup of test_lib:
ext_modules = cythonize([
Extension("test_lib", ["test_lib.pyx"],
libraries=[':foo_wrapper.so'],
library_dirs=['.'],
)])
this would lead to the following linker command:
gcc ... test_lib.o -L. -l:foo_wrapper.so -o test_lib.so
If we now look up the symbols, so we see no difference:
>>> nm test_lib.so --undefined
....
U PyXXXXX
U source_func
source_func is still undefined! So what is the advantage of linking against the shared library? The difference is, that now foo_wrapper.so is listed as needed in for test_lib.so:
>>>> readelf -d test_lib.so| grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [foo_wrapper.so]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
ld does not link, this is a job of the dynamic linker, but it does a dry run and help dynamic linker by noting, that foo_wrapper.so is needed in order to resolve the symbols, so it must be loaded before the search of the symbols starts. However, it does not explicitly say, that the symbol source_func must be looked in foo_wrapper.so - we could actually find it and use it anywhere.
Lets start python again, this time without preloading:
>>>> LD_DEBUG=libs,files,symbols python
>>>> import test_lib
....
>>>> file=./test_lib.so [0]; dynamically loaded by python [0]....
>>>> file=foo_wrapper.so [0]; needed by ./test_lib.so [0]
>>>> find library=foo_wrapper.so [0]; searching
>>>> search cache=/etc/ld.so.cache
.....
>>>> `foo_wrapper.so: cannot open shared object file: No such file or directory.
Ok, now the dynamic linker knows, it has to find foo_wrapper.so but it is nowhere in the path, so we get an error message.
We have to tell dynamic linker where to look for the shared libraries. There is a lot of ways, one of them is to set LD_LIBRARY_PATH:
LD_DEBUG=libs,symbols,files LD_LIBRARY_PATH=. python
>>>> import test_lib
....
>>>> find library=foo_wrapper.so [0]; searching
>>>> search path=./tls/x86_64:./tls:./x86_64:. (LD_LIBRARY_PATH)
>>>> ...
>>>> trying file=./foo_wrapper.so
>>>> file=foo_wrapper.so [0]; generating link map
This time foo_wrapper.so is found (dynamic loader looked at places hinted at by LD_LIBRARY_PATH), loaded and then used for resolving the undefined symbols in test_lib.so.
But what is the difference, if runtime_library_dirs-setup argument is used?
ext_modules = cythonize([
Extension("test_lib", ["test_lib.pyx"],
libraries=[':foo_wrapper.so'],
library_dirs=['.'],
runtime_library_dirs=['.']
)
])
and now calling
LD_DEBUG=libs,symbols,files python
>>>> import test_lib
....
>>>> file=foo_wrapper.so [0]; needed by ./test_lib.so [0]
>>>> find library=foo_wrapper.so [0]; searching
>>>> search path=./tls/x86_64:./tls:./x86_64:. (RPATH from file ./test_lib.so)
>>>> trying file=./foo_wrapper.so
>>>> file=foo_wrapper.so [0]; generating link map
foo_wrapper.so is found on a so called RPATH even if not set via LD_LIBRARY_PATH. We can see this RPATH being inserted by the static linker:
>>>> readelf -d test_lib.so | grep RPATH
0x000000000000000f (RPATH) Library rpath: [.]
however this is the path relative to the current working directory, which is most of the time not what is wanted. One should pass an absolute path or use
ext_modules = cythonize([
Extension("test_lib", ["test_lib.pyx"],
libraries=[':foo_wrapper.so'],
library_dirs=['.'],
extra_link_args=["-Wl,-rpath=$ORIGIN/."] #rather than runtime_library_dirs
)
])
to make the path relative to current location (which can change for example through copying/moving) of the resultingshared library. readelf shows now:
>>>> readelf -d test_lib.so | grep RPATH
0x000000000000000f (RPATH) Library rpath: [$ORIGIN/.]
which means the needed shared library will be searched relatively to the path of the loaded shared library, i.e test_lib.so.
That is also how your setup should be, if you would like to reuse the symbols from foo_wrapper.so which I do not advocate.
There are however some possibilities to use the libraries you have already built.
Let's go back to original setup. What happens if we first import foo_wrapper (as a kind of preload) and only then test_lib? I.e.:
>>>> import foo_wrapper
>>>>> import test_lib
This doesn't work out of the box. But why? Obviously, the loaded symbols from foo_wrapper are not visible to other libraries. Python uses dlopen for dynamical loading of shared libraries, and as explained in this good article, there are some different strategies possible. We can use
>>>> import sys
>>>> sys.getdlopenflags()
>>>> 2
to see which flags are set. 2 means RTLD_NOW, which means that the symbols are resolved directly upon the loading of the shared library. We need to OR flag withRTLD_GLOBAL=256 to make the symbols visible globally/outside of the dynamically loaded library.
>>> import sys; import ctypes;
>>> sys.setdlopenflags(sys.getdlopenflags()| ctypes.RTLD_GLOBAL)
>>> import foo_wrapper
>>> import test_lib
and it works, our debug trace shows:
>>> symbol=source_func; lookup in file=./foo_wrapper.so [0]
>>> file=./foo_wrapper.so [0]; needed by ./test_lib.so [0] (relocation dependency)
Another interesting detail: foo_wrapper.so is loaded once, because python does not load a module twice via import foo_wrapper. But even if it would be opened twice, it would be only once in the memory (the second read only increases the reference count of the shared library).
But now with won insight we could even go further:
>>>> import sys;
>>>> sys.setdlopenflags(1|256)#RTLD_LAZY+RTLD_GLOBAL
>>>> import test_lib
>>>> test_lib.do_it()
>>>> ... it works! ....
Why this? RTLD_LAZY means that the symbols are resolved not directly upon the loading but when they are used for the first time. But before the first usage (test_lib.do_it()), foo_wrapper is loaded (import inside of test_lib module) and due to RTLD_GLOBAL its symbols can be used for resolving later on.
If we don't use RTLD_GLOBAL, the failure comes only when we call test_lib.do_it(), because the needed symbols from foo_wrapper are not seen globally in this case.
To the question, why it is not such a great idea just to link both modules foo_wrapper and test_lib against foo.cpp: Singletons, see this.

Why do I get this warnings under Cython?

I try to reproduce some examples on the Cython tutorial to learn Cython:
http://docs.cython.org/en/latest/src/tutorial/external.html
I think the two following warnings are not related. Therefore two qestions:
(1)
Using this as input to
python setup.py build_ext --inplace -c mingw32
from libc.math cimport sin
cdef extern from "math.h":
cdef double sin(double x)
cpdef double f(double x):
return sin(x*x)
cpdef test(double x):
return f(x)
I get:
D:\python\cython>python setup.py build_ext --inplace -c mingw32
Compiling primes.pyx because it changed.
[1/1] Cythonizing primes.pyx
warning: primes.pyx:4:19: Function signature does not match previous declaration
running build_ext
building 'primes' extension
C:\MinGW\bin\gcc.exe -mdll -O -Wall -IC:\Python34\include -IC:\Python34\include -c primes.c -o build\temp.win32-3.4\Release\primes.o
writing build\temp.win32-3.4\Release\primes.def
C:\MinGW\bin\gcc.exe -shared -s build\temp.win32-3.4\Release\primes.o build\temp.win32-3.4\Release\primes.def -LC:\Python34\libs -LC:\Python34\PCbuild -lpython34 -lmsvcr100 -o D:\python\cython\primes.pyd
D:\python\cython>
Why is the warning "Function signature does not match previous declaration" ?
(2)
When I declare
cdef extern from "math.h":
cpdef double sin(double x)
I get the additional warning
warning: primes.pyx:4:20: Function 'sin' previously declared as 'cpdef'
However, it is given exactly in the same way as example in the chapter "External declarations" of the linked page. In a python module where the module is imported, sin is not known under the package. Where is the problem?
The description in the tutorial is:
Note that you can easily export an external C function from your Cython module by declaring it as cpdef. This generates a Python wrapper for it and adds it to the module dict.
the different parts of the tutorial show different manners to call C functions.
For some functions for which a Cython .pxd header is provided, you can use from libc.math import sin. For all libraries, you can use the lengthier method of .h header and re-declaration.
You cannot mix the two however, as it creates two definitions of the same function even though they are identical.