I have a number of objects that I am trying to pickle which all share the same (large) cython memoryview as an attribute. Since the memoryviews are passed by reference, they all share the same memory and the implementation is memory efficient.
Now I need to pickle these objects and reload them while keeping the shared data shared (if the shared data becomes not shared then the file size blows up and it is impossible to read into memory). Normally I think pickle recognizes shared data and just pickles/unpickles it once, but because memory views can't be pickled directly, they need to be converted to a numpy array in the reduce method for each object and pickle no longer recognizes that the data is shared.
Is there some way that I can maintain the shared data through the pickle/unpickle process?
A MWE follows:
import numpy as np
import pickle
cdef class SharedMemory:
cdef public double[:, :] data
def __init__(self, data):
self.data = data
def duplicate(self):
return SharedMemory(self.data)
def __reduce__(self):
return self.__class__, (np.asarray(self.data),)
def main():
x = SharedMemory(np.random.randn(100, 100))
duplicates = [x.duplicate() for _ in range(5)]
cdef double* pointerx = &x.data[0, 0]
cdef double* pointerd
cdef double[:, :] ddata
for d in duplicates:
ddata = d.data
pointerd = &ddata[0, 0]
if pointerd != pointerx:
print('Memory is not shared')
else:
print('Memory is shared')
print('pickling')
with open('./temp.pickle', 'wb') as pfile:
pickle.dump(x, pfile, protocol=pickle.HIGHEST_PROTOCOL)
for d in duplicates:
pickle.dump(d, pfile, protocol=pickle.HIGHEST_PROTOCOL)
with open('./temp.pickle', 'rb') as pfile:
nx = pickle.load(pfile)
nd = []
for d in duplicates:
nd.append(pickle.load(pfile))
ddata = nx.data
cdef double* pointernx = &ddata[0, 0]
for d in nd:
ddata = d.data
pointerd = &ddata[0, 0]
if pointerd != pointernx:
print('Memory is not shared')
else:
print('Memory is shared')
Put the above in a file test.pyx an cythonize with "cythonize -a -i test.pyx". Then "export PYTHONPATH="$PYTHONPATH":." and run
from test import main
main()
from python.
There are actually two problems:
First: The shared objects are also shared after dump/load only if they where pickled in one go (see also this answer).
That means you need to do the following (or similar):
...
with open('./temp.pickle', 'wb') as pfile:
pickle.dump((x,duplicates), pfile, protocol=pickle.HIGHEST_PROTOCOL)
...
with open('./temp.pickle', 'rb') as pfile:
nx, nd = pickle.load(pfile)
...
When you dump single objects, pickle isn't able to track identical objects - doing so would be an issue: objects with the same id between two dump-calls could be completely different objects or the same objects with different content!
Second: You should not create new objects, but pass the shared numpy-object in __reduce__ (pickle doesn't look inside of a numpy-array to see, whether buffer is shared or not, but only at the id of the array):
def __reduce__(self):
return self.__class__, (self.data.base,)
which will give you the desired result. data.base is a reference to the underlying original numpy-array (or whatever type, which must support pickling/unpickling, obviously).
Warning: As #DavidW has rightly pointed out, additional considerations must be taken into account, when working with sliced memory-views - because in this case base might not be "the same" as the actual memory view.
Related
Actually I have a working version of what I need, but it is awfully slow:
%%cython -a
cimport numpy as cnp
import numpy as np # for np.empty
from time import time
cdef packed struct Bar:
cnp.int64_t dt
double open
double high
double low
double close
cnp.int64_t volume
bar_dtype = np.dtype([('dt', 'i8'), ('open', 'f8'), ('high', 'f8'), ('low', 'f8'), ('close', 'f8'), ('volume', 'i8')])
cpdef f(Bar bar):
cdef cnp.ndarray bars_sar
cdef Bar[:] buffer
bars_sar = np.empty((1000000,), dtype=bar_dtype)
start = time()
for n in range(1000000):
buffer = bars_sar
buffer[0] = bar
print (f'Elapsed: {time() - start}')
return buffer
def test():
sar = f(Bar(1,2,3,4,5,6))
print(sar[0])
return sar
This 1M iterations loop takes about 3 seconds - because of the line takes memory view from numpy structured array:
buffer = bars_sar
The idea is that I need bars_sar untyped. Only when I want to store or read something from it, I want to reinterpret it as a particular type of memory view, and I don't see any problem why it cannot be done fast, but don't know how to do it. I hope, there is something similar to C reinterpret_cast in Cython.
I tried to declare bars_sar as void *, but I'm unable to store memory view address there like:
cdef Bar[:] view = np.empty((5,), dtype=bar_dtype)
bars_sar = <void*>view
or even
cdef Bar[:] view = np.empty((5,), dtype=bar_dtype)
bars_sar = <void*>&view
The first one results in error C2440 that it cannot convert "__Pyx_memviewslice" to "void *"
The second one results in error: "Cannot take address of memoryview slice"
Please suggest
I'm trying to loop over a 3D array with a window. At each iteration, the window is moved 1 pixel and the variance for the (3D)window is calculated.
I'm trying to do this in Cython for performance reasons, in Jupyter notebook.
My (working, but slow) code in python looks approximately like this:
## PYTHON
#code adapted from https://stackoverflow.com/questions/36353262/i-need-a-fast-way-to-loop-through-pixels-of-an-image-stack-in-python
def Variance_Filter_3D_python(image, kernel = 30):
min_var = 10000
min_var_coord = [0,0,0]
window = np.zeros(shape=(kernel,kernel,kernel), dtype = np.uint8)
z,y,x = image.shape
for i in np.arange(0,(z-kernel),1):
for j in np.arange(0,(y-kernel),1):
for k in np.arange(0,(x-kernel),1):
window[:,:,:] = image[i:i+kernel,j:j+kernel,k:k+kernel]
var = np.var(window)
if var < min_var:
min_var = var
min_var_coord = [i,j,k]
print(min_var_coord)
return min_var,min_var_coord
When I try to declare the variables in the cython code:
%%cython
#cython.boundscheck(False) # Deactivate bounds checking
#cython.wraparound(False)
def Variance_Filter_3D(image, kernel = 30):
cdef double min_var = 10000
cdef list min_var_coord = [0,0,0]
cdef unsigned int z,y,x = image.shape
cdef np.ndarray[float, ndim=3] window = np.zeros(shape=(kernel,kernel,kernel),
dtype=FTYPE)
....etc
I get a error saying that "'np' is not declared" in the following line:
cdef np.ndarray[float, ndim=3] window = np.zeros(shape=(kernel,kernel,kernel),
dtype=FTYPE)
and that cython isn't declared in these lines:
#cython.boundscheck(False) # Deactivate bounds checking
#cython.wraparound(False)
However, I have used cimport previously:
%%cython
cimport numpy as np
cimport cython
What's going wrong?
You probably need to put the Numpy and Cython cimports in the exact notebook cell you need them in. Cython doesn't have a lot of "global scope" in Jupiter.
However,
window[:,:,:] = image[i:i+kernel,j:j+kernel,k:k+kernel]
will work a lot better if:
you set the type of image to be a memoryview. Slicing a memoryview is fairly quick while viewing an arbitrary Python object as a memoryview is slower.
You made the left-hand side window instead of window[:,:,:] (a view rather than a copy)
I'm trying to create a memoryview to store several vectors as rows, but when I try to change the value of any I got an error, like it is expecting a scalar.
%%cython
import numpy as np
cimport numpy as np
DTYPE = np.float
ctypedef np.float_t DTYPE_t
cdef DTYPE_t[:, ::1] results = np.zeros(shape=(10, 10), dtype=DTYPE)
results[:, 0] = np.random.rand(10)
This trows me the following error:
TypeError: only size-1 arrays can be converted to Python scalars
Which I don't understand given that I want to overwrite the first row with that vector... Any idea about what I am doing wrong?
The operation you would like to use is possible between numpy arrays (Python functionality) or Cython's memory views (C functionality, i.e. Cython generates right for-loops in the C-code), but not when you mix a memory view (on the left-hand side) and a numpy array (on the right-hand side).
So you have either to use Cython's memory-views:
...
cdef DTYPE_t[::1] r = np.random.rand(10)
results[:, 0] = r
#check it worked:
print(results.base)
...
or numpy-arrays (we know .base is a numpy-array):
results.base[:, 0] = np.random.rand(10)
#check it worked:
print(results.base)
Cython's version has less overhead, but for large matrices there won't be much difference.
It seems that one can't declare np.ndarray in cython.locals in .pxd files. It works with memoryviews but not with np.ndarray. However, there are cases where we need np.ndarray.
In notsupported.py
import numpy as np
def func():
arr = np.ones(2)
return arr**2
In notsupported.pxd
import cython
import numpy as np
cimport numpy as np
#cython.locals(arr=np.ndarray[np.int_t, ndim=1])
cpdef func()
Error log:
Error compiling Cython file:
------------------------------------------------------------
...
import cython
import numpy as np
cimport numpy as np
#cython.locals(arr=np.ndarray[np.int_t, ndim=1])
^
------------------------------------------------------------
notsupported.pxd:6:44: Expected ']', found '='
Is there something wrong with this code? What is the alternative?
Since it looks like this isn't supported I assume you're really interested in workarounds. For the purpose of this question I'm assuming you want your code to also be valid in pure Python. I'm also assuming that your code is of the form:
def func():
arr = np.ones(2)
for n in range(arr.shape[0]):
arr[n] = # some operation element-by-element
return arr**2
If your code doesn't have the element-by-element section then there's really no benefit to setting the type at all - I don't believe Cython uses the type for Numpy "whole array" operations like the power operator you show here.
My first choice would be to have two variables: arr and arr_view. arr should be untyped, and arr_view a memoryview. You only use the memoryview in the element-by-element section. Provided you stick to in-place operations the two share the same memory so modifying one modifies the other:
def func():
arr = np.ones(2)
arr_view = arr
for n in range(arr_view.shape[0]):
arr_view[n] = ...
return arr**2
The pxd file is then:
#cython.locals(arr_view=np.int_t[:])
cpdef func()
My second choice would be to type arr as a memoryview, and use np.asarray when you want to do "whole array" operations
def func():
arr = np.ones(2)
for n in range(arr.shape[0]):
arr[n] = # some operation element-by-element
return np.asarray(arr)**2
with pxd:
#cython.locals(arr=nnp.int_t[:])
cpdef func()
np.asarray is essentially a no-op if it's passed an array, and can usuaully avoid a copy if passed a memoryview, so it won't slow things down too much.
A third option is to use the arr.base object of a memoryview to get the underlying Numpy array. This loses pure Python compatibility though since arr.base is often None when arr is a Numpy array. Therefore I don't really recommend it here.
I implemented a pure Python code in object-oriented style. In some of the methods there are time intensive loops, which I hope to speed up by cythonizing the code.
I am using a lot of numpy arrays and struggle with converting classes into Cython extension types.
Here I declare two numpy arrays 'verteces' and 'norms' as attributes:
import numpy as np
cimport numpy as np
cdef class Geometry(object):
cdef:
np.ndarray verteces
np.ndarray norms
def __init__(self, config):
""" Initialization"""
self.config = config
self.verteces = np.empty([1,3,3],dtype=np.float32)
self.norms = np.empty(3,dtype=np.float32)
During runtime the actual size of the arrays will be defined. This happens when calling the Geometry.load() method of the same class. The method opens an STL-file and loops over the triangle entries.
Finally I want to determine the intersection points of the triangles and a ray. In the respective method I use the following declarations.
cdef void hit(self, object photon):
""" Ray-triangle intersection according to Moeller and Trumbore algorithm """
cdef:
np.ndarray[DTYPE_t, ndim=3] verteces = self.verteces # nx3x3
np.ndarray[DTYPE_t, ndim=2] norms = self.norms
np.ndarray[DTYPE_t, ndim=1] ph_dir = photon.direction
np.ndarray[DTYPE_t, ndim=1] ph_origin = photon.origin
np.ndarray[DTYPE_t, ndim=1] v0, v1, v2, vec1, vec2, trsc, norm, v, p_inter
float a, b, par, q, q0, q1, s0, s1
int i_tri
When I try to compile this code I get the following error message:
'dimensions' is not a member of 'tagPyArrayObject'
I am not very familiar cython programming, but maybe the error is do to the fact that I have to initialize an array of fixed size in a C-extension type? The size of the array is, however, unkown until the STL-file is read.
Not sure if this is related to your problem, but I've got the same identical error message when specifying the "NPY_1_7_API_VERSION" macro in my setup.py file.
extension_module = Extension(
'yourfilename',
sources=["yourfilename.pyx],
include_dirs=[numpy.get_include()],
define_macros=[("NPY_NO_DEPRECATED_API", "NPY_1_7_API_VERSION")],
)
With this macro, a simple npmatrix.shape[0] numpy function is compiled as:
/* "yourfilename.pyx":35
*
* cpdef int vcount(self):
* return self.npmatrix.shape[0]
*
*/
__pyx_r = (__pyx_v_self->npmatrix->dimensions[0]);
which causes the error. Just removing the macro resolved this error to me.