Why don't you get a type error when you pass a float instead of an int in cython - cython

I have a cython function:
def test(int a, int b):
return a+b
If I call it with:
test(0.5, 1)
I get the value 1.
Why doesn't it give a type error?

This is because float defines the special function __int__, which is called by Cython along the way (or more precise PyNumber_Long, at least this is my guess, because it is not easy to track the call through all these defines and ifdefs).
That is the deal: If your object defines __int__ so it can be used as an integer by Cython. Using Cython for implicit type-checking is not very robust.
If you want, you can check, whether the input is an int-object like in the following example (for Python3, for Python2 it is a little bit more complex, because there are different int-classes):
%%cython
from cpython cimport PyLong_Check
def print_me(i):
if not PyLong_Check(i):
print("Not an integer!")
else:
print(i)

Related

Is there a way to call a cdef class method that has non-pythonic arguments?

Effectively, I am cleaning up a cython module that has many globally scoped functions and variables. I thought cdef classes would be a great way to package some of these functions. But I ran into an issue when trying to call some of the methods of these classes. I boilded it down to these two examples that show my issue. The functionality of the code is unimportant, I am just trying to demonstrate the problem I am facing.
cdef class Foo:
def __init__(self):
pass
cdef int deref(self, int *bar):
return bar[0]
cdef int bar = 5
foo = Foo()
foo.deref(&bar)
If I run this code I get an error
Cannot convert 'int *' to Python object
But If I define everything in global scope it works just fine:
cdef int deref(int *bar):
return bar[0]
cdef int bar = 5
deref(&bar)
My question is: Is it possible to call a cdef method in this mannor?
My thought was that it would work since it is all being done within cython, but for some reason cython wants to convert the pointer into a python object and then back into a cython object? Is this always the case? I thought cdef classes were an effective tool to use when using cython.
Anyways, I have exhausted my attempts at solving this issue myself and wanted to ask here before abandoning cdef classes and going back to functional programming.
Normally Cython would detect the cdef type of a variable, e.g. for
def doit():
cdef int bar = 5
foo = Foo()
return foo.deref(&bar)
Cython would see, that foo is of type cdef Foo and treat it as such (the above code builds).
Not so for the global variables:
foo = Foo()
this is implicitly a def variable, which can be accessed (or set) via modulename.foo. That means, Cython cannot be sure, that foo is of type Foo - it could be set to int, float and what not.
Thus Cython must assume that foo is a pure Python-object with a (def-) method called deref which has Python arguments, and a pointer cannot be cast to a Python object automatically - which leads to the error you are observing.
Declaring foo as
cdef Foo foo=Foo()
hides the variable from pure-python, so it is no longer possible to access it via modulename.foo and to assign some arbitrary object to it. Thus it possible for Cython to assume, that foo is really of type Foo, thus granting access to cdef-functions of the class Foo.
foo must be declared a Foo object
cdef Foo foo = Foo()
Then all is well

Cython: declare a PyCapsule_Destructor in pyx file

I don't know python and trying to wrap an existing C library that provides 200 init functions for some objects and 200 destructors with help of PyCapsule. So my idea is to return a PyCapsule from init functions` wrappers and forget about destructors that shall be called automatically.
According to documentation PyCapsule_New() accepts:
typedef void (*PyCapsule_Destructor)(PyObject *);
while C-library has destructors in a form of:
int foo(void*);
I'm trying to generate a C function in .pyx file with help of cdef that would generate a C-function that will wrap library destructor, hide its return type and pass a pointer taken with PyCapsule_GetPointer to destructor. (pyx file is programmatically generated for 200 functions).
After a few experiments I end up with following .pyx file:
from cpython.ref cimport PyObject
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_IsValid, PyCapsule_GetPointer
cdef void stateFree( PyObject *capsule ):
cdef:
void * _state
# some code with PyCapsule_GetPointer
def stateInit():
cdef:
void * _state
return PyCapsule_New(_state, "T", stateFree)
And when I'm trying to compile it with cython I'm getting:
Cannot assign type 'void (PyObject *)' to 'PyCapsule_Destructor'
using PyCapsule_New(_state, "T", &stateFree) doesn't help.
Any idea what is wrong?
UPD:
Ok, I think I found a solution. At least it compiles. Will see if it works. I'll bold the places I think I made a mistake:
from cpython.ref cimport PyObject
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_IsValid, PyCapsule_GetPointer, PyCapsule_Destructor
cpdef void stateFree( object capsule ):
cdef:
void* _state
_state = PyCapsule_GetPointer(capsule, "T")
print('destroyed')
def stateInit():
cdef:
int _state = 1
print ("initialized")
return PyCapsule_New(_state, "T", < PyCapsule_Destructor >stateFree)
The issue is that Cython distinguishes between
object - a Python object that it knows about and handles the reference-counting for, and
PyObject*, which as far as it's concerned is a mystery type that it basically nothing about except that it's a pointer to a struct.
This is despite the fact that the C code generated for Cython's object ends up written in terms of PyObject*.
The signature used by the Cython cimport is ctypedef void (*PyCapsule_Destructor)(object o) (which isn't quite the same as the C definition. Therefore, define the destructor as
cdef void stateFree( object capsule ):
Practically in this case the distinction makes no difference. It matters more in cases where a function steals a reference or returns a borrowed reference. Here capsule has the same reference count on both the input and output of the function whether Cython manages it or not.
In terms of your edited-in solution:
cpdef is wrong for stateFree. Use cdef since it is not a function that should be exposed in a Python interface (and if you use cpdef it isn't obvious whether the Python or C version is passed as a function pointer).
You shouldn't need the cast to PyCapsule_Destructor and should avoid it because casts can easily hide bugs.
Can I just take a moment to express my general dislike for PyCapsule (it's occasionally useful for passing an opaque type through Python code without touching it, but for anything more I think it's usually better to wrap it properly in a cdef class). It's possible you've thought about it and it is the right tool for the job, but I'm putting this warning in to try to discourage people in the future who might be trying to use it on a more "copy-and-paste" basis.

Cython cannot declare None type

I have a .pyx file as follow :
cdef class Foo:
def __cinit__(self, bar=None):
self.bar = bar
and a .pxd as follow :
cdef class Foo:
cdef public ??? bar
Instead of ??? I would like to declare the type to be either int or None
I tried using a fusedtype with void or int, which does not work either.
Does anyone know how to make it compile? Using None or Null?
The main purpose of Python types is to define an efficient C representation (in space used or speed). int is understood by Python to mean a C integer, and there is no C datatype which could either be a C integer or a Python None object.
You then have two options:
Make bar a Python object (cdef public object bar) and add some runtime type checks in the constructor to ensure that it is either an integer or None. You might want to use #property to also check it when written to. Since this uses Python objects you should not expect any Cython speedup.
Make bar an int (i.e. a C integer) and pick some special value which will represent an invalid value (and so you need to pick a value you'll never need to represent). This will be efficient in terms of space and possible speed.

Should type information be provided for the 'self' argument in cython extension types?

I've been experimenting with wrapping C++ with cython. I'm trying to understand the implications of typing self in extension type methods.
In the docs self is not explicitly typed but it seems like there could potentially be speedups associated with typing self.
However, in my limited experimentation, explicitly typing self does not seem to yield performance increases. Is there special magic going on under the covers to handle self, or is this purely a style thing?
EDIT for clarity:
By typing self, I mean providing type information for the self argument of a method. i.e.:
cdef class foo:
cpdef bar(self):
# do stuff with self
vs
cdef class foo:
cpdef bar(foo self):
# do stuff with self
Short answer:
There is no need to verbosely type self in a class method. It's not much faster than a plain self.
Long answer:
Although there are indeed some differences in the generated c codes(One can easily check it in jupyter notebook with magic cell %%cython -a). For example:
%%cython -a
# Case 1
cdef class foo1:
def bar(self, foo1 other):
pass
def __eq__(self, foo1 other):
pass
# Case 2
cdef class foo2:
def bar(self, foo2 other):
pass
def __eq__(foo2 self, foo2 other):
pass
In the Python wrapper, self is always converted to PyObject *.
For normal method(bar), the wrapped C function signatures are identical, self are both converted to struct xxx_foo *.
For magic method(__eq__), in the wrapped C function, plain self is converted to PyObject *, but the typed foo2 self is converted to struct xxx_foo2 *. In the latter case, the python wrapper cast PyObject * to struct xxx_foo2 * and call the wrapped C function. Case 2 may have fewer pointer indirections, but there should be not much difference in performance in both case. Besides, case 2 will do more checks in the python wrapper. In practice, the profile can say everything.
As you already worked out, normally self is "translated" to the right type in the resulting c-code.
The only exceptions I'm aware of are the rich comparison operators, i.e. __eq__, __lt__,__le__ and so one.
The other special methods/operators like += or + work exactly in the same way as all other "normal" methods: self is automatically of the right type.
However, the behavior of the rich comparison operators will be changed soon, as it seems to be only a glitch in the newly introduced feature: corresponding issue.
Now, that we have established, what the cython does do, the interesting question is why cython does it this way.
For somebody comming from static typed languages it is pretty obvious, that self can be only of the class-type (exact this class or derived from this class) for which this function is defined, so I would expect self to be of this class-type. So it would be a surprise if cython would behave differently.
Yet it is probably not so clear in the age of duck-typing and mokey-patching in which classes can be changed dynamically. Let's take a look at the following example:
[]class A:
def __init__(self, val):
self.val=val
def __str__(self):
return "value=%s"%self.val
[]class B:
def __init__(self, val):
self.val="<"+val+">"
[] a,b=A(1.0),B("div")
[] print a
value=3
[] print b
<__main__.B instance at 0x0000000003D24E08>
So if we don't like how print handles the class B. It is possible to monkey-patch the class B via:
[]B.__str__=lambda self: "value=%s"%self.val
[]print b
value=<div>
So if we like the way the class A handles the __str__ method, we could try to "reuse" it:
[]B.__str__=lambda self: A.__str__(self)
[]print b
TypeError: unbound method __str__() must be called
with A instance as first argument
(got B instance instead)
So it is not possible: python checks for calls to A.__str__(self) that self is really of type A.
Thus, cython is right in using the right type for self directly and not a python object.

Is it possible to write "pure" c++ class in Cython?

In Cython, a class, or a extension type is a Python class, which means it can be initialized by Python. On the other hand, the parameters of its __init__ or __cinit__ have to be Python Object.
Is it possible to write a class in Cython, which can only be initilized by cdef functions, and thus can be initilized by C types and C++ objects?
I want to this because it would be easier to translate my existing Python codes to Cython code than C/C++ code.
You can quite easily create a class that can't (easily) be initialised from Python, but can only be created from a cdef factory function
cdef class ExampleNoPyInit:
cdef int value
def __init__(self):
raise RuntimeError("Cannot be initialise from python")
cdef ExampleNoPyInit_factory(int v):
cdef ExampleNoPyInit a
# bypass __init__
a = ExampleNoPyInit.__new__(ExampleNoPyInit)
a.value = v
return a
inst = ExampleNoPyInit_factory(5)
(I suspect the really committed could use the same method of initialising it in Python if they wanted. There are other ways to prevent initialisation if you want to be more thorough - for example you could use a cdef global variable in your Cython module as a flag, which would not be accessed from Python).
This class still has the Python reference counting mechanism built-in so is still a "Python class". If you want to avoid that then you could use a cdef struct, (although that can't have member functions).