Using Python cffi for introspection - json

I'm contemplating using Python's cffi module as part of a Remote Procedure Call package: you create a single <mymodule>.h file and use cffi to generate the JSON encoders and decoders between the two endpoints.
To do this, I'd use cffi to parse the typedefs, the structs and the function signatures, which I'd then use to create the encoders and decoders for the JSON code.
I can see how to get cffi's internal model of a typedef, e.g.:
>>> from cffi import FFI
>>> import pprint
>>> ffi = FFI()
>>> ffi.cdef("""
typedef enum { LED_HEAT, LED_FAN} led_id_t;
void set_led(led_id_t led_id, bool on);
""")
>>> led_id_type = ffi.typeof('led_id_t')
>>> led_id_type.elements
{1: 'LED_FAN', 0: 'LED_HEAT'}
Question: I know that cffi has a model for the signature of the set_led() function. What cffi functions give me access to that model?
Update
By groveling over the cffi sources, I see that I can do the following:
>>> x = ffi._parser._declarations['function set_led']
>>> x
(<void(*)(led_id_t, _Bool)>, 0)
>>> type(x[0])
<class 'cffi.model.FunctionPtrType'>
>>> x[0].args
(<led_id_t>, <_Bool>)
>>> type(x[0][0])
<class 'cffi.model.EnumType'>
... which gets me pretty much what I want. Is there a published API for going that (or at least something that doesn't dig so deeply into the internals)?

Related

Converting JSON into a DataFrame within FastAPI app

I am trying to create an API for customer churn at a bank. I have completed the model and now want to create the API using FastAPI. My problem is converting the JSON passed data to a dataframe to be able to run it through the model. Here is the code.
from fastapi import FastAPI
from starlette.middleware.cors import CORSMiddleware
from pycaret.classification import *
import pandas as pd
import uvicorn # ASGI
import pickle
import pydantic
from pydantic import BaseModel
class customer_input(BaseModel):
CLIENTNUM:int
Customer_Age:int
Gender:str
Dependent_count:int
Education_Level:str
Marital_Status:str
Income_Category:str
Card_Category:str
Months_on_book:int
Total_Relationship_Count:int
Months_Inactive_12_mon:int
Contacts_Count_12_mon:int
Credit_Limit:float
Total_Revolving_Bal:int
Avg_Open_To_Buy:float
Total_Amt_Chng_Q4_Q1:float
Total_Trans_Amt:int
Total_Trans_Ct:int
Total_Ct_Chng_Q4_Q1:float
Avg_Utilization_Ratio:float
app = FastAPI()
#Loading the saved model from pycaret
model = load_model('BankChurnersCatboostModel25thDec2020')
origins = [
'*'
]
app.add_middleware(
CORSMiddleware,
allow_origins=origins,
allow_credentials=True,
allow_methods=['GET','POST'],
allow_headers=['Content-Type','application/xml','application/json'],
)
#app.get("/")
def index():
return {"Nothing to see here"}
#app.post("/predict")
def predict(data: customer_input):
# Convert input data into a dictionary
data = data.dict()
# Convert the dictionary into a dataframe
my_data = pd.DataFrame([data])
# Predicting using pycaret
prediction = predict_model(model, my_data)
return prediction
# Only use below 2 lines when testing on localhost -- remove when deploying
if __name__ == '__main__':
uvicorn.run(app, host='127.0.0.1', port=8000)
When I test this out I get the Internal Server Error from the OpenAPI interface so I check my cmd and the error says
ValueError: [TypeError("'numpy.int64' object is not iterable"), TypeError('vars() argument must have __dict__ attribute')]
How can I have the data that is passed into the predict function successfully convert into a dataframe. Thank you.
Ok so I fixed this by changing the customer_input class. Any int types I changed to a float and that fixed it. I don't understand why though. Can anyone explain?
Fundamentally those int values are only meant to be an integer because they are all discrete values (i.e choosing number of dependents in a bank) but I guess I could put a constrain on the front-end.

How to modify and fetch from map in cython? [duplicate]

I was wondering if this was possible to iterate through a map directly in Cython code, ie, in the .pyx.
Here is my example:
import cython
cimport cython
from licpp.map import map as mapcpp
def it_through_map(dict mymap_of_int_int):
# python dict to map
cdef mapcpp[int,int] mymap_in = mymap_of_int_int
cdef mapcpp[int,int].iterator it = mymap_in.begin()
while(it != mymap.end()):
# let's pretend here I just want to print the key and the value
print(it.first) # Not working
print(it.second) # Not working
it ++ # Not working
This does not compile: Object of type 'iterator' has no attribute 'first'
I used map container in cpp before but for this code, I am trying to stick to cython/python, is it possible here?.
Resolved by DavidW
Here is an working version of the code, following DavidW answer:
import cython
cimport cython
from licpp.map import map as mapcpp
from cython.operator import dereference, postincrement
def it_through_map(dict mymap_of_int_int):
# python dict to map
cdef mapcpp[int,int] mymap_in = mymap_of_int_int
cdef mapcpp[int,int].iterator it = mymap_in.begin()
while(it != mymap.end()):
# let's pretend here I just want to print the key and the value
print(dereference(it).first) # print the key
print(dereference(it).second) # print the associated value
postincrement(it) # Increment the iterator to the net element
The map iterator doesn't have elements first and second. Instead it has a operator* which returns a pair reference. In C++ you can use it->first to do this in one go, but that syntax doesn't work in Cython (and it isn't intelligent enough to decide to use -> instead of . itself in this case).
Instead you use cython.operator.dereference:
from cython.operator cimport dereference
# ...
print(dereference(it).first)
Similarly, it++ can be done with cython.operator.postincrement

Encoding/Decode shapeless records with circe

Upgrading circe from 0.4.1 to 0.7.0 broke the following code:
import shapeless._
import syntax.singleton._
import io.circe.generic.auto._
.run[Record.`'transaction_id -> Int`.T](transport)
def run[A](transport: Json => Future[Json])(implicit decoder: Decoder[A], exec: ExecutionContext): Future[A]
With the following error:
could not find implicit value for parameter decoder: io.circe.Decoder[shapeless.::[Int with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("transaction_id")],Int],shapeless.HNil]]
[error] .run[Record.`'transaction_id -> Int`.T](transport)
[error] ^
Am I missing some import here or are these encoders/decoders not available in circe anymore?
Instances for Shapeless's hlists, records, etc. were moved to a separate circe-shapes module in the circe 0.6.0 release. If you add this module to your build, the following should just work:
import io.circe.jawn.decode, io.circe.shapes._
import shapeless._, record.Record, syntax.singleton._
val doc = """{ "transaction_id": 1 }"""
val res = decode[Record.`'transaction_id -> Int`.T](doc)
The motivation for moving these instances was that the improved generic derivation introduced in 0.6 meant that they were no longer necessary, and keeping them out of implicit scope when they're not needed is both cleaner and potentially supports faster compile times. The new circe-shapes module also includes features that were not available in circe-generic, such as instances for coproducts.

Dill/Pickling Error: odd numbers of set items

I have a very strange error that cannot be reproduced anywhere except my production environment. What does this error mean? I get it when I try run the following piece of code:
serialized_object = dills.dumps(object)
dill.loads(serialized_object)
pickle.UnpicklingError: odd number of items for SET ITEMS
I'd never seen this before, so I looked at the source code. See here: https://github.com/python/cpython/blob/f24143b25e4f83368ff6182bebe14f885073015c/Modules/_pickle.c#L5914 it seems that the implication is that you have a corrupted or hostile pickle.
Based on the OP's comments, I think I see the workaround. I'll have to determine the impact of the workaround, and it will have to be integrated into dill, but for now here it is:
>>> import StringIO as io
>>> f = io.StringIO()
>>> import dill
>>> import numpy as np
>>> x = np.array([1])
>>> y = (x,)
>>> p = dill.Pickler(f)
>>> p.dump(x)
>>> f.getvalue()
"cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I1\ntp6\ncnumpy\ndtype\np7\n(S'i8'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\np13\ntp14\nb."
>>> p.dump(y)
>>> f.getvalue()
"cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I1\ntp6\ncnumpy\ndtype\np7\n(S'i8'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\np13\ntp14\nb.(g5\ntp15\n."
>>> dill.loads(_)
array([1])
>>>
import dill
import numpy as np
x = np.array([1])
y = (x,)
dill.dumps(x)
dill.loads(dill.dumps(y))
This will throw an out of index exception. The reason is because there is a special function that is registered to serialize numpy array objects. That special function uses the global Pickler to store the serialized data instead of the Pickler that is passed as an argument. To fix it, I used the Pickler that is passed to the argument instead. I'm not sure if it breaks anything else in the dill though.

piecewise numpy function with integer arguments

I define the piecewise function
def Li(x):
return piecewise(x, [x < 0, x >= 0], [lambda t: sin(t), lambda t: cos(t)])
And when I evaluate Li(1.0)
The answer is correct
Li(1.0)=array(0.5403023058681398),
But if I write Li(1) the answer is array(0).
I don't understand this behaviour.
This function runs correctly.
def Li(x):
return piecewise(float(x),
[x < 0, x >= 0],
[lambda t: sin(t), lambda t: cos(t)])
It seems that piecewise() converts the return values to the same type as the input so, when an integer is input an integer conversion is performed on the result, which is then returned. Because sine and cosine always return values between −1 and 1 all integer conversions will result in 0, 1 or -1 only - with the vast majority being 0.
>>> x=np.array([0.9])
>>> np.piecewise(x, [True], [float(x)])
array([ 0.9])
>>> x=np.array([1.0])
>>> np.piecewise(x, [True], [float(x)])
array([ 1.])
>>> x=np.array([1])
>>> np.piecewise(x, [True], [float(x)])
array([1])
>>> x=np.array([-1])
>>> np.piecewise(x, [True], [float(x)])
array([-1])
In the above I have explicitly cast the result to float, however, an integer input results in an integer output regardless of the explicit cast. I'd say that this is unexpected and I don't know why piecewise() should do this.
I don't know if you have something more elaborate in mind, however, you don't need piecewise() for this simple case; an if/else will suffice instead:
from math import sin, cos
def Li(t):
return sin(t) if t < 0 else cos(t)
>>> Li(0)
1.0
>>> Li(1)
0.5403023058681398
>>> Li(1.0)
0.5403023058681398
>>> Li(-1.0)
-0.8414709848078965
>>> Li(-1)
-0.8414709848078965
You can wrap the return value in an numpy.array if required.
I am sorry, but this example is taken and modified from
http://docs.scipy.org/doc/numpy/reference/generated/numpy.piecewise.html
But, in fact, using ipython with numpy 1.9
"""
Python 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Aug 21 2014, 18:22:21)
Type "copyright", "credits" or "license" for more information.
IPython 2.2.0 -- An enhanced Interactive Python.
"""
I have no errors, but "ValueError: too many boolean indices" error appears if I use python 2.7.3 with numpy 1.6
"""
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
"""
I test this function under Linux and Windows and the same error occurs.
Obviously, It is very easy to overcome this situation, but I think that
this behaviour is a mistake in the numpy library.