How to save xgboost model as a json of jsons? - json

After training an xgboost model, I would like to save it and some other custom fields as a json object as below. The purpose being so I can load the json object later, use the model object to make predictions as well as inspecting the other custom fields.
model = xgb.train(params=tree_params, dtrain=data)
my_model_dict = {
"model": ...<json serializable model object>..., # need help here
"features": model.feature_names,
"tree_params": tree_params,
...
}
json.dumps(my_model_dict, file_path)
my_model_dict = json.load(file_path)
model = my_model_dict["model"]
predictions = model.predict(new_data)
Is it possible to convert an xgboost model object into an object that is json serializable and that can then be loaded to make standard xgboost predictions?
I appreciate I can save the raw model seperately as a json using
model.save_model("my_model.json")
model = xgb.Booster("my_model.json")
model.predict(new_data)
but really what I would like to do is create a dictionary containing the model along with other custom fields that can be saved as a json, then loaded to make predictions.

Related

DRF: can you deserialize one JSON key:value pair into multiple fields

In my API I have a module, which collects JSON objects obtained via POST request. JSON objects I'm receiving look more or less like this:
{
"id": "bec7426d-c5c3-4341-9df2-a0b2c54ab8da",
"data": {
"temperature": -2.4,
// some other stuff is here as well ...
}
}
The problem is requirement that I have to save both: records from data dictionary and whole data dictionary as a JSONField. My ORM model looks like this:
class Report(BaseModel):
id = models.BigAutoField(primary_key=True)
data = JSONField(verbose_name=_("Data"), encoder=DjangoJSONEncoder, blank=True, null=True)
temperature = models.DecimalField(
max_digits=3,
decimal_places=1,
)
# other properties
Is there any neat way to get both properties in one attempt to deserialize JSON object? Currently I use nested serializers formed like this:
class DataSerializer(serializers.ModelSerializer):
temperature = serializers.DecimalField(
source="temperature", write_only=True, max_digits=3, decimal_places=1
)
class Meta:
model = Report
fields = ("temperature")
class ReportSerializer(serializers.ModelSerializer):
id = serializers.UUIDField(source="uuid", read_only=True)
data = DataSerializer(source="*")
class Meta:
model = Report
fields = ("id", "data")
which obviously does not pass whole data dictionary to validated_data, which means I have to update the JSON field elsewhere (not good since I would like to have both data and temperature at null=False in ORM model). Any good ideas on how to fix this using DRF serializers would be appreciated.
I believe you should be able to override validate method for your serializer where you can "store initial data JSON field" and do the default validation by calling super()... method.
More info https://www.django-rest-framework.org/api-guide/serializers/#validation
Also, there are object-level validation functions available, you can take a look there as well for the initial posted data
https://www.django-rest-framework.org/api-guide/serializers/#object-level-validation
Also, you can override the method run_validation to access the initially passed data object.

How to Convert Detected Object to COCO dataset Json

I follow Object Detection Demo in https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb:
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
But I want to convert output_dict (output from function run_inference_for_single_image(image_np, detection_graph)) to COCO annotation JSON type so I can input it to make benchmark between different Object Detection models.
Here is code to benchmark model: https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
#initialize COCO detections api
resFile='%s/results/%s_%s_fake%s100_results.json'
resFile = resFile%(dataDir, prefix, dataType, annType)
cocoDt=cocoGt.loadRes(resFile)
But you need to input a COCO Json type.
Are there anyone can tell me how to Convert from output_dict to COCO Json?
This is what you are looking for. The last code snippet:
https://lijiancheng0614.github.io/2017/08/22/2017_08_22_TensorFlow-Object-Detection-API/

Why json parsing serializer is not working for nested json in Django Rest Framework?

I am implementing json API using Django Rest Framework.
Since sqlite is being used for database, json data is stored as string and when the data is requested, serializer parse the string and convert into json and sent to client side. This implementation worked for simple json file as shown in left picture. However, this cannot work for nested json as shown right picture.
Can anyone tell me how I should revise serializer in order to work for nested json also?
serializer.py
class strToJson(serializers.CharField):
def to_representation(self,value):
x=JSON.loads(value)
return x
class summarySerializer(serializers.ModelSerializer):
project=serializers.CharField(read_only=True,source="html.project")
version = serializers.CharField(read_only=True, source="html.version")
id = serializers.IntegerField(read_only=True, source="html.pk")
json = strToJson()
class Meta:
model=summary
fields=('id','project','version','json')
model.py
class summary(models.Model):
html = models.ForeignKey(html, on_delete=models.CASCADE,related_name='summaries')
keyword = models.CharField(max_length=50, default='test')
json = models.TextField(default='test')

GCP Proto Datastore encode JsonProperty in base64

I store a blob of Json in the datastore using JsonProperty.
I don't know the structure of the json data.
I am using endpoints proto datastore in order to retrieve my data.
The probleme is the json property is encoded in base64 and I want a plain json object.
For the example, the json data will be:
{
first: 1,
second: 2
}
My code looks something like:
import endpoints
from google.appengine.ext import ndb
from protorpc import remote
from endpoints_proto_datastore.ndb import EndpointsModel
class Model(EndpointsModel):
data = ndb.JsonProperty()
#endpoints.api(name='myapi', version='v1', description='My Sample API')
class DataEndpoint(remote.Service):
#Model.method(path='mymodel2', http_method='POST',
name='mymodel.insert')
def MyModelInsert(self, my_model):
my_model.data = {"first": 1, "second": 2}
my_model.put()
return my_model
#Model.method(path='mymodel/{entityKey}',
http_method='GET',
name='mymodel.get')
def getMyModel(self, model):
print(model.data)
return model
API = endpoints.api_server([DataEndpoint])
When I call the api for getting a model, I get:
POST /_ah/api/myapi/v1/mymodel2
{
"data": "eyJzZWNvbmQiOiAyLCAiZmlyc3QiOiAxfQ=="
}
where eyJzZWNvbmQiOiAyLCAiZmlyc3QiOiAxfQ== is the base64 encoded of {"second": 2, "first": 1}
And the print statement give me: {u'second': 2, u'first': 1}
So, in the method, I can explore the json blob data as a python dict.
But, in the api call, the data is encoded in base64.
I expeted the api call to give me:
{
'data': {
'second': 2,
'first': 1
}
}
How can I get this result?
After the discussion in the comments of your question, let me share with you a sample code that you can use in order to store a JSON object in Datastore (it will be stored as a string), and later retrieve it in such a way that:
It will show as plain JSON after the API call.
You will be able to parse it again to a Python dict using eval.
I hope I understood correctly your issue, and this helps you with it.
import endpoints
from google.appengine.ext import ndb
from protorpc import remote
from endpoints_proto_datastore.ndb import EndpointsModel
class Sample(EndpointsModel):
column1 = ndb.StringProperty()
column2 = ndb.IntegerProperty()
column3 = ndb.StringProperty()
#endpoints.api(name='myapi', version='v1', description='My Sample API')
class MyApi(remote.Service):
# URL: .../_ah/api/myapi/v1/mymodel - POSTS A NEW ENTITY
#Sample.method(path='mymodel', http_method='GET', name='Sample.insert')
def MyModelInsert(self, my_model):
dict={'first':1, 'second':2}
dict_str=str(dict)
my_model.column1="Year"
my_model.column2=2018
my_model.column3=dict_str
my_model.put()
return my_model
# URL: .../_ah/api/myapi/v1/mymodel/{ID} - RETRIEVES AN ENTITY BY ITS ID
#Sample.method(request_fields=('id',), path='mymodel/{id}', http_method='GET', name='Sample.get')
def MyModelGet(self, my_model):
if not my_model.from_datastore:
raise endpoints.NotFoundException('MyModel not found.')
dict=eval(my_model.column3)
print("This is the Python dict recovered from a string: {}".format(dict))
return my_model
application = endpoints.api_server([MyApi], restricted=False)
I have tested this code using the development server, but it should work the same in production using App Engine with Endpoints and Datastore.
After querying the first endpoint, it will create a new Entity which you will be able to find in Datastore, and which contains a property column3 with your JSON data in string format:
Then, if you use the ID of that entity to retrieve it, in your browser it will show the string without any strange encoding, just plain JSON:
And in the console, you will be able to see that this string can be converted to a Python dict (or also a JSON, using the json module if you prefer):
I hope I have not missed any point of what you want to achieve, but I think all the most important points are covered with this code: a property being a JSON object, store it in Datastore, retrieve it in a readable format, and being able to use it again as JSON/dict.
Update:
I think you should have a look at the list of available Property Types yourself, in order to find which one fits your requirements better. However, as an additional note, I have done a quick test working with a StructuredProperty (a property inside another property), by adding these modifications to the code:
#Define the nested model (your JSON object)
class Structured(EndpointsModel):
first = ndb.IntegerProperty()
second = ndb.IntegerProperty()
#Here I added a new property for simplicity; remember, StackOverflow does not write code for you :)
class Sample(EndpointsModel):
column1 = ndb.StringProperty()
column2 = ndb.IntegerProperty()
column3 = ndb.StringProperty()
column4 = ndb.StructuredProperty(Structured)
#Modify this endpoint definition to add a new property
#Sample.method(request_fields=('id',), path='mymodel/{id}', http_method='GET', name='Sample.get')
def MyModelGet(self, my_model):
if not my_model.from_datastore:
raise endpoints.NotFoundException('MyModel not found.')
#Add the new nested property here
dict=eval(my_model.column3)
my_model.column4=dict
print(json.dumps(my_model.column3))
print("This is the Python dict recovered from a string: {}".format(dict))
return my_model
With these changes, the response of the call to the endpoint looks like:
Now column4 is a JSON object itself (although it is not printed in-line, I do not think that should be a problem.
I hope this helps too. If this is not the exact behavior you want, maybe should play around with the Property Types available, but I do not think there is one type to which you can print a Python dict (or JSON object) without previously converting it to a String.

Reading JSON thru web service into POJOs annotated for Hibernate

I am reading the following json through a web service. Is there a way to read the json into three appropriate POJOs? The POJOs are generated by hibernate and are used to communicate to the database.
Basically I need to read the person json into a Person POJO, the pets json into a set of Pet POJOs, and the toy json into a set of Toy POJOs.
The JSON
{
"person":{"first_name":"John", "last_name":"Smith"},
"pets":[{"species":"dog", "name":"Adama"}, {"species":"cat", "name":"Benton"} ],
"toys":[{"car":"corvet", "color":"black"}, {"action_figure":"hancock", "height":"1ft"} ]
}
The Web Service
#Post
public Representation readForm(Representation representation) {
try {
Person aPerson = …
Set<Pet> petSet = …
Set<Toy> toySet = ...
….
You can use xStream . You will have to create a VO having all 3 types of your objects as properties. Give them respective aliases and you will get all 3 types of objects in that VO. You can get them simply by calling their getters.