dump weights of cnn in json using keras - json

I want to use the dumped weights and model architecture in other framework for testing.
I know that:
model.get_config() can give the configuration of the model
model.to_json returns a representation of the model as a JSON string, but that the representation does not include the weights, only the architecture
model.save_weights(filepath) saves the weights of the model as a HDF5 file
I want to save the architecture as well as weights in a json file.

Keras does not have any built-in way to export the weights to JSON.
Solution 1:
For now you can easily do it by iterating over the weights and saving it to the JSON file.
weights_list = model.get_weights()
will return a list of all weight tensors in the model, as Numpy arrays.
Then, all you have to do next is to iterate over this list and write to the file:
for i, weights in enumerate(weights_list):
writeJSON(weights)
Solution 2:
import json
weights_list = model.get_weights()
print json.dumps(weights_list.tolist())

Related

Relation extraction using doccano

I want to do relation extraction using doccano. I have already annotated data/entity relation using doccano and exported data is in jsonl format. I want to convert it into spacy format data to train bert using spacy on jsonl annotated data.
.
Drop this Annotation and go with NER Annotator spacy (reannotate it)

The result of using the catboostclassifier model's output python file to predict is different from the result of using model to predict directly

I want to verify that the predicted results from the exported file are consistent with those predicted directly.
I use the output Python file with the model description of catclassifier to predict result:
But the result which is predicted directly is 2.175615211102761. It is verified that this is true for multiple data. I want to know why and how to solve it.
float_sample and cat_sample look like
Supplementary question: the results predicted by using the model file described in Python language provided by the catboost tutorial are different from those predicted directly by the model

Create LMDB for new test data

I have an LMDB train data file for the VPGNet CNN model pre- trained on Caltech Lane data set.
I would like to test it on new data set different from the training data set. How to create LMDB for the new test data.
Do I need to modify prototxt files for testing with pre-trained net. For testing do I need a prototxt file or there is a specific command.
Thanks
Lightning Memory-Mapped Databases (LMDB) formats can be efficiently process as input data.
We create the native format (lmdb) for training and validating the model.
Once the trained model converged and the loss is calculated on training and validation data,
we use separate data (Unknown data/ Data which is not used for training) for inference the model.
In case if we are running a classification inference on a single image or set of images,
We need not convert those in to lmdb. Instead we can just run a forward pass on the stacked topology with the image/images converted into the desired format (numpy arrays).
For More info :
https://software.intel.com/en-us/articles/training-and-deploying-deep-learning-networks-with-caffe-optimized-for-intel-architecture

How can we do feature selection on json data?

I have large dataset in json format from which I want to extract important attributes whcih captures the most variance. I want to extract these attributes to build a search engine on the dataset with these attributes being the hash key.
The main question being asked here is doing feature selection on a json data.
You could read the data into a pandas DataFrame Object with the pandas.read_json() function. You can use this DataFrame Object to gain insight into your data. For example:
data = pandas.load_json(json_file)
data.head() # Displays the top five rows
data.info() # Displays description of the data
Or you can use matplotlib on this DataFrame to plot a histogram for each numerical attribute
import matplotlib.pyplot as plt
data.hist(bins=50, figsize=(20,15))
If you are interested into correlation of attributes, you can use the pandas.scatter_matrix() function.
You have to manually pick the attributes that fit best to your task and this tools help you to understand the data and gain insight into it.

"Not JSON compliant number" building a geojson file

I'm trying to build a GeoJSON file with Python geojson module consisting on a regular 2-d grid of points whose 'properties' are associated to geophysical variables (velocity,temperature, etc). The information comes from a netcdf file.
So the code is something like that:
from netCDF4 import Dataset
import numpy as np
import geojson
ncfile = Dataset('20140925-0332-n19.nc', 'r')
u = ncfile.variables['Ug'][:,:] # [T,Z,Y,X]
v = ncfile.variables['Vg'][:,:]
lat = ncfile.variables['lat'][:]
lon = ncfile.variables['lon'][:]
features=[]
for i in range(0,len(lat)):
for j in range(0,len(lon)):
coords = (lon[j],lat[i])
features.append(geojson.Feature(geometry = geojson.Point(coords),properties={"u":u[i,j],"v":v[i,j]}))
In this case the point has velocity components in the 'properties' object. The error I receive is on the features.append() line with the following message:
*ValueError: -5.4989638 is not JSON compliant number*
which corresponds to a longitude value. Can someone explains me whatcan be wrong ?
I have used simply conversion to float and it eliminated that error without need of numpy.
coords = (float(lon[j]),float(lat[i]))
I found the solution. The geojson module only supports standard Python data classes, while numpy extends up to 24 types. Unfortunately netCDF4 module needs numpy to load arrays from netCDF files. I solved using numpy.asscalar() method as explained here. So in the code above for example:
coords = (lon[j],lat[i])
is replaced by
coords = (np.asscalar(lon[j]),np.asscalar(lat[i]))
and works also for the rest of variables coming from the netCDF file.
Anyway, thanks Bret for your comment that provide me the clue to solve it.