I trained a model in TensorFlow, and saved it on disk.
Now I want to load it from checkpoint and print the trained parameters.
Something like:
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=hidden_units,
warm_start_from=checkpoint_path)
print(parameters(classifier))
How do I do that?
I'm using tf version 1.14.
I think you can use these two methods get_variable_names() and get_variable_value() to retrieve the parameters in your classifier.
params = classifier.get_variable_names()
for p in params:
print(p, classifier.get_variable_value(p))
Related
The following code from the tutorial to pytorch data paraleelism reads strange to me:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = Model(input_size, output_size)
if torch.cuda.device_count() > 1:
print("Let's use", torch.cuda.device_count(), "GPUs!")
# dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
model = nn.DataParallel(model)
model.to(device)
According to my best knowledge, mode.to(device) copy the data to GPU.
DataParallel splits your data automatically and sends job orders to multiple models on several GPUs. After each model finishes their job, DataParallel collects and merges the results before returning it to you.
If the DataParallel does the job of copying, what does the to(device) do here?
They add few lines in the tutorial to explain nn.DataParallel.
DataParallel splits your data automatically, and send job orders to multiple models on different GPUs using the data. After each model finishes their job, DataParallel collects and merges the results for you.
The above quote can be understood that nn.DataParallel is just a wrapper class to inform model.cuda() should make a multiple copies to GPUs.
In my case, I don't have any GPU on my laptop. I still call nn.DataParallel() without any problem.
import torch
import torchvision
model = torchvision.models.alexnet()
model = torch.nn.DataParallel(model)
# No error appears if I don't move the model to `cuda`
These days, I have tried a model which implemented by cntk. But I can't find a way to predict new pic with trained model.
The trained model saved as a checkpoint:
trainer.save_checkpoint(os.path.join(output_model_folder, "model_{}".format(best_epoch)))
Then I have gotten some files like:
So, I tried to load this model checkpoint like:
model = ct.load_model('../data/models/VGG13_majority/model_94')
the code above can run successfully. Then I tried
model.eval(image_data)
but I got an error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ update ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
this time I have tried the method below:
model = ct.load_model('../data/models/VGG13_majority/model_94')
model.eval({model.arguments[0]: [final_image]})
then a new error raised:
For any C.Function.eval() you need to pass a dictionary as the argument.
So it will go something like this, assuming that you only have one input_variable into the model:
model = C.load_model()
model.eval({model.arguments[0]: image_data})
Anyhow, i noticed that you saved the model from the checkpoint. By doing so, you actually saved the "ground_truth" input_variable to the loss function too.
I would recommend next time that you saved the model directly. Usually the files from save_checkpoint is meant to be used in restore_from_checkpoint()
import cntk as C
from cntk.layers import Dense
model = Dense(10)(C.input_variable(1))
loss = C.binary_cross_entropy(model, C.input_variable(10))
trainer = C.Trainer(model, (loss,), [C.adam(model.parameters, 0.9, 0.9)])
trainer.save_checkpoint("hello")
model.save() # used this to save the model directly
# to recover model from checkpoint use below
trainer.restore_from_checkpoint("hello")
original_model = trainer.model
print(trainer)
for i in trainer.model.arguments:
print(i)
In the given example of MNIST in the Caffe installation.
For any given test image, how to get the softmax scores for each category and do some processing on them? Say compute the mean and variance of them.
I am newbie so a detail would help me a lot. I am able to train the model and use the testing feature to get the prediction but I am not sure which files are to be edited in order to get the above results.
You can use python interface
import caffe
net = caffe.Net('/path/to/deploy.prototxt', '/path/to/weights.caffemodel', caffe.TEST)
in_ = read_data(...) # this is up to you to read a sample and convert it to numpy array
out_ = net.forward(data=in_) # assuming your net expects "data" in blob
Now you have the output of your net in a dictionary out (keys are names of output blobs). You can run it in a loop on several examples etc.
I can try to answer your question. Assuming in your deploying net, the softmax layer is like below:
layer {
name: "prob"
type : "Softmax"
bottom: "fc6"
top: "prob"
}
In your python code that processes data, combining with the code #Shai provided, you can get the probability of each category by adding code based on #Shai's code:
predicted_prob = net.blobs['prob'].data
predicted_prob will be returned an array that contains the probabilities with all categories.
For example, if you only have two categories, predicted_prob[0][0] will be the probability that this testing data belongs to one category and predicted_prob[0][1] will be the probability of the other one.
PS:
If you don't want to write any additional python script, according to https://github.com/BVLC/caffe/tree/master/examples/mnist
it says this example will automatically do the testing every 500 iterations. "500" is defined in solver, such as https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt
So you need to trace back the caffe source code that processes the solver file. I guess it should be https://github.com/BVLC/caffe/blob/master/src/caffe/solver.cpp
I am not sure solver.cpp is the correct file you need to look at. But in this file, you can see it has functions of testing and calculation of some values. I hope it can give you some ideas if no one else can answer your question.
I am having trouble to access the coefficients of a support vector regression model (SVR) in scikit learn when the model is embedded in a pipeline and a grid search.
Consider the following example:
from sklearn.datasets import load_iris
import numpy as np
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVR
from sklearn.feature_selection import SelectKBest
from sklearn.pipeline import Pipeline
iris = load_iris()
X_train = iris.data
y_train = iris.target
clf = SVR(kernel='linear')
select = SelectKBest(k=2)
steps = [('feature_selection', select), ('svr', clf)]
pipeline = Pipeline(steps)
grid = GridSearchCV(pipeline, param_grid={"svr__C":[10,10,100],"svr__gamma": np.logspace(-2, 2)})
grid.fit(X_train, y_train)
This seems to work fine but when I try to access the coefficient of the best fitting model
grid.best_estimator_.coef_
I get an error message: AttributeError: 'Pipeline' object has no attribute 'coef_'.
I also tried to access the individual steps of the pipeline:
pipeline.named_steps['svr']
but could not find the coefficients there.
Just happened to come across the same problem and this post
had the answer:
grid.best_estimator_ contains an instance of the pipeline, which consists of steps. The last step should always be the estimator, so you should always find the coefficients at:
grid.best_estimator_.steps[-1][1].coef_
Is it possible to translate the info in a .caffemodel file such that it could be read by (for example) Matlab. That is, is there a way to write your model using something else that prototxt and import the weights trained using Caffe?
If the answer is "Nope, it's a binary file and will always remain that way", is there some documentation regarding the structure of the file so that one could extract the important information somehow?
As you know, .caffemodel consists of weights and biases.
A simple way to read weights and biases for a caffemodel given the prototxt would be to just load the network in Python and read the weights.
You can use:
import caffe
net = caffe.Net(<prototxt-file>,<model-file>,<phase>);
and access the params from net.params
source
I'll take VGG as an example
from caffe.proto import caffe_pb2
net = caffe_pb2.NetParameter()
caffemodel = sys.argv[1]
with open(caffemodel, 'rb') as f:
net.ParseFromString(f.read())
for i in net.layer:
print i.ListFields()[0][-1]
#conv1
#relu1
#norm1
#pool1
#conv2
#relu2
#norm2
#pool2
#conv3
#relu3
#conv4
#relu4
#conv5
#relu5
#pool5
#fc6
#relu6
#drop6
#fc7
#relu7
#drop7
#fc8
#prob