I am using caffe to train a deep network, and have set the display at 200 iterations in my solver prototext file. However, instead of getting the loss and accuracy, I am getting a large number of lines of the form
solver.cpp:245] Train net output #{no.}: prob = {no.}.
The prob values are the probabilities of output of softmax layer(the last layer in my network).
This display is of no real use to me. I am interested in seeing only the rate at which the accuracy is evolving with the number of iterations. Can someone suggest a way in which I may print only the relevant parameters to stdout? Is there a way to control in general what is printed by caffe to stdout?
Thanks.
(Note: I am using caffe executable for training on ubuntu)
Related
What is going wrong with this code? I have generated adversarial images using cleverhans API - generate_np method. And using the default cleverhans CNN classifier to classify the images. The test accuracy is very low as expected when I use the model after generating the images. But if I save and reload the model, the accuracy is too high. Please check the code here.
https://github.com/csesivakumar/Adversarial_Defense/blob/master/Cleverhans_generatenp.ipynb
Python: 3.6
Pasting my answer from the GitHub issue tracker in case others are facing the same issue:
From your code it looks like you are initializing the model's weights, defining the tf session, etc... after having trained the model using Keras. My guess is that the adv_x array does not contain images that are adversarial. This would explain why the accuracy output by [22] is close to random---because the model weights are random. When you restore the model, its weights are set again to the values learned during training so the accuracy is restored (because the images are not adversarial).
I have a pretrained CNN (Resnet-18) trained on Imagenet, now i want to extend it on my own dataset of video frames , now the point is all tutorials i found on Finetuning required dataset to be organised in classes like
class1/train/
class1/test/
class2/train/
class2/test/
but i have only frames on many videos , how will i train my CNN on it.
So can anyone point me in right direction , any tutorial or paper etc ?
PS: My final task is to get deep features of all frames that i provide at the time of testing
for training network, you should have some 'label'(sometimes called y) of your input data. from there, network calculate loss between logit(answer of network) and the given label.
And the network will self-revise using that loss value by backpropagating. that process is what we call 'training'.
Because you only have input data, not label, so you can get the logit only. that means a loss cannot be calculated.
Fine tuning is almost same word with 'additional training', so that you cannot fine tuning your pre-trained network without labeled data.
About train set & test set, that is not the problem right now.
If you have enough labeled input data, you can divide it with some ratio.
(e.g. 80% of data for training, 20% of data for testing)
the reason why divide data into these two sets, we want to check the performance of our trained network more general, unseen situation.
However, if you just input your data into pre-trained network(encoder part), it will give a deep feature. It doesn't exactly fit to your task, still it is deep feature.
Added)
Unsupervised pre-training for convolutional neural network in theano
here is the method you need, deep feature encoder in unsupervised situation. I hope it will help.
My apologies since my question may sound stupid question. But I am quite new in deep learning and caffe.
How can we detect how many iterations are required to fine-tune a pre-trained on our own dataset? For example, I am running fcn32 for my own data with 5 classes. When can I stop the fine-tuning process by looking at the loss and accuracy of training phase?
Many thanks
You shouldn't do it by looking at the loss or accuracy of training phase. Theoretically, the training accuracy should always be increasing (also means the training loss should always be decreasing) because you train the network to decrease the training loss. But a high training accuracy doesn't necessary mean a high test accuracy, that's what we referred as over-fitting problem. So what you need to find is a point where the accuracy of test set (or validation set if you have it) stops increasing. And you can simply do it by specifying a relatively larger number of iteration at first, then monitor the test accuracy or test loss, if the test accuracy stops increasing (or the loss stops decreasing) in consistently N iterations (or epochs), where N could be 10 or other number specified by you, then stop the training process.
The best thing to do is to track training and validation accuracy and store snapshots of the weights every k iterations. To compute validation accuracy you need to have a sparate set of held out data which you do not use for training.
Then, you can stop once the validation accuracy stops increasing or starts decreasing. This is called early stopping in the literature. Keras, for example, provides functionality for this: https://keras.io/callbacks/#earlystopping
Also, it's good practice to plot the above quantities, because it gives you important insights into the training process. See http://cs231n.github.io/neural-networks-3/#accuracy for a great illustration (not specific to early stopping).
Hope this helps
Normally you converge to a specific validation accuracy for your model. In practice you normally stop training, if the validation loss did not increase in x epochs. Depending on your epoch duration x may vary most commonly between 5 and 20.
Edit:
An epoch is one iteration over your dataset for trainig in ML terms. You do not seem to have a validation set. Normally the data is split into training and validation data so you can see how well your model performs on unseen data and made decisions about which model to take by looking at this data. You might want to take a look at http://caffe.berkeleyvision.org/gathered/examples/mnist.html to see the usage of a validation set, even though they call it test set.
I have a pre-trained network with which I would like to test my data. I defined the network architecture using a .prototxt and my data layer is a custom Python Layer that receives a .txt file with the path of my data and its label, preprocess it and then feed to the network.
At the end of the network, I have a custom Python layer that get the class prediction made by the net and the label (from the first layer) and print, for example, the accuracy regarding all batches.
I would like to run the network until all examples have passed through the net.
However, while searching for the command to test a network, I've found:
caffe test -model architecture.prototxt -weights model.caffemodel -gpu 0 -iterations 100
If I don't set the -iterations, it uses the default value (50).
Does any of you know a way to run caffe test without setting the number of iterations?
Thank you very much for your help!
No, Caffe does not have a facility to detect that it has run exactly one epoch (use each input vector exactly once). You could write a validation input routine to do that, but Caffe expects you to supply the quantity. This way, you can generate easily comparable results for a variety of validation data sets. However, I agree that it would be a convenient feature.
The lack of this feature might be related to its lack for training and the interstitial testing.
In training, we tune the hyper-parameters to get the most accurate model for a given application. As it turns out, this is more closely dependent on TOTAL_NUM than on the number of epochs (given a sufficiently large training set).
With a fixed training set, we often graph accuracy (y-axis) against epochs (x-axis), because that gives tractable results as we adjust batch size. However, if we cut the size of the training set in half, the most comparable graph would scale on TOTAL_NUM rather than the epoch number.
Also, by restricting the size of the test set, we avoid long waits for that feedback during training. For instance, in training against the ImageNet data set (1.2M images), I generally test with around 1000 images, typically no more than 5 times per epoch.
I'm working with Caffe and am interested in comparing my training and test errors in order to determine if my network is overfitting or underfitting. However, I can't seem to figure out how to have Caffe report training error. It will show training loss (the value of the loss function computed over the batch), but this is not useful in determining if the network is overfitting/underfitting. Is there a straightforward way to do this?
I'm using the Python interface to Caffe (pycaffe). If I could get access to the raw training set somehow, I could just put batches through with forward passes and evaluate the results. But, I can't seem to figure out how to access more than the currently-processing batch of training data. Is this possible? My data is in a LMDB format.
In the train_val.prototxt file change the source in the TEST phase to point to the training LMDB database (by default it points to the validation LMDB database) and then run this command:
$ ./build/tools/caffe test -solver models/bvlc_reference_caffenet/solver.prototxt -weights models/bvlc_reference_caffenet/<caffenet_train_iter>.caffemodel -gpu 0