TestAccuracy above 1 in Caffe training - deep-learning

I was training an bvlc_googlenet using Caffe on a set of images with 5 output values. The results of testing I received is very unclear for me as test accuracy in many cases was above 1. How should I interpret those results? Is it an error?
Here are test logs of the trainning obtained using command
/home/ubuntu/caffe/tools/extra/parse_log.sh train.log
https://pastebin.com/8KN6g7Rx

Please see this caffe Pull Request fixing this bug.
As a workaround (if you do not want to merge this PR) you can use Accuracy only during testing and disable it for training.

Related

Tensorflow 2 Object Detection API - Can/Should I use K-Fold Cross Validation?

I have a small dataset of about 1000 images and am training my model to detect 8 classes. I had divided my dataset in a ratio of 80:20 (training: validation) and wanted to apply k-fold cross validation so as to make the most of my dataset.
#1: Is this line of thinking proper or am I misunderstanding something? In another post about K-fold cross-validation in object detection, someone mentioned that since we have confidence scores, we don't require k fold cross-validation. However, I don't see a correlation between training my model on the 'k' number of folds and confidence scores.
#2: Is this something that has to be manually done or does tensorflow 2.x have the means to add k fold cross-validation?
Any clarification would be greatly appreciated! Thanks!
About your query 1 and 2
(IMO), It would be proper to do K-Fold. FYI, splitting the data set into the 8:2 ratio is something called the holdout method, AFAIK, it's not K-Fold. When you want to do K-Fold there is something you probably need to consider such as class distribution, bounding box distribution, etc. However, as you don't provide any sample data or code, here is a similar discussion that might help you.
It has to be manually done. It's a resampling procedure used to evaluate machine learning models on a limited data sample. It's not something integrated with any framework.

Custom translator - Model adjustment after training

I've used three parallel sentence files to train my custom translator model. No dictionary files and no tuning files too. After training is finished and I've checked test results, I want to make some adjustments in the model. And here are several questions:
Is it possible to tune the model after training? Am I right that the model can't be changed and the only way is to train a new model?
The best approach to adjusting the model is to use tune files. Is it correct?
There is no way to see an autogenerated tune file, so I have to provide my own tuning file for a more manageable tuning process. Is it so?
Could you please describe how the tuning file is generated, when I have 3 sentence files with different amount of sentences, which is: 55k, 24k and 58k lines. Are all tuning sentences is from the first file or from all three files proportionally to their size? Which logic is used?
I wish there were more authoritative answers on this, I'll share what I know as a fellow user.
What Microsoft Custom Translator calls "tuning data" is what is normally known as a validation set. It's just a way to avoid overfitting.
Is it possible to tune the model after training? Am I right that the model can't be changed and the only way is to train a new model?
Yes, with Microsoft Custom Translator you can only train a model based on the generic category you have selected for the project.
(With Google AutoML technically you can choose to train a new model based on one of your previous custom models. However, it's also not usable without some trial and error.)
The best approach to adjusting the model is to use tune files. Is it correct?
It's hard to make a definitive statement on this. The training set also has an effect. A good validation set on top of a bad training set won't get us good results.
There is no way to see an autogenerated tune file, so I have to provide my own tuning file for a more manageable tuning process. Is it so?
Yes, it seems to me that if you let it decide how to split the training set into the training set, tuning set and test set, you can only download the training set and the test set.
Maybe neither includes the tuning set, so theoretically you can diff them. But that doesn't solve the problem of the split being different between different models.
... Which logic is used?
Good question.

Graph interpretation deep learning

i'm trying to build a model that classify sentences. i'm using a Reccurent neural network(RNN) model "GRUcell" and i have the following Graph. the loss function i'm using is cross entropy.
can you please explain me why the loss after been close to 0 pick to 1 after each iterations?
i can't find any interpretation of this, thank you.
enter image description here
According to the information you have provided it looks like its going down at the end of a batch and again going back up at the start of the next batch. This can be due to a high learning rate with not enough decay over time.
Try to tweak the parameters and see if that helps.
Cheers

Multiple pretrained networks in Caffe

Is there a simple way (e.g. without modifying caffe code) to load wights from multiple pretrained networks into one network? The network contains some layers with same dimensions and names as both pretrained networks.
I am trying to achieve this using NVidia DIGITS and Caffe.
EDIT: I thought it wouldn't be possible to do it directly from DIGITS, as confirmed by answers. Can anyone suggest a simple way to modify the DIGITS code to be able to select multiple pretrained networks? I checked the code a bit, and thought the training script would be a good place to start, but I don't have in-depth knowledge of Caffe, so I'm not sure what the best/quickest way to achieve this would be.
As Shai suggested, there was no way of doing this, so I decided to clone the official repository and make the appropriate changes. I changed the code so that multiple pretrained networks can be loaded by using a colon as separator.
I created a pull request on the official repository and my changes were then merged with the main branch of DIGITS, meaning it is now possible to use this functionality in DIGITS.
AFAIK there is no straight forward way of doing so.
However, you can use net surgery to load the pretrained models and manually assign their weights to the target net. Once you have a single net with all the weights initialized according to the various pretrained models, you can save it and use it as a single pretrained model for the rest of your work.

How do I see the history of my Jenkins build test results?

I've got a collection of Jenkins jobs which are all essentially tests packs - running lots of JUnit tests.
I keep the results for 7 days and, with the aid of the global build stats plugin and build metrics plugin, I can get a percentage of the number of builds (test packs) that had at least one failure in the last week.
What I'm now interested in doing is getting the percentage of all test failures over one week, to get a better idea as to how badly the set of builds failed - was it just one test that caused each build to fail? Or all the tests?. Is it possible with an existing plugin?
I know the data is there because the home page of any of my jobs has a graph on the right where the green area represents test passes and red fails, for all of the previous builds. This gives me some idea, but I'd like a figure to report with.
You may want to take a look at the Unit Test History Generator or Test Results Analyzer plugins.