difference between accuracy from confusion matrix and validation accuracy - deep-learning

What is the difference between accuracy obtained from confusion matrix and validation accuracy obtained after several epochs?
I am new to deep learning

The validation accuracy (from each epoch) is the accuracy obtained from each trial (combination of parameters/weight). That is not to be used for reporting. It is just logging of the status of the training/learning process.
If you want to measure accuracy of a trained (fixed/frozen) model (be it deep learning or whatever algorithm learning), use a separate validation set and measure it using confusion matrix yourselve (there is metric package in sklearn).

Related

Deep Learning Training Loss does not decrease, and Validation Accuracy fluctuate

My CNN-based deep learning model is fluctuating in validation accuracy at certain epochs. I learned the deep learning model from frequency domain images. Do I need to normalize or not?
And then, the training loss does not decrease and is stable. I tried 80 epochs. The training loss and validation accuracy are still stuck in epoch 20. There are no improvements.
I have already tried tuning hyperparameters such as learning rate. I also added regularization parameters.

Deep Learning for Acoustic Emission concrete fracture speciments: regression on-set time and classification of type of failure

How can I use deep learning for both regression and classification tasks?
I am facing a problem with acoustic emission on fracture with concrete speciment. The objective is to find automatically the on-set time instant (time at the beginning of the acoustic emission) and the slope with the peak value to determine the kind of fracture (mode I or mode II based on the raise angle RA).
I have tried Regional CNN to work with images of the signals Fine-tuning Faster-RCNN using pytorch, but unfortunately the results are not outstanding up to now.
I would like to work with sequences (time series) of amplitude data according to a certain sampling frequency, but they have different length each. How can I deal with this problem?
Can I make a 1D-CNN which makes a sort of anomaly detection based on the supervised point that I can mark manually on training examples?
I have a certain number of recordings which I would like to exploit to train the model sampled at 100Hz. In examples on anomaly detection like Timeseries anomaly detection using an Autoencoder, they use the same time series and they perform a window with sliding 1 time step in order to obtain about 3700 to train their neural network. Instead I have different number of recordings (time series) each of them with a certain on-set time instant and different global length in seconds. How can I manage it?
I actually need the time instant of the beginning of the signal and the maximum point to define the raise angle and classify the type of fracture. Can I make classification directly with CNN simultaneously with regression tasks of the on-set time instant?
Thank you in advance!
I finally solved, thanks to the fundamental suggestion by #JonNordby, using Sound Event Detection method. We adopted and readapted the code from GitHub YashNita.
I labelled the data according to the following image:
Then, I adopted the method for extracting features from computing the spectrogram of the input signals:
And finally we were able to get a more precise output recognition of the Seismic Event Detection which is directly connected to the Acoustic Emission Event detection, obtaining the following result:
For the moment, only the event recognition phase was done, but it would be simple to readapt also to conduct classification of mode I or mode II of cracking.

Validation accuracy when training and testing within same loop

Since I am training and testing within same loop (for each epoch on training set, network is applied on entire validation set).
Now does it make sense that the highest validation accuracy I get at some instant (nth epoch) be my network's highest accuracy or should I only use the validation accuracy once the graph has settled and weights don't change?
I think you are confusing testing with validation. If possible you should keep a separate test set of your dataset for testing only AFTER the training and validation are done.
Although you can use that test set for your validation, it's the best practice to do inference on a separate set of data that the model has never seen before.
So answering your question, if you get a spike in your validation accuracy then it can or cannot be the highest as more the number of epochs more the chances of greater accuracy (assuming the model is not overfitting the dataset).
In this situation, it's best that you save the model after every validation accuracy spike so you the best model with the highest validation accuracy after the training and validation has completed.

Is this a valid way to speed up kFold cross validations for deep neural network training?

In the context of convolutional neural network training, I need to do a 10-fold cross validations of my training set. Training just 1 of the 10 fold takes at least one hour on my GPU which means total time for training all 10 folds independently would take at least 10 hours! To speed up training, will my kFold result be valid if I load and tune the trained weights from the fully trained model from the first fold (fold1) for each of the rest of the KFold models (fold2, fold3... fold10)? Is there any side effect?
That won't be doing any cross-validation.
The point of retraining the net is to ensure that it's trained on a different subset of your data, and for every full training, it keeps aside a set of validation data that it has never seen. If you reload your weights from a previous training instance, you're going to be validating against data your network has already seen, and your cross-validation score will be inflated.

How we know when to stop training a model on a pre-trained model?

My apologies since my question may sound stupid question. But I am quite new in deep learning and caffe.
How can we detect how many iterations are required to fine-tune a pre-trained on our own dataset? For example, I am running fcn32 for my own data with 5 classes. When can I stop the fine-tuning process by looking at the loss and accuracy of training phase?
Many thanks
You shouldn't do it by looking at the loss or accuracy of training phase. Theoretically, the training accuracy should always be increasing (also means the training loss should always be decreasing) because you train the network to decrease the training loss. But a high training accuracy doesn't necessary mean a high test accuracy, that's what we referred as over-fitting problem. So what you need to find is a point where the accuracy of test set (or validation set if you have it) stops increasing. And you can simply do it by specifying a relatively larger number of iteration at first, then monitor the test accuracy or test loss, if the test accuracy stops increasing (or the loss stops decreasing) in consistently N iterations (or epochs), where N could be 10 or other number specified by you, then stop the training process.
The best thing to do is to track training and validation accuracy and store snapshots of the weights every k iterations. To compute validation accuracy you need to have a sparate set of held out data which you do not use for training.
Then, you can stop once the validation accuracy stops increasing or starts decreasing. This is called early stopping in the literature. Keras, for example, provides functionality for this: https://keras.io/callbacks/#earlystopping
Also, it's good practice to plot the above quantities, because it gives you important insights into the training process. See http://cs231n.github.io/neural-networks-3/#accuracy for a great illustration (not specific to early stopping).
Hope this helps
Normally you converge to a specific validation accuracy for your model. In practice you normally stop training, if the validation loss did not increase in x epochs. Depending on your epoch duration x may vary most commonly between 5 and 20.
Edit:
An epoch is one iteration over your dataset for trainig in ML terms. You do not seem to have a validation set. Normally the data is split into training and validation data so you can see how well your model performs on unseen data and made decisions about which model to take by looking at this data. You might want to take a look at http://caffe.berkeleyvision.org/gathered/examples/mnist.html to see the usage of a validation set, even though they call it test set.