Deep Learning Training Loss does not decrease, and Validation Accuracy fluctuate - deep-learning

My CNN-based deep learning model is fluctuating in validation accuracy at certain epochs. I learned the deep learning model from frequency domain images. Do I need to normalize or not?
And then, the training loss does not decrease and is stable. I tried 80 epochs. The training loss and validation accuracy are still stuck in epoch 20. There are no improvements.
I have already tried tuning hyperparameters such as learning rate. I also added regularization parameters.

Related

CNN: Normal that the validation loss decreases much slower than training loss?

i'm training a CNN U-net model for semantic segmentation of images, however the training loss seems to decrease in a much faster rate than the validation loss, is this normal?
I'm using a loss of 0.002
The training and validation loss can be seen in the image bellow:
Yes, this is perfectly normal.
As the NN learns, it infers from the training samples, that it knows better at each iteration. The validation set is never used during training, this is why it is so important.
Basically:
as long as the validation loss decreases (even slightly), it means the NN is still able to learn/generalise better,
as soon as the validation loss stagnates, you should stop training,
if you keep training, the validation loss will likely increase again, this is called overfitting. Put simply, it means the NN learns "by heart" the training data, instead of really generalising to unknown samples (such as in the validation set)
We usually use early stopping to avoid the last: basically, if your validation loss doesn't improve in X iterations, stop training (X being a value such as 5 or 10).

Densenet with Hinge loss on CIFAR dataset

I am trying to use Hinge loss with densenet on the CIFAR 100 dataset. The learning converges to some point and after that there is no learning. The accuracy is much less than Densenet with CrossEntropy loss function. I tried with different learning rates and weight decays.
Any ideas on why I am unable to train properly Densenet with Hinge loss? I am able to use Hinge loss with Resnet without any problem.

difference between accuracy from confusion matrix and validation accuracy

What is the difference between accuracy obtained from confusion matrix and validation accuracy obtained after several epochs?
I am new to deep learning
The validation accuracy (from each epoch) is the accuracy obtained from each trial (combination of parameters/weight). That is not to be used for reporting. It is just logging of the status of the training/learning process.
If you want to measure accuracy of a trained (fixed/frozen) model (be it deep learning or whatever algorithm learning), use a separate validation set and measure it using confusion matrix yourselve (there is metric package in sklearn).

Is this a valid way to speed up kFold cross validations for deep neural network training?

In the context of convolutional neural network training, I need to do a 10-fold cross validations of my training set. Training just 1 of the 10 fold takes at least one hour on my GPU which means total time for training all 10 folds independently would take at least 10 hours! To speed up training, will my kFold result be valid if I load and tune the trained weights from the fully trained model from the first fold (fold1) for each of the rest of the KFold models (fold2, fold3... fold10)? Is there any side effect?
That won't be doing any cross-validation.
The point of retraining the net is to ensure that it's trained on a different subset of your data, and for every full training, it keeps aside a set of validation data that it has never seen. If you reload your weights from a previous training instance, you're going to be validating against data your network has already seen, and your cross-validation score will be inflated.

How we know when to stop training a model on a pre-trained model?

My apologies since my question may sound stupid question. But I am quite new in deep learning and caffe.
How can we detect how many iterations are required to fine-tune a pre-trained on our own dataset? For example, I am running fcn32 for my own data with 5 classes. When can I stop the fine-tuning process by looking at the loss and accuracy of training phase?
Many thanks
You shouldn't do it by looking at the loss or accuracy of training phase. Theoretically, the training accuracy should always be increasing (also means the training loss should always be decreasing) because you train the network to decrease the training loss. But a high training accuracy doesn't necessary mean a high test accuracy, that's what we referred as over-fitting problem. So what you need to find is a point where the accuracy of test set (or validation set if you have it) stops increasing. And you can simply do it by specifying a relatively larger number of iteration at first, then monitor the test accuracy or test loss, if the test accuracy stops increasing (or the loss stops decreasing) in consistently N iterations (or epochs), where N could be 10 or other number specified by you, then stop the training process.
The best thing to do is to track training and validation accuracy and store snapshots of the weights every k iterations. To compute validation accuracy you need to have a sparate set of held out data which you do not use for training.
Then, you can stop once the validation accuracy stops increasing or starts decreasing. This is called early stopping in the literature. Keras, for example, provides functionality for this: https://keras.io/callbacks/#earlystopping
Also, it's good practice to plot the above quantities, because it gives you important insights into the training process. See http://cs231n.github.io/neural-networks-3/#accuracy for a great illustration (not specific to early stopping).
Hope this helps
Normally you converge to a specific validation accuracy for your model. In practice you normally stop training, if the validation loss did not increase in x epochs. Depending on your epoch duration x may vary most commonly between 5 and 20.
Edit:
An epoch is one iteration over your dataset for trainig in ML terms. You do not seem to have a validation set. Normally the data is split into training and validation data so you can see how well your model performs on unseen data and made decisions about which model to take by looking at this data. You might want to take a look at http://caffe.berkeleyvision.org/gathered/examples/mnist.html to see the usage of a validation set, even though they call it test set.