Learning rate curve going backwards for cnn

Learning rate curve going backwards for cnn - deep-learning

I'm working on recognizing the numbers 3 and 7 using MNIST data set. I'm using cnn_learner() function from fastai library. When I plotted the learning rate, the curve started going backwards after a certain value on X-axis. Can someone please explain what does it signify?

The curve's behavior is normal as weird as it seems to look. This curve represents multiple learning rate values with respect to the loss. It helps you in determining, at what learning rate value, will you loss be minimum or maximum.
This is a method of finding the best learning rate for your model. You must choose a value that reduces your loss. However, if you choose a learning rate value to be too small just by thinking the loss would be the minimum, then your model will train damn slow.
You can use the following link to get more understanding of finding the optimal learning rate for your model.
Learning rate finder

Related

When using the reinforcement learning model ddpg, the input data are sequence data

When using the reinforcement learning model ddpg, the input data are sequence data, high-dimensional (21 dimensional) state and low dimensional (1-dimensional) action. Does this have any negative impact on the training of the model? How to solve it

In general in any machine learning scenario, dimensionality per se is not a problem, it is mostly a matter of how much variability there is the input data. Of course, higher dimensional data can have much higher variability than lower dimensional one.
Even considering this, the problem can "easily" be solved by feeding more data to the ML algorithm and increasing the complexity that it is allowed to represent (i.e. more nodes and/or layers in a neural network).
In RL, this is even less of a problem because you don't really have a restriction on how much data you actually have. You can always run your agent some more on the environment to get more sample trajectories to train on. The only issue you might find here is that your computing time grows a lot (depending on how much more you need to train on the environment for this problem).

Learning rate decay wrt to cumulative reward?

In deep reinforcement learning, is there any way to decay learning rate wrt to cumulative reward. I mean, decay learning rate when the agent is able to learn and maximize the reward?

It is common to modify learning rates with number of steps, so it would certainly be possible to modify learning rates as a function of cumulative reward.
One risk would be that you do not know what reward you are seeking at the beginning of training, so reducing the learning rate too early is a common problem. If you target a reward of 80, with the learning rate declining sharply as you attain that value, you will never know if your algorithm could have attained 90, as learning will stop at 80.
Another problem is setting the target too high. If you set the target for 100, meaning that the learning rate does not reduce as you reach 85, the instability may mean that the algorithm cannot converge well enough to reach 90.
So in general, I think people try a variety of learning schedules, and if possible sometimes let the algorithms run for plenty of time to see if they converge.

In deep learning, can I change the weight of loss dynamically?

Call for experts in deep learning.
Hey, I am recently working on training images using tensorflow in python for tone mapping. To get the better result, I focused on using perceptual loss introduced from this paper by Justin Johnson.
In my implementation, I made the use of all 3 parts of loss: a feature loss that extracted from vgg16; a L2 pixel-level loss from the transferred image and the ground true image; and the total variation loss. I summed them up as the loss for back propagation.
From the function
yˆ=argminλcloss_content(y,yc)+λsloss_style(y,ys)+λTVloss_TV(y)
in the paper, we can see that there are 3 weights of the losses, the λ's, to balance them. The value of three λs are probably fixed throughout the training.
My question is that does it make sense if I dynamically change the λ's in every epoch(or several epochs) to adjust the importance of these losses?
For instance, the perceptual loss converges drastically in the first several epochs yet the pixel-level l2 loss converges fairly slow. So maybe the weight λs should be higher for the content loss, let's say 0.9, but lower for others. As the time passes, the pixel-level loss will be increasingly important to smooth up the image and to minimize the artifacts. So it might be better to adjust it higher a bit. Just like changing the learning rate according to the different epochs.
The postdoc supervises me straightly opposes my idea. He thought it is dynamically changing the training model and could cause the inconsistency of the training.
So, pro and cons, I need some ideas...
Thanks!

It's hard to answer this without knowing more about the data you're using, but in short, dynamic loss should not really have that much effect and may have opposite effect altogether.
If you are using Keras, you could simply run a hyperparameter tuner similar to the following in order to see if there is any effect (change the loss accordingly):
https://towardsdatascience.com/hyperparameter-optimization-with-keras-b82e6364ca53
I've only done this on smaller models (way too time consuming) but in essence, it's best to keep it constant and also avoid angering off your supervisor too :D
If you are running a different ML or DL library, there are optimizer for each, just Google them. It may be best to run these on a cluster and overnight, but they usually give you a good enough optimized version of your model.
Hope that helps and good luck!

Invalid moves in reinforcement learning

I have implemented a custom openai gym environment for a game similar to http://curvefever.io/, but with discreet actions instead of continuous. So my agent can in each step go in one of four directions, left/up/right/down. However one of these actions will always lead to the agent crashing into itself, since it cant "reverse".
Currently I just let the agent take any move, and just let it die if it makes an invalid move, hoping that it will eventually learn to not take that action in that state. I have however read that one can set the probabilities for making an illegal move zero, and then sample an action. Is there any other way to tackle this problem?

You can try to solve this by 2 changes:
1: give current direction as an input and give reward of maybe +0.1 if it takes the move which does not make it crash, and give -0.7 if it make a backward move which directly make it crash.
2: If you are using neural network and Softmax function as activation function of last layer, multiply all outputs of neural network with a positive integer ( confidence ) before giving it to Softmax function. it can be in range of 0 to 100 as i have experience more than 100 will not affect much. more the integer is the more confidence the agent will have to take action for a given state.
If you are not using neural network or say, deep learning, I suggest you to learn concepts of deep learning as your environment of game seems complex and a neural network will give best results.
Note: It will take huge amount of time. so you have to wait enough to train the algorithm. i suggest you not to hurry and let it train. and i played the game, its really interesting :) my wishes to make AI for the game :)

How we know when to stop training a model on a pre-trained model?

My apologies since my question may sound stupid question. But I am quite new in deep learning and caffe.
How can we detect how many iterations are required to fine-tune a pre-trained on our own dataset? For example, I am running fcn32 for my own data with 5 classes. When can I stop the fine-tuning process by looking at the loss and accuracy of training phase?
Many thanks

You shouldn't do it by looking at the loss or accuracy of training phase. Theoretically, the training accuracy should always be increasing (also means the training loss should always be decreasing) because you train the network to decrease the training loss. But a high training accuracy doesn't necessary mean a high test accuracy, that's what we referred as over-fitting problem. So what you need to find is a point where the accuracy of test set (or validation set if you have it) stops increasing. And you can simply do it by specifying a relatively larger number of iteration at first, then monitor the test accuracy or test loss, if the test accuracy stops increasing (or the loss stops decreasing) in consistently N iterations (or epochs), where N could be 10 or other number specified by you, then stop the training process.

The best thing to do is to track training and validation accuracy and store snapshots of the weights every k iterations. To compute validation accuracy you need to have a sparate set of held out data which you do not use for training.
Then, you can stop once the validation accuracy stops increasing or starts decreasing. This is called early stopping in the literature. Keras, for example, provides functionality for this: https://keras.io/callbacks/#earlystopping
Also, it's good practice to plot the above quantities, because it gives you important insights into the training process. See http://cs231n.github.io/neural-networks-3/#accuracy for a great illustration (not specific to early stopping).
Hope this helps

Normally you converge to a specific validation accuracy for your model. In practice you normally stop training, if the validation loss did not increase in x epochs. Depending on your epoch duration x may vary most commonly between 5 and 20.
Edit:
An epoch is one iteration over your dataset for trainig in ML terms. You do not seem to have a validation set. Normally the data is split into training and validation data so you can see how well your model performs on unseen data and made decisions about which model to take by looking at this data. You might want to take a look at http://caffe.berkeleyvision.org/gathered/examples/mnist.html to see the usage of a validation set, even though they call it test set.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008