I made a YOLO model to learn about cars and learned it from Google Colaboratory. I then added more datasets and created new datasets and trained new versions to increase accuracy. The new dataset had about seven times more data than the previous version. (70k)
But why does it take the same time to learn the previous version?
Also, the size of the Weights file is the same, but I don't know why.
Why is the learning time of 10k data the same as the learning time of 70k data?
Why is the size of the weight file the same as 10k data and 70k data?
Learning time will be same for all datasets. Because when training you didn't change cfg file. if you will change will be differet. you can change steps and you can see.
Weights also will be same for all dataset for all yolov3 models. Because you are using yolov3 weights. For example, if you will use yolov3-tiny for training, will be different from yolov3 model.
Related
I have trained a model of YOLOv4 by using my original dataset and the custom yolov4 configuration file, which I will refer to as my 'base' YOLOv4 model.
Now I want to use this base model that I have created to train the model again using images that I have manually augmented. I am trying to retrain my models to try and increase the mAP and AP. So I want to use the weights from my base model to train a new yolov4 model with the manually augmented images.
I have seen on the YOLOv4 wiki page that using stopbackward = 1 freezes the layers so weights in these layers would not be updated, however this reduces accuracy. Also there was another piece of information that I read where ./darknet partial cfg/yolov4.cfg yolov4.weights yolov4.conv.137 137 takes out the first 137 layers. Does this mean that the first 137 layers are frozen in the network or does this mean you are only training on the 137 layers?
My questions are:
Which code actually does freeze layers so I can do transfer learning
on the base YOLOv4 model I have created?
Which layers would you recommend freezing,the first 137
before the first YOLO layer in the network?
Thank you in advance!
To answer your questions:
If you want to use transfer learning, you don't have to freeze any layers. You should simply start training with the weights you have stored from your first run. So instead of darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137 you can run darknet.exe detector train data/obj.data yolo-obj.cfg backup/your_weights_file. The weights are stored in the backup folder build\darknet\x64\backup\. So for example, the command could look like this: darknet.exe detector train data/obj.data yolo-obj.cfg backup/yolov4_2000.weights
Freezing layers can save time during training. What is a good solution is to first train the model with the first layers frozen, and later unfreeze the layers to finetune your learning. I am not sure what is a good amount of layers to freeze in the first run, maybe can you test it with trial and error.
The command "./darknet partial cfg/yolov4.cfg yolov4.weights yolov4.conv.137 137" dumps the weights from the first 137 layers in "yolov4.weights" into the file "yolov4.conv.137", and has nothing to do with training.
What is the least smallest sample that can be learned using CNN for a research? I have 60 datasets of large images (20, 20, 20) for three classes.
If you're a bit creative then you can really use any amount of pictures (assuming you have at least like 30). If you only have a very small amount of pictures, you just need to use transfer learning instead of training on just your pictures. Essentially you import a pre-trained model (ResNet-50 for example) and then you just continue to train on just your pictures, while freezing the weights in most of the early portions of the network.
By doing this, your network will already know how to detect many key building block features and you'll be able to train your network despite your small dataset.
Here's a link to learn more about transfer learning
https://www.kaggle.com/suniliitb96/tutorial-keras-transfer-learning-with-resnet50
I am currently working on vehicle detection using ssd mobile net TensorFlow API. I have made a custom dataset from coco dataset which comprises of all the vehicle categories in coco i.e. car, bicycle, motorcycle, bus, truck, and also I have a dataset of 730 rickshaw images.
Ultimately my goal is to detect rickshaws along with other vehicles as well. But so far I have failed.
There are a total of 16000 instances in the train_labels.csv on average each class has 2300 instances. I have set the batch size = 12. Then I train the coco pre-trained model on my custom dataset for 12000 steps.
But unfortunately I have not been able to get good results. After training it failed to classify other vehicles.
Any advice regarding the ratio of each class in the dataset, or maybe I need more rickshaw images, how many layers should I freeze? Or may be a different perspective would be highly appreciated.
Since you have a custom dataset of 730 rickshaw images, I think there is no need to extract different dataset of other vehicles from COCO dataset for fine tuning.
What I meant is the tensorflow pretrained model is really good at detecting all other vehicles than the rickshaw. Your task is just to teach the model, how to detect rickshaw.
Another option is since you already have a vehicle dataset, you can try training a model using checkpoints from COCO.
https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
Go through the above article, it will give you a fair idea about start to end flow. Author has tuned ssd mobilenet model trained on coco dataset to detect raccoon images. The raccoon was the only new class author wanted to detect. In your case, you just have to replace raccoon by rickshaw images and follow exact same steps. Author of this has used Google cloud but you can change the config file to tune it on a local machine. Considering you have only 730 new images, tuning it shouldn't take time.
This is another good example in case things are not clear https://towardsdatascience.com/building-a-toy-detector-with-tensorflow-object-detection-api-63c0fdf2ac95
Coming to your question about do you need more data, more data is always better. What I would suggest is tune model using steps above and check mAP. If you think mAP is low and the performance for your intended application is not enough, collect more data and tune again.
Please let me know if you have any questions.
I am working with Neural network for object classification right now. I am working on creating datasets for training and validation. I want to know if it is possible to create two datasets for training comprising of two completely different objects and labels. (EG dataset 1 has cars and dataset 2 has cats) Does it still work or should I create datasets where each file has mixed up both the different object types and labels in all the training files? Does such mixture/separation matter if I am training the network in one cycle with different datasets?
Depending on what you are using to train, many APIs (such as TensorFlow object detection) read the TF Record files (datasets) in order as they are scrambled to make the files beforehand. Scrambling is quite important with training as you will get the model starting training with one class, and then train for a bit with another individual class. It should get to the same standard eventually, but it is a lot better for the model to train with an equal distribution of classes of the training steps.
I'm quite new to caffe and this could be a non sense question.
I have trained my network from scratch. It trains well and gets a reasonable accuracy in tests. The question is about retraining or fine tuning this network. Suppose you have new samples of images of the same original categories and you want to teach the net with this new images (because for example the net fails to predict in this particular images).
As far a I know it is possible to resume training with a snapshot and solverstate or fine-tuning using only the weights of the training model. What is the best option in this case?. or is better to retrain the net with original images and new ones together?.
Think in a possible "incremental training" scheme, because not all the cases for a particular category are available in the initial training. Is it possible to retrain the net only with the new samples?. Should I change the learning rate or maintain any parameters in order to maintain the original accuracy in prediction when training with the new samples? the net should predict in original image set with the same behaviour after fine tuning.
Thanks in advance.