Computational issue in training EfficientNetB0 model for a classification task - deep-learning

I am facing difficulties in training a custom image classification model (EfficientNetB0) on a dataset of 36,000 images of size (100, 100, 3). The images consist of alphanumeric characters (0-9 and A-Z) and my goal is to classify them. My attempts at training the model have been unsuccessful due to either high memory usage on Google Colab or overheating on my MacBook Air M1. I am seeking suggestions for free alternative training methods and models that would be suitable for this classification task.
Below I have also attached images of Google Colab issue and a sample image from my dataset.
colab_crash_image
sample_dataset_image
Here, is what I have tried till now:
Using Google Colab for training, didn't work due to high memory usage.
Using inbuilt processor of MacBook Air M1, didn't work due to overheating issue.
Tried implementing transfer learning on EfficientNetB0 with imagenet weights, got training accuracy of 87% but only 46% test accuracy.

Related

Simplest caffe network for image classification

I have a set of 60000 train and 10000 test images (227x227). The images are either completely black (label 1), or black with a white patch in the middle (label 255). What would be the simplest caffe network that I can train on this data to get accuracy 95% or higher. I need to deploy this on an embedded device so the simplest network is what I desire.
I tried to train it using the BVLC reference caffenet and got an accuracy of 99.6%. I converted this model into CMSISNN to deploy it on a ARM device but it generated a weights file of 150MB which is not feasible for an embedded device.
You can try SqueezeNet (4.7 MB, >= 80.3% top-5 on ImageNet), or MobileNet (v2: 13.5 MB, 90.49% top-5 on ImageNet) as baselines. Since you have a much simpler problem, you might want to try a super simple network with your own topology.

Retrain or fine-tuning in Caffe a network with images of the existing categories

I'm quite new to caffe and this could be a non sense question.
I have trained my network from scratch. It trains well and gets a reasonable accuracy in tests. The question is about retraining or fine tuning this network. Suppose you have new samples of images of the same original categories and you want to teach the net with this new images (because for example the net fails to predict in this particular images).
As far a I know it is possible to resume training with a snapshot and solverstate or fine-tuning using only the weights of the training model. What is the best option in this case?. or is better to retrain the net with original images and new ones together?.
Think in a possible "incremental training" scheme, because not all the cases for a particular category are available in the initial training. Is it possible to retrain the net only with the new samples?. Should I change the learning rate or maintain any parameters in order to maintain the original accuracy in prediction when training with the new samples? the net should predict in original image set with the same behaviour after fine tuning.
Thanks in advance.

how to train pre-trained CNN on new dataset which is not organised in classes (Unsupervised)

I have a pretrained CNN (Resnet-18) trained on Imagenet, now i want to extend it on my own dataset of video frames , now the point is all tutorials i found on Finetuning required dataset to be organised in classes like
class1/train/
class1/test/
class2/train/
class2/test/
but i have only frames on many videos , how will i train my CNN on it.
So can anyone point me in right direction , any tutorial or paper etc ?
PS: My final task is to get deep features of all frames that i provide at the time of testing
for training network, you should have some 'label'(sometimes called y) of your input data. from there, network calculate loss between logit(answer of network) and the given label.
And the network will self-revise using that loss value by backpropagating. that process is what we call 'training'.
Because you only have input data, not label, so you can get the logit only. that means a loss cannot be calculated.
Fine tuning is almost same word with 'additional training', so that you cannot fine tuning your pre-trained network without labeled data.
About train set & test set, that is not the problem right now.
If you have enough labeled input data, you can divide it with some ratio.
(e.g. 80% of data for training, 20% of data for testing)
the reason why divide data into these two sets, we want to check the performance of our trained network more general, unseen situation.
However, if you just input your data into pre-trained network(encoder part), it will give a deep feature. It doesn't exactly fit to your task, still it is deep feature.
Added)
Unsupervised pre-training for convolutional neural network in theano
here is the method you need, deep feature encoder in unsupervised situation. I hope it will help.

Building an Image search engine using Convolutional Neural Networks

I am trying to implement an image search engine using AlexNethttps://github.com/akrizhevsky/cuda-convnet2
The idea is to implement an image search engine by training a neural net to classify images and then using the code from the net's last hidden layer as a similarity measure.
I am trying to figure out how to train the CNN on a new set of images to classify them. Does anyone know how to get started with this?
Thanks
You basically have two approaches to your problem:
-Either you have plenty of good training data (>1M) and dozens of GPUs and you retrain the network from scratch using SGD with the classes you have for your queries.
-Either you don't and then you simply truncate a pretrained AlexNet (where exactly you truncate it is for you to choose) and plug it to your images (possibly resized to fit the network (227x227x3 if I am not mistaken)).
Then from your image you get a feature vector (sometimes called a descriptor) and you use those feature vectors to train a linear SVM on your images and your specific task.

Can I train a deep convolutional network without GPUs?

I am thinking of building a convolutional neural network as a tracking system application.I get the feeling that all the deep network applications require the use of GPUs. Is it necessary to use GPUs in a task like mine? What are the minimum PC requirements I should have in my laptop ?
It all depends on the size and depth of your CNN. If your CNN has one convolution layer, and one fully connected layer, and input images are 64x64, you will be able to train your network on your Laptop in a reasonable time. If you use GoogLeNet with hundred of layers, and train on the entire ImageNet set, than even with a video card it will take you a week, so on a CPU it will never finish training.
For most practical applications, however, it is desirable to have a GPU to train a convolution network. Note that on AWS you can get GPU-enabled instances for a rather reasonable price, especially if you get spot instances, so you don't necessarily need to have a GPU locally.
Last note: most of the frameworks (theano, torch, caffe, mxnet, tensorflow) allow you to execute the same model on CPU and on GPU with minor or no modifications to the code, so you can prototype locally on the CPU with a small set of images, and then when your model works, train it on AWS on a GPU instance.