Object Detection from Image Classification - deep-learning

Can I use a model used for Image Classification to do Object Detection? Already I spent too much time doing the image collection and distribute each class into it folder.

You can use your classification model as an initialized backbone for a detection model (e.g. Faster-RCNN) but it might not help that much compared to train your detector from scratch.
You will need to add detection layers (e.g. ROI pooling) to your backbone to perform detection.
While you can try unsupervised object detection, usually you will need extra labels such as object bounding-boxes to train your object detector.

Related

How to conduct object detection again in bouding boxes after object detection?

For example, I want to use YoloV5 to detect intestinal cells first and get green boudingbox, and then detect intestinal bulges in green boundingbox.
I want to import boundingboxes from the first yoloV5 model into the second YoloV5 model, but how do I get boundingboxes into the second YoloV5 model?
Or you have other good ideas.
I would train a new Yolov5 model on both classes. By doing so you would only need a single inference to get the results.
If that is not possible I would input the clean image into the different models. Gather the bboxes of the different models and finally plot them on the original image. This is so that the previous bboxes do not impact the inference process negatively as the models has most certainly never seen bounding boxes during training.

One box object detection

I am using a faster rcnn model to predict one object in an image. There can only be one object in each image.
Is it possible to force Faster Rcnn to train and predict as if it should only find one object per image?
Yes, all depends only on data that you train on.
But I don't think that fast-rcnn is the best solution for this case: it's "brute force" solution (if only one object)- but if the data is really complex and such big object detection model worth it, try to use modern convolution-based object detection architectures like YOLO or SSD

Is it possible to do transfer learning on different observation and action space for Actor-Critic?

I have been experimenting with actor-critic networks such as SAC and TD3 on continuous control tasks and trying to do transfer learning using the trained network to another task with smaller observation and action space.
Would it be possible to do so if i were to save the weights in a dictionary and then load it in the new environment? The inputs to the Actor-Critic network requires a state with different dimensions as well as outputting an actor with different dimensions.
I had some experience doing fine-tuning with transformer models by addind another classifier head and fine-tuning it, but how would i do this with Actor-Critic networks, if the initial layer and final layer does not match with the learned agent.

Backbone network in Object detection

I am trying to understand the training process of a object deetaction deeplearng algorithm and I am having some problems understanding how the backbone network (the network that performs feature extraction) is trained.
I understand that it is common to use CNNs like AlexNet, VGGNet, and ResNet but I don't understand if these networks are pre-trained or not. If they are not trained what does the training consist of?
We directly use a pre-trained VGGNet or ResNet backbone. Although the backbone is pre-trained for classification task, the hidden layers learn features which can be used for object detection also. Initial layers will learn low level features such as lines, dots, curves etc. Next layer will learn learn high-level features that are built on top of low-level features to detect objects and larger shapes in the image.
Then the last layers are modified to output the object detection coordinates rather than class.
There are object detection specific backbones too. Check these papers:
DetNet: A Backbone network for Object Detection
CBNet: A Novel Composite Backbone Network Architecture for Object Detection
DetNAS: Backbone Search for Object Detection
High-Resolution Network: A universal neural architecture for visual recognition
Lastly, the pretrained weights will be useful only if you are using them for similar images. E.g.: weights trained on Image-net will be useless on ultrasound medical image data. In this case we would rather train from scratch.

How to reject false alarm in object detection using SSD?

I used SSD for my object detection. But there are some false detection from some other objects in the image.
That is happening consistently from same objects. So is there a way to reject those components in training?
For Yolo, I can do as follow.
Just add images with these non-labeled objects to the training dataset and train. Network will learn not to detect such objects.
Also it is desirable to add negative-samples to your training dataset: https://github.com/AlexeyAB/darknet
desirable that our training dataset include images with non-labeled objects that we do not want to detect - negative samples without bounded box (empty .txt files). (Credit to alexbe)
In general, what we can do are
Hard negative mining, Inspect confusion matrix and Data Augmentation.