lets say I have two models:
first one is trained to classify colors (red, green and blue)
the second one is trained to classify geometric shapes (rectangle, triangle and circle)
I want to combine them to obtain a new model capable to understand a “red rectangle” and so one…
Which should be the proper way to do it?
Thanks!!
Related
I want to retrain the object detector Yolov4 to recognize figures of the board game Ticket to Ride.
While gathering pictures i was searching for an idea to reduce the amount of needed pictures.
I was wondering if more instances of an object/class in a picture means more "training per picture" which leads to "i need less pictures"
Is this correct? If not could you try to explain in simple terms?
On the roboflow page, they say that the YOLOv4 breaks detecting objects into two pieces:
regression to identify object positioning via bounding boxes;
classification to classify the objects into classes.
Regression (analysis) is - in short - a method of analysis that tries to find the data (images in your case) that is relevant. Classification - on the other hand - transforms the ‘interesting’ images from the previous step into a class (which is ’train piece’, ’tracks’, ’station’ or something else that is worth separating from the rest).
Now, to answer your question: “no, you need more pictures.” When taking more pictures, YOLOv4 is using more samples make / test a more accurate classification. Yet, you have to be careful what you want to classify. You do want the algorithm to extract a ’train’ class from an image, but not an ‘ocean’ class for example. To prevent this, make more (different) pictures of the classes you want to have!
I am trying to solve Baby detection with unet segmentation model. I already collected baby images, baby segments and also passed the adult images as negative (created black segments for this).
So if I will do in this way is unet model can differentiate the adults and babies? if not what I have to do next?
It really depends on your dataset.
During the training, Unet will try to learn specific features in the images, such as baby's shape, body size, color, etc. If your dataset is good enough (e.g. contains lots of babies examples and lots of adults with a separate color and the image dimensions are not that high) then You probably won't have any problems at all.
There is a possibility however, that your model misses some babies or adults in an image. To tackle this issue, There are a couple of things you can do:
Add Data Augmentation techniques during the training (e.g. random crop, padding, brightness, contrast, etc.)
You can make your model stronger by replacing Unet model with a new approach, such as Unet++ or Unet3+. According to Unet3+ paper, it seems that it is able to outperform both Unet & Unet++ in medical image segmentation tasks:
https://arxiv.org/ftp/arxiv/papers/2004/2004.08790.pdf
Also, I have found this repository, which contains a clean implementation of Unet3+, which might help you get started:
https://github.com/kochlisGit/Unet3-Plus
I am quite new to QGIS, and using version 3.4.5 Madeira.
I have one raster layer with 22 land-use classes for each pixel (25m resolution), and another raster layer with elevation data for each pixel (10m resolution), see screenshots.
My question is how to combine the data from those two layers?
The result should be a third raster layer with the land-use class AND elevation data for each pixel (25m resolution). For example, I want to know how many pixels with a certain land-use class are below or above 1000m elevation.
Thank you!
land-use classes
elevation data
Is it possible for the Generator to learn a distribution when noise is a specific input say n images instead of a random noise? For example, there are two categories of images with labels 0 and 1 say 0 for cats and 1 for dogs. Is it possible to learn the generator as we feed it a dog and it will generate a cat image against that dog image?
This query is somehow the same as deblurring images but what if no clear image is given against that blurred image but we are just given with random clear images.
Sure, it is possible. This is called style transfer and there have been a lot of works on that. In a way you learn a mapping function between the manifolds of dogs to the manifolds of cats. A famous work in that direction is the CycleGAN paper (https://arxiv.org/pdf/1703.10593.pdf), which uses a cycle consistent loss to map from one direction to the other and back. This makes the training more stable and the resulting images closer to the initial images.
What are common techniques for finding which parts of images contribute most to image classification via convolutional neural nets?
In general, suppose we have 2d matrices with float values between 0 and 1 as entires. Each matrix is associated with a label (single-label, multi-class) and the goal is to perform classification via (Keras) 2D CNN's.
I'm trying to find methods to extract relevant subsequences of rows/columns that contribute most to classification.
Two examples:
https://github.com/jacobgil/keras-cam
https://github.com/tdeboissiere/VGG16CAM-keras
Other examples/resources with an eye toward Keras would be much appreciated.
Note my datasets are not actual images, so using methods with ImageDataGenerator might not directly apply in this case.
There are many visualization methods. Each of these methods has its strengths and weaknesses.
However, you have to keep in mind that the methods partly visualize different things. Here is a short overview based on this paper.
You can distinguish between three main visualization groups:
Functions (gradients, saliency map): These methods visualize how a change in input space affects the prediction
Signal (deconvolution, Guided BackProp, PatternNet): the signal (reason for a neuron's activation) is visualized. So this visualizes what pattern caused the activation of a particular neuron.
Attribution (LRP, Deep Taylor Decomposition, PatternAttribution): these methods visualize how much a single pixel contributed to the prediction. As a result you get a heatmap highlighting which pixels of the input image most strongly contributed to the classification.
Since you are asking how much a pixel has contributed to the classification, you should use methods of attribution. Nevertheless, the other methods also have their right to exist.
One nice toolbox for visualizing heatmaps is iNNvestigate.
This toolbox contains the following methods:
SmoothGrad
DeConvNet
Guided BackProp
PatternNet
PatternAttribution
Occlusion
Input times Gradient
Integrated Gradients
Deep Taylor
LRP
DeepLift