A basic query about Generative adversarial Models - deep-learning

Is it possible for the Generator to learn a distribution when noise is a specific input say n images instead of a random noise? For example, there are two categories of images with labels 0 and 1 say 0 for cats and 1 for dogs. Is it possible to learn the generator as we feed it a dog and it will generate a cat image against that dog image?
This query is somehow the same as deblurring images but what if no clear image is given against that blurred image but we are just given with random clear images.

Sure, it is possible. This is called style transfer and there have been a lot of works on that. In a way you learn a mapping function between the manifolds of dogs to the manifolds of cats. A famous work in that direction is the CycleGAN paper (https://arxiv.org/pdf/1703.10593.pdf), which uses a cycle consistent loss to map from one direction to the other and back. This makes the training more stable and the resulting images closer to the initial images.

Related

Keypoint detection when target appears multiple times

I am implementing a keypoint detection algorithm to recognize biomedical landmarks on images. I only have one type of landmark to detect. But in a single image, 1-10 of these landmarks can be present. I'm wondering what's the best way to organize the ground truth to maximize learning.
I considered creating 10 landmark coordinates per image and associate them with flags that are either 0 (not present) or 1 (present). But this doesn't seem ideal. Since the multiple landmarks in a single picture are actually the same type of biomedical element, the neural network shouldn't be trying to learn them as separate entities.
Any suggestions?
One landmark that can appear everywhere sounds like a typical CNN problem. Your CNN filters should learn which features make up the landmark, but they don't care where it appears. That would be the responsibility of the next layers. Hence, for training the CNN layers you can use a monochrome image as the target: 1 is "landmark at this pixel", 0 if not.
The next layers are basically processing the CNN-detected features. To train those, your ground truth should be basically the desired outcome. Do you just need a binary output (count>0)? A somewhat accurate estimate of the count? Coordinates? Orientation? NN's don't care that much what they learn, so just give it in training what it should produce in inference.

Does the number of Instances of an Object in a picture affect the training of a deep-learning object detector

I want to retrain the object detector Yolov4 to recognize figures of the board game Ticket to Ride.
While gathering pictures i was searching for an idea to reduce the amount of needed pictures.
I was wondering if more instances of an object/class in a picture means more "training per picture" which leads to "i need less pictures"
Is this correct? If not could you try to explain in simple terms?
On the roboflow page, they say that the YOLOv4 breaks detecting objects into two pieces:
regression to identify object positioning via bounding boxes;
classification to classify the objects into classes.
Regression (analysis) is - in short - a method of analysis that tries to find the data (images in your case) that is relevant. Classification - on the other hand - transforms the ‘interesting’ images from the previous step into a class (which is ’train piece’, ’tracks’, ’station’ or something else that is worth separating from the rest).
Now, to answer your question: “no, you need more pictures.” When taking more pictures, YOLOv4 is using more samples make / test a more accurate classification. Yet, you have to be careful what you want to classify. You do want the algorithm to extract a ’train’ class from an image, but not an ‘ocean’ class for example. To prevent this, make more (different) pictures of the classes you want to have!

Locate/Extract Patches from an Image

I have an image(e.g. 60x60) with multiple items inside it. Items are in the shape of square boxes, with say 4x4 dimensions, and are randomly placed within the image. The boxes(items) themselves are created with random patterns, some random pixels switched on and others switched off. So, it could be the same box repeated twice(or more in case of more than 2 items) in the image or could be entirely different.
I'm looking to create a deep learning model that could take in the original image(60x60) and output all the patches in the image.
This is all I have for now, but I can definitely share more details as the discussion starts. I'd be interested to weigh in different options that can help me achieve this objective. Thanks.
I would solve this using object detection. First I would train a network to detect those box like objects by cutting out patches of those objects. Then I would run a Faster R-CNN or something like this on it.
You might want to take a look at the stanford lecture on detection (slides here: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf).

Caffe Image Classifier of unclassified images

I fine tuned an image classifier from GoogleNet which outputs classes of animals (dogs, cats, birds) and it's working perfectly. Accuracy is very high when i pass an image related to the topic and very happy about it!
Now the question is: if i pass to the classifier an image of something un-related to the training dataset (for example an image of a house), I would love to receive as an output lower score that helps me recognize that the analyzed image is not one of the dataset categories.
My current output is
dogs = 97%
cats = 2%
birds = 1%
Instead my need is to see something like
dogs = (anything low %)
cats = (anything low %)
birds = (anything low %)
How can i accomplish this result?
Thanks for any help
The last layer of your network is a softmax so the results will sum up to 100% even if your input is a white image. If you look at the layer just before, you have a score for each class. The score is probably much lower than if there was a dog on the picture.
Anyway, if your goal is to be able to know if there is a dog, a cat, a bird or none of them in the picture, you should probably add a class 'other' and add images on which there is none of the 3 other classes.
You need to read the docs.
But often recognisers take advantage of the fact that the inputs must be one of a restricted set to help tune their algorithms. For example postcodes must be English letters and numerals. If someone handwrites a postcode which is not, it doesn't matter if the recogniser produces garbage, because the input is also garbage.
It's quite likely that it can't recognise input outside of the training set, not trained for. But it all depends exactly how it works underneath.

Store a "routine" which, given some input, generates a 3d model

Well, it's the time of the year were I get busy on my next-generation, cutting edge, R&D project (just for the fun of it...and maybe some profit eventually).
This time, I've had a great idea for a service, which unfortunately I can't detail much.
However, a major part of this project is the ability to generate a 3d model out of certain input criteria. The generated model must be different on each generation.
As such, this is much different than the static models used in games - I think I will have to store actual code more than just model coords.
To give an example of some output:
var apple = new AppleGenerator();
apple->set_size_between(30, 50); // these two numbers are just samples...
apple->set_seeds_between(3, 8); // apple must have at least 3 seeds*
var apple_model = apple->generate();
// * I realize seeds may not be exactly part of the model, but I can't of anything else
So I need to tackle some points here:
How do I store these models as data?
Do you know of any tools that may help?
I need to incorporate a randomness factor (for example, the apples would have slightly different shapes each time)
I suppose math will play a good part here, but since these are complex shapes, it's going to be infeasible to cook up the necessary formulae for each model, right?
Also, textures must be relevant to each part of the model, as well as making the model look random (eg; I could be detailing a 40 to 60 percent red, and the rest green, for the generated apple).
This is in fact not a simple task. The solution varies a LOT depending on the complexity and variety of the objects you are trying to create.
Let's consider a few cases though:
Object is more or less known:
The most simple case is, to have a 3d model in the conventional way, and then randomize it a bit. Take the apple for example. The randomization can vary from the size of the apple to its texture colors to fruit damage.
All your objects can be described using NURBS surfaces:
In this case, you need to store enough data for the surface to be able to be generated, where of course this data can be randomized a bit.
Your objects have rotational symmetry:
In this case, generating a single curve and rotating it around the an axis can give you a shape. An apple is an example. You would need to store only the curve data and randomizing the shape could either be done on the curve (keeping symmetry) or on the final mesh.
On textures
This is way more complicated than the mesh generation. This is mainly because textures carry much more information than meshes (they are more detailed). You can have many texture generation strategies. In the case of your apple, you could select a few vertices, give them colors (one red, one green, another red etc) and interpolate the other vertex colors. This creates a smooth transition of colors which may look nice on an apple. If you are generating a knife however that just looks terrible.
In most cases, you need to be aware of which part of your mesh represents what, and generate the texture part by part. In the knife example above, you can generate the mesh in two steps; blade and handle each part's texture generated separately.
Conclusion
You can have a mixture of these of course. A meshGenerator class can take the data and based on whichever type they are, generates a mesh accordingly. Perhaps the first solution for object creation is the most suitable as any complicated object can be more easily defined by its triangles rather than NURBS.
Take a look at some of the basic architectural principles used to code Spore, the video game about evolving living creatures: http://chrishecker.com/My_liner_notes_for_spore
Here's an example of how to XML-serialize a mesh, along with some random morph behavior: http://www.ogre3d.org/tikiwiki/Morph+animation#The_XML_format_of_meshes_with_morph_animation
To make your apples all a bit different, you can apply a random transformation (or deformation). See for example: http://wiki.blender.org/index.php/Doc:2.4/Manual/Modifiers/Deform/MeshDeform
You want to use an established file format to avoid strange problems. It's more geometry than pure math. Your generate function would plot the polygons, and then your save method would interact with the formats.
https://stackoverflow.com/questions/441388/most-common-3d-model-format