I’m new to DNN.
Is assuming anchor boxes’s positions when training an object detection model similar to initializing kernels with default weights in CNN?
I’m a beginner to Deep Neural Networks
Yes, the idea of initializing anchor boxes and kernels with default weights in a DNN model are similar. In both cases, the idea is to provide a starting point for the model to learn from, rather than starting with a completely random set of values.
If you are using a custom dataset then you can create anchor boxes from several website those are bellow:
https://www.makesense.ai/
https://roboflow.com/
In case you used a dataset from Kaggle then there declare the boxed in the training file and you have to calculate anchor boxed from there.
This just a reminder for each object it has 5 values
class, the center of x and y, width and height
Related
I'm new to deep learning and trying cell segmentation with Detectron2 Mask R-CNN.
I use the images and mask images from http://celltrackingchallenge.net/2d-datasets/ - Simulated nuclei of HL60 cells - the training dataset. The folder I am using is here
I tried to create and register a new dataset following balloon dataset format in detectron2 colab tutorial.
I have 1 class, "cell".
My problem is, after I train the model, there are no masks visible when visualizing predictions. There are also no bounding boxes or prediction scores.
A visualized annotated image is like this but the predicted mask image is just a black background like this.
What could I be doing wrong? The colab I made is here
I have a problem similar to yours, the network predicts the box and the class but not the mask. The first thing to note is that the algorithm automatically resizes your images (DefaultTrainer), so you need to create a custom mapper to avoid this. Second thing is that you need to create a data augmentation, using which you significantly improve your convergence and generalization.
First, avoid the resize:
cfg.INPUT.MIN_SIZE_TRAIN = (608,)
cfg.INPUT.MAX_SIZE_TRAIN = 608
cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING = "choice"
cfg.INPUT.MIN_SIZE_TEST = 608
cfg.INPUT.MAX_SIZE_TEST = 608
See too:
https://gilberttanner.com/blog/detectron-2-object-detection-with-pytorch/
How to use detectron2's augmentation with datasets loaded using register_coco_instances
https://eidos-ai.medium.com/training-on-detectron2-with-a-validation-set-and-plot-loss-on-it-to-avoid-overfitting-6449418fbf4e
I found a current workaround by using Matterport Mask R-CNN and the sample nuclei dataset instead: https://github.com/matterport/Mask_RCNN/tree/master/samples/nucleus
I am having two folders training and testing. Each folder has 14k images of a random single object like chair , box ,fan , can ,etc..
Addition to this i have 4 columns [x1,x2,y1,y2] for each image of training set in which that random object is enclosed(bounding box).
With this information i want to predict the bounding boxes for the test set.
I am very new in Computer Vision,It would be very helpful if any one can help me how to start with training such kind of models.
I found yolov3 but it includes classification as well.
I recommend you see the github code here.
In detect.py, there is do_detect() function.
this function returns both class and bbox that you want to get from image.
boxes = do_detect(model, image, confidence_threshold, nms_threshold)
I do have a .caffemodel file, converted it to a .coreml file which I'm using in an app to recognize special types of bottles. It works, but it only shows if a bottle is in the picture or not.
Now I would like to know WHERE the bottle is and stumbled upon https://github.com/hollance/YOLO-CoreML-MPSNNGraph and https://github.com/r4ghu/iOS-CoreML-Yolo but I don't know how I can convert my .caffemodel to such files; is such a conversion even possible or does the training have to be completely different?
If your current model is a classifier then you cannot use it to detect where the objects are in the picture, since it was not trained to do this.
You will have to train a model that does not just do classification but also object detection. The detection part will tell you where the objects are in the image (it gives you zero or more bounding boxes), while the classification part will tell you what the objects are in those bounding boxes.
A very simple way to do this is to add a "regression" layer to the model that outputs 4 numbers in addition to the classification (so the model now has two outputs instead of just one). Then you train it to make these 4 numbers the coordinates of the bounding box for the thing in the image. (This model can only detect a single object in the image since it only returns the coordinates for a single bounding box.)
To train this model you need not just images but also the coordinates of the bounding box for the thing inside the image. In other words, you'll need to annotate your training images with bounding box info.
YOLO, SSD, R-CNN, and similar models build on this idea and allow for multiple detections per image.
I really want to know, is it possible to create a fixture for a body that could be broken by some other body?
There is the example:
a body with its fixture divided into small figures:
and what happens after it is hit by another body:
P.s. Are there any programs that could help the process of creating such fixture?
yes you can do this using Breakable spotted at :
net.dermetfan.gdx.physics.box2d.Breakable
The Breakable allows to easily make whole bodies or single fixtures breakable, which means they will be destroyed if a certain force or friction is applied to them.
How to use
A Breakable is meant to be put in a body's, fixture's or joint's user
data. A single Breakable instance can be put in the user data of
multiple bodies, fixtures and joints. Since this may collide with the
Box2DSprite or other classes using the user data, the
Breakable$Manager uses a Function to access the Breakable in the
user data of a body, fixture or joint.Do not forget to set a Manager
instance as ContactListener to the world and to call destroy() after
every timestep. If the field is already in use, check out the
ContactMultiplexer. The Manager does the actual work, the Breakables
are just passive data holders.
A Breakable consists of a normal resistance, tangent resistance, an
option to destroy its body in case its last fixture was destroyed and
an option setting if the body should be destroyed no matter the amount
of remaining fixtures.
The normalResistance is the force that can be applied to the
Breakable before it breaks (inclusive). The tangentResistance is the
friction the Breakable can bear (also inclusive). The
reactionForceRestiance specifies the reaction force a joint can bear
on each axis. The reactionForceLength2Resistance is the max squared
length of the joint's reaction force the Breakable can bear.
referred to libgdx-utils
some other good references with good examples here and here
for the question (Are there any programs that could help the process
of creating such fixture)
yes you can easily use box2d-editor which allows you to create complex polygons and you can also create your bodies and shapes from your images or sprites check the official documentation in the same page there are several video who explain the way box2d-editor works :
Features:
Automatically decomposes concave shapes into convex polygons,
Automatically traces your images if needed,
Supports multiple outlines for a single body,
Supports polygon and circle shapes,
Reference point location can be changed,
Visual configurable grid with snap-to-grid option,
Built-in collision tester! Throw balls at your body to test it,
Loader provided for LibGDX game framework (written in Java),
Simple export format (JSON), to let you easily create your own loader for any framework in any language.
i am using blender to create my models, and loading them into Libgdx, if i create them with the Origin in the center of the model like below and then use this code to create the rigid body, all works fine
Vector3 hescoWallHalfExtents = new Vector3(hescoWall.calculateBoundingBox(bounds).getDimensions()).scl(0.5f);
however if i place the bottom of the model level with the ground like this
then the btRigidbody is offset like this
is there an obvious way that i can offset the height of the rigidbody?
many thanks.
Spriggsy
The center (origin) of the rigid body is the same as the center of mass and therefor an important property for the physics simulation. You could "move" this center using a btCompoundShape if you like, but this will also influence the physics simulation and therefore probably won't give you satisfying results.
Alternatively, you could compensate for the difference of physics origin and visual origin in your btMotionState. For example by setting ModelInstance#transform to the provided worldTransform multiplied by the a Matrix4 instance which contains the offset (use Matrix4#translate for example).
However, this is probably just making it more complex than it needs to be. You could say that the real question is why you want offset the center of model compared to the body? For example, in you second image the center of the model appears to be same as the in the first image. You only moved the Node, basically indicating that you want to provide an initial value for the ModelInstance#transform member. You can achieve this by instantiating the ModelInstance as follows:
modelInstance = new ModelInstance(model, "coneNode", true);
Replace "coneNode" with the name of the node as created it in your modeling application. The last (true) argument tells the ModelInstance to set its transform member to the transformation you've given it in the modeling application. If needed, you can call modelInstance.transform.translate(x, y, z); or modelInstance.transform.trn(x, y, z); to move the modelInstance relative to this transformation.
A more in-depth explanation of this can be found here: http://blog.xoppa.com/loading-a-scene-with-libgdx/
Note that this only works if you're using .g3db or .g3dj files (e.g. using fbx-conv) created from a file format that supports node transformations (e.g. .fbx, but not .obj)