Surface mesh to volume mesh - stl

I have a closed surface mesh generated using Meshlab from point clouds. I need to get a volume mesh for that so that it is not a hollow object. I can't figure it out. I need to get an *.stl file for printing. Can anyone help me to get a volume mesh? (I would prefer an easy solution rather than a complex algorithm).

Given an oriented watertight surface mesh, an oracle function can be derived that determines whether a query line segment intersects the surface (and where): shoot a ray from one end-point and use the even-odd rule (after having spatially indexed the faces of the mesh).
Volumetric meshing algorithms can then be applied using this oracle function to tessellate the interior, typically variants of Marching Cubes or Delaunay-based approaches (see 3D Surface Mesh Generation in the CGAL documentation). The initial surface will however not be exactly preserved.
To my knowledge, MeshLab supports only surface meshes, so it is unlikely to provide a ready-to-use filter for this. Volume mesher packages should however offer this functionality (e.g. TetGen).

The question is not perfectly clear. I try to give a different interpretation. According to your last sentence:
I need to get an *.stl file for printing
It means that you need a 3D model that is ok for being fabricated using a 3D printer, i.e. you need a watertight mesh. A watertight mesh is a mesh that define in a unambiguous way the interior of a volume and corresponds to a mesh that is closed (no boundary), 2-manifold (mainly that each edge is shared exactly by two face), and without self intersections.
MeshLab provide tools for both visualizing boundaries, non manifold and self-intersection. Correcting them is possible in many different ways (deletion of non manifoldness and hole filling or drastic remeshing).

Related

Multiple Convex Hull for a single point cloud

I am working on a Configuration Space subject (C-Space) for a 6 DOF robot arm.
From a simulation I can get a point cloud that define my C-Space.
From this C-Space, I would like to be able to know if a robot configuration (set of joints angles), is inside the C-Space or not.
So I would like to define a 6 dimensions model from my C-Space, like a combination of a lot of convex hull with a given radius.
And then, I would like to create or use a function that give me if my configuration is inside one of the convex hull (so inside the C-Space, which that means that the configuration is in collision).
Do you have any ideas ?
Thanks a lot.
The question is not completely clear yet. I am guessing that you have a point cloud from a laser scanner and would like to approximate the output of the point cloud with a set of convex objects to perform collision query later.
If the point clouds is already clustered into sets, convex hull for each set can be found fairly quickly using the quick-hull algorithm.
If you want to also find the clusters, then a convex decomposition algorithm, like the Volumetric Hierarchical Approximate Convex Decomposition maybe what you are looking for. However, there may need to be an intermediate step to transform the point cloud into a mesh object to pass as an input to V-HACD.

3D annotation for instance segmentation

I'm trying to annotate some data for 3D instance segmentation. While it's fairly straightforward to draw masks for each 2D plane, it's not obvious how to connect the same "instances" together post-annotation (ie. connect the "red" masks together, connect the "blue" masks together) without laboriously making sure the instances are instance-matched (ie. colour-coded to make sure "red" masks always connect with "red" masks).
A naive approach I have thought of is to make many 2D segmentation masks, and calculate the center of mass for each object detected. I can later re-assign the instances based on the closest matching center of mass, but I worry this would inadvertently generate "crossed-over" segmentation instances (illustrated below). What are some high-throughput strategies to generate 3D annotations?
The boundary of your 2-d slices could be used as constraints to obtain the optimal 3-d surface, as proposed in 1.
However, I think it is easier to generate 3-d labels from markers, such as 2. Its implementation is available in here (Fill free open an issue if you encounter any problems :P).
Also, the napari package could be useful to develop the GUI without much effort.
[1] Grady, Leo. "Minimal surfaces extend shortest path segmentation methods to 3D." IEEE Transactions on Pattern Analysis and Machine Intelligence 32.2 (2008): 321-334.
[2] Falcão, Alexandre X., and Felipe PG Bergo. "Interactive volume segmentation with differential image foresting transforms." IEEE Transactions on Medical Imaging 23.9 (2004): 1100-1108.
You can use 3D Slicer's Segment Editor. It is free, open-source, has many built-in tools, and customizable/extensible in Python or C++ (you can plug in your own segmentation method with minimal effort). To solve a segmentation task, typically you first figure out a good segmentation workflow (what tools to use, in what combination and what parameters) using interactive GUI, then if necessary you can make it semi-automatic or fully automatic using Python scripting.
You can create a segmentation by contouring every image slice, but it would be too tedious. Instead, you can use 3D region growing (Grow from seeds effect) or segment on just a few slices and interpolate between them (Fill between slices effect).

Computer vision detection on small object bad results why?

Currently, for detection (localisation + recognition tasks) we use mainly deep learning algorithm in computer vision. Two types of detector exist :
one stage : SSD, YOLO, retinanet, ...
two stage : RCNN, Fast RCNN and faster RCNN for example
Using these detectors on very small objects (10 pixels for example) is a very challenging tasks and it seems the one stage algorithm are worse than the two stage algorithm. But I do not really understand why it works better on Faster RCNN for example. In fact, the one and two stage detector use both of them the anchor concept, and most of them use the same backbone like VGG16 or resnet50/resnet101. That means the receptive fields is the same. For example, I tried to detect very small object on retinanet and on faster RCNN. On retinanet, small object are not detected contrary to faster rcnn. I do not understand why. What is the explication theoretically ? (same backbone : resnet50)
I think in general networks like retinaNet are trying to bridge the gap you mention.Usually in one stage networks we will have anchor boxes of varying scales in the feature maps produced by the Backbone net, These feature maps are produced by heavily down sampling the input image, A lot of information about small object might be lost while performing this operation.While this is the case with one stage detectors, In two stage detectors because of flexibility of the RPN network, The RPN network may still propose regions which are small and this may help it to perform slightly better than its one stage counterparts.
I don't think you should be very surprised that both of these might use the same backbone, After the conv features are extracted both networks use different methods to perform detection.
Hope this helps, Let me know if i wasn't clear enough,or you have questions.

Which is best for object localization among R-CNN, fast R-CNN, faster R-CNN and YOLO

what is the difference between R-CNN, fast R-CNN, faster R-CNN and YOLO in terms of the following:
(1) Precision on same image set
(2) Given SAME IMAGE SIZE, the run time
(3) Support for android porting
Considering these three criteria which is the best object localization technique?
R-CNN is the daddy-algorithm for all the mentioned algos, it really provided the path for researchers to build more complex and better algorithm on top of it.
R-CNN, or Region-based Convolutional Neural Network
R-CNN consist of 3 simple steps:
Scan the input image for possible objects using an algorithm called Selective Search, generating ~2000 region proposals
Run a convolutional neural net (CNN) on top of each of these region proposals
Take the output of each CNN and feed it into a) an SVM to classify the region and b) a linear regressor to tighten the bounding box of the object, if such an object exists.
Fast R-CNN:
Fast R-CNN was immediately followed R-CNN. Fast R-CNN is faster and better by the virtue of following points:
Performing feature extraction over the image before proposing regions, thus only running one CNN over the entire image instead of 2000 CNN’s over 2000 overlapping regions
Replacing the SVM with a softmax layer, thus extending the neural network for predictions instead of creating a new model
Intuitively it makes a lot of sense to remove 2000 conv layers and instead take once Convolution and make boxes on top of that.
Faster R-CNN:
One of the drawbacks of Fast R-CNN was the slow selective search algorithm and Faster R-CNN introduced something called Region Proposal network(RPN).
Here’s is the working of the RPN:
At the last layer of an initial CNN, a 3x3 sliding window moves across the feature map and maps it to a lower dimension (e.g. 256-d)
For each sliding-window location, it generates multiple possible regions based on k fixed-ratio anchor boxes (default bounding boxes)
Each region proposal consists of:
an “objectness” score for that region and
4 coordinates representing the bounding box of the region
In other words, we look at each location in our last feature map and consider k different boxes centered around it: a tall box, a wide box, a large box, etc. For each of those boxes, we output whether or not we think it contains an object, and what the coordinates for that box are. This is what it looks like at one sliding window location:
The 2k scores represent the softmax probability of each of the k bounding boxes being on “object.” Notice that although the RPN outputs bounding box coordinates, it does not try to classify any potential objects: its sole job is still proposing object regions. If an anchor box has an “objectness” score above a certain threshold, that box’s coordinates get passed forward as a region proposal.
Once we have our region proposals, we feed them straight into what is essentially a Fast R-CNN. We add a pooling layer, some fully-connected layers, and finally a softmax classification layer and bounding box regressor. In a sense, Faster R-CNN = RPN + Fast R-CNN.
YOLO:
YOLO uses a single CNN network for both classification and localising the object using bounding boxes. This is the architecture of YOLO :
In the end you will have a tensor of shape 1470 i.e 7*7*30 and the structure of the CNN output will be:
The 1470 vector output is divided into three parts, giving the probability, confidence and box coordinates. Each of these three parts is also further divided into 49 small regions, corresponding to the predictions at the 49 cells that form the original image.
In postprocessing steps, we take this 1470 vector output from the network to generate the boxes that with a probability higher than a certain threshold.
I hope you get the understanding of these networks, to answer your question on how the performance of these network differs:
On the same dataset: 'You can be sure that the performance of these networks are in the order they are mentioned, with YOLO being the best and R-CNN being the worst'
Given SAME IMAGE SIZE, the run time: Faster R-CNN achieved much better speeds and a state-of-the-art accuracy. It is worth noting that although future models did a lot to increase detection speeds, few models managed to outperform Faster R-CNN by a significant margin. Faster R-CNN may not be the simplest or fastest method for object detection, but it is still one of the best performing. However researchers have used YOLO for video segmentation and by far its the best and fastest when it comes to video segmentation.
Support for android porting: As far as my knowledge goes, Tensorflow has some android APIs to port to android but I am not sure how these network will perform or even will you be able to port it or not. That again is subjected to hardware and data_size. Can you please provide the hardware and the size so that I will be able to answer it clearly.
The youtube video tagged by #A_Piro gives a nice explanation too.
P.S. I borrowed a lot of material from Joyce Xu Medium blog.
If your are interested in these algorithms you should take a look into this lesson which go through the algoritmhs you named : https://www.youtube.com/watch?v=GxZrEKZfW2o.
PS: There is also a Fast YOLO if I remember well haha !
I have been working with YOLO and FRCNN a lot. To me the YOLO has the best accuracy and speed but if you want to do research on image processing, I will suggest FRCNN as many previous works are done with it, and to do research you really want to be consistent.
For Object detection, I am trying SSD+ Mobilenet. It has a balance of accuracy and speed So it can also be ported to android devices easily with good fps.
It has less accuracy compared to faster rcnn but more speed than other algorithms.
It also has good support for android porting.

OpenGL Newbie - Best way to move objects about in a scene

I'm new to OpenGL and graphics programming in general, though I've always been interested in the topic so have a grounding in the theory.
What I'd like to do is create a scene in which a set of objects move about. Specifically, they're robotic soccer players on a field. The objects are:
The lighting, field and goals, which don't change
The ball, which is a single mesh which will undergo translation and rotation but not scaling
The players, which are each composed of body parts, each of which are translated and rotated to give the illusion of a connected body
So to my GL novice mind, I'd like to load these objects into the scene and then just move them about. No properties of the vertices will change, either their positioning nor texture/normals/etc. Just the transformation of their 'parent' object as a whole.
Furthermore, the players all have identical bodies. Can I optimise somehow by loading the model into memory once, then painting it multiple times with a different transformation matrix each time?
I'm currently playing with OpenTK which is a lightweight wrapper on top of OpenGL libraries.
So a helpful answer to this question would either be:
What parts of OpenGL give me what I need? Do I have to redraw all the faces every frame? Just those that move? Can I just update some transformation matrices? How simple can I make this using OpenTK? What would psuedocode look like? Or,
Is there a better framework that's free (ideally open source) and provides this level of abstraction?
Note that I require any solution to run in .NET across multiple platforms.
Using so called vertex arrays is probably the surest way to optimize such a scene. Here's a good tutorial:
http://www.songho.ca/opengl/gl_vertexarray.html
A vertex array or more generally, a gl data array holds data like vertex positions, normals, colors. You can also have an array that hold indexes to these buffers to indicate in which order to draw them.
Then you have a few closely related functions which manage these arrays, allocate them, set data to them and paint them. You can perform a rendering of a complex mesh with just a single OpenGL command like glDrawElements()
These arrays generally reside on the host memory, A further optimization is to use vertex buffer objects which are the same concept as regular arrays but reside on the GPU memory and can be somewhat faster. Here's abit about that:
http://www.songho.ca/opengl/gl_vbo.html
Working with buffers as opposed to good old glBegin() .. glEnd() has the advantage of being compatible with OpenGL ES. in OpenGL ES, arrays and buffers are the only way to draw stuff.
--- EDIT
Moving things, rotating them and transforming them in the scene is done using the Model View matrix and does not require any changes to the mesh data. To illustrate:
you have your initialization:
void initGL() {
// create set of arrays to draw a player
// set data in them
// create set of arrays for ball
// set data in them
}
void drawScene {
glMatrixMode(GL_MODEL_VIEW);
glLoadIdentity();
// set up view transformation
gluLookAt(...);
drawPlayingField();
glPushMatrix();
glTranslate( player 1 position );
drawPlayer();
glPopMatrix();
glPushMatrix();
glTranslate( player 2 position );
drawPlayer();
glPopMatrix();
glPushMatix();
glTranslate( ball position );
glRotate( ball rotation );
drawBall();
glPopMatrix();
}
Since you are beginning, I suggest sticking to immediate mode rendering and getting that to work first. If you get more comfortable, you can improve to vertex arrays. If you get even more comfortable, VBOs. And finally, if you get super comfortable, instancing which is the fastest possible solution for your case (no deformations, only whole object transformations).
Unless you're trying to implement something like Fifa 2009, it's best to stick to the simple methods until you have a demonstrable efficiency problem. No need to give yourself headaches prematurely.
For whole object transformations, you typically transform the model view matrix.
glPushMatrix();
// do gl transforms here and render your object
glPopMatrix();
For loading objects, you'll even need to come up with some format or implement something that can load mesh formats (obj is one of the easiest formats to support). There are high-level libraries to simplify this but I recommend going with OpenGL for the experience and control that you'll have.
I'd hoped the OpenGL API might be easy to navigate via the IDE support (intellisense and such). After a few hours it became apparent that some ground rules need to be established. So I stopped typing and RTFM.
http://www.glprogramming.com/red/
Best advice I could give to anyone else who finds this question when finding their OpenGL footing. A long read, but empowering.