OpenGL Newbie - Best way to move objects about in a scene - language-agnostic

I'm new to OpenGL and graphics programming in general, though I've always been interested in the topic so have a grounding in the theory.
What I'd like to do is create a scene in which a set of objects move about. Specifically, they're robotic soccer players on a field. The objects are:
The lighting, field and goals, which don't change
The ball, which is a single mesh which will undergo translation and rotation but not scaling
The players, which are each composed of body parts, each of which are translated and rotated to give the illusion of a connected body
So to my GL novice mind, I'd like to load these objects into the scene and then just move them about. No properties of the vertices will change, either their positioning nor texture/normals/etc. Just the transformation of their 'parent' object as a whole.
Furthermore, the players all have identical bodies. Can I optimise somehow by loading the model into memory once, then painting it multiple times with a different transformation matrix each time?
I'm currently playing with OpenTK which is a lightweight wrapper on top of OpenGL libraries.
So a helpful answer to this question would either be:
What parts of OpenGL give me what I need? Do I have to redraw all the faces every frame? Just those that move? Can I just update some transformation matrices? How simple can I make this using OpenTK? What would psuedocode look like? Or,
Is there a better framework that's free (ideally open source) and provides this level of abstraction?
Note that I require any solution to run in .NET across multiple platforms.

Using so called vertex arrays is probably the surest way to optimize such a scene. Here's a good tutorial:
http://www.songho.ca/opengl/gl_vertexarray.html
A vertex array or more generally, a gl data array holds data like vertex positions, normals, colors. You can also have an array that hold indexes to these buffers to indicate in which order to draw them.
Then you have a few closely related functions which manage these arrays, allocate them, set data to them and paint them. You can perform a rendering of a complex mesh with just a single OpenGL command like glDrawElements()
These arrays generally reside on the host memory, A further optimization is to use vertex buffer objects which are the same concept as regular arrays but reside on the GPU memory and can be somewhat faster. Here's abit about that:
http://www.songho.ca/opengl/gl_vbo.html
Working with buffers as opposed to good old glBegin() .. glEnd() has the advantage of being compatible with OpenGL ES. in OpenGL ES, arrays and buffers are the only way to draw stuff.
--- EDIT
Moving things, rotating them and transforming them in the scene is done using the Model View matrix and does not require any changes to the mesh data. To illustrate:
you have your initialization:
void initGL() {
// create set of arrays to draw a player
// set data in them
// create set of arrays for ball
// set data in them
}
void drawScene {
glMatrixMode(GL_MODEL_VIEW);
glLoadIdentity();
// set up view transformation
gluLookAt(...);
drawPlayingField();
glPushMatrix();
glTranslate( player 1 position );
drawPlayer();
glPopMatrix();
glPushMatrix();
glTranslate( player 2 position );
drawPlayer();
glPopMatrix();
glPushMatix();
glTranslate( ball position );
glRotate( ball rotation );
drawBall();
glPopMatrix();
}

Since you are beginning, I suggest sticking to immediate mode rendering and getting that to work first. If you get more comfortable, you can improve to vertex arrays. If you get even more comfortable, VBOs. And finally, if you get super comfortable, instancing which is the fastest possible solution for your case (no deformations, only whole object transformations).
Unless you're trying to implement something like Fifa 2009, it's best to stick to the simple methods until you have a demonstrable efficiency problem. No need to give yourself headaches prematurely.
For whole object transformations, you typically transform the model view matrix.
glPushMatrix();
// do gl transforms here and render your object
glPopMatrix();
For loading objects, you'll even need to come up with some format or implement something that can load mesh formats (obj is one of the easiest formats to support). There are high-level libraries to simplify this but I recommend going with OpenGL for the experience and control that you'll have.

I'd hoped the OpenGL API might be easy to navigate via the IDE support (intellisense and such). After a few hours it became apparent that some ground rules need to be established. So I stopped typing and RTFM.
http://www.glprogramming.com/red/
Best advice I could give to anyone else who finds this question when finding their OpenGL footing. A long read, but empowering.

Related

Computer vision detection on small object bad results why?

Currently, for detection (localisation + recognition tasks) we use mainly deep learning algorithm in computer vision. Two types of detector exist :
one stage : SSD, YOLO, retinanet, ...
two stage : RCNN, Fast RCNN and faster RCNN for example
Using these detectors on very small objects (10 pixels for example) is a very challenging tasks and it seems the one stage algorithm are worse than the two stage algorithm. But I do not really understand why it works better on Faster RCNN for example. In fact, the one and two stage detector use both of them the anchor concept, and most of them use the same backbone like VGG16 or resnet50/resnet101. That means the receptive fields is the same. For example, I tried to detect very small object on retinanet and on faster RCNN. On retinanet, small object are not detected contrary to faster rcnn. I do not understand why. What is the explication theoretically ? (same backbone : resnet50)
I think in general networks like retinaNet are trying to bridge the gap you mention.Usually in one stage networks we will have anchor boxes of varying scales in the feature maps produced by the Backbone net, These feature maps are produced by heavily down sampling the input image, A lot of information about small object might be lost while performing this operation.While this is the case with one stage detectors, In two stage detectors because of flexibility of the RPN network, The RPN network may still propose regions which are small and this may help it to perform slightly better than its one stage counterparts.
I don't think you should be very surprised that both of these might use the same backbone, After the conv features are extracted both networks use different methods to perform detection.
Hope this helps, Let me know if i wasn't clear enough,or you have questions.

Surface mesh to volume mesh

I have a closed surface mesh generated using Meshlab from point clouds. I need to get a volume mesh for that so that it is not a hollow object. I can't figure it out. I need to get an *.stl file for printing. Can anyone help me to get a volume mesh? (I would prefer an easy solution rather than a complex algorithm).
Given an oriented watertight surface mesh, an oracle function can be derived that determines whether a query line segment intersects the surface (and where): shoot a ray from one end-point and use the even-odd rule (after having spatially indexed the faces of the mesh).
Volumetric meshing algorithms can then be applied using this oracle function to tessellate the interior, typically variants of Marching Cubes or Delaunay-based approaches (see 3D Surface Mesh Generation in the CGAL documentation). The initial surface will however not be exactly preserved.
To my knowledge, MeshLab supports only surface meshes, so it is unlikely to provide a ready-to-use filter for this. Volume mesher packages should however offer this functionality (e.g. TetGen).
The question is not perfectly clear. I try to give a different interpretation. According to your last sentence:
I need to get an *.stl file for printing
It means that you need a 3D model that is ok for being fabricated using a 3D printer, i.e. you need a watertight mesh. A watertight mesh is a mesh that define in a unambiguous way the interior of a volume and corresponds to a mesh that is closed (no boundary), 2-manifold (mainly that each edge is shared exactly by two face), and without self intersections.
MeshLab provide tools for both visualizing boundaries, non manifold and self-intersection. Correcting them is possible in many different ways (deletion of non manifoldness and hole filling or drastic remeshing).

how to display many polygons with shared vertices in cesium

I would like to display 100,000 or more polygons in Cesium. The polygons have a lot of shared boundaries --- they are essentially like US zip code polygons but smaller, so there are more of them --- so I'd like to use a representation that takes advantage of this and is "aware" of the topology of shared boundaries and only stores each vertex once.
I'm fairly new to programming with Cesium (but familiar with 3D graphics in general); I've scanned the tutorials and docs and don't immediately see a way to create a polygon collection with shared vertices. I have my polygons in a topojson file and tried loading it using code like what is in the topojson example:
var promise = Cesium.GeoJsonDataSource.load('./polygons.topojson');
promise.then(function(dataSource) {
viewer.dataSources.add(dataSource);
...
});
But
this doesn't take advantage of the shared vertices because the GeoJsonDataSource converts each individual polygon to a GeoJson object, and
it crashes my browser, presumably because 100,000 separate GeoJson objects is more than it can handle.
I feel fairly sure (and hopeful) that there is a way to do this in Cesium, but I haven't found it yet. Can someone tell me what the most effective approach would be, in particular what primitives / loader utilities should I be looking at?
Ultimately, by the way, the application I want to write will never actually render all 100,000 polygons at the same time --- it will choose which ones to render based on the mouse position, and at any one time it will only render a few thousand of them. But I want to load them all into memory ahead of time, so that I can change which ones are being rendered in real time as the cursor moves around.

Manage resources to minimize garbage collection activity and improve performance

I am working on a graphic design, vector drawing application that needs to render the data in every frame when there is a change. The issue is, that if the user is moving nodes, there will be changes during every single frame. This is not an issue with a tiny amount of data and is a major slowdown when there is anything more than a minor amount of data.
The reason is that in order to render I preform calculations and store data inside arrays. Then when the function responsible for the computation is done, the GC simply discards the data and next time the function is called, we create new arrays and new data.
In C++ I would probably allocate space in the memory and write to that space(over and over). I would probably improve performance that way. In languages that us GC I cannot allocate space that way. I have to do an ugly hack where I define an array as a class member and then write to that array from the function over and over although that array is only used in that one function and is not used by other methods of the class.
My questions is, what is the best way to reuse memory space in a language that uses GC?
Object pooling would be the major one, see here:
Gotoandplay Tutorial
Also
10 Top Tips around GC
I would also suggest you read through Grant's explanation of the garbage collection system in the Flash Player, it's quite unique, and understanding how Flash handles data is quite important to data intensive scripts.
This presentation

AS3: How to access pixel data efficiently?

I'm working a game.
The game requires entities to analyse an image and head towards pixels with specific properties (high red channel, etc.)
I've looked into Pixel Bender, but this only seems useful for writing new colors to the image. At the moment, even at a low resolution (200x200) just one entity scanning the image slows to 1-2 Frames/second.
I'm embedding the image and instance it as a Bitmap as a child of the stage. The 1-2 FPS situation is using BitmapData.getPixel() (on each pixel) with a distance calculation beforehand.
I'm wondering if there's any way I can do this more efficiently... My first thought was some sort of spatial partioning coupled with splitting the image up into many smaller pieces.
I also feel like Pixel Bender should be able to help somehow, however I've had little experience with it.
Cheers for any help.
Jonathan
Let us call the pixels which entities head towards "attractors" because they attract the entities.
You describe a low frame rate due to scanning for attractors. This indicates that you may possibly be scanning an image at every frame. You don't specify whether the image scanned is static or changes as frequently as, e.g., a video input. If the image is changing with every frame, so that you must re-calculate attractors somehow, then what you are attempting is real-time computer vision with the ABC Virtual Machine, please see below.
If you have an unchanging image, then the most important optimization you can make is to scan the image one time only, then save a summary (or "memoization") of the locations of the attractors. At each rendering frame, rather than scan the entire image, you can search the list or array of known attractors. When the user causes the image to change, you can recalculate from scratch, or update your calculations incrementally -- as you see fit.
If you are attempting to do real-time computer vision with ActionScript 3, I suggest you look at the new vector types of Flash 10.1 and also that you look into using either abcsx to write ABC assembly code, or use Adobe's Alchemy to compile C onto the Flash runtime. ABC is the byte code of Flash. In other words, reconsider the use of AS3 for real-time computer vision.
BitmapData has a getPixels method (notice it's plural). It returns a byte array of all the pixels which can be iterated much faster than a for loop with a call to getPixel inside, nested inside another for loop . Unfortunately, bytearrays are, as their name implies, 1 dimensional arrays of bytes, so iterating each pixel(4 bytes) requires using a for loop, not a foreach loop. You can access each pixel's color channel individually by default, but this sounds like what you want (find pixels with a "high red channel"), so you won't have to bitwise-and each pixel value to isolate a particular channel.
I read somewhere that getPixel is very slow, so that's where I figured you'd save the most. I could be wrong, so it'd be worth timing it.
I would say Heath Hunnicutt's anwser is a good one. If the image doesnt change just store all the color values in a vector. or byteArray of whatever and use it as a lookup table so you don't need to call getPixel() every frame.