How do I write data from Pixmap to Framebuffer's depth buffer? - libgdx

I'm implementing texture masking using framebuffer's depth buffer following this example:
https://github.com/mattdesl/lwjgl-basics/wiki/LibGDX-Masking#masking-with-depth-buffer
I got it working with ShapeRenderer altering depth buffer, but now I want to alter depth buffer with a Pixmap. I need all non-opaque pxiels from pixmap to be written to depth buffer as 1.0 depth value, and all opaque ones as 0.0 depth value.
I see a solution in writing individual pixels from Pixmap with ShapeRenderer. But that seems rather inefficient. Is there a more appropriate and efficient way?

Texture myTex = new Texture(Pixmap pixmap);
Should be pretty trivial to draw a texture to a FrameBuffer, no?

Related

Does a Spritebatch need to be flushed every time a uniform is set on the shader?

If uniforms are set on a shader used by a spritebatch does the spritebatch need to be flushed before resetting the uniform for the next draw call?
Eg. Is this correct?
batch begin
set uniform for texture one
draw texture 1
set uniform for texture two
draw texture 2
...
set uniform for texture N
draw texture N
batch end
or does the batch need to be flush after each draw call?
Since the shader needs a source, you have to flush it. Only when flushing the batch, the texture gets drawn and the shader is applied.

Do we need to add a texture to the texture cache to get the benefits of autobatching?

Normally, when one loads the sprite frame cache from a file by calling:
SpriteFrameCache::getInstance()->addSpriteFramesWithFile(filename);
internally, the texture corresponding to that file is added to the texture cache by calling:
Texture2D *texture = Director::getInstance()->getTextureCache()->addImage(texturePath.c_str());
which is basically creating a Texture2D object from that image and storing it in an unordered_map.
This does not happen internally when I generate my texture on the fly, and add sprite frames to the frame cache by calling the code below within a loop:
I generate a texture on the fly, and add sprite frames to the SpriteFrameCache by doing:
SpriteFrame* frame;
if (!isRotated) {
frame = SpriteFrame::createWithTexture(texture, rect, isRotated, offset, originalSize);
}else{
frame = SpriteFrame::createWithTexture(texture, rect, isRotated, offset, originalSize);
}
SpriteFrameCache::getInstance()->addSpriteFrame(frame, frameName);
It seems that no calls are made internally to addImage() in the texture cache, when I add frames this way (by calling addSpriteFrame()), even though all the sprite frames are using the same texture.
The counter on the bottom left that displays the number of openGL calls says there are only 2 calls, regardless of how many frames I add to the screen.
When calling
p Director::getInstance()->getTextureCache()->getCachedTextureInfo()
I get the output:
(std::__1::string) $0 = "\"/cc_fps_images\" rc=4 id=254 999 x 54 # 16
bpp => 105 KB\nTextureCache dumpDebugInfo: 1 textures, for 105 KB
(0.10 MB)\n"
Which is the texture that shows the fps rate.... so there is no sign of my texture, but at the same time there is no problem adding frames that use that texture.
So my question is: Will there be a performance problem later on because of this ? Should I add the texture to the texture cache manually ? Are there any other problems that I may encounter by adding my sprite frames this way ?
Also, my texture is created by using Texture2D* tex = new Texture2D(), and then initWithData(). So should I keep a reference to this pointer, and call delete later ? Or is it enough to just call removeUnusedTextures?
So my question is: Will there be a performance problem later on
because of this ?
It depends how many times you'd be using this texture.
Should I add the texture to the texture cache manually ?
Again, it depends how many you'd use it. If it's created dynamically few or more times caching will improve performance as you don't have to recreate it again and again.
Are there any other problems that I may encounter by adding my sprite
frames this way ?
I don't think so.
Also, my texture is created by using Texture2D* tex = new Texture2D(),
and then initWithData(). So should I keep a reference to this pointer,
and call delete later ?
Well if you just want to abandon tex (making it local variable), because you created sprite from it you can do it. But sprite simply has a pointer to this texture. If you'll release texture itself it'll disappear (probably will become a black rectangle).
Or is it enough to just call removeUnusedTextures?
This just clears TextureCache map. If your texture isn't here it won't release it.
You'd have to specify use case of this texture. If - let's imagine - you have a texture, which contains a bullet (which you created using initWithData), which is used frequently. You just can have one texture object stored in your scene and you have to create all bullet sprites from this one texture. Using TextureCache it won't be any faster. However you have to remember to release texture memory when you don't need it anymore (for example when you leave scene), because you create Texture2D using new keyword, not create principle (like Sprite::create, Texture2D doesn't have it), which auto manages memory.

constructor for SpriteBatch explained

This is a definition of SpriteBatch constructor from docs:
SpriteBatch()
Constructs a new SpriteBatch with a size of 1000, one buffer, and the default shader.
Buffer is like temporary storage for data that needs to be set on screen. So one buffer means one piece of memory in RAM? The size parameter is the number of bytes in this piece of memory?

Should the model view projection matrix be built in Actionscript 3 or on the GPU in the vertex shader?

All of the Stage3D examples I have seen build the model view projection matrix in AS3 on each render event. eg:
modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
modelViewProjectionMatrix.identity();
modelViewProjectionMatrix.append( modelMatrix );
modelViewProjectionMatrix.append( viewMatrix );
modelViewProjectionMatrix.append( projectionMatrix );
// Model view projection matrix to vertex constant register 0
context3D.setProgramConstantsFromMatrix( Context3DProgramType.VERTEX, 0, modelViewProjectionMatrix, true );
...
And a single line in the vertex shader transforms the vertex into screen space :
m44 op, va0, vc0
Is there a reason for doing it this way?
Aren't these kinds of calculation what the GPU was made for?
Why not instead only update the view and projection matrix when they change and upload each to separate registers :
// Projection matrix to vertex constant register 0
// This could be done once on initialization or when the projection matrix changes
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 0, projectionMatrix, true);
// View matrix to vertex constant register 4
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 4, viewMatrix, true);
Then on each frame and for each object :
modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
// Model matrix to vertex constant register 8
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 8, modelMatrix, true);
...
And the shader would instead look like this :
// Perform model view projection transformation and store the results in temporary register 0 (vt0)
// - Multiply vertex position by model matrix (vc8)
m44 vt0 va0 vc8
// - Multiply vertex position by view matrix (vc4)
m44 vt0 vt0 vc4
// - Multiply vertex position by projection matrix (vc0) and write the result to the output register
m44 op vt0 vc0
UPDATE
I have now found another question here which might have already answered this question :
DirectX world view matrix multiplications - GPU or CPU the place
This is a tough optimization problem. The first thing you should ask: Is that really a bottleneck? If yes, you have to consider this:
Doing the matrix multiply in AS3 is slower than it should be.
Extra matrix transforms in the vertex program are practically free.
Setting one matrix is faster than setting multiple matrices as constants!
Do you need the concatenated matrix somewhere else anyway? Picking maybe?
There is no simple answer. For speed I would let the GPU do the work. But in many cases you might want a compromise: Send the model->world and the world->clip matrix like classic OpenGL. For molehill specifically do more work on the GPU in the vertex program. But always make sure that this issue is really a bottleneck before worrying about it too much.
tl/dr: Do it in the vertex program if you can!
Don't forget that the vertex shader runs per vertex and you end up doing the multiplication hundreds of thousounds of times per frame,
while the AS3 version only does the multiplication once per frame.
As with every performance problem:
Optimize stuff that runs often and ignore the things that run only now and then.

Can one index CUDA texture with integers

Just like topic says. Can one access CUDA texture using integer coordinates?
ex.
tex2D(myTex, 1, 1);
I'd like to store float values in texture, and use it as my framebuffer.
I will pass it to OpenGL than to render on a screen.
Is this addressing possible? I don't want to interpolate between pixels. I want value from exactly specified point.
Note: there isn't really interpolation going on when you use the 0.5 offset notation for multi-dimensional textures (the actual pixel values start at (0.5, 0.5)). If you're really worried, set round-to-nearest point rather than default of bilinear.
If you use 1D textures instead (when the underlying data is 2D), you may lose performance due to lack of data locality in the other dimension.
If you want to use the texture cache without using any of the texture-specific operations such as interpolation, you can use tex1Dfetch(). This lets you index with integers.
The size limit is 2^27 elements, so you will be able to access 512 MB with floats, or 1GB with int2 [which can also be used to retrieve doubles via __hiloint2double()]. Larger data can be accessed by mapping multiple textures on top of it that cover the data.
You will have to map any multi-dimensional array accesses to the one-dimensional array supported by tex1Dfetch(). I have always used simple C macros for that.