Optimizing h264 MediaFoundation encoding - h.264

I'm writing a screen capture application that uses the UWP Screen Capture API.
As a result, I get a callback each frame with a ID3D11Texture2D with the target screen or application's image, which I want to use as input to MediaFoundation's SinkWriter to create an h264 in mp4 container file.
There are two issues I am facing:
The texture is not CPU readable and attempts to map it fail
The texture has padding (image stride > pixel width * pixel format size), I assume
To solve them, my approach has been to:
Use ID3D11DeviceContext::CopyResource to copy the original texture into a new one created with D3D11_USAGE_STAGING and D3D11_CPU_ACCESS_READ flags
Since that texture too has padding, create a IMFMediaBuffer buffer wrapping it with MFCreateDXGISurfaceBuffer cast it to IMF2DBuffer and use IMF2DBuffer::ContiguousCopyTo to another IMFMediaBuffer that I create with MFCreateMemoryBuffer
So I'm basically copying around each and every frame two times, once on the GPU and once on the CPU, which works, but seems way inefficient.
What is a better way of doing this? Is MediaFoundation smart enough to deal with input frames that have padding?

The inefficiency comes from your attempt to Map rather than use the texture as video encoder input. MF H.264 encoder, which is in most cases a hardware encoder, can take video memory backed textures as direct input and this is what you want to do (setting up encoder respectively - see D3D/DXGI device managers).
Padding does not apply to texture frames. In case of padding in frames with traditional system memory data Media Foundation primitives are normally capable to process data with padding: Image Stride and MF_MT_DEFAULT_STRIDE, and other Video Format Attributes.

Related

HTML5 video and canvas CPU optimization

I am making an app with HTML5 video along with a canvas drawing on top of it (640x480 px). The objective was to record the canvas drawing along with the encoded video to produce it as a single video. I was able to do all these using FFMPEG. But the problem I am facing is, when the HTML5 video is running it takes around 50% of my CPU. Since drawing on canvas is also demanding CPU, the browser freezes after certain time and the CPU usage for that tab on chrome is showing continously > 100. I have tried to optimize html5 canvas rendering. But nothing helped. Is there a way to run this video with much less CPU usage? or any other optimizations possible?
There is not very much you can if the hardware need to use the CPU to decode and display the video. The keyword is compromise.
A few things can be done though that removes additional barriers. These must be considered general tips though:
Efficient looping
Make sure to use requestAnimationFrame() to invoke your loop in case you aren't.
setTimeout()/setInterval() are relatively performance-heavy and cannot sync properly to the monitor refresh rate.
Reduce update load
Also if you're not already doing this: the canvas is usually updated at 60 FPS while a video is rarely above 30/29.97 FPS (in Europe 25 FPS). This means you can skip every second frame update and still show the video at optimal rate. Use a toggle to achieve this.
Video at 25 FPS will be re-synced to 30 FPS (monitors typically runs at 60 Hz even for European models which are electronically re-synced internally, which also means the browser need to deal with drop/double-frames etc. internally - nothing we can do here).
// draw video at 30 FPS
var toggle = false;
(function loop() {
toggle = !toggle;
if (toggle) { /* draw frame here */ }
requestAnimationFrame(loop);
})();
Avoid scaling of the canvas element
Make sure canvas size and CSS size is the exact same. Or put simple: don't use CSS to set the size of the canvas at all.
Disable alpha channel composition
You can disable alpha composition of the canvas in some browsers to get a little more speed. Consumer-video never come with an alpha-channel.
// disable alpha-composition of the canvas element where supported
var context = canvas.getContext("2d", {alpha: false});
Tweak settings at encoding stage
Make sure to encode the video using a balance between size and decompression load. The more a video is compressed the more need to be reconstructed, at a cost. Encode with different encoder settings to find a balance that works in your scenario.
Also consider aspects such as color-depth i.e. 16 vs 24 bit.
The H264 codec is preferable as it has wide support in various display interface hardware.
Reduce video FPS
If the content of the video allows, f.ex. there is little movement or changes, encode using 15 FPS instead of 30 FPS. If so, also use a MODULO instead of a toggle (as shown above) where you can skip 3 frames and update canvas only at the 4.:
// draw video at 15 FPS
var modToggle = 0;
(function loop() {
if (modToggle++ % 4 === 0) { /* draw frame here */ }
requestAnimationFrame(loop);
})();
Encode video at smaller source size
Encode the video at a slightly smaller size dividable by 8 (in this case I would even suggest half size 320x240 - experiment!). Then draw using the scale parameters of the drawImage() method:
context.drawImage(video, 0, 0, video.videoWidth, video.videoHeight, 0, 0, 640, 480);
This help reduce the amount of data needed to be loaded and decoded but will of course reduce the quality. How it turns out depends again on the content.
You can also turn off interpolation using imageSmoothingEnabled set to false on the context (note: the property need a prefix in some browsers). For this you may not want to reduce the size as much as 50% but only slightly (something like 600x420 in this case).
Note: even if you "reduce" the frame rate the canvas is still redrawn at 60 FPS, but since it doesn't do any actual work on the intermediate frames it's still off-loading the CPU/GPU giving you a less tight performance budget over-all.
Hope this gives some input.

How to use cache on Movieclip with createJS and Flash CC

Hi all I want to ask something. I've made a game with Flash CC and createJS. it's a Drag and drop game (3 object for drag, and 3 object for drop) and a lot of vector movieclip object. But when I run it in mobile, the game looks like have a performance issues. I've read some article that talk about caching the object. But I'm really dont know anything about cache and don't know how to use it on an object like movieclip. Do you have any explanation or solution or maybe a tutorial how to use cache function? Thank you very much.
From the docs:
Draws the display object into a new canvas, which is then used for subsequent draws. For complex content that does not change frequently (ex. a Container with many children that do not move, or a complex vector Shape), this can provide for much faster rendering because the content does not need to be re-rendered each tick. The cached display object can be moved, rotated, faded, etc freely, however if its content changes, you must manually update the cache by calling updateCache() or cache() again. You must specify the cache area via the x, y, w, and h parameters. This defines the rectangle that will be rendered and cached using this display object's coordinates.
http://createjs.com/Docs/EaselJS/classes/DisplayObject.html#method_cache
So, you don't want to cache a playing MovieClip (you would have to update the cache every frame, which is slow). However, you could cache elements in the MC that are just being transformed.
For example, an animation of a walking character, with complex vector shapes for the arms, legs, head, and body that are being transformed (scaled, rotated, translated) to create the walk animation. You wouldn't cache the character MC, but you could cache the body parts themselves.

AS3: Disposing of cacheAsBitmap bitmaps?

I've got a large, detailed, interactive vector object with dynamic text that will frequently translate horizontally from off the screen onto the screen when the user needs to look at it, then back off the screen when the user is done. If I set myVector.cacheAsBitmap = true before translating it and myVector.cacheAsBitmap = false after translating it, what happens to all of those bitmaps that are generated each time? Do I have to dispose of them myself?
Adobe help about Bitmap caching:
Turning on bitmap caching for an animated object that contains complex
vector graphics (such as text or gradients) improves performance.
However, if bitmap caching is enabled on a display object such as a
movie clip that has its timeline playing, you get the opposite result.
On each frame, the runtime must update the cached bitmap and then
redraw it onscreen, which requires many CPU cycles. The bitmap caching
feature is an advantage only when the cached bitmap can be generated
once and then used without the need to update it.
If you turn on bitmap caching for a Sprite object, the object can be
moved without causing the runtime to regenerate the cached bitmap.
Changing the x and y properties of the object does not cause
regeneration. However, any attempt to rotate it, scale it, or change
its alpha value causes the runtime to regenerate the cached bitmap,
and as a result, hurts performance.
Conclusion
If you make a simple translational movement along the x or y axis, your bitmap is created only one time.
Do I have to dispose of them myself?
It seems that you can't touch the bitmap cache only used internally by Flash player.

How to animate textures in a 3d model?

I wish to have a animated 3d texture in my LibGDX code but I am struggling to find out how to do it.
I assume how this "should" be done is either;
a) Directly accessing and modifying the texture on the model. (via a pixmap? ByteBuffer?)
or
b) Prerendering a big image containing all the frames (say, 20) and then moving the UV co-ordinates to create the illusion of the animation. (akin to ImageStrips in 2d/webdesign).
I did work out how I could completely replace the material eachtime, but that seems a much worse way of doing it. So if anyone could show the commands I need to do either a) or b) (or a similar optimal method) I would be great-fall.
Maths I am fine with. The intricacies of OpenGLES or GDX I am not :)
(The solution should at least work HTML/Android compiles, ideally everything)
Since the latest release it is very easy to play a 2d animation on a 3d surface. First make sure to get familiar with the 2d animation concept, as explained over here: https://github.com/libgdx/libgdx/wiki/2D-Animation. Then, instead of using a spritebatch, you can use the TextureRegion (which Animation#getKeyFrame returns) to set the material of the surface, as shown here: https://github.com/libgdx/libgdx/blob/master/tests/gdx-tests/src/com/badlogic/gdx/tests/g3d/TextureRegion3DTest.java. So basically you would get in your render method:
attribute.set(animation.getKeyFrame(stateTime, true));
Or if you want a more generic approach:
instance.getMaterial("<name of material>").get(TextureAttribute.class, TextureAttribute.Diffuse).set(animation.getKeyFrame(stateTime, true));
Or, if there's only one material in the ModelInstance:
instance.materials.get(0).get(TextureAttribute.class, TextureAttribute.Diffuse).set(animation.getKeyFrame(stateTime, true));
If you have the memory for it I would definetly choose b), it is easier on the processor. Also, you would only change a uniform's value. However, due to preprocessing it might take some time to open the application.
Get you uniform variable, where you compile your shaders, animationPos should be global.
Gluint animationPos = glGetUniformLocation(shaderProgram, "nameoftheuniform");
Your main loop should pass animationPos value to the shader:
Gluniform1i ( animationPos, curentAnimationIndex);
Add this your fragment shader variables:
uniform int animationPos;
Fragment shader main:
float texCoordY = texCoord.y; //texture coordinates should be passed from vertex shader
float texCoordX = texCoord.x/20.0f; //we are dividing it with 20 since it is the amount of textures that we have and if we use it directly it would try to use all the texture. Whereas the texture stores at 20 different textures.
float textureIndex = 1.0f*animationPos/20.0f; //Pointer to the start of the animation texture.
gl_fragColor = texture2D ( yourTexture, vec2( textureIndex + texCoordX, texCoordY));
Above code assumes that you expanded your textures in the x direction, you can also try to expand it like a matrix, then you need to change the texCoord calculation part. Also that we are using 20 textures.
The option a) is more heavy on the processor and you will be changing the texture every time so it will use pci a bit more, but easier on memory. The question is more like a design decision, but I guess 20 images can be handled so go with option b).
Edit: Added code.

Draw bitmap data on html5 canvas

I am receiving base64 encoded bitmap data through sockets, I decoding this data using atob and now have the bitmap data ready to be drawn on an HTML5 canvas.
I was reading this post
Data URI leak in Safari (was: Memory Leak with HTML5 canvas)
and although I am able to draw my bitmap data on the canvas using context.drawImage(..) I am looking for a different approach, due to memory leaks when alling drawImage too many times.
This post
Data URI leak in Safari (was: Memory Leak with HTML5 canvas)
refers to "creating a pixel data array and writing it to the canvas with putImageData" however there is no code to support this.
How can I create a pixel data array from my bitmap data ?
Basically I want to draw my bitmap data on the canvas, by using putImageData.
Thank you.
May be this you would found useful. Link.
If this is not what you were looking for then elaborate a little more about your query,
i will surely try to help.