I have a D3D11 Texture2d with the format DXGI_FORMAT_R10G10B10A2_UNORM and want to convert this into a D3D11 Texture2d with a DXGI_FORMAT_R32G32B32A32_FLOAT or DXGI_FORMAT_R8G8B8A8_UINT format, as those textures can only be imported into CUDA.
For performance reasons I want this to fully operate on the GPU. I read some threads suggesting, I should set the second texture as a render target and render the first texture onto it or to convert the texture via a pixel shader.
But as I don't know a lot about D3D I wasn't able to do it like that.
In an ideal world I would be able to do this stuff without setting up a whole rendering pipeline including IA, VS, etc...
Does anyone maybe has an example of this or any hints?
Thanks in advance!
On the GPU, the way you do this conversion is a render-to-texture which requires at least a minimal 'offscreen' rendering setup.
Create a render target view (DXGI_FORMAT_R32G32B32A32_FLOAT, DXGI_FORMAT_R8G8B8A8_UINT, etc.). The restriction here is it needs to be a format supported as a render target view on your Direct3D Hardware Feature level. See Microsoft Docs.
Create a SRV for your source texture. Again, needs to be supported as a texture by your Direct3D Hardware device feature level.
Render the source texture to the RTV as a 'full-screen quad'. with Direct3D Hardware Feature Level 10.0 or greater, you can have the quad self-generated in the Vertex Shader so you don't really need a Vertex Buffer for this. See this code.
Given your are starting with DXGI_FORMAT_R10G10B10A2_UNORM, then you pretty much require Direct3D Hardware Feature Level 10.0 or better. That actually makes it pretty easy. You still need to actually get a full rendering pipeline going, although you don't need a 'swapchain'.
You may find this tutorial helpful.
Related
I have been working on a project with Direct3D on Windows Phone. It is just a simple game with 2d graphics, and I make use of DirectXTK for helping me out with sprites.
Recently , I have come across to an out of memory error while I was debugging on 512mb emulator. This error was not common and was the result of a sequence of open, suspend , open , suspend ...
Tracking it down, I found out that the textures are loaded on every activation of the app, and finally filling up the allowed memory. To solve it , I will probably go and edit it so as to load textures only on opening but activation from suspends; but after this problem I am curious about the correct life cycle management of texture resources. While searching I have came across to Automatic (or "managed" by microsoft) Texture Management http://msdn.microsoft.com/en-us/library/windows/desktop/bb172341(v=vs.85).aspx . which can probably help out with some management of the textures in video memory.
However, I also would like to know other methods since I couldnt figure out a good way to incorporate managed textures into my code.
My best is to call the Release method of ID3D11ShaderResourceView pointers I store in destructors to prevent filling up the memory , but how do I ensure textures are resting in memory while other apps would want to use it(the memory)?
Windows phone uses Direct3D 11 which 'virtualizes' the GPU memory. Essentially every texture is 'managed'. If you want a detailed description of this, see "Why Your Windows Game Won't Run In 2,147,352,576 Bytes?". The link you provided is Direct3D 9 era for Windows XP XPDM, not any Direct3D 11 capable platform.
It sounds like the key problem is that your application is leaking resources or has too large a working set. You should enable the debug device and first make sure you have cleaned up everything as you expected. You may also want to check that you are following the recommendations for launching/resuming on MSDN. Also keep in mind that Direct3D 11 uses 'deferred destruction' of resources so just because you've called Release everywhere doesn't mean that all the resources are actually gone... To force a full destruction, you need to use Flush.
With Windows phone 8.1 and Direct3D 11.2, there is a specific Trim functionality you can use to reduce an app's memory footprint, but I don't think that's actually your issue.
So, I think I understand the basic functionality of cuda, and also how the graphics pipeline works. But what I don't understand is how CUDA raytracing engines combine those two. Since the vertices of a scene are stored in the graphics pipeline via directx or opengl, I can't see how this information can be accessed via the cuda pipeline.
I can't see how this information can be accessed via the cuda pipeline.
First of all, you can write a ray-tracer that doesn't use the graphics pipeline at all, except for final display of the pixels (e.g. glDrawPixels). An example is here.
But in general, the sharing of data between OGL/DX and CUDA is facilitated by the interop APIs. These APIs allow for various kinds of data to be shared between CUDA and OpenGL, including final rendered pixels, geometry data, and textures. There are plenty of CUDA sample codes which demonstrate all of these types of data sharing, in both directions, including display of results.
I am making racing game in Libgdx.My game apk size is 9.92 mb and I am using four texture packer of total size is 9.92 Mb. My game is running on desktop but its run on android device very slow. What is reason behind it?
There are few loopholes which we neglect while programming.
Desktop processors are way more powerful so the game may run smoothly on Desktop but may slow on mobile Device.
Here are some key notes which you should follow for optimum game flow:
No I/O operations in render method.
Avoid creating Objects in Render Method.
Objects must be reused (for instance if your game have 1000 platforms but on current screen you can display only 3, than instead of making 1000 objects make 5 or 6 and reuse them). You can use Pool class provided by LibGdx for object pooling.
Try to load only those assets which are necessary to show on current screen.
Try to check your logcat if the Garbage collector is called. If so than try to use finalize method of object class to find which class object are collected as garbage and try to improve on it.
Good luck.
I've got some additional tips for improving performance:
Try to minimize texture bindings (or generally bindings when you're making a 3D game for example) in you render loop. Use texture atlases and try to use one texture after binding as often as possible, before binding another texture unit.
Don't display things that are not in the frustum/viewport. Calculate first if the drawn object can even be seen by the active camera or not. If it's not seen, just don't load it onto your GPU when rendering!
Don't use spritebatch.begin() or spritebatch.end() too often in the render loop, because every time you begin/end it, it's flushed and loaded onto the GPU for rendering its stuff.
Do NOT load assets while rendering, except you're doing it once in another thread.
The latest versions of libgdx also provide a GLProfiler where you can measure how many draw calls, texture bindings, vertices, etc. you have per frame. I'd strongly recommend this since there always can be situations where you would not expect an overhead of memory/computational usage.
Use libgdx Poolable (interface) objects and Pool for pooling objects and minimizing the time for object creation, since the creation of objects might cause tiny but noticable stutterings in your game-render loop
By the way, without any additional information, no one's going to give you a good or precise answer. If you think it's not worth it to write enough text or information for your question, why should it be worth it to answer it?
To really understand why your game is running slow you need to profile your application.
There are free tools avaiable for this.
On Desktop you can use VisualVM.
On Android you can use Android Monitor.
With profiling you will find excatly which methods are taking up the most time.
A likely cause of slowdowns is texture binding. Do you switch between different pages of packed textures often? Try to draw everything from one page before switching to another page.
The answer is likely a little more that just "Computer fast; phone slow". Rather, it's important to note that your computer Java VM is likely Oracles very nicely optimized JVM while your phone's Java VM is likely Dalvik, which, to say nothing else of its performance, does not have the same optimizations for object creation and management.
As others have said, libGDX provides a Pool class for just this reason. Take a look here: https://github.com/libgdx/libgdx/wiki/Memory-management
One very important thing in LibGDX is that you should make sure that sometimes loading assets from the memory cannot go in the render() method. Make sure that you are loading the assets in the right times and they are not coming in the render method.
Another very important thing is that try to calculate your math and make it independent of the render in the sense that your next frame should not wait for calculations to happen...!
These are the major 2 things i encountered when I was making the Snake game Tutorial.
Thanks,
Abhijeet.
One thing I have found, is that drawing is laggy. This means that if you are drawing offscreen items, then it uses a lot of useless resources. If you just check if they are onscreen before drawing, then your performance improves by a lot surprisingly.
Points to ponder (From personal experience)
DO NOT keep calling a function,in render method, that updates something like time,score on HUD (Make these updates only when required eg when score increases ONLY then update score etc)
Make calls IF specific (Make updations on certain condition, not all the time)
eg. Calling/updating in render method at 60FPS - means you update time 60 times a sec when it just needs to be updated once per sec )
These points will effect hugely on performance (thumbs up)
You need to check the your Image size of the game.If your image size are more than decrease the size of images by using the following link "http://tinypng.org/".
It will be help you.
As we know many HTML 5 renderers use the GPU to draw canvas elements. I'm wondering about using this ability to trigger the GPU to use it for GPGPU. There probably are no native GPGPU functions in the canvas API or HTML 5, but what about a hack to do that?
I was thinking about using something like a texture (2D or 3D array) with the values to be processed and then ask a canvas element to perform some operation on this matrix. This operation has to be a function that I can somehow send to the canvas element. Then we have browser-based GPGPU.
Is such a thing possible? What do you think? Do you have any other ideas of how to implement this?
There is WebCL standard which is created exactly to give Javascripts running in browser access to GPGPU's computational power (provided client has any). However the list of existing implementations is pretty short.
Successful attempts to harness GPU power for genral-purpose calculations were long before (and lead to) the emergence of CUDA, OpenCL and similar GPGPUs framework. Here is what looks like a good tutorial, and I guess it is portable to WebGL (which has much broader support then WebCL). See #MikkoOhtamaa's answer for good introductory article about WebGL itself
You probably want to use webGL shaders for your nefarious purposes.
http://www.html5rocks.com/en/tutorials/webgl/shaders/
Shaders provide limited opportunities for parallel computations.
I am dealing with a set of (largish 2k x 2k) images
I need to do per-pixel operations down a stack of a few sequential images.
Are there any opinions on using a single 2D large texture + calculating offsets vs using 3D arrays?
It seems that 3D arrays are a bit 'out of the mainstream' in the CUDA api, the allocation transfer functions are very different from the same 2D functions.
There doesn't seem to be any good documentation on the higher level "how and why" of CUDA rather than the specific calls
There is the best practices guide but it doesn't address this
I would recommend you to read the book "Cuda by Example". It goes through all these things that aren't documented as well and it'll explain the "how and why".
I think what you should use if you're rendering the result of the CUDA kernel is to use OpenGL interop. This way, your code processes the image on the GPU and leaves the processed data there, making it much faster to render. There's a good example of doing this in the book.
If each CUDA thread needs to read only one pixel from the first frame and one pixel from the next frame, you don't need to use textures. Textures only benefit you if each thread is reading in a bunch of consecutive pixels. So you're best off using a 3D array.
Here is an example of using CUDA and 3D cuda arrays:
https://github.com/nvpro-samples/gl_cuda_interop_pingpong_st