How to emulate CUDA on windows - cuda

is there any way I can test the CUDA samples and codes from a computer with no NVIDIA graphic card?
I am using Windows and the latest version of CUDA.

There are several possibilities:
Use older version of CUDA, which has built-in emulator (2.3 has it for sure). Emulator is far from good, and you won't have features from latest CUDA releases.
Use OpenCL, it can run on CPUs (though not with nVidia SDK, you will have to install either AMD or Intel OpenCL implementation (AMD works fine on Intel CPUs, btw)). In my experience, OpenCL is usually slightly slower than CUDA.
There is windows branch of Ocelot emulator: http://code.google.com/p/gpuocelot/. I haven't tried it, though.
However, I would recommend buying some CUDA-capable card. 8xxx or 9xxx series is ok and really cheap. Emulation would allow you to get some basic skills of GPGPU programming, but is useless when you write some real-world application since it doesn't allow you to debug and tune performance.

Related

Why can't QEMU get even close to Rosetta 2's performance when translating x86 to M1?

Apparently, QEMU is the only piece of open source code that can emulate an x86 operating system on the new Apple silicon (M1, M2, etc.).
Apple built Rosetta 2, which, in theory, does the exact same thing that QEMU would be doing in these scenarios. It translates x86 (Intel) instructions into the instruction set supported by the new Apple silicon processors.
Rosetta 2 does it with remarkable performance, and some x86 applications even run with better performance than on native x86 hardware. QEMU, on the other hand, doesn't get even close when running x86 Linux on Apple silicon.
How can Rosetta have such superior performance? Are there any "secrets" that only Apple knows about their architecture that were never shared with the QEMU project? Any forbidden APIs that QEMU is not allowed to access?
Rosetta and QEMU are both emulators. However, they tackle the problem in vastly different ways.
QEMU
In order to emulate a a Linux system, QEMU must also emulate storage devices, console output devices, ethernet devices, keyboards, and the entire CPU. With this framework, it emulates every instruction doing everything with Just in Time translation. From the Linux kernel down to your /bin/ls command.
There are generally few limitations to QEMU's Intel emulate. You can run most any Intel Operating System and associated applications.
Rosetta 2
Apple's emulate, on the other hand, happens before the application launches. The entire binary is translated from x86 to Apple Silicon and launched. Once translated, the application is in effect a native arm64 binary making native macOS system calls.
Apple's documentation explains it thus:
If an executable contains only Intel instructions, macOS automatically
launches Rosetta and begins the translation process. When translation
finishes, the system launches the translated executable in place of
the original. However, the translation process takes time, so users
might perceive that translated apps launch or run more slowly at times
Rosetta 2 has a number of significant limitations. For example you can't use Intel Kernel extensions, Virtual Machine apps that virtualize x86_64 computer platforms (Parallels for example), or AVX/AVX2/AVX512 vector instructions.

Running 32-bit application on CUDA

I read on the CUDA toolkit documentation (11.3.0) that "Deployment and execution of CUDA applications on x86_32 is still supported, but is limited to use with GeForce GPUs."
This looks in conflict with the fact that I was able to run a 32-bit app on my Tesla T4. (I verified that the code was actually running on the GPU and the app was 32-bit).
Have I misinterpreted the documentation? Why am I able to run 32-bit apps on a Tesla GPU?
(I'm running Visual studio 2017 on Windows 10)
It's a question of what is supported.
Other things may work, or they may not.

Browsers don't seem to use GPU for WebGL in Arch Linux with Nvidia Driver

I am running Arch Linux (EndeavourOS) with a Nvidia GeForce GTX 1080. Hardware acceleration seems to work when I run programs like games from steam. I get good frame rates for them but WebGl performance in Firefox, Chrome and Brave are all very slow.
Additionally when I run nvidia-smi, I see the non-browser processes like games appear in the process list but no browsers even when they are running WebGL.
So I guess my questions are
Should the browsers appear in the nvidia-smi process list if they are using the nvidia GPU?
If they aren't in that list does that mean they are not using the Nvidia GPU?
3 If so how do I get the browsers using the Nvidia GPU?
For more info: here is gist of the output of chromes about://gpu in a gist
I am running nvidia driver version 460.67
I am not using bumble or any GPU switching tools, just straight Nvidia.
I have tried playing around with settings in Chrome and Firefox but to no effect.
AFAIK, hardware WebGL support in chrome was limited, you need to investigate further on that, i think there was a blacklist or some sort of limitation you can change. About firefox, recently they've enabled hw WebGL support, but you need to be running above version 75 and exclusively on Wayland.

Can a Cuda application built and running on Jetson TX2 run on Jetson Xavier?

I have a Cuda application that was built with Cuda Toolkit 9.0 and running fine on Jetson TX2 board.
I now have a Jetson Xavier board, flashed with Jetpack 4 that installs Cuda Toolkit 10.0 (only 10.0 is available).
What do I need to do if I want to run the same application on Xavier?
Nvidia documentation suggests that as long as I specify the correct target hardware when running nvcc, I should be able to run on future hardwares thanks to JIT compilation. But does this hold for different versions of Cuda toolkit (9 vs 10)?
In theory (and note I don't have access to a Xavier board to test anything), you should be able to run a cross compiled CUDA 9 application (and that might mean both ARM and GPU architecture settings) on a CUDA 10 host.
What you will need to make sure is that you either statically link or copy all the CUDA runtime API library components you require with your application on the Xavier board. Note that there is still an outside chance that those libraries might lack the necessary GPU and ARM features to run correctly on a Xavier system, or more subtle issues like libC incompatibility. That you will have to test for yourself.

cuda with optimus just to access gpgpu

I have a Dell XPS L502 with the Nvidia 525M graphics card. I am only interested in using the gpgpu capabilities of the card for now.
I installed Ubuntu 12.04 as a dual boot with the Windows 7 that came with the machine and followed several installation procedures for installing the CUDA driver and developer kit from Nvidia ( many re-installs of Ubuntu ). In all cases the display drops to 640x480 resolution. Best I can determine this has something to do with Optimus technology and Linux. I tried Bumblebee to no avail.
I really don't care about using the NVidia card to drive the display. Is there any way that I can just install the NVidia drivers so that a program can use the CUDA capabilities of the graphics card and I still get the full resolution on the display?
I had a similar issue with my Alienware M11xR2, and posted the solution on the NVIDIA Forums. Unfortunately the forums are down at the moment but essentially the process is as follows:
Install the Nvidia Drivers, but when prompted to modify your X11 Config, select 'No'. This is because the Nvidia card cannot be used as a display device.
Install the CUDA SDK and run one of the samples as root. I found this to be a necessary step. After this you should be able to execute further CUDA programs as a normal user.
Hope that helps.
With the new release of CUDA 5 the, comes the installation guide, there you have just one file that installs drivers, toolkit and sdk (even nvidia nsight). And one thing that got my attention is that you also have optimus options in the installation process.
I also have and Alienware M14x, and i understand your problem, but i also wanted the drivers to work for me, so i didn't try too hard on that.
Maybe you could give that a try and comment with the rest of us.
Here you can look for the CUDA 5 release candidate: CUDA 5
and here is the installation guide (maybe give this a read first): CUDA 5 Starting Guide for Linux.