Cuda driver initialization failed - cuda

I have a two gpu system, a Geforce 8400 GS and Geforce GT 520. I am able to run my cuda programs on both the gpus. But when I use cuda-gdb to debug them I get an error saying that the Cuda driver initialization failed. Also, when I run the program with cuda-gdb, the cudaGetDeviceCount says I have only 1 gpu. I am able to run the programs on either of the gpus if I am not using cuda-gdb. Can somebody help me with this?
I am running Ubuntu 11.04.

It looks like you have a display driver version older than the one required by the CUDA Toolkit. Make sure you installed the display driver downloaded from the same download page you got your toolkit from.
cuda-gdb will hide from the application being debugged GPUs used to run your desktop environment. Otherwise the desktop environment might've hanged when the application is suspended on the breakpoint. To see both GPUs in cuda-gdb you need to run without desktop environment.

Related

Loading a PTX programatically returns error 209 when run against device with CUDA capability 5.0

I am trying to use the ptxjit sample from the CUDA SDK as the basis for instrument the interaction with the GPU device.
I've managed to successfully compile the instrumentation code, and control the device to load and execute a PTX module with a Geforce GT440 that has CUDA capability 2.0.
When compiling the same instrumentation code on a (laptop using bumblebee to control the discrete GPU) system with a Geforce 830M that has CUDA capability 5.0 the code compiles but gives me 209 (CUDA_ERROR_NO_BINARY_FOR_GPU).
I've tried to compile the kernel to be compatible with CUDA capability 5.0 but had no success, still the same error.
Any ideas?
In the end the problem was with the driver. It seams that it affects only the functions that are used for PTX code loading with GPUs that have CUDA Capability 5.0.
I removed all the nvidia driver packages that were updated recently and installed the driver and OpenGL libraries that comes with the CUDA SDK. The driver version for SDK 7.5 is 352.39, with this driver both the original ptxjit sample as well as the modified one executed perfectly as on the other systems.
I don't have any GPU with CUDA capability 3.0 to test if the same problem would appear, also, I didn't updated my desktop to the 367.44 driver to see if it would break the ptxjit sample.
For now, the solution is to keep the driver that comes with the CUDA SDK and turn off updates from the nvidia repository.

Running CUDA programs on Quadro K620m

I have laptop which has Quadro K620m GPU. I am trying to learn CUDA programming and downloaded the network installer from NVIDIA site.
During CUDA SDK installation, just when its checking the hardware of the machine, it displays
Do you want to Continue?
This graphics driver could not find compatible graphics hardware. You may continue installation, but you will not be able to run CUDA applications.
Any thoughts why this could be happening? In my computer's device manager, I can see NVIDIA Quadro K620m in the display adapter listing.
Thank you.
This is normal, when the driver packaged in the CUDA installer is "older" than your GPU.
You should retain your current GPU driver, and go ahead with the CUDA toolkit installation, but de-select the option to install the GPU driver.
Your existing driver should work fine.

CUDA samples cause machine to crash

I was planning on starting to use CUDA on a machine with Kubuntu 12.04 LTS and a Quadro card. I installed CUDA 5.5 using the .deb from here, and the installation seems to have gone fine. Then I built the CUDA samples, again everything went fine.
When I run the samples in sequence, however, some of them botch my display, and others simply crash my computer.
What causes the crash? How can I fix it?
I'll mention that my NVidia card is the only display adapter the machine has, but that shouldn't make CUDA crash and burn.
The problem was due to the X server using the FOSS nouveau drivers. These are known to conflict with NVidia's way of accessing the card. When I restarted X (actually, I restarted the machine), the samples did run and work properly.
Not all the samples are runnable if you just installed CUDA on a clean ubuntu system. Some of them require additional libraries, and some of them require particular CC versions.
You could read the CUDA sample document of those crashed samples for more information.
http://docs.nvidia.com/cuda/cuda-samples/index.html

CUDA driver 4.2 version mismatch? 295.40 vs 295.41

I'm trying to install CUDA 4.2 on my Alienware Aurora desktop system. It's running Ubuntu 12.04, and Linux kernel 3.2.0-32 with an Nvidia GTX 690. I am able to install the CUDA SDK and display driver without issue. However, when Xorg starts, it dies with this error:
Error: API mismatch: the NVIDIA kernel module has version 295.40, but the NVIDIA driver component has version 295.41. Please make sure that the kernel module and all NVIDIA driver components have the same version.
The same thing happens when trying to run a CUDA application. Any thoughts? I have a lab of over a dozen other CUDA workstations which don't have this problem, but are also running Ubuntu 10.10.
In short: Ubuntu 12 is not yer supported distro.
If you still want to run cuda on usupported platform and expose yourself to other such problems see answer https://stackoverflow.com/a/13062766/56875

CUDA 5.0 cuda-gdb on Linux Needs dedicated CPU?

With a fresh CUDA 5.0 Linux install on CentOS 5.5, I am not able to gdb. So I am wondering if you still need a dedicated GPU for the Linux cuda-gdb? I tried it with the Vesa device driver for X11, but get the same result. Profiling works, running the app works, but trying to run cuda-gdb gives :
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
Any suggestions?
cuda-gdb still needs a GPU that is not used by graphical environment (e.g. if you are running Gnome/KDE/etc. you need to have system with several GPUs - not necessary all of them must be NVIDIA GPUs)
This particular message is not about this problem - you can ignore it. cuda-gdb will tell if it fails because no GPU can be used for debugging.