Debugger in CUDA 5 - cuda

Nvidia has released extended eclipse for CUDA 5. They have Nsight plugin for VS2010 also. In VS2010 we can stop program execution at breakpoint in kernel but how to achieve this functionality in eclipse on Linux? I don't see any nsight specific keys to stop execution. I tried changing perspective but it debugs as a normal C/C++ application. I'm using Tesla C2070, Intel Xeon 8 core machine with Linux.

I'm from Nsight Eclipse Edition team.
Our goal is specifically for the application to be debugged as a normal C/C++ application. This means that you can set breakpoints, use "run to line", etc. regardless of whether you debug host or device code.
Basically, the process is quite standard for Eclipse:
Create a project (you can also import existing executable)
Click debug button
Debugger will run and by default will break in the main function. Note that no device code posted on the device so you will only see the host thread.
Set a breakpoint in the device code and hit resume (note that Breakpoints view toolbar also allows you setting breakpoint on any CUDA kernel launch)
Debugger will break when device code reaches the breakpoint. You can inspect your application state using visual debugger UI.

Couple things, and not sure which solved the issue. Drivers updated to latest ones with RC5.0, but I chose to run VNC server instead of native X server. Then the CUDA card(s) are dedicated to my apps and debugging, and it works like a charm, and now accessible from everywhere.

Eugene,
I just installed Cuda 5, and I wasn't able to break in any kernel code. It was a clean install of centos 5.5, with a fresh download of cuda-5, and i am running on a asus g71x laptop which has a gtx260m installed.
I thought maybe you cant run display and dedbug on one device still, so i switched to non-nv x display, but still had same issue, cant stop in the kernel code.

Have you tried CUDA 5.0 RC1? It is available now. You can download and try it. And I have tried the Nsight in it, it works well for debugging.
Best regards!

The 304.43 NVIDIA Driver does not let users other than root debug their CUDA application.
That problem is not present in any past or future public releases. The CUDA documentation recommends using only drivers listed in the CUDA DevZone. The 304.43 driver is not one of them.
That may or may not be the issue you are hitting. But I thought it was worth mentioning.

Related

Nsight Debugging using single GPU

I have a single GPU in my Windows 7 system. Would like to debug my gpu code locally on this machine.
There is a confusion regarding this. Do I need to do headless debugging, (may be making my on-board display as display driver), as explained in Setup Local Headless GPU Debugging?
Or do I need not do any thing like that?
You cannot do local headless debugging with only a single GPU. Headless means there is no monitor or active display attached to the GPU that is running the code under debug. If you are debugging locally, you need this display to see the nsight GUI and your windows desktop.
Single GPU local (non-headless) debugging is covered in the nsight manual.
If you can enable another GPU (need not be an NVIDIA GPU), then you can use that GPU for your windows display, and do headless debugging on the NVIDIA GPU.

Can't debug CUDA: CUDA dynamic parallelism debugging is not supported in preemption mode

I have CUDA 5.5, latest drivers, Nsight studio 3.1 for VC2010 on Windows7 64bit.
The target machine has a headless Titan card, and another simple NVidia card, to which the monitor is connected.
I'm trying to debug my CUDA code which includes some dynamic parallelism. Whenever I click "Start CUDA Debugging" in VC, I get this error from Nsight Monitor: CUDA dynamic parallelism debugging is not supported in preemption mode. From what little I found regarding this issue, this is because I'm trying to debug CUDA on the same device that drives my screen. This however is not true, as I mentioned, I have a separate card to drive the screen.
I went even further with this, disconnected the monitor from the second card as well, rebooted, and set up remote debugging from a different machine. Same result.
Does anyone have an idea how to tackle this?
Right click the monitor's tray icon, check "Options\CUDA\Debugger". Except TCC GPUs, the others are by default force "Software Preemption".
You can set "Desktop GPUS must use Software Preemption" and "Headless GPUs must use software preemption" to false. And make sure in you VisualStuido, the setting "Nsight\Options\CUDA\Preemption Preference" is "Prefer no Software Preemption".

CUDA on Windows and Linux

I'm trying to set up a cuda development environment under windows, and lurked many cuda-tagged posts, but few things are still unclear:
Can I debug cuda applications under windows without the need of a second video card, using nsight and VS2010 express?
Can I debug cuda applications under linux without the need of a second video card, AND without shut down the graphical interface?
Answered thousands of times, but perhaps something has changed, so I ask again just to be sure: Can I develop under windows without installing a cuda-enabled video card? There is some kind of emeulator? (Ocelot for windows is practically inexistent).
Thanks.
Can I debug cuda applications under windows without the need of a second video card, using nsight and VS2010 express?
You can apparently debug with a single video card, but nsight requires vs2010 professional (not express edition)
https://developer.nvidia.com/nsight-visual-studio-edition-requirements
Can I debug cuda applications under linux without the need of a second video card, AND without shut down the graphical interface?
I don't think so, from the eclipse nsight docs (http://docs.nvidia.com/cuda/nsight-eclipse-edition-getting-started-guide/index.html#linux-requirements):
"A GPU that is running X11 (on Linux) or Aqua (on Mac) cannot be used to debug a CUDA application and will be hidden from the application ran in the debugger. Such GPU can still be used for profiling GPU applications."
Answered thousands of times, but perhaps something has changed, so I ask again just to be sure: Can I develop under windows without installing a cuda-enabled video card? There is some kind of emeulator? (Ocelot for windows is practically inexistent).
no, if you want to use cuda, you'd be best off just getting a cheap cuda-enabled card (e.g. a GTX 650 is ~$100 and is the most recent (kepler) architecture)

CUDA Parallel NSight Debugging host and device simultaneously

Does anyone know if its possible to Debug CUDA using parallel NSight on a remote machine? I am able to step into CUDA code but not my host code. It says CUDA has the capability to generate host debug information so debugging remotely and locally should be possible.
My card is a 580 GTX.
//device code <-- able to debug device code
//host code <---- when device code returns, should be able to debug host code
Thanks!
Simultaneous GPU/CPU debugging from a single IDE instance is unfortunately not possible with the current releases of Nsight and Visual Studio.
As a workaround, you can start GPU debugging from one copy of Visual Studio, then open a second IDE instance and attach its CPU debugger. They won't have unified stepping, but you can at least set breakpoints independently.
It should now be possible to attach both the Visual Studio default debugger and NSight in the same VS instance. Then this should work.

GPGPU CUDA debug server

I have access to a server machine, with 3 CUDA enabled GPUs in it, and I would like to use NVidia Parallel Nsight, to remotly debug on the machine.
This works just find.
Now, is it possibble, to start another debug session (possibbly by another developer), on the same machine, but on another GPGPU?
Is it possibble, to do this, if I use gdb on linux?
Thanks,
krisy
Krisy, yes this is possible.
However this case/scenario that you mentioned has not been actively tested internally by the Nsight team yet. I tried this our real quick on a system with a similar setup as the one you mentioned and I was able to debug 2 different instances of CUDA app simulataneously (provided each app runs on a different unique device that is not connected to any output display).
The stability of this is not guaranteed. From what I've tried so far, this worked for me and it should work in theory as well but there were instances where I experienced sluggish behavior on my system.
For other developers who are interested to know more about this, please take a look at: http://forums.nvidia.com/index.php?showtopic=201211