Can I run multiple cuda-gdb on the same GPU? - cuda

I want to compare two of my codes in runtime using cuda-gdb but I get this warning and I can't debug both of them simultaneously.
As you can see, I'm debugging the same codes (from this source) and unfortunately I can't debug them simultaneously.
Are there any way to do it properly?

Related

Running CUDA GUI samples from a passive (inactive) GPU

I managed to successfully run CUDA programs on a GeForce GTX 750 Ti while using a AMD Radeon HD 7900 as the rendering device (actually connected to the display) using this guide; for instance, the Vector Addition sample runs nicely. However, I can only run applications that do not produce visual output. For example, the Mandelbrot CUDA sample does not run and fails with an error:
Error: failed to get minimal extensions for demo:
Missing support for: GL_ARB_pixel_buffer_object
This sample requires:
OpenGL version 1.5
GL_ARB_vertex_buffer_object
GL_ARB_pixel_buffer_object
The error originates from asking glewIsSupported() for these extensions. Is there any way to run an application, like these CUDA samples, so that the CUDA operations are run on the GTX as usual but the Window is drawn on the Radeon card? I tried to convince Nsight Eclipse to run a remote debugging session, with my own PC as the remote host, but something else failed right away. Is this supposed to actually work? Could it be possible to use VirtualGL?
Some of the NVIDIA CUDA samples that involve graphics, such as the Mandelbrot sample, implement an efficient rendering strategy: they bind OpenGL data structures - Pixel Vertex Objects in the case of Mandelbrot - to the CUDA arrays containing the simulation data and render them directly from the GPU. This avoids copying the data from the device to the host at end of each iteration of the simulation, and results in a lightning fast rendering phase.
To answer your question: NVIDIA samples as they are need to run the rendering phase on the same GPU where the simulation phase is executed, otherwise, the GPU that handles the graphics would not have the data to be rendered in its memory.
This does not exclude that the samples can be modified to work with multiple GPUs. It should be possible to copy the simulation data back to the host at end of each iteration, and then render it using a custom method or even send it over the network. This would require to (1) modify the code, by separating and making independent simulation and rendering phases, and (2) accept the big loss in frame per second that would result from this.

Computation between two different kernels in Cuda [duplicate]

I'm writing a CUDA program but I'm getting the obnoxious warning:
Warning: Cannot tell what pointer points to, assuming global memory space
this is coming from nvcc and I can't disable it.
Is there any way to filter out warning from third-party tools (like nvcc)?
I'm asking for a way to filter out of the output window log errors/warnings coming from custom build tools.
I had the same annoying warnings, I found help on this thread: link.
You can either remove the -G flag on the nvcc command line,
or
change the compute_10,sm_10 to compute_20,sm_20 in the Cuda C/C++ options of your project if you're using Visual Studio.

Runtime error: Cannot set while device is active in this process

I'm trying to implement a separable convolution filter using CUDA as part of a bigger application I'm working on. My code has multiple CUDA kernels which are called one after the other (each performing one stage of the process). The problem is I keep getting this weird error and I'm not sure what exactly does it mean or what is causing it. I also can't find anything about it on the Internet except a couple of Stack Overflow questions related to OpenGL and CUDA interoperability (which I'm not doing, i.e. I'm not using OpenGL at all).
Can someone please explain to me why such an error may occur?
Thanks.

Compute Visual Profiler doesn't Fill the .csv files

Im trying to benchmark my CUDA application with Compute Visual Profiler. However the program is unable to fill any data in the .csv files. All the paths to CUDA are set properly in the profiler application.
After few runs on the exe file it returns the error:
Error in Profiler data file
'C:/..../temp_compute_profiler_0_0.csv'
at line number 1. No column found.
There are many possible reasons... some of them to check for
the execution time out. make sure that the profiler is not set to time out too soon
the program doesn't finish executing (even if the kernel does). make sure there isn't a getchar at the end of your code
try adding an explicit call to cudaThreadExit at the end of your code, and check for errors.
One of the most common reason for that kind of error is that your program never manages to launch a CUDA kernel or that it failed during its execution.

Why I do not get line numbers in exception stack trace if I have PDB and symbols loaded?

I have a solution with several projects. The solution is in Release mode with no optimization and with pdb files generated. While running Unit Tests i get exceptions but the stack trace does not contains the line numbers. At the modules window I can see that the current assembly is not optimized and that it has symbols loaded.
Just to note that I'm checking the stacktrace inside the debugger at a breakpoint where the exception is caugth.
Thanks in advance mates.