Mxnet is supposed to build and run, on CPU as well as on GPU, for multiple OSs including Windows.
I'm trying to build mxnet from source on Windows Server 2016 that has NVIDIA K80 GPU on it.
I followed all the instructions in https://mxnet.incubator.apache.org/get_started/windows_setup.html but not able to move past the point of building mxnet in Visual Studio 2013.
The error I'm seeing is
'mshadow::cuda::AddTakeGrad' : ambiguous call to overloaded function indexing_op.h
If I fix this generic call to AddTakeGrad to make it a specific call to mshadow::cuda::, then some other polymorphic function ends up with the same error and so on ...
I tried searching a lot to find if anyone was successful in building mxnet for windows (on both cpu mode and gpu mode) but couldn't find any.
Question: Has anyone been able to successfully build mxnet on Windows? If so, could you help with this error as well as any specific instructions to get it to build for both cpu mode as well as gpu mode?
These days it should be possible to just pip install.
I am a Java developer. To speed some of our algorithms, we have decided to try CUDA.
But the Issue is, currently we have only one server with GPU installed and 3 developers have to work on it (by transferring the file each time over ssh and compiling and running it over there). This obviously is a tedious process.
What I would like to know is: On my machine which does not have GPU, can I using NSight work on CUDA by compiling and generating files locally. This can automatically be transferred to server to get the result.
If we can at least work on algorithm locally using NSight (or any other IDE) and not pure vim and then compile it to remove compile time errors, this would save quite some time.
On Linux you can do remote debugging using Nsight Eclipse Edition as documented here. This requires 5.5 or later. On Windows you need to start the Nsight monitor on the server and then just configure Nsight Visual Studio Edition to use the remote machine.
Does anyone know if its possible to Debug CUDA using parallel NSight on a remote machine? I am able to step into CUDA code but not my host code. It says CUDA has the capability to generate host debug information so debugging remotely and locally should be possible.
My card is a 580 GTX.
//device code <-- able to debug device code
//host code <---- when device code returns, should be able to debug host code
Thanks!
Simultaneous GPU/CPU debugging from a single IDE instance is unfortunately not possible with the current releases of Nsight and Visual Studio.
As a workaround, you can start GPU debugging from one copy of Visual Studio, then open a second IDE instance and attach its CPU debugger. They won't have unified stepping, but you can at least set breakpoints independently.
It should now be possible to attach both the Visual Studio default debugger and NSight in the same VS instance. Then this should work.
So I got Visual Studio 2010 installed with NSight for VS. Projects are compiling and working just fine (but still I don't have intellisense recognized altough I spend many many hours and went through all tutorials).
Now I would like to use Compute Visual Profiler (NSight) to which I need executable. So when I go to Debug or Release after running the projest I can't find exe file there. How to create it ?
Anyway I am able to run Compute Visual Profiler profiling this project by pointing the Debug directory but then I got the error :
Program run #18 completed.
Error :
Application : "C:(...)dot_product.exe.intermediate.manifest".
Profiler data file 'C:/(...)temp_compute_profiler_0_0.csv' for application run 0 not found.
So my question is How could I possibly profile my project?
Nvidia has released extended eclipse for CUDA 5. They have Nsight plugin for VS2010 also. In VS2010 we can stop program execution at breakpoint in kernel but how to achieve this functionality in eclipse on Linux? I don't see any nsight specific keys to stop execution. I tried changing perspective but it debugs as a normal C/C++ application. I'm using Tesla C2070, Intel Xeon 8 core machine with Linux.
I'm from Nsight Eclipse Edition team.
Our goal is specifically for the application to be debugged as a normal C/C++ application. This means that you can set breakpoints, use "run to line", etc. regardless of whether you debug host or device code.
Basically, the process is quite standard for Eclipse:
Create a project (you can also import existing executable)
Click debug button
Debugger will run and by default will break in the main function. Note that no device code posted on the device so you will only see the host thread.
Set a breakpoint in the device code and hit resume (note that Breakpoints view toolbar also allows you setting breakpoint on any CUDA kernel launch)
Debugger will break when device code reaches the breakpoint. You can inspect your application state using visual debugger UI.
Couple things, and not sure which solved the issue. Drivers updated to latest ones with RC5.0, but I chose to run VNC server instead of native X server. Then the CUDA card(s) are dedicated to my apps and debugging, and it works like a charm, and now accessible from everywhere.
Eugene,
I just installed Cuda 5, and I wasn't able to break in any kernel code. It was a clean install of centos 5.5, with a fresh download of cuda-5, and i am running on a asus g71x laptop which has a gtx260m installed.
I thought maybe you cant run display and dedbug on one device still, so i switched to non-nv x display, but still had same issue, cant stop in the kernel code.
Have you tried CUDA 5.0 RC1? It is available now. You can download and try it. And I have tried the Nsight in it, it works well for debugging.
Best regards!
The 304.43 NVIDIA Driver does not let users other than root debug their CUDA application.
That problem is not present in any past or future public releases. The CUDA documentation recommends using only drivers listed in the CUDA DevZone. The 304.43 driver is not one of them.
That may or may not be the issue you are hitting. But I thought it was worth mentioning.