check failed: status == cublas_status_success (1 vs. 0) cublas_status_not_initialized - caffe

When i use CAFFE to run the program something happened.
Can anyone help me solve this problem.
It seems there is something wrong with the CUDA version.
Ubuntu 16.04
CUDA 8.0.61
Can anyone tell me how to set the proper version?
Thanks a lot and really appreciate it.

Looks like there's some problem with CUDA.
First Check whether CUDA is installed properly or it's installation is broken. Run some CUDA samples.
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
Your result should be
Result = PASS
If it's installed properly then check whether /usr/local/cuda symlinked to /usr/local/cuda-8.0
Check whether there's meaningful result for ls /usr/local/cuda/lib64 command, it should list down CUDA libraries.
Check whether nvidia-smi output is fine.

Related

CUDA 8 examples on Ubuntu 16 not finding libGLU, libGL, or libX11

I am using CUDA 8 and I am able to run some of the examples but I can not get any of the visualizations to run. I have gotten them to work in the past, but now I am not able to reproduce the results on the same computer with a fresh install. Mint or Ubuntu.
after a successful install of CUDA I try to make the particles or nbody samples but I get this error:
>>> WARNING - libGL.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - libGLU.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
>>> WARNING - libX11.so not found, refer to CUDA Getting Started Guide for how to find and install them. <<<
I looked through the Getting Started guide but have not found a solution.
I am systematically working through the symbolic links. perhaps someone here can offer a suggestion...
The result of a find request...
$ sudo find / -name 'libGLU*'
/usr/lib/i386-linux-gnu/libGLU.so.1.3.1
/usr/lib/i386-linux-gnu/libGLU.so.1
/usr/lib/x86_64-linux-gnu/libGLU.a
/usr/lib/x86_64-linux-gnu/libGLU.so.1.3.1
/usr/lib/x86_64-linux-gnu/libGLU.so.1
/usr/lib/x86_64-linux-gnu/libGLU.so
I have been trying to create symbolic links to the i386* and x86* libraries but havnt gotten it to work yet.
I am, for example, trying
sudo ln -s /usr/lib/i386-linux-gnu/libGLU.so /usr/lib/libGLU.so
My question now is, which libGLU.so do I need to point "/usr/lib/libGLU.so" to?
.a ?
.1?
.1.3.1?
x86 or i386? I know my system is 64bit but is CUDA expecting a 32bit library?
Doesn't seem like it should or would but... ?
I have tried the solutions on every SO and other board I can find... the two most relevant are
Cuda 6.5 cannot find - libGLU. (On ubuntu 14.04 64 bit)
and
http://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
which is where this question has existed previously.
It appears that the answer is in the link provided by Robert Crovella.
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libglfw3-dev libgles2-mesa-dev
then
GLPATH=/usr/lib make
instead of just make
source of solution
Thank you Robert.

Can't make cuda sample: Makefile:36: findcudalib.mk: No such file or directory

My problem is that I cannot compile a CUDA example. I believe I've got CUDA 4.0 installed correctly ( I need the old version b/c I'm trying to run GPGPU-Sim). I downloaded an NVIDIA cuda sample, namely conjugateGradient. If I cd to it and run
make
it doesn't work:
macair93278:7_CUDALibraries r8t$ cd conjugateGradient/
macair93278:conjugateGradient r8t$ ls
Makefile main.cpp
macair93278:conjugateGradient r8t$ make
Makefile:36: findcudalib.mk: No such file or directory
make: *** No rule to make target `findcudalib.mk'. Stop.
I've changed my path so that running
nvcc -V
doesn't produce an error, but gives me the version. So I think that's right.
Thanks for any help.
-bb
findcudalib.mk is missing because the individual sample you downloaded is not designed to be a complete, standalone sample. It requires a framework of other files and probably other libraries that need to be built around it.
To fix this, download the CUDA 4.0 SDK (GPU Computing SDK) from here.
Install that package. Once you have installed it, and assuming your CUDA install is otherwise intact, you should be able to change into the toplevel directory and issue make. This will build all the samples. For convenience, you may wish to issue make -k.

Nsight debugger doesn't go to device functions [duplicate]

I'we been writing some simple cuda program (I'm student so I need to practice), and the thing is I can compile it with nvcc from terminal (using Kubuntu 12.04LTS) and then execute it with optirun ./a.out (hardver is geforce gt 525m on dell inspiron) and everything works fine. The major problem is that I can't do anything from Nsight. When I try to start debug version of code the message is "Launch failed! Binaries not found!". I think it's about running command with optirun but I'm not sure. Any similar experiences? Thanks, for helping in advance folks. :)
As this was the first post I found when searching for "nsight optirun" I just wanted to wanted to write down the steps I took to make it working for me.
Go to Run -> Debug Configurations -> Debugger
Find the textbox for CUDA GDB executable (in my case it was set to "${cuda_bin}/cuda-gdb")
Prepend "optirun --no-xorg", in my case it was then "optirun --no-xorg ${cuda_bin}/cuda-gdb"
The "--no-xorg" option might not be required or even counterproductive if you have an OpenGL window as it prevents any of that to appear. For my scientific code however it is required as it prevents me from running into kernel timeouts.
Happy bug hunting.
We tested Nsight on Optimus systems without optirun - see "Install the cuda toolkit" in CUDA Toolkit Getting Started on using CUDA toolkit on the Optimus system. We have not tried optirun with Nsight EE.
If you still need to use optirun for debugging, you can try making a shell script that uses optirun to start cuda-gdb and set that shell script as cuda-gdb executable in the debug configuration properties.
The simplest thing to do is to run eclipse with optirun, that will also run your app properly.

How to start debug version of project in nsight with optirun command?

I'we been writing some simple cuda program (I'm student so I need to practice), and the thing is I can compile it with nvcc from terminal (using Kubuntu 12.04LTS) and then execute it with optirun ./a.out (hardver is geforce gt 525m on dell inspiron) and everything works fine. The major problem is that I can't do anything from Nsight. When I try to start debug version of code the message is "Launch failed! Binaries not found!". I think it's about running command with optirun but I'm not sure. Any similar experiences? Thanks, for helping in advance folks. :)
As this was the first post I found when searching for "nsight optirun" I just wanted to wanted to write down the steps I took to make it working for me.
Go to Run -> Debug Configurations -> Debugger
Find the textbox for CUDA GDB executable (in my case it was set to "${cuda_bin}/cuda-gdb")
Prepend "optirun --no-xorg", in my case it was then "optirun --no-xorg ${cuda_bin}/cuda-gdb"
The "--no-xorg" option might not be required or even counterproductive if you have an OpenGL window as it prevents any of that to appear. For my scientific code however it is required as it prevents me from running into kernel timeouts.
Happy bug hunting.
We tested Nsight on Optimus systems without optirun - see "Install the cuda toolkit" in CUDA Toolkit Getting Started on using CUDA toolkit on the Optimus system. We have not tried optirun with Nsight EE.
If you still need to use optirun for debugging, you can try making a shell script that uses optirun to start cuda-gdb and set that shell script as cuda-gdb executable in the debug configuration properties.
The simplest thing to do is to run eclipse with optirun, that will also run your app properly.

cuda-gdb exits with "[1] stopped" when it hits a kernel call

I'm pretty new to CUDA and flying a bit by the seat of my pants here...
I'm trying to debug my CUDA program on a remote machine I don't have admin rights on. I compile my program with nvcc -g -G and then try to debug it with cuda-gdb. However, as soon as gdb hits a call to a kernel (doesn't even have to enter it, and it doesn't happen in host code), I get:
(cuda-gdb) run
Starting program: /path/to/my/binary/cuda_clustered_tree
[Thread debugging using libthread_db enabled]
[1]+ Stopped cuda-gdb cuda_clustered_tree
cuda-gdb then dumps me back to my terminal. If I try to run cuda-gdb again, I get
An instance of cuda-gdb (pid 4065) is already using device 0. If you believe
you are seeing this message in error, try deleting /tmp/cuda-dbg/cuda-gdb.lock.
The only way to recover is to kill -9 cuda-gdb and cuda_clustered_ (I assume the latter is part of my binary).
This machine has two GPUs, is running CUDA 4.1 (I believe -- there were a lot installed, but that's the one I set the PATH and LD_LIBRARY_PATH to) and compile + runs deviceQuery and bandwidthTest fine.
I can provide more info if need be. I've searched everywhere I could find online and found no help with this.
Figured it out! Turns out, cuda-gdb hates csh.
If you are running csh, it will cause cuda-gdb to exhibit the above anomalous behavior. Even running bash from within csh, then running cuda-gdb, I still saw the behavior. You need to start your shell as bash, and only bash.
On the machine, the default shell was csh, but I use bash. I wasn't allowed to change it directly, so I added 'exec /bin/bash --login' to my .login script.
So even though I was running bash, because it was started by csh, cuda-gdb would exhibit the above anomalous behavior. Getting rid of 'exec' command, so I was running csh directly with nothing on top, still showed the behavior.
In the end, I had to get IT to change my shell to bash directly (after much patient troubleshooting by them.) Now it works as intended.