CUDA on Ubuntu 16.04 - cuda

I am installing CUDA from this link.
Though it is a CUDA SDK 9.2, when I check the version installed using nvcc --version, I get the following results:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
I am new to CUDA and wanted to check if this is expected. Should I expect 9.2 as the CUDA version post installation?
FYI - GPU is GeForce GTX 1080 Ti

You need to follow the official guide step by step,click here to check if the toolkit is installed correctly. Also,the post-installation-actions must be taken into consideration,click here to get more info.
I had occured the same condition u mention above, In my case, I add this path to the PATH:
$ export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
The best solution should be to modify the corresponding profile file, like this:
vim /etc/profile
Add export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}} to the end of the file
reboot
Good Luck.

Related

Cannot find libspirv-nvptx64--nvidiacl.bc when used intel clang++ to build binary for nvidia cuda GPU

I used below command to build binary for nvidia GPU:
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda simple-sycl-app.cpp -o simple-sycl-app-cuda
But got below error message:
clang++: error: cannot find 'libspirv-nvptx64--nvidiacl.bc'; provide path to libspirv library via '-fsycl-libspirv-path', or pass '-fno-sycl-libspirv' to build without linking with libspirv
I searched in both intel oneAPI installation path and cuda toolkit path, but cannot find the spirv-nvptx64-nvidiacl.bc.
Anyone knows where to find libspirv-nvptx64—nvidiacl.bc?
It looks like you are trying to compile using the DPC++ compiler for Nvidia GPUs.
This option is not included in the oneAPI release installations from the Intel website. At the moment you will need to compile the DPC++ LLVM project with this enabled to be able to use the appropriate flag to target Nvidia devices.
You can follow the instructions on this page to compile the project and then it explains how to use the ptx target. In the future Codeplay, the company I work for, intends to publish release binaries that include the ptx compiler option.

Confusing cuda versions

I just installed the latest CUDA 9.1 on Ubuntu 16.04 according to the official instruction. But when I run the command nvcc -V, it still shows my cuda version is 7.5 like below.
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
Also, which nvcc gave me /usr/bin/nvcc which is not under /usr/local folder. Is this normal? Is this a compatibility issue? I have a GTX 1080 Ti and a GTX 980. I added commands below to .bashrc file, but it still didn't work.
export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
The best thing to do here is to remove all traces of CUDA binaries from the /usr/bin directory, and in the future always install the CUDA toolkit in the "default" locations at /usr/local/cuda-XX
To remove CUDA items from /usr/bin, just use the linux rm command as a root user. Not sure what to remove? Take a look in an "ordinary" CUDA install bin directory, such as /usr/local/cuda-8.0/bin
By having your CUDA install at the default locations e.g. /usr/local/cuda-8.0 and /usr/local/cuda-9.0 (for example), you can have "side-by-side" installs, and switch between them by modifying the PATH and LD_LIBRARY_PATH variables accordingly.

nvidia-smi Failed to initialize NVML: GPU access blocked by the operating system

when asking for
nvidia-smi
it gives this error:
Failed to initialize NVML: GPU access blocked by the operating system
other information:
$ nvcc --verion
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Mon_Feb_16_22:59:02_CST_2015
Cuda compilation tools, release 7.0, V7.0.27
and also:
$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GF108M [GeForce GT 425M] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GF108 High Definition Audio Controller (rev a1)
Having searched a lot in the internet I couldn't find a way to solve this problem.
when I use ipython notebook and want to run Caffe framework it gives this error:
Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected
I noticed that after CUDA installation restarting Ubuntu works, and now I see the GPU details output by nvidia-smi
If you believe that both CUDA and graphics driver are installed correctly, but you still cannot make your GPU to be detected, the problem might be in that you are using mobile Nvidia graphics on Optimus-enabled laptop on Linux.
You could either:
change your application to properly detect GPUs behind Optimus. See documentation here
or to run your application via Bumblebee (and primus)
WSL user here. Running nvidia-smi on either Windows and WSL failed. Reinstalling the Nvidia for WSL driver, on the Windows side, fixed the problem. The problem was created when installing CUDA Toolkit and CUDNN broke the Nvidia for WSL driver.
I had the same problem. It was happened because of installing a nvidia toolkit (I am not sure). According to this website (which has useful ideas)
I found that cuda driver version in the cuda installer and host was incompatible. (host : 367.57 , installer: 375.26 , At first I could not check the installer version because all the versions was 367.57, but when I reinstall cuda by run file, I found it)
So, I uninstalled cuda and nvidia completely and install cuda again by this help. At first in the installation process I got some errors which I found, nvidia has not completely gone. After uninstalling completely, I installed cuda and now I can run "sudo nvidia-smi" without problem.
I got the error failed to initialize NVML: Driver/Library version mismatch. And nvidia-smi failed to print any info. I tried to find if there were other versions of nvidia driver installed in my ubuntu. But I just found nvidia-driver-390. In the end, reboot helped me solve the problem.

What is the difference between the CUDA tookit and the CUDA sdk

I am installing CUDA on Ubuntu 14.04 and have a Maxwell card (GTX 9** series) and I think I have installed everything properly with the toolkit as I can compile my samples. However, I read that in places that I should install the SDK (This appears to be talked about with the sdk 4). I am not sure if the toolkit and sdk are different? As I have a later 9 series card does that mean I have CUDA 6 running? Here is my nvcc version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Wed_Aug_27_10:36:36_CDT_2014
Cuda compilation tools, release 6.5, V6.5.16
I am following a book and I need to include <cutil.h> and I can't find that file in the includes anywhere where I installed it.
I followed this guide provided by nvidia and as I have done what they say this is why I am confused http://developer.download.nvidia.com/compute/cuda/6_5/rel/docs/CUDA_Getting_Started_Linux.pdf
Thanks for help
CUDA Toolkit is a software package that has different components. The main pieces are:
CUDA SDK (The compiler, NVCC, libraries for developing CUDA software, and CUDA samples)
GUI Tools (such as Eclipse Nsight for Linux/OS X or Visual Studio Nsight for Windows)
Nvidia Driver (system driver for driving the card)
It has also many other components such as CUDA-debugger, profiler, memory checker, etc.
The fact that you are able to compile and run samples means that you probably installed the Toolkit fully and have the SDK, the driver, and the Samples at least.
As for the cutil.h, doing a search in my CUDA 6.5 installation with find -L . -iname "cutil.h" yielded no results. Also looking at other related questions on SO, it seems like this header file does not exist in CUDA installations anymore (since CUDA 5.0). However, looking at the samples, you can find some newer utility headers such as helper_cuda.h being in use. Helpers like these should be located in somewhere like /usr/local/cuda/samples/common/inc in your OS. helper_cuda.h is a header I almost always include in my CUDA programs since I find utility functions such as checkCudaErrors() very useful.
If you are following a book, my recommendation is; try to compile the code, and whenever you get an error saying a utility function is missing, do a grep search in the header files included in samples/common/inc. You will most probably find the missing utility functions there and then you can include the necessary headers accordingly.

CUDA 5.0 wants the libcudart from CUDA 4.0?

I just upgraded from CUDA 4.2 to CUDA 5.0. Not surprisingly, the library that used to be named libcudart.so.4 is now called libcudart.so.5.0. After recompiling my code with nvcc 5.0, and attempting to running the code, I got this message:
./main: error while loading shared libraries: libcudart.so.4: cannot open shared object file: No such file or directory
Yeah, you stupid system, I know there's no libcudart.so.4. That's because it's now called libcudart.so.5.0. Why is it looking for libcudart.so.4 instead of libcudart.so.5.0, and how can I fix it?
What I've tried so far:
I've checked that all my paths are in order. These environment variables are set:
export PATH=$PATH:/usr/local/cuda/bin:/usr/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib:/usr/local/cuda/lib64:/lib
#note: /usr/local/cuda is symlinked to /usr/local/cuda-5.0
I've verified that libcudart.so.5.0 can be found in one of the LD_LIBRARY_PATH directories.
I recompiled my CUDA application with the the CUDA 5.0 version of nvcc. I successfully compiled and ran my application on an other machine with CUDA 4.2, and on an other machine with CUDA 4.0.
I confirmed that nvcc is really on version 5.0:
user#host$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2012 NVIDIA Corporation
Built on Fri_Sep_21_17:28:58_PDT_2012
Cuda compilation tools, release 5.0, V0.2.122
I'd like to get this question off the unanswered list, and I don't think #Jared Hoberock will mind, so I'm going to post his comment as an answer. If there's a concern and Jared or solvingPuzzles posts an answer, I'll delete mine (assuming it's not accepted -- I can't delete an accepted answere AFAIK).
nvcc seems to be statically linking against libcudart.a version 4.
Somewhere in your lib path, it seems that nvcc is finding an old libcudart.a, which needs to be removed.
For other readers, it's probably just sufficient to find all instances of libcudart.* on the system and delete any that don't match your desired CUDA version (assuming you're not trying to run a machine with multiple CUDA versions available -- in that case, the library paths for both compiling and running have to be managed appropriately)