How to set CUDA parameters with GTX1080 for Tensorflow?

How to set CUDA parameters with GTX1080 for Tensorflow? - cuda

After I install the diriver of GTX1080, tensorflow shows that it can find the cudnn library.
However, the GPU driver is not recognized by the modprobe.
Detais information are as follows:
$ python
[14:22:14]
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
>>> sess = tf.InteractiveSession()
modprobe: ERROR: could not insert 'nvidia_352_uvm': Invalid argument
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:347] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.27 Thu Jun 9 18:53:27 PDT 2016 GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) """
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 367.27.0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine.
The version of GTX1080 driver is 367.27, which is provided by the NVIDIA.
I don't know why there is a 'nvidia_352_uvm'?
The result of nvidia-smi is here.
May be I need to reinstall cuda, but I really reinstall it several times.
Should I remove all the cuda library and nvidia dirver, then reinstall them all? Is there any install sequence about this two?
enter image description here

Too long for a comment, but here are some tips I've learned after trying to get NVidia drivers to play nice with Ubuntu.
Upgrading new driver on top of existing driver gives a partially upgraded installation. You need to remove the previous stuff first.
sudo apt-get remove --purge nvidia-*
sudo rm /etc/X11/xorg.conf # if you ran nvidia-xconfig
Reload NVidia driver as follows (from virtual terminal, CTRL+ALT+F7)
sudo service lightdm stop # stop your window manager
killall python # kill all running TensorFlow instances to free GPU
sudo modprobe -r nvidia
sudo modprobe nvidia
dmesg | tail -100 # check for error messages
Check logs for any error messages from NVidia
dmesg | grep -i nvidia
lspci | grep -i nvidia
nvidia-smi # make sure this reports version 367.27
Also, there are two ways to install drivers, using Ubuntu's built-in upgrade with sudo apt-get install nvidia-current, or by getting tar ball from NVidia website. I was not able to get sudo apt-get route to work for TensorFlow, so I would recommend downloading drivers from NVidia website

Related

Tensorflow cannot open libcuda.so.1

I have a laptop with a GeForce 940 MX. I want to get Tensorflow up and running on the gpu. I installed everything from their tutorial page, now when I import Tensorflow, I get
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
>>>
after which I think it just switches to running on the cpu.
EDIT: After I nuked everything , started from scratch. Now I get this:
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally

libcuda.so.1 is a symlink to a file that is specific to the version of your NVIDIA drivers. It may be pointing to the wrong version or it may not exist.
# See where the link is pointing.
ls /usr/lib/x86_64-linux-gnu/libcuda.so.1 -la
# My result:
# lrwxrwxrwx 1 root root 19 Feb 22 20:40 \
# /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> ./libcuda.so.375.39
# Make sure it is pointing to the right version.
# Compare it with the installed NVIDIA driver.
nvidia-smi
# Replace libcuda.so.1 with a link to the correct version
cd /usr/lib/x86_64-linux-gnu
sudo ln -f -s libcuda.so.<yournvidia.version> libcuda.so.1
Now in the same way, make another symlink from libcuda.so.1 to a link of the same name in your LD_LIBRARY_PATH directory.
You may also find that you need to create a link to libcuda.so.1 in /usr/lib/x86_64-linux-gnu named libcuda.so

In case anyone still encounters this. First make sure to add the --runtime=nvidia parameter in order to run your container.
docker run --runtime=nvidia -t tensorflow/serving:latest-gpu
where tensorflow/serving:latest-gpu is the name of the docker image.

In the case I just solved, it was updating the GPU driver to the latest and installing the cuda toolkit. First, the ppa was added and GPU driver installed:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390
After adding the ppa, it showed options for driver versions, and 390 was the latest 'stable' version that was shown.
Then install the cuda toolkit:
sudo apt install nvidia-cuda-toolkit
Then reboot:
sudo reboot
It updated the drivers to a newer version than the 390 originally installed in the first step (it was 410; this was a p2.xlarge instance on AWS).

Installing cuda via brew and dmg

After attempting to install nvidia toolkit on MAC by following guide : http://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html#axzz4FPTBCf7X I received error "Package manifest parsing error" which led me to this : NVidia CUDA toolkit 7.5.27 failing to install on OS X . I unmounted the dmg and upshot was that instead of receiving "Package manifest parsing error" the installer would not launch (it seemed to launch briefly , then quit).
Installing via command brew install Caskroom/cask/cuda (CUDA 7.5 install on Mac missing nvrtc) seems to have successfully installed cuda.
command nvcc --version returns :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Mon_Apr_11_13:23:40_CDT_2016
Cuda compilation tools, release 7.5, V7.5.26
I've built the example in /Developer/NVIDIA/CUDA-7.5/samples/1_Utilities with :
make -C bandwidthTest/
This executed without error.
It appears installing with brew install Caskroom/cask/cuda is safe method of installing ? What is difference between this install method and installing via DMG file from nvidia ?
Caskroom appears to be an extension for brew for installing GUI applications : https://github.com/caskroom/homebrew-cask
Should an IDE also be installed as part of the cuda install ?

Nowadays you have to do the following to install cuda via brew:
brew tap homebrew/cask-drivers
brew cask install nvidia-cuda
See https://github.com/caskroom/homebrew-cask/issues/38325 .
Then you also need to add the following to your file ~/.bash_profile:
export PATH=/Developer/NVIDIA/CUDA-9.0/bin${PATH:+:${PATH}}
export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-9.0/lib${DYLD_LIBRARY_PATH:+:${DYLD_LIBRARY_PATH}}
See http://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html.
UPDATE: Newer versions of Mac OS X with activated SIP (System integrity protection) will prevent modifying the DYLD_LIBRARY_PATH (see https://groups.google.com/forum/#!topic/caffe-users/waugt62RQMU). You can check that via
source ~/.bash_profile
env | grep DYLD_LIBRARY_PATH
If the output of this command is empty SIP is active and you might want to deactivate it as described at https://www.macworld.com/article/2986118/security/how-to-modify-system-integrity-protection-in-el-capitan.html . After doing this you should see
env | grep DYLD_LIBRARY_PATH
DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-9.0/lib

Both methods download and install from the same .dmg file from NVidia.
The homebrew-cask framework is the preferred method for installing software distributed as binaries in the homebrew paradigm.
This is my understanding.

Using DMG file, follow below:
wget 'https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_mac.dmg' && \
hdiutil attach cuda_10.2.89_mac.dmg \
-nobrowse \
-mountpoint \
/Volumes/CUDAMacOSXInstaller
Open installer:
open /Volumes/CUDAMacOSXInstaller/CUDAMacOSXInstaller.app
Uncheck "CUDA Samples" before continue.
Unmount and remove file:
hdiutil detach /Volumes/CUDAMacOSXInstaller && rm ./cuda_10.2.89_mac.dmg

Unable to get cuda to work in tensorflow

I'm trying to use cuda to accelerate tensorflow. I'm running tensorflow using the docker image.
Firstly, when I launch the gpu image, it has a mismatch in the LT_LIBRARY_PATH environment variable:
~# echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64:
root#d578acbbc2cd:~# ls /usr/local/
bin cuda cuda-7.0 etc games include lib man sbin share src
There's no nvidia directory there. When I try to run the convolutional.py demo, it can't initialise the cuda support:
# python models/image/mnist/convolutional.py
Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 8
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.2.0-23-generic/modules.dep.bin'
E tensorflow/stream_executor/cuda/cuda_driver.cc:466] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:98] retrieving CUDA diagnostic information for host: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:106] hostname: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:131] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:242] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.68 Tue Dec 1 17:24:11 PST 2015
GCC version: gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2)
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:135] kernel reported version is: 352.68
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA:
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 8
It then goes on to train using cpu only.
# find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
So in the docker image, there's only the gnu cpu cuda implementation. No NVIDIA stuff. In the host ubuntu 15.10 session, I have libcuda.so installed:
$ find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/lib/i386-linux-gnu/libcuda.so
/usr/local/cuda-7.5/targets/x86_64-linux/lib
/stubs/libcuda.so
So these seem to be stubs ... not sure why.
Is there some trick to getting this to work?

Try rebuilding the Docker image directly from the Tensorflow repository (i.e. don't rely on the image on the container registry) and use https://github.com/NVIDIA/nvidia-docker to run the container (the Docker command described in the Tensorflow documentation is not portable).

I had a similar problem, though not in docker. The libcuda.so in /usr/local/cuda/lib64/stubs was a broken sym link. When I searched for libcuda.so it only turned up a file in a lib32 folder.
It seems that the problem was how I originally installed the NVIDIA device driver. At some point in the driver install process you're given the option to install the lib32 drivers. I had thought this meant in addition to lib64 drivers so I selected it. Turns out it only installs lib32 and not lib64 drivers.
I reinstalled the NIVDIA device driver, this time not selecting the lib32 'option'. Now tensorflow finds libcuda.so.

I had the same problem with running tensorflow on a Ubuntu machine after I upgraded my driver to 352.63 and 352.93. (I remember it works with 346.* but when I try to install 346., it installs 352. automatically for some reason).
I finally figured out that it's caused by permission issue. (I can run it with root) So, I changed the permission of the libcuda.so.352-63 file to executable by anyone and it works well now.
Hope this will be helpful to those still struggling with this issue.
I didn't try the docker one, but I guess it's also caused by permission setting.

Try this command
sudo apt-get install nvidia-modprobe
As mentioned here:
https://github.com/tensorflow/tensorflow/issues/394
and
http://kkjkok.blogspot.in/2016_08_01_archive.html

After I updated NVIDIA driver to 378.09 on Ubuntu 14.10 I had the same error,
although all the right for lib files were set correctly.
Thanks to #PhoenixQ, I tried to run with sudo and it worked.
After that I tried to run without sudo one more time and error disappeared. I'm not sure what ecxactly happened, but maybe something was configured during call with sudo, which was not possible withous sudo.
So the solution:
Try to run the same thing with sudo.
After this. Tryu running without sudo. Worked for me.

Caffe Installation Issue on Ubuntu 14.04

I successfully installed caffe on my dual-boot laptop (GTX 860M, Windows 7 + Ubuntu 14.04.2). All the tests were successfully passed. When I restarted, however, the ubuntu got stuck on the opening screen (the one with ubuntu logo and five red dots). Don't know what to do with it.
Has anyone run into the same issue before? I reckon something is wrong with graphic card driver booting. I installed newest CUDA 7 Toolkit with nvidia drivers built inside. Since all tests were passed before I restarted, it seems that the driver would work once successfully booted.
the stuck screen is like this: http://i.stack.imgur.com/pRtEF.jpg

I had a similar issue when trying to install Caffe on my system. The steps below worked for me, but it has at least one known issue (documented below).
I'm not sure what precisely caused this problem, but it surely has something to do with the Nvidia Driver and Cuda Toolkit installation and is not caused by Caffe.
After completing the steps below, I've been able to successfully install Caffe on my system with the following tutorials and guides:
Official Install Guide
Github Install Guide
Update
Recently, I had the exact same problem trying to make Cuda 7.5 work on Ubuntu 14.04; this approach also solved that problem. Specs:
CPU: Intel Core i7-4700MQ (4x 2.40 GHz with Hyperthreading)
GPU: NVidia GT 940M
RAM: 8 GB
HDD: 52.7 GB (of which 6.7 GB used after installation)
INSTALL NVIDIA DRIVER AND CUDA ON UBUNTU 14.04
Source: ubuntuforums.org/showthread.php?t=2246526
!! Known Issues !!
After the system has been suspended (or hibernated, not confirmed), all applications using the Nvidia Driver and Cuda 6.5 Toolkit will freeze. When this happens, the command sudo shutdown -r now will print the reboot message but nothing will happen.
Executed and tested on a fresh 64-bit Ubuntu 14.04 install with the following hardware specifications:
CPU: Intel Core i5-2410m (2x 2.30 GHz with Hyperthreading)
GPU: NVidia GT 540M
RAM: 6 GB
HDD: 52.7 GB (of which 8.6 GB used after installation)
The following command was executed before installation:
sudo apt-get -y build-essential vim git llvm clang
The following steps resulted in a stable system with the latest Nvidia Driver and Cuda 6.5 Toolkit installed:
Remove all traces of previous/legacy Nvidia Drivers and Cuda Toolkits or perform a fresh Ubuntu 14.04 install.
Download the latest Nvidia Driver .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: NVIDIA-Linux-x86_64-346.35.run
Download the latest Cuda 6.5 Toolkit .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: cuda_6.5.14_linux_64.run
Blacklist the 'nouveau' Driver by appending the following lines to /etc/modprobe.d/blacklist.conf (nouveau is a free open-source driver for Nvidia cards, it is the default for Ubuntu 14.04):
blacklist nouveau
options nouveau modeset=0
Reboot the system, do NOT log in but drop to the terminal with CTRL+ALT+F1
Kill lightdm (replace 'lightdm' with your own Display Manager if you have changed it, lightdm is the default for Ubuntu 14.04):
sudo service lightdm stop
The next step is critical, make sure to check twice before continuing!
Run the Nvidia Driver installer with the --no-opengl-files option (the option prevents OpenGL files from being overwritten; without this option, Unity would not function properly and the screen would freeze after login):
sudo chmod +x ~/Downloads/NVIDIA-Linux-x68_64-346.35.run
sudo ~/Downloads/NVIDIA-Linux-x68_64-346.35.run --no-opengl-files
Accept the EULA and acknowledge all further warnings but deny to install anything extra.
Reboot and login to the desktop, verify with the 'Additional Drivers' (System Settings > Software & Updates > Additional Drivers) utility that the manually installed driver is in use.
Open a terminal and install the Cuda 6.5 Toolkit:
sudo chmod +x ~/Downloads/cuda_6.5.14_linux_64.run
sudo ~/Downloads/cuda_6.5.14_linux_64.run
Accept the EULA, do NOT install the driver, install the Toolkit and the Examples (if you want to), leave all default directories in place.
Add the Cuda 6.5 Toolkit environment variables by appending the following lines to ~/.bashrc:
# For 32-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib
# For 64-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64
The Nvidia Driver and Cuda 6.5 Toolkit should now be correctly installed.
Optional: confirm your Nvidia Driver and Cuda 6.5 Toolkit installation.
Confirm the Nvidia Driver installation by running the following command:
nvidia-smi
Confirm the Cuda Compiler installation by running the following command:
nvcc -V
Confirm everything works by building and running the optionally installed Cuda Examples: (build-essential is required to use 'make')
sudo apt-get install -y build-essential
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/deviceQuery
make
./deviceQuery
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/bandwidthTest
make
./bandwidthTest

This problem is not related to caffe.
The problem is that the nVidia driver that is installed from the ubuntu software center does not support your card.
Uninstall any nvidia package (sudo apt-get purge nvidia-*) and install the latest driver version from the nvidia website.

I recommend you to change the cuda 7.5 ubuntu 15.04 version. I try it on the ubuntu 14.04, it solves this problem. And when I install cuda 7.5 ubuntu 14.04 version on ubuntu 14.04 I countered the exactly problem.

How to get the CUDA version?

Is there any quick command or script to check for the version of CUDA installed?
I found the manual of 4.0 under the installation directory but I'm not sure whether it is of the actual installed version or not.

As Jared mentions in a comment, from the command line:
nvcc --version
(or /usr/local/cuda/bin/nvcc --version) gives the CUDA compiler version (which matches the toolkit version).
From application code, you can query the runtime API version with
cudaRuntimeGetVersion()
or the driver API version with
cudaDriverGetVersion()
As Daniel points out, deviceQuery is an SDK sample app that queries the above, along with device capabilities.
As others note, you can also check the contents of the version.txt using (e.g., on Mac or Linux)
cat /usr/local/cuda/version.txt
However, if there is another version of the CUDA toolkit installed other than the one symlinked from /usr/local/cuda, this may report an inaccurate version if another version is earlier in your PATH than the above, so use with caution.

On Ubuntu Cuda V8:
$ cat /usr/local/cuda/version.txt
You can also get some insights into which CUDA versions are installed with:
$ ls -l /usr/local | grep cuda
which will give you something like this:
lrwxrwxrwx 1 root root 9 Mar 5 2020 cuda -> cuda-10.2
drwxr-xr-x 16 root root 4096 Mar 5 2020 cuda-10.2
drwxr-xr-x 16 root root 4096 Mar 5 2020 cuda-8.0.61
Given a sane PATH, the version cuda points to should be the active one (10.2 in this case).
NOTE: This only works if you are willing to assume CUDA is installed under /usr/local/cuda (which is true for the independent installer with the default location, but not true e.g. for distributions with CUDA integrated as a package). Ref: comment from #einpoklum.

[Edited answer. Thanks for everyone who corrected it]
If you run
nvidia-smi
You should find the CUDA Version highest CUDA version the installed driver supports on the top right corner of the comand's output. At least I found that output for CUDA version 10.0 e.g.,

For CUDA version:
nvcc --version
Or use,
nvidia-smi
For cuDNN version:
For Linux:
Use following to find path for cuDNN:
$ whereis cuda
cuda: /usr/local/cuda
Then use this to get version from header file,
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
For Windows,
Use following to find path for cuDNN:
C:\>where cudnn*
C:\Program Files\cuDNN7\cuda\bin\cudnn64_7.dll
Then use this to dump version from header file,
type "%PROGRAMFILES%\cuDNN7\cuda\include\cudnn.h" | findstr CUDNN_MAJOR
If you're getting two different versions for CUDA on Windows -
Different CUDA versions shown by nvcc and NVIDIA-smi

Use the following command to check CUDA installation by Conda:
conda list cudatoolkit
And the following command to check CUDNN version installed by conda:
conda list cudnn
If you want to install/update CUDA and CUDNN through CONDA, please use the following commands:
conda install -c anaconda cudatoolkit
conda install -c anaconda cudnn
Alternatively you can use following commands to check CUDA installation:
nvidia-smi
OR
nvcc --version
If you are using tensorflow-gpu through Anaconda package (You can verify this by simply opening Python in console and check if the default python shows Anaconda, Inc. when it starts, or you can run which python and check the location), then manually installing CUDA and CUDNN will most probably not work. You will have to update through conda instead.
If you want to install CUDA, CUDNN, or tensorflow-gpu manually, you can check out the instructions here https://www.tensorflow.org/install/gpu

Other respondents have already described which commands can be used to check the CUDA version. Here, I'll describe how to turn the output of those commands into an environment variable of the form "10.2", "11.0", etc.
To recap, you can use
nvcc --version
to find out the CUDA version.
I think this should be your first port of call.
If you have multiple versions of CUDA installed, this command should print out the version for the copy which is highest on your PATH.
The output looks like this:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
We can pass this output through sed to pick out just the MAJOR.MINOR release version number.
CUDA_VERSION=$(nvcc --version | sed -n 's/^.*release \([0-9]\+\.[0-9]\+\).*$/\1/p')
If nvcc isn't on your path, you should be able to run it by specifying the full path to the default location of nvcc instead.
/usr/local/cuda/bin/nvcc --version
The output of which is the same as above, and it can be parsed in the same way.
Alternatively, you can find the CUDA version from the version.txt file.
cat /usr/local/cuda/version.txt
The output of which
CUDA Version 10.1.243
can be parsed using sed to pick out just the MAJOR.MINOR release version number.
CUDA_VERSION=$(cat /usr/local/cuda/version.txt | sed 's/.* \([0-9]\+\.[0-9]\+\).*/\1/')
Note that sometimes the version.txt file refers to a different CUDA installation than the nvcc --version. In this scenario, the nvcc version should be the version you're actually using.
We can combine these three methods together in order to robustly get the CUDA version as follows:
if nvcc --version 2&> /dev/null; then
# Determine CUDA version using default nvcc binary
CUDA_VERSION=$(nvcc --version | sed -n 's/^.*release \([0-9]\+\.[0-9]\+\).*$/\1/p');
elif /usr/local/cuda/bin/nvcc --version 2&> /dev/null; then
# Determine CUDA version using /usr/local/cuda/bin/nvcc binary
CUDA_VERSION=$(/usr/local/cuda/bin/nvcc --version | sed -n 's/^.*release \([0-9]\+\.[0-9]\+\).*$/\1/p');
elif [ -f "/usr/local/cuda/version.txt" ]; then
# Determine CUDA version using /usr/local/cuda/version.txt file
CUDA_VERSION=$(cat /usr/local/cuda/version.txt | sed 's/.* \([0-9]\+\.[0-9]\+\).*/\1/')
else
CUDA_VERSION=""
fi
This environment variable is useful for downstream installations, such as when pip installing a copy of pytorch that was compiled for the correct CUDA version.
python -m pip install \
"torch==1.9.0+cu${CUDA_VERSION/./}" \
"torchvision==0.10.0+cu${CUDA_VERSION/./}" \
-f https://download.pytorch.org/whl/torch_stable.html
Similarly, you could install the CPU version of pytorch when CUDA is not installed.
if [ "$CUDA_VERSION" = "" ]; then
MOD="+cpu";
echo "Warning: Installing CPU-only version of pytorch"
else
MOD="+cu${CUDA_VERSION/./}";
echo "Installing pytorch with $MOD"
fi
python -m pip install \
"torch==1.9.0${MOD}" \
"torchvision==0.10.0${MOD}" \
-f https://download.pytorch.org/whl/torch_stable.html
But be careful with this because you can accidentally install a CPU-only version when you meant to have GPU support.
For example, if you run the install script on a server's login node which doesn't have GPUs and your jobs will be deployed onto nodes which do have GPUs. In this case, the login node will typically not have CUDA installed.

On Ubuntu :
Try
$ cat /usr/local/cuda/version.txt
or
$ cat /usr/local/cuda-8.0/version.txt
Sometimes the folder is named "Cuda-version".
If none of above works, try going to
$ /usr/local/
And find the correct name of your Cuda folder.
Output should be similar to:
CUDA Version 8.0.61

If you have installed CUDA SDK, you can run "deviceQuery" to see the version of CUDA

If you have PyTorch installed, you can simply run the following code in your IDE:
import torch
print(torch.version.cuda)

On Windows 10, I found nvidia-smi.exe in 'C:\Program Files\NVIDIA Corporation\NVSMI'; after cd into that folder (was not in the PATH in my case) and '.\nvidia-smi.exe' it showed

You might find CUDA-Z useful, here is a quote from their Site:
"This program was born as a parody of another Z-utilities such as CPU-Z and GPU-Z. CUDA-Z shows some basic information about CUDA-enabled GPUs and GPGPUs. It works with nVIDIA Geforce, Quadro and Tesla cards, ION chipsets."
http://cuda-z.sourceforge.net/
On the Support Tab there is the URL for the Source Code: http://sourceforge.net/p/cuda-z/code/ and the download is not actually an Installer but the Executable itself (no installation, so this is "quick").
This Utility provides lots of information and if you need to know how it was derived there is the Source to look at. There are other Utilities similar to this that you might search for.

One can get the cuda version by typing the following in the terminal:
$ nvcc -V
# below is the result
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Alternatively, one can manually check for the version by first finding out the installation directory using:
$ whereis -b cuda
cuda: /usr/local/cuda
And then cd into that directory and check for the CUDA version.

We have three ways to check Version:
In my case below is the output:-
Way 1:-
cat /usr/local/cuda/version.txt
Output:-
CUDA Version 10.1.243
Way2:-
nvcc --version
Output:-
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Way3:-
/usr/local/cuda/bin/nvcc --version
Output:-
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
Way4:-
nvidia-smi
NVIDIA-SMI 450.36.06 Driver Version: 450.36.06 CUDA Version: 11.0
Outputs are not same. Don't know why it's happening.

First you should find where Cuda installed.
If it's a default installation like here the location should be:
for ubuntu:
/usr/local/cuda
in this folder you should have a file
version.txt
open this file with any text editor or run:
cat version.txt
from the folder
OR
cat /usr/local/cuda/version.txt

On Windows 11 with CUDA 11.6.1, this worked for me:
cat "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\version.json"

if nvcc --version is not working for you then use cat /usr/local/cuda/version.txt

After installing CUDA one can check the versions by: nvcc -V
I have installed both 5.0 and 5.5 so it gives
Cuda Compilation Tools,release 5.5,V5.5,0
This command works for both Windows and Ubuntu.

Apart from the ones mentioned above, your CUDA installations path (if not changed during setup) typically contains the version number
doing a which nvcc should give the path and that will give you the version
PS: This is a quick and dirty way, the above answers are more elegant and will result in the right version with considerable effort

If you are running on linux:
dpkg -l | grep cuda

If you have multiple CUDA installed, the one loaded in your system is CUDA associated with "nvcc". Therefore, "nvcc --version" shows what you want.

Open a terminal and run these commands:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
You can get the information of CUDA Driver version, CUDA Runtime Version, and also detailed information for GPU(s). An image example of the output from my end is as below.
You can find the image here.

i get /usr/local - no such file or directory. Though nvcc -V gives
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

Found mine after:
whereis cuda
at
cuda: /usr/lib/cuda /usr/include/cuda.h
with
nvcc --version
CUDA Version 9.1.85

Using tensorflow:
import tensorflow as tf
from tensorflow.python.platform import build_info as build
print(f"tensorflow version: {tf.__version__}")
print(f"Cuda Version: {build.build_info['cuda_version']}")
print(f"Cudnn version: {build.build_info['cudnn_version']}")
tensorflow version: 2.4.0
Cuda Version: 11.0
Cudnn version: 8

Programmatically with the CUDA Runtime API C++ wrappers (caveat: I'm the author):
auto v1 = cuda::version::maximum_supported_by_driver();
auto v2 = cuda::version::runtime();
This gives you a cuda::version_t structure, which you can compare and also print/stream e.g.:
if (v2 < cuda::version_t{ 8, 0 } ) {
std::cerr << "CUDA version " << v2 << " is insufficient." std::endl;
}

You can check the version of CUDA using
nvcc -V
or you can use
nvcc --version
or You can check the location of where the CUDA is using
whereis cuda
and then do
cat location/of/cuda/you/got/from/above/command

On my cuda-11.6.0 installation, the information can be found in /usr/local/cuda/version.json. It contains the full version number (11.6.0 instead of 11.6 as shown by nvidia-smi.
The information can be retrieved as follows:
python -c 'import json; print(json.load(open("/usr/local/cuda/version.json"))["cuda"]["version"])'

If there is a version mismatch between nvcc and nvidia-smi then different versions of cuda are used as driver and run time environemtn.
To ensure same version of CUDA drivers are used what you need to do is to get CUDA on system path.
First run whereis cuda and find the location of cuda driver.
Then go to .bashrc and modify the path variable and set the directory precedence order of search using variable 'LD_LIBRARY_PATH'.
for instance
$ whereis cuda
cuda: /usr/lib/cuda /usr/include/cuda.h /usr/local/cuda
CUDA is installed at /usr/local/cuda, now we need to to .bashrc and add the path variable as:
vim ~/.bashrc
export PATH="/usr/local/cuda/bin:${PATH}"
and after this line set the directory search path as:
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:${LD_LIBRARY_PATH}"
Then save the .bashrc file. And refresh it as:
$ source ~/.bashrc
This will ensure you have nvcc -V and nvidia-smi to use the same version of drivers.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to set CUDA parameters with GTX1080 for Tensorflow? - cuda

Related

Tensorflow cannot open libcuda.so.1

Installing cuda via brew and dmg

Unable to get cuda to work in tensorflow

Caffe Installation Issue on Ubuntu 14.04

How to get the CUDA version?

Categories

Resources