Installing cuda via brew and dmg - cuda

After attempting to install nvidia toolkit on MAC by following guide : http://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html#axzz4FPTBCf7X I received error "Package manifest parsing error" which led me to this : NVidia CUDA toolkit 7.5.27 failing to install on OS X . I unmounted the dmg and upshot was that instead of receiving "Package manifest parsing error" the installer would not launch (it seemed to launch briefly , then quit).
Installing via command brew install Caskroom/cask/cuda (CUDA 7.5 install on Mac missing nvrtc) seems to have successfully installed cuda.
command nvcc --version returns :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Mon_Apr_11_13:23:40_CDT_2016
Cuda compilation tools, release 7.5, V7.5.26
I've built the example in /Developer/NVIDIA/CUDA-7.5/samples/1_Utilities with :
make -C bandwidthTest/
This executed without error.
It appears installing with brew install Caskroom/cask/cuda is safe method of installing ? What is difference between this install method and installing via DMG file from nvidia ?
Caskroom appears to be an extension for brew for installing GUI applications : https://github.com/caskroom/homebrew-cask
Should an IDE also be installed as part of the cuda install ?

Nowadays you have to do the following to install cuda via brew:
brew tap homebrew/cask-drivers
brew cask install nvidia-cuda
See https://github.com/caskroom/homebrew-cask/issues/38325 .
Then you also need to add the following to your file ~/.bash_profile:
export PATH=/Developer/NVIDIA/CUDA-9.0/bin${PATH:+:${PATH}}
export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-9.0/lib${DYLD_LIBRARY_PATH:+:${DYLD_LIBRARY_PATH}}
See http://docs.nvidia.com/cuda/cuda-installation-guide-mac-os-x/index.html.
UPDATE: Newer versions of Mac OS X with activated SIP (System integrity protection) will prevent modifying the DYLD_LIBRARY_PATH (see https://groups.google.com/forum/#!topic/caffe-users/waugt62RQMU). You can check that via
source ~/.bash_profile
env | grep DYLD_LIBRARY_PATH
If the output of this command is empty SIP is active and you might want to deactivate it as described at https://www.macworld.com/article/2986118/security/how-to-modify-system-integrity-protection-in-el-capitan.html . After doing this you should see
env | grep DYLD_LIBRARY_PATH
DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-9.0/lib

Both methods download and install from the same .dmg file from NVidia.
The homebrew-cask framework is the preferred method for installing software distributed as binaries in the homebrew paradigm.
This is my understanding.

Using DMG file, follow below:
wget 'https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_mac.dmg' && \
hdiutil attach cuda_10.2.89_mac.dmg \
-nobrowse \
-mountpoint \
/Volumes/CUDAMacOSXInstaller
Open installer:
open /Volumes/CUDAMacOSXInstaller/CUDAMacOSXInstaller.app
Uncheck "CUDA Samples" before continue.
Unmount and remove file:
hdiutil detach /Volumes/CUDAMacOSXInstaller && rm ./cuda_10.2.89_mac.dmg

Related

Tensorflow cannot open libcuda.so.1

I have a laptop with a GeForce 940 MX. I want to get Tensorflow up and running on the gpu. I installed everything from their tutorial page, now when I import Tensorflow, I get
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
>>>
after which I think it just switches to running on the cpu.
EDIT: After I nuked everything , started from scratch. Now I get this:
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
libcuda.so.1 is a symlink to a file that is specific to the version of your NVIDIA drivers. It may be pointing to the wrong version or it may not exist.
# See where the link is pointing.
ls /usr/lib/x86_64-linux-gnu/libcuda.so.1 -la
# My result:
# lrwxrwxrwx 1 root root 19 Feb 22 20:40 \
# /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> ./libcuda.so.375.39
# Make sure it is pointing to the right version.
# Compare it with the installed NVIDIA driver.
nvidia-smi
# Replace libcuda.so.1 with a link to the correct version
cd /usr/lib/x86_64-linux-gnu
sudo ln -f -s libcuda.so.<yournvidia.version> libcuda.so.1
Now in the same way, make another symlink from libcuda.so.1 to a link of the same name in your LD_LIBRARY_PATH directory.
You may also find that you need to create a link to libcuda.so.1 in /usr/lib/x86_64-linux-gnu named libcuda.so
In case anyone still encounters this. First make sure to add the --runtime=nvidia parameter in order to run your container.
docker run --runtime=nvidia -t tensorflow/serving:latest-gpu
where tensorflow/serving:latest-gpu is the name of the docker image.
In the case I just solved, it was updating the GPU driver to the latest and installing the cuda toolkit. First, the ppa was added and GPU driver installed:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390
After adding the ppa, it showed options for driver versions, and 390 was the latest 'stable' version that was shown.
Then install the cuda toolkit:
sudo apt install nvidia-cuda-toolkit
Then reboot:
sudo reboot
It updated the drivers to a newer version than the 390 originally installed in the first step (it was 410; this was a p2.xlarge instance on AWS).

How to find cuda version in ubuntu?

I installed cuda 8.0 in my ubuntu 16.04 machine and checked the cuda version using the command "nvcc --version". it shows version as 7.5!!!.How Can I be sure that it is accurate? Are there other commands that I can also use to verify my result?
[edited 2022]
For CUDA 11:
$ cat /usr/local/cuda/version.json
For cuda-8.0 on Ubuntu16.04, you should be able to read
$ cat /usr/local/cuda/version.txt
CUDA Version 8.0.44
I agree with Robert Crovella, you might need to check your PATH
Starting from CUDA 8.0, it's possible to have multiple CUDA versions installed. You can then activate different values for $PATH environment variable that will present you with different CUDA version.
Command to immediately obtain the CUDA version:
$ nvcc --version | grep "release" | awk '{print $6}' | cut -c2-
You can confirm the result by checking the install status of CUDA libraries:
$ dpkg -l | grep cuda
For installing multiple versions of CUDA, you can refer to this article.
Thank you all...
Previously I tried to install cuda8.0 using run file from https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run. After that I tried to check "nvcc --version", but it shows the following error "The program 'nvcc' is currently not installed. You can install it by typing: sudo apt-get install nvidia-cuda-toolkit". So I tried the above command. It gave the cuda7.5 version.
Later I tried to install cuda using debian package which by default contained nvcc. Now I am getting correct version.
It may be due to the fact that you have both v7.5 and v8.0 installed. So instead of changing path, try uninstalling v7.5 first

How to set CUDA parameters with GTX1080 for Tensorflow?

After I install the diriver of GTX1080, tensorflow shows that it can find the cudnn library.
However, the GPU driver is not recognized by the modprobe.
Detais information are as follows:
$ python
[14:22:14]
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
>>> sess = tf.InteractiveSession()
modprobe: ERROR: could not insert 'nvidia_352_uvm': Invalid argument
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:347] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.27 Thu Jun 9 18:53:27 PDT 2016 GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) """
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 367.27.0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine.
The version of GTX1080 driver is 367.27, which is provided by the NVIDIA.
I don't know why there is a 'nvidia_352_uvm'?
The result of nvidia-smi is here.
May be I need to reinstall cuda, but I really reinstall it several times.
Should I remove all the cuda library and nvidia dirver, then reinstall them all? Is there any install sequence about this two?
enter image description here
Too long for a comment, but here are some tips I've learned after trying to get NVidia drivers to play nice with Ubuntu.
Upgrading new driver on top of existing driver gives a partially upgraded installation. You need to remove the previous stuff first.
sudo apt-get remove --purge nvidia-*
sudo rm /etc/X11/xorg.conf # if you ran nvidia-xconfig
Reload NVidia driver as follows (from virtual terminal, CTRL+ALT+F7)
sudo service lightdm stop # stop your window manager
killall python # kill all running TensorFlow instances to free GPU
sudo modprobe -r nvidia
sudo modprobe nvidia
dmesg | tail -100 # check for error messages
Check logs for any error messages from NVidia
dmesg | grep -i nvidia
lspci | grep -i nvidia
nvidia-smi # make sure this reports version 367.27
Also, there are two ways to install drivers, using Ubuntu's built-in upgrade with sudo apt-get install nvidia-current, or by getting tar ball from NVidia website. I was not able to get sudo apt-get route to work for TensorFlow, so I would recommend downloading drivers from NVidia website

CUDA 7.5 install on Mac missing nvrtc

According to the documentation, when I install the CUDA 7.5 Toolkit on my Mac (OSX 10.11) I should get the nvrtc files with it. I do not. Where do I pick up the nvrtc header files and libraries? Were they supposed to be in the bundle and left out? Were the deprecated or replaced with something else?
So the trick is:
1) Install XCode (from the App Store) FIRST. After the App Store is done installing it, you have to go into your Application menu and actually run it and accept the license.
2) Use the Homebrew version:
$ brew install Caskroom/cask/cuda
3) Lastly, you can update your PATH and LD_LIBRARY_PATH to find the new code:
$ export PATH=/usr/local/cuda/bin:${PATH}
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib:${LD_LIBRARY_PATH}
For some reason, simply downloading the package from NVidia and installing it does not get you a complete installation.

Caffe Installation Issue on Ubuntu 14.04

I successfully installed caffe on my dual-boot laptop (GTX 860M, Windows 7 + Ubuntu 14.04.2). All the tests were successfully passed. When I restarted, however, the ubuntu got stuck on the opening screen (the one with ubuntu logo and five red dots). Don't know what to do with it.
Has anyone run into the same issue before? I reckon something is wrong with graphic card driver booting. I installed newest CUDA 7 Toolkit with nvidia drivers built inside. Since all tests were passed before I restarted, it seems that the driver would work once successfully booted.
the stuck screen is like this: http://i.stack.imgur.com/pRtEF.jpg
I had a similar issue when trying to install Caffe on my system. The steps below worked for me, but it has at least one known issue (documented below).
I'm not sure what precisely caused this problem, but it surely has something to do with the Nvidia Driver and Cuda Toolkit installation and is not caused by Caffe.
After completing the steps below, I've been able to successfully install Caffe on my system with the following tutorials and guides:
Official Install Guide
Github Install Guide
Update
Recently, I had the exact same problem trying to make Cuda 7.5 work on Ubuntu 14.04; this approach also solved that problem. Specs:
CPU: Intel Core i7-4700MQ (4x 2.40 GHz with Hyperthreading)
GPU: NVidia GT 940M
RAM: 8 GB
HDD: 52.7 GB (of which 6.7 GB used after installation)
INSTALL NVIDIA DRIVER AND CUDA ON UBUNTU 14.04
Source: ubuntuforums.org/showthread.php?t=2246526
!! Known Issues !!
After the system has been suspended (or hibernated, not confirmed), all applications using the Nvidia Driver and Cuda 6.5 Toolkit will freeze. When this happens, the command sudo shutdown -r now will print the reboot message but nothing will happen.
Executed and tested on a fresh 64-bit Ubuntu 14.04 install with the following hardware specifications:
CPU: Intel Core i5-2410m (2x 2.30 GHz with Hyperthreading)
GPU: NVidia GT 540M
RAM: 6 GB
HDD: 52.7 GB (of which 8.6 GB used after installation)
The following command was executed before installation:
sudo apt-get -y build-essential vim git llvm clang
The following steps resulted in a stable system with the latest Nvidia Driver and Cuda 6.5 Toolkit installed:
Remove all traces of previous/legacy Nvidia Drivers and Cuda Toolkits or perform a fresh Ubuntu 14.04 install.
Download the latest Nvidia Driver .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: NVIDIA-Linux-x86_64-346.35.run
Download the latest Cuda 6.5 Toolkit .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: cuda_6.5.14_linux_64.run
Blacklist the 'nouveau' Driver by appending the following lines to /etc/modprobe.d/blacklist.conf (nouveau is a free open-source driver for Nvidia cards, it is the default for Ubuntu 14.04):
blacklist nouveau
options nouveau modeset=0
Reboot the system, do NOT log in but drop to the terminal with CTRL+ALT+F1
Kill lightdm (replace 'lightdm' with your own Display Manager if you have changed it, lightdm is the default for Ubuntu 14.04):
sudo service lightdm stop
The next step is critical, make sure to check twice before continuing!
Run the Nvidia Driver installer with the --no-opengl-files option (the option prevents OpenGL files from being overwritten; without this option, Unity would not function properly and the screen would freeze after login):
sudo chmod +x ~/Downloads/NVIDIA-Linux-x68_64-346.35.run
sudo ~/Downloads/NVIDIA-Linux-x68_64-346.35.run --no-opengl-files
Accept the EULA and acknowledge all further warnings but deny to install anything extra.
Reboot and login to the desktop, verify with the 'Additional Drivers' (System Settings > Software & Updates > Additional Drivers) utility that the manually installed driver is in use.
Open a terminal and install the Cuda 6.5 Toolkit:
sudo chmod +x ~/Downloads/cuda_6.5.14_linux_64.run
sudo ~/Downloads/cuda_6.5.14_linux_64.run
Accept the EULA, do NOT install the driver, install the Toolkit and the Examples (if you want to), leave all default directories in place.
Add the Cuda 6.5 Toolkit environment variables by appending the following lines to ~/.bashrc:
# For 32-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib
# For 64-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64
The Nvidia Driver and Cuda 6.5 Toolkit should now be correctly installed.
Optional: confirm your Nvidia Driver and Cuda 6.5 Toolkit installation.
Confirm the Nvidia Driver installation by running the following command:
nvidia-smi
Confirm the Cuda Compiler installation by running the following command:
nvcc -V
Confirm everything works by building and running the optionally installed Cuda Examples: (build-essential is required to use 'make')
sudo apt-get install -y build-essential
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/deviceQuery
make
./deviceQuery
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/bandwidthTest
make
./bandwidthTest
This problem is not related to caffe.
The problem is that the nVidia driver that is installed from the ubuntu software center does not support your card.
Uninstall any nvidia package (sudo apt-get purge nvidia-*) and install the latest driver version from the nvidia website.
I recommend you to change the cuda 7.5 ubuntu 15.04 version. I try it on the ubuntu 14.04, it solves this problem. And when I install cuda 7.5 ubuntu 14.04 version on ubuntu 14.04 I countered the exactly problem.