CUDA "No compatible Device" error on Ubuntu 11.10/12.04 - cuda

I have been trying to set up an Ubuntu environment on my laptop for some time now for CUDA programming. I am currently dual booting Windows 8 and Ubuntu 12.04 and want to install CUDA 5 on Ubuntu.
The laptop has a GeForce GT 640M graphics card (See below for full specs). It is an Optimus card.
Originally I was dual booting Ubuntu 11.10 and have tried tutorials on both 11.10 and 12.04.
I have tried many tutorials of all shapes and sizes, including this tutorial. The installation process shows the device driver installing and the Toolkit installing, and the Samples failing, but when I go to test a simple Vector Add CUDA program in NSight, "No compatible CUDA Device" error is thrown.
Ubuntu Details also still shows "Unknown" for Graphics
Suggestions?
Laptop Specs:
Acer V3-771G
Intel Core i7 2670QM
nVidia GeForce GT 640M 2GB - Optimus
16GB DDR3-1600 RAM
120GB SSD + 500GB HDD + 32GB Cache SSD

Since it is an optimus device, there are some extra steps to be able to use the nvidia GPU. While it is not necessary, I suggest that you use the bumblebee wrapper program because it is the easiest solution.
After you have installed the bumblebee wrapper you can run your programs using optirun programname or start a shell with the nvidia card activated: optirun bash --login
An added bonus is that the bumblebee daemon will disable the GPU when it is not running and will save you some battery.
If you don't care about battery life and just want CUDA to be always enabled without wrapping commands you can load the nvidia kernel module and then create the necessary device nodes manually:
mknod /dev/nvidia0 c 195 0
mknod /dev/nvidiactl c 195 255
(This advanced method lets you run cuda programs from the console without starting Xorg, for example when SSH-ing to a machine without a running X server.)
See also https://askubuntu.com/questions/131506/how-can-i-get-nvidia-cuda-or-opencl-working-on-a-laptop-with-nvidia-discrete-car for a more detailed discussion.

Try the command sudo apt-get install mesa-utils.
See if the graphics is recognized and then try to install cuda
If does not recognized with the first command try:
sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
sudo apt-get update
sudo apt-get install nvidia-current

First install the following libraries & Tools:
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
Next we will blacklist some modules(drivers), in terminal enter:
sudo gedit /etc/modprobe.d/blacklist.conf
Add the following to the end of the file(one per line like so):
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
Save the file and close the editor.
Now we want to get rid of any nvidia risiduals, in terminal:
sudo apt-get remove --purge nvidia*
Next you need to restart your machine (sudo reboot).
0) Press Ctrl+Alt+F1 at login screen(you don't have to login, we'll have to restart later anyway), then log in.
1) sudo service lightdm stop
2) cd Downloads
3) chmod +x devdriver*.run (your driver filename)
4) sudo ./devdriver*.run
You might have to run the driver-installer once, reboot(it will remove nouveau drivers) and repeat the steps again. Follow the installer instructions and it will be fine, when it asks you;
yes, you do want the 32-bit libraries and you DO want it to change the xorg.conf file.
Once the installer completes, restart (sudo reboot). You're done :]
In Order to install SDK and Toolkit,
use the steps 3 and 4 with the downloaded files. (.run)

In theory, the drivers included with CUDA 5.5 should natively support Optimus (as well as single GPU debugging for non-Optimus laptops). I haven't tried it yet because I'm waiting for a compute 3.5 Optimus laptop so that it'll support kernel recursion and HyperQ. In theory the HP Envy 15t-j000 has the GK208 version of the GT 740m, but I'd really rather have an ultrabook form factor like the upcoming Acer S3-392 with GT 735m. The NVIDIA guys at GTC assured me that Optimus should be working with the CUDA 5.5 RC. I found this 'CUDA Getting Started Guide for Linux' released this month that provides some flags for getting Optimus drivers installed correctly:
http://www.google.com/url?q=http://developer.download.nvidia.com/compute/cuda/5_5/rc/docs/CUDA_Getting_Started_Linux.pdf
Also, more information about GK208 Chips and Compute 3.5 in laptops:
https://devtalk.nvidia.com/default/topic/546357/sounds-like-gk208-laptops-cards-will-support-most-sm_35-features/
Anyone have luck with CUDA 5.5 and Optimus laptops under linux?

Related

How to run u-boot on QEMU(raspi2)?

I am trying to run the u-boot on QEMU. But when start QEMU it gives nothing, so why this doesn't work and how to debug to find out the reason?
This is I tried:
Install Ubuntu 18.04 WSL2 on Windows.
Compile u-boot for the Raspi2
sudo apt install make gcc bison flex
sudo apt-get install gcc-arm-none-eabi binutils-arm-none-eabi
export CROSS_COMPILE=arm-none-eabi-
export ARCH=arm
make rpi_2_defconfig all
Start QEMU
qemu-system-arm -M raspi2 -nographic -kernel ./u-boot/u-boot.bin
And also tried QEMU on the Windows side, and the result is the same.
PS C:\WINDOWS\system32> qemu-system-arm.exe -M raspi2 --nographic -kernel E:\u-boot\u-boot.bin
Then QEMU didn't give output, even I tried to ctrl+c cannot stop the process.
Unfortunately this is an incompatibility between the way that u-boot expects to be started on the raspberry pi and the ways of starting binaries that QEMU supports for this board.
QEMU supports two ways of starting guest code on Arm in general:
Linux kernels; these boot with whatever the expected
boot protocol for the kernel on this board is. For raspi
that will be "start the primary CPU, but put the secondaries in
the pen waiting on the mbox". Effectively, QEMU emulates a
very minimal bit of the firmware, just enough to boot Linux.
Not Linux kernels; these are booted as if they were the
first thing to execute on the raw hardware, which is to say
that all CPUs start executing at once, and it is the job of
the guest code to provide whatever penning of secondary CPUs
it wants to do. That is, your guest code has to do the work
of the firmware here, because it effectively is the firmware.
We assume that you're a Linux kernel if you're a raw image,
or a suitable uImage. If you're an ELF image we assume you're
not a Linux kernel. (This is not exactly ideal but we're to
some extent lumbered with it for backwards-compatibility reasons.)
On the raspberry pi boards, the way the u-boot binary expects to be started is likely to be "as if the firmware launched it", which is not exactly the same as either of the two options QEMU supports. This mismatch tends to result in u-boot crashing (usually because it is not expecting the "all CPUs run at once" behaviour).
A fix would require either changes to u-boot so it can handle being launched the way QEMU launches it, or changes to QEMU to support more emulation of the firmware of this board (which QEMU upstream would be reluctant to accept).
An alternative approach if it's not necessary to use the raspi board in particular would be to use some other board like the 'virt' board which u-boot does handle in a way that allows it to boot on QEMU. (The 'virt' board also has better device support; for instance it can do networking and USB devices, which 'raspi' and 'raspi2' cannot at the moment.)

How to set CUDA parameters with GTX1080 for Tensorflow?

After I install the diriver of GTX1080, tensorflow shows that it can find the cudnn library.
However, the GPU driver is not recognized by the modprobe.
Detais information are as follows:
$ python
[14:22:14]
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
>>> sess = tf.InteractiveSession()
modprobe: ERROR: could not insert 'nvidia_352_uvm': Invalid argument
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: work-data
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:347] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.27 Thu Jun 9 18:53:27 PDT 2016 GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) """
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 367.27.0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine.
The version of GTX1080 driver is 367.27, which is provided by the NVIDIA.
I don't know why there is a 'nvidia_352_uvm'?
The result of nvidia-smi is here.
May be I need to reinstall cuda, but I really reinstall it several times.
Should I remove all the cuda library and nvidia dirver, then reinstall them all? Is there any install sequence about this two?
enter image description here
Too long for a comment, but here are some tips I've learned after trying to get NVidia drivers to play nice with Ubuntu.
Upgrading new driver on top of existing driver gives a partially upgraded installation. You need to remove the previous stuff first.
sudo apt-get remove --purge nvidia-*
sudo rm /etc/X11/xorg.conf # if you ran nvidia-xconfig
Reload NVidia driver as follows (from virtual terminal, CTRL+ALT+F7)
sudo service lightdm stop # stop your window manager
killall python # kill all running TensorFlow instances to free GPU
sudo modprobe -r nvidia
sudo modprobe nvidia
dmesg | tail -100 # check for error messages
Check logs for any error messages from NVidia
dmesg | grep -i nvidia
lspci | grep -i nvidia
nvidia-smi # make sure this reports version 367.27
Also, there are two ways to install drivers, using Ubuntu's built-in upgrade with sudo apt-get install nvidia-current, or by getting tar ball from NVidia website. I was not able to get sudo apt-get route to work for TensorFlow, so I would recommend downloading drivers from NVidia website

Unable to get cuda to work in tensorflow

I'm trying to use cuda to accelerate tensorflow. I'm running tensorflow using the docker image.
Firstly, when I launch the gpu image, it has a mismatch in the LT_LIBRARY_PATH environment variable:
~# echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64:
root#d578acbbc2cd:~# ls /usr/local/
bin cuda cuda-7.0 etc games include lib man sbin share src
There's no nvidia directory there. When I try to run the convolutional.py demo, it can't initialise the cuda support:
# python models/image/mnist/convolutional.py
Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 8
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open moddep file '/lib/modules/4.2.0-23-generic/modules.dep.bin'
E tensorflow/stream_executor/cuda/cuda_driver.cc:466] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:98] retrieving CUDA diagnostic information for host: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:106] hostname: d578acbbc2cd
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:131] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:242] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.68 Tue Dec 1 17:24:11 PST 2015
GCC version: gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2)
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:135] kernel reported version is: 352.68
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA:
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 8
It then goes on to train using cpu only.
# find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
So in the docker image, there's only the gnu cpu cuda implementation. No NVIDIA stuff. In the host ubuntu 15.10 session, I have libcuda.so installed:
$ find /usr -name libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so
/usr/lib/i386-linux-gnu/libcuda.so
/usr/local/cuda-7.5/targets/x86_64-linux/lib
/stubs/libcuda.so
So these seem to be stubs ... not sure why.
Is there some trick to getting this to work?
Try rebuilding the Docker image directly from the Tensorflow repository (i.e. don't rely on the image on the container registry) and use https://github.com/NVIDIA/nvidia-docker to run the container (the Docker command described in the Tensorflow documentation is not portable).
I had a similar problem, though not in docker. The libcuda.so in /usr/local/cuda/lib64/stubs was a broken sym link. When I searched for libcuda.so it only turned up a file in a lib32 folder.
It seems that the problem was how I originally installed the NVIDIA device driver. At some point in the driver install process you're given the option to install the lib32 drivers. I had thought this meant in addition to lib64 drivers so I selected it. Turns out it only installs lib32 and not lib64 drivers.
I reinstalled the NIVDIA device driver, this time not selecting the lib32 'option'. Now tensorflow finds libcuda.so.
I had the same problem with running tensorflow on a Ubuntu machine after I upgraded my driver to 352.63 and 352.93. (I remember it works with 346.* but when I try to install 346., it installs 352. automatically for some reason).
I finally figured out that it's caused by permission issue. (I can run it with root) So, I changed the permission of the libcuda.so.352-63 file to executable by anyone and it works well now.
Hope this will be helpful to those still struggling with this issue.
I didn't try the docker one, but I guess it's also caused by permission setting.
Try this command
sudo apt-get install nvidia-modprobe
As mentioned here:
https://github.com/tensorflow/tensorflow/issues/394
and
http://kkjkok.blogspot.in/2016_08_01_archive.html
After I updated NVIDIA driver to 378.09 on Ubuntu 14.10 I had the same error,
although all the right for lib files were set correctly.
Thanks to #PhoenixQ, I tried to run with sudo and it worked.
After that I tried to run without sudo one more time and error disappeared. I'm not sure what ecxactly happened, but maybe something was configured during call with sudo, which was not possible withous sudo.
So the solution:
Try to run the same thing with sudo.
After this. Tryu running without sudo. Worked for me.

Seeing No CUDA capable GPU detected after i upgraded to cuda 6.5 from 5.5

Hey i am receving the error : No CUDA capable GPU detected
after i upgraded from Cuda 5.5 to Cuda 6.5 .
Nvidia driver version i have is 331.49 .
Is this compatible for running 6.5 version or what is the best stable version for cuda 6.5
CUDA 6.5 requires a r340 driver or newer. On linux that would be 340.29 or higher.
331.49 won't work. Whatever method you used to "upgrade" from 5.5 to 6.5 was incomplete.
There are getting started guides for each supported OS that may help.
If you just want to load a new driver, you can select a driver appropriate for your GPU and OS at http://www.nvidia.com/Download/index.aspx?lang=en-us
The best stable version driver for 6.5 is 346.
you can uninstall completely the driver by
sudo nvidia-uninstall
and then add xorg-edgers repository, and next update the apt-get and install the desired driver version:
sudo add-apt-repository ppa:xorg-edgers/ppa
sudo apt-get update
sudo apt-get install nvidia-346
and then run
sudo nvidia-xconfig
and reboot after that.
After startup verify the driver installation by:
nvidia-smi
It should print out desirable output. which is some information about the GPU.
And after you can verify the cuda installation by running deviceQuery in samples.

Caffe Installation Issue on Ubuntu 14.04

I successfully installed caffe on my dual-boot laptop (GTX 860M, Windows 7 + Ubuntu 14.04.2). All the tests were successfully passed. When I restarted, however, the ubuntu got stuck on the opening screen (the one with ubuntu logo and five red dots). Don't know what to do with it.
Has anyone run into the same issue before? I reckon something is wrong with graphic card driver booting. I installed newest CUDA 7 Toolkit with nvidia drivers built inside. Since all tests were passed before I restarted, it seems that the driver would work once successfully booted.
the stuck screen is like this: http://i.stack.imgur.com/pRtEF.jpg
I had a similar issue when trying to install Caffe on my system. The steps below worked for me, but it has at least one known issue (documented below).
I'm not sure what precisely caused this problem, but it surely has something to do with the Nvidia Driver and Cuda Toolkit installation and is not caused by Caffe.
After completing the steps below, I've been able to successfully install Caffe on my system with the following tutorials and guides:
Official Install Guide
Github Install Guide
Update
Recently, I had the exact same problem trying to make Cuda 7.5 work on Ubuntu 14.04; this approach also solved that problem. Specs:
CPU: Intel Core i7-4700MQ (4x 2.40 GHz with Hyperthreading)
GPU: NVidia GT 940M
RAM: 8 GB
HDD: 52.7 GB (of which 6.7 GB used after installation)
INSTALL NVIDIA DRIVER AND CUDA ON UBUNTU 14.04
Source: ubuntuforums.org/showthread.php?t=2246526
!! Known Issues !!
After the system has been suspended (or hibernated, not confirmed), all applications using the Nvidia Driver and Cuda 6.5 Toolkit will freeze. When this happens, the command sudo shutdown -r now will print the reboot message but nothing will happen.
Executed and tested on a fresh 64-bit Ubuntu 14.04 install with the following hardware specifications:
CPU: Intel Core i5-2410m (2x 2.30 GHz with Hyperthreading)
GPU: NVidia GT 540M
RAM: 6 GB
HDD: 52.7 GB (of which 8.6 GB used after installation)
The following command was executed before installation:
sudo apt-get -y build-essential vim git llvm clang
The following steps resulted in a stable system with the latest Nvidia Driver and Cuda 6.5 Toolkit installed:
Remove all traces of previous/legacy Nvidia Drivers and Cuda Toolkits or perform a fresh Ubuntu 14.04 install.
Download the latest Nvidia Driver .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: NVIDIA-Linux-x86_64-346.35.run
Download the latest Cuda 6.5 Toolkit .run file for Ubuntu 14.04 and your system specifications to the ~/Downloads directory.
e.g.: cuda_6.5.14_linux_64.run
Blacklist the 'nouveau' Driver by appending the following lines to /etc/modprobe.d/blacklist.conf (nouveau is a free open-source driver for Nvidia cards, it is the default for Ubuntu 14.04):
blacklist nouveau
options nouveau modeset=0
Reboot the system, do NOT log in but drop to the terminal with CTRL+ALT+F1
Kill lightdm (replace 'lightdm' with your own Display Manager if you have changed it, lightdm is the default for Ubuntu 14.04):
sudo service lightdm stop
The next step is critical, make sure to check twice before continuing!
Run the Nvidia Driver installer with the --no-opengl-files option (the option prevents OpenGL files from being overwritten; without this option, Unity would not function properly and the screen would freeze after login):
sudo chmod +x ~/Downloads/NVIDIA-Linux-x68_64-346.35.run
sudo ~/Downloads/NVIDIA-Linux-x68_64-346.35.run --no-opengl-files
Accept the EULA and acknowledge all further warnings but deny to install anything extra.
Reboot and login to the desktop, verify with the 'Additional Drivers' (System Settings > Software & Updates > Additional Drivers) utility that the manually installed driver is in use.
Open a terminal and install the Cuda 6.5 Toolkit:
sudo chmod +x ~/Downloads/cuda_6.5.14_linux_64.run
sudo ~/Downloads/cuda_6.5.14_linux_64.run
Accept the EULA, do NOT install the driver, install the Toolkit and the Examples (if you want to), leave all default directories in place.
Add the Cuda 6.5 Toolkit environment variables by appending the following lines to ~/.bashrc:
# For 32-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib
# For 64-bit systems, append these:
export PATH=$PATH:/usr/local/cuda-6.5/bin
export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64
The Nvidia Driver and Cuda 6.5 Toolkit should now be correctly installed.
Optional: confirm your Nvidia Driver and Cuda 6.5 Toolkit installation.
Confirm the Nvidia Driver installation by running the following command:
nvidia-smi
Confirm the Cuda Compiler installation by running the following command:
nvcc -V
Confirm everything works by building and running the optionally installed Cuda Examples: (build-essential is required to use 'make')
sudo apt-get install -y build-essential
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/deviceQuery
make
./deviceQuery
cd ~/NVIDIA_CUDA-6.5_SAMPLES/1_Utilities/bandwidthTest
make
./bandwidthTest
This problem is not related to caffe.
The problem is that the nVidia driver that is installed from the ubuntu software center does not support your card.
Uninstall any nvidia package (sudo apt-get purge nvidia-*) and install the latest driver version from the nvidia website.
I recommend you to change the cuda 7.5 ubuntu 15.04 version. I try it on the ubuntu 14.04, it solves this problem. And when I install cuda 7.5 ubuntu 14.04 version on ubuntu 14.04 I countered the exactly problem.