How does anaconda pick cudatoolkit

How does anaconda pick cudatoolkit - cuda

I have multiple enviroments of anaconda with different cuda toolkits installed on them.
env1 has cudatoolkit 10.0.130
env2 has cudatoolkit 10.1.168
env3 has cudatoolkit 10.2.89
I found these by running conda list on each environment.
When i do nvidia-smi i get the following output no matter which environment i am in
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:01:00.0 On | N/A |
| 0% 42C P8 7W / 260W | 640MiB / 11016MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
Is the cuda version shown above is same as cuda toolkit version?
If so why is it same in all the enviroments?
In env3 which has cudatoolkit version 10.2.89, i tried installing cupy library using the command pip install cupy-cuda102 .
I get the following error when i try to do it.
ERROR: Could not find a version that satisfies the requirement cupy-cuda102 (from versions: none)
ERROR: No matching distribution found for cupy-cuda102
I was able to install using pip install cupy-cuda101 which is for cuda 10.1.
Why is it not able to find cudatoolkit 10.2?
The reason i am asking this question is because, i am getting an error cupy.cuda.cublas.CUBLASError: CUBLAS_STATUS_NOT_INITIALIZED when i am running a deep learning model. I am just wondering if cudatoolkit version has something to do with this error.Even if this error is not related to cudatoolkit version i want to know how anaconda uses cudatoolkit.

This isn't really answering the original question, but the follow up ones:
tensorflow and pytorch can be installed directly through anaconda without explicitly downloading the cudatoolkit from nvidia. It only needs gpu driver installed. In this case nvcc is not installed still it works fine. how does it work in this case?
In general, GPU packages on Anaconda/Conda-Forge are built using Anaconda's new CUDA compiler toolchain. It is made in such a way that nvcc and friends are split from the rest of runtime libraries (cuFFT, cuSPARSE, etc) in CUDA Toolkit. The latter is packed in the cudatoolkit package, is made as a run-dependency (in conda's terminology), and is installed when you install GPU packages like PyTorch.
Then, GPU packages are compiled, linked to cudatoolkit, and packaged, which is the reason you only need the CUDA driver to be installed and nothing else. The system's CUDA Toolkit, if there's any, is by default ignored due to this linkage, unless the package (such as Numba) has its own way to look up CUDA libraries in runtime.
It's worth mentioning that the installed cudatoolkit does not always match your driver. In that event, you can explicitly constrain its version (say 10.0):
conda install some_gpu_package cudatoolkit=10.0
what happens when the environment in which tensorflow is installed is activated? Does conda create environment variables for accessing cuda libraries just when the environment is activated?
Conda always sets up some env vars when an env is activated. I am not fully sure about tensorflow, but most likely when it's built, it's linked to CUDA runtime libraries (cudatoolkit in other words). So, when launching tensorflow or other GPU apps, they will use the cudatoolkit installed in the same conda env.

Is the cuda version shown above is same as cuda toolkit version?
It has nothing to do with CUDA toolkit versions.
If so why is it same in all the enviroments [sic]?
Because it is a property of the driver. It is the maximum CUDA version that the active driver in your system supports. And when you try and use CUDA 10.2, it is why nothing works. Your driver needs to be updated to support CUDA 10.2.

Related

Installing cuda 10.1 but getting 10.2 [duplicate]

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So when I run
$ which nvcc
/usr/local/cuda-9.2/bin/nvcc
However, when I run
$ nvidia-smi
Wed Nov 21 19:41:32 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A |
| N/A 53C P0 26W / N/A | 379MiB / 6078MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1324 G /usr/lib/xorg/Xorg 225MiB |
| 0 2844 G compiz 146MiB |
| 0 15550 G /usr/lib/firefox/firefox 1MiB |
| 0 19992 G /usr/lib/firefox/firefox 1MiB |
| 0 23605 G /usr/lib/firefox/firefox 1MiB |
So am I using cuda9.2 as which nvcc suggests, or am I using cuda10 as nvidia-smi suggests? I saw this answer but it does not provide direct answer to the confusion, it just asks us to reinstall the CUDA Toolkit, which I already did.

CUDA has 2 primary APIs, the runtime and the driver API. Both have a corresponding version (e.g. 8.0, 9.0, etc.)
The necessary support for the driver API (e.g. libcuda.so on linux) is installed by the GPU driver installer.
The necessary support for the runtime API (e.g. libcudart.so on linux, and also nvcc) is installed by the CUDA toolkit installer (which may also have a GPU driver installer bundled in it).
In any event, the (installed) driver API version may not always match the (installed) runtime API version, especially if you install a GPU driver independently from installing CUDA (i.e. the CUDA toolkit).
The nvidia-smi tool gets installed by the GPU driver installer, and generally has the GPU driver in view, not anything installed by the CUDA toolkit installer.
Recently (somewhere between 410.48 and 410.73 driver version on linux) the powers-that-be at NVIDIA decided to add reporting of the CUDA Driver API version installed by the driver, in the output from nvidia-smi.
This has no connection to the installed CUDA runtime version.
nvcc, the CUDA compiler-driver tool that is installed with the CUDA toolkit, will always report the CUDA runtime version that it was built to recognize. It doesn't know anything about what driver version is installed, or even if a GPU driver is installed.
Therefore, by design, these two numbers don't necessarily match, as they are reflective of two different things.
If you are wondering why nvcc -V displays a version of CUDA you weren't expecting (e.g. it displays a version other than the one you think you installed) or doesn't display anything at all, version wise, it may be because you haven't followed the mandatory instructions in step 7 (prior to CUDA 11) (or step 6 in the CUDA 11 linux install guide) of the cuda linux install guide
Note that although this question mostly has linux in view, the same concepts apply to windows CUDA installs. The driver has a CUDA driver version associated with it (which can be queried with nvidia-smi, for example). The CUDA runtime also has a CUDA runtime version associated with it. The two will not necessarily match in all cases.
In most cases, if nvidia-smi reports a CUDA version that is numerically equal to or higher than the one reported by nvcc -V, this is not a cause for concern. That is a defined compatibility path in CUDA (newer drivers/driver API support "older" CUDA toolkits/runtime API). For example if nvidia-smi reports CUDA 10.2, and nvcc -V reports CUDA 10.1, that is generally not cause for concern. It should just work, and it does not necessarily mean that you "actually installed CUDA 10.2 when you meant to install CUDA 10.1"
If nvcc command doesn't report anything at all (e.g. Command 'nvcc' not found...) or if it reports an unexpected CUDA version, this may also be due to an incorrect CUDA install, i.e the mandatory steps mentioned above were not performed correctly. You can start to figure this out by using a linux utility like find or locate (use man pages to learn how, please) to find your nvcc executable. Assuming there is only one, the path to it can then be used to fix your PATH environment variable. The CUDA linux install guide also explains how to set this. You may need to adjust the CUDA version in the PATH variable to match your actual CUDA version desired/installed.
Similarly, when using docker, the nvidia-smi command will generally report the driver version installed on the base machine, whereas other version methods like nvcc --version will report the CUDA version installed inside the docker container.
Similarly, if you have used another installation method for the CUDA "toolkit" such as Anaconda, you may discover that the version indicated by Anaconda does not "match" the version indicated by nvidia-smi. However, the above comments still apply. Older CUDA toolkits installed by Anaconda can be used with newer versions reported by nvidia-smi, and the fact that nvidia-smi reports a newer/higher CUDA version than the one installed by Anaconda does not mean you have an installation problem.
Here is another question that covers similar ground. The above treatment does not in any way indicate that this answer is only applicable if you have installed multiple CUDA versions intentionally or unintentionally. The situation presents itself any time you install CUDA. The version reported by nvcc and nvidia-smi may not match, and that is expected behavior and in most cases quite normal.

nvcc is in the CUDA bin folder - as such check if the CUDA bin folder has been added to your $PATH.
Specifically, ensure that you have carried out the CUDA Post-Installation actions (e.g. from here):
Add the CUDA Bin to $PATH (i.e. add the following line to your ~/.bashrc)
export PATH=/usr/local/cuda-10.1/bin:/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}}
PS. Ensure the following two paths above, exist first: /usr/local/cuda-10.1/bin and /usr/local/cuda-10.1/NsightCompute-2019.1 (the NsightCompute path could have a slightly different ending depending on the version of Nsight compute installed...
Update $LD_LIBRARY_PATH (i.e. add the following line to your ~/bashrc).
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
After this, both nvcc and nvidia-smi (or nvtop) report the same version of CUDA...

If you are using cuda 10.2 :
export PATH=/usr/local/cuda-10.2/bin:/opt/nvidia/nsight-compute/2019.5.0${PATH:+:${PATH}}
might help because when I checked, there was no directory for nsight-compute in cuda-10.2.
I am not sure if this was just the problem with me or else why wouldn't they mention it in the official documentation.

Adding onto Robert Crovella's answer...
The difference between the device driver and the runtime driver is that, with device driver you will be able to run compiled CUDA C code. That is, you can download CUDA powered applications and they will be able to successfully execute their code on your GPU.
Whereas, with the runtime driver you will be able to able to compile the CUDA C code, which then will be executed with the help of the device driver on your GPU.
Section 2.2.3 - Cuda Development Toolkit

nvidia-smi can show a “different CUDA version” from the one that is reported by nvcc. Because they are reporting two different things:
nvidia-smi shows that maximum available CUDA version support for a given GPU driver.
And the 2nd thing which nvcc -V reports is the CUDA version that is currently being used by the system.
In short
nvidia-smi shows the highest version of CUDA supported by your driver. nvcc -V shows the version of the current CUDA installation. As long as your driver-supported version is higher than your installed version, it's fine. You can even have several versions of CUDA installed at the same time.

difference between nvidia-smi command and conda list cudatoolkit on anaconda prompt [duplicate]

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So when I run
$ which nvcc
/usr/local/cuda-9.2/bin/nvcc
However, when I run
$ nvidia-smi
Wed Nov 21 19:41:32 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A |
| N/A 53C P0 26W / N/A | 379MiB / 6078MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1324 G /usr/lib/xorg/Xorg 225MiB |
| 0 2844 G compiz 146MiB |
| 0 15550 G /usr/lib/firefox/firefox 1MiB |
| 0 19992 G /usr/lib/firefox/firefox 1MiB |
| 0 23605 G /usr/lib/firefox/firefox 1MiB |
So am I using cuda9.2 as which nvcc suggests, or am I using cuda10 as nvidia-smi suggests? I saw this answer but it does not provide direct answer to the confusion, it just asks us to reinstall the CUDA Toolkit, which I already did.

nvcc is in the CUDA bin folder - as such check if the CUDA bin folder has been added to your $PATH.
Specifically, ensure that you have carried out the CUDA Post-Installation actions (e.g. from here):
Add the CUDA Bin to $PATH (i.e. add the following line to your ~/.bashrc)
export PATH=/usr/local/cuda-10.1/bin:/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}}
PS. Ensure the following two paths above, exist first: /usr/local/cuda-10.1/bin and /usr/local/cuda-10.1/NsightCompute-2019.1 (the NsightCompute path could have a slightly different ending depending on the version of Nsight compute installed...
Update $LD_LIBRARY_PATH (i.e. add the following line to your ~/bashrc).
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
After this, both nvcc and nvidia-smi (or nvtop) report the same version of CUDA...

If you are using cuda 10.2 :
export PATH=/usr/local/cuda-10.2/bin:/opt/nvidia/nsight-compute/2019.5.0${PATH:+:${PATH}}
might help because when I checked, there was no directory for nsight-compute in cuda-10.2.
I am not sure if this was just the problem with me or else why wouldn't they mention it in the official documentation.

Adding onto Robert Crovella's answer...
The difference between the device driver and the runtime driver is that, with device driver you will be able to run compiled CUDA C code. That is, you can download CUDA powered applications and they will be able to successfully execute their code on your GPU.
Whereas, with the runtime driver you will be able to able to compile the CUDA C code, which then will be executed with the help of the device driver on your GPU.
Section 2.2.3 - Cuda Development Toolkit

nvidia-smi can show a “different CUDA version” from the one that is reported by nvcc. Because they are reporting two different things:
nvidia-smi shows that maximum available CUDA version support for a given GPU driver.
And the 2nd thing which nvcc -V reports is the CUDA version that is currently being used by the system.
In short
nvidia-smi shows the highest version of CUDA supported by your driver. nvcc -V shows the version of the current CUDA installation. As long as your driver-supported version is higher than your installed version, it's fine. You can even have several versions of CUDA installed at the same time.

CUDA version mismatch [duplicate]

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So when I run
$ which nvcc
/usr/local/cuda-9.2/bin/nvcc
However, when I run
$ nvidia-smi
Wed Nov 21 19:41:32 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A |
| N/A 53C P0 26W / N/A | 379MiB / 6078MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1324 G /usr/lib/xorg/Xorg 225MiB |
| 0 2844 G compiz 146MiB |
| 0 15550 G /usr/lib/firefox/firefox 1MiB |
| 0 19992 G /usr/lib/firefox/firefox 1MiB |
| 0 23605 G /usr/lib/firefox/firefox 1MiB |
So am I using cuda9.2 as which nvcc suggests, or am I using cuda10 as nvidia-smi suggests? I saw this answer but it does not provide direct answer to the confusion, it just asks us to reinstall the CUDA Toolkit, which I already did.

nvcc is in the CUDA bin folder - as such check if the CUDA bin folder has been added to your $PATH.
Specifically, ensure that you have carried out the CUDA Post-Installation actions (e.g. from here):
Add the CUDA Bin to $PATH (i.e. add the following line to your ~/.bashrc)
export PATH=/usr/local/cuda-10.1/bin:/usr/local/cuda-10.1/NsightCompute-2019.1${PATH:+:${PATH}}
PS. Ensure the following two paths above, exist first: /usr/local/cuda-10.1/bin and /usr/local/cuda-10.1/NsightCompute-2019.1 (the NsightCompute path could have a slightly different ending depending on the version of Nsight compute installed...
Update $LD_LIBRARY_PATH (i.e. add the following line to your ~/bashrc).
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
After this, both nvcc and nvidia-smi (or nvtop) report the same version of CUDA...

If you are using cuda 10.2 :
export PATH=/usr/local/cuda-10.2/bin:/opt/nvidia/nsight-compute/2019.5.0${PATH:+:${PATH}}
might help because when I checked, there was no directory for nsight-compute in cuda-10.2.
I am not sure if this was just the problem with me or else why wouldn't they mention it in the official documentation.

Adding onto Robert Crovella's answer...
The difference between the device driver and the runtime driver is that, with device driver you will be able to run compiled CUDA C code. That is, you can download CUDA powered applications and they will be able to successfully execute their code on your GPU.
Whereas, with the runtime driver you will be able to able to compile the CUDA C code, which then will be executed with the help of the device driver on your GPU.
Section 2.2.3 - Cuda Development Toolkit

nvidia-smi can show a “different CUDA version” from the one that is reported by nvcc. Because they are reporting two different things:
nvidia-smi shows that maximum available CUDA version support for a given GPU driver.
And the 2nd thing which nvcc -V reports is the CUDA version that is currently being used by the system.
In short
nvidia-smi shows the highest version of CUDA supported by your driver. nvcc -V shows the version of the current CUDA installation. As long as your driver-supported version is higher than your installed version, it's fine. You can even have several versions of CUDA installed at the same time.

nvprof application not found

I am trying to use Nvidia nvprof to profile my CUDA and OpenCL programs. However, whatever benchmark I choose, the only output is ======== Error: application not found. I have tried both CUDA and OpenCL benchmarks, and recompiled them several times, but it seems helpless.
My CUDA version: 4.2
NVIDIA Driver version: 334.21

Different from AMD sprofile, ./ is needed before the application name on Linux.
So you can call the profiler with this command:
nvprof ./ApplicationName

Compiling CUDA SDK examples in hardware emulation mode

I'm trying to do some CUDA development on a PC without CUDA-capable GPU via emulation mode. The OS is Linux Mint Debian (can be considered Debian testing for all practical purposes) 32bit (2.6.32-5-686 kernel). Here's what I did so far:
Grabbed the CUDA Toolkit 32 bit and SDK for Ubuntu from http://developer.nvidia.com/cuda-toolkit-40
Installed the CUDA Toolkit in /usr/local/cuda/lib
Added the paths to bashrc
echo "# CUDA stuff
PATH=\$PATH:/usr/local/cuda/bin
LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/usr/local/cuda/lib
export PATH
export LD_LIBRARY_PATH" >> ~/.bashrc
Added the path to /etc/ld.so.conf.d/cuda.conf:
/usr/local/cuda/lib
Executed "sudo ldconfig"
Restarted the session
Then installed the SDK in /home/user/NVIDIA_GPU_Computing_SDK folder
When I got to NVIDIA_GPU_Computing_SDK/C and type "make emu=1" to compile the examples I get:
nvcc warning : option 'device-emulation' has been deprecated and is ignored
/usr/bin/ld: cannot find -lcudartemu
/usr/bin/ld: cannot find -lcudartemu
collect2: ld returned 1 exit status
Seems like a library missing (rt = runtime ?). There is libcudart3 in the package manager, but wants a whole bunch of nvidia stuff as a dependency, including drivers and I don't even have an NVIDIA card on this machine. Also apparently the GPU emulation is now deprecated... Does anybody have some experience with CUDA emulation?

There is no emulation in CUDA any more. It was deprecated and removed during the 3.x release cycle. There is no emulation support beyond CUDA 3.1 IIRC. Certainly there is nothing you can do in CUDA 4.0.
On Linux, your best bet is to try gpuocelot, which provides a PTX level emulation on x86 processors and a reimplementation of the CUDA APIs.

Although I agree with the suggestion to try Ocelot, when I was in the same boat I found it easiest to go on eBay and get a cheap CUDA capable card to use for testing (I think I paid < $40). If you have the ability to open the hardware (I realize this isn't an option for some people) and to install drivers, that's what I'd suggest.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008