Best solution to have multiple CUDA/cuDNN versions installed on Ubuntu - cuda

I am using Conda on Ubuntu 16.04. My objective is to associate each Conda environment to a specific version of CUDA / cuDNN. I had a look around and I found this interesting article, which basically suggests to put different CUDA versions into different folders and then use an environment-specific bash script (run when the environment is activated) to properly set the PATH/LD_LIBRARY_PATH variables (which creates the association with the CUDA version).
This is fine, but when I try to install frameworks such as pytorch using Conda, it forces me to install also the "cudatoolkit" package.
So, a couple of questions:
1) does downloading cudatoolkit mess up my previous CUDA configurations? which version will be used?
2) if using Conda is possible to install "cudatoolkit" and also "cudnn", why not just using conda for everything? Why even needing to apply the instructions of the above mentioned article?
Thank you.

As an answer to the first question, no, downloading and installing another CUDA toolkit won't mess up other configurations. From CUDA toolkit installer, you specify an installation directory, so just pick whatever works for you that is unique to that CUDA version. This won't affect any currently installed CUDA versions. A Pytorch install will look for a CUDA_HOME environment variable as well as in '/usr/local/cuda' (the default CUDA toolkit install dir.), so it's just this environment variable that needs to be changed.
I can't speak for the second part. Perhaps the installation using Conda will use the default installation directory for the CUDA toolkit (seems silly but this is just speculation).

Related

No "nvcc" in "cuda-10.2/bin" toolkit patch

I have a working cuda-10.0 toolkit and 470 driver. I need to use new virtual memory management features that I found in 10.2 driver. And I can't install more than 10.x because my old video card has compute capability 3.0.
So after applying new toolkit with:
sudo sh ./cuda_10.2.1_linux.run --toolkit --silent --override
it is as I think successfully installed:
But now in folder with "cuda-10.2" there is almost nothing, "bin" folder only has uninstaller and no "nvcc" and others. And newly created link links to that "nothing". How to deal with it?
I tried official docs and googling but nothing was found.
The patch updates for CUDA 10.2 do not contain complete toolkits. The idea behind a "patch" is that it contains only the files necessary to address the items that the patch is focused on.
To get a full CUDA 10.2 CUDA tookit install, you must first install using a full CUDA 10.2 toolkit installer, and a typical filename for that would be cuda_10.2.89_440.33.01_linux.run (runfile installer to match your indicated runfile installer usage). After that, if you decide you need/want the items addressed by the patch, you must also install the desired patch.
Note the statement on the download page:
These patches require the base installer to be installed first.

nvcc not found but cuda runs fine?

I was trying to run nvcc -V to check cuda version but I got the following error message.
Command 'nvcc' not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit
But gpu acceleration is working fine for training models on cuda. Is there another way to find out cuda compiler tools version. I know nvidia-smi doesn't give the right version.
Is there a way to install or configure nvcc. So I don't have to install a whole new toolkit.
Most of the time, nvcc and other CUDA SDK binaries are not in the environment variable PATH. Check the installation path of CUDA; if it is installed under /usr/local/cuda, add its bin folder to the PATH variable in your ~/.bashrc:
export CUDA_HOME=/usr/local/cuda
export PATH=${CUDA_HOME}/bin:${PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH
You can apply the changes with source ~/.bashrc, or the next time you log in, everything is set automatically.
As #pQB and #talonmies above mentioned you only need to install the GPU drivers (Versioned 430-470 these days) to use PyTorch. If you are using your GPU display port you should be fine.
For Cuda compilation tools you need to install the whole toolkit, which includes the driver as well. If installing manually from CLI the downloaded file, CLI will give you the option to choose the components to install or skip.
Generally, it is recommended to install the compilation tools (which are system wide) and GPU drivers together because it avoids compatibility issues.
Append:
export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
to
~/.bashrc
Note: your path to cuda may include a version so navigate to /usr/local/ and check for cudaXX.XX and modify the command to point to that in ~/.bashrc

Is it possible to run multiple CUDA version on windows?

I am doing an experiment on a chest x-ray Project. and I want multiple versions of the CUDA toolkit but the problem is that my system put the latest version which I installed lastly is appearing.
Is it possible to run any of CUDA like 9.0, 10.2, 11.0 as required to GitHub code?
I have done all the initial steps like path added to an environment variable and added CUDNN copied file and added to the environment.
Now the problem is that I want to use Cuda 9.0 as per my code but my default setting put cuda.11.0 what is the solution or script to switch easily between these version
You may set CUDA_PATH_V9_0, CUDA_PATH_V10_0, etc properly, then set CUDA_PATH to any one of them (e.g. CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0).
Then in your VS project, set your cuda library path using the CUDA_PATH (e.g. $CUDA_PATH\lib).
To switch, just set the CUDA_PATH to another version, and clean & rebuild your VS project(s).

AllenNLP Server: pip is looking at multiple versions of each package

Within my Conda environment with Python 3.6.9, I've installed AllenNLP 9.2.0. I tried to install AllenNLP Server following the instruction from https://github.com/allenai/allennlp-server by running pip install --editable .
However, the installation procedure never finished as the compatibility checks with several modules, e.g. pip is looking at multiple versions of tqdm to determine which version is compatible with other requirements. This could take a while. Collecting tqdm>=4.19
Does anybody know what happens here? Should I add more restrictions to steup.py in AllenNLP server? However, there is any code included in such file.
Thanks a lot for your help.
I just tried it with AllenNLP 2.0.1 (the latest), and while it takes a long time, it does eventually resolve the packages.
That said, I would recommend two things:
Use Python 3.8 instead.
If it still doesn't work, specify a version of tqdm tightly in the requirements. My version automatically picked tqdm==4.56.2, just or reference.

Install multiple versions of CUDA and cuDNN

I am currently using CUDA version 7.5 with cuDNN version 5 for MatConvNet. I'd like to install version 8.0 and cuDNN version 5.1 and I want to know if there will be any conflicts if I have the environment paths pointing to both versions of CUDAand cuDNN.
The only environment variables that matter are PATH and LD_LIBRARY_PATH. There shouldn't be any conflicts due to LD_LIBRARY_PATH since all the libs' sonames seem to be bumped properly in each version. As for PATH, the shell will execute the version from the path that appears first in the variable. So there is no point for PATH to contain both versions at the same time, you'll need to decide which version to use at a time.
There's a good article that describes all the steps. The important ones for me were:
Run the CUDA install script with the --silent --toolkit --override options.
Set the LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64.
Change the /usr/local/cuda symbolic link to point back to the default version.