Does CUDA compilation rely on presence of graphics card? [duplicate] - cuda

This question already has an answer here:
Is CUDA hardware needed at compile time?
(1 answer)
Closed 8 years ago.
Suppose, hypothetically, that I want to test compile, but not run, CUDA code on a machine that has no CUDA capable GPU present. Should I be able to do that with only the CUDA Toolkit installed? Or does NVCC rely on the presence of graphics card hardware in any way?

Certainly on linux, you can install the CUDA toolkit and compile code without a GPU installed. There are nuances to this. For example, if your code depends on a library that only gets installed by the driver (such as libraries required by CUDA code using the Driver API), then there are additional bridges to cross. But ordinary CUDA runtime API code can be compiled this way just fine. nvcc does not depend on a GPU.
I haven't actually tried this in Windows, but I think it should be possible to install the CUDA toolkit without a CUDA GPU.

Related

Does CUDA 11.2 supports backward compatibility with application that is compiled on CUDA 10.2?

I have the base image for my application built with nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04.I have to run that application in the cluster which is having cuda version
NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2.
My application is not giving me right prediction results for the GPU trained model(it is returning the base score as prediction output).However, it is able to return accurate prediction results for the CPU-trained model.so, I am speculating it as the CUDA version incompatibility issue between the two. I want to know that whether CUDA version 11.2 works well with application that is complied with CUDA 10.2 or not..
Yes, it is possible for an application compiled with CUDA 10.2 to run in an environment that has CUDA 11.2 installed. This is part of the CUDA compatibility model/system.
Otherwise, there isn't enough information in this question to diagnose why your application is behaving the way you describe. For that, SO expects a minimal reproducible example.

Can a Cuda application built and running on Jetson TX2 run on Jetson Xavier?

I have a Cuda application that was built with Cuda Toolkit 9.0 and running fine on Jetson TX2 board.
I now have a Jetson Xavier board, flashed with Jetpack 4 that installs Cuda Toolkit 10.0 (only 10.0 is available).
What do I need to do if I want to run the same application on Xavier?
Nvidia documentation suggests that as long as I specify the correct target hardware when running nvcc, I should be able to run on future hardwares thanks to JIT compilation. But does this hold for different versions of Cuda toolkit (9 vs 10)?
In theory (and note I don't have access to a Xavier board to test anything), you should be able to run a cross compiled CUDA 9 application (and that might mean both ARM and GPU architecture settings) on a CUDA 10 host.
What you will need to make sure is that you either statically link or copy all the CUDA runtime API library components you require with your application on the Xavier board. Note that there is still an outside chance that those libraries might lack the necessary GPU and ARM features to run correctly on a Xavier system, or more subtle issues like libC incompatibility. That you will have to test for yourself.

Is there a way to compile CUDA programs in a machine that does not have NVIDIA graphics card? [duplicate]

I tried to install cuda toolkit without display driver in CentOS 6. It gets installed properly. I was able to compile but it is compiling without performing any operation and I get garbage values in array addition. For cudaGetDeviceCount(&count) I am getting value as "o" which means I don't have any card on my machine.
You can install the CUDA toolkit without installing the driver.
You can then compile CUDA codes that use the runtime API.
You will not be able to run those codes unless you have a proper CUDA driver and GPU installed in the machine, however.
Codes that depend on the driver API will also not be compilable in this configuration, on older CUDA toolkits, without additional work. Newer CUDA toolkits provide stub libraries for driver libraries, which can be linked against.
This answer covers the method to install the CUDA toolkit without the driver.
If you want just run the codes and profiling the performance and other parameters, it would be helpful if you install GPGPU-sim simulator. It doesn't need any graphic card on your machine.

cudart_static - when is it necessary?

Since newer drivers ship with the CUDA runtime (I can choose 9.1 or 9.2 in the drivers download page) my question is: should my library (which uses a CUDA kernel internally) be shipped with -lcudart_static?
I had issues launching kernels compiled with 9.2 on systems which used 9.1 CUDA drivers. What's the most 'compatible' way of ensuring my library will run everywhere a recent CUDA driver is installed? (I'm already compiling for a virtual architecture)
Since newer drivers ship with the CUDA runtime (I can choose 9.1 or 9.2 in the drivers download page)
No, that's incorrect. That choice in the drivers download page is related to the fact that each CUDA version has a minimum required driver version associated with it. It does not mean that the driver ships with the CUDA runtime (stated another way, the driver does not install libcudart.so on linux and never has - with some careful experimentation on a clean install, you can prove this to yourself.)
Some additional comments:
-lcudart_static is actually the default for current/recent versions of nvcc. You can discover this by reading the nvcc manual. Therefore, by default, your executable, when compiled/built with nvcc should already be statically linked to the CUDA runtime library corresponding to the version of nvcc that you are using for compilation. The reason you might need to specify this or something like this is if you are building an application with e.g. the gnu toolchain (on linux) rather than nvcc.
The purpose of static linking to the CUDA runtime library is, as you surmise, so that an application can be built in such a way that it does not need an installation of the CUDA toolkit to run properly. It only needs a machine with a proper GPU driver install.
The most compatible way to ensure that an application will run on a range of machines with a range of GPU driver installs is to compile your application using the oldest CUDA toolkit required to meet the needs of the earliest GPU driver in the range you intend to cover. Again, you can refer to the table here.

Can I compile a cuda program without having a cuda device

Is it possible to compile a CUDA program without having a CUDA capable device on the same node, using only NVIDIA CUDA Toolkit...?
The answer to your question is YES.
The nvcc compiler driver is not related to the physical presence of a device, so you can compile CUDA codes even without a CUDA capable GPU. Be warned however that, as remarked by Robert Crovella, the CUDA driver library libcuda.so (cuda.lib for Windows) comes with the NVIDIA driver and not with the CUDA toolkit installer. This means that codes requiring driver APIs (whose entry points are prefixed with cu, see Appendix H of the CUDA C Programming Guide) will need a forced installation of a "recent" driver without the presence of an NVIDIA GPU, running the driver installer separately with the --help command line switch.
Following the same rationale, you can compile CUDA codes for an architecture when your node hosts a GPU of a different architecture. For example, you can compile a code for a GeForce GT 540M (compute capability 2.1) on a machine hosting a GT 210 (compute capability 1.2).
Of course, in both the cases (no GPU or GPU with different architecture), you will not be able to successfully run the code.
For the early versions of CUDA, it was possible to compile the code under an emulation modality and run the compiled code on a CPU, but device emulation is since some time deprecated. If you don't have a CUDA capable device, but want to run CUDA codes you can try using gpuocelot (but I don't have any experience with that).