Does the client need to install Cuda Toolkit to run the application? - cuda

I have finished building my cuda application and would like to install it on the clients PCs.
The cuda toolkit installer is about 2GB. It comes with the compiler, samples, tools, etc.
Is the whole toolkit definitly required to be installed on the client machine?
Is there no lighter version for just the CUDA Runtime API ?

Is the whole toolkit definitely required to be installed on the client machine?
No, it isn't.
Is there no lighter version for just the CUDA Runtime API ?
No, but the EULA allows you to redistribute the library components which your application requires with your application. The simplest solution would be to have your installation/deployment system copy the required toolkit components you built against to a private path which is known to your application or deployment environment. You obviously also need to deploy a supported driver version as well.

Related

Is there a way to compile CUDA programs in a machine that does not have NVIDIA graphics card? [duplicate]

I tried to install cuda toolkit without display driver in CentOS 6. It gets installed properly. I was able to compile but it is compiling without performing any operation and I get garbage values in array addition. For cudaGetDeviceCount(&count) I am getting value as "o" which means I don't have any card on my machine.
You can install the CUDA toolkit without installing the driver.
You can then compile CUDA codes that use the runtime API.
You will not be able to run those codes unless you have a proper CUDA driver and GPU installed in the machine, however.
Codes that depend on the driver API will also not be compilable in this configuration, on older CUDA toolkits, without additional work. Newer CUDA toolkits provide stub libraries for driver libraries, which can be linked against.
This answer covers the method to install the CUDA toolkit without the driver.
If you want just run the codes and profiling the performance and other parameters, it would be helpful if you install GPGPU-sim simulator. It doesn't need any graphic card on your machine.

cudart_static - when is it necessary?

Since newer drivers ship with the CUDA runtime (I can choose 9.1 or 9.2 in the drivers download page) my question is: should my library (which uses a CUDA kernel internally) be shipped with -lcudart_static?
I had issues launching kernels compiled with 9.2 on systems which used 9.1 CUDA drivers. What's the most 'compatible' way of ensuring my library will run everywhere a recent CUDA driver is installed? (I'm already compiling for a virtual architecture)
Since newer drivers ship with the CUDA runtime (I can choose 9.1 or 9.2 in the drivers download page)
No, that's incorrect. That choice in the drivers download page is related to the fact that each CUDA version has a minimum required driver version associated with it. It does not mean that the driver ships with the CUDA runtime (stated another way, the driver does not install libcudart.so on linux and never has - with some careful experimentation on a clean install, you can prove this to yourself.)
Some additional comments:
-lcudart_static is actually the default for current/recent versions of nvcc. You can discover this by reading the nvcc manual. Therefore, by default, your executable, when compiled/built with nvcc should already be statically linked to the CUDA runtime library corresponding to the version of nvcc that you are using for compilation. The reason you might need to specify this or something like this is if you are building an application with e.g. the gnu toolchain (on linux) rather than nvcc.
The purpose of static linking to the CUDA runtime library is, as you surmise, so that an application can be built in such a way that it does not need an installation of the CUDA toolkit to run properly. It only needs a machine with a proper GPU driver install.
The most compatible way to ensure that an application will run on a range of machines with a range of GPU driver installs is to compile your application using the oldest CUDA toolkit required to meet the needs of the earliest GPU driver in the range you intend to cover. Again, you can refer to the table here.

Developing using CUDA on several computers, when only one has a GPU installed

I am a Java developer. To speed some of our algorithms, we have decided to try CUDA.
But the Issue is, currently we have only one server with GPU installed and 3 developers have to work on it (by transferring the file each time over ssh and compiling and running it over there). This obviously is a tedious process.
What I would like to know is: On my machine which does not have GPU, can I using NSight work on CUDA by compiling and generating files locally. This can automatically be transferred to server to get the result.
If we can at least work on algorithm locally using NSight (or any other IDE) and not pure vim and then compile it to remove compile time errors, this would save quite some time.
On Linux you can do remote debugging using Nsight Eclipse Edition as documented here. This requires 5.5 or later. On Windows you need to start the Nsight monitor on the server and then just configure Nsight Visual Studio Edition to use the remote machine.

portability of DLL with CUDA code

I have DLL, which contain CUDA function (image processing). This DLL is compiled with VISUAL STUDIO 2008 Express edition. I call this DLL with LabVIEW.
This DLL and LabVIEW VI are developed on one computer (office) and I need to run same program in differen computer (in lab).
Q1: Do I have to instal cuda toolkit or cuda SDK on computer in lab?
Q2: Do I have to recompile DLL on computer in lab or DLL are completly portable?
Thanks
Yes, you have to install CUDA toolkit and SDK if you use any functions/ wrappers ( like cudaSafeCall) from SDK. In general SDK is not nesessary. You need in compatible NVIDIA GPU driver instaled on Lab computer, too.
You need not recompile if Lab computer and your own have the same Microsoft Visual Studio runtime, CUDA runtime version and Lab computer have GPU device with proper compute capability that your code was compiled for. For more information about CUDA code compatibility see 3.1.2 - 3.1.4 sections in CUDA C programming Guide.

How to create 64-bit CUDA applications? (Win7 x64, CUDA 4, VS 2010 Express)

I'm mostly set up for CUDA development. I've installed the developer drivers, CUDA 4.0 toolkit, and the 4.0 SDK, as well as the bugfix. I'm running Windows 7 x64, and am using Visual C++ 2010 Express. For 32-bit applications, I perform the following steps and my CUDA applications work properly.
Create new empty project
make sure Platform Toolset is set to v100 (normally the default)
check the CUDA 4.0 Build Customization for the project
set the item type of my .cu file to CUDA C/C++
add 'cudart.lib' to Properties->Linker->Input->Additional Dependencies
I can also run non-CUDA 64-bit applications. Visual C++ 2010 Express does not come with 64-bit dependencies automatically, so I had to install the Windows 7.1 SDK w/ .NET Framework 4.0. Then I simply set the Platform Toolset for the VC++ project to Windows7.1SDK, change the Active solution platform to x64, and I'm good to go.
However, I can't seem to do both at the same time - I can't create a 64-bit CUDA application. If I change the Platform Toolset of a CUDA application to Windows7.1SDK, whether the Active solution platform is x64 or Win32, I get the compile error that nvcc.exe exited with code -1. And if I leave the Platform Toolset set to v100 and change the Active solution platform to x64, I get the compile error "fatal error LNK1104: cannot open file 'kernel32.lib'. The only combination that works is v100 and Win32, and obviously that prevents me from running a 64-bit application.
Is there a procedure for enabling this functionality that I just haven't been able to find online? Any ideas or suggestions? Thanks for your time.
Not possible in express edition , ( does not support plugins ) unless you want to setup nvcc manually , and use notepad to write cu files, I very much prefer the VS integration .
You could check that the host compiler properties for the .cu files are set to 64-bit.
Right-click the "Code.cu" file and click 'Properties'.
Expand the "CUDA C/C++" item and select "Common".
Change "Target Machine Platform" to 64-bit.