I want to start CUDA in C++ and I familiar with C++ , Qt and C#.
But i want to know it's better to use from CUDA libraries -at high level- or CUDA API s -at the lower level- ?
Is it more better that I'm starting from API and dont use of CUDA driver ?
(I start on "cuda by example" for its concepts in parallel)
Since you are familiar with C/C++, you'd better use the higher-level API, CUDA C or C for CUDA, which is more convenient to easy to write, because it consists of a minimal set of extensions to the C language and a runtime library.
The lower-level API, which is the CUDA driver API that provides an additional level of control by exposing lower-level concepts, requires more code, is harder to program and debug, but offers a better level of control and is language-independent since it handles binary or assembly code.
See Chapter 3 of CUDA programming guide for more details.
Related
Before I dive too deep into CUDA programming I need to orient myself. The NVIDIA CUDA programming guides made a distinct change from referring to "CUDA C" to "CUDA C++" between versions 10.1 and 10.2. Since this was a minor version change, I suspect it is just semantics. I compared sample code from pre-10.1 and post-10.2 and found no difference...though that doesn't mean there is no difference. Was there a more subtle programming paradigm shift between these versions?
Here's my suspicion: CUDA has always been an extension of C++, not C, but everyone has referred to it as CUDA C because the we don't take advantage of the OOP offered by C++ when writing CUDA code. Is that a fair assessment?
I think your assessment is reasonable conjecture. People are sometimes imprecise in their references to C and C++, and so CUDA probably hasn't been very rigorous here either. There is some history though that suggests to me this is not purely hand-waving.
CUDA started out as largely a C-style realization, but over time added C++ style features. Certainly by CUDA 4.0 (circa 2010) if not before, there were plenty of C++ style features.
Lately, CUDA drops the reference to C but claims compliance to a particular C++ ISO standard, subject to various enumerated restrictions and limitations.
The CUDA compiler, nvcc, behaves by default like a C++ style compiler (so, for example, using C++ style mangling), and will by default invoke the host-code C++ compiler (e.g. g++) not the host code C compiler (e.g. gcc) when passing off host code to be compiled.
As you point out, a programmer can use the C++ language syntactically in a very similar way to C usage (e.g. without the use of classes, to pick one example). This is also true for CUDA C++.
It's not possible to build Rome in a day, and so CUDA development has proceeded in various areas at various rates. For example, one stated limitation of CUDA is that elements of the standard library (std::) are not necessarily supported in device code. However various CUDA developers are working to gradually fill in this gap with the libcu++ evolution.
I'm writing a single header library that executes a cuda kernel. I was wondering if there is a way to get around the <<<>>> syntax, or get C source output from nvcc?
You can avoid the host language extensions by using the CUDA driver API instead. It is a little more verbose and you will require a little more boilerplate code to manage the context, but it is not too difficult.
Conventionally, you would compile to PTX or a binary payload to load at runtime, however NVIDIA now also ship an experimental JIT CUDA C compiler library, libNVVM, which you could try if you want JIT from source.
I'm planning to use GPU to do an application with intensive matrix manipulation. I want to use the CUDA NVIDIA support. My only doubt is: is there any fallback support? I mean: if I use these libraries I've got the possibility to run the application in non-CUDA environment (without gpu support, of course)? I'd like to have the possibility to debug the application without the constraint to use that environment. I didn't find this information, any tips?
There is no fallback support built into the libraries (e.g. CUBLAS, CUSPARSE, CUFFT). You would need to have your code develop a check for an existing CUDA environment, and if it finds none, then develop your own code path, perhaps using alternate libraries. For example, CUBLAS functions can be mostly duplicated by other BLAS libraries (e.g. MKL). CUFFT functions can be largely replaced by other FFT libraries (e.g. FFTW).
How to detect a CUDA environment is covered in other SO questions. In a nutshell, if your application bundles (e.g. static-links) the CUDART library, then you can run a procedure similar to that in the deviceQuery sample code, to determine what GPUs (if any) are available.
This is a bit of silly question, but I'm wondering if CUDA uses an interpreter or a compiler?
I'm wondering because I'm not quite sure how CUDA manages to get source code to run on two cards with different compute capabilities.
From Wikipedia:
Programmers use 'C for CUDA' (C with Nvidia extensions and certain restrictions), compiled through a PathScale Open64 C compiler.
So, your answer is: it uses a compiler.
And to touch on the reason it can run on multiple cards (source):
CUDA C/C++ provides an abstraction, it's a means for you to express how you want your program to execute. The compiler generates PTX code which is also not hardware specific. At runtime the PTX is compiled for a specific target GPU - this is the responsibility of the driver which is updated every time a new GPU is released.
These official documents CUDA C Programming Guide and The CUDA Compiler Driver (NVCC) explain all the details about the compilation process.
From the second document:
nvcc mimics the behavior of the GNU compiler gcc: it accepts a range
of conventional compiler options, such as for defining macros and
include/library paths, and for steering the compilation process.
Not just limited to cuda , shaders in directx or opengl are also complied to some kind of byte code and converted to native code by the underlying driver.
So, I heard that some people have figured out ways to run programs on the GPU using High Level Shader Language and I would like to start writing my own programs that run on the GPU rather than my CPU, but I have been unable to find anything on the subject.
Does anyone have any experience with writing programs for the GPU or know of any documentation on the subject?
Thanks.
For computation, CUDA and OpenCL are more suitable than shader languages. For CUDA, I highly recommend the book CUDA by Example. The book is aimed at absolute beginners to this area of programming.
The best way I think to start is to
Have a CUDA Card from Nvidia
Download Driver + Toolkit + SDK
Build the examples
Read the Cuda Programming Guide
Start to recreate the cudaDeviceInfo example
Try to allocate memory in the gpu
Try to create a little kernel
From there you should be able to gain enough momentum to learn the rest.
Once you learn CUDA then OpenCL and other are a breeze.
I am suggesting CUDA because is the one most widely supported and tested.