What are the differences between CUDA compute capabilities? - cuda

What does compute capability 2.0 add over 1.3, 2.1 over 2.0, and 3.0 over 2.1?

The Compute Capabilities designate different architectures. In general, newer architectures run both CUDA programs and graphics faster than previous architectures. Note, though, that a high end card in a previous generation may be faster than a lower end card in the generation after.
From the CUDA C Programming Guide (v6.0):

Related

Benefit of higher version of CUDA for devices with lower Compute Capability

I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0. Is there any compatibility or stability risk with using higher version of CUDA with devices with (much) lower C.C.?
(Various available versions of CUDA on Nvidia website make me doubtful which one is really good)
Regarding benefits, newer CUDA toolkit versions usually provide feature benefits (new features, and/or enhanced performance) over previous CUDA toolkit version. However there are also occasionally performance regressions. Specifics can't be given - it may vary based on your exact code. However there are generally summary blog articles for each new CUDA toolkit version, for example here is the one for CUDA 8 and here is the one for CUDA 9, describing the new features available.
Regarding compatibility, there should be no risk to moving to a higher CUDA version, regardless of the compute capability of your device, as long as your device is supported. All current CUDA versions in the range of 7-9 support your cc3.5 GPU.
Regarding stability, it is possible that a newer CUDA version may have a bug, but it is also possible that a bug in your existing CUDA version may be fixed in a newer version. Guarantees can't be made here; software almost always has bugs in it. However it is generally recommended to use the latest CUDA version compatible with your GPU (in the absence of other considerations), as this gives you access the latest features and at least gives you the best possibility that a historically known issue has been addressed.
I doubt these sort of platitudes are any different regardless of the software stack (e.g. compiler, tools framework, etc.) that you are using. I don't think these considerations are specific or unique to CUDA.
I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0 ?
Are you kidding me? There are enormous benefits. It's a world of difference! Just have a look at the CUDA 8 feature descriptions (Parallel4All blog entry). Specifically,
CUDA 8.0 lets you compile with GCC 5.x instead of 4.x
Not only does that save you a life full of pain having to build your own GCC - since modern distros often don't package it at all, and it's not the system's default compiler. Also, GCC 5.x has lots of improvements, not the least of which being full C++14 support for host-side code.
CUDA 8 lets you use C++11 lambdas in device code
(actually, CUDA 7.5 lets you do that and this is rounded off in CUDA 8)
NVCC internal improvements
Not that I can list these, but hopefully NVIDIA continues working on its compiler, equipping it with better optimization logic.
Much faster compilation
NVCC is markedly faster with CUDA 8. It might be up to 2x, but even if it's just 1.5x - that really improves your quality of life as a developer...
Shall I go on? ... all of the above applies regardless of your compute capability. And CC 3.5 or 3.7 is nothing to sneeze at anyway.

CUDA driver version is insufficient for runtime version [duplicate]

I have a very simple Toshiba Laptop with i3 processor. Also, I do not have any expensive graphics card. In the display settings, I see Intel(HD) Graphics as display adapter. I am planning to learn some cuda programming. But, I am not sure, if I can do that on my laptop as it does not have any nvidia's cuda enabled GPU.
In fact, I doubt, if I even have a GPU o_o
So, I would appreciate if someone can tell me if I can do CUDA programming with the current configuration and if possible also let me know what does Intel(HD) Graphics mean?
At the present time, Intel graphics chips do not support CUDA. It is possible that, in the nearest future, these chips will support OpenCL (which is a standard that is very similar to CUDA), but this is not guaranteed and their current drivers do not support OpenCL either. (There is an Intel OpenCL SDK available, but, at the present time, it does not give you access to the GPU.)
Newest Intel processors (Sandy Bridge) have a GPU integrated into the CPU core. Your processor may be a previous-generation version, in which case "Intel(HD) graphics" is an independent chip.
Portland group have a commercial product called CUDA x86, it is hybrid compiler which creates CUDA C/ C++ code which can either run on GPU or use SIMD on CPU, this is done fully automated without any intervention for the developer. Hope this helps.
Link: http://www.pgroup.com/products/pgiworkstation.htm
If you're interested in learning a language which supports massive parallelism better go for OpenCL since you don't have an NVIDIA GPU. You can run OpenCL on Intel CPUs, but at best you can learn to program SIMDs.
Optimization on CPU and GPU are different. I really don't think you can use Intel card for GPGPU.
Intel HD Graphics is usually the on-CPU graphics chip in newer Core i3/i5/i7 processors.
As far as I know it doesn't support CUDA (which is a proprietary NVidia technology), but OpenCL is supported by NVidia, ATi and Intel.
in 2020 ZLUDA was created which provides CUDA API for Intel GPUs. It is not production ready yet though.

In CUDA, does UVA depend on any hardware features?

I know CUDA only got UVA (Unified Virtual Addressing) with version 4.0. But - is that only a software feature? Or does it require some kind of hardware support (on the GPU side I mean)?
Notes:
In this GTC 2011 presentation it says a Fermi-class GPU is necessary for P2P copies, but it doesn't say that's necessary for UVA itself.
Note: I know UVA is not a good idea on a 32-bit-CPU system, I don't mean that kind of hardware support.
The UVA which was introduced back in May 2011 with CUDA 4.0 requires for hardware support some Fermi-class GPUs. So, this implies compute capability 2.0 onwards.
But apparently, that's not enough since, according to slide #17 of this presentation of the new features of CUDA 4.0, it seems to be only supported in 64-bit (which makes sense since otherwise you would run out of address space pretty quick), and with TCC (Tesla Compute Cluster) when on Windows. I'm not sure if this later limitation still exists since I never ever developed on Windows.

Nvidia Jetson TK1 Development Board - Cuda Compute Capability

I have quite impressed with this deployment kit. Instead of buying a new CUDA card, which might require new main board and etc, this card seems provide all in one.
At it's specs it says it has CUDA compute capability 3.2. AFAIK dynamic parallelism and more comes with cm_35, cuda compute capability 3.5. Does this card support Dynamic Parallelism and HyperQ features of Kepler architecture?
Does this card support Dynamic Parallelism and HyperQ features of Kepler architecture?
No.
Jetson has compute capability 3.2. Dynamic parallelism requires compute capability 3.5 or higher. From the documentation:
Dynamic Parallelism is only supported by devices of compute capability 3.5 and higher.
Hyper-Q also requires cc 3.5 or greater. We can deduce this from careful study of the simpleHyperQ sample code, excerpted:
// HyperQ is available in devices of Compute Capability 3.5 and higher
if (deviceProp.major < 3 || (deviceProp.major == 3 && deviceProp.minor < 5))

CUDA or same something that can be available to intel graphic card?

I want to learn GPGPU and CUDA programming. But I know that only Nvidia card support it. My laptop has an Intel HD Graphic Card. So I need to search if it is possible to do GPGPU or something like that with Intel graphic card. Thanks for any information.
To develop in CUDA your options are:
Use an NVIDIA GPU - all NVIDIA server, desktop and laptop GPUs support CUDA since around 2006, since your laptop does not have one you could try using one remotely.
Use PGI CUDA x86, not free but does what you want.
Use gpuocelot to execute the PTX on the CPU, that's an open-source project in development so YMMV.
You cannot do GPGPU on Intel HD Graphics cards today, unless you do shader-based programming (which was common practice in the days before CUDA and OpenCL).
In my experience, the PGI X86 stuff seems to have fallen flat and I'm not aware of anyone using that. Ocelot is another attempt at the same, but it is very reasearchy and not fully robust at this point.
The only OpenCL compliant devices from Intel are the latest CPUs (Sandy Bridge and Ivy Bridge).
What CPU do you have in your system?
CUDA is Nvidia specific as starter. The GPU emulators are always there in CUDA, so you can use them without a graphics card easily, though it will be slow. A faster solution is the
the x86 implementation. Any of these will allow you to learn the basics of CUDA without using the GPU at all.
If you are want to learn GPGPU in general you still have the option to learn OpenCL, which more widely supported, including AMD, Intel, Nvidia etc... E.g. Intel has an OpenCL SDK (the target is the CPU then, but I guess is irrelevant for you).
After learning the basics of either CUDA or OpenCL, the other will be easy to learn. Neither the syntax nor the semantics are the same, but it is easy step forward as the concepts are the same.