In CUDA, does UVA depend on any hardware features? - cuda

I know CUDA only got UVA (Unified Virtual Addressing) with version 4.0. But - is that only a software feature? Or does it require some kind of hardware support (on the GPU side I mean)?
Notes:
In this GTC 2011 presentation it says a Fermi-class GPU is necessary for P2P copies, but it doesn't say that's necessary for UVA itself.
Note: I know UVA is not a good idea on a 32-bit-CPU system, I don't mean that kind of hardware support.

The UVA which was introduced back in May 2011 with CUDA 4.0 requires for hardware support some Fermi-class GPUs. So, this implies compute capability 2.0 onwards.
But apparently, that's not enough since, according to slide #17 of this presentation of the new features of CUDA 4.0, it seems to be only supported in 64-bit (which makes sense since otherwise you would run out of address space pretty quick), and with TCC (Tesla Compute Cluster) when on Windows. I'm not sure if this later limitation still exists since I never ever developed on Windows.

Related

Benefit of higher version of CUDA for devices with lower Compute Capability

I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0. Is there any compatibility or stability risk with using higher version of CUDA with devices with (much) lower C.C.?
(Various available versions of CUDA on Nvidia website make me doubtful which one is really good)
Regarding benefits, newer CUDA toolkit versions usually provide feature benefits (new features, and/or enhanced performance) over previous CUDA toolkit version. However there are also occasionally performance regressions. Specifics can't be given - it may vary based on your exact code. However there are generally summary blog articles for each new CUDA toolkit version, for example here is the one for CUDA 8 and here is the one for CUDA 9, describing the new features available.
Regarding compatibility, there should be no risk to moving to a higher CUDA version, regardless of the compute capability of your device, as long as your device is supported. All current CUDA versions in the range of 7-9 support your cc3.5 GPU.
Regarding stability, it is possible that a newer CUDA version may have a bug, but it is also possible that a bug in your existing CUDA version may be fixed in a newer version. Guarantees can't be made here; software almost always has bugs in it. However it is generally recommended to use the latest CUDA version compatible with your GPU (in the absence of other considerations), as this gives you access the latest features and at least gives you the best possibility that a historically known issue has been addressed.
I doubt these sort of platitudes are any different regardless of the software stack (e.g. compiler, tools framework, etc.) that you are using. I don't think these considerations are specific or unique to CUDA.
I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0 ?
Are you kidding me? There are enormous benefits. It's a world of difference! Just have a look at the CUDA 8 feature descriptions (Parallel4All blog entry). Specifically,
CUDA 8.0 lets you compile with GCC 5.x instead of 4.x
Not only does that save you a life full of pain having to build your own GCC - since modern distros often don't package it at all, and it's not the system's default compiler. Also, GCC 5.x has lots of improvements, not the least of which being full C++14 support for host-side code.
CUDA 8 lets you use C++11 lambdas in device code
(actually, CUDA 7.5 lets you do that and this is rounded off in CUDA 8)
NVCC internal improvements
Not that I can list these, but hopefully NVIDIA continues working on its compiler, equipping it with better optimization logic.
Much faster compilation
NVCC is markedly faster with CUDA 8. It might be up to 2x, but even if it's just 1.5x - that really improves your quality of life as a developer...
Shall I go on? ... all of the above applies regardless of your compute capability. And CC 3.5 or 3.7 is nothing to sneeze at anyway.

cuda 7.0: maximum nvidia driver version

I have access to a computation server which uses an old version of the nvidia driver (346) and cuda (7.0) with applications depending on that specific version of cuda.
Is it possible to upgrade the driver and keep the old cuda?
I could find minimal driver versions but not maximal one.
CUDA generally doesn't enforce any maximum driver version.
Older CUDA toolkits are usable with newer drivers.
The only thing somewhat relevant here is that eventually, from time to time, NVIDIA GPU architectures become "deprecated", and this usually happens first at the driver level. That is, a particular GPU may only be supported up to a certain driver level, at which point support ceases. These GPUs are then in a "legacy" status.
So if your GPU is old enough, it will not be supported by newer/latest drivers. But if you currently have CUDA 7 running correctly, you would have to at least have a Fermi GPU, which is still supported by newest/latest drivers. However Fermi is probably/likely the next GPU family to go into a legacy status, at some point in the future.

CUDA driver version is insufficient for runtime version [duplicate]

I have a very simple Toshiba Laptop with i3 processor. Also, I do not have any expensive graphics card. In the display settings, I see Intel(HD) Graphics as display adapter. I am planning to learn some cuda programming. But, I am not sure, if I can do that on my laptop as it does not have any nvidia's cuda enabled GPU.
In fact, I doubt, if I even have a GPU o_o
So, I would appreciate if someone can tell me if I can do CUDA programming with the current configuration and if possible also let me know what does Intel(HD) Graphics mean?
At the present time, Intel graphics chips do not support CUDA. It is possible that, in the nearest future, these chips will support OpenCL (which is a standard that is very similar to CUDA), but this is not guaranteed and their current drivers do not support OpenCL either. (There is an Intel OpenCL SDK available, but, at the present time, it does not give you access to the GPU.)
Newest Intel processors (Sandy Bridge) have a GPU integrated into the CPU core. Your processor may be a previous-generation version, in which case "Intel(HD) graphics" is an independent chip.
Portland group have a commercial product called CUDA x86, it is hybrid compiler which creates CUDA C/ C++ code which can either run on GPU or use SIMD on CPU, this is done fully automated without any intervention for the developer. Hope this helps.
Link: http://www.pgroup.com/products/pgiworkstation.htm
If you're interested in learning a language which supports massive parallelism better go for OpenCL since you don't have an NVIDIA GPU. You can run OpenCL on Intel CPUs, but at best you can learn to program SIMDs.
Optimization on CPU and GPU are different. I really don't think you can use Intel card for GPGPU.
Intel HD Graphics is usually the on-CPU graphics chip in newer Core i3/i5/i7 processors.
As far as I know it doesn't support CUDA (which is a proprietary NVidia technology), but OpenCL is supported by NVidia, ATi and Intel.
in 2020 ZLUDA was created which provides CUDA API for Intel GPUs. It is not production ready yet though.

CUDA or same something that can be available to intel graphic card?

I want to learn GPGPU and CUDA programming. But I know that only Nvidia card support it. My laptop has an Intel HD Graphic Card. So I need to search if it is possible to do GPGPU or something like that with Intel graphic card. Thanks for any information.
To develop in CUDA your options are:
Use an NVIDIA GPU - all NVIDIA server, desktop and laptop GPUs support CUDA since around 2006, since your laptop does not have one you could try using one remotely.
Use PGI CUDA x86, not free but does what you want.
Use gpuocelot to execute the PTX on the CPU, that's an open-source project in development so YMMV.
You cannot do GPGPU on Intel HD Graphics cards today, unless you do shader-based programming (which was common practice in the days before CUDA and OpenCL).
In my experience, the PGI X86 stuff seems to have fallen flat and I'm not aware of anyone using that. Ocelot is another attempt at the same, but it is very reasearchy and not fully robust at this point.
The only OpenCL compliant devices from Intel are the latest CPUs (Sandy Bridge and Ivy Bridge).
What CPU do you have in your system?
CUDA is Nvidia specific as starter. The GPU emulators are always there in CUDA, so you can use them without a graphics card easily, though it will be slow. A faster solution is the
the x86 implementation. Any of these will allow you to learn the basics of CUDA without using the GPU at all.
If you are want to learn GPGPU in general you still have the option to learn OpenCL, which more widely supported, including AMD, Intel, Nvidia etc... E.g. Intel has an OpenCL SDK (the target is the CPU then, but I guess is irrelevant for you).
After learning the basics of either CUDA or OpenCL, the other will be easy to learn. Neither the syntax nor the semantics are the same, but it is easy step forward as the concepts are the same.

CUDA-enabled graphics processor as VMware?

I'm taking a course that teaches CUDA. I would like to use it my personal laptop, but I don't have Nvidia graphics processor. mine is ATI . so I was thinking is there any Virtual Hardware simulator that I can use? or that there is no other way than using a PC with CUDA Graphics processor.
Thank you very much
The CUDA toolkit used to ship with a host CPU emulation mode, but that was deprecated early in the 3.0 release cycle and has been fully removed from toolkits for the best part of two years.
Your only real option today is to use Ocelot. It has a PTX assembly translator and a pretty reliable reimplementation of the CUDA runtime for x86 CPUs, and there is also a rather experimental PTX to AMD IL translator (I have no experience with the latter). On a modern linux system with an up to date GNU toolchain, Ocelot is reasonably easy to get running. I am not sure if there is a functioning Windows port or not.
CUDA has its own emulation mode witch runs everything on CPU. Problem is that in such case you don't have real concurrency so programs that runs successfully in emulation mode can fail (and usually does) in normal mode. You can develop your code in emulation mode, but then you have to debug it on computer with CUDA card.