cuda 7.0: maximum nvidia driver version - cuda

I have access to a computation server which uses an old version of the nvidia driver (346) and cuda (7.0) with applications depending on that specific version of cuda.
Is it possible to upgrade the driver and keep the old cuda?
I could find minimal driver versions but not maximal one.

CUDA generally doesn't enforce any maximum driver version.
Older CUDA toolkits are usable with newer drivers.
The only thing somewhat relevant here is that eventually, from time to time, NVIDIA GPU architectures become "deprecated", and this usually happens first at the driver level. That is, a particular GPU may only be supported up to a certain driver level, at which point support ceases. These GPUs are then in a "legacy" status.
So if your GPU is old enough, it will not be supported by newer/latest drivers. But if you currently have CUDA 7 running correctly, you would have to at least have a Fermi GPU, which is still supported by newest/latest drivers. However Fermi is probably/likely the next GPU family to go into a legacy status, at some point in the future.

Related

CUDA Driver API - minimum driver version?

I know that each CUDA toolkit has a minimum required driver, what I'm wondering is the following: suppose I'm loading each function pointer for each driver API function (e.g. cuInit) via dlsym from libcuda.so. I use no runtime API and neither link against cudart. My kernel uses virtual architectures to be JIT-ted at runtime (and the architecture is quite low, e.g. compute_30 so that I'm content with any kepler-and-above device).
Does the minimum driver required restriction still apply in my case?
Yes, there is still a minimum driver version requirement.
The GPU driver has a CUDA version that it is designed to be compatible with. This can be discovered in a variety of ways, one of which is to run the deviceQuery (or deviceQueryDrv) sample code.
Therefore a particular GPU driver will have a "compatibility" associated with a particular CUDA version.
In order to run correctly, Driver API codes will require an installed GPU Driver that is compatible with (i.e. has a CUDA compatibility version equal to or greater than) the CUDA version that the Driver API code was compiled against.
The CUDA/GPU Driver compatibility relationships, and the concept of forward compatibility, are similar to what is described in this question/answer.
To extend/generalize the ("forward") compatibility relationship statement from the previous answer, newer GPU Driver versions are generally compatible with older CUDA codes, whether those codes were compiled against the CUDA Runtime or CUDA Driver APIs.

Does the nVidia Titan V support GPUDirect?

I was wondering if someone might be able to help me figure out if the new Titan V from nVidia support GPUDirect. As far as I can tell it seems limited to Tesla and Quadro cards.
Thank you for taking the time to read this.
GPUDirect Peer-to-Peer (P2P) is supported between any 2 "like" CUDA GPUs (of compute capability 2.0 or higher), if the system topology supports it, and subject to other requirements and restrictions. In a nutshell, the system topology requirement is that both GPUs participating must be enumerated under the same PCIE root complex. If in doubt, "like" means identical. Other combinations may be supported (e.g. 2 GPUs of the same compute capability) but this is not specified, or advertised as supported. If in doubt, try it out. Finally, these things must be "discoverable" by the GPU driver. If the GPU driver cannot ascertain these facts, and/or the system is not part of a whitelist maintained in the driver, then P2P support will not be possible.
Note that in general, P2P support may vary by GPU or GPU family. The ability to run P2P on one GPU type or GPU family does not necessarily indicate it will work on another GPU type or family, even in the same system/setup. The final determinant of GPU P2P support are the tools provided that query the runtime via cudaDeviceCanAccessPeer. So the statement here "is supported" should not be construed to refer to a particular GPU type. P2P support can vary by system and other factors as well. No statements made here are a guarantee of P2P support for any particular GPU in any particular setup.
GPUDirect RDMA is only supported on Tesla and possibly some Quadro GPUs.
So, if you had a system that had 2 Titan V GPUs plugged into PCIE slots that were connected to the same root complex (usually, except in Skylake CPUs, it should be sufficient to say "connected to the same CPU socket"), and the system (i.e. core logic) was recognized by the GPU driver, I would expect P2P to work between those 2 GPUs.
I would not expect GPUDirect RDMA to work to a Titan V, under any circumstances.
YMMV. If in doubt, try it out, before making any large purchasing decisions.

Benefit of higher version of CUDA for devices with lower Compute Capability

I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0. Is there any compatibility or stability risk with using higher version of CUDA with devices with (much) lower C.C.?
(Various available versions of CUDA on Nvidia website make me doubtful which one is really good)
Regarding benefits, newer CUDA toolkit versions usually provide feature benefits (new features, and/or enhanced performance) over previous CUDA toolkit version. However there are also occasionally performance regressions. Specifics can't be given - it may vary based on your exact code. However there are generally summary blog articles for each new CUDA toolkit version, for example here is the one for CUDA 8 and here is the one for CUDA 9, describing the new features available.
Regarding compatibility, there should be no risk to moving to a higher CUDA version, regardless of the compute capability of your device, as long as your device is supported. All current CUDA versions in the range of 7-9 support your cc3.5 GPU.
Regarding stability, it is possible that a newer CUDA version may have a bug, but it is also possible that a bug in your existing CUDA version may be fixed in a newer version. Guarantees can't be made here; software almost always has bugs in it. However it is generally recommended to use the latest CUDA version compatible with your GPU (in the absence of other considerations), as this gives you access the latest features and at least gives you the best possibility that a historically known issue has been addressed.
I doubt these sort of platitudes are any different regardless of the software stack (e.g. compiler, tools framework, etc.) that you are using. I don't think these considerations are specific or unique to CUDA.
I'm using CUDA 7.0 on a Tesla K20X (C.C. 3.5). Is there any benefit to update to a higher version of CUDA, say 8.0 ?
Are you kidding me? There are enormous benefits. It's a world of difference! Just have a look at the CUDA 8 feature descriptions (Parallel4All blog entry). Specifically,
CUDA 8.0 lets you compile with GCC 5.x instead of 4.x
Not only does that save you a life full of pain having to build your own GCC - since modern distros often don't package it at all, and it's not the system's default compiler. Also, GCC 5.x has lots of improvements, not the least of which being full C++14 support for host-side code.
CUDA 8 lets you use C++11 lambdas in device code
(actually, CUDA 7.5 lets you do that and this is rounded off in CUDA 8)
NVCC internal improvements
Not that I can list these, but hopefully NVIDIA continues working on its compiler, equipping it with better optimization logic.
Much faster compilation
NVCC is markedly faster with CUDA 8. It might be up to 2x, but even if it's just 1.5x - that really improves your quality of life as a developer...
Shall I go on? ... all of the above applies regardless of your compute capability. And CC 3.5 or 3.7 is nothing to sneeze at anyway.

In CUDA, does UVA depend on any hardware features?

I know CUDA only got UVA (Unified Virtual Addressing) with version 4.0. But - is that only a software feature? Or does it require some kind of hardware support (on the GPU side I mean)?
Notes:
In this GTC 2011 presentation it says a Fermi-class GPU is necessary for P2P copies, but it doesn't say that's necessary for UVA itself.
Note: I know UVA is not a good idea on a 32-bit-CPU system, I don't mean that kind of hardware support.
The UVA which was introduced back in May 2011 with CUDA 4.0 requires for hardware support some Fermi-class GPUs. So, this implies compute capability 2.0 onwards.
But apparently, that's not enough since, according to slide #17 of this presentation of the new features of CUDA 4.0, it seems to be only supported in 64-bit (which makes sense since otherwise you would run out of address space pretty quick), and with TCC (Tesla Compute Cluster) when on Windows. I'm not sure if this later limitation still exists since I never ever developed on Windows.

When will OpenCL 1.2 for NVIDIA hardware be available?

I would have asked this question on the NVIDIA developer forum but since it's still down maybe someone here can tell me something.
Does anybody know if there is already OpenCL 1.2 support in NVIDIAs driver? If not, is it coming soon?
I don't have a GeForce 600 series card to check myself. According to Wikipedia there are already some cards that could support it though.
It somewhat seems like NVIDIA does not mention OpenCL a whole lot anymore and just focuses on CUDA C/C++ (see StreamComputing.eu). I guess it makes sense to them but I would like to see some more OpenCL love.
Thanks
NVidia's latest SDK (v4.2.9) does not support OpenCL 1.2 with regard to the header files or library it provides. I considered this might just be the SDK itself: as you point out, the GeForce 600 series appears to support it in hardware. Unfortunately I don't own any 600 series card, but OpenCL64.dll supplied with the latest drivers (v306.23) does not export OpenCL 1.2 symbols. Further, I can find no trace of the new symbols (such as "clLinkProgram") as strings in the driver package. Although this does not rule out the possibility of bootstrapping 1.2 functionality in the driver via an ICD Loader, there is no evidence that there is an 1.2 implementation there, and this would be undocumented and unsupported.
As to when OpenCL 1.2 will be officially supported by NVidia, unfortunately I don't know the answer to this, and would be equally keen to find out.
In the mean-time you might consider an alternative OpenCL 1.2 implementation for development; for example the Intel SDK 2013 Beta (Intel CPU) or AMD APP SDK v2.7 (AMD CPU or AMD/ATI GPU).
An aside, but personally I am considering switching from NVidia GPUs to ATI for production purposes, partly based on AMD's investment in OpenCL and also arguments comparing "bang for buck" between NVidia and the latest AMD cards: NVIDIA vs AMD: GPGPU performance
The NVIDIA hotfix driver version 350.05 (April 2015) adds support for OpenCL 1.2.
With the 350.12 (also April 2015) release, NVidia has clarified the situation:
With this driver release NVIDIA has also posted a bit more information on their OpenCL 1.2 driver. The driver has not yet passed OpenCL conformance testing over at Khronos, but it is expected to do so. OpenCL 1.2 functionality will only be available on Kepler and Maxwell GPUs, with Fermi getting left behind.
It looks like the 700 series supports OpenCL 1.2
I'm still looking for which driver I'll need to get that working.