NVML Power readings with nvmlDeviceGetPowerUsage

NVML Power readings with nvmlDeviceGetPowerUsage - cuda

I'm running an application using the NVML function nvmlDeviceGetPowerUsage().
The problem is that I always get the same number for different applications I'm running using on a TESLA M2050.
Any suggestions?

If you read the documentation, you'll discover that there are some qualifiers on whether this function is available:
For "GF11x" Tesla ™and Quadro ®products from the Fermi family.
• Requires NVML_INFOROM_POWER version 3.0 or higher.
For Tesla ™and Quadro ®products from the Kepler family.
• Does not require NVML_INFOROM_POWER object.
And:
It is only available if power management mode is supported. See nvmlDeviceGetPowerManagementMode.
I think you'll find that power management mode is not supported on the M2050, and if you run that nvmlDeviceGetPowerManagementMode API call on your M2050 device, you'll get confirmation of that.
The M2050 is niether a Kepler GPU nor is it a GF11x Fermi GPU. It is using the GF100 Fermi GPU, so it is not covered by this API capability (and the GetPowerManagementMode API call would confirm that.)

Related

Does the nVidia Titan V support GPUDirect?

I was wondering if someone might be able to help me figure out if the new Titan V from nVidia support GPUDirect. As far as I can tell it seems limited to Tesla and Quadro cards.
Thank you for taking the time to read this.

GPUDirect Peer-to-Peer (P2P) is supported between any 2 "like" CUDA GPUs (of compute capability 2.0 or higher), if the system topology supports it, and subject to other requirements and restrictions. In a nutshell, the system topology requirement is that both GPUs participating must be enumerated under the same PCIE root complex. If in doubt, "like" means identical. Other combinations may be supported (e.g. 2 GPUs of the same compute capability) but this is not specified, or advertised as supported. If in doubt, try it out. Finally, these things must be "discoverable" by the GPU driver. If the GPU driver cannot ascertain these facts, and/or the system is not part of a whitelist maintained in the driver, then P2P support will not be possible.
Note that in general, P2P support may vary by GPU or GPU family. The ability to run P2P on one GPU type or GPU family does not necessarily indicate it will work on another GPU type or family, even in the same system/setup. The final determinant of GPU P2P support are the tools provided that query the runtime via cudaDeviceCanAccessPeer. So the statement here "is supported" should not be construed to refer to a particular GPU type. P2P support can vary by system and other factors as well. No statements made here are a guarantee of P2P support for any particular GPU in any particular setup.
GPUDirect RDMA is only supported on Tesla and possibly some Quadro GPUs.
So, if you had a system that had 2 Titan V GPUs plugged into PCIE slots that were connected to the same root complex (usually, except in Skylake CPUs, it should be sufficient to say "connected to the same CPU socket"), and the system (i.e. core logic) was recognized by the GPU driver, I would expect P2P to work between those 2 GPUs.
I would not expect GPUDirect RDMA to work to a Titan V, under any circumstances.
YMMV. If in doubt, try it out, before making any large purchasing decisions.

CUDA driver version is insufficient for runtime version [duplicate]

I have a very simple Toshiba Laptop with i3 processor. Also, I do not have any expensive graphics card. In the display settings, I see Intel(HD) Graphics as display adapter. I am planning to learn some cuda programming. But, I am not sure, if I can do that on my laptop as it does not have any nvidia's cuda enabled GPU.
In fact, I doubt, if I even have a GPU o_o
So, I would appreciate if someone can tell me if I can do CUDA programming with the current configuration and if possible also let me know what does Intel(HD) Graphics mean?

At the present time, Intel graphics chips do not support CUDA. It is possible that, in the nearest future, these chips will support OpenCL (which is a standard that is very similar to CUDA), but this is not guaranteed and their current drivers do not support OpenCL either. (There is an Intel OpenCL SDK available, but, at the present time, it does not give you access to the GPU.)
Newest Intel processors (Sandy Bridge) have a GPU integrated into the CPU core. Your processor may be a previous-generation version, in which case "Intel(HD) graphics" is an independent chip.

Portland group have a commercial product called CUDA x86, it is hybrid compiler which creates CUDA C/ C++ code which can either run on GPU or use SIMD on CPU, this is done fully automated without any intervention for the developer. Hope this helps.
Link: http://www.pgroup.com/products/pgiworkstation.htm

If you're interested in learning a language which supports massive parallelism better go for OpenCL since you don't have an NVIDIA GPU. You can run OpenCL on Intel CPUs, but at best you can learn to program SIMDs.
Optimization on CPU and GPU are different. I really don't think you can use Intel card for GPGPU.

Intel HD Graphics is usually the on-CPU graphics chip in newer Core i3/i5/i7 processors.
As far as I know it doesn't support CUDA (which is a proprietary NVidia technology), but OpenCL is supported by NVidia, ATi and Intel.

in 2020 ZLUDA was created which provides CUDA API for Intel GPUs. It is not production ready yet though.

How to enable/disable a specific graphic card?

I'm working on a "fujitsu" machine. It has 2 GPUs installed: Quadro 2000 and Tesla C2075. The Quadro GPU has 1 GB RAM and Tesla GPU has 5GB of it. (I checked using the output of nvidia-smi -q). When I run nvidia-smi, the output shows 2 GPUs, but the Tesla ones display is shown as off.
I'm running a memory intensive program and would like to use 5 GB of RAM available, but whenever I run a program, it seems to be using the Quadro GPU.
Is there some way to use a particular GPU out of the 2 in a program? Does the Tesla GPU being "disabled" means it's drivers are not installed?

You can control access to CUDA GPUs either using the environment or programmatically.
You can use the environment variable CUDA_VISIBLE_DEVICES to specify a list of 1 or more GPUs that will be visible to any application, as well as their order of visibility. For example if nvidia-smi reports your Tesla GPU as GPU 1 (and your Quadro as GPU 0), then you can set CUDA_VISIBLE_DEVICES=1 to enable only the Tesla to be used by CUDA code.
See my blog post on the subject.
To control what GPU your application uses programmatically, you should use the device management API of CUDA. Query the number of devices using cudaGetDeviceCount, then you can cudaSetDevice to each device, query its properties using cudaGetDeviceProperties, and then select the device that fits your application criteria. You can also use cudaChooseDevice to select the device that most closely matches the device properties you specify.

nvidia-smi -ac equivalent in NVML

I learnt than nvidia-smi -ac can be used to change the clock
rate of GPU cores and memory. Is nvidia-smi built upon the NVML library?
What is its equivalent in NVML since I checked the document
http://cyber.sibsutis.ru:82/GPGPU/sdk/CUDA_TOOLKIT/nvml.pdf
but could only see the API's which are used to get the values of clock rates rather than
setting them?
Thanks

Yes, nvidia-smi is built on the NVML library.
According to the latest nvml api documentation available here (which is linked from the site I previously suggested to you here) The "Set Application Clocks" command is supported on Tesla K10 and K20 GPUs (page 6). I believe it is also supported on "Kepler" members of the Quadro family, such as Quadro K5000.
If you have a Tesla K10, K20, or K20X GPU, the Set Application Clocks command is described on p68, which I am also reproducing here for convenience:
7.12.2.2 nvmlReturn_t DECLDIR nvmlDeviceSetApplicationsClocks (nvmlDevice_t device, unsigned int
memClockMHz, unsigned int graphicsClockMHz)
Set clocks that applications will lock to.
Sets the clocks that compute and graphics applications will be running at. e.g. CUDA driver requests these clocks
during context creation which means this property defines clocks at which CUDA applications will be running unless
some overspec event occurs (e.g. over power, over thermal or external HW brake).
Can be used as a setting to request constant performance.
For Tesla ™products, and Quadro ®products from the Kepler family. Requires root/admin permissions.
See nvmlDeviceGetSupportedMemoryClocks and nvmlDeviceGetSupportedGraphicsClocks for details on how to list
available clocks combinations.
After system reboot or driver reload applications clocks go back to their default value.
Parameters:
device The identifier of the target device
memClockMHz Requested memory clock in MHz
graphicsClockMHz Requested graphics clock in MHz
Returns:
• NVML_SUCCESS if new settings were successfully set
• NVML_ERROR_UNINITIALIZED if the library has not been successfully initialized
• NVML_ERROR_INVALID_ARGUMENT if device is invalid or memClockMHz and graphicsClockMHz is
not a valid clock combination
• NVML_ERROR_NO_PERMISSION if the user doesn’t have permission to perform this operation
• NVML_ERROR_NOT_SUPPORTED if the device doesn’t support this feature
• NVML_ERROR_UNKNOWN on any unexpected error

Can't run CUDA nor OpenCL on GeForce 540M

I have problem running samples provided by Nvidia in their GPU Computing SDK (there's a library of compiled sample codes).
For cuda I get message "No CUDA-capable device is detected", for OpenCL there's error from function that should find OpenCL capable units.
I have installed all three parts from Nvidia to develop with OpenCL - devdriver for win7 64bit v.301.27, cuda toolkit 4.2.9 and gpu computing sdk 4.2.9.
I think this might have to do with Optimus technology that reroutes output from Nvidia GPU to Intel to render things (this notebook has also Intel 3000HD accelerator), but in Nvidia control pannel I set to use high performance Nvidia GPU, set power profile to prefer maximum performance and for PhysX I changed from automatic selection to Nvidia processor again. Nothing has changed though, those samples won't run (not even those targeted for GF8000 cards).
I would like to play somewhat with OpenCL and see what it is capable of but without ability to test things it's useless. I have found some info about this on forums, but it was mostly about linux users where you need Bumblebee to access Nvidia GPU. There's no such problem on Windows however, drivers are better and so you can access it without dark magic (or I thought so until I found this problem).

My laptop has a GeForce 540M as well, in an Optimus configuration since my Sandy Bridge CPU also has Intel's integrated graphics. To run CUDA codes, I have to:
Install NVIDIA Driver
Go to NVIDIA Control Panel
Click 3D Settings -> Manage 3D Settings -> Global Settings
In the Preferred Graphics processor drop down, select "High-performance NVIDIA processor"
Apply the settings
Note that the instructions above apply the settings for all applications, so you don't have to worry about CUDA errors any more. But it will drain more battery.
Here is a video recap as well. Good luck!

Ok this has proven to be totally crazy solution. I was thinking if something isn't hooking between the hardware and application and only thing that came to my mind was AV software. I'm using Comodo with sandbox and Defense+ on and after turning them off I could run all those samples. What's more, only Defense+ needs to be turned off.
Now I just think about how much apps could have been blocked from accessing that GPU..

That's most likely because of the architecture of Optimus. So I'd suggest you to read
NVIDIA CUDA Developer Guide for NVIDIA Optimus Platforms, especially the section "Querying for a CUDA Device" which addresses this issue, I believe.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008