How can I get RaspberryPI CPU temperature on IoT Core? - windows-10-iot-core

Using the SenseHat the temperature readings come back very hot.
Is there a way to get the CPU temperature on the RapsberryPi with Win10IoTCore to compensate?

Related

Too much time to train one epoch

I use an RTX 3060 12GB GPU enabled workstation with RAM of 16GB DDR4 and CPU Intel Core i5 10400F. Also mounted an external storage HDD drive and ran the script p2ch11.prepcache from the bellow referred repository in order to cache… Used from zero to 8 workers and various batch size selections ranging from 32 to 1024!! Still it takes approximately 13,5 hours to train for one epoch (with batch size=1024 and 4 workers!!)… I still haven’t figured what’s wrong… Looks like I cannot utilize the GPU for some reason …
Code pulled from the repository: https://github.com/deep-learning-with-pytorch/dlwpt-code
-> p2ch11.training.py (https://github.com/deep-learning-with-pytorch/dlwpt-code/blob/master/p2ch11/training.py)
The size of an image is large, you need to do some preprocessing first. I think this will help.

Using nvidia-smi what is the best strategy to capture power

I am using Tesla K20c and measuring power with nvidia-smi as my application is run. My problem is power consumption does not reach a steady state but keeps rising. For example, if my application runs for 100 iterations, power reaches 106W(in 4 seconds), for 1000 iterations 117 W (in 41 seconds), for 10000 iterations 122W (in 415 seconds) and so on increasing slightly every time. I am writing for some recommendation on which power value I should record. In my experimental setup I have over 400 experiments, and doing each one for 10000 iterations is not feasible at least for now. The application is matrix multiplication which is doable in just one iteration taking just a few milliseconds. Increasing the number of iterations does not bring any value to the results, but it increases the run time allowing power monitoring.
The reason you are seeing power consumption increase over time is that the GPU is heating up under a sustained load. Electronic components draw more power at increased temperature mostly due to an increase in Ohmic resistance. In addition, the Tesla K20c is an actively cooled GPU: as the GPU heats up, the fan on the card spins faster and therefore requires more power.
I have run experiments on a K20c that were very similar to yours, out to about 10 minutes. I found that the power draw plateaus after 5 to 6 minutes, and that there are only noise-level oscillations of +/-2 W after that. These may be due to hysteresis in the fan's temperature-controlled feedback loop, or due to short-term fluctuations from incomplete utilization of the GPU at the end of every kernel. Difference in power draw due to fan speed difference were about 5 W. The reason it takes fairly long for the GPU to reach steady state is the heat capacity of the entire assembly, which has quite a bit of mass, including a solid metal back plate.
Your measurements seem to be directed at determining the relative power consumption when running with 400 different variants of the code. It does not seem critical that steady-state power consumption is achieved, just that the conditions under which each variant is tested are as equal as is practically achievable. Keep in mind that the GPU's power sensors are not designed to provide high-precision measurements, so for comparison purposes you would want to assume a noise level on the order of 5%. For an accurate comparison you may even want to average measurements from more than one GPU of the same type, as manufacturing tolerances could cause variations in power draw between multiple "identical" GPUs.
I would therefore suggest the following protocol: Run each variant for 30 seconds, measuring power consumption close to the end of that interval. Then let the GPU idle for 30 seconds to let it cool down before running the next kernel. This should give roughly equal starting conditions for each variant. You may need to lengthen the proposed idle time a bit if you find that the temperature stays elevated for a longer time. The temperature data reported by nvidia-smi can guide you here. With this process you should be able to complete the testing of 400 variants in an overnight run.

Latency for Geocoding API

I was planning to use Google Geocoding API. I was wondering what is the latency I should expect in getting the response back? I cannot find out these details on the website.
Does anyone aware of what will be the actual latency if I am using Google Geocoding API?
Meaning how much time it will take to get the response back from the Geocoding API.
we have a live app working in the playstore and we get roughly 120-150 hits per hour. Our median latency is around 210 ms and latency (98%) is 510 ms.
We have an application 24x7 with ~2 requests per second.
Median: 197.08 ms
98th percentile (slowest 2%): 490.54 ms
Could be a high bottle neck for you application... use some strategies to help you:
Memory cache
Secondary cache
batch persistence

CUDA: Host to Device bandwidth greater than peak b/w of PCIe?

I had used the same plot as attached, for another question. One could see that the peak bandwidth is more than 5.5GB/s. I am using NVidia's bandwidth test program from code samples to find the bandwidth between host to device and vice versa.
The system consists of total 12 Intel Westmere CPUs on two sockets, 4 Tesla C2050 GPUs with 4 PCIe Gen2 Express slots. Now the question is, since the peak bandwidth of PCIe x16 Gen2 is 4GB/s in one direction, how come I am getting a much more bandwidth while doing host to device transfer?
I have in mind that each PCIe is connected to the CPU via an I/O Controller Hub, which is connected through QPI (much more b/w) to the CPU.
The peak bandwidth of PCIe x16 Gen2 is 8GB/s in each direction. You are not exceeding the peak.

How is GPU and memory utilization defined in nvidia-smi results?

I am currently using a tool shipped with nvidia's driver 'nvidia-smi' for performance monitoring on GPU. When we use 'nvidia-smi -a', it will give the information of current GPU information, including GPU core and memory usage, temperature and so on like this:
==============NVSMI LOG==============
Timestamp : Tue
Feb 22 22:39:09 2011
Driver Version :
260.19.26
GPU 0:
Product Name : GeForce 8800 GTX
PCI Device/Vendor ID : 19110de
PCI Location ID : 0:4:0
Board Serial : 211561763875
Display : Connected
Temperature : 55 C
Fan Speed : 47%
Utilization
GPU : 1%
Memory : 0%
I am curious about how are the GPU and memory Utilization defined? For example, GPU core's utilization is 47%. It means there are 47% of SMs active working? Or all the GPU cores are busy in 47% time while idle other 53% time? For memory, the utilization stands for the ratio between current bandwidth and max bandwidth, or the busy time ratio in last time unit?
A post by a moderator on the NVIDIA forums says the GPU utilization and memory utilization figures are based on activity over the last second:
GPU busy is actually the percentage of time over the last second the SMs were busy, and the memory utilization is actually the percentage of bandwidth used during the last second. Full memory consumption statistics come with the next release.
You can refer to this official API document: http://docs.nvidia.com/deploy/nvml-api/structnvmlUtilization__t.html#structnvmlUtilization__t
It says : "Percent of time over the past sample period during which one or more kernels was executing on the GPU."