What exactly is 'Google Chrome Helper (GPU)'? - google-chrome

What exactly is 'Google Chrome Helper (GPU)', and (aside from having two chrome instances open with ~15 tabs in each), is there anything I can do to reduce its memory use? Is its (high) memory use affected by me not having a dedicated GPU?

What exactly is 'Google Chrome Helper (GPU)'
It could be a chrome tab, extension, process, or a subframe.
and (aside from having two chrome instances open with ~15 tabs in each), is there anything I can do to reduce its memory use?
close tabs and use a single instance of chrome
Is its (high) memory use affected by me not having a dedicated GPU?
nope
Answer to followup question about determining the cost of specific helpers:
Open Chrome Task Manager via Shift + Esc and look for tasks that are consuming high memory.

Related

Chrome memory measurement now almost flat for longer test runs

In order to check our web application for memory leaks, I run a machine which does the following:
it runs automated End-to-End tests over (almost) the entire application in Chrome
after each block of tests, it goes to a state of the web application where almost nothing happens
it triggers gc(); for garbage collection
it saves totalJSHeapSize, and usedJSHeapSize to a file
it plots out the results for each test run to a graph
That way, we can see how much the memory increases and which are the problematic parts of our application: At some point the memory increases, at some point it decreases.
Till yesterday, it looked like this:
Bright red (upper line): totalJSHeapSize, light red (lower line): usedJSHeapSize
Yesterday, I updated Chrome to version 69. And now the chart looks quite different:
The start and end amount of memory used (usedJSHeapSize) is almost the same. But as you can clearly see, the way it changes over the course of the test (ca. 1,5h) is quite different.
My questions are now:
Is this a change in reality or in measurement? I.e. did Chrome change its memory handling? Or just the way it puts out memory values via totalJSHeapSize, and usedJSHeapSize?
Concerning memory leaks, is it good news or bad news for me? Like: Before I had dozens of spots where memory increases, now I have just three. Is this true? Or are the memory leaks in the now flat areas still there and hidden?
I'm also thankful for any background information on how Chrome changed its memory measurement.
Some additional info:
The VM runs under KUbuntu 18.04
It's a single web page application done with AngularJS 1.6
The outcome of the memory measurement is quite stable - both before and after the update of Chrome
EDIT:
It seems this was a bug of Chrome version 69. At least, with an update to Chrome 70, this strange behavior is gone and everything looks almost as before.
I don't think you should be worry about it. This can happen due to the memory manager used inside the chrome. You didn't mentioned the version of your first memory graph, possibility that the memory manager used between these two version is different. Chrome was using the TCMalloc which take the large chunk of memory from the OS and manage it, once the memory shortage happenned with TCMalloc then it ask again a big chunk of memory from OS and start managing it. So the later graph what you are seeing have less up and downs (but bigger then previous one) due to that. Hope it answered your query.
As you mentioned that
The outcome of the memory measurement is quite stable - both before and after the update of Chrome
You don't need to really worry about it, the way previously chrome was allocating memory and how it does with new version is different(possible different memory manager) that's it.

Memory is bloated but freed after closing chrome developer tools

I'm fighting a leak on a page, which makes large AJAX requests and replaces page content once in several seconds.
If I open the page in Chrome Dev Tools, I see that memory usage grows over time (from one memory snapshot to another). There are no explicit leaks on the page (the three-snapshot technique shows there are none)
According to Google Dev tools, the memory accumulated under GC Roots -> "Global handles", in some map. As I said, closing Chrome Dev Tools result in disappearing of this memory (i.e. memory usage drops from 600+Mb to 40Mb for the page). At the same time, pressing "GC" does not help, the memory remains in place.
But if the page is remained open for several hours, it may eat gigabytes and become unresponsive.
Google Chrome version is Version 60.0.3112.90 (Official Build) (64-bit) on MacOS 10.12.6.
Any hint on how to fix/avoid such memory bloat is appreciated.

Recording memory leak in Google Chrome using Ionic in browser

How can I record memory leaks in Google Chrome similar to what is being performed in the link bewow?
https://github.com/driftyco/ionic/issues/1096
I have an Ionic app that runs embedded video, after clicking back and forth for over 10 pages during intense clicking it crashes. The pages viewed are embedded mp4s, I suspect there is some memory leakage as listed in the link above? Just need to find a way to test it
Following the post below from Ant, here is the memory log from Google Canary
http://i.imgur.com/QrwTNwe.jpg. Do the nodes and listeners look unusual?
Get chrome canary then open developer tools and click on profiles.
Using the tools there you can take heap snapshots and compare memory allocations between snapshots to see what is staying in memory or you can record heap allocations which records memory allocation in real time on a timeline so you can dig in and find where memory is not being released.
https://developer.chrome.com/devtools/docs/javascript-memory-profiling
There are some very good guides on the technicalities of doing the above if you google how to find memory leaks.

chrome dev-tools network throttling seems slower than setting

When using chrome dev tools, the network throttling functionality seems to simulate a slower connection than the kb/s down setting defines.
For example when simulating with the preset of 50kb/s for GPRS and downloading a 256kb file, chrome shows the file taking a total of 42.89 sec for the content download. Yet, 256 / 50 would come to 5.12 seconds. Am I missing something here?
Thanks for reading,
-cybo
Internet connection speeds are measured in kilobits instead of kilobytes to describe the connection speed. That explains the 8x difference between the value you got and what you expected.
Here's another example, downloading the 181 kilobyte StackOverflow sprites file.
50kb/s is 6.25KB/s. We'd expect the download to take 181KB / (6.25KB/s) = 28.96s, which closely matches the actual value of 28.83s.

CUDA apps time out & fail after several seconds - how to work around this?

I've noticed that CUDA applications tend to have a rough maximum run-time of 5-15 seconds before they will fail and exit out. I realize it's ideal to not have CUDA application run that long but assuming that it is the correct choice to use CUDA and due to the amount of sequential work per thread it must run that long, is there any way to extend this amount of time or to get around it?
I'm not a CUDA expert, --- I've been developing with the AMD Stream SDK, which AFAIK is roughly comparable.
You can disable the Windows watchdog timer, but that is highly not recommended, for reasons that should be obvious.
To disable it, you need to regedit HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Watchdog\Display\DisableBugCheck, create a REG_DWORD and set it to 1.
You may also need to do something in the NVidia control panel. Look for some reference to "VPU Recovery" in the CUDA docs.
Ideally, you should be able to break your kernel operations up into multiple passes over your data to break it up into operations that run in the time limit.
Alternatively, you can divide the problem domain up so that it's computing fewer output pixels per command. I.e., instead of computing 1,000,000 output pixels in one fell swoop, issue 10 commands to the gpu to compute 100,000 each.
The basic unit that has to fit within the time slice is not your entire application, but the execution of a single command buffer. In the AMD Stream SDK, a long sequence of operations can be broken up into multiple time slices by explicitly flushing the command queue with a CtxFlush() call. Perhaps CUDA has something similar?
You should not have to read all of your data back and forth across the PCIX bus on every time slice; you can leave your textures, etc. in gpu local memory; you just have some command buffers complete occasionally, to prove to the OS that you're not stuck in an infinite loop.
Finally, GPUs are fast, so if your application is not able to do useful work in that 5 or 10 seconds, I'd take that as a sign that something is wrong.
[EDIT Mar 2010 to update:] (outdated again, see the updates below for the most recent information) The registry key above is out-of-date. I think that was the key for Windows XP 64-bit. There are new registry keys for Vista and Windows 7. You can find them here: http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx
or here: http://msdn.microsoft.com/en-us/library/ee817001.aspx
[EDIT Apr 2015 to update:] This is getting really out of date. The easiest way to disable TDR for Cuda programming, assuming you have the NVIDIA Nsight tools installed, is to open the Nsight Monitor, click on "Nsight Monitor options", and under "General" set "WDDM TDR enabled" to false. This will change the registry setting for you. Close and reboot. Any change to the TDR registry setting won't take effect until you reboot.
[EDIT August 2018 to update:]
Although the NVIDIA tools allow disabling the TDR now, the same question is relevant for AMD/OpenCL developers. For those: The current link that documents the TDR settings is at https://learn.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys
On Windows, the graphics driver has a watchdog timer that kills any shader programs that run for more than 5 seconds. Note that the Xorg/XFree86 drivers don't do this, so one possible workaround is to run the CUDA apps on Linux.
AFAIK it is not possible to disable the watchdog timer on Windows. The only way to get around this on Windows is to use a second card that has no displayed screens on it. It doesn't have to be a Tesla but it must have no active screens.
Resolve Timeout Detection and Recovery - WINDOWS 7 (32/64 bit)
Create a registry key in Windows to change the TDR settings to a
higher amount, so that Windows will allow for a longer delay before
TDR process starts.
Open Regedit from Run or DOS.
In Windows 7 navigate to the correct registry key area, to create the
new key:
HKEY_LOCAL_MACHINE>SYSTEM>CurrentControlSet>Control>GraphicsDrivers.
There will probably one key in there called DxgKrnlVersion there as a
DWord.
Right click and select to create a new key REG_DWORD, and name it
TdrDelay. The value assigned to it is the number of seconds before
TDR kicks in - it > is currently 2 automatically in Windows (even
though the reg. key value doesn't exist >until you create it). Assign
it with a new value (I tried 4 seconds), which doubles the time before
TDR. Then restart PC. You need to restart the PC before the value will
work.
Source from Win7 TDR (Driver Timeout Detection & Recovery)
I have also verified this and works fine.
The most basic solution is to pick a point in the calculation some percentage of the way through that I am sure the GPU I am working with is able to complete in time, save all the state information and stop, then to start again.
Update:
For Linux: Exiting X will allow you to run CUDA applications as long as you want. No Tesla required (A 9600 was used in testing this)
One thing to note, however, is that if X is never entered, the drivers probably won't be loaded, and it won't work.
It also seems that for Linux, simply not having any X displays up at the time will also work, so X does not need to be exited as long as you screen to a non-X full-screen terminal.
This isn't possible. The time-out is there to prevent bugs in calculations from taking up the GPU for long periods of time.
If you use a dedicated card for CUDA work, the time limit is lifted. I'm not sure if this requires a Tesla card, or if a GeForce with no monitor connected can be used.
The solution I use is:
1. Pass all information to device.
2. Run iterative versions of algorithms, where each iteration invokes the kernel on the memory already stored within the device.
3. Finally transfer memory to host only after all iterations have ended.
This enables control over iterations from CPU (including option to abort), without the costly device<-->host memory transfers between iterations.
The watchdog timer only applies on GPUs with a display attached.
On Windows the timer is part of the WDDM, it is possible to modify the settings (timeout, behaviour on reaching timeout etc.) with some registry keys, see this Microsoft article for more information.
It is possible to disable this behavior in Linux. Although the "watchdog" has an obvious purpose, it may cause some very unexpected results when doing extensive computations using shaders / CUDA.
The option can be toggled in your X-configuration (likely /etc/X11/xorg.conf)
Adding: Option "Interactive" "0" to the device section of your GPU does the job.
see CUDA Visual Profiler 'Interactive' X config option?
For details on the config
and
see ftp://download.nvidia.com/XFree86/Linux-x86/270.41.06/README/xconfigoptions.html#Interactive
For a description of the parameter.