How do I disable Chrome's memory cache without devtools? - google-chrome

Is there a way to disable Chrome's in-memory cache without changing caching headers in responses?
I found a few answers about disabling cache, caching headers, CLI arguments (like --disk-cache-size=1 and a few others), none of them seem to be able to disable the memory cache. Using the --disk-cache-size=1 flag, I was able to disable disk cache, but I couldn't find a way to do the same thing for memory cache.
In my scenario, I run automated tests that load a page with a request that responds with a caching header. However, I want this resource to be loaded each time a test runs, even if the browser is not restarted (performance reasons), in order to properly simulate delayed page loads. I have no means to change the request or response in any way, so I have to change how the browser loads it.
For the first test run, this works fine thanks to --disk-cache-size=1, but subsequent runs don't work because Chrome loads the resource from memory.
Neither I can change the request, nor the response, so changing caching headers or appending something to the URL of the resource is not an option. I want to run those tests in an automated fashion, so disabling the cache in devtools also doesn't work. Thus browser extensions also aren't feasible.
Is there any other way to disable the in-memory cache in Chrome (or other chromium-based browsers)?

Related

Chrome - Images disappearing from cache storage after few hours

I have a website that is used as a kiosk app. When online, I preload data and images from a wordpress API, and store the images in the cache storage.
A service worker intercepts http gets to these images and serves the cache data instead. Like this, the app can run offline (API calls included).
But after few hours running offline (generally around 6h) some images disappear from the cache storage. And it's always the same ones.
But not all.
Any idea where should I check to see what's going wrong ?
There can be two possible reasons for such clearing of cache, either your storage gets full(chrome localStorage), so it gets clear OR check for the expiring length of the data send from the server, check for headers that might give you insights of time it takes to expire.
And for checking if the data gets evicted from the storage, try testing your website in Safari or Edge browser where such eviction does not occur.
How did you configured Cache-Control? The ones which are constantly deleted might configured as max-age.
It's still recommended that you configure the Cache-Control headers on
your web server, even when you know that you're using the Cache
Storage API. You can usually get away with setting Cache-Control:
no-cache for unversioned URLs, and/or Cache-Control: max-age=31536000
for URLs that contain versioning information, like hashes.
as stated at https://web.dev/service-workers-cache-storage
Also, you should have a check storage amount of cache storage. (https://developers.google.com/web/tools/chrome-devtools/storage/cache) Even they claim that either all caches are deleted or none of them, since currently it is experimental technology, it is normal to be suspicious about this point.
You are also responsible for periodically purging cache entries. Each
browser has a hard limit on the amount of cache storage that a given
origin can use. Cache quota usage estimates are available via the
StorageEstimate API. The browser does its best to manage disk space,
but it may delete the Cache storage for an origin. The browser will
generally delete all of the data for an origin or none of the data for
an origin. Make sure to version caches by name and use the caches only
from the version of the script that they can safely operate on. See
Deleting old caches for more information.
as stated at https://developer.mozilla.org/en-US/docs/Web/API/Cache
Service Workers use the Storage API to cache assets.
There are fixed quotas available and different from browser to browser.
You can get some more hints from your app using the following call and analysing the results:
navigator.storageQuota.queryInfo("temporary")
.then(function(info) {
console.log(info.quota);
// It gives us the quota in bytes
console.log(info.usage);
// It gives us the used data in bytes
});
UPDATE 1
Do you have anywhere a cleanup method or logic that removes old cache entries and eventually is triggered somehow unexpected? Something like the code below.
caches.keys().then(function(keyList) {
return Promise.all(keyList.map(function(key) {
if (key !== cacheName) {
return caches.delete(key);
}
}));
})
A couple of more questions:
Do you have any business logic in the SW dealing with caching the images?
Could you get any info by using the storageQuota methods to see the available quota and usage? This might be very useful to understand if the images are evicted because of reaching the limit. (even by setting a box as persistence, you are not 100% sure the assets are retained)
From "Building PWAs" (o'Reilly - Tal Alter):
Each browser behaves differently when it comes to managing CacheStorage, allocating space in the cache for each site, and clearing out old cache entries.
In addition to a storage limit per site (also known as an origin), most browsers will also set a size limit for their entire cache. When the cache passes this limit, the browser will delete the caches of the site accessed the longest time ago (also known as the least recently used).
Browsers will never delete only part of your site’s cache. Either your site’s entire cache will be deleted or none of it will be. This ensures your site never has an unpredictable partial cache.
Considering the last words "Browsers will never delete only part of your site’s cache." might me think that maybe it is not a problem with the cache limit reached, as otherwise all images wiould be deleted in the cache.
If interested, I wrote an article about this topic here.
You could mark your data storage unit (aka "box") as "persistent", making the user agent retain it as long as possible (more details in the Storage API link above).

What does Blink in-memory cache store?

Besides the browser cache, there are a few other ways browser cache data. For Chrome, there is another cache in the rendering engine Blink that stores images, styles, scripts and fonts (maybe more) in memory.
This cache is used for consecutive navigations on a site. Resources delivered from the Blink cache are tagged with (from memory cache) in the network tab. Resources served from the browser cache are tagged with (from disk cache).
My question is now, which resources are stored in and delivered from this very fast cache? From my tests, it varies a lot:
It works extremely well for images and script tag which are directly in the HTML.
It works sometimes for style (link) tags which are directly in the HTML. Sometimes it does not work (in the same browser with the same session).
It works almost never for script tags that are inserted into the HTML programmatically. Sometimes it works though.
One huge difference between disk cache hits and memory cache hits becomes visible in combination with Service Workers. Requests that are served by the in-memory cache cannot be observed in the Service Worker (because the cache comes before the Service Worker). Requests that are served by the disk cache pass through the Service Worker (since the Browser Cache lies behind Service Worker).
To show the explained behavior, I built a test page with all resource types: https://dm-clone-optimized.app.baqend.com/
You can navigate through the site with the links at the top and observe how the requests behave in the network tab and console. Every page loads the same resources.
After a bit of navigating (Chrome 70.0.3538.67), I get this behavior most of the time:
HTML is fetched from network
Script tags scripts.js and scripts2.js are from in-memory cache
Image tag logo.png is from in-memory as well
Style link tag styles.css is from disk cache :(
Programatically added script tag scripts2.js?id=1 is from disk cache as well :(
Sometimes though, I get really lucky and everything is served from in-memory cache:
I would love to understand how the Blink in-memory cache works and how I can tune my site to use it for all resources with appropriate cache control header.
---- edit ----
What concerns me the most is: Why are dynamically added scripts not cached at all? This has a noticeable impact on frameworks like require.js since they insert all dependencies as dynamically added script tags.
Blink in-memory cache works
Blink has four memory allocators
PartitionAlloc, Oilpan, tcmalloc, and system allocator
So the team working on Chrome Blink has removed tcmalloc and system allocators from Blink
Blink (PartitionAlloc+Oilpan) is the second largest consumer of renderer’s memory which consumes 10 - 20% in typical cases and retains some of the memory in Discardable, CC and V8
Inside Blink, the primary memory consumers are:
Large StringImpls (used by JavaScript source code)
shared buffers (used by Resources)
Vectors and HashTables
The recommendation is to: "identify caches that have an impact on Blink’s total memory and implement purgeMemory() only on them."
Reducing the size of (DOM object) won’t have an impact
Discarding caches won’t affect in most cases
They are working on getting rid of "DiscardableMemory" items which will help to do things like forcibly detaching all layout objects which in turn will release memory retained by the layout tree.
I believe it is a result of optimization in chrome, and they make it verbose to you.
The files are always go into disk cache. And they also goes into memory, and flushed very soon.
Chrome is smart enough to ask running process that do they still have a loaded copy of them in memory before seek on disk.
The step has a high hit rate, as those images/js are actively using for something.
You will not have any control how chrome manage TTL of them/capacity of memory could be used to keep blob hot. Chrome dev team doing quite a lots on dynamic tuning based on actual hardware capacity and system loading.
P.S. If you are asking for keep YOUR APP in memory. You are failing into Sun/Adobe way of evil: making their app DLL hot in memory(by tray icon/service) and slow everyone else down.
P.P.S. If it is the app end-user might want to use, use electron and follow Whatsapp/Slack/etc to build an app always running.

How to disable HTTP/2 in chrome or chromium?

I'm trying to debug difference between HTTP/1.1 and HTTP/2.
Is there any possibility for disabling HTTP/2 in chrome or chromium?
I couldn't find this option flag in chrome 56. I have tried chromium 58 with flag --disable-http2:
./Chromium.app/Contents/MacOS/Chromium --disable-http2
But content is still delivered with HTTP/2 protocol after using this flag:
For what it is worth, the flag works.
The issue is that you need to quit EVERYTHING Chrome for it to take effect. Including plugin shims and other chrome tabs and so on.
It is not enough just to add the command line switch.
An easier way to achieve something broadly equivalent is to use an HTTP Proxy, like https://www.telerik.com/fiddler. This adds negligible additional time to your requests, and (as far as I know) doesn't support http/2 at all (yet); even if it did, I'm pretty sure it would be much easier/practical to switch behavior in than restarting all your Chrome windows.
The advantage of this approach is that it takes effect immediately - disabling and reenabling HTTP/2 becomes as easy as starting and stopping the proxy, without messing with the (if you're anything like me) dozens of Chrome tabs you have open, to StackOverflow and elsewhere :)
What happens when you try doing the same thing in WebPageTest (select Chrome as the test agent and add the command line switch in the Chrome tab under advanced settings)
Here's a test I did for my personal site just now and the flag appears to work OK (if you look at the response headers you'll see HTTP/1.1)
https://www.webpagetest.org/result/170322_1B_ab8656afcfb8bcc4103e9872ff56c28b/1/details/#waterfall_view_step1
I have seen the same problem created by a firewall running in proxy mode vs flow mode.
The firewall would buffer the entire file so it could scan it then pass it along vs scanning the individual packets.
https://docs.fortinet.com/document/fortigate/6.4.4/administration-guide/721410/inspection-modes
The problem would only happen when using http2 and might have something to do with http request priority not being handled properly or it forced it single threaded.
We would have a video request start with a low priority that would stall then start causing other file downloads to be delayed. Then there was a api poler in the background coming in with high priority requests. After a few high priority requests were blocked chrome would cancel the low priority video.
It would happen in other cases but the video made it very reproduceable for us.
https://medium.com/dev-channel/javascript-loading-priorities-in-chrome-57c54cfa6672
https://blog.cloudflare.com/better-http-2-prioritization-for-a-faster-web/
https://blog.cloudflare.com/http-2-prioritization-with-nginx/
https://calendar.perfplanet.com/2018/http2-prioritization/
We set it back to flow mode on the firewall and the problem went away.
Afterwards the downloads all happened in parallel with no blocking or stalling in the chrome network waterfall.

Determine when cache will timeout for particular URL

I would like to look at previously loaded resources in the Chrome or Firefox browsers and see when the browser will consider that resource expired such that it will reload it from the server.
I am interested in doing this for two reasons:
Debug problems with appcache.
Confirm that I have the server configured properly by looking at resources to see if its expiration time is as I expect.

Why the static resources listed in HTML5 cache manifest are downloaded twice?

I listed static resources (CSS and javascript files) referenced by the HTML in the explicit cache section of the cache manifest, as a modern mobile web app usually does. But it is surprising to find in chrome that these static resources are downloaded twice for the first time accessing the web app when there is not any local cache yet, since the cache is used to minimize bandwidth. I think the reasons that they are downloaded twice is that they are referenced in the cache manifest and in the HTML.
Why doesn't the browser share the resource if it is referenced in both places? Is it conforming to the HTML5 standard or a bug in chrome?
I have seen this behavior consistently across nearly every browser (Chrome, IE, and Firefox) when using a cache manifest. As far as I can tell, the resources are indeed downloaded twice because they are referred to in both the cache manifest and the html/css. The initial load is for loading the page itself, and the second load is for loading the appcache.
Now, even if a second request needs to be performed to load the appcache, the resource should be served from the regular browser cache since the original request probably loaded it into the browser cache. However, all browsers seem to treat appcache loads differently than regular browsing activity and bypass the cache.
As far as HTML5 specs go, the spec suggests that browsers can implement a mechanism to use the existing download, but it is optional:
Fetch the resource, from the origin of the URL manifest URL, with the synchronous flag set and the manual redirect flag set. If this is an upgrade attempt, then use the newest application cache in cache group as an HTTP cache, and honor HTTP caching semantics (such as expiration, ETags, and so forth) with respect to that cache. User agents may also have other caches in place that are also honored.
If the resource in question is already being downloaded for other reasons then the existing download process can sometimes be used for the purposes of this step, as defined by the fetching algorithm.
An example of a resource that might already be being downloaded is a large image on a Web page that is being seen for the first time. The image would get downloaded to satisfy the img element on the page, as well as being listed in the cache manifest. According to the rules for fetching that image only need be downloaded once, and it can be used both for the cache and for the rendered Web page.
source