What is the difference between a primed cache and empty cache? - html

What is the difference between a primed cache and empty cache?
For example the statistics result of YSlow provides a graphical data of an empty cached vs. primed cache. What are the difference between them?

Simply, a primed cache means the browser has it cached. It has been there before, or (though I don't think YSlow means it this way) it has been somewhere that uses some of the same resources (images, CSS, JavaScript)

This was asked 3 years before, but I bumped into this question my self since I had it to. So I did a small research on the internet and I found that :
Statistics is the third tab and provides a graphical representation of the number of HTTP requests made to the server and the total weight of the page in kilobytes for both Empty Cache and Primed Cache scenarios.
The Empty Cache scenario is when the browser makes the first request to the page and the Primed Cache scenario is when the browser has a cached version of the page. In a Primed Cache scenario, the components are already in the cache, so this will reduce the number of HTTP requests and thereby the weight of the page.
Keyword here is "scenarios". This does not mean that the graphs will change if you already have cached the page. I run the test two times even though I cached it and it always displays both graphs since it shows the "scenarios". So if I cached the page I am looking at the Primed Cache scenario but for my new visitors the Empty Cache.
So in the above example, when I request the page and it is cached, my browser will still make 3 requests with total weight of 86.6K
This page explains what actually yslow displays. http://www.devcurry.com/2010/07/understanding-yslow-firebug-extension.html

Related

How to optimize website for Chrome: waiting and download order of magnitude slower than other browsers

NB: since the answer to this could involve JavaScript or PHP programming, or general networking, or IT systems, I put it here, but if some mod thinks it's better suited for SuperUser or ServerFault, I won't object to it being moved.
I have a landing page to which I'm driving traffic through PPC. I've set up AWS CloudWatch to get RUM data, and the page is performing terribly — an average load time of 9.9s and max of 21.5!
I've done all of the "standard" optimization I can think of or research. The site is built with WordPress, running on Apache on an EC2 server. I've
Upgraded the EC2 instance to ensure I have enough memory
Written a custom plugin to filter out any other plugins that aren't explicitly used on the landing page in question
Customised my theme code so that it sets proper srcset and creates the correct image sizes on upload
Minified all the JS and CSS that I include through plugins or themes I've written
Put the site itself as a distribution to CloudFront
Installed the WP Super Cache plugin, and created a separate CDN distribution on CloudFront for it
Set appropriate cache control headers on CloudFront and told it to gzip everything
Put a facade in place of any videos
The site is blazing fast for me — "load" is less than 1 second. But my RUM says that's not the case for my users. So, I dug a little deeper. 70.2% of my visitors use Chrome, and 27.7% use other, of which almost 1/3rd are Android Browser — which as I understand it is just some sort of "Chrome Lite" — so nearly 80% of my visitors are using some Chrome variant.
Sure enough — if I load the page on Safari (to ensure nothing has expired on CloudFront), clear my browser cache, and reload the page, the first request shows a waiting time of 21.2ms, TTFB of 22.6ms, and download time of 4.8ms. The whole page shows that it's finished loading in 973ms.
Firefox is slightly slower, with the first request taking 100ms total, and the whole load about 1.75s — not blazing fast, but still within the understood "2 second" limit for good user experience.
On the other hand, for Chrome that same first request takes almost 570ms waiting and 208ms download. So, just the first request (which is 36k in size) takes almost as long to load in Chrome as the whole site takes in Safari. And that repeats for every single request, where both the waiting time and the download time are an order of magnitude slower on Chrome than on Safari (on the same device on the same network):
Whereas on Safari:
I would think "waiting" and "download" times would be primarily network driven, but I can repeat this all day long and the results are the same.
I might just assume that Chrome is not optimized for the Mac on which I'm running it, but, as I said, this all started with RUM data, so it's clearly not that. As much as I might like to, I obviously can't force all of my visitors to swap their Android devices for iPhones. Equally obviously, I can't have an average load time of 10 seconds.
So, why is my site so slow on Chrome? What else can I do to optimize this?
The landing page in question is here: https://www.chrisrichardson.info/lp/prague-b/
Note, a lot of the optimization I've done is for that page in particular, so other pages on the site might perform even worse, but I don't care about that, at least at the moment.
Hahahaha.
OK, just leaving this here for posterity's sake. The 10x latency was because Chrome had preserved in devtools from a previous session throttling to 3G. So, if you stumble upon this problem, check your throttling.
That still doesn't address my RUM issue, however, but I'll open that up as a different question.

Chrome - Images disappearing from cache storage after few hours

I have a website that is used as a kiosk app. When online, I preload data and images from a wordpress API, and store the images in the cache storage.
A service worker intercepts http gets to these images and serves the cache data instead. Like this, the app can run offline (API calls included).
But after few hours running offline (generally around 6h) some images disappear from the cache storage. And it's always the same ones.
But not all.
Any idea where should I check to see what's going wrong ?
There can be two possible reasons for such clearing of cache, either your storage gets full(chrome localStorage), so it gets clear OR check for the expiring length of the data send from the server, check for headers that might give you insights of time it takes to expire.
And for checking if the data gets evicted from the storage, try testing your website in Safari or Edge browser where such eviction does not occur.
How did you configured Cache-Control? The ones which are constantly deleted might configured as max-age.
It's still recommended that you configure the Cache-Control headers on
your web server, even when you know that you're using the Cache
Storage API. You can usually get away with setting Cache-Control:
no-cache for unversioned URLs, and/or Cache-Control: max-age=31536000
for URLs that contain versioning information, like hashes.
as stated at https://web.dev/service-workers-cache-storage
Also, you should have a check storage amount of cache storage. (https://developers.google.com/web/tools/chrome-devtools/storage/cache) Even they claim that either all caches are deleted or none of them, since currently it is experimental technology, it is normal to be suspicious about this point.
You are also responsible for periodically purging cache entries. Each
browser has a hard limit on the amount of cache storage that a given
origin can use. Cache quota usage estimates are available via the
StorageEstimate API. The browser does its best to manage disk space,
but it may delete the Cache storage for an origin. The browser will
generally delete all of the data for an origin or none of the data for
an origin. Make sure to version caches by name and use the caches only
from the version of the script that they can safely operate on. See
Deleting old caches for more information.
as stated at https://developer.mozilla.org/en-US/docs/Web/API/Cache
Service Workers use the Storage API to cache assets.
There are fixed quotas available and different from browser to browser.
You can get some more hints from your app using the following call and analysing the results:
navigator.storageQuota.queryInfo("temporary")
.then(function(info) {
console.log(info.quota);
// It gives us the quota in bytes
console.log(info.usage);
// It gives us the used data in bytes
});
UPDATE 1
Do you have anywhere a cleanup method or logic that removes old cache entries and eventually is triggered somehow unexpected? Something like the code below.
caches.keys().then(function(keyList) {
return Promise.all(keyList.map(function(key) {
if (key !== cacheName) {
return caches.delete(key);
}
}));
})
A couple of more questions:
Do you have any business logic in the SW dealing with caching the images?
Could you get any info by using the storageQuota methods to see the available quota and usage? This might be very useful to understand if the images are evicted because of reaching the limit. (even by setting a box as persistence, you are not 100% sure the assets are retained)
From "Building PWAs" (o'Reilly - Tal Alter):
Each browser behaves differently when it comes to managing CacheStorage, allocating space in the cache for each site, and clearing out old cache entries.
In addition to a storage limit per site (also known as an origin), most browsers will also set a size limit for their entire cache. When the cache passes this limit, the browser will delete the caches of the site accessed the longest time ago (also known as the least recently used).
Browsers will never delete only part of your site’s cache. Either your site’s entire cache will be deleted or none of it will be. This ensures your site never has an unpredictable partial cache.
Considering the last words "Browsers will never delete only part of your site’s cache." might me think that maybe it is not a problem with the cache limit reached, as otherwise all images wiould be deleted in the cache.
If interested, I wrote an article about this topic here.
You could mark your data storage unit (aka "box") as "persistent", making the user agent retain it as long as possible (more details in the Storage API link above).

how do I fix Blocked state in pingdom test results?

I am getting (Blocked Web browser is not ready to send) on pingdom on all my scrips and images in the page. What is coursing this? and is it something I can do about it?
As Nathan said, browsers have a hard limit for concurrent connections per hostname; Chrome I think is about six. You have three approaches: to work around that limit, to increase it, or to make it irrelevant.
You can combine all your assets so the need for request matches the browser limit (minify css, combine JS etc.)
You can "increase" the limit by using domain sharding, that means to load your assets from different subdomains for example.
images.yourdomain.com
css.yourdomain.com
The problem is that this increases DNS resolution times, so it's convenient only in certain situations.
You can enable protocols like http2 which opens one connection per domain, but internally uses a multiplexing feature allowing it to make a much larger number of requests, depending on how that limit was negotiated between the client and the server.
Here's the answer I got from Pingdom support:
"It means that the web browser is waiting for other requests to complete before issuing a new one. A web browser is configured to have a maximum number of simultaneous connections to a single domain, so if there are many such connections/requests at the same time it can cause a block and slowdown of the page."
And one way to fix it would be, "Decreasing the number of components on a page reduces the number of HTTP requests required to render the page, resulting in faster page loads. Some ways to reduce the number of components include: combine files, combine multiple scripts into one script, combine multiple CSS files into one style sheet, and use CSS Sprites and image maps." (from the Pingdom tip "Make fewer HTTP requests").

What does Blink in-memory cache store?

Besides the browser cache, there are a few other ways browser cache data. For Chrome, there is another cache in the rendering engine Blink that stores images, styles, scripts and fonts (maybe more) in memory.
This cache is used for consecutive navigations on a site. Resources delivered from the Blink cache are tagged with (from memory cache) in the network tab. Resources served from the browser cache are tagged with (from disk cache).
My question is now, which resources are stored in and delivered from this very fast cache? From my tests, it varies a lot:
It works extremely well for images and script tag which are directly in the HTML.
It works sometimes for style (link) tags which are directly in the HTML. Sometimes it does not work (in the same browser with the same session).
It works almost never for script tags that are inserted into the HTML programmatically. Sometimes it works though.
One huge difference between disk cache hits and memory cache hits becomes visible in combination with Service Workers. Requests that are served by the in-memory cache cannot be observed in the Service Worker (because the cache comes before the Service Worker). Requests that are served by the disk cache pass through the Service Worker (since the Browser Cache lies behind Service Worker).
To show the explained behavior, I built a test page with all resource types: https://dm-clone-optimized.app.baqend.com/
You can navigate through the site with the links at the top and observe how the requests behave in the network tab and console. Every page loads the same resources.
After a bit of navigating (Chrome 70.0.3538.67), I get this behavior most of the time:
HTML is fetched from network
Script tags scripts.js and scripts2.js are from in-memory cache
Image tag logo.png is from in-memory as well
Style link tag styles.css is from disk cache :(
Programatically added script tag scripts2.js?id=1 is from disk cache as well :(
Sometimes though, I get really lucky and everything is served from in-memory cache:
I would love to understand how the Blink in-memory cache works and how I can tune my site to use it for all resources with appropriate cache control header.
---- edit ----
What concerns me the most is: Why are dynamically added scripts not cached at all? This has a noticeable impact on frameworks like require.js since they insert all dependencies as dynamically added script tags.
Blink in-memory cache works
Blink has four memory allocators
PartitionAlloc, Oilpan, tcmalloc, and system allocator
So the team working on Chrome Blink has removed tcmalloc and system allocators from Blink
Blink (PartitionAlloc+Oilpan) is the second largest consumer of renderer’s memory which consumes 10 - 20% in typical cases and retains some of the memory in Discardable, CC and V8
Inside Blink, the primary memory consumers are:
Large StringImpls (used by JavaScript source code)
shared buffers (used by Resources)
Vectors and HashTables
The recommendation is to: "identify caches that have an impact on Blink’s total memory and implement purgeMemory() only on them."
Reducing the size of (DOM object) won’t have an impact
Discarding caches won’t affect in most cases
They are working on getting rid of "DiscardableMemory" items which will help to do things like forcibly detaching all layout objects which in turn will release memory retained by the layout tree.
I believe it is a result of optimization in chrome, and they make it verbose to you.
The files are always go into disk cache. And they also goes into memory, and flushed very soon.
Chrome is smart enough to ask running process that do they still have a loaded copy of them in memory before seek on disk.
The step has a high hit rate, as those images/js are actively using for something.
You will not have any control how chrome manage TTL of them/capacity of memory could be used to keep blob hot. Chrome dev team doing quite a lots on dynamic tuning based on actual hardware capacity and system loading.
P.S. If you are asking for keep YOUR APP in memory. You are failing into Sun/Adobe way of evil: making their app DLL hot in memory(by tray icon/service) and slow everyone else down.
P.P.S. If it is the app end-user might want to use, use electron and follow Whatsapp/Slack/etc to build an app always running.

What is the difference between HTML5 AppCache and the normal browser cache?

I don't understand the point of the HTML5 AppCache. We already have a normal cache. If you visit a website the first time it'll already cache all the assets. What extra value does the AppCache provide? Is it just a list of files so that the browser knows what assets to download, even if they're not referenced by the HTML right now? Does the browser make sure that the caching is "all-or-nothing", i.e. does it ensure that everything referenced by the manifest is cached, or nothing at all?
I think the point you're missing is that AppCache is specifically designed to allow web apps (and web sites) to be made available offline, though the same speed benefits which the normal browser cache provides, when the user is online, are also provided by AppCache.
The key difference with the browser cache is that you can specify all the assets the browser should cache in a manifest file (conceivably your entire site) whereas the browser cache will only store the pages (and associated assets) you have actually visited.
I'm no expert on the AppCache, but I do know it is not without its problems. There's a really good article here from a chap who used AppCache to allow parts of his mobile site to be available offline. It includes some rationale on their decision to use it and a number of gotchas they encountered in doing so.
This HTML5 Rocks article on the subject also has some good information.
AppCache actually uses the browser cache in support of its operation. It is the browser equivalent of downloading an app to run locally.
The first time a user visits the page, the resources of that page will be loaded from the server and stored in the normal cache. If the page specifies an appcache manifest, the browser will download the manifest and fetch all the resources in there (even if they do not appear on the page that embedded the manifest). These are then stored in appcache.
The second time a user visits the page, the browser will check its appcache. If an entry exists for that URL, it will load the page from appcache instead of from the server, based on the rules specified in the manifest (the manifest can mark some resources explicitly as fetched from the network).
After the browser loads the page from appcache, it will contact the server to see if there is an updated manifest. If the manifest is updated, it will fetch the resources from the manifest. These fetches are done using normal browser cache rules, so some of these resources may actually end up being fetched from the regular browser cache instead of from the server (this allows you to do differential updates when using appcache to develop offline apps). The new version of the appcache is kept separate from the old version. After the new version is fetched the user keeps interacting with the resources from the old version until they refresh the main page, after which the new version is loaded and the old is discarded.
Another important point is that appcache has different rules for when resources are discarded. Appcache basically never discards the latest set of resources, and caches them as a whole. To prevent abuse it enforces storage limits (sometimes as little as 5 MB) of how big a site's cache can be. By contrast, the browser cache has no per-site limits, but will discard individual resources from a site if the global cache limits are reached.
The important feature of HTML 5 application cache is it makes available the web application offline. Which is not given by normal browser cache.
In addition to this application cache will provide
Speed - since the entire contents of the specified page will be cached to browser so it will provide a better speed than browser cache
Reduce Server Load - There is no need of a post back all the time, since all the contents are there in cache, till there is any changes in the Manifest file
Cache manifest :- The cache manifest file is the heart of HTML5 application cache. We can specify what are the pages need not be cached, what should not, and even we can reuse this one as a error handling technique, for that we can specify a custom error page in the FALLBACK section to show if the user request a content that requires network connectivity
For a basic understanding on Application cache you can See this tutorial