How to trigger browser html refresh for cached html files? - html

YSLOW suggests: For static components: implement "Never expire" policy by setting far future Expires header.... if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename, for example, yahoo_2.0.6.js.
http://developer.yahoo.com/performance/rules.html
I'd like to take advantage of caching for my mostly static pages and reload the js files when the version # changes. I've set a version # for my .js files but my main.html page has Expires set to the future so it doesn't reload and therefore doesn't reload the js files. Ideally I'd like to tell the browser (using a psychic technique) to reload main.html when a new version of the site is released. I could make my main.html page always reload but then I loose the caching benefit. I'm not looking for the ctrl-F5 answer as this needs to happen automatically for our users.
I think the answer is: main.html can't be cached, but I'd like to hear what are others doing to solve this problem. How are you getting the best caching vs. reload benefits.
Thanks.

Your analysis is correct. Web performance best practices suggest a far future expiration date for static components (i.e., those which don't change often), and using a version number in the URL manages those changes nicely.
For the main page (main.html), you would not set a far future expiration date. Instead, you could not set an expiration, or set it for a minimal amount of time, for example +24 hours.

Guess it depends on why you want to cache the HTML page - to improve user load-times or reduce server load.
Even with a long expiry time you might find that it's not actually cached at the client for very long (Yahoo studies show that files don't live in the cache for very long), so a shorter expiry time e.g. 1 day, might not be an issue.
If it's to reduce backend load, it might be worth looking at whether a proxy like Varnish would help i.e. it caches the pages from the origin server at serves them when requested. This way you could control how long pages are cached with a finer level of control.

Related

How to minimise the time for new static content to appear on the GitHub Pages CDN?

Assume we are only pushing lightweight static content like small HTML or JS files with no Liquid tags. There are no plugins, and there is no _posts/ directory, and files are never changed once committed.
Because nothing really needs to be built, in theory if we configure incremental_build: true and keep_files: ['.html', '.js'], then the build should be very fast.
However, right now, the GitHub pages build only happens every 5 minutes or so, so effectively there is a lag of 0 to 10 minutes.
Is there a way to reduce the time it takes for the file to appear at [repo].github.io/[path]? Is there some logic to it, for example do more commits or more files or more reads have an effect one way or another?
Github Pages does not respect those options. You could try prebuilding your site, but will possibly increase the total time to deploy. It's also possible that the build is happening instantly but it's taking time for the CDN to receive updates and invalidate caches.
You can try using another host (like running your own Jekyll server on EC2) or having your build upload the static content to S3 instead.
However, I recommend taking a step back and asking why you need less than 10 minute latency on deploy. If there are highly volatile resources you need to serve, then perhaps you need to identify those and serve them in a different way. Static site generators are good at, well, static content, not so much for highly volatile content.
If the volatile resources are page content, then it sounds like you have a use case better served with a mainstream CMS like Wordpress. If it's code, then deploy that separately to S3 and reference it in your site.

Load page / Cache analysis

Hello all please help with the analysis of my page.
Question 1
Why since everything is load from cache. Load time is 690ms?
question 2
what will be the reason to use --> private, max-age=60000
(public), max-age=60000 VS. private, max-age=60000
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching?hl=en
First, load time isn't just defined by the time it takes to get assets from the network. Painting and parsing can take a lot of time, as can the parsing of Javascript. In your case, DOMContentLoaded is only fired after 491 milliseconds, so that's already part of the answer.
As to your second question, the answer really is in the link you provided:
If the response is marked as “public” then it can be cached, even if it has HTTP authentication associated with it, and even when the response status code isn’t normally cacheable. Most of the time, “public” isn’t necessary, because explicit caching information (like “max-age”) indicates that the response is cacheable anyway.
By contrast, “private” responses can be cached by the browser but are typically intended for a single user and hence are not allowed to be cached by any intermediate cache - e.g. an HTML page with private user information can be cached by that user’s browser, but not by a CDN.

New content not visible because browser cache

I have updated my website with some new content.
I talked to some people to view the content on their computers,
but it seems like they cant's see the content unless they delete their browser cache. Is there a way to handle this by my side, so all new things show up automatically on every browser?
There is no way to accomplish this without cleaning the cache.
Although, you can delete cache easily by pressing CTRL+SHIFT+R (atleast on firefox).
You cannot remote-wipe someone's cache. This time your only options are to wait, or tell your users to clear their cache, or instruct them to vigorously press refresh a few times, which will cause most browsers to refresh the page.
For future reference, there are two types of caching: expiration based caches and ETag caches.
If you set an explicit expiration date on your HTTP response, the client will not check back at all until that expiration date has passed. This greatly reduces network traffic, at the tradeoff of possibly having outdated content out there. Choose your expiration dates wisely for the best tradeoff.
The alternative is ETags, in which case the server sends an ETag token, and the client will inquire with "send me new content unless this token is still valid". This only reduces network traffic somewhat, but you're guaranteed to always have the latest content out there.
You need to balance your caching strategy in practice. First decide if you need caching at all, then decide how much you need and what tradeoff you're willing to make. For a high-traffic site even a cache of a few minutes can be worthwhile, while the issue of outdated content will be minuscule in this scenario.
You cannot erase a remove user's browser cache from Server/Client-side code.
Going forward, best you could do is tell the browsers not to cache at all in future, or to cache for a specified time (less than next expected update)
Cache-Control : no-cache
Cache-Control : max-age=315600
ETag sounds like it could serve the purpose (though I've never used).
The server generates and returns an arbitrary token which is typically
a hash or some other fingerprint of the contents of the file. The
client does not need to know how the fingerprint is generated, it only
needs to send it to the server on the next request: if the fingerprint
is still the same then the resource has not changed and we can skip
the download.

HTML - Cache control max age

I'ld like to present always the latest website content to the user but also have it fast loaded. By researching I came across postings people suggesting to use the cache for speeding up loading.
So what do I need to add to my website to "overwrite" the cache after 3 days to display the latest content?
The Cache-Control header is used in HTTP 1.1 to control the behavior of caches. The max-age directive is used to specify (in seconds) the maximum age of the content before it becomes stale (i.e., the content will not change for some period of time). So if you know that your content will not change for 3 days, you want your server to add the following HTTP header:
Cache-Control: max-age=259200
(259200 = 60s x 60m x 24h x 3d)
To do that in PHP, add this line to your output:
header('Cache-Control: max-age=259200');
Read here for more info on the header function:
http://php.net/manual/en/function.header.php
There is more than one way to do this - but you need to consider exactly what you need to cache and what you don't. The biggest speed increases will likely come from making sure your assets (css, images, javascript) are cached, rather than the html itself. You then need to look at various factors (how often do these assets change, how will you force a user to download a new version of the file of you do change it?).
Often as part of a sites release process, new files (updated files) are given a new filename to force the users browser to redownload the file, but this is only one approach.
You should take a look at apache mod_expire, and the ability to set expiry times for assets using the .htaccess file.
http://www.google.com/?q=apache+cache+control+htaccess#q=apache+cache+control+htaccess
As mentioned Expires and Cache-Control Headers are usually the best way to incorporate information about information lifetime.
Because clients are not very reliable on interpreting these informations proxies with caching capabilities like squid, varnish or such solutions are preferred by most people. You also need to consider if you want to cache only static content (like images, stylesheets, ..) or dynamically generated content as well.
As per the YSlow recommendations you could configure your web server to add an Expires or a Cache-Control HTTP header to the response which will result in user agents caching the response for the specified duration.

Website files caching?

I want to know how long certain files like css, html and js are desirable to be cached by .htaccess setting and why different time setting for each file type?
In few examples i saw that someone cache html for 10 mins, js for a month and imagery for a year.
I think it depends how often a resource is updated. You HTML content is probably dynamic, so you can't cache it for a long time. Otherwise a visitor sees the changes after a long delay.
On the other side, pictures are rarely updated, so you can set longer cache time.
The JavaScript files are often updated for new features or bugfixes. Maybe you can use a version number for this files (core.js?v=12323) so that you can change the number in your HTML content to get them refreshed by a visitor. This way you can cache them for a longer time as well.