I am reasonably new to browser caching. I am attempting to get Chrome to permanently cache any static file with a query parameter (for cache busting purposes). I have set Cache-Control and Expires headers way into the future, which should be adequate to say "cache this forever". The resulting response headers:
HTTP/1.1 200 OK
Cache-Control: public, max-age=315360000
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/x-javascript
Date: Wed, 16 Jul 2014 09:29:54 GMT
Last-Modified: Wed, 16 Jul 2014 03:44:14 GMT
Server: nginx/1.6.0
Transfer-Encoding: chunked
Vary: Accept-Encoding
Firefox and Safari seem to respect this for all cachebusted (?v= query parameter) files. Chrome mostly follows the directives, except for Javascript. Most of the time it does a request with an If-Modified-Since header rather than loading from cache. Sometimes one of them will load from cache and the other will yield a request resulting in a 304. Usually when loading the page from a new tab it will load from cache, but not if you hit enter in the address bar.
I've observed other websites using what I think are the exact same headers, and the files are always loaded from cache. Some of them load from cache even if you do a refresh.
I understand cache behaviour is somewhat unpredictable, but I want to make sure I'm not overseeing something that's making Chrome do that?
I had the same issue with chrome and after some hours of trial and error I figuered out, that chrome seems to have a problem with the Vary Header
I've got this snippet in my Apache / .htaccess config and as soon as I comment the line "Header append Vary Accept-Encoding" Chrome starts caching .js and .css files
<FilesMatch "(\.js\.gz|\.css\.gz)$">
# Serve correct encoding type.
Header set Content-Encoding gzip
# Force proxies to cache gzipped & non-gzipped css/js files separately.
#Header append Vary Accept-Encoding
</FilesMatch>
It still does not work while running the request via our nignx server, because it is adding the Vary: Accept-Encoding header too, when delivering gzip compressed.
So far I can guess this is a problem that only happens with chrome and as a workaround I would change the configuration to append the header only if chrome (haven't checked for safari) is not the client until there is a better fix:
<FilesMatch "(\.js\.gz|\.css\.gz)$">
# Serve correct encoding type.
Header set Content-Encoding gzip
# Force proxies to cache gzipped & non-gzipped css/js files separately.
BrowserMatch "Chrome" ChromeFound
Header append Vary Accept-Encoding env=!ChromeFound
</FilesMatch>
Related
I have a react app and save the bundle.js on a CDN (or S3 for this example)
on save I run gzip -9
on upload to CDN / S3 i add headers: Content-Encoding: gzip
now each time a browser / http client will download the bundle it will get:
curl -I https://cdn.example.com/bundle.min.js
HTTP/2 200
content-type: application/javascript
content-length: 3304735
date: Wed, 27 Feb 2019 22:27:19 GMT
last-modified: Wed, 27 Feb 2019 22:26:53 GMT
content-encoding: gzip
accept-ranges: bytes
this works fine if I test this in a browser. my only concern is that now we only save a gzip version of the js bundle and users will get it regardless of sending over the Accept-Encoding: gzip in the request
I cant think of any issues this will cause for browsers but I might be missing something.
Is it a bad practice to "enforce" gzip in the response for the bundle.js file ?
Its several month late, but the answer may still be relevant to someone trying the same thing.
Pre-compression with high compression settings is something that can help with saving few more percentages of bandwidth on static resources. However, forcing a single encoding may create problem for some users. As per a study conducted in 2010, ~15% of users with gzip-capable browsers were not sending an appropriate Accept-Encoding request header. The reason being anti-virus software or intermediaries/proxies striping Accept-Encoding header to force the server to send the content in plain-text form.
If you need a CDN that covers the gap, PageCDN serves exactly the same purpose, but offers superior brotli-11 compression, and falls back to gzip for browsers that do not support brotli. You do not need to connect to S3 or any external storage. Just connect to your website, github, or manually upload files to CDN and configure the compression level for files.
I have an S3 Bucket fronted with a Cloudfront CDN. In that bucket, I have some woff2 fonts that were automatically tagged with the content type octet-stream. When trying to load that font from a CSS file on a live production website, I get the following error:
Access to Font at 'https://cdn.example.com/fonts/my-font.woff2' from origin
'https://www.live-website.com' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'https://www.live-website.com' is therefore not allowed access.
The thing is that a curl reveals that the Access-Control-Allow-Origin is present:
HTTP/1.1 200 OK
Content-Type: binary/octet-stream
Content-Length: 98488
Connection: keep-alive
Date: Wed, 08 Aug 2018 19:43:01 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: 3000
Last-Modified: Mon, 14 Aug 2017 14:57:06 GMT
ETag: "<redacted>"
Accept-Ranges: bytes
Server: AmazonS3
Age: 84847
X-Cache: Hit from cloudfront
Via: 1.1 <redacted>
X-Amz-Cf-Id: <redacted>
Everything is working fine in Firefox, so I guess that Chrome is doing an extra validation that blocks my font.
Turns out that the problem was actually with the Content-Type. As soon as I changed the content type to application/font-woff2 and invalidated the cache of these woff2 files, everything went through smoothly.
My problem with CORS and multi domain was that Cloudfront was taking in cache the first request so I had to select in Whitelist Headers the Origin option. And it works.
enter image description here
Here is a very simple example to illustrate my question using JQuery from a CDN to modify the page:
<html>
<body>
<p>Hello Dean!</p>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
<script>$("p").html("Hello, Gabe!")</script>
</body>
</html>
When you load this page with an internet connection, the page displays "Hello Gabe". When I then turn off the internet connection, the page displays "Hello Dean" with an error -- JQuery is not available.
My understanding is that CDNs have a long Cache-Control and Expire in the header response, which I understand to mean that the browser caches the file locally.
$ curl -s -D - https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.3/jquery.min.js | head
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Fri, 17 Apr 2015 16:30:33 GMT
Content-Type: application/javascript
Transfer-Encoding: chunked
Connection: keep-alive
Last-Modified: Thu, 18 Dec 2014 17:00:38 GMT
Expires: Wed, 06 Apr 2016 16:30:33 GMT
Cache-Control: public, max-age=30672000
But this does not appear to be happening. Can someone please explain what is going on? Also -- how can I get the browser to use the copy of JQuery in the cache somewhere?
This question came up because we want to be using CDN's to serve external libraries, but also want to be able to develop the page offline -- like on an airplane.
I get similar behavior using Chrome and Firefox.
There's nothing to deal with CDN. When then browser encounters a script tag, it will request it to the server, whether it's hosted on a CDN or on your server. If the browser previously loaded it, on the same address, the server tells whether it should be reloaded or not (sending 304 HTTP status code).
What you are probably looking for is to cache your application for offline use. It's possible with HTML5 cache manifest file. You need to create a file listing all files needed to be cached for explicit offline use
Since the previous answer recommended using a cache manifest file, I just wanted to make people aware that this feature is being dropped from the web standards.
Info available from Mozilla:
https://developer.mozilla.org/en-US/docs/Web/HTML/Using_the_application_cache
It's recommended to use Service workers instead of the cache manifest.
I have a caching issue. Chrome does not always load newer versions of site assets, most often Javascript files loaded by Require.js. Right now I've been having this problem for over 24 hours with a particular file.
If I load the page with devtools (network tab) open, the offending files typically show a HTTP 200 response, but in the "Size" column it shows "(from cache)". In the Headers details it shows "Provisional headers are shown". Wireshark shows that the file is indeed not requested from the server.
Chrome shows the Last-Modified date of the file as Sat, 06 Dec 2014 01:27:55 GMT, but my below raw request to the server clearly indicates the file has changed much more recently.
If I do a raw request myself I don't see anything in the headers returned by the server that should cause this problem:
GET /js/path/to/file.js HTTP/1.1
Host: static.mydomain.com
User-Agent: Matt
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Vary: Accept-Encoding
Content-Type: application/javascript
Accept-Ranges: bytes
ETag: "4203477418"
Last-Modified: Fri, 16 Jan 2015 18:28:30 GMT
Content-Length: 5704
Date: Fri, 16 Jan 2015 21:05:06 GMT
Server: lighttpd/1.4.33
.... data here ...
The issue has been reported by chrome users on multiple OSes with different versions of chrome, but I do not typically receive reports of caching issues on other browsers. (Right now I'm on "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36")
Edit:
The problem seems to be most offensive with files loaded by Require.js, though I have encountered it with javascript directly referenced in the page as well.
What am I missing here? Why won't chrome check for a new version of the file?
As it turns out, browsers have poorly documented behavior regarding caching when the Cache-Control header is not specified in the server response (well, at least when no caching behavior is specified). In general, it seems that in this case a browser determines how long to cache the item for based on the file's last modified date (if declared in the response), the current date, and ????
See: https://webmasters.stackexchange.com/questions/53942/why-is-this-response-being-cached
Sadly, Google's official page on HTTP caching does not mention what happens if the header is not set: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching
Edit:
I ran across some more specific information about the heuristics used, here: What heuristics do browsers use to cache resources not explicitly set to be cachable?
I'm using nginx and Dojo to build an embedded UI driven by a set of JSON files. Our primary target browser is Chrome, but it should work with all modern browsers.
Changing the JSON files can change the UI drastically, and I use this to give different presentations to different users. See my previous question for the details (Configure nginx to return different files to different authenticated users with the same URI), but basically my nginx configuration is such that the same URI with different users can yield different content.
This all works very well, except when someone switches to a different user. Some browsers will grab those JSON files from their own internal cache without even checking with the server, which leaves the UI display the previous user's presentation. Reloading the page fixes it, but boy! would I rather the right thing happened automatically.
The obvious solution is to use the various cache headers, but they don't appear to help. I'm using the following nginx directives:
expires epoch;
etag off;
if_modified_since off;
add_header Last-Modified "";
... which yields the following response headers:
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Wed, 24 Sep 2014 16:58:32 GMT
Content-Type: application/octet-stream
Content-Length: 1116
Connection: keep-alive
Expires: Thu, 01 Jan 1970 00:00:01 GMT
Cache-Control: no-cache
Accept-Ranges: bytes
This looks pretty conclusive to me, but the problem still occurs with Chrome 36 for OS X and Opera 24 for OS X (although Firefox 29 and 32 do the right thing). Chrome is content to grab files from its cache without even referring to the server.
Here's a detailed example, with headers pulled from Chrome's Network debug panel. The first time Chrome fetches /app/resources/states.json, Chrome reports
Remote Address:75.144.159.89:8765
Request URL:http://XXXXXXXXXXXXXXX/app/resources/screens.json
Request Method:GET
Status Code:200 OK
with request headers:
Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Authorization:Basic dm9sdGFpcndlYjp2b2x0YWly
Cache-Control:max-age=0
Connection:keep-alive
Content-Type:application/x-www-form-urlencoded
DNT:1
Host:suitable.dyndns.org:8765
Referer:http://XXXXXXXXXXXXXXXXXXXXXX/
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36
X-Requested-With:XMLHttpRequest
and response headers:
Accept-Ranges:bytes
Cache-Control:no-cache
Connection:keep-alive
Content-Length:2369
Content-Type:application/octet-stream
Date:Wed, 24 Sep 2014 17:19:46 GMT
Expires:Thu, 01 Jan 1970 00:00:01 GMT
Server:nginx/1.4.1
Again, all fine and good. But, when I change the user (by restarting Chrome and then reloading the parent page), I get the following Chrome report:
Remote Address:75.144.159.89:8765
Request URL:http://suitable.dyndns.org:8765/app/resources/states.json
Request Method:GET
Status Code:200 OK (from cache)
with no apparent contact to the server.
This doesn't seem to happen with all files. A few .js files are cached, most are not; none of the .css files seem to be cached; all the .html files are cached, and all of the .json files are cached.
How can I tell the browser (I'm looking at you, Chrome!) that these files are good at the moment it requests them, but will never again be good? Is this a Chrome bug? (If so, it's strange that Opera also shows the problem.)
I believe I've found the problem. Apparently "Cache-Control: no-cache" is insufficient to tell the browser to, um, not cache the data. I added "no-store":
Cache-Control:no-store, no-cache
and that did the trick. No more caching by Chrome or Opera.
I had the same problem, with json being cached...
If you control the client application-code, a possible workaround is to just add a random-value query-parameter at the end of the URL.
So instead of calling:
http://XXXXXXXXXXXXXXX/app/resources/screens.json
you call, for example:
http://XXXXXXXXXXXXXXX/app/resources/screens.json?rand=rrrrrrrrrr
where rrrrrrrrrr is some random-value that is different in each call.
Then, the browser will not be able to reuse any cached values.