Chrome, Firefox caching 302 redirects

Chrome, Firefox caching 302 redirects - html

According to the HTTP spec, upon loading a resource that results in a 302 redirect:
...the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
However, within a single page load, I'm seeing current Chrome and Firefox both resolving subsequent requests to the initial Request-URI to the resolved value from the first request, even when the redirect specifies no caching.
I've setup a minimal repro case here:
http://chrome-302-broke.herokuapp.com/test.html
It's on a free heroku dyno (in case you reach it while it's offline).
Am I missing something? It seems like caching the redirect from the initial response, even within the same page load, is taking liberty with the description from the spec. A strict interpretation shouldn't cache this request at all.
Especially with a growing number of web applications that don't navigate between pages for a considerable amount of time, this seems like it would cause problems for an increasing number of use cases.
Is this something I should submit as a bug to Chrome/Firefox?

Related

How to force re-validation of cached resource?

I have a page /data.txt, which is cached in the client's browser. Based on data which might be known only to the server, I now know that this page is out of date and should be refreshed. However, since it is cached, they will not re-request it for a long time (until the cache expires).
The client is now requesting a different page /foo.html. How can I make the client's browser re-request /data.txt and update its cache?
This should be done using HTTP or HTML (not all clients have JS).
(I specifically want to avoid the "cache-busting" pattern of appending version numbers to the /data.txt URL, like /data.txt?v=2. This fills the cache with useless entries rather than replacing expired ones.)
Edit for clarity: I specifically want to cache /data.txt for a long time, so telling the client not to cache it is unfortunately not what I'm looking for (for this question). I want /data.txt to be cached forever until the server chooses to invalidate it. But since the user never re-requests /data.txt, I need to invalidate it as a side effect of another request (for /foo.html).

To expand my comment:
You can use IF-Modified-Since and Etag, and to invalidate the resource that has been already downloaded you may take a look at the different approaches suggested in Clear the cache in JavaScript and fetch(), how do you make a non-cached request?, most of the suggestions there mentioned fetching the resource from JavaScript with no-cache header fetch(url, {cache: "no-store"}).
Or, if you can try sending a Clear-Site-Data header if your clients' browsers are supported.
Or maybe, give up this time only for the cache-busting solution. And if it's possible for you, rename the file to something else rather than adding a querystring as suggested in Revving Filenames: don’t use querystring.
Update after clarification:
If you are not maintaining a legacy implementation with users that already have /data.txt cached, the use of Etag And IF-Modified-Since headers should help.
And for the users with the cached versions, you may redirect to: /newFile.txt or /data.txt?v=1 from /foo.html. The new requests will have the newly added headers.

The first step is to fix your cache headers on the data.txt resource so it uses your desired cache policy (perhaps Cache-Control: no-cache in conjunction with an ETag for conditional validation). Otherwise you're just going to have this problem over and over again.
The next step is to get clients who have it in their cache already to re-request it. In general there's no automatic way to achieve this, but if you know they're accessing foo.html then it should be possible. On that page you can make an AJAX request to data.txt with the Cache-Control: no-cache request header. That should force the browser to bypass the cache and get a fresh version, and the cache should then be repopulated with the new version.
(At least, that's how it's supposed to work. I've never tried this, and I've seen reports here that browsers don't handle Cache-Control request headers properly.)

Caching in HTTP requests: ETag vs max-age

I have a SPA which consumes some static assets from the backend server. For reasons, I picked ETag validation as the caching mechanism. In short, I want the browser keep the assets in its cache forever, as long as the related ETags remain unchanged.
To signal the browser about caching, header Cache-Control must be present in the the responses. To me it's absolutely comprehensible, but what makes me confused is that I have to provide max-age in the header as well. In other words Cache-Control=public doesn't work whereas Cache-Control=public, max-age=100 is the correct header.
To me it sounds contradictory. The browser inquiries the server to see if an asset has changed using If-Not-Match={ETag} any time it asks for it. What's the role of max-age here then?

The resource/file cached in browser with ETag will be anyway requested each time. If this is a *.js file that was changed on server then server will send a new version with a new ETag and browser will refresh it's cached version.
But anyway performed a full network round trip of request and response and this is quite expensive.
If you do expect that some file really may change at any time then you have to use ETag.
The Cache-Control is a directive to a browser to not even try to retrieve an updated version for some time specified by the max-age. This is much more performant.
This is useful for static assets that probably wont be changed e.g. jquery-3.1.js
file will be always the same.
Or even if the resource was changed it's not a big deal e.g. style.css.
During development when assets often changed the Cache-Control is usually disabled.
But please be careful with the public modifier: that means that the resource may be cached on a proxy server (like CloudFlare) and shared between different users. If the resource have private info e.g. messages then users may see data of each others.

Code for temporary removal?

I know a 301 redirect is for a permanent change, and 302 is for a temporary change.
What code should I use when the page is offline for a number of weeks and in the mean time I am redirecting to the homepage? The page should be back up in a few weeks.

If you want to redirect, it would be 302. If you don't want to redirect, you could send 503 Service unavailable and set a Retry-After header (which should hopefully prevent search engines from coming back before that time).
If you still want the end-user experience to be a redirect to the homepage, you might, with heavy heart, consider adding that to the content of your 503 error page with a meta refresh or something JavaScript based, and hope for the best in terms of what a search engine crawler makes of that.
Previous answers suggest that browsers might honour cache and expires headers set on a 301 response, but since that fails-unsafe, I wouldn't rely on it. (The standard says the response is "cacheable unless indicated otherwise"; its definition of 302 Found suggests a 302 that is explicitly cacheable might be cached, but it wouldn't be the first time browsers don't implement what could be read out of the letter of the RFCs.)

Understanding Firebug's Net panel

I am trying to get a hang on analysing the performance of a web page using Firebug's Net panel.
The following screenshot shows an example of a google query. For the sake of this discussion I clicked twice, so some requests are cached.
So here are my questions:
1) What is happening between the end of the first request and the beginning of the next request (which is the third one). In the same context: Why is the third request starting earlier than the second request?
2) The next 6 requests are coming from the cache. The purple bar is indicating waiting time and I assumed this is the time the browser "waiting for the server to to something". So as comes from cache, what exactly is the browser waiting for. Also: What could be the reason, that the waiting time for for a 4,4KB response is longer (63ms) than for a 126,3 KB response (50ms).
3) In the next request there is a fairly long green bar indicating the time of receiving the response. How comes that this doesn't seem to be at least fairly proportional to the size of the response?
4) The red vertical line indicates the load event. According to https://developer.mozilla.org/en-US/docs/Web/Events/load this means: "The load event is fired when a resource and its dependent resources have finished loading." In the timeline you can see that there are still a couple of requests performed after the load event. How comes? Are they considered to be not dependent and if so why?

The response of the first request needs to be parsed to find out what else needs to be loaded. This parsing time causes the gap to the second request. See also my answer to a related question.
Responses coming from cache still have an associated network request, which returns the 304 HTTP status code. You can see the request and response headers as well as the response headers of the cached response when you expand the request.
In contrast to that there is also a response that is directly served from a special cache called Back-Forward Cache (or BFCache). These responses happen directly after the browser start when you have the option enabled to restore your tabs from the last session and also when you navigate back and forth in the tab history.
This depends on the network connection speed and the response's size in the first place but also on how long the server takes to send the full response. Why that one request takes that long in comparison to the others can't be explained without knowing what happens on the server side.
The load event is fired when the page request is loaded including all its depending resources like CSS, images, JavaScript sources, etc. Requests initiated after the load event are loaded asyncronously, e.g. through an XMLHttpRequest or the defer attribute of the element.

Chrome is not sending if-none-match

I'm trying to do requests to my REST API, I have no problems with Firefox, but in Chrome I can't get the browser to work, always throws 200 OK, because no if-none-match (or similar) header is sent to the server.
With Firefox I get 304 perfectly.
I think I miss something, I tried with Cache-Control: max-age=10 to test but nothing.

One reason Chrome may not send If-None-Match is when the response includes an "HTTP/1.0" instead of an "HTTP/1.1" status line. Some servers, such as Django's development server, send an older header (probably because they do not support keep-alive) and when they do so, ETags don't work in Chrome.
In the "Response Headers" section, click "view source" instead of the parsed version. The first line will probably read something like HTTP/1.1 200 OK — if it says HTTP/1.0 200 OK Chrome seems to ignore any ETag header and won't use it the next load of this resource.
There may be other reasons too (e.g. make sure your ETag header value is sent inside quotes), but in my case I eliminated all other variables and this is the one that mattered.
UPDATE: looking at your screenshots, it seems this is exactly the case (HTTP/1.0 server from Python) for you too!
Assuming you are using Django, put the following hack in your local settings file, otherwise you'll have to add an actual HTTP/1.1 proxy in between you and the ./manage.py runserver daemon. This workaround monkey patches the key WSGI class used internally by Django to make it send a more useful status line:
# HACK: without HTTP/1.1, Chrome ignores certain cache headers during development!
# see https://stackoverflow.com/a/28033770/179583 for a bit more discussion.
from wsgiref import simple_server
simple_server.ServerHandler.http_version = "1.1"

Also check that caching is not disabled in the browser, as is often done when developing a web site so you always see the latest content.

I had a similar problem in Chrome, I was using http://localhost:9000 for development (which didn't use If-None-Match).
By switching to http://127.0.0.1:9000 Chrome1 automatically started sending the If-None-Match header in requests again.
Additionally - ensure Devtools > Network > Disable Cache [ ] is unchecked.
1 I can't find anywhere this is documented - I'm assuming Chrome was responsible for this logic.

Chrome is not sending the appropriate headers (If-Modified-Since and If-None-Match) because the cache control is not set, forcing the default (which is what you're experiencing). Read more about the cache options here: https://developer.mozilla.org/en-US/docs/Web/API/Request/cache.
You can get the wished behaviour on the server by setting the Cache-Control: no-cache header; or on the browser/client through the Request.cache = 'no-cache' option.

Chrome was not sending 'If-None-Match' header for me either. I didn't have any cache-control headers. I closed the browser, opened it again and it started sending 'If-None-Match' header as expected. So restarting your browser is one more option to check if you have this kind of problem.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Chrome, Firefox caching 302 redirects - html

Related

How to force re-validation of cached resource?

Caching in HTTP requests: ETag vs max-age

Code for temporary removal?

Understanding Firebug's Net panel

Chrome is not sending if-none-match

Categories

Resources