Why does checking "Disable cache" in Developer Tools cause Cache-Control: no-cache to be sent in the request?
no-cache means check with the server to see if the cached copy is the same as what's on the server, if it is, use that, if not, then get the new version.
But, if the cache is disabled, then the client has to request a new version from the server regardless.
So why does it bother sending no-cache?
Also, isn't no-cache pointless without also sending If-None-Match? Otherwise the server has no way of knowing what version of the file is in your cache.
I guess there is no cached version so it can't send the ETag in the If-None-Match, which brings me back to the original question of why no-cache is used at all.
After clearing the cache, unchecking the "Disable cache" checkbox, and refreshing the page, you can see no Cache-Control header is sent:
no-cache means check with the server to see if the cached copy is the same as what's on the server, if it is, use that, if not, then get the new version.
no-cache means: "a cache MUST NOT use a stored response to satisfy the request without successful validation on the origin server." It doesn't require a conditional request, a regular GET would also satisfy that definition.
So why does it bother sending no-cache?
To prevent intermediate caches from returning a cached version without validation.
Also, isn't no-cache pointless without also sending If-None-Match? Otherwise the server has no way of knowing what version of the file is in your cache.
The browser is not making a conditional request (though it could), so it doesn't use If-None-Match. An intermediate cache, though, could have a cached version with an ETag, which it could revalidate. Or not.
I think the point you're missing is that this directive is directed at intermediate caches, to control their behavior.
Related
I have a page /data.txt, which is cached in the client's browser. Based on data which might be known only to the server, I now know that this page is out of date and should be refreshed. However, since it is cached, they will not re-request it for a long time (until the cache expires).
The client is now requesting a different page /foo.html. How can I make the client's browser re-request /data.txt and update its cache?
This should be done using HTTP or HTML (not all clients have JS).
(I specifically want to avoid the "cache-busting" pattern of appending version numbers to the /data.txt URL, like /data.txt?v=2. This fills the cache with useless entries rather than replacing expired ones.)
Edit for clarity: I specifically want to cache /data.txt for a long time, so telling the client not to cache it is unfortunately not what I'm looking for (for this question). I want /data.txt to be cached forever until the server chooses to invalidate it. But since the user never re-requests /data.txt, I need to invalidate it as a side effect of another request (for /foo.html).
To expand my comment:
You can use IF-Modified-Since and Etag, and to invalidate the resource that has been already downloaded you may take a look at the different approaches suggested in Clear the cache in JavaScript and fetch(), how do you make a non-cached request?, most of the suggestions there mentioned fetching the resource from JavaScript with no-cache header fetch(url, {cache: "no-store"}).
Or, if you can try sending a Clear-Site-Data header if your clients' browsers are supported.
Or maybe, give up this time only for the cache-busting solution. And if it's possible for you, rename the file to something else rather than adding a querystring as suggested in Revving Filenames: don’t use querystring.
Update after clarification:
If you are not maintaining a legacy implementation with users that already have /data.txt cached, the use of Etag And IF-Modified-Since headers should help.
And for the users with the cached versions, you may redirect to: /newFile.txt or /data.txt?v=1 from /foo.html. The new requests will have the newly added headers.
The first step is to fix your cache headers on the data.txt resource so it uses your desired cache policy (perhaps Cache-Control: no-cache in conjunction with an ETag for conditional validation). Otherwise you're just going to have this problem over and over again.
The next step is to get clients who have it in their cache already to re-request it. In general there's no automatic way to achieve this, but if you know they're accessing foo.html then it should be possible. On that page you can make an AJAX request to data.txt with the Cache-Control: no-cache request header. That should force the browser to bypass the cache and get a fresh version, and the cache should then be repopulated with the new version.
(At least, that's how it's supposed to work. I've never tried this, and I've seen reports here that browsers don't handle Cache-Control request headers properly.)
It appears that in a recent Chrome release, (or at least recently when making calls to my API --- haven't see it until today), Google is throwing warnings about CORB requests being blocked.
Cross-Origin Read Blocking (CORB) blocked cross-origin response [domain] with MIME type text/plain. See https://www.chromestatus.com/feature/5629709824032768 for more details.
I have determined that the requests to my API are succeeding, and that it's the pre-flight OPTIONS request that is triggering the warning in console.
The application which is calling the API, is not explicitly making the OPTIONS request, rather I have come to understand this is enforced by the browser when making a cross-origin request and is done automatically by the browser.
I can confirm that the OPTIONS request response does not have a mime-type defined. However, I am a little confused as it is my understanding that an OPTIONS response, is only headers, and does not contain a body. I do not understand why such a request would require a mime-type to be defined.
Moreover, the console warning says the request was blocked; yet the various POST and GET requests, are succeeding. So it looks as though the OPTIONS request isn't actually being blocked?
This is a three-part question:
Why does an OPTIONS request require a mime-type to be defined, when there is no body response?
What should the mime-type be for an OPTIONS request, if plain/text is not appropriate? Would I assume application/json to be correct?
How do I configure my Apache2 server to include a mime-type for all pre-flight OPTIONS requests?
I have gotten to the bottom of these CORB warnings.
The issue is related, in part, to my use of the content-type-options: nosniff header. I set this header in order to stop the browser from trying to sniff the content-type itself, thereby removing mime-type trickery, namely with user-uploaded files, as an attack vector.
The other part of this, is related to the content-type being returned application/json;charset=utf-8. Per Google's documentation, it notes:
A response served with a "X-Content-Type-Options: nosniff" response header and an incorrect "Content-Type" response header, may be blocked.
Based on this, I set out to double check IANA's site on acceptable media types. To my surprise, I discovered that no charset parameter was ever actually defined in any RFC for the application/json type, and further notes:
No "charset" parameter is defined for this registration. Adding one really has no effect on compliant recipients.
Based on this, I removed the charset from the content-type: application/json and can confirm the CORB warnings stopped in Chrome.
In conclusion, it would appear that per a recent Chrome release, Google has opted to start treating the mime-type more strictly than it has in the past.
Lastly, as a side note, the reason all of our application requests still succeeds, is because it appears Cross-Origin Read Blocking isnt actually enforced in Chrome:
In most cases, the blocked response should not affect the web page's behavior and the CORB error message can be safely ignored.
Was having the same issue.
The problem I had was due to the fact the API was answering to the preflight with 200 OK but was an empty response without the Content-Length header set.
So, either changing the preflight response status to 204 No Content or by simply setting the Content-Length: 0 header solved the issue.
I'm using Chrome 40 (so something nice and modern).
Cache-Control: max-age=0, no-cache is set on all pages - so I expect the browser to only use something from its cache if it has first checked with the server and gotten a 304 Not Modified response.
However on pressing the back button the browser merrily hits its own cache without checking with the server.
If I open the same page, as I reached with the back button, in a new tab then it does check with the server (and gets a 303 See Other response as things have changed).
See the screen captures below showing the output for the two different cases from the Network tab of the Chrome Developer Tools.
I thought I could use max-age=0, no-cache as a lighter weight alternative to no-store where I don't want users seeing stale data via the back button (but where the data is non-valuable and so can be cached).
My understanding of no-cache (see here and here on SO) is that the browser must always revalidate all responses. So why doesn't Chrome do this when using the back button?
Is no-store the only option?
200 response (from cache) on pressing back button:
303 response on requesting the same page in a new tab:
From http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1
no-cache
If the no-cache directive does not specify a field-name, then a cache MUST NOT use the response to satisfy a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent caching even by caches that have been configured to return stale responses to client requests.
If the no-cache directive does specify one or more field-names, then a cache MAY use the response to satisfy a subsequent request, subject to any other restrictions on caching. However, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent the re-use of certain header fields in a response, while still allowing caching of the rest of the response.
Other than the name implies, no-cache does not require that the response must not be stored in cache. It only specifies that the cached response must not be reused to serve a subsequent request without re-validating, so it's a shorthand for must-revalidate, max-age=0.
It is up to the browser what to qualify as a subsequent request, and to my understanding using the back-button does not. This behavior varies between different browser engines.
no-store forbids the use of the cached response for all requests, not only for subsequent ones.
Note that even with no-store, the RFC actually permits the client to store the response for use in history buffers. That means client may still use a cached response even when no-store has been specified.
Latter behavior covers cases where the page has been recorded with its original page title in the browser history. Another use case is the behavior of various mobile browsers which will not discard the previous page until the following page has fully loaded as the user might want to abort.
For clarification on the the behavior of the back button: It is not subject to any cache header, according to http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.13
User agents often have history mechanisms, such as "Back" buttons and history lists, which can be used to redisplay an entity retrieved earlier in a session.
History mechanisms and caches are different. In particular history mechanisms SHOULD NOT try to show a semantically transparent view of the current state of a resource. Rather, a history mechanism is meant to show exactly what the user saw at the time when the resource was retrieved.
By default, an expiration time does not apply to history mechanisms. If the entity is still in storage, a history mechanism SHOULD display it even if the entity has expired, unless the user has specifically configured the agent to refresh expired history documents.
That means that disrespecting any cache control headers when using the back button is the recommended behavior. If your browser happens to honor a backdated expiration date or applies the no-store directive not only to the browser cache but also to the history, it's actually already departing from that recommendation.
For how to solve it:
You can't, and you are not supposed to. If the user is returning to a previously visited page, most browsers will even try to restore the viewport. You may use deferred mechanism like AJAX to refresh content if this was the original behavior before the user left the page, but otherwise you should not even modify the content.
Have you tried using the full old set of no-cache headers?
<meta http-equiv="cache-control" content="max-age=0" />
<meta http-equiv="cache-control" content="no-cache" />
<meta http-equiv="expires" content="0" />
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
<meta http-equiv="pragma" content="no-cache" />
This seems to work all the time, unless you are running a "pushState" web.
Looks like this is a known 'quirk' in Chrome with using the back button. There is a good discussion of the issue in the bug report for it here:
https://code.google.com/p/chromium/issues/detail?id=28035
Sadly it looks like most people reverted to using no-store instead.
I expect though that most users are used to an experience of not getting a full page refresh using the back button. If you think about most Angular or Backbone apps that manage the back action themselves so that you just refresh content and not the page. With this in mind I suspect that having the customer refresh or get updates when they come back might not be that unexpected.
I have encountered a situation where I need to check the state of an order. But the remote server wouldn't return the response soon because it would cast a relatively long time.
So somebody recommended me to use long-link or half-long-link of HttpClient. but I never run into such a unknown situation for me. So I wanted to know how to implement it. Does somebody have some good ideas?
If your so-called "long-link" or "half-long-link" means "keep-alive", then I thought I got a solution. There is a header called "Connection" which has two arguments to be selected: "Keep-Alive" or "close". In http 1.0, it's "close" by default. But in Http 1.1, "Keep-Alive" is the default value. So, if you want to implement a persistent link, you need to call the method of setRequestHeader("Connection" , "Keep-Alive") of HttpGet/HttpPost/HttpPut before these requests are executed.
Meanwhile, maybe you already know this below, I hope it will help you:
Http Keep-Alive seems to be massively misunderstood. Here's a short
description of how it works, under both 1.0 and 1.1.
HTTP/1.0
Under HTTP 1.0, there is no official specification for how keepalive operates. It was, in essence, tacked on to an existing protocol. If the browser supports keep-alive, it adds an additional header to the request: Connection: Keep-Alive Then, when the server receives this request and generates a response, it also adds a header to the response:
Connection: Keep-Alive Following this, the connection is NOT dropped, but is instead kept open. When the client sends another request, it uses the same connection. This will continue until either the client or the server decides that the conversation is over, and one of them drops the connection.
HTTP/1.1
Under HTTP 1.1, the official keepalive method is different. All connections are kept alive, unless stated otherwise with the following header: Connection: close The Connection: Keep-Alive header no longer has any meaning because of this. Additionally, an optional Keep-Alive: header is described, but is so underspecified as to be meaningless. Avoid it. Not reliable HTTP is a stateless protocol - this means that every request is independent of every other. Keep alive doesn’t change that.
Additionally, there is no guarantee that the client or the server will keep the connection open. Even in 1.1, all that is promised is that you will probably get a notice that the connection is being closed. So keepalive is something you should not write your application to rely upon. KeepAlive and POST The HTTP 1.1 spec states that following the body of a POST, there are to be no additional characters.
It also states that "certain" browsers may not follow this spec, putting a CRLF after the body of the POST. Mmm-hmm. As near as I can tell, most browsers follow a POSTed body with a CRLF. There are two ways of dealing with this: Disallow keepalive in the context of a POST request, or ignore CRLF on a line by itself. Most servers deal with this in the latter way, but there's no way to know how a server will handle it without testing.
I'm trying to do requests to my REST API, I have no problems with Firefox, but in Chrome I can't get the browser to work, always throws 200 OK, because no if-none-match (or similar) header is sent to the server.
With Firefox I get 304 perfectly.
I think I miss something, I tried with Cache-Control: max-age=10 to test but nothing.
One reason Chrome may not send If-None-Match is when the response includes an "HTTP/1.0" instead of an "HTTP/1.1" status line. Some servers, such as Django's development server, send an older header (probably because they do not support keep-alive) and when they do so, ETags don't work in Chrome.
In the "Response Headers" section, click "view source" instead of the parsed version. The first line will probably read something like HTTP/1.1 200 OK — if it says HTTP/1.0 200 OK Chrome seems to ignore any ETag header and won't use it the next load of this resource.
There may be other reasons too (e.g. make sure your ETag header value is sent inside quotes), but in my case I eliminated all other variables and this is the one that mattered.
UPDATE: looking at your screenshots, it seems this is exactly the case (HTTP/1.0 server from Python) for you too!
Assuming you are using Django, put the following hack in your local settings file, otherwise you'll have to add an actual HTTP/1.1 proxy in between you and the ./manage.py runserver daemon. This workaround monkey patches the key WSGI class used internally by Django to make it send a more useful status line:
# HACK: without HTTP/1.1, Chrome ignores certain cache headers during development!
# see https://stackoverflow.com/a/28033770/179583 for a bit more discussion.
from wsgiref import simple_server
simple_server.ServerHandler.http_version = "1.1"
Also check that caching is not disabled in the browser, as is often done when developing a web site so you always see the latest content.
I had a similar problem in Chrome, I was using http://localhost:9000 for development (which didn't use If-None-Match).
By switching to http://127.0.0.1:9000 Chrome1 automatically started sending the If-None-Match header in requests again.
Additionally - ensure Devtools > Network > Disable Cache [ ] is unchecked.
1 I can't find anywhere this is documented - I'm assuming Chrome was responsible for this logic.
Chrome is not sending the appropriate headers (If-Modified-Since and If-None-Match) because the cache control is not set, forcing the default (which is what you're experiencing). Read more about the cache options here: https://developer.mozilla.org/en-US/docs/Web/API/Request/cache.
You can get the wished behaviour on the server by setting the Cache-Control: no-cache header; or on the browser/client through the Request.cache = 'no-cache' option.
Chrome was not sending 'If-None-Match' header for me either. I didn't have any cache-control headers. I closed the browser, opened it again and it started sending 'If-None-Match' header as expected. So restarting your browser is one more option to check if you have this kind of problem.