Why is Firefox ignoring cache control on Range-based queries?

Why is Firefox ignoring cache control on Range-based queries? - html

Web servers have the ability to stream media (audio in this example) to browsers. Browsers use HTML5 controls to play the media. What I'm discovering, however, is that Firefox is caching the media, even though I (believe I) explicitly tell it not to. I have a hunch that it has something to do with the 206 Partial Content response as a regular "non-range" GET with a full 200 OK response does not get cached. Chrome (27) handles this OK, but Firefox (21) does not:
HTTP/1.1 206 Partial Content
Date: Tue, 21 May 2013 17:24:29 GMT
Expires: 0
Pragma: no-cache
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Disposition: attachment; filename="audio.wav"
Content-Type: audio/x-wav
Connection: close
Accept-Ranges: bytes
Content-Range: bytes 0-218923/218924
Anyone got any ideas as to how to make Firefox not cache this? When I click to play other audio files that are named the same, Firefox simply plays the first one that was clicked (cached) in a session as opposed to re-fetching the new one from the server.
Note that this question seems to directly ask/answer this, but it does not work... I use the headers mentioned.
Thanks for any help.
EDIT: I also tried adding an ETag: header, but still Firefox caches the original response.
EDIT: Including a Content-Length: header to match (218924 in this example) does not seem to impact the issue.
EDIT: I have filed a bug at bugzilla.mozilla.org but no activity on it at this point.

Your Firefox is implementing Section 13.8 of rfc2616. So this behavior is alright.
13.8 Errors or Incomplete Response Cache Behavior
A cache that receives an incomplete response (for example, with fewer
bytes of data than specified in a Content-Length header) MAY store the
response. However, the cache MUST treat this as a partial response.
Partial responses MAY be combined as described in section 13.5.4; the
result might be a full response or might still be partial. A cache
MUST NOT return a partial response to a client without explicitly
marking it as such, using the 206 (Partial Content) status code. A
cache MUST NOT return a partial response using a status code of 200
(OK).
Partial responses may(or maynot) be stored. So Chrome and Firefox both follow the rules.

Related

How to validate if Chrome is preloading resources hinted in a 103 response?

I am able to validate that we are returning the expected HTTP 103 response:
curl -D - https://local.contra.dev:8080/log-in
HTTP/1.1 103 Early Hints
Link: <https://builds.contra.com>; rel="preconnect"; crossorigin
Link: <https://fonts.googleapis.com/css2?family=Inter:wght#400;500;600;700;900&display=swap>; rel="preload"; as="font"
Link: </static/assets/entry-client-routing.de82cadc.js>; rel="modulepreload"; as="script"; crossorigin
HTTP/1.1 200 OK
cache-control: no-store
referrer-policy: strict-origin-when-cross-origin
x-frame-options: sameorigin
content-type: text/html
content-length: 5430
Date: Tue, 26 Jul 2022 19:19:28 GMT
Connection: keep-alive
Keep-Alive: timeout=72
However, how do I confirm that google-chrome (which is the only browser that supports 103 Early Hints) is taking advantage of these hints?
I don't see anything in Chrome network tab that would indicate that resources are loaded early.

One way to check would be to use the Performance API.
performance.getEntriesByName("https://path/to/your/resource")
You should see a PerformanceResourceTiming object with initiatorType: "early-hints" corresponding to your early hinted resource assuming your headers are working. See here: https://chromium.googlesource.com/chromium/src/+/master/docs/early-hints.md#checking-early-hints-preload-is-working.
I've also used network waterfalls in Chrome's devtools which should show the resource being loaded from disk cache with low time to load. Hopefully better support will arrive soon for tracing early hints requests in devtools.
Note that early hints in Chrome don't work over HTTP/1.1. https://chromium.googlesource.com/chromium/src/+/master/docs/early-hints.md#what_s-not-supported

Chrome not caching Webpack dynamically imported javascript chunks regardless of cache-control header; Firefox caches

I have one entry pack in my homepage that after DOMContentLoaded uses import() to dynamically import a heavy JS chunk.
Watching DevTools in the network tab, I see that the behavior is exactly as expected: after DOMContentLoaded, a new request is made to load that separate chunk.
However, I noticed that while all other chunks (the initial sync ones, that are loaded imediatelly) are correctly cached (their status is a grayed out 200, with a size memory_cache), the dynamically imported chunks ALWAYS gets requested from the server and, worse, it's re-downloaded with a 200 status, even tough it's content hasn't changed.
This doesn't happen in Firefox at all; the dynamic imports are cached as well.
This happens with all dynamic imports, regardless of page or entry pack.
Inspecting the response headers in Chrome for those dynamic import assets, you can clearly see it was supposed to be cached:
cache-control: max-age=315360000
cache-control: public
content-encoding: gzip
content-length: 210215
content-type: application/javascript
date: Sat, 15 May 2021 19:33:45 GMT
expires: Thu, 31 Dec 2037 23:55:55 GMT
last-modified: Sat, 15 May 2021 19:32:27 GMT
pragma: public
server: nginx
vary: Accept-Encoding
This is a Rails 6 with Webpack 5 gem setup, so all packs are served from /packs/. This is the NGINX config:
location ~ ^/(assets|packs|images|javascripts|stylesheets|swfs|system|uploads|blog-media)/ {
gzip_static on;
expires max;
add_header Cache-Control public;
add_header Pragma public;
add_header ETag "";
break;
}
I have 'disable cache' NOT selected in my Chrome Dev Tools;
The last-modified response header is stable (it's not changing);
The filename is stable (it's not changing);
Content-length is not changing;
I have used diffchecker.com to compare, line by line, both the REQUEST and RESPONSE headers of chunks that are being cached and the dynamic chunks that are not... there's literally no difference but for the obvious path and content-length fields;
It's also relevant to note that, in Firefox, if I reload the page with CMD R, it automatically adds Cache-Control max-age=0 to the request headers, but that makes nginx return 304 for all chunks (so, the requests hit the server and appear in NGINX log, but nothing is download because of 304 responses and in cache appears in the transfered column. But if I click on my logo (which acts as a navigation to the homepage), then I see the 200 status, in cache in transfered column, and no requests hit nginx at all. Everything as expected.
Chrome, however, acts totally different. A CMD R doesn't add that cache-control max-age 0 header, so all chunks (with the exception of the dynamic imports) return 200 and (memory cache) in the size column. These don't hit nginx. However, the dynamic imports requests not only hit nginx, but also NGINX returns 200 status, forcing the download, so I can't understand why NGINX is sending the asset again (200) instead of a 304 response as it happens with Firefox.
I went as far as customizing the nginx log to add cache=$http_cache_control to the output, but it's empty for the dynamic imports; (I can see firefox's max-age=0 when I reload the page as described above).
UPDATE
This is so weird that it happens at random. Once every ~5 reloads with CMD+R, the dynamic chunks also appear as cached as expected:
Maybe I hit a Chrome bug? (Chrome version 90.0.4430.212), since Firefox works as expected?

Chrome stalls when making multiple requests to same resource?

I'm trying to implement long polling for the first time, and I'm using XMLHttpRequest objects to do it. So far, I've been successful at getting events in Firefox and Internet Explorer 11, but Chrome strangely is the odd one out this time.
I can load one page and it runs just fine. It makes the request right away and starts processing and displaying events. If I open the page in a second tab, one of the pages starts seeing delays in receiving events. In the dev tools window, I see multiple requests with this kind of timing:
"Stalled" will range up to 20 seconds. It won't happen on every request, but will usually happen on several requests in a row, and in one tab.
At first I thought this was an issue with my server, but then I opened two IE tabs and two Firefox tabs, and they all connect and receive the same events without stalling. Only Chrome is having this kind of trouble.
I figure this is likely an issue with the way in which I'm making or serving up the request. For reference, the request headers look like this:
Connection: keep-alive
Last-Event-Id: 530
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36
Accept: */*
DNT: 1
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
The response looks like this:
HTTP/1.1 200 OK
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: text/event-stream
Expires: Tue, 16 Dec 2014 21:00:40 GMT
Server: Microsoft-HTTPAPI/2.0
Date: Tue, 16 Dec 2014 21:00:40 GMT
Connection: close
In spite of the headers involved, I'm not using the browser's native EventSource, but rather a polyfill that lets me set additional headers. The polyfill is using XMLHttpRequest under the covers, but it seems to me that no matter how the request is being made, it shouldn't stall for 20 seconds.
What might be causing Chrome to stall like this?
Edit: Chrome's chrome://net-internals/#events page shows that there's a timeout error involved:
t=33627 [st= 5] HTTP_CACHE_ADD_TO_ENTRY [dt=20001]
--> net_error = -409 (ERR_CACHE_LOCK_TIMEOUT)
The error message refers to a patch added to Chrome six months ago (https://codereview.chromium.org/345643003), which implements a 20-second timeout when the same resource is requested multiple times. In fact, one of the bugs the patch tries to fix (bug number 46104) refers to a similar situation, and the patch is meant to reduce the time spent waiting.
It's possible the answer (or workaround) here is just to make the requests look different, although perhaps Chrome could respect the "no-cache" header I'm setting.

Yes, this behavior is due to Chrome locking the cache and waiting to see the result of one request before requesting the same resource again. The answer is to find a way to make the requests unique. I added a random number to the query string, and everything is working now.
For future reference, this was Chrome 39.0.2171.95.
Edit: Since this answer, I've come to understand that "Cache-Control: no-cache" doesn't do what I thought it does. Despite its name, responses with this header can be cached. I haven't tried, but I wonder if using "Cache-Control: no-store", which does prevent caching, would fix the issue.

adding Cache-Control: no-cache, no-transform worked for me

I have decided to keep it simple and checked the response headers of a website that did not have this issue and I changed my response headers to match theirs:
Cache-Control: max-age=3, must-revalidate

Override the "cache-control" values in a HTTP response

I have a web page that returns the following header when I access material:
HTTP/1.1 200 OK
Date: Sat, 29 Jun 2013 15:57:25 GMT
Server: Apache
Content-Length: 2247515
Cache-Control: no-cache, no-store, must-revalidate, max-age=-1
Pragma: no-cache, no-store
Expires: -1
Connection: close
Using a chrome extension, I want to modify this response header so that the material is actually cached instead of wasting bandwidth.
I have the following sample code:
chrome.webRequest.onHeadersReceived.addListener(function(details)
{
// Delete the required elements
removeHeader(details.responseHeaders, 'pragma');
removeHeader(details.responseHeaders, 'expires');
// Modify cache-control
updateHeader(details.responseHeaders, 'cache-control', 'max-age=3600;')
console.log(details.url);
console.log(details.responseHeaders);
return{responseHeaders: details.responseHeaders};
},
{urls: ["<all_urls>"]}, ['blocking', 'responseHeaders']
);
Which correctly modifies the header to something like this (based on the console.log() output):
HTTP/1.1 200 OK
Date: Sat, 29 Jun 2013 15:57:25 GMT
Server: Apache
Content-Length: 2247515
Cache-Control: max-age=3600
Connection: close
But based on everything I have tried to check this, I cannot see any evidence whatsoever that this has actually happened:
The cache does not contain an entry for this file
The Network tab in the Developer Console shows no change at all to the HTTP response (I have tried changing it to even trivial modifications just for the sake of ensuring that its not a error, but still no change).
The only real hints I can find are this question which suggests that my approach still works and this paragraph on the webRequest API documentation which suggests that this won't work (but doesn't explain why I can't get any changes whatsoever):
Note that the web request API presents an abstraction of the network
stack to the extension. Internally, one URL request can be split into
several HTTP requests (for example to fetch individual byte ranges
from a large file) or can be handled by the network stack without
communicating with the network. For this reason, the API does not
provide the final HTTP headers that are sent to the network. For
example, all headers that are related to caching are invisible to the
extension.
Nothing is working whatsoever (I can't modify the HTTP response header at all) so I think that's my first concern.
Any suggestions at where I could be going wrong or how to go about finding what is going wrong here?
If its not possible, are there any other ways to achieve what I am trying to achieve?

I have recently spent some hours on trying to get a file cached, and discovered that the chrome.webRequest and chrome.declarativeWebRequest APIs cannot force resources to be cached. In no way.
The Cache-Control (and other) response headers can be changed, but it will only be visible in the getResponseHeader method. Not in the caching behaviour.

HTML form method="HEAD"

I've never seen this before, I've always known there was either GET or POST. And I can't find any good documentation.
GET send variables via the URL.
POST send it via the file body?
What does HEAD do?
It doesn't get used often, am I correct?
W3schools.com doesn't even mention it.

HTML’s method attribute only allows GET and POST.

The HEAD method is used to send the request and retrieve just the HTTP header as response. For example, a client application can issue a HEAD request to check the size of a file (from HTTP headers) without downloading it. As Arjan points out, it's not even valid in HTML forms.

HTTP method HEAD sends the response's headers but without a body; it's often useful, as the URL I've given explains, though hardly ever in a "form" HTML tag.

The only thing I can imagine is that the server may actually have been set up to validate the request method, to discover submissions by robots that for HEAD might actually use a different method than a browser does. (And thus reject those submissions.)
A response to a HEAD request does not imply nothing is shown to the user: even a response to HEAD can very well redirect to another page. However, like Gumbo noted: it's not valid for the method in a HTML form, so this would require a lot of testing in each possible browser...
For a moment I wondered if HEAD in a form is somehow used to avoid accidental multiple submissions. But I assume the only useful response would be a 301 Redirect, but that could also be used with GET or POST, so I don't see how HEAD would solve any issues.
A quick test in the current versions of both Safari and Firefox on a Mac shows that actually a GET is invoked. Of course, assuming this is undocumented behavior, one should not rely on that. Maybe for some time, spam robots were in fact fooled into using HEAD (which would then be rejected on the server), or might be fooled into skipping this form if they would only support GET and POST. But even the dumbest robot programmer (aren't they all dumb for not understanding their work is evil?) would soon have learned that a browser converts this into GET.
(Do you have an example of a website that uses this? Are you sure there's no JavaScript that changes this, or does something else? Can anyone test what Internet Explorer sends?)

HEAD Method
The HEAD method is functionally like GET, except that the server replies with a response line and headers, but no entity-body. Following is a simple example which makes use of HEAD method to fetch header information about hello.htm:
HEAD /hello.htm HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: www.tutorialspoint.com
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Following will be a server response against the above GET request:
HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Vary: Authorization,Accept
Accept-Ranges: bytes
Content-Length: 88
Content-Type: text/html
Connection: Closed
You can notice that here server does not send any data after header.
-Obtained from tutorialspoint.com

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008