I have a web page that returns the following header when I access material:
HTTP/1.1 200 OK
Date: Sat, 29 Jun 2013 15:57:25 GMT
Server: Apache
Content-Length: 2247515
Cache-Control: no-cache, no-store, must-revalidate, max-age=-1
Pragma: no-cache, no-store
Expires: -1
Connection: close
Using a chrome extension, I want to modify this response header so that the material is actually cached instead of wasting bandwidth.
I have the following sample code:
chrome.webRequest.onHeadersReceived.addListener(function(details)
{
// Delete the required elements
removeHeader(details.responseHeaders, 'pragma');
removeHeader(details.responseHeaders, 'expires');
// Modify cache-control
updateHeader(details.responseHeaders, 'cache-control', 'max-age=3600;')
console.log(details.url);
console.log(details.responseHeaders);
return{responseHeaders: details.responseHeaders};
},
{urls: ["<all_urls>"]}, ['blocking', 'responseHeaders']
);
Which correctly modifies the header to something like this (based on the console.log() output):
HTTP/1.1 200 OK
Date: Sat, 29 Jun 2013 15:57:25 GMT
Server: Apache
Content-Length: 2247515
Cache-Control: max-age=3600
Connection: close
But based on everything I have tried to check this, I cannot see any evidence whatsoever that this has actually happened:
The cache does not contain an entry for this file
The Network tab in the Developer Console shows no change at all to the HTTP response (I have tried changing it to even trivial modifications just for the sake of ensuring that its not a error, but still no change).
The only real hints I can find are this question which suggests that my approach still works and this paragraph on the webRequest API documentation which suggests that this won't work (but doesn't explain why I can't get any changes whatsoever):
Note that the web request API presents an abstraction of the network
stack to the extension. Internally, one URL request can be split into
several HTTP requests (for example to fetch individual byte ranges
from a large file) or can be handled by the network stack without
communicating with the network. For this reason, the API does not
provide the final HTTP headers that are sent to the network. For
example, all headers that are related to caching are invisible to the
extension.
Nothing is working whatsoever (I can't modify the HTTP response header at all) so I think that's my first concern.
Any suggestions at where I could be going wrong or how to go about finding what is going wrong here?
If its not possible, are there any other ways to achieve what I am trying to achieve?
I have recently spent some hours on trying to get a file cached, and discovered that the chrome.webRequest and chrome.declarativeWebRequest APIs cannot force resources to be cached. In no way.
The Cache-Control (and other) response headers can be changed, but it will only be visible in the getResponseHeader method. Not in the caching behaviour.
Related
I have one entry pack in my homepage that after DOMContentLoaded uses import() to dynamically import a heavy JS chunk.
Watching DevTools in the network tab, I see that the behavior is exactly as expected: after DOMContentLoaded, a new request is made to load that separate chunk.
However, I noticed that while all other chunks (the initial sync ones, that are loaded imediatelly) are correctly cached (their status is a grayed out 200, with a size memory_cache), the dynamically imported chunks ALWAYS gets requested from the server and, worse, it's re-downloaded with a 200 status, even tough it's content hasn't changed.
This doesn't happen in Firefox at all; the dynamic imports are cached as well.
This happens with all dynamic imports, regardless of page or entry pack.
Inspecting the response headers in Chrome for those dynamic import assets, you can clearly see it was supposed to be cached:
cache-control: max-age=315360000
cache-control: public
content-encoding: gzip
content-length: 210215
content-type: application/javascript
date: Sat, 15 May 2021 19:33:45 GMT
expires: Thu, 31 Dec 2037 23:55:55 GMT
last-modified: Sat, 15 May 2021 19:32:27 GMT
pragma: public
server: nginx
vary: Accept-Encoding
This is a Rails 6 with Webpack 5 gem setup, so all packs are served from /packs/. This is the NGINX config:
location ~ ^/(assets|packs|images|javascripts|stylesheets|swfs|system|uploads|blog-media)/ {
gzip_static on;
expires max;
add_header Cache-Control public;
add_header Pragma public;
add_header ETag "";
break;
}
I have 'disable cache' NOT selected in my Chrome Dev Tools;
The last-modified response header is stable (it's not changing);
The filename is stable (it's not changing);
Content-length is not changing;
I have used diffchecker.com to compare, line by line, both the REQUEST and RESPONSE headers of chunks that are being cached and the dynamic chunks that are not... there's literally no difference but for the obvious path and content-length fields;
It's also relevant to note that, in Firefox, if I reload the page with CMD R, it automatically adds Cache-Control max-age=0 to the request headers, but that makes nginx return 304 for all chunks (so, the requests hit the server and appear in NGINX log, but nothing is download because of 304 responses and in cache appears in the transfered column. But if I click on my logo (which acts as a navigation to the homepage), then I see the 200 status, in cache in transfered column, and no requests hit nginx at all. Everything as expected.
Chrome, however, acts totally different. A CMD R doesn't add that cache-control max-age 0 header, so all chunks (with the exception of the dynamic imports) return 200 and (memory cache) in the size column. These don't hit nginx. However, the dynamic imports requests not only hit nginx, but also NGINX returns 200 status, forcing the download, so I can't understand why NGINX is sending the asset again (200) instead of a 304 response as it happens with Firefox.
I went as far as customizing the nginx log to add cache=$http_cache_control to the output, but it's empty for the dynamic imports; (I can see firefox's max-age=0 when I reload the page as described above).
UPDATE
This is so weird that it happens at random. Once every ~5 reloads with CMD+R, the dynamic chunks also appear as cached as expected:
Maybe I hit a Chrome bug? (Chrome version 90.0.4430.212), since Firefox works as expected?
I'm receiving invalid characters on try to receiving JSON by http protocol in C.
When I send
GET /<query> http/1.1\r\nHost:<host>\r\n\r\n
then the result displays as follow:
HTTP/1.0 200 OK
Cache-Control: private
Content-Type: application/json; charset=utf-8
Content-Encoding: gzip
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST
Access-Control-Allow-Credentials: false
X-Content-Type-Options: nosniff
Date: Sun, 07 Dec 2014 13:14:29 GMT
Content-Length: 11410
�
I don't know why I don't receiving JSON? However, in browser's address bar I type the same query & I received the json as well.
[UPDATE # 1]
I found it useful to work on built-in library, as the error is related to network byte compressions so I used Qt's library QNetworkManager specifically to done this job.
Please use a HTTP library or make yourself comfortable with the HTTP protocol and implement it properly.
Content-Encoding: gzip
At least this line means trouble for your simple parsing. But interestingly, based on your request the server should not sent this line if a non-compressed version is available. This means not only your code is buggy, but that the server might be buggy too (it actually assume that you understand any compressions because you did not say different).
GET / http/1.1\r\nHost:\r\n\r\n
And your request is wrong too. It must be HTTP/1.1 instead of http/1.1. And it must contain a proper host header.
Qt's QNetworkManager is best for this purpose, as it do all the intricate stuff itself & provide elegant interface to user.
I have developed a Box App using "Web App Integrations", the options to manage the file from Box web using right click on it.
It is a popup integration that gets the file modify it and save it again.
Some time ago we detected it was broken but have not had time to check it until now and the problem lays in our last request to box when we want to save the modified file.
In our callback we are requesting #overwrite_url# and #new_copy_url# and we post to that urls with the modified files to "save as" or "save" based on user selection.
The new documentation does not describe this 2 parameters but the app management allows them to be requested so I assume that they are not deprecated, other than that I have not been able to see a difference in the documentation related to this issue.
The request we are using is:
POST /api/1.0/new_copy/dmq5esykpq30sp2kepy3b1d7mvese5ap/9721827325?new_file_name=Koala.proton.jpg HTTP/1.1
Accept: application/json
Content-Type: multipart/form-data;boundary=2iqAzMZWpgN473oDBmRGnysbfTtsD2
Cache-Control: no-cache
Pragma: no-cache
User-Agent: Java/1.7.0_45
Host: upload.box.com
Connection: keep-alive
Content-Length: 17831
--2iqAzMZWpgN473oDBmRGnysbfTtsD2
Content-Disposition: form-data; name="file"; filename="empty.dat"
Content-Type: application/octet-stream
Content-Length: 17627
And the only response I get is a 200 response with the body "restricted" without further information.
I suspect this has something to do with the deprecation of APIv1 but the integrations does not use the api and I did ask a couple of times to box support mail if the deprecation was going to have some effect to integrations and the responses were always negative.
There are definitely changes required in order to update your integration to continue to work. Yes, V1 APIs have been deprecated, and so your old integration has stopped working.
New documentation is here . Subtle difference is that you get way more power now for these web-app integrations. Tokens don't expire after 24 hours, but follow Box's same OAuth2 rules. Scope of your token will be for the file or folder that your web-app-integration is invoked on.
Fundamentally, first step after you get the inbound request on your server is to trade in the auth_code for an Auth-Token via the OAuth2 endpoints.
See the section on auth_code. Then you will have an Auth-token that will let you call regular V2 APIs. To do a copy you would then :
POST https://api.box.com/2.0/files/{id}/copy (with the Bearer-token header)
See https://developers.box.com/docs/#files-copy-a-file for the documentation on how to do a copy operation. Nice thing is you can also do any number of other API calls with that token... as long as they are within scope of that file.
Web servers have the ability to stream media (audio in this example) to browsers. Browsers use HTML5 controls to play the media. What I'm discovering, however, is that Firefox is caching the media, even though I (believe I) explicitly tell it not to. I have a hunch that it has something to do with the 206 Partial Content response as a regular "non-range" GET with a full 200 OK response does not get cached. Chrome (27) handles this OK, but Firefox (21) does not:
HTTP/1.1 206 Partial Content
Date: Tue, 21 May 2013 17:24:29 GMT
Expires: 0
Pragma: no-cache
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Content-Disposition: attachment; filename="audio.wav"
Content-Type: audio/x-wav
Connection: close
Accept-Ranges: bytes
Content-Range: bytes 0-218923/218924
Anyone got any ideas as to how to make Firefox not cache this? When I click to play other audio files that are named the same, Firefox simply plays the first one that was clicked (cached) in a session as opposed to re-fetching the new one from the server.
Note that this question seems to directly ask/answer this, but it does not work... I use the headers mentioned.
Thanks for any help.
EDIT: I also tried adding an ETag: header, but still Firefox caches the original response.
EDIT: Including a Content-Length: header to match (218924 in this example) does not seem to impact the issue.
EDIT: I have filed a bug at bugzilla.mozilla.org but no activity on it at this point.
Your Firefox is implementing Section 13.8 of rfc2616. So this behavior is alright.
13.8 Errors or Incomplete Response Cache Behavior
A cache that receives an incomplete response (for example, with fewer
bytes of data than specified in a Content-Length header) MAY store the
response. However, the cache MUST treat this as a partial response.
Partial responses MAY be combined as described in section 13.5.4; the
result might be a full response or might still be partial. A cache
MUST NOT return a partial response to a client without explicitly
marking it as such, using the 206 (Partial Content) status code. A
cache MUST NOT return a partial response using a status code of 200
(OK).
Partial responses may(or maynot) be stored. So Chrome and Firefox both follow the rules.
I've never seen this before, I've always known there was either GET or POST. And I can't find any good documentation.
GET send variables via the URL.
POST send it via the file body?
What does HEAD do?
It doesn't get used often, am I correct?
W3schools.com doesn't even mention it.
HTML’s method attribute only allows GET and POST.
The HEAD method is used to send the request and retrieve just the HTTP header as response. For example, a client application can issue a HEAD request to check the size of a file (from HTTP headers) without downloading it. As Arjan points out, it's not even valid in HTML forms.
HTTP method HEAD sends the response's headers but without a body; it's often useful, as the URL I've given explains, though hardly ever in a "form" HTML tag.
The only thing I can imagine is that the server may actually have been set up to validate the request method, to discover submissions by robots that for HEAD might actually use a different method than a browser does. (And thus reject those submissions.)
A response to a HEAD request does not imply nothing is shown to the user: even a response to HEAD can very well redirect to another page. However, like Gumbo noted: it's not valid for the method in a HTML form, so this would require a lot of testing in each possible browser...
For a moment I wondered if HEAD in a form is somehow used to avoid accidental multiple submissions. But I assume the only useful response would be a 301 Redirect, but that could also be used with GET or POST, so I don't see how HEAD would solve any issues.
A quick test in the current versions of both Safari and Firefox on a Mac shows that actually a GET is invoked. Of course, assuming this is undocumented behavior, one should not rely on that. Maybe for some time, spam robots were in fact fooled into using HEAD (which would then be rejected on the server), or might be fooled into skipping this form if they would only support GET and POST. But even the dumbest robot programmer (aren't they all dumb for not understanding their work is evil?) would soon have learned that a browser converts this into GET.
(Do you have an example of a website that uses this? Are you sure there's no JavaScript that changes this, or does something else? Can anyone test what Internet Explorer sends?)
HEAD Method
The HEAD method is functionally like GET, except that the server replies with a response line and headers, but no entity-body. Following is a simple example which makes use of HEAD method to fetch header information about hello.htm:
HEAD /hello.htm HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: www.tutorialspoint.com
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Following will be a server response against the above GET request:
HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Vary: Authorization,Accept
Accept-Ranges: bytes
Content-Length: 88
Content-Type: text/html
Connection: Closed
You can notice that here server does not send any data after header.
-Obtained from tutorialspoint.com