Avoid caching 205 responses - html

A dynamic-post-form that should be cached using eTag. Navigation:
A user browse to the form form.html and recieve the status 200 having the new eTag "DNEI297" within the response from the server. Now the browser caches this document in the cache.
The user enters some values and finally post the form-data to form.html (browser to server) and recieve from the server the HTTP status code 205 (accepted/reset form data) and the unchanged eTag "DNEI297".
Since the 205 response is empty in this case, the browser reload the page form.html using the eTag "DNEI297". The server compares the eTag with his eTag and decide that neither the form nor the eTag changed and the browser already have cached the correct version of the form.html and send a 304 (unchanged).
Now the Problem: Since the Server sent a 304 the Browser took the last request and decide to use the cached version. But the cached version is the answer of the post-request having status-code 205 and the eTag "DNEI297".
Finally after the submit of the form the document http-status is 205. How to avoid the wrong code? It makes trouble and produce alerts from antivirus-plugins.

The server in this case erred by sending the same ETag—or any ETag—in its 205 response to the form submission.
RFC 7232 describes when it's appropriate to use an ETag:
A "strong validator" is representation metadata that changes value
whenever a change occurs to the representation data that would be
observable in the payload body of a 200 (OK) response to GET.
So you should not send an ETag along with the empty 205 response, since that's not what you'd get by doing a successful GET to that URL.

Related

Chrome does not invalidate cache when PUT request contains If-Match header

I'm creating a HTTP Web API where some of my resources will be cacheable. A cacheable resource will have two operations, GET & PUT. The GET will return response headers of Cache-Control: public,max-age=3600 & Etag: "2kuSN7rMzfGcB2DKt67EqDWQELA". The PUT will require the If-Match header which will contain the Etag value from a GET of the same resource. My goal is to have the browser cache invalidate a resource when I PUT to that resource. This works fine until I add the If-Match header to the PUT request. When the PUT request contains the If-Match header, subsequent GET requests will fetch from the cache which would be stale data. This is the behavior I've been experiencing with Chrome. Firefox doesn't behave like this, and works as I assume it should. Is this a bug in Chrome or am I misunderstanding some part of the HTTP spec?
Here are some example requests to show behavior:
//correctly fetchs from origin server (returns 200)
GET http://localhost/api/my-number/1
Response Headers
cache-control: public,max-age=3600
etag: "2kuSN7rMzfGcB2DKt67EqDWQELA"
Response Body
7
//correctly fetchs from disk cache (returns 200)
GET http://localhost/api/my-number/1
Response Headers
cache-control: public,max-age=3600
etag: "2kuSN7rMzfGcB2DKt67EqDWQELA"
Response Body
7
//correctly updates origin server (returns 200)
PUT http://localhost/api/my-number/1
Request Headers
if-match: "2kuSN7rMzfGcB2DKt67EqDWQELA"
Request Body
8
//incorrectly still fetches from disk cache (returns 200)
GET http://localhost/api/my-number/1
Response Headers
cache-control: public,max-age=3600
etag: "2kuSN7rMzfGcB2DKt67EqDWQELA"
Response Body
7
This is indeed incorrect behavior. RFC 7234 says:
A cache MUST invalidate the effective Request URI... as well as the URI(s) in the Location and Content-Location
response header fields (if present) when a non-error status code is
received in response to an unsafe request method.
Given that, the bug report you filed looks appropriate to me.

Response is not appearing in chrome Developer Tool but Dynamic Form data is posting for the next request

i recorded script with jmeter for 4 transactions.launch, logon, continue, logoff. i am seeing redirecting error for continue transaction and for that i am not seeing any response for that all request. But i am seeing response data in jmeter for all request for continue transaction. i have id token value and that i want to substitute for the next request as post.
Continue transaction
request..response (i am seeing response data with ID_token in jmeter but not in browser)
request (ID_Token as posting here) - Need to get final response for continue transaction.
As per Redirections in HTTP guide:
In HTTP, redirection is triggered by a server sending a special redirect response to a request. Redirect responses have status codes that start with 3, and a Location header holding the URL to redirect to.
As per RFC 2616 the response body is not required for 3xx redirect responses, moreover for 304 Not Modified status it's even forbidden so it's absolutely fine to not to have response body for 3xx status codes as long as you have Location header which points you to the next page.
Just make sure that JMeter sends the same requests and they're treated in the same manner by the server as requests from the real browser by comparing request flow in your browser developer tools and the ones which are sent by JMeter. In case of mismatch play with Redirect automatically and Follow redirects checkboxes in HTTP Request sampler or in HTTP Request Defaults configuration element:

apache httpclient and etag cache

I'm using Apache HttpClient 4.3.1 and I'm trying to integrate etag validation cache.
I've tried to "drop in" httpclient-cache CachingHttpClientBuilder instead of my usual HttpClientBuilder using instructions in here, but that didn't seem to do any good. While tracing the execution, it seems like a response that has "etag" header (weak etag) isn't considered cache-able - and so isn't retained for the next cycle.
Has anyone managed to use etag validation based cache with Apache HttpClient? I'm also open for alternative implementations.
Notes:
The server returns the first request with a weak etag header (W/"1234"). If the second request to the same URL has "If-None-Match=1234", the server returns 304. This is checked and working.
The server does not send any other cache header (expires, etc).
The whole setup works wonderfully when using a modern browser.
Whether a response is considered as cacheable or not is decided in
ResponseCachingPolicy#isResponseCacheable(org.apache.http.HttpRequest, org.apache.http.HttpResponse)
which checks for some headers using
ResponseCachingPolicy#isExplicitlyCacheable
when
header 'Expires' is set or the header 'Cache-Control:' has one of the values "max-age" "s-maxage" "must-revalidate" "proxy-revalidate" or "public" the response is considered cacheable.
For us, it worked to add "Cache-Control: 'must-revalidate' to the response on the server, along with the 'Etag' header.
With this settings the apache http client
stores the response of the first request in the cache
on the second request, sends a request to the server and if this responds with a HttpStatus 304 (Not Modified) returns a HttpStatus 200 (ok) and the original content to the caller
That is how it should be.
We are using release 4.5.2 of apache http client cache.

Chrome HEAD request?

Why does Chrome send a HEAD request? Example in logs:
2013-03-04 07:43:51 W3SVC7 NS1 GET /page.html 80 - *.*.*.* HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/537.22+(KHTML,+like+Gecko)+Chrome/25.0.1364.97+Safari/537.22
2013-03-04 07:43:51 W3SVC7 NS1 HEAD / - 80 - *.*.*.* HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/537.22+(KHTML,+like+Gecko)+Chrome/25.0.1364.97+Safari/537.22
I have a ban system, and this head request really annoying, and its happening exactly the same second with GET request.
What is the nature of it? any help appreciated.
p.s: I noticed that the head requests are all only to my homepage.
RFC 2616 states:
9.4 HEAD
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.
The response to a HEAD request MAY be cacheable in the sense that the
information contained in the response MAY be used to update a
previously cached entity from that resource. If the new field values
indicate that the cached entity differs from the current entity (as
would be indicated by a change in Content-Length, Content-MD5, ETag
or Last-Modified), then the cache MUST treat the cache entry as
stale.
Most likely it is trying to verify the clients cookie/session is valid with the server.

How does HTTP and HTML Work Together?

The answer to this little question will clear everything up for me.
If have a form tag that has a Get method and an action of some random script.
When I hit the submit button on the page, the Get Method is sent to HTTP and HTTP is what appends the query string to the url, the HTTP then returns a 20X status if the response is good and a 40X is a bad response? And our action goes to our webserver to run the script?
HTTP is transport and HTML is content. The Form submit calls a GET or POST request on the server depending on the action defined for the HTML form. The Form's arguments are appended by the Browser's form logic to the HTTP request, depending whehter GET or POST is used, they are attached to the request URL or put into the request body.
Then the request is handled on the server and the result is returned by the server logic (which can be a CGI, some perl script, a J2EE application etc.).
The server seponds with a HTTP status code (where everything below 300 is a success, and everything above 399 is an error - see here:HTTP staus codes ).
You are sending your form's data via HTTP using the "get" request. HTTP is a protocol and not a server. Your request is handled by a server who knows how to handle the HTTP protocol, eg. Apache.
The server processes the data and sends back a response. As you mention there are different kind of responses. 404 is best known (document not found).
The script is not run on the server, it is run on the client (the browser).
HTML is the markup code that describes the structure of the page. Browsers interpet the HTML code they receive and construct your page from it. Check here for more details: Wikipedia: HTML
The HTTP is the protocol used by the browser to talk to the server. Check this for more details: Wikipedia again: HTTP