Load page / Cache analysis - html

Hello all please help with the analysis of my page.
Question 1
Why since everything is load from cache. Load time is 690ms?
question 2
what will be the reason to use --> private, max-age=60000
(public), max-age=60000 VS. private, max-age=60000
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching?hl=en

First, load time isn't just defined by the time it takes to get assets from the network. Painting and parsing can take a lot of time, as can the parsing of Javascript. In your case, DOMContentLoaded is only fired after 491 milliseconds, so that's already part of the answer.
As to your second question, the answer really is in the link you provided:
If the response is marked as “public” then it can be cached, even if it has HTTP authentication associated with it, and even when the response status code isn’t normally cacheable. Most of the time, “public” isn’t necessary, because explicit caching information (like “max-age”) indicates that the response is cacheable anyway.
By contrast, “private” responses can be cached by the browser but are typically intended for a single user and hence are not allowed to be cached by any intermediate cache - e.g. an HTML page with private user information can be cached by that user’s browser, but not by a CDN.

Related

What is the RESTful way to return a JSON + binary file in an API

I have to implement a REST endpoint that receives start and end dates (among other arguments). It does some computations to generate a result that is a kind of forecast according to the server state at invocation epoch and the input data (imagine a weather forecast for next few days).
Since the endpoint does not alter the system state, I plan to use GET method and return a JSON.
The issue is that the output includes also an image file (a plot). So my idea is to create a unique id for the file and include an URI in the JSON response to be consumed later (I think this is the way suggested by HATEOAS principle).
My question is, since this image file is a resource that is valid only as part of the response to a single invocation to the original endpoint, I would need a way to delete it once it was consumed.
Would it be RESTful to deleting it after serving it via a GET?
or expose it only via a DELETE?
or not delete it on consumption and keep it for some time? (purge should be performed anyway since I can't ensure the client consumes the file).
I would appreciate your ideas.
Would it be RESTful to deleting it after serving it via a GET?
Yes.
or expose it only via a DELETE?
Yes.
or not delete it on consumption and keep it for some time?
Yes.
The last of these options (caching) is a decent fit for REST in HTTP, since we have meta-data that we can use to communicate to general purpose components that a given representation has a finite lifetime.
So this reference of the report (which includes the link to the plot) could be accompanied by an Expires header that informs the client that the representation of the report has an expected shelf life.
You might, therefore, plan to garbage collect the image resource after 10 minutes, and if the client hasn't fetched it before then - poof, gone.
The reason that you might want to keep the image around after you send the response to the GET: the network is unreliable, and the GET message may never reach its destination. Having things in cache saves you the compute of trying to recalculate the image.
If you want confirmation that the client did receive the data, then you must introduce another message to the protocol, for the client to inform you that the image has been downloaded successfully.
It's reasonable to combine these strategies: schedule yourself to evict the image from the cache in some fixed amount of time, but also evict the image immediately if the consumer acknowledges receipt.
But REST doesn't make any promises about liveness - you could send a response with a link to the image, but 404 Not Found every attempt to GET it, and that's fine (not useful, of course, but fine). REST doesn't promise that resources have stable representations, or that the resource is somehow eternal.
REST gives us standards for how we request things, and how responses should be interpreted, but we get a lot of freedom in choosing which response is appropriate for any given request.
You could offer a download link in the JSON response to that binary resource that also contains the parameters that are required to generate that resource. Then you can decide yourself when to clean that file up (managing disk space) or cache it - and you can always regenerate it because you still have the parameters. I assume here that the generation doesn't take significant time.
It's a tricky one. Typically GET requests should be repeatable as an import HTTP feature, in case the original failed. Some people might rely on it.
It could also be construed as a 'non-safe' operation, GET resulting in what is effectively a DELETE.
I would be inclined to expire the image after X seconds/minutes instead, perhaps also supporting DELETE at that endpoint if the client got the result and wants to clean up early.

The fantastic but confusing idea of Resource Hints: (a)synchronous?

I've been reading through Google's slides for the so-called pre-optimisation. (For the ones interested or for those who do not know what I'm talking about, this slide kinda summarises it.)
In HTML5 we can prefetch and prerender pages in the link element. Here's an overview. We can use the rel values dns-prefetch, subresource, prefetch and prerender.
The first confusing thing is that apparently only prefetch is in the spec for HTML5 (and 5.1) but none of the others are. (Yet!) The second, that browser support is OK for (dns-)prefetch but quite bad for the others. Especially Firefox's lack of support for prerender is annoying.
Thirdly, the question that I ask myself is this: does the prefetching (or any other method) happen as soon as the browser reads the line (and does it, then, block the current page load), or does it wait with loading the resources in the background until the current page is loaded completely?
If it's loaded synchronously in a blocking manner, is there a way to do this asynchronously or after page load? I suppose with a JS solution like this, but I'm not sure it will run asynchronously then.
var pre = document.createElement("link");
pre.setAttribute("rel", "prerender prefetch");
pre.setAttribute("href", "next-page.php");
document.head.appendChild(pre);
Please answer both questions if applicable!
EDIT 17 September
After reading through the Editor's draft of Resource Hints I found the following (emphasis mine):
Resource fetches that may be required for the next navigation can
negatively impact the performance of the current navigation context
due to additional contention for the CPU, GPU, memory, and network
resources. To address this, the user agent should implement logic to
reduce and eliminate such contention:
Resource fetches required for the next navigation should have lower
relative priority and should not block or interfere with resource
fetches required by the current navigation context.
The optimal time to initiate a resource fetch required for the next navigation is
dependent on the negotiated transport protocol, users current
connectivity profile, available device resources, and other context
specific variables. The user agent is left to determine the optimal
time at which to initiate the fetch - e.g. the user agent may decide
to wait until all other downloads are finished, or may choose to
pipeline requests with low priority if the negotiated protocol
supports the necessary primitives. Alternatively, the user agent may
opt-out from initiating the fetch due to resource constraints, user
preferences, or other factors.
Notice how much the user agent may do this or that. I really fear that this will lead to different implementations by different browsers which will lead to a divergence again.
The question remains though. It isn't clear to me whether loading an external resource by using prefetch or other happens synchronously (and thus, when placed in the head, before the content is loaded), or asynchronously (with a lower priority). I am guessing the latter, though I don't understand how that is possible because there's nothing in the link spec that would allow asynchronous loading of link elements' content.
Disclaimer: I haven't spent a lot of dedicated time with these specs, so it's quite likely I've missed some important point.
That said, my read agrees with yours: if a resource is fetched as a result of a pre optimization, it's likely to be fetched asynchronously, and there's little guarantee about where in the pipeline you should expect it to be fetched.
The intent seems advisory rather than prescriptive, in the same way as the CSS will-change attribute advises rendering engines that an element should receive special consideration, but doesn't prescribe the behavior, or indeed that there should be any particular behavior.
there's nothing in the link spec that would allow asynchronous loading of link elements' content
Not all links would load content in any case (an author type wouldn't cause the UA to download the contents of a mailto: URL), and I can't find any mention of fetching resources in the spec apart from that in the discussion around crossorigin:
The exact behavior for links to external resources depends on the exact relationship, as defined for the relevant link type. Some of the attributes control whether or not the external resource is to be applied (as defined below)... User agents may opt to only try to obtain such resources when they are needed, instead of pro-actively fetching all the external resources that are not applied.
(emphasis mine)
That seems to open the door for resources specified by a link to be fetched asynchronously (or not at all).

New content not visible because browser cache

I have updated my website with some new content.
I talked to some people to view the content on their computers,
but it seems like they cant's see the content unless they delete their browser cache. Is there a way to handle this by my side, so all new things show up automatically on every browser?
There is no way to accomplish this without cleaning the cache.
Although, you can delete cache easily by pressing CTRL+SHIFT+R (atleast on firefox).
You cannot remote-wipe someone's cache. This time your only options are to wait, or tell your users to clear their cache, or instruct them to vigorously press refresh a few times, which will cause most browsers to refresh the page.
For future reference, there are two types of caching: expiration based caches and ETag caches.
If you set an explicit expiration date on your HTTP response, the client will not check back at all until that expiration date has passed. This greatly reduces network traffic, at the tradeoff of possibly having outdated content out there. Choose your expiration dates wisely for the best tradeoff.
The alternative is ETags, in which case the server sends an ETag token, and the client will inquire with "send me new content unless this token is still valid". This only reduces network traffic somewhat, but you're guaranteed to always have the latest content out there.
You need to balance your caching strategy in practice. First decide if you need caching at all, then decide how much you need and what tradeoff you're willing to make. For a high-traffic site even a cache of a few minutes can be worthwhile, while the issue of outdated content will be minuscule in this scenario.
You cannot erase a remove user's browser cache from Server/Client-side code.
Going forward, best you could do is tell the browsers not to cache at all in future, or to cache for a specified time (less than next expected update)
Cache-Control : no-cache
Cache-Control : max-age=315600
ETag sounds like it could serve the purpose (though I've never used).
The server generates and returns an arbitrary token which is typically
a hash or some other fingerprint of the contents of the file. The
client does not need to know how the fingerprint is generated, it only
needs to send it to the server on the next request: if the fingerprint
is still the same then the resource has not changed and we can skip
the download.

reducing response size

I am working on a web application and I am using polling approach to check if there is any update needed. These polling requests occur in every 1 or 2 seconds. The size of the response is 240 bytes if there is no update needed(An empty response is returned in that case) and around 10 KBs which is the size of the content itself. My problem is, since it returns at least 240 B in every seconds approximately, is there a way to optimize this response by pushing the boundaries a bit more?
When I checked the contents of the response, I saw that the 50 bytes are essential for me(session id and status code). However, there are some information in the header such as connection type, timeout and content-type. These settings will be same for each request of this type(i.e. it always requires content type as: "text/html; carset=utf-8"). So, can I just assume these settings in client side and prevent the server from sending these header info?
I am using django on the server side and jQuery for sending ajax requests by the way. Also, any type of push technology is out of question for now.
It does add up, but not as much as you think. If you polled every sec for a full hour, you'd have only used 864K, less than a typical webpage would require with an unprimed cache. Even if you did it for a full day, you're talking about ~20M. Maybe if you're someone like Twitter, you might need to be concerned about this, but I doubt you'll be getting anywhere near the traffic it would take for this to actually be problematic.
Nevertheless, you can of course customize the headers of the request, but what if any impact this will have on the client will be a matter to testing. Some headers can probably be dropped, but others may surprise you, and it technically could vary browser to browser, as well.
One solution to this kind of problem is "long polling". The polling client will send a request, and the webserver checks to see if there is an update. If there is not, the webserver sleeps for a second or two and then checks again in a loop, without sending a response. As soon as this loop sees an update, it sends a response. To the client web browser, it will look like the server is congested and taking a long time to respond, but actually the relevant data is being transmitted promptly and the "no data" responses are simply being skipped.
I'd recommend adding a timeout to the loop -- say 30 or 60 seconds -- after which the webserver would reply with "no data" as usual. Even just a 30 second cycle would cut your empty response load by a factor of 15-30.
Caveat: I've read about this kind of implementation but I haven't tried it myself. You will need to test compatibility with various web browsers to ensure that this fairly nonstandard method doesn't cause issues on the client side.

How to trigger browser html refresh for cached html files?

YSLOW suggests: For static components: implement "Never expire" policy by setting far future Expires header.... if you use a far future Expires header you have to change the component's filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component's filename, for example, yahoo_2.0.6.js.
http://developer.yahoo.com/performance/rules.html
I'd like to take advantage of caching for my mostly static pages and reload the js files when the version # changes. I've set a version # for my .js files but my main.html page has Expires set to the future so it doesn't reload and therefore doesn't reload the js files. Ideally I'd like to tell the browser (using a psychic technique) to reload main.html when a new version of the site is released. I could make my main.html page always reload but then I loose the caching benefit. I'm not looking for the ctrl-F5 answer as this needs to happen automatically for our users.
I think the answer is: main.html can't be cached, but I'd like to hear what are others doing to solve this problem. How are you getting the best caching vs. reload benefits.
Thanks.
Your analysis is correct. Web performance best practices suggest a far future expiration date for static components (i.e., those which don't change often), and using a version number in the URL manages those changes nicely.
For the main page (main.html), you would not set a far future expiration date. Instead, you could not set an expiration, or set it for a minimal amount of time, for example +24 hours.
Guess it depends on why you want to cache the HTML page - to improve user load-times or reduce server load.
Even with a long expiry time you might find that it's not actually cached at the client for very long (Yahoo studies show that files don't live in the cache for very long), so a shorter expiry time e.g. 1 day, might not be an issue.
If it's to reduce backend load, it might be worth looking at whether a proxy like Varnish would help i.e. it caches the pages from the origin server at serves them when requested. This way you could control how long pages are cached with a finer level of control.