How do we cache HTML "fragments"? - html

I have a page which looks like this:
<!doctype html>
<head></head>
<body>
<div>Content 1000 chars</div>
<div>Content 1000 chars</div>
<div>Content 1000 chars</div>
</body>
</html>
When a client downloads the page, basically he's downloading 3100 characters. If he visits the page again and the contents of the first div changes, he will have to redownload the entire page again (3100 characters).
Now basically I was wondering are we able to cache HTML fragments like the way we do with images?
So I was thinking is there somewhere to get this effect:
<!doctype html>
<head></head>
<body>
<div src="page1.html"></div>
<div src="page2.html"></div>
<div src="page3.html"></div>
</body>
</html>
So if I were to change the contents of page1.html, the browser would be able to know that only page1.html was changed since the last visit, and downloads 1000 characters instead of the entire page (3100) characters. Essentially this behavior is identical to what is happening now with images:
<!doctype html>
<head></head>
<body>
<img src="img1.gif">
<img src="img2.gif">
<img src="img3.gif">
</body>
</html>
whereby changing img1.gif will invoke the browser to redownload only img1.gif (assuming all the other files have not been edited)
To be clear, I'm not looking for an AJAX solution. I need a solution that works without javascript (as with all the above examples). I'm also not particularly in favor of the frames solution, However I would accept that as answer if there are simply no other alternatives / quirks / hacks

Have you thought of IFrames?
However, I think that this is such a micro-optimization, that it wouldn't have any advantage (except caching inside server applicvation, which is a completely other can o'worms).
(Or you're talking about way more than 3000 chars here.)
Edit: There is another solution, but it is not supported in any browser on HTML documents without using AJAX, and only in some server scenarios: HTTP Range requests. You can tell the server with an additional header to return only a certain range of a document:
GET /large-document.html HTTP/1.1
Accept-Range: bytes
Range: bytes=0-500
Response will contain only the first 500 bytes. This technique is used to resume aborted downloads, for example.
But as I said, this doesn't help you in your scenario. For one, no browser supports this without AJAX (or outside the download manager). And for the second, the client has no idea, which range to request, and where to put it in the already fetched document to replace the old part.
If you really need to support legacy browsers down to IE3 and Netscape 2 and even old text-browsers like legacy Lynx versions, use the classic <frameset>, not an <iframe>. It is supported in basically everything since the olden days of Mosaic and was back then specifically designed for this task. (So it was the tool of choice back then when the browsers came out that you seek to support.)

The only way I can think of for achieving what you want without any JavaScript is using frames. There are, however, a number of disadvantages to frames, which you should be aware of before using them in your website.

Modern versions of Firefox and Chrome do this natively - they cache images and code whenever they can. In fact, the only way to get reloads is to clear cache at the browser level.
You might also want to look into reverse-proxy caching, which essentially does what you are doing on a site-wide basis to avoid DB traffic. Varnish is a good option that will cache pages and is highly customizable.

I don't know if doing such optimizations is reasonable. Modern browsers accept data compression (and moderns servers do it), and the text compresses really well. You have to use output buffering (e.g. see ob_start in PHP) so the page won't be sent chunk-by-chunk in tiny pieces by the server, but it will wait some time for output to be ready, then compress it and send to client, and client uncompresses it.
Using frames as a layout technique is highly discouraged nowadays (maybe iframes are sometimes a good solution, but it depends).

Related

Link: Response Header VS HTML

I am currently working on a function to assist in preparing Link: HTTP header or a set of <link> tags and while reading different materials on this, I still am not able to find an answer to simple question: when to use Link: header and when to use <link>.
So far I can only say, that if you want to use HTTP20 server push, it is recommended to utilize the header. On the other hand, even if I push a stylesheet, it will not be applied unless there is a respective tag in HTML output.
Since I am preparing the library in order to help with some standardization and sanitization, I would like to catch, at least, some "weird" cases like this, if it's possible, but for that I need some set of recommendations or best practices in that regard. Sadly I am unable to find any thus far, so am turning to more knowledgeable people: what best practices or weird cases should I consider catching or should I just allow whatever to be sent regardless of whether it's a header or a tag?
If anyone is interested, the code is present in https://github.com/Simbiat/HTTP20/blob/main/src/Headers.php (links function).
They are supposed to be equivalent as #Evert states so in theory you can use either. However there are some considerations:
Headers are usually set in web server config (at least for static pages) which may not be as easy to update for developers.
However it has the added advantage that you can set these for multiple pages all at once (e.g. preload your core fonts on every .html file, rather than having to remember to set this on all pages, or all page templates if using a CMS).
On the other side with the HTML version it’s often easier to configure it per page (or page template), if you have different needs (e.g. different fonts are used in different pages).
There’s also some which say there are slight performance considerations to doing it in the header but honestly, as long as it’s high enough in the <HEAD> element I really think you’d struggle to notice this.
Of perhaps of more importance is whether it’s passed on hop to hop if your web server is hidden behind other infrastructure (e.g. a CDN or other proxy). In theory it should be, for simple headers, but for things like HTTP/2 push that’s not so easy. If it’s in the HTML you don’t need to worry about this (assuming intermediaries are not changing the markup of course!).
You mentioned the HTTP/2 push use case and that definitely needs the header (though this is not a defined standard method of setting push and some servers or CDNs use other methods, but many use this). However given HTTP/2 push’s complexities and concerns it can cause more problems than it solves, this is maybe a reason to recommend the HTML method to ensure it’s never pushed.
All in all I recommend setting this in the HTML. It’s just easier.
This is not the case however with other, similar things, which can be set in HTML and HTTP headers. CSP for example is limited in the HTML version, lacking some features of the HTTP Header version, and is also not recommended as it could be altered with JavaScript whereas the HTTP header cannot. But for simple Link headers these are less of a concern.

Does commented html causes page loading time

Does commented html elements and White spaces will causes page loading time.I know commented markup will not run by browser.compressing markup is the good idea while deploying in server
<!--Html Elements -->
It will still increase your page size, but shouldn't be a problem. Having 10000 lines of commented-out HTML is going to be a problem though, but keeping your comments small, should not increase the page size by too much.
It won't be run by the browser, but it will be in every case streamed by the server, and downloaded by the client. It shouldn't make any difference, as long as you don't have enourmous amounts of characters in there.
If you're using dynamic pages generated server side, you might be interested in server-side comments, that don't get streamed in the response, so the client never downloads/sees them.For instance, in JSP, <%-- this is a server-side comment --%>.
Also, remember that javascript code is not affected by these comments, actually, <!-- --> is used to to avoid javascript code showing up on old old browsers that didn't support javascript. See this link: Hiding JS code from old browsers.
Your page size will be increased but the execution time will stay the same as the comments are not parsed.
If you are using a server-side language, you can use their comments instead, like so:
<ul>
<li>Something</li>
<?php /* <li>Else</li> */ ?>
</ul>
This will hide the HTML from anyone poking around in your source code and will reduce the page size as the commented out HTML will not be sent to the user.
You could also use PHP's output control functions to automatically strip out any HTML comments. More info here: http://www.php.net/manual/en/ref.outcontrol.php
You can also use those functions to compress your pages, remove whitespace, etc, which will also speed up page load by decreasing page size.

Is it possible to cache JS inside <script/> tags on HTML page?

When JavaScript is inside external .js file it's cached on the browser.
Is it possible to do the same with JS inside tags on HTML page?
No.
Alright a little insight of what you're asking for.
Imagine the script tag's content is cached. What would be the name of it? How the browser would identify? Alright somehow it manages to do that. But then comes the real question:
What would you benefit from that? You have no access to the browser's cache so you need to send the same inline script tag with every request as it can be in the cache or not.
To sum up, it is:
impossible
useless
Why not? It may be cached along with the entire html file, but then of course any dynamic content must be done with ajax ;-)
The tag itself, no. This is one of the nice benefits of external JS files, in addition to separation of concerns. It is not normal to cache individual tags in a document, though, and I'm not sure what the benefit could conceivably be. Either the whole HTML document is cached or none is — never just the <script>s or the <p>s or whatever.

HTML Line Spacing & Compact Code

I have the following example code:
<body>
<div id="a"></div>
<div id="b"></div>
</body>
If I add empty lines between each of my original lines, like this:
<body>
<div id="a"></div>
<div id="b"></div>
</body>
does that do anything to my site's performance? Will the page load slower?
Yes, compact code speeds up page loading due to decreased payload...but not by a measurable amount, at least in most cases, unless your page is massive you won't see a difference.
Pages should be delivered via gzip, making the size difference between spaced and un-spaced negligible, just do what's readable to you, you'll thank yourself later. As with everything, you have to weight the costs, in this case a very minor difference in payload size, with what's easiest to maintain for you.
In theory, yes.
For the server, if it has to send out a 1MB file to each client, it has to spend n amount of time and resources sending out that one file. Now, if you were able to cut the file size in half, the time and resources it would take per user on the server would be .5n.
For the client, it has to download a file. Assuming a download rate of 25KB/S, a 1MB file would take 41 seconds to download. A .5MB file would take 20.5s. Thats a savings of 20 seconds by reducing the file size.
However, in practice. No, I would not worry about it, unless you're dealing with audio/video/picture data. That's because a character in a HTML document is only a couple bytes. Sure, you might have lets say 100 extra characters that you could trim and remove - whitespace for instance. At most you'd save up an additional 1KB per page.
I wouldn't be too concerned about it, unless you're developing an application or solution where it needs to be compact. But any modern or sub-modern computer won't break with 1KB extra data in their HTML file.
in your example, it saves up 3 bytes of code... so i don't think it has any noticeable effect on page loading time in modern times and it's internet speed. a better improvement would be to send your page gziped.
Page loading and compact code ? yes it really make things better as additional newlines and spaces are nothing but characters which need to be downloaded on the client end.
However i will suggest you to see it as part of the big strategy for optimization.
I will suggest you to take a look at YSlow/Yahoo Guidelines which will help you understand the different parts of "strategy" which is added to server and client components also. And collective results are just amazing for big sites.

Fastest way to load CSS -- inline vs HEAD

I have a 50x50px div I want to display on my homepage as fast as possible.
Is it faster for me to do
<div style="height:50px;width:50px">
Or to assign it a class to avoid the inline style:
<div class="myDiv">
And put the myDiv class in a CSS file in the HEAD section of the HTML page?
My thought was that the first one should be faster since it doesn't need to request and recieve a CSS? I guess ultimatley I'm asking if BODY and HEAD get rendered sequentially or in parallel.
Without HEAD loading first there can be no BODY.
Before your BODY gets rendered, it has has to be loaded first. And if it is loaded, then the HEAD has already been loaded.
You're probably interested in whether a browser can load simultaneously both CSS files and the HTML document itself. It will depend on the browser implementation, but I believe most can download at least two documents simultaneously.
One other important thing is that the more files a document consists of, the more chances the request for one of them gets lost. So by using inline CSS you make sure the CSS never gets lost.
But I must point out that inline CSS is considered a bad style. Once you have a sufficient amount of markup, you will find it increasingly difficult to update your pages all at once. You will inevitably be losing one or the other instance. It is a much better idea to declare all styles in a separate document and reference them from pages. This way, when you need to change some color, you do it in one place and not in 37 places to be found in your pages.
As others already pointed out, the right thing to do would be to put the styles in an external file and refer to it in the <head> part of your document.
But if you're going for fast (and this is what you were asking for) then you should use the inline-declaration like
<div style="height:50px;width:50px">
There are several reasons for that:
You don't have to load an external file. This is very slow (compared to the next reason) since there is an additional HTTP request involved which (on top of the request and download itself) might be held back by other external files like JavaScript, favicons etc.
So it will already load faster if you put your declaration in some <style> tags on the same document. But then there is the next reason.
The browser does not have to look through the DOM tree and search for nodes with the class myDiv to apply the styles to. It finds the <div> and immediately (or at the next render turn) applies the style information.
This second delay will hardly be noticeable but if you are going for high performance, this is the way to go.
I agree that these may somewhat be theoretical reasons but here you go. :-)
There are cases when this would be a "good" practice. For example, you have a high value landing page, that requires about 500 bytes of CSS to support, verses the 200K Style sheet.
While true, that they customer will have to download that file on the NEXT page, time to render is often most important on the landing page.
Also, AFAIK, browsers will not begin rending until the entire CSS file is downloaded, which is not the case for inline styles. But yes, Best Practices, and 98% of the time you want to put CSS in a single linked file.
Use an embeded css file. After the first request the file will be cached by the browser and won't have to be downloaded again. Making the page load faster and reducing the strain on your server.
Placing styles inline is not only ugly it also undermines the whole cascading thing.
The differences in performance will be imperceptible and should be irrelevent. Instead of worrying about premature optimisations like this be more concerned with doing the "right thing" - and in this case the right thing is to use external style-sheet files for your CSS as it is more maintainable and separates concerns.