Does commented html causes page loading time - html

Does commented html elements and White spaces will causes page loading time.I know commented markup will not run by browser.compressing markup is the good idea while deploying in server
<!--Html Elements -->

It will still increase your page size, but shouldn't be a problem. Having 10000 lines of commented-out HTML is going to be a problem though, but keeping your comments small, should not increase the page size by too much.

It won't be run by the browser, but it will be in every case streamed by the server, and downloaded by the client. It shouldn't make any difference, as long as you don't have enourmous amounts of characters in there.
If you're using dynamic pages generated server side, you might be interested in server-side comments, that don't get streamed in the response, so the client never downloads/sees them.For instance, in JSP, <%-- this is a server-side comment --%>.
Also, remember that javascript code is not affected by these comments, actually, <!-- --> is used to to avoid javascript code showing up on old old browsers that didn't support javascript. See this link: Hiding JS code from old browsers.

Your page size will be increased but the execution time will stay the same as the comments are not parsed.
If you are using a server-side language, you can use their comments instead, like so:
<ul>
<li>Something</li>
<?php /* <li>Else</li> */ ?>
</ul>
This will hide the HTML from anyone poking around in your source code and will reduce the page size as the commented out HTML will not be sent to the user.
You could also use PHP's output control functions to automatically strip out any HTML comments. More info here: http://www.php.net/manual/en/ref.outcontrol.php
You can also use those functions to compress your pages, remove whitespace, etc, which will also speed up page load by decreasing page size.

Related

Optimize CSS Delivery - a suggestion by Google

Google suggests to use very important CSS inline in head and other CSS inside <noscript><link rel="stylesheet" href="small.css"></noscript>.
This raises few questions in my mind:
How to prioritize CSS in two files. Everything for that page looks important. Display, font etc. If I move it to bottom then how it helps page render. Wont it cause repaint, etc?
Is that CSS is required after Document ready event? Got it from here.
How 'CSS can' go inside <noscript></noscript>, which is for script? Will it work when JavaScript is enabled? Is it browsers compatible?
Reference
Based on my reading of the link given in the question:
Choose which CSS declarations are inlined based on eliminating the Flash-of-Unstyled-Content effect. So, ensure that all page elements are the correct size and colour. (Of course, this will be impossible if you use web-fonts.)
Since the CSS which is not inlined is deferrable, you can load it whenever makes sense. Loading it on DOMContentReady, in my opinion, goes against the point of this optimisation: launching new HTTP requests before the document is completely loaded will potentially slow the rest of the page load. Also, see my next point:
The example shows the CSS in a noscript tag as a fallback. Below the example, the page states
The original small.css is loaded after onload of the page.
i.e. using javascript.
If I could add my own personal opinion to this piece:
this optimisation seems especially harmful to code readability: style sheets don't belong in noscript tags and, as pointed out in the comments, it doesn't pass validation.
It will break any potential future enhancements to HTTP (or other protocol) requests, since the network transaction is hard-coded through javascript.
Finally, under what circumstances would you get a performance gain? Perhaps if your page loads a lot of initially-hidden content; however I would hope that the browser itself is able to optimise the page load better than this hack can.
Take this with a grain of salt, however. I would hesitate to say that Google doesn't know what they're doing.
Edit: note on flash-of-unstyled-content (abbreviated FOUC)
Say you a block of text spanning multiple lines, and includes some text with custom styling, say <span class="my-class">. Now, say that your CSS will set .my-class { font-weight:bold }. If that CSS is not part of the inline style sheet, .my-class will suddenly become bold after the deferred loading has finished. The text block may reflow, and might also change size if it requires an extra line.
So, rather than a flash of totally-unstyled content, you have a flash of partly-styled content.
For this reason you should be careful when considering what CSS is deferred. A safe approach would be to only defer CSS which is used to display content which is itself deferred, for example hidden elements which are displayed after user interaction.

Overhead of HTML whitespace indentation

I started wondering what is the overall impact of using whitespaces to indent html documents.
Why not simply use tabs to indent? Wouldn't this be more cost-effective: 1 char (\t) vs. example 4 chars (spaces)?
I did little experimenting by converting an asp.net-page to use tabs and compared sizes of rendered markups.
By replacing only one partial view's white space caused a page of 22kb size to be reduced to 19,4kb -> that's 12% reduction. Changing all indentation, page ended up allocating 16,7kb - 24% reduction! (used chrome dev tools and Fiddler for verifying)
Is my reasoning sound? Should tabs be used primary for indentation of HTML? Is there any reason to use spaces(such as compatibility with exotic browsers)?
ps. Stackoverflow seems to use spaces too. Converting SO main page to use tabs gave 9% reduction. Is this valid observation? If so, why haven’t they used tabs?
StackOverflow uses HTTP Compression - when this is turned on, the differences between using spaces versus tabs goes down - a lot.
You need to run your tests against the compressed versions for reliable results.
You do have a point though for the cases when a browser does not support the compression schemes the server supports.
First thing : html doesn't have a rule of doing indentation. It's done by programmers for code readability and program's structure. More ever We can reduce size taken by indents and white spaces by compression.
Minify/compact/compressing HTML : Compacting HTML code, can save many bytes of data and speed up downloading, parsing, and execution time.
StackOverflow uses HTTP Compression
Minifying HTML has the same benefits as those for minifying CSS and JS: reducing network latency, enhancing compression, and faster browser loading and execution. Moreover, HTML frequently contains inline JS code (in tags) and inline CSS (in tags), so it is useful to minify these as well.
Note: This rule is experimental and is currently focused on size reduction rather than strict HTML well-formedness. Future versions of the rule will also take into account correctness. For details on the current behavior, see the Page Speed wiki.
Tip: When you run Page Speed against a page referencing HTML files, it automatically runs the Page Speed HTML compactor (which will in turn apply JSMin and cssmin.js to any inline JavaScript and CSS) on the files and saves the minified output to a configurable directory.
Refer : http://code.google.com/speed/page-speed/docs/payload.html#MinifyHTML
Why not simply use tabs to indent? Wouldn't this be more cost-effective: 1 char (\t) vs. example 4 chars (spaces)?
If you're worried about downloaded HTML size, you won't fuss over tabs-vs-spaces — you'll compress your HTML as it goes over the wire and minify your markup, CSS, and Javascript, which provide real savings and don't interfere with your own coding guidelines.

How do we cache HTML "fragments"?

I have a page which looks like this:
<!doctype html>
<head></head>
<body>
<div>Content 1000 chars</div>
<div>Content 1000 chars</div>
<div>Content 1000 chars</div>
</body>
</html>
When a client downloads the page, basically he's downloading 3100 characters. If he visits the page again and the contents of the first div changes, he will have to redownload the entire page again (3100 characters).
Now basically I was wondering are we able to cache HTML fragments like the way we do with images?
So I was thinking is there somewhere to get this effect:
<!doctype html>
<head></head>
<body>
<div src="page1.html"></div>
<div src="page2.html"></div>
<div src="page3.html"></div>
</body>
</html>
So if I were to change the contents of page1.html, the browser would be able to know that only page1.html was changed since the last visit, and downloads 1000 characters instead of the entire page (3100) characters. Essentially this behavior is identical to what is happening now with images:
<!doctype html>
<head></head>
<body>
<img src="img1.gif">
<img src="img2.gif">
<img src="img3.gif">
</body>
</html>
whereby changing img1.gif will invoke the browser to redownload only img1.gif (assuming all the other files have not been edited)
To be clear, I'm not looking for an AJAX solution. I need a solution that works without javascript (as with all the above examples). I'm also not particularly in favor of the frames solution, However I would accept that as answer if there are simply no other alternatives / quirks / hacks
Have you thought of IFrames?
However, I think that this is such a micro-optimization, that it wouldn't have any advantage (except caching inside server applicvation, which is a completely other can o'worms).
(Or you're talking about way more than 3000 chars here.)
Edit: There is another solution, but it is not supported in any browser on HTML documents without using AJAX, and only in some server scenarios: HTTP Range requests. You can tell the server with an additional header to return only a certain range of a document:
GET /large-document.html HTTP/1.1
Accept-Range: bytes
Range: bytes=0-500
Response will contain only the first 500 bytes. This technique is used to resume aborted downloads, for example.
But as I said, this doesn't help you in your scenario. For one, no browser supports this without AJAX (or outside the download manager). And for the second, the client has no idea, which range to request, and where to put it in the already fetched document to replace the old part.
If you really need to support legacy browsers down to IE3 and Netscape 2 and even old text-browsers like legacy Lynx versions, use the classic <frameset>, not an <iframe>. It is supported in basically everything since the olden days of Mosaic and was back then specifically designed for this task. (So it was the tool of choice back then when the browsers came out that you seek to support.)
The only way I can think of for achieving what you want without any JavaScript is using frames. There are, however, a number of disadvantages to frames, which you should be aware of before using them in your website.
Modern versions of Firefox and Chrome do this natively - they cache images and code whenever they can. In fact, the only way to get reloads is to clear cache at the browser level.
You might also want to look into reverse-proxy caching, which essentially does what you are doing on a site-wide basis to avoid DB traffic. Varnish is a good option that will cache pages and is highly customizable.
I don't know if doing such optimizations is reasonable. Modern browsers accept data compression (and moderns servers do it), and the text compresses really well. You have to use output buffering (e.g. see ob_start in PHP) so the page won't be sent chunk-by-chunk in tiny pieces by the server, but it will wait some time for output to be ready, then compress it and send to client, and client uncompresses it.
Using frames as a layout technique is highly discouraged nowadays (maybe iframes are sometimes a good solution, but it depends).

What is the best way to handle user generated html content that will be viewed by the public?

In my web application I allow user generated content to be posted for public consumption similar to Stackoverflow.
What is the best practice for handing this?
My current steps for handling user generated content are:
I use MarkItUp to allow users
an easy way to format their html.
After a user has submitted thier
changes I run it through an HTML
Sanitizer (scroll to the
bottem) that uses a white list
approach.
If the Sanitization process has
removed any user created content I
do not save the content. I then
Return there modified content with a
warning message, "Some illegal
content tags where detected and
removed double check your work and
try again."
If the content passes through the
sanitization process cleanly, I save
the raw html content to the
database.
When rendering to the client I just
pass the raw html out of the db to
the page.
That's an entirely reasonable approach. For typical applications it will be entirely sufficient.
The trickiest part of white-listing raw HTML is the style attribute and embed/object. There are legitimate reasons why someone might want to put CSS styles into an otherwise untrusted block of formatted text, or say, an embedded YouTube video. This issue comes up most commonly with feeds. You can't trust the arbitrary block of text contained within a feed entry, but you don't want to strip out, e.g., syntax highlighting CSS or flash video, because that would fundamentally change the content and potentially confuse anyone reading it. Because CSS can contain dangerous things like behaviors in IE, you may have to parse the CSS if you decide to allow the style attribute to stay in. And with embed/object you may need to white-list hostnames.
Addenda:
In worst case scenarios, HTML escaping everything in sight can lead to a very poor user experience. It's much better to use something like one of the HTML5 parsers to go through the DOM with your whitelist. This is much more flexible in terms of how you present the sanitized output to your users. You can even do things like:
<div class="sanitized">
<div class="notice">
This was sanitized for security reasons.
</div>
<div class="raw"><pre>
<script>alert("XSS!");</script>
</pre></div>
</div>
Then hide the .raw stuff with CSS, and use jQuery to bind a click handler to the .sanitized div that toggles between .raw and .notice:
CSS:
.raw {
display: none;
}
jQuery:
$('.sanitized').click(function() {
$(this).find('.notice').toggle();
$(this).find('.sanitized').toggle();
});
The white list is a good move. Any black list solution is prone to letting through more than it should, because you just can't think of everything. I've seen some attemts of using black lists (for example The Code Project), and if they manage to catch everything, generally they still cause additional problems like replacing characters in code so that it can't be used without manually restoring it first.
The safest method would be:
HTML encode all the text.
Match a set of allowed tags and attributes and decode those.
Using a regular expression you can even require that each opening tag has a closing tag, so that an unclosed tag can't mess up the page.
You should be able to do this in something like ten lines of code, so the code that you linked to seems overly complicated.

Fastest way to load CSS -- inline vs HEAD

I have a 50x50px div I want to display on my homepage as fast as possible.
Is it faster for me to do
<div style="height:50px;width:50px">
Or to assign it a class to avoid the inline style:
<div class="myDiv">
And put the myDiv class in a CSS file in the HEAD section of the HTML page?
My thought was that the first one should be faster since it doesn't need to request and recieve a CSS? I guess ultimatley I'm asking if BODY and HEAD get rendered sequentially or in parallel.
Without HEAD loading first there can be no BODY.
Before your BODY gets rendered, it has has to be loaded first. And if it is loaded, then the HEAD has already been loaded.
You're probably interested in whether a browser can load simultaneously both CSS files and the HTML document itself. It will depend on the browser implementation, but I believe most can download at least two documents simultaneously.
One other important thing is that the more files a document consists of, the more chances the request for one of them gets lost. So by using inline CSS you make sure the CSS never gets lost.
But I must point out that inline CSS is considered a bad style. Once you have a sufficient amount of markup, you will find it increasingly difficult to update your pages all at once. You will inevitably be losing one or the other instance. It is a much better idea to declare all styles in a separate document and reference them from pages. This way, when you need to change some color, you do it in one place and not in 37 places to be found in your pages.
As others already pointed out, the right thing to do would be to put the styles in an external file and refer to it in the <head> part of your document.
But if you're going for fast (and this is what you were asking for) then you should use the inline-declaration like
<div style="height:50px;width:50px">
There are several reasons for that:
You don't have to load an external file. This is very slow (compared to the next reason) since there is an additional HTTP request involved which (on top of the request and download itself) might be held back by other external files like JavaScript, favicons etc.
So it will already load faster if you put your declaration in some <style> tags on the same document. But then there is the next reason.
The browser does not have to look through the DOM tree and search for nodes with the class myDiv to apply the styles to. It finds the <div> and immediately (or at the next render turn) applies the style information.
This second delay will hardly be noticeable but if you are going for high performance, this is the way to go.
I agree that these may somewhat be theoretical reasons but here you go. :-)
There are cases when this would be a "good" practice. For example, you have a high value landing page, that requires about 500 bytes of CSS to support, verses the 200K Style sheet.
While true, that they customer will have to download that file on the NEXT page, time to render is often most important on the landing page.
Also, AFAIK, browsers will not begin rending until the entire CSS file is downloaded, which is not the case for inline styles. But yes, Best Practices, and 98% of the time you want to put CSS in a single linked file.
Use an embeded css file. After the first request the file will be cached by the browser and won't have to be downloaded again. Making the page load faster and reducing the strain on your server.
Placing styles inline is not only ugly it also undermines the whole cascading thing.
The differences in performance will be imperceptible and should be irrelevent. Instead of worrying about premature optimisations like this be more concerned with doing the "right thing" - and in this case the right thing is to use external style-sheet files for your CSS as it is more maintainable and separates concerns.