Is it legit for elements to appear after </html>? - html

Google PageSpeed Insights recommend us to defer loading of secondary stylesheets by placing them after the closing </html> tag. Code example from that page:
<html>
<head>
<style>
.blue{color:blue;}
</style>
</head>
<body>
<div class="blue">
Hello, world!
</div>
</body>
</html>
<link rel="stylesheet" href="deferred_stylesheet.css">
This strikes me as odd, because I would expect Google to recommend standard practices instead of non-standard practices.
Is the code above legit? Do the HTML5 standards provide consistent rules for browsers to parse these sort of code?
What problems will we (or our users in fact) encounter if we use the tags as such?

It isn't correct to put anything outside the HTML tags however all browsers try to sort the mess they get served.
Almost all browsers e.g. Firefox, Google Chrome, Internet Explorer, etc will sort this out for you. Some self written browsers maybe won't do this.
However you can better write it in between the head tag for performance (because then the browsers don't need to sort out the dirty coding).
This means users will encounter almost no problems.

First, you're missing your <!DOCTYPE html> declaration at the top, which is required for HTML5 documents.
Second, you're also missing a <title> tag in your head.
Finally, no, tags after the closing </html> tag are not allowed. If you run your doc through the w3 validator service you'll see:
Line 14, Column 54: Stray start tag link.
<link rel="stylesheet" href="deferred_stylesheet.css">
This error is unrecoverable.
I would expect to face browsers either totally ignoring it or loading it. Depending on the browser, the version and the platform. Since you can't be sure it'll work (and since it's non-conforming), you shouldn't do it.
UPDATE
This is non-valid HTML, however user-agents are allowed to parse it and include in the body. But they are not required to since they can define their own rules for parse errors

Is this correct or even allowed structure?
No. It is incorrect to place elements outside the <html>tag.
Oh dear! Will it land me in problems?
No. Browsers will fix your bad code as best as it can. This is no problem for the browser.
Should I do it?
Well ... it's complicated. Short answer is "no, probably not". As developers, it is our responsibility to ensure that our code is correct (machines can't be trusted, that will be our downfall!). The rules are there for a reason. On the other hand, as hackers, we can some times break and bend rules, if it makes sense in the situation and you know what the consequenses will be.
HTML hacking? What the hell are you talking about?
Browsers will always try to render the document as best as it can. Interestingly, you don't even have to give the a <html> tag – the browser will figure it out for you.
Here's an example page – inspect the rendered DOM, and the view source. I've almost not given it anything to work with, and it has kindly created all missing tags. Neat! You do need a DOCTYPE declaration and in most cases a <title> tag.
But, I digress ...
So, the browser will definitely render it correctly. It is, however, incorrect as per the specs, and the document will not validate (if you care about that). For the most part, it's a good idea to follow the specs, unless you have a good reason.
Now, I can't see why Google wants you to place it outside the <html> tag. It sounds strange, however I guess they know what they're doing. My guess is, that they want to make absolutely sure, that the ressource is loaded last.

Related

Why does HTML with poor and improper tag usage still work?

The tl;dr version is why does html that doesn't close tags properly still work error free?
I'm learning more and more HTML each day but I'm still quite a beginner. So I don't understand why badly written HTML without properly closing tags still works. I was using an email template for a task at work and was curious about the HTML behind it so I loaded to code into an editor.
I came across 9 separate tags that don't close: <td> <center> <div> <p> <div> <td> <tr> tbody> <table> The code starts with an <html> tag as it should, but in the <body> after a <table> and <tr>, it starts another <html> <head>... etc set of tags. In the two <style> tags, they both say say the same thing and have an extra closing brace li { margin-bottom: 10px; } }.
When I load just this code into my browser the page still visually appears how it is supposed to. In Firebug though, after the first body tag, it skips table, tr, html, head, and body, and goes straight to just showing the first <div>.
Why is it that a webpage (because I'm sure this must be a somewhat common thing out there) that has missing closure tags, extra <html> and <body> tags, etc still able to function properly?
I think this is an application of the Robustness Principle.
Be conservative in what you do, be liberal in what you accept from others
I'd argue this is an inevitable outcome in a landscape of competing browsers. If an HTML error prevents a site from working in browser A, but browser B is able to guess a correction, users will tend to use browser B, as A appears to be broken. This has been going on since Netscape 3 or earlier.
Misformed HTML does not work correctly - browsers try to guess the intent of the HTML structure and display whatever the result of the guess is. This is the result of an unfortunate decision to allow poorly-formed HTML to display, rather than rejecting it and forcing the author to fix such problems.
When you see misformed HTML looking correct on the screen, it's not the result of correct behavior: it's the result of a lucky guess on the part of the browser (obviously, a tiny problem is easier to fix by guessing than a massive structural problem that spans the entire HTML structure).
It comes down to treating HTML as content (which it is not), rather than a formal language (which it is): content authors were (and are) considered non-technical people and forcing them to fix problems with "content" was seen too hard a requirement.

what happens when we include wrong attributes in html tags?

I have noticed that the browser does not complain in any manner when I include non existing attribute names for html tags. for example
<!DOCTYPE html>
<html>
<head test1="abc">
<title test2="cba">My Sample Project</title>
</head>
<body>
<h1>My Sample Project</h1>
</body>
</html>
What is actually happenning here? Does the html parser in the browser ignore attributes when it doesn't know what to do with it? I also found that the same behaviour is seen when we include non existing tags as well. Does it mean that the browser interprets the html it can understand and does not complain about anything else in the file?
Any slight upgrade to browsers would cause a lot of issues if the previous browsers just gave errors for all the things they didn't recognize. That said, the page will fail W3C validation; and making up attributes generally isn't recommended (What if, two years from now, test1 is a valid HTML5 attribute that triggers a browser's unit-testing functions? Okay, a stretch.)
If you do want to make up attributes for random purposes, I would recommend you start them with data-.
You can technically have any sort of attribute you want, and it will be accessible throughout html, css, and javascript.
Example:
<h1 id="heading" my-attribute="foo" another="bar">My Heading</h1>
CSS
h1[my-attribute=foo]{
background:red;
}
Javascript
console.log(document.getElementById('heading').getAttribute('my-attribute');//logs "foo"
This technically works too:
<mytag>Something awesome.</mytag>
However, there is a reason we have a standards model (W3C). Do we want developers relearning all the tags they are going to be using in a project? Or figure out how to access certain attributes of those tags? It can get out of control quite quickly.
How to do custom attributes
With HTML5, now it's considered standard practice to use data- to create custom defined attributes:
<h1 data-alt="my alternate data">This is totally standard compliant</h1>
These are 100% valid, standard compliant custom attributes, you can read about them in depth at MDN: https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/Using_data_attributes
This allows for us to tag custom data and be sure that new browsers won't come along and stomp all over us with handling our attributes in odd ways (as Katana314 said in their more concise answer).

What would happen to my SEO and Scripts if my <HEAD> tag was below my <BODY> tag?

Just thinking about it, XHTML1.1 spec and by extension, HTML5 (assumed)... well Markup is designed so that unless otherwise specified, "order" isn't supposed to matter.
Everything in the Body tag obviously is ordered a specific way for the browsers rendering engine to interpret, but the HEAD and BODY tags themselves conceptually have nothing to do with render order (despite their name, and except includes in HEAD; if an include depends on another include obviously that must be loaded in first), and thus follow the same rules as any Markup language.
Throwing the HEAD tag block below the BODY tag block works (at least in WebKit based browsers anyways) but all I've been able to do so far is test that te Title tag works as it should. Not a totally conducive test, but as I write this on my phone, I didn't have time to go any further with my thought process.
I'm wondering how doing this would affect SEO, and worse yet: loading of Script and CSS files typically handled in the HEAD. I understand a practice lately has external loading of script files happening at or near the bottom of the markup to ahem delay their loading for when the page is ready, would this react any differently?
Basically I'm asking, What are the *repercussions * of having a website where the HEAD block is located below the BODY block?
<html>
<body>
Test
</body>
<head>
<title>Test</title>
<script src="test.js" type="text/javascript"></script>
<link rel="stylesheet" type="text/css" href="test.css" />
</head>
</html>
It would only have a negative effect on SEO, if any at all.
First off, your proposal results in incorrect HTML. The HTML4.01 DTD, which strictly defines the structure of HTML documents mandates that <head> becomes before <body>:
<!ENTITY % html.content "HEAD, BODY">
(If the order didn't matter, then it would be <!ENTITY % html.content "(HEAD|BODY)+">
Secondly, I'd wager most spiders look for a <head> element as quickly as possible, if it can't find one before the <body> element then it will probably discount your document at best, if not completely ignore it. I suspect most spiders would ignore any <head> elements encountered after <body>.
Third, it ruins the user experience. Sometimes pages can take a while to load, but a browser parses the HTML as it downloads. As soon as it sees <title> it displays it to the user so the user knows the page has at least partially loaded (even if it hasn't been rendered yet). Without this ability your users might close the browser tab/window out of frustration if it loads too slow, as they'd think the site was completely unresponsive.
Interesting question but I strongly believe that HTML structure is very much similar with human anatomy (Head-Body-foot), what happen and how its look if it’s not in proper structure?? Looks ugly, difficult to identify the particular person, here browser act accordingly to the universal structure (like head-body-footer) so these are few predefine structure that we need to follow for best result.
Regarding SEO, offsite SEO work in such a case that how are we following structure, and of course it will effect to Google spider and many more thing .

Is it ok to put html comments outside the <html> tags?

W3c validator didn't ding me on this, but I was curious if anyone else had an opinion on placing html comments outside of the html tags?
...
</body>
</html>
<!-- byee -->
I have an application and am outputting some data and want it to be the absolute last thing that is done, which unfortunately means I've already attached my last </html>.
I can't see this being a problem - allowable comments are not specified in a DTD (as they're effectively for humans, not computers). Also, the DOM API (http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html) explicitly allows many comments directly under the document node (i.e. not the root HTML element, the logical document root), so any conforming browser should allow it.
This is not to say you won't find browsers or tools, especially older ones, that choke. But I'd be surprised if there were many.
I don't think a comment after the </html> will cause any problems, but I believe that a comment that precedes the DOCTYPE declaration (and therefore before the <html> tag) will kick IE6 into quirks mode.
Any client should completely ignore comments, so they should not cause any problems. Anyway if the validator didn't complain it's probably ok.
FYI if you're using AngularJS and create a .directive where replace is true, a comment outside the root element in the HTML fragment will cause Angular to see two root elements and throw this error
Template for directive 'yourDirective' must have exactly one root element.
I had an SEO company that was working on a client's site decide to add an HTML comment into one of my PHP includes that was outside the HTML tag and it caused issues in Internet Explorer. It caused a bunch of formatting issues with my drop down menus. It made no sense why it broke, but it was absolutely 100% caused by the comment. As soon as the comment was deleted all went back to normal.
Yes by all means. Any rendering engine (IE, Firefix, Opera, Safari, etc) will ignore any HTML comment tag completely regardless which position.

Order of tags in <head></head>

does it matter at all what order the <link> or <script> or <meta> tags are in in the <head></head>?
(daft question but one of those things i've never given any thought to until now.)
Optimization
According to the folks over at Yahoo! you should put CSS at the top and scripts at the bottom because scripts block parallel downloads. But that is mostly a matter of optimization and is not critical for the page actually working. Joeri Sebrechts pointed out that Cuzillion is a great way to test this and see the speed improvement for yourself.
Multiple Stylesheets
If you are linking multiple stylesheets, the order they are linked may affect how your pages are styled depending on the specificity of your selectors. In other words, if you have two stylesheets that define the same selector in two different ways, the latter will take precedence. For example:
Stylesheet 1:
h1 { color: #f00; }
Stylesheet 2:
h1 { color: #00f; }
In this example, h1 elements will have the color #00f because it was defined last with the same specificity:
Multiple Scripts
If you are using multiple scripts, the order they are used may be important if one of the scripts depends on something in another script. In this case, if the scripts are linked in the wrong order, some of your scripts may throw errors or not work as expected. This, however, is highly dependent on what scripts you are using.
The accepted answer is kind of wrong, depending on the encoding of the document. If no encoding is sent by in the HTTP header, the browser has to determine the encoding from the document itself.
If the document uses a <meta http-equiv="Content-Type" … declaration to declare its encoding, then any ASCII-valued character (character code < 128) occurring before this statement must be an ASCII value, as per HTML 4 spec. Therefore, it's important that this meta declaration occurs before any other element that may contain non-ASCII characters.
It's recommended to put the meta tag with the character encoding as high as possible. If the encoding is not included in (or differs from) the response header of the requested page, the browser will have to guess what the encoding is. Only when it finds this meta tag it knows what it is dealing with and it will have to read everything it has already parsed again.
See for instance Methods for indicating the character set.
One important thing to note: if you're using the Internet Explorer meta X-UA-Compatible tag to switch rendering modes for IE, it needs to be the first thing in the HEAD:
<head>
<meta http-equiv="X-UA-Compatible" content="IE=7" />
<title>Page title</title>
...etc
&lt/head>
meta does not matter, but link (for css) and script matters.
script will block most browser from rendering the page until the scripts are loaded.
Therefore, if possible put them not in the head, but the body.
css link will not block page rendering.
It is usually recommended to have the <script> tag as lower down the page as possible (not in the head but in the body).
Other than that, I don't think it makes much of a difference because the body cannot be parsed unless you have the <head> section completely loaded. And, you want your <link> tag to be in the head as you want your styling to occur as the browser renders your page and not after that!
If you declare the charset in a meta element, you should do it before any other element.
Not a daft question at all.
CSS above Script tags for reasons already mentioned.
CSS is applied in the order you place the tags - the more specific the stylesheet, the lower down the order it should be.
Same goes for scripts - scripts that use functions declared in other files should be loaded after the dependency is loaded.
Put the meta tag that declares the charset as the first element in head. The browser only searches so far for the tag. If you have too much stuff before the meta element, the charset might not get applied.
If you use the BASE element, put it before any elements that load URIs (if desired).
It would only matter if one of the linked files (CSS/Javascript) depended on another. In that case, all dependencies must be loaded first.
Say, for example, you are loading a jQuery plugin, you'd then need to first load jQuery itself. Same when you have a CSS file with some rules extending other rules.
As already pointed out meta describing content charset should be the first otherwise it could actually be a security hole in a certain situation. (sorry i dont remember that situation well enought to describe here but it was demostrated to me at web security training course)
I recently was having a problem with a draggable jquery ui element. It was behaving properly in Firefox, but not Safari. After a ton of trial and error, the fix was to move my css links above the javascript links in the head. Very odd, but will now become my standard practice.
For the purposes of validation as XHTML, yes. Otherwise you're probably going to care about the optimization answers.
Nope, it doesn't matter, except for CSS linking or inclusion, because of CSS inheritance and the fact that it overwrite what was already styled (sorry for my english, i think my sentence is not really clear :-/).