What should a basic html5 file include? - html

I was wondering what an html file should ALWAYS include besides <html>, <header> and <body>. I've seen many things, so I'm not sure what to include ALWAYS.

The html, header and body tags are actually optional in HTML5. The only required element, aside from the doctype definition, is title. So the following would be a completely valid HTML5 document:
<!DOCTYPE html>
<title>example</title>
You can validate that using the W3C Validator.

Actually the start/end tags of the html/body tags are optional IF:
The <html> start tag is optional unless the first thing inside the html element is not a comment.
The <⁄html> end tag is optional unless the html element is not immediately followed by a comment.
see html tag, optional vs. required
and the body start/end tags may be omitted IF : see when

Related

What are the drawbacks of ignoring <html> and <body>

Is there any drawback to never using
<html> and <body>
on your web pages that are written in HTML and PHP?
I mean, everything works perfectly fine with it or without it, so why use it?
They are explicitly optional in the spec (so the document will still be valid).
This has been true since the original spec (which says <!ELEMENT HTML O O (( HEAD | BODY | %oldstyle)*, PLAINTEXT?)>, O O meaning Start Tag Optional, End Tag Optional) through to the current spec (which says "An html element's start tag can be omitted if the first thing inside the html element is not a comment. An html element's end tag can be omitted if the html element is not immediately followed by a comment.").
They are only mandatory in XHTML since XML has no concept of optional tags.
I've never seen any browser or user-agent fail to handle them correctly in an HTML document. (Note that while the tags are optional, the elements are not, so browsers will insert an HTML, HEAD and BODY elements even if the tags are missing, so any script which tries to find them in the DOM will still work).
The only technical drawback is that you can't put attributes on tags which aren't there, and a lang attribute for the HTML element is useful.
Leaving them out can confuse other people who have to maintain your code who don't know that the tags are optional though.
Both <head> and <body> tags are optional in HTML5. In fact it is recommended by Google's HTML style guide to not use them:
<!-- Not recommended -->
<!DOCTYPE html>
<html>
<head>
<title>Spending money, spending bytes</title>
</head>
<body>
<p>Sic.</p>
</body>
</html>
<!-- Recommended -->
<!DOCTYPE html>
<title>Saving money, saving bytes</title>
<p>Qed.
By not using those tags, some drawbacks include:
that it is drastically different from what is typically learned for developers, so it may cause some confusion.
a restriction that a comment cannot be immediately after the <html> tag that is omitted.
Reiterating the optional nature of the tags from the spec:
An html element's start tag may be omitted if the first thing inside the html element is not a comment.
A body element's start tag may be omitted if the element is empty, or if the first thing inside the body element is not a space character or a comment, except if the first thing inside the body element is a meta, link, script, style, or template element.
See:
https://google.github.io/styleguide/htmlcssguide.xml?showone=Optional_Tags#Optional_Tags
https://html.spec.whatwg.org/multipage/syntax.html#syntax-tag-omission
If you do not use <html> and <body> than your HTML document will be not valid, some libraries/plugins may not work too.

Can <HEAD> tag occur after <BODY> in valid HTML [duplicate]

This question already has answers here:
Tag head after closing tab body
(2 answers)
Closed 7 years ago.
I tried to load a browser with html file with the tag after and it seems to load correctly.
<HTML>
<BODY>
This is body
</BODY>
<HEAD>
</HEAD>
</HTML>
Does this confirm to HTML specs? The HTML specs document does not seem to specify the position of the element
In practice, I believe that HEAD will always precede BODY but I don't know if the HTML parsers also implement this positional relationship
Browsers process markup as it is read - putting a <head> element below the <body> means that your content will be displayed in an unstyled fashion until the CSS documents linked to in the <head>. There's likely to be other issues with this approach.
Also - if you're learning about HTML - skip HTML4.01 and dive on into HTML5 - everyone supports it, and it's pretty normative these days. Learning HTML4.01 is like learning Olde English in the 20th Century.
Anywho - the HTML4.01 manual does say that HEAD must go before BODY - just not in plain English. There is a snippet of the HTML4.01 Strict Document Type Declaration:
<!ENTITY % html.content "HEAD, BODY">
<!ELEMENT HTML O O (%html.content;) -- document root element -->
This is the validation rule that says <html> must contain a <head> and <body> in that specific order.
HTML5 doesn't use a doctype - but the standard is more explicitly written for head and body:
4.2.1 The head element
Categories:
None.
Contexts in which this element can be used:
As the first element in an html element.
...
4.3.1 The body element
Categories:
Sectioning root.
Contexts in which this element can be used:
As the second element in an html element.
No you cannot. Browser will interpret it but it does not mean your code is correct.
you should run your code in the w3c validator https://validator.w3.org/ it will tell you it need to stay in the order html, head, body

What form of meta tag is valid in HTML5 <meta> or <meta/>?

Validators accept both form of tag <meta> and <meta/>. HTML5 specification says that no end tag should be present, hence the form of <meta><meta/> is prohibited. But I could not find any information about form <meta/>.
As per this HTML5 standard: http://www.w3.org/TR/html-markup/syntax.html#void-element
Start tags consist of the following parts, in exactly the following
order:
A "<" character.
The element’s tag name.
Optionally, one or more attributes, each of which must be preceded by one or more space characters.
Optionally, one or more space characters.
Optionally, a "/" character, which may be present only if the element is a void element.
A ">" character.
The meta is a void element and hence the part #5 would apply with a caveat that "optionally, a / character, which may be present..."
<meta ... />
And so, you may omit the part #5 i.e. the closing "/", and hence this is also valid:
<meta ... >
Further down the spec says:
Void elements only have a start tag; end tags must not be specified
for void elements.
To summarize, end tag is not required. Self closing is not required. It will not hurt if end tag or self-close is present.
.
It depends on whether you use HTML5 or XHTML5 syntax. In XHTML5 it is required and the parser will freak out if you don't use / when closing tag. Generaly all XML elements must have closing tag.
Try this snippet of code in Validator.nu
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Document</title>
<meta charset="UTF-8"/>
</head>
<body>
</body>
</html>
Try to remove / from meta charset and observe the result. Don't forget to set correct preset for XHTML5.
Looking at Mozilla's documentation for it, as well as from what I've observed in general, you just don't close the tag.
<meta charset="utf-8">
The above is valid HTML5. There's no reason to include any sort of close tag or anything resembling a closing tag on it.

What's a valid HTML5 document?

I've just been reading the HTML5 author spec.
It states that the <html>, <head> and <body> tags are optional.
Does that mean that you can leave them out completely and still have a valid HTML5 document?
If I'm interpreting this correctly, it means this should be completely valid:
<!DOCTYPE html>
<p>Hello!</p>
Is this correct?
You can check out the spec here:
http://dev.w3.org/html5/spec-author-view/syntax.html#syntax
"8.1.2.4 Optional tags" is the bit out about it being OK to omit <html>, <head> and <body>
The title element is indeed required, but as Jukka Korpela notes, it also must be non-empty. Furthermore, the content model of the title element is:
Text that is not inter-element whitespace.
Therefore, having just a space character in the title element is not considered valid HTML. You can check this in W3C validator.
So, an example of a minimal and valid HTML5 document is the following:
<!doctype html><title>a</title>
This is the minimal HTML5-valid document:
<!doctype html><title> </title>
W3C HTML validator maintainer here. FYI with regard to the validator behavior, as of today, the validator now enforces the requirement in the HTML spec that the title element must contain at least one non-whitespace character -
http://validator.w3.org/nu/?doc=data%3Atext%2Fhtml%3Bcharset%3Dutf-8%2C%3C%2521doctype%2520html%3E%3Ctitle%3E%2520%2520%2520%3C%252Ftitle%3E
While the <html>, <head> and <body> start and end tags are optional, the <title> tags are required, except in special circumstances, so no, your sample is not (ordinarily) valid.

Is it necessary to write HEAD, BODY and HTML tags?

Is it necessary to write <html>, <head> and <body> tags?
For example, I can make such a page:
<!DOCTYPE html>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<title>Page Title</title>
<link rel="stylesheet" type="text/css" href="css/reset.css">
<script src="js/head_script.js"></script><!-- this script will be in head //-->
<div>Some html</div> <!-- here body starts //-->
<script src="js/body_script.js"></script>
And Firebug correctly separates head and body:
The W3C validator says it's valid.
But I rarely see this practice on the web.
Is there a reason to write these tags?
Omitting the html, head, and body tags is certainly allowed by the HTML specifications. The underlying reason is that browsers have always sought to be consistent with existing web pages, and the very early versions of HTML didn't define those elements. When HTML first did, it was done in a way that the tags would be inferred when missing.
I often find it convenient to omit the tags when prototyping and especially when writing test cases as it helps keep the markup focused on the test in question. The inference process should create the elements in exactly the manner that you see in Firebug, and browsers are pretty consistent in doing that.
But...
Internet Explorer has at least one known bug in this area. Even Internet Explorer 9 exhibits this. Suppose the markup is this:
<!DOCTYPE html>
<title>Test case</title>
<form action='#'>
<input name="var1">
</form>
You should (and do in other browsers) get a DOM that looks like this:
HTML
HEAD
TITLE
BODY
FORM action="#"
INPUT name="var1"
But in Internet Explorer you get this:
HTML
HEAD
TITLE
FORM action="#"
BODY
INPUT name="var1"
BODY
See it for yourself.
This bug seems limited to the form start tag preceding any text content and any body start tag.
The Google Style Guide for HTML recommends omitting all optional tags.
That includes <html>, <head>, <body>, <p> and <li>.
From 3.1.7 Optional Tags:
For file size optimization and scannability purposes, consider
omitting optional tags. The HTML5 specification defines what tags can
be omitted.
(This approach may require a grace period to be established as a wider
guideline as it’s significantly different from what web developers are
typically taught. For consistency and simplicity reasons it’s best
served omitting all optional tags, not just a selection.)
<!-- Not recommended -->
<!DOCTYPE html>
<html>
<head>
<title>Spending money, spending bytes</title>
</head>
<body>
<p>Sic.</p>
</body>
</html>
<!-- Recommended -->
<!DOCTYPE html>
<title>Saving money, saving bytes</title>
<p>Qed.
Contrary to Liza Daly's note about HTML5, that specification is actually quite specific about which tags can be omitted, and when (and the rules are a bit different from HTML 4.01, mostly to clarify where ambiguous elements like comments and whitespace belong)
The relevant reference is 8.1.2.4 Optional tags, and it says:
An html element's start tag may be omitted if the first thing inside the html element is not a comment.
An html element's end tag may be omitted if the html element is not immediately followed by a comment.
A head element's start tag may be omitted if the element is empty, or if the first thing inside the head element is an element.
A head element's end tag may be omitted if the head element is not immediately followed by a space character or a comment.
A body element's start tag may be omitted if the element is empty, or if the first thing inside the body element is not a space character or a comment, except if the first thing inside the body element is a script or style element.
A body element's end tag may be omitted if the body element is not immediately followed by a comment.
So your example is valid HTML5, and would be parsed like this, with the html, head and body tags in their implied positions:
<!DOCTYPE html><HTML><HEAD>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<title>Page Title</title>
<link rel="stylesheet" type="text/css" href="css/reset.css">
<script src="js/head_script.js"></script></HEAD><BODY><!-- this script will be in head //-->
<div>Some HTML content</div> <!-- here body starts //-->
<script src="js/body_script.js"></script></BODY></HTML>
Note that the comment "this script will be in head" is actually parsed as part of the body, although the script itself is part of the head. According to the specification, if you want that to be different at all, then the </HEAD> and <BODY> tags may not be omitted. (Although the corresponding <HEAD> and </BODY> tags still can be.)
It's true that the HTML specifications permit certain tags to be omitted in certain cases, but generally doing so is unwise.
It has two effects - it makes the specification more complex, which in turn makes it harder for browser authors to write correct implementations (as demonstrated by Internet Explorer getting it wrong).
This makes the likelihood of browser errors in these parts of the specification high. As a website author, you can avoid the issue by including these tags - so while the specification doesn't say you have to, doing so reduces the chance of things going wrong, which is good engineering practice.
What's more, the latest HTML 5.1 WG specification currently says (bear in mind it’s a work in progress and may yet change).
A body element's start tag may be omitted if the element is empty, or
if the first thing inside the body element is not a space character or
a comment, except if the first thing inside the body element is a
meta, link, script, style, or template element.
From 4.3.1 The body element.
This is a little subtle. You can omit body and head, and the browser will then infer where those elements should be inserted. This carries the risk of not being explicit, which could cause confusion.
So this
<html>
<h1>hello</h1>
<script ... >
...
results in the script element being a child of the body element, but this
<html>
<script ... >
<h1>hello</h1>
would result in the script tag being a child of the head element.
You could be explicit by doing this:
<html>
<body>
<script ... >
<h1>hello</h1>
and then whichever you have first, the script or the h1, they will both, predictably appear in the body element. These are things which are easy to overlook while refactoring and debugging code (say for example, you have JavaScript which is looking for the 1st script element in the body - in the second snippet it would stop working).
As a general rule, being explicit about things is always better than leaving things open to interpretation. In this regard, XHTML is better, because it forces you to be completely explicit about your element structure in your code, which makes it simpler, and therefore less prone to misinterpretation.
So yes, you can omit them and be technically valid, but it is generally unwise to do so.
It's valid to omit them in HTML 4:
7.3 The HTML element
start tag: optional, End tag: optional
7.4.1 The HEAD element
start tag: optional, End tag: optional
From 7 The global structure of an HTML document.
In HTML5, there are no "required" or "optional" elements exactly, as HTML5 syntax is more loosely defined. For example, title:
The title element is a required child in most situations, but when a higher-level protocol provides title information, e.g. in the Subject line of an e-mail when HTML is used as an e-mail authoring format, the title element can be omitted.
From 4.2.2 The title element.
It's not valid to omit them in true XHTML5, though that is almost never used (versus XHTML-acting-like-HTML5).
However, from a practical standpoint you often want browsers to run in "standards mode," for predictability in rendering HTML and CSS. Providing a DOCTYPE and a more structured HTML tree will guarantee more predictable cross-browser results.
Firebug shows this correctly because your browser automagically fixes the bad markup for you. This behaviour is not specified anywhere and can (will) vary from browser to browser. Those tags are required by the DOCTYPE you're using and should not be omitted.
The HTML element is the root element of every html page. If you look at all other elements' description it says where an element can be used (and almost all elements require either head or body).