How Semantic Can We Get With HTML 5? - html

This is a community wiki that asks the question, "Just how semantic can our HTML markup get thanks to HTML 5?" Below You can find the source code of a sample HTML 5 page. The object is to make a very usable, accessible, style-able webpage using as few classes and IDs as possible.
Also, when do you plan to start implementing HTML 5? Are you going to wait 10+ years until the draft is finalized, or are you going to be an "early adopter" now that browser support is rapidly growing?
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Site Name • Page Title</title>
</head>
<body>
<nav>
<h1>Site Name</h1>
<ul>
<li>Nav Link</li>
<li>Nav Link</li>
<li>Nav Link</li>
</ul>
</nav>
<header>
<p>Welcome to the site!</p>
Call to action!
</header>
<section>
<aside>
<!-- Sidebar -->
</aside>
<article>
<header>
<h2>Article Name</h2>
<p>Posted by <cite>Kerrick Long</cite> on <time datetime="2009-06-21">June 21</time>.</p>
</header>
<p>Lorem ipsum dolor sit amet...Aliquam erat volutpat.</p>
<figure>
<img src="/images/eclipse.jpg" width="640" height="480" alt="Solar Eclipse" />
<label>Here we can see the solar eclipse that happened <time datetime="2009-05-28">recently</time>.</label>
</figure>
<p>Lorem ipsum dolor...</p>
</article>
</section>
<footer>
<p>© <time datetime="2009-01-01">2009</time>, <cite>Site Owner</cite></p>
</footer>
</body>
</html>

It won't be 10+ years. That time period is for "final completion", all browsers support all parts of the spec. It's due to become a candidate late this year, early next, and hopefully approved by 2011/2.
I'm phasing it it in where I can, right now. How much I use depends on audience, but since IE share has been falling constantly, what they don't support is no longer a killer, especially as John Resig's "HTML5 shiv" lets the semantic tags play even in IE6 with js turned on.
More importantly, I'm starting to shift my thinking into HTML5 lines, using classes today for what will become HTML5 tags tomorrow (div class="nav"). That way I'll be more used to thinking in HTML5 terms when the opportunity arises.

Although I take great happiness in seeing new capabilities, the truth still remains that my clients use IE6 (and similar browsers). As much as I would like to see everybody using a modern browser, the fact that they aren't means I have to work with technologies that don't require them to upgrade.

I'm going to use it as soon as browsers support it. The sites I make are mainly hobby projects mostly visited by Firefox users. (80% of my traffic uses the latest version of FF).

Keep in mind that the cite element is not appropriate for a person's name: HTML5 states "A person's name is not the title of a work — even if people call that person a piece of work — and the element must therefore not be used to mark up people's names." Also, the trailing slash in <meta charset="UTF-8" /> isn't necessary.

The main driver for people adopting HTML 5 would be better search engine placement, without that, I'm not seeing a huge reason to adopt.
(Maybe if people could somehow convince me that the web might be more data-like and therefore interoperability would improve, then I might be somewhat convinced, but that sounds a bit overly optimistic)

I'll echo jonothan sampson. As long as a reasonable number of people are still using older browsers, it's hard to make that jump.
On the other hand, it's probably sensible to detect browsers and send a version that makes good sense. Since the differences between the two languages will be moderate, it will probably be feasable to transform an HTML5 page to HTML4 with additonal class and styles depending on user agent, perhaps with a little server side xslt. That said, I doubt I'd be the one to invent that technology, although I'd use it if or when it becomes available.

People keep mentioning a javascript solution for older browsers such as ie6 but what if they do not support javascript?
Sorry not an answer but more a ????? As this is the point I just don't get about semantic html5 and IE support.
You could always go belt and braces for older browsers
<nav><div id="nav"> some nav stuff</div></nav>
But that feels dirty to me?

Related

Should I restart heading tag numbering when nesting?

Let's consider a html book (to step away from the usual blog post example).
The code might look something like this:
<!DOCTYPE html>
<html>
<head>
<title>The title</title>
</head>
<body>
<h1>The title</h1>
<article class="the-book">
<h2>Chapter I</h2>
<section>
<!-- the contents -->
</section>
<h2>Chapter II</h2>
<section>
<!-- the contents -->
</section>
</article>
</body>
</html>
What would be the correct heading numbering within chapters? Is it correct to restart the numbering in the new scope and use <h1>, <h2> again? Or do we use the next unused level (<h3>)?
I am asking this in context of semantics and correct HTML5 code not of presentation.
The best way is to keep your nesting valid regardless of the expected scoping of <article> and <section>. In other words, for choosing your <h#> level, pretend <article> and <section> do not exist.
You may be thinking of the HTML5 Document Outline Algorithm, but the Document Outline Algorithm was never a recommendation in a final W3C spec. There was a warning explicitly against authors relying on it, though the outline language was retained for browsers to understand how to implement support (eventually).
It has been removed from the HTML5 specification (June 9), the HTML validator has been updated to recognize that no browser ever implemented it (June 16), and there is no action on any of the open bugs with the browsers to do anything about it (Chromium, Firefox, WebKit, IE / Edge).
You can get the latest take on heading structure in the HTML 5.2 draft spec, recent as of January 14, 2018 (it updates regularly).
If you want more context or history, I have a blog post that covers it with links to the spec sources. Just go to the bullet list at the bottom: http://adrianroselli.com/2016/08/there-is-no-document-outline-algorithm.html

Syntactic sugar in html/xhtml

I'm currently writing html/xhtml by hand, and that's fine to me, but I would like to ease things a little bit, especially for writing footnotes.
Today, here is how I write footnotes:
<p>Here is a footnote<a id="ref1b" href="#ref1">[1]</a>.</p>
<!-- And at the end of the document -->
<div class="footnotes">
<h2>Notes</h2>
<p id="ref1">[1] But this one isn't very helpful.
<!-- Let's add a go-back-to-the-text arrow -->
↩
</p>
</div>
The idea would be to make things automatic, and potentially done on the client side (by the browser), so that I could write something like that:
<p>Here is a footnote<ref id="1"/>.</p>
<!-- And at the end of the document -->
<div class="footnotes">
<h2>Notes</h2>
<ref-def id="1">But this one isn't very helpful.</ref-def>
</div>
So ref and ref-def would simply be evaluated on the fly by the browser.
Is this possible only using html/xhtml and css?
For completeness purpose. As of today there is a footnote tag in HTML.
https://www.w3.org/MarkUp/html3/footnotes.html
How it is presented to clients is left to implementors. Yo can use more html or css for a better formatting.
<DL>
<DT>Hamlet: <DD>You should not have believed me, for virtue cannot so inoculate our old stock but we shall relish of it. I loved you not.
<DT>Ophelia: <DD> I was the more deceived.
<DT>Hamlet: <DD>Get thee to a nunnery. Why wouldst thou be a breeder of sinners? I am myself indifferent honest ...
</DL>
<fn id=fn1><i>inoculate</i> - graft</fn>
<fn id=fn2><i>relish of it</i> - smack of it (our old sinful nature)</fn>
<fn id=fn3><i>indifferent honest</i> - moderately virtuous</fn>
the way you're doing this now has the advantage of being accessible and standards compliant - it will work with any browser - even with javascript disabled. Also search engines will be able to make sense out of this.
So there are some benefits in doing it this way.
if you decided to go for a shorter alternative, then there's plenty of jQuery plugins that will make your task more comfortable. e.g. look at https://github.com/nicholascloud/footnote.js
If you go for that approach please also note, that your site speed will suffer as users will have to download plenty of javascript to get your footnotes working.

When can I safely use the new <main> element in HTML5?

On the 16th December, a HTML5 extension specification for the <main> element was submitted to the W3C under something called an editors draft. The abstract is as follows:
This specification is an extension to the HTML5 specification [HTML5].
It defines an element to be used for the identification of the main
content area of a document. All normative content in the HTML5
specification, unless specifically overridden by this specification,
is intended to be the basis for this specification.
The main element formalises the common practice of identification of
the main content section of a document using the id values such as
'content' and 'main'. It also defines an HTML element that embodies
the semantics and function of the WAI-ARIA [ARIA] landmark role=main.
Example:
<!-- other content -->
<main>
<h1>Apples</h1>
<p>The apple is the pomaceous fruit of the apple tree.</p>
<article>
<h2>Red Delicious</h2>
<p>These bright red apples are the most common found in many
supermarkets.</p>
<p>... </p>
<p>... </p>
</article>
<article>
<h2>Granny Smith</h2>
<p>These juicy, green apples make a great filling for
apple pies.</p>
<p>... </p>
<p>... </p>
</article>
</main>
<!-- other content -->
It's got all the info in there and I feel I should start incorporating it into web pages. As far as I know now, the HTML5 spec is just progressive with new features been "bolted" on to the spec with no upgrade. I guess that means the browsers will start implementing it when they can - the question is, how long does this take and how do I know all browsers support it? Should I just build it like so for now and resort to a polyfill?
Support for <main> will be much like support for any other new container element introduced in HTML 5.
New enough browsers will support it.
Older browsers will let you style it so it is display: block and give you the visual effects of it
Older versions of IE won't support it at all without a JavaScript shim (which will work in exactly the same way as the ones for all the other new container elements).
The "when" depends on what level of browser support you need and how willing you are to depend on a JS shim.
For now, I would be careful about usng it.
For the future of the proposal, what really matters is implementation in browsers. In particular, because <main> is a proposed block level element, it will require a change to the HTML5 parser implementation as well as giving it the default ARIA role of main.
Without the default ARIA role, there is no point to the element, although using it now in preparation for that is a reasonable approach.
The parser change does require a modicum of care though. Remember that the </p> tag is optional. Now suppose you decide that before your "main" content you want a paragraph of preamble. You could write:
<!DOCTYPE html>
<body>
<p> This is my page preamble ...
<main>
My main content ...
<div>
A story ...
</div>
</main>
</body>
If and when browsers implement the <main> element, the <main> tag will automatically close the <p> element and in the DOM, the <p> element and the <main> element will be siblings of one another. The <div> element and its content will be a child of the <main> element. i.e. The DOM will be:
HTML
+--HEAD
+--BODY
+--P
| +--This is my page preamble ...
+--MAIN
+--My main content ...
+--DIV
+--A story
However, right now in browsers, the <main> becomes a child element of the <p> element, and while "My main content ..." is a child of the <main> element, the <div> element is not. i.e. the DOM has this structure:
HTML
+--HEAD
+--BODY
+--P
| +--This is my page preamble ...
| +--MAIN
| +--My main content ...
+--DIV
+--A story
Now, of course, this is easily avoided by explicitly using a </p> tag, on the preamble paragraph, but it is a trap for the unwary.
The HTML 5.1 main element is now implemented in Webkit. Validation support to follow shortly. Expect Firefox implementation soonish.
You can go ahead and use it, Chrome 26 and Firefox 21 already implemented it.
Just as with the introduction of many other new HTML5 elements, not all browsers recognise <main> or have preset styles for it. You’ll need to ensure it displays as a block level element in your CSS:
main {display:block;}
For the time being, you'll also need to use JavaScript to create the element for older versions of IE:
<script>document.createElement('main');</script>
Of course, if you use the html5shiv, <main> is now baked in directly.

Is <header> a semantic or structural tag

Take these two pieces of markup:
<div id="header">
<img src="/img/logo.png" alt="MyLogo" />
<ul id="misc-nav">
<li>..</li>
</ul>
<header>
<h1>Welcome to Bob's Website of Fantastical Conbobulations</h1>
<p>The smell of invention awaits you...</p>
</header>
</div>
and
<header>
<img src="/img/logo.png" alt="MyLogo" />
<ul id="misc-nav">
<li>..</li>
</ul>
<h1>Welcome to Bob's Website of Fantastical Conbobulations</h1>
<p>The smell of invention awaits you...</p>
</header>
My example may not be perfect, but I'd like to know if the purpose of the tag is for semantic definition of the content, or is it block level structural definition with CSS?
It is my understanding from the spec itself, that the first example is the correct interpretation, yet I see the second being touted as the right way.
Can you offer any clarity on it?
Both are fine. But what exactly do you mean by "structural" vs "semantic"?
It's your first method (semantically).
The < header> tag defines an
introduction to the document.
<header>
<h1>Welcome to my homepage</h1>
<p>My name is Donald Duck</p>
</header>
<p>The rest of my home page...</p>
http://www.w3schools.com/html5/tag_header.asp
Spec: http://www.whatwg.org/specs/web-apps/current-work/multipage/sections.html#the-header-element
The header tag purely semantic.
However, in fact all HTML tags are to provide a context to the content (= semantics).
Use CSS to style your content approperiately.
I would advocate the following combination of markup and CSS:
In your CSS:
header {
background: #fff url(/img/logo.png) top left no-repeat;
padding-left: 64px; /* or whatever required to display margin correctly */
}
/* if you REALLY want your navigation to appear as a bulleted list */
nav a {
display: list-item;
}
In your page markup:
<nav>
<a>...</a>
<a>...</a>
</nav>
<header>
<h1>Welcome to Bob's Website of Fantastical Conbobulations</h1>
<p>The smell of invention awaits you...</p>
</header>
This way you're using the semantic <header /> and <nav /> tags to mark up text content, and then using CSS to enhance the presentation with display formatting, logo images, etc.
I recall - although alas I can't find the sources now - that the proposed new elements in HTML5 (header, nav, footer, aside, article, etc.) were chosen based on analysis of Google's database of websites to identify the most commonly-used ID attributes assigned to DIV elements, figuring that those represented the most common scenarios where developers were using DIVs to wrap meaningful elements of their page structure.
HTML5 actually does away with block/inline distinction in favour of a more nuanced content model. The header element is flow content, which is like the default state for HTML5 elements. Semantically it should be considered as introductory information for its nearest section content or sectioning root ancestor.
I think both your examples are valid uses of the element, though I personally would probably markup your first one this way:
<header>
<img src="/img/logo.png" alt="MyLogo" />
<nav>
<ul>
<li>..</li>
</ul>
</nav>
<hgroup>
<h1>Welcome to Bob's Website of Fantastical Conbobulations</h1>
<h2>The smell of invention awaits you...</h2>
</hgroup>
</header>
I used to think that the first method is the proper way to use the element, as it is intended to provide relevant information for a given section it is included in, not being a section itself, besides we already have elements for structuring the content, but for what i've seen in some pages, the reason many people includes also a header element at root level is to provide that same information considering the whole page as a big section, so i've changed my mind to think both of the examples can be considered correct.
If you read the W3C HTML5 specification you will find that every html page should have only one H1 tag, so if you use h1 then h2 then h3 you might see some weird styling. That is because the browser expect one h1 on every html page when it parses it.
So you can instead use h1 h2 h3 tags and style them any way you want.
The point of using semantic html elements is because your website will be 'read' not only by web browsers but also by web crawlers, tools that read the page with voice, braile tools and many more applications and physical tools.
So when those applications read your website they don't read css, only html and might read some javascript. So when they see lang="en" they know to read the contents in the element in english etc. When they see "section" they know it's section element and when they see "aside" they know it is some aside element etc.
We can easy see the page and know what is what, but visually impaired and other people can't do that so for them this will be of great help. Think about that when you make your websites, think about all the people that will access it and how easy will be for them to do that.
That is the whole point of the new awesome html5 elements. You can make the same webpage just with one element - "div" for example, and with a whole range of new html5 semantic elements - article, section, header, footer, aside etc. The webpage will look the same in web browsers, but smart applications like search engine robots will crawl the page better and some applications that read web pages will parse the page more easily.
The point of web is to be open to all people and free, and I agree to that.
In the future, the web will evolve without doubt, new tools will be made that will parse web pages, and using new html5 semantic elements will make your webpages future proof, so these new tools will read our pages in smart way.
Hope this helped someone :)

Do HTML5 elements mean anything to search engines?

Let me just say first that I'm not looking to start a flame war :-)
I'm aware of the semantic meaning that tags such as <article> give a document, but what benefits does one get from using them?
Do search engines look at them differently? If not, what other benefits are there?
A Q+A on Google's Webmaster Central seems to suggest that the new HTML5 semantic elements have no impact currently, but will at some point in the future:
http://www.google.com/support/forum/p/Webmasters/thread?tid=2d4592cbb613e42c&hl=en
There are the obvious benefits that one day search engines might use them or microdata to associate more meaning to your site. I'm not totaly sure if that is the case yet, but some of the other answers so far provide some good links to answer that question.
Another benefit is cleaner markup. It helps to clean up the div soup that solved so many of our organization issues with HTML 4.01. Take this example:
<div class="post">
<h1>Example Blog Post</h1>
<div class="entry">
<p>Blog text goes here...</p>
</div>
<div class="entryFooter">
<p> Posted in example category.</p>
</div>
</div>
is not as readable or as clean as:
<article>
<header>
<h1>Example Blog Post</h1>
</header>
<p>Blog text goes here...</p>
<footer>
<p>Posted in example category.</p>
</footer>
</article>
Cleaner markup will make maintaining the markup a lot easier and that is always a benefit in my opinion.
Search engines will certainly look at Microdata differently.
Here's a tool that parses microdata:
http://www.google.com/webmasters/tools/richsnippets
To learn more about HTML5 Microdata:
http://diveintohtml5.ep.io/extensibility.html