Proper way to use h1? (Regarding document outline and SEO)

Proper way to use h1? (Regarding document outline and SEO) - html

I'm still trying to familiarize myself with HTML5, and there's this stuff which feels a bit confusing....
I once read in Jeremy Keith's book and HTML5 Doctor (via this question) which say that HTML5 makes it possible to use multiple h1s. In HTML5, each section can have its own heading element so it is okay to have more than one h1. I've seen a Wordpress theme framework, "underscores", which seem to apply this in the fullest.
However, this may seem to pose problem for older browsers (yet to support HTML5) in defining the site structure/document outline. Also, it poses problem for SEO.
I stumbled upon Matt Cutts's (from Google) video and re-read Keith's book which recommend limiting the use of h1 and use the conventional document outline (only use one or two h1 per page, followed by multiple h2, h3, etc). Matt Cutts also imply that multiple h1 is not too good for SEO.
However,
I previously never paid serious attention to site structure/document outline. So I never know how old browsers (pre-HTML5) read a site structure/document outline. There exists a HTML5 outliner, but I can't find outliner for HTML4.
Matt Cutts's video (regarding HTML5 and SEO) is published in 2009. I
don't know if Google already support the new HTML5 way of outlining
document.
So my question is, if I want to:
Support older browsers (e.g. Firefox 3.0 and IE 6) to display correct site structure/document outline
Have a good result in SEO
Which one should I use: multiple h1s (the way it is done in HTML5) or the conventional way?
This HTML5 one (example taken from HTML5 Doctor):
<h1>My fantastic site</h1>
<section>
<h1>About me</h1>
<p>I am a man who lives a fascinating life. Oh the stories I could tell you...</p>
<section>
<h1>What I do for a living</h1>
<p>I sell enterprise-managed ant farms.</p>
</section>
</section>
<section>
<h1>Contact</h1>
<p>Shout my name and I will come to you.</p>
</section>
or the conventional way?
<h1>My fantastic site</h1>
<h2>About me</h2>
<p>I am a man who lives a fascinating life. Oh the stories I could tell you...</p>
<h3>What I do for a living</h3>
<p>I sell enterprise-managed ant farms.</p>
<h2>Contact</h2>
<p>Shout my name and I will come to you.</p>

Use the new format.
Plenty of people will use h3s or h2s, and that's perfectly fine as well.
In fact, they'll use the section or article or header or footer elements offered by html5, and then use h3 or h4 as headings for that document-segment (for fear of SEO penalties / legacy styling|layout quirks).
And that's fine, too.
If you watch Cuts' video again, he says to keep the h1 use to a minimum -- only using multiples when they're really warranted.
That hasn't really changed at this point.
Google isn't going to murder you for having multiples.
Google IS going to expect each one to mean that there was a fundamental change in content.
That's true whether or not you have the sectioning (section/article/etc) elements in there or not.
Google has also gotten to the point where they're properly spidering AJAX-only, or JavaScript-dependent websites, and have their own rich-content metadata system... ...they're sophisticated enough to parse section or article.
Worry more about the quality of the content, and if you're ready to take it on, the Google-specific metadata which they use for search-results, etc...
...and let Google worry about navigating the semantics (as long as you're using them well, and not doing anything shady).
Lesser crawlers, who knows... ...but that's on a per-crawler basis, and most people only need to be concerned with Google and Bing and Yahoo, with other crawlers either feeding off of Google, or being very domain-specific (like if you want to rank highly on an opt-in, car-rental crawler for some reason... ...at which point you should be supplying an XML/JSON feed of some sort, anyway).

deathlock, your second example doesn't contain any sectioning elements. However, you could use sectioning elements with headings other than h1. I think that's the point of your question:
h1 in every sectioning element
<section>
<h1>…</h1>
<section>
<h1>…</h1>
</section>
</section>
or "calculated" heading level
<section>
<h2>…</h2>
<section>
<h3>…</h3>
</section>
</section>
Semantically/technically, they are the same.
SEO shouldn't be a problem, because "h1 everywhere" will be (and already is) used all over the web, and the major search engines know this. If they want to support HTML5, they have to understand the outlining algorithm. I bet that their crawler/APIs already correctly calculate the real heading level, like the HTML5 outliner does, for example.
The only reason why you'd want to use h2-h6 as sectioning element heading would be old accessibility software, e.g. screenreaders. They usually offer an outline menu, so the user can jump directly to a certain heading. So if you always use h1, older screenreaders, that don't know HTML5, would announce all headings as h1, because they don't calculate the correct outline levels. However, Jaws 13 for example (current version of a screenreader), only gets "h1 everywhere" for HTML5 correct in IE, AFAIR, and it gets confused if you use other heading levels in a HTML5 page. This is, of course, a bug, but it's a nice example that sticking to the "old way" will not always work for newer software.
So you might get problems either way.
In my opinion you should stick with what the HTML5 spec recommends, and this would be: use h1 for all sectioning element headings. Because this specification is what future user-agents, accessibility tools, search engines and other services/softwares use to build their product.
However, it depends on your use case, of course. If you know your visitor statistics, you should use them to make the right decision for your special case. E.g. if your site will not live for many years in the future, use what is now best supported.

The best way is to use HTML5 and use this link to make them work in the old browser since Google is ready your website way better and consider you use new technology (so that your site is better) if you use the new tags.
<!--[if IE]>
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
Put it in the head section of your site and it'll work fine for old IE versions

Related

Use header tag in html

Why would one use <header> tags or <footer> or <address> tags? Is it just for SEO, or another reason?
I ask this question because IE8 and older doesn't support many of these elements.

These are all tags introduced with HTML5. They are part of an evolution of HTML. They're not supported in IE8 because they were introduced after support for IE8 ended. They were introduced to provide more logical elements that were commonly used in web page designs.
If you need to support IE8, you can do so by not using these tags and sticking with <div> tags with classes, such as:
<div class="header"></div>
<div class="footer"></div>
<div class="address"></div>
accompanied, of course, by CSS styles for each.
They have nothing to do with SEO.

The tags you describe, such as a header tag, are what is known as 'syntactic sugar'. That is, it makes it easier to read and know what the intent of the tag is. This is good for human readers, but it is especially useful for automated systems.
Compare these examples that all could mean exactly the same thing:
<header>...</header>
<div class="header">...</div>
<div class="hdr">...</div>
Note that header is easy to differentiate from the two div tags. Only if you know what the class attribute means will you understand what the div tags are defining. Because the class attribute value is free-form, it means there is no standard definition.
SEO is one example of an automated system that might need to read the tag directly and understand that it has a semantic meaning. A particularly observant SEO engine might understand that the above three tags all refer to the same semantic definition, but you will agree that writing an engine that would presciently know all three mean 'header' would be difficult. In fact, there is nothing to disambiguate the latter two from <div class="zebra">.
Having automated systems be able to read your code and understand the semantic meaning goes well beyond SEO: it can make automated handling of mobile versions easier, for instance. You no longer have to hand-roll code for your specific implementation. You can use Javascript libraries that might do handy things with your headers, such as allow them to be sortable or auto-link them. Anyone writing those libraries has an easier time doing it.
You also ask why you would use something not supported by older browsers that are still widely used. That is a question of demographic: what is your app targeting? If you need the 6% of the world that is using an old browser to utilize your application, then you should absolutely use backwards compatible techniques. If, on the other hand, you want to make the UX the best possible for a set of users likely to be using a modern browser, then you should use the new tags. (Note that having bad UX or long development time as you roll your own solutions to things may cost you more than 6% of your application's userbase...)

Are we not supposed to be using the <main> element anymore?

Like most of my SO questions, this one stems from my inability to find up-to-date Google results.
It's been almost 3 years since <main> was accepted into the HTML5.1 spec. It seems to make perfect semantic sense to use:
<header></header>
<main></main>
<footer></footer>
But I see a lot of semantics-powered sites (like CanIUse and CSS-Tricks) that simply ignore the element, instead using something like:
<header></header>
<div class="main-wrapper">
<!--no ARIA role, nothing to semantically indicate "main" content-->
</div>
<footer></footer>
I feel like I've missed some conversation about how everyone needs to stop using <main> and Google's not helping me find that conversation. Was the element deemed unnecessary (i.e. clients don't really ever parse for it)?
Now it seems IE never ended up supporting it (sans polyfill), but is that why folks aren't using it? The same sites I've seen use div.main-wrapper do LOTs of things that still require polyfills for IE. Why not still use the semantic benefits of <main>, which only requires a 1 line JS shiv and a display:block?

(i.e. clients don't really ever parse for it)?
All major browsers except IE have implemented the parsing/styling and semantics mapping (role=main) for the main element, Edge has implemented. 3 years is not a long time in terms of uptake for a new element (although its usage is already much higher than some other new elements added years before it). Its use is steadily growing over time (you can grep the data from http://webdevdata.org if you are so inclined).
All major screen readers support main element semantics as part of landmark navigation.
63% of screenreader users sometimes/ often/ always use landmarks/ regions (so add them, or I’ll spank you). - Bruce Lawson

Answer: You should not avoid using the main element just because some other prominent sites/developers you run across are not using it.
There has been no conclusion among anybody anywhere to stop using main or to suggest to others that they should not be using it.
It’s not a requirement that you use it. But if you use it in a way that doesn’t cause the W3C validator to emit an error or warning, and in way you judge conveys the meaning/structure of your document as you the author intend—then go for it. That’s what it’s there for.

My guess is it's a chicken/egg thing. There's not much point in using it in sites if clients aren't doing anything special with it. And there's not much point in some clients doing anything with it if adoption is low. And I would guess the problem isn't causing enough pain for the majority of users & developers.

Changing from HTML4 to HTML5 because of accessibility?

I'm working for the public sector and I had to take over a web project from my superior. The page is already made accessible to people with disabilities.
While making some changes I thought about changing from HTML4 to HTML5 because I heard it has some new, better aspects for the accessibility.
Does the change really pay off? Or is it just wasted time to convert the code?

HTML5 brings with it a large array of semantic elements that give the user agent further insight into how the page is laid out. It is definitely worth it and shouldn't be too time consuming especially if your styles are decent and don't rely on tag types. You will basically be replacing a bunch of <div>s with their semantic counterparts.
For example, here are some of the new tags in HTML5:
<section>
<nav>
<article>
<aside>
<hgroup>
<header>
<footer>
<time>
<mark>
The other parts of HTML5 like CSS3, local storage, etc. don't really have accessibility benefits.
I suggest reading up more about what all these tags actually mean to make sure you're using them correctly. There were also other changes like clear meanings for the <b>, <em> and <strong> tags.
Support
Some browsers like IE6 (not sure about IE7) don't like these new tags and will mangle the page when used. You can include a polyfill library like Modernizr to fix this up, simply include the script and everything works!
Further reading
Dive into HTML - Semantics
HTML5 Doctor

While html5 is a great standard to aim for, many browsers still don't support the newer tags/markup....enter ARIA or WAI-ARIA (Accessible Rich Internet Applications Suite).
Clarissa Peterson's website gives a nice real-world example of html5 v ARIA. See the section titled HTML5 & ARIA.

Is the role attribute supported by search engines?

HTML5 introduced 22 new markup tags. W3C still recommends we stick to the old tags, because IE exists. I think adding JavaScript for this purpose is over the top. HTML5 also features the less known role, comparable with the ARIA role of XHTML 2. The great advantage of the markup tags is that search engines like Google know which is which. Do search engines also support these?

Hard to tell, but I would say Google understands them in the same way as it understands HTML5 tags. And even support for HTML5 will take some time, since new sites could take too much vantage from old HTML4 sites.
For example a <div role="main">...</div> would be like the main <article>...</article> of your page.
Please keep in mind that it is important to make your document at least ARIA valid. For example, use more than one role="main" in your site is invalid and could be considered as blackhat SEO.
Worth using role where appropriate. Remember that it improves the accessibility of your site.

Where could some html tags be necessary?

I was walking through the new html 5 features and I saw and tried a lot of new tags to see what kind of affect they have on the browser but honestly I didn't see much difference.
So lets talk about <time> tag as an example :
If you write down <time>10:00</time>
obviously it shows 10:00 on the page
but I mean I was expecting something
advanced like formatting. For example,
if I write <time>10</time> it could
format it to 10:00 instead of just
showing 10 on the page.
Also another example can be <time
datetime="2008-02-14">Valentines
day</time> it also shows just
Valentine days on the page and
nothing more. No tooltip, not a fancy
animation nothing like them.
They are tags like this, and if it is needed I most probably will use span or something else and use some kinda js code to make it more appealing and that's all.
So I am not just talking about <time> tag here, any other tags like that.
Eventually, my question is why and where we need to use them.
My best guess it to make source code more readable by codes and crawlers maybe or they can be used for semantic web. But even these answers didn't satisfy me.

They are definitely used for adding semantic meaning to the markup. For instance:
<div class="post">
<h1>Example Blog Post</h1>
<div class="entry">
<p>Blog text goes here...</p>
</div>
<div class="entryFooter">
<p> Posted in example category.</p>
</div>
</div>
is not as readable as using the HTML5 tags of:
<article>
<header>
<h1>Example Blog Post</h1>
</header>
<p>Blog text goes here...</p>
<footer>
<p>Posted in example category.</p>
</footer>
</article>
It also helps to make the markup more readable to developers and I think it makes styling a lot easier. The <time> tag falls in this boat as well as it provides more semantic meaning in your markup than a <span> tag would. This way programs in the future or browsers might be able to use that semantic data to change the time per the user's local time zone as an example.

Semantic markup gives power back to users
It is possible that you can write something like:
<time datetime="2010-10-20" />
and the browser will render it as:
3 days ago
or in some other languages of the user's preference.
Semantic markups are all about these kind of things. We are tagging text for their 'meaning', instead of for visual styling. The styling is just an extra. Once you get the meaning right, the user can style these markups to their preference, not to the author's preference. This wasn't possible with style markups, a bold markup may be used for adding emphasis, or it may be used for headings, while in other places it may be used for the first paragraph of an article. A time markup is always used for time, an emphasis markup is always used for adding emphasis, and nothing else.
Many browsers support user stylesheet to allow a user to override the stylesheet used by the website, but they are very rarely used since they are currently useless, because they apply styling indiscriminately (and because of the lack of tool support, not every users fancies writing CSS code; but we can argue that the lack of tool support is because there are not many that will use these tools, given the limited usefulness they have with formatting markup).
Then, there is the non-visual browsers; some people with visual disabilities uses aural browser. These browsers doesn't work well with formatting markups.

The semantic tags are, as you guess, used for adding semantics to web pages. Browsers won't generally render the differently by default, but you certainly can style them how you want with CSS.

I don't think html tags should be used to provide any style information - that's for css. However, you should use the new tags when appropriate. If you're making an html 5 website, why use the <span> tag for a time, when you could use the time tag? A lot of these new tags are no different that their predecessors in their functionality, but they do allow you to add more meaning to your mark-up.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008