What is the use of HTML5 semantic markup for intranet applications? - html

As far as I understand, the only real advantage of HTML5 semantic markup is for search engines and web crawlers to interpret the document better.
Since intranet applications have nothing to do with search engines or web crawlers, what are the advantages of using semantic markup in HTML5?

There is no straightforward example to point out, but the website (even intranet) can be consumed by different user agents (on different devices).
You are probably familiar with Skype (and the iOS Safari) making phone-number-like words clickable. In the future I can easily imagine mobile browsers being smarter to assist the user in completing tasks on the page, like importing a clearly indicated contact to the address book.

Screenreaders for blind people?

While there is not a whole lot of immediate benefit for non-disabled people, it is still good practice. Does your company not have any externally facing sites? If it does, do those people not look at internal page code? Good practices spread just like bad ones.
see also: http://en.wikipedia.org/wiki/Semantic_HTML

Simply said there the only rule you have to follow if you create your html documents is that it is valid html, otherwise you will have the problem that the browser would try to correct your broken syntax which may result in defects of the visual representation of your content.
In modern browsers you can use display to given any element - with some limitations e.g. with input element - the same visual look and behavior then any other element.
So if you ask what are the advantages of using semantic markup in HTML5 you should ask, why to use any of the semantic markup if it is possible to have the same result using css.
The short answer is, no one will stop you if it is your own project where you are responsible to - except the client that probably gives you requirements.
It is similar to asking: Language xyz provides comments and there is a syntax for doc-comments, but why should I use them?.
Using the semantics wisely increases the readability and thus maintainability. You are not required to use every possibility of semantics at all costs.
Using them will help you to get into the code again if you haven't looked at it for a longer time, e.g. to distinguish between the elements that encapsulate logical parts and elements that are used for styling. Especially if you use a template engine to create your code or to search for certain elements in multiple files.
Even if you are now the only one who works on the code it may happen that if the project grows that you need other people to work on the code. Or for the situation, you for some reason are not available, someone else needs to maintain your code, a good markup is essential.
Using the correct markup and additions like WAI-ARIA is not only essential for handicapped people, but also allows the browser to recognize the meaning of elements, allowing to e.g. improve the keyboard navigation. Especially in a productive environment where you need to type much, it is often faster to navigate with keyboard then using a trackpad or a mouse.

Related

What are the concrete risks of using custom HTML elements and attributes in HTML5?

My question is similar to what this poster is asking:
What are the concrete risks of using custom HTML attributes?
but I want to know what can happen if I use custom elements and custom attributes with the current html specs (html 5).
Example
<x a="5"> abc </x>
Visually I see no issues in any browser. js works:
x = document.getElementsByTagName('x');
alert('x has attribute a=' + x[0].getAttribute('a'));
css works too:
x{
color: red;
}
x[a]{
text-decoration:underline;
}
Possible Risks include
Backward compatibility. In particular, IE8 and below won't understand your tag, and you'll have to remember to write document.createElement('x') for all your new elements.
Semantics - having your html machine-readable may not be your goal, but there may come a time when it needs to be parsed in a moderately useful fashion.
Portability & maintenance - there are plenty of current html tags that almost certainly do what you want them to do. At some point, someone else may have to look after your code. Is there anything to be gained from having them spend time learning what all your new tags are for?
SEO - don't take the risk of a penalty just because it's something you can do..
For completeness, there are justified reasons to do it though. If you can demonstrate your new tag improves the semantics of your page (your example of 'x' obviously doesn't) and you can think of some use-case where your page will be machine-parsed by your own process, then go for it.
The only issue I can think of is that other applications, including search engines, won't recognize your custom elements and properties, so they won't know what to look for or how to use them which is a decided disadvantage for SEO. Other applications trying to access your content, including RESTful apps, will not know either without you telling the app developer.
This was always listed as one of the disadvantages of XML/XHTML but here we are again, back full circle to where we should have been in the first place, the use of XML on the web ... but I digress.
The main reason custom elements were frowned upon in the past is because browsers don't know what to do with them and there was no standardised way of telling them what they are.
What are the risks of using custom HTML elements in HTML5 without following standardisation?
Browsers will handle them differently:
Some browsers may ignore the elements and pretend they're not there; <x>, I don't know what <x> is, lets get rid of that.
Some browsers may attempt to convert the element into something else; define a <tab> element and a browser may think you've mis-spelled <table>, for instance.
You'd have to handle what the element is supposed to do across a large range of devices; just because it works on your PC doesn't mean it works on your phone, or your TV, or your e-reader... or your WiFi-powered fridge...
The good news is that there is some new documentation being written up to allow developers to define their own custom elements in a standardised way. Custom Elements, as it's titled, gives both developers and browser vendors the know-how to allow developers to implement and script custom elements in a way which will work across all supporting browsers... or that's the idea, anyway.

Disadvantages of using consistent-behaving yet deprecated HTML tags?

When users visit my website, they don't care about how perfect or how much standard the page is coded. They only care about whether it works or not.
There are tags that are deprecated but have consistent behavior throughout all major, minor, and very minor browsers. They work now and will work in the future. (I'm not talking about optional tags like <marquee> and <blink> which will probably be removed in the future since their non-existence doesn't break pages.) The tags I'm talking about are for example:
<center> (used by google.com homepage, yes and it's May 2014)
<body bgcolor=, alink=, vlink=, link= (all used by google.com)
<font size= (also used by google.com)
If my HTML generator produces tags like <body bgcolor=black>, it is guaranteed to work for near 100% of users.
If it instead produce CSS like background:black;, it will be supported by lesser users compared to <body bgcolor=black>. (Start with https://superuser.com/q/732669/78897 and https://superuser.com/q/447269/78897, though I'm sure they are not the only ones in the whole world.)
Bear with me, this is a real question based on a true problem. Exactly what are the real disadvantages of having these tags as output?
Potential disadvantages include the following:
1) Your customer might actually care about how standard the code is. Maybe not now, but in the future. Maybe for questionable reasons, but still.
2) Deprecated constructs do not always work consistently. For example, align=center attribute set on a table may have different effects depending on browser mode. This is a relatively weak argument, though, since the browser practices have been described rather well in HTML5 CR and you can manage the potential problems. (Besides, even CSS settings may work inconsistently.)
3) There is no guarantee that deprecated features will be supported by all future browsers. On the other hand, the same applies to standard features. In practice, very few features that have been defined in HTML specifications have actually been removed from browsers. (Regarding tags, I think basefont is the only case.) All the examples mentioned, and also marquee, have been described in HTML5 CR as “obsolete” but still well-defined, and according to HTML5 CR, browsers are expected, and partly required, to support them all.
4) Your colleagues (designers/developers/...) may regard your code (and you) as old-fashioned, non-semantic, and whatever.
5) Code maintenance and development may be more difficult. If you have 1,000 pages with <body bgcolor=black> and the customer says they want a somewhat different background color, you would need to edit each page. This argument is, however, weaker than it seems to be. First, how often do such things actually happen? Second, if the pages have actually been generated using suitable tools, perhaps you just need to change the value of one parameter and regenerate them (or just let servers do that, if the pages are dynamically generated). Third, if you have a link element on all pages, referring to basic style sheet for the pages, as you normally should, you just need to add one rule to that style sheet. It is easy to override presentational HTML attributes with CSS.
To summarize, the practical arguments against your approach are rather weak. The most important arguments relate to coding style and principles.
I've added some more disadvantages:
Another disadvantage of using those tags is site bandwidth. When you put in html center, bgcolor and similar tags every time browser needs to load the whole content even if on every page those tags are the same or even if user visited this site many times. But when you place design in css file browsers may cache those files (especially when you set headers properly) so they only load html and images (if no cache is set).
One another thing is that if you decide to redesign the site/style new elements, it's much easier to put changes only in CSS files. It's possible in future you won't be doing those changes on your own or other companies/freelancers will be doing them and it will be much easier for them to make changes in the site. So the site will be cheaper to maintain.
In addition if html / php code is poor (or site is very complex) and many "visual conditions" appear in many files (for example on one page you decide to use one colour and you put it in HTML, on the other another colour) and something goes wrong it will be much easier to find the problem because you may simple cut some css and check where's the problem.
The disadvantage is when one of the major browsers chooses to get rid of the deprecated tag in a future release.
The advantage of using CSS over tags is that you can change the whole web site look and feel in a simple move.
Consider people that require larger font sizes. Colour blindness and also enable the most use of screen readers.
Even those consistent behaviour tags may be removed from browser. What if you would like to create HTML5 website? Then you will need to learn everything from scratch and change literally everything for your website to make it work because you never know if those tags will be supported in HTML 5 in future or only in older HTML documents
CSS provides easier maintenance, for one; client decides they want some elements aligned left instead of center? Change your css rule and poof, you're done. But if you're using old-school valign and such? Get ready to go change every single instance of that in the file(s).

Why HTML with disabled CSS matters?

When creating a website why should you care for HTML with no style?
Is there any device which will render HTML only (no CSS or JavaScript)?
Do you usually care how your website will display without CSS?
Why is it important?
There are several cases in which websites may be used without styling. As mentioned in the comments, screen readers (such as those used by visually-impaired people) read only content, not styling.
Perhaps more importantly, many search engine spiders (think: Google) read your site without styling. When you view your site without CSS, you will gain a better understanding of how search engines view your content.
And if you are lucky, or your content is particularly geeky, you may get the occasional guru who browses your site via Lynx.
There seem to be a few misconceptions here. First of all and most importantly, screenreaders do take into account CSS and JavaScript. Why? Simply because unlike in the past they are not running on their own, but rather work as addons for existing browsers or include the render engines inline in their own systems.
Does that mean you don't need to concern yourself with screenreaders at all? Sadly that's not the case either. For example, if you add display:table to an element just because you want to vertically align something some screenreaders will actually treat it like a real table (which makes no practical sense). The good part is though that pages are read top-to-bottom, header and menu first (if found) and that adding display:none through javascript to an element will hide the element from the screen reader as well. Now, the following is going to sound really harsh, but except if you're making a real high profile website I wouldn't advice you to concern yourself with this too much. On one hand screenreaders are becoming better and better (try for example the one that's included on your android device if you have a recent version of android) and on the other hand blind people are used to websites being a 'bit' messy. Now, that doesn't mean you should start using flash or otherwise crazy stuff, but it does mean that if you just write a proper website, make your menu a list, make your divisions divs and not tables etc. you should in general be fine. And if you are making a high profile website then you should check out WAI-ARIA.
Now, getting to the search engine part, that's not true either for the big search engines at least. Google does take styling into account. Not all the styling that's unimportant for Google, but it actually will realize which stuff is hidden and analyzes the javascript whether hidden content will be shown (as part of it's anti-SEO work), it will search for links in your javascript and probably lots more I am not aware of. Bing does this to a large extend as well, though for example duckduckgo does not do this too much/at all. Either way, once again, the notion that Google sees your site like lynx does was true in the far past, but by now is invalid.
And if you check your serverlogs you will see that nobody accessed your site through lynx. That's just the reality of life nowadays. In the past (again) people would occasionally use lynx if they only had access to a console, but nowadays it's far easier to pull your phone from your pocket which runs a full web browser.
First part of the answer : 'text based browsers'
Text-based browser list
Alynx
ELinks (active version of Links)
Emacs/W3
Line Mode Browser
Links
Lynx
Net-Tamer
w3m
WebbIE
Second part : 'search engines'
List of semantic search engines
List of search engines
Third part : 'web accessibility' where software helps people with disabilities get access to the web.
It's important to note that for the third part, accessibility, it is
sometimes a legal requirement. For example, in the UK it is illegal to
have a website that is not accessible to blind people. There are
similar requirements for US government services. – slebetman
It's also an applicable law in canada
See this list of tools from w3 for a Complete List of Web Accessibility Evaluation Tools
CSS isn’t an on/off thing. Although CSS may be completely disabled, it is much more common that some of your CSS settings get ignored or overridden. Here is an incomplete list of cases (see my CSS Caveats for some additional details):
Speech-based browsers generally ignore most of CSS, largely because most of CSS is directed towards visual rendering.
So do “text-only browsers” (more accurately, character cell browsers, which render in plain text only using a monospace font but may be able to use colors and bolding).
Search engines generally don’t care about CSS at all.
CSS support varies. The more advanced CSS features you use, the more probable it is that many browsers don’t implement them.
CSS support may be disabled by the user, completely or partially.
User style sheets may interfere with your CSS code or even override them.
Browser settings e.g. on minimum font size may make some of you CSS settings ineffective.
Browsers have bugs. The more complicated CSS techniques you use, the more probable it is that you trigger a bug in some browsers.
An external CSS file (the recommended way of using CSS) may get lost, e.g. a browser may need to wait (perhaps in vain) from a server, or an archiving system may archive an HTML file but fail to archive the CSS file.
Styling may get lost in transfer, e.g. when copying content from a web page to MS Word or Excel or Notepad or email.

We hear so much about "semantic html". Where/what are the algorithms reading our semantic html?

I keep making attempts at properly using HTML5 but I feel like it's still not even close to anything semantically valuable.
My attempts:
HTML5 Article node Architecture
HTML5 Blog Page Architecture
But there's such subtleties in every single tag!
My question is, what specific software out there on the web is actually doing things like processing our HTML DOM, calculating and comparing elements to say "oh, this is a <header>, and it's just after <section>, and it has <time> in it, so the <time> tag must be "metadata" in relation to the <header>...", and saying "The content within the <time> tag not only is the "published time", but also relates to the author's birthday, so it must be a special post (say because there was also a <cite> or <address class='vcard'> tag in there too)".
I mean, what benefit am I ever going to get in using HTML5 if I don't know the algorithms that are interpreting it? If I just stuck with the basic div, ol, ul, li, p, a, h[1-6] tags, I could do everything with half the number of DOM elements.
Looking forward to some specific algorithms that I can use to shape how I structure the DOM from here on out.
I'm at the point where I don't even think we should be using HTML5 tags at all. For example, on the iPhone especially, the goal should be to minimize dom elements to decrease load time. Plus, if the iPhone site is a mirror of the traditional browser version, the search engines won't even see the iPhone site (ideally). So there's no real point in making the DOM semantic. So if I can use 1/2 the amount of <div> tags to achieve the same layout as if I used a somewhat "semantic HTML5" rendition, and that's a good thing for the iPhone, why don't I do that for the regular browser too? That's where I'm coming from.
Articles like this are basically saying it's pointless to worry about semantic HTML.
What algorithms are reading your semantic HTML? Google, that's who. Their algorithm tries to extract every bit of meaning from pages that it can, because that helps Google construct smart, relevant search results. For one example, Google tries to determine the dates of things by reading the HTML and gives headers extra consideration in determining the overall topic of a page.
Also, your assertion that we shouldn't use HTML5 tags on the iPhone "to minimize dom elements" isn't founded in any technical basis. HTML5 doesn't dictate that we use more DOM elements, and in fact it can let us leave out tags that would be required by XHTML. You should use HTML5 on the iPhone more than anywhere else. For example, the new input types like number and email don't do much on the desktop, but that extra information can really make things nicer on the iPhone by allowing it to present an appropriate interface.
Whenever a "machine" tries to make sense of your content.
In addition to search engines (→ SEO), screen readers (→ Accessibility) interpret the markup. They get better from version to version.
Also, think of all the tools that might come one day. The great thing about the Web is, that all the web pages could still exist in 5, 10, 100 … years from now. Imagine the user-agents and algorithms and search tools that might exist then, and how they could extract the meaning of your old documents.
Search engines can/will better interpret your pages which combined with other factors will result in better rankings for your pages.
Moreover if you use the tags consistently and semantically, you could build your own reusable widgets and libraries that derive knowledge from the HTML structure independent of how the data is stored in the backend.
Consider this sample Google search where you can filter results by date. By using semantic HTML, for let's say, <article> and <time>, you can write a simple crawler that recreates this functionality or allows users to specify a timespan within which to search articles in your own site(s).
Off the top of my head, I don’t know of any algorithms making use of the new semantic tags in HTML5. (Obviously, that doesn’t mean there aren’t any.)
But the idea that you should tailor your HTML to specific algorithms is, I think, a bit contrary to how the web works. The web is worldwide, and will hopefully be around for a long time. We can’t know what uses our HTML will be put to, and useful algorithms can’t be written until there’s a good amount of actual content out there.
The <a> tag wasn’t designed with Google’s PageRank algorithm in mind. Some people thought links would be useless if they weren’t inherently two-way, because you’d get too many broken links when one end went away.
Of course, if the vague possibility of undefined future benefits makes it not worth using some or all HTML5 tags for whatever project you’re working on, don’t use them.
For me, the benefit of using them is that there’s a well-known, public, non-proprietary specification that tells you, and anyone else working on the code, what we’ve agreed the tags mean. Future developers don’t just get a <div> with a class name that I made up in a coffee-fuelled 7 p.m. code print, they get a tag designed and documented by people smarter and more experienced than me. There’s also the chance that the code will become more useful in future if people use the meaning contained in HTML5 tags in algorithms, whereas there’s less chance of that if it’s all just a bunch of <div>s.
I don’t think the size increase of our pages from HTML5 tags is particularly worth worrying about though. After gzipping, the size increases aren’t enough to worry about, especially as mobile performance is as much hampered by the latency (which you can’t do much about) as the bandwidth. Plus mobile bandwidth is likely to trend up, rather than down.

Is semantic markup too open-ended? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am taking a peek at Dive Into HTML5. It seems nice and interesting, but I am puzzled.
In the 1990s, at the time when Netscape was the browser and HTML was HTML2 or HTML3, there were a lot of tags: address, cite, code... Most of them are unused as of today, probably even obsolete.
HTML5 introduces tags to express "semantic meaning" to the tag itself. This is all fun and games, but I see something very strange in this approach. Technically, the semantics can be very open ended. HTML5 has tags for article, time, navigation bars, footer. Why shouldn't it contain tags for post icon, author's place, name and surname, or whatever else you want to assign specific semantics to (I'm confident <rant> and <nsfw> would be very important tags): ? I thought XML was the strategy to assign semantics to stuff. Nothing forbids you to put an XML chunk under a XHTML div element, and assign a stylesheet to it so to style it properly, or to delegate to the proper viewer the handling of that namespace (for example, when handling RSS or SVG).
In conclusion, I don't understand the reason behind this extensions focused towards semantics, when it's clear that semantic is a very broad topic, which is guaranteed to require a potentially infinite amount of semantic tags. Since I am pretty sure there are clever people at W3C, I think I'm wrong, but I'd like to know why.
Why are tags for article, time, navigation bars, footer useful?
Because they facilitate parsing for text processing tools like Google.
It's nothing about semantics (at least in 'broad' meaning). Instead they just say: here is the body of page (most important text part) and there is the navigation bar full of links. With such an approach you can easily extract just what you need.
I too hate the way that W3C is going with their specs. There are many things that I don't like, and this "semantics" fad is one of them. (Others include taking forever to complete their specs and leaving too many important details for the browsers to implement as they choose)
Most of all I don't like it because it makes my work as a web developer more difficult. I often have to make a choice whether to make the webpage "semantically correct" or "visually/aesthetically pleasing". The latter wins of course, because that is what the users want, but as a result validations start failing and the whole thing gets quite non-semantic (tables for layout and other things).
Another issue at which I frown is that they have officialy declared that the "class" attribute is for semantics, but then they used it for visual presentation selectors in CSS.
Bottom line - DON'T MIX SEMANTICS AND VISUAL REPRESENTATION. If you use some mechanism for describing semantics (like tag names, attribute values, or what not else), then don't use it for funcional/visual purposes and vice versa.
If I would design HTML, I would simply add an attribute "semantic" which could (like the "class" attribute) be added to any tag. Then there would be a number of predefined values like all those headers/footers/articles/quotes/etc.
Tags would define functionality. Basically you could reduce HTML tags to just a handful, like "div", "table/tr/td", "a", "img", "form", "input" and "select". I probably missed a few but this is the bulk. Visual styling would be accomplished through CSS.
This way the three areas - semantics, visual representation, and functionality - would be completely independent and wouldn't clash in real life solutions.
Of course, I don't think W3C is interested in practical solutions...
There is already a lot of semantics in HTML markup in the forms of classes and IDs, of which there is a (near) infinite amount of possibilities of, And everyone has their own way of handling these semantics. One of the goals of HTML5 is to try to bring some structure to this. you will still be able to extend the semantics of tags with classes and ids. It will also most likely make things easier for search engines.
Look at it from the angle of trying to make statements either about the page, or about objects referenced from the page. If you see a <footer> tag, all you can say is "stuff in here is a footer" and pass it by. As such, adding custom tags is not as generic a solution as adding attributes and allowing people to use their own choice of URIs to specify predicates and optionally values - RDFa wins hands-down because you can express any triple-statement you like from RDF in a page, one way or another.
I just want to address one part of your question. You say:
In the nineties, at the time when
Netscape was the browser and html was
HTML2 or HTML3, there were a lot of
tags: address, cite, code... Most of
them are unused as of today, probably
even obsolete.
There are a great deal of tags to choose from in html, but the lack of usage does not imply that they are obsolete. In particular the header tags <h1>, etc, and <ul>, <ol> are used to join items into lists in a way I consider semantic. Many people may not use tags semantically, but the effort to create microformats is an ongoing continuation of the idea you consider an artifact of the 1990s. Efforts to make the semantic web be a winner keeps going, despite full-text search and link analysis (in the form of Google) being the winner as far as how to find and understand the web.
It would be great to see an updated version of Google's Web Stats which show "html as she is spoke." But you are right that many tags are underused.
Whether html5 will be successful is an open and interesting question, but the tags you describe as obsolete didn't go anywhere, they were there in HTML 4.01 and xhtml. HTML5 seems to be an effort to solidify what is useful in tags. In the end if html5 gets support in browsers and makes the job of web developers easier, it will succeed. xhtml2 failed because it roundly failed to gain adoption in browsers and did nothing to make the job of web page makers easier. The forces working on html5 seem keenly aware of the failure of xhtml2, and I think are avoiding having html5 suffer a similar fate.
"Why shouldn't it contain tags for post icon, author's place, name and surname, or whatever else you want to assign specific semantics to (I'm confident and would be very important tags): ?"
You use <dialog> to describe conversations or comments. Rant and NSFW are subjective terms therefore it makes sense not to use them.
From what I understand a bunch of experienced web developers did research and looked for what most websites have in common in html. They noticed that most websitse have id="header", id="footer", id="section" and id="nav" tags so they decided that we need HTML tags to replace those id's. So in other words, don't expect them to give you a HUGE amount of HTML vocabulary. Just keep it simple as possible as you can while addressing the MOST common needed HTML tags.
NAV tag is VERY important for providing accessibility as well. You want them to know where the navigation is rather than to force them to find whether links are for navigation or not.
I disagree with adding extra tags. If detailed vocabulary were actually import then there could be a different tag name for every word in the dictionary. Additional tags names are not helpful as they may communicate additional meaning to humans, but do nothing to facilitate machine parsing of the language. This is why I don't like the "semantic" tags for HTML5 as I believe this to be slippery slope to providing a vocabulary too complex while only providing a weak solution to a problem not fully addressed.
In my opinion markup language structure data as much as describe it in a tree diagram form. Through parsing of the structure and proper use of semantic conventions, such as RDFa, context can be leveraged to provide specific meaning to otherwise generic tag names. In such as case excessive vocabulary need not exist and structurally redundant tag names, such as footer and aside, could be eliminated. The final objective is to make content faster and more accurate to interpret by both humans and machines simultaneously while using as little code as possible to achieve that result. How that solution is lesser important, except to HTML5.
I thought XML was the strategy to assign semantics to stuff.
As far as I know, no it wasn’t. XML allows new languages to be defined which are all parsed in the same way, because they all use the XML syntax.
It doesn’t, of itself, provide any way to add meaning (“semantic” just means “meaningful”) to those languages. And until computers get artificial intelligence, they don’t actually understand meaning, so meaning is just what is agreed between human beings. HTML is the most commonly-used language with agreed meaning of its tags.
As HTML is so common, it’s helpful to add a few meaningful tags to it that are quite general in their application. The new HTML5 tags are aimed at that. The HTML5 spec’s authors could indeed carry on down this route, creating tags for every specific bit of meaning possible, but as they’re not robots, they probably won’t.
<section> is useful, and general enough to be meaningfully applicable in lots of documents. <author-last-name> isn’t. Distinguishing between the two is a judgment call, which is why humans, and not computers, write the spec.
For custom semantics that are too specific to be added to HTML as tags, HTML5 defines microdata.
I've been reading Andy Clark's book Transcending CSS (page 33).
...,it is now widely accepted that presentational names such as header, left, or red that describe an element's look or position are poor choices.
After reading these lines I asked myself: hey, aren't there elements in HTML5 spec such as header, footer?? Why is footer more semantic ? Andy in his book advocates to use site-info for the ID of the footer div and this makes more sense IMHO. Footer is a presentational name (describes the element's position).
In a word, AJAX. The new tags are meant to support what real-world developers are doing by replacing some of the <div class="sidebar-wrap"><div class="styling-hook"><div><ul class="nav"> type of divitis many websites suffer from. The only <div> left in the HTML5 is the styling hook.
The semantics that get promoted to tags from classes are those that developers have freely adopted en-masse as best practices, given an extended xhtml/css adoption period. Check out the WHATWG developer's edition of the spec's sections pagehere. The document itself is a pleasure, but I won't spoil it if you haven't seen it yet.
One of the less obvious reasons for some decisions made by the W3C is the importance of Webkit. If you look, you can see that they were better than some at taking the current work of the HTML5 Working Group and implementing ideas. They have historically been way out ahead in compliance (see here). The W3C placed a high priority on their (i.e. Android, iPhone, the Googlebot, Chrome, Safari, Dreamweaver, etc.,). Google, framework users, Wordpress/Moveable Type/Joomla! type users and others wanted self contained building blocks, so this is the style we get.
Facebook is modular. Responsive design's grids are modular. Wordpress is modular. Ajax works best with modular page structures. Widgets are modules. Plug-ins are modules. It would seem that we should be trying to figure out stuff like how to apply these tags to make it easier to hook the appropriate elements and activate them in our document/application/info-network hybrid Web 2.0.
In closing, HTML5 is meant to be written as xml (again, see the spec) in order to ensure that tools and machines making ajax requests for a portion of a document will get a well-formed useful response. How awesome in combination with things like media queries for devices like feed readers, braille printers, annotators, etc.,. I see a (near)future where anything with good semantic content is it's own newsfeed automagically! This only happens if developers adopt and write compliant documents.