Is 3.2.5.9 of the HTML5 W3 spec deprecated?

Is 3.2.5.9 of the HTML5 W3 spec deprecated? - html

I recently received the following comment:
Are these custom attributes (dw-filter et al) going to be valid across all browsers? Goes against the data-* convention.
Arguably, now that angular is becoming more prevalent on the web it's fairly common to find something such as:
<a _ngcontent-c2 class="col-1-4" ng-reflect-router-link="/detail/13" href="/preview/cj9t4xj0v00113c5sw2c8s6w8/detail/13">
<div _ngcontent-c2 class="module">
<h4 _ngcontent-c2>Bombasto</h4>
</div>
</a>
The W3 HTML5 spec explicitly states that:
https://www.w3.org/TR/html5/dom.html#custom-data-attribute: A custom data attribute is an attribute in no namespace whose name starts with the string "data-"
Angular applications always fail w3 validation due to this, even though very simple arbitrary custom attributes (some-attr="1") still work in all HTML5 browsers without complaint / error.
Do websites (angular or not) which use such custom attributes really need to honor this part of the specification anymore?

ng- attributes are not part of the HTML specification. Therefore, they are invalid markup. Proper custom attributes should begin with data- as specified in the current HTML5 specification. Angular attributes do not do that and fail most, if not all validators.
Failing to follow web standards is a risk that has caused issues in the past and now. Should an attribute become popular with some other group, or even become a standard in the future, will cause problems for anyone trying to establish their own. For whatever reason, WHATWG or the W3C decide to make ng- an attribute for a security property, every Angular application ever made will face an uncertain future.
Not to mention adding confusion to the web as witnessed by this question.
Always, always write valid markup.
Commentary:
Why Angular intentionally violates HTML standards, I do not know. (Note: see the reply from #angular in the comments.) Unfortunately, I see articles online where developers say, "So what?", and that, to me, smacks of the days of Microsoft browsers before competition from Firefox and, then, Chrome from Google, the developers of Angular, which boasted of standards compliance.

Related

are custom html attributes without data-* prefix a valid attribute?

Okey, so, recently, i found this: https://angularjs.org/
I noticed that they use custom attribute prefix "ng-"
From articles, like:
http://html5doctor.com/html5-custom-data-attributes/
or even stackoverflow:
https://stackoverflow.com/a/17091848/2803917
And there are many more, the only VALID (im not talking about the fact, that they work anyways) prefix to use is "data-".
So, could someone explain to me, how can it be, that these, million projects and companies, uses an invalid prefix for custom html element attributes and no one seems to care?
Or am i missing something?
I would really appreciate some thoughts, or even sources of info, not just texts like "everyone does it" and "don't bother and leave it".

This is an old question, but the following may be helpful.
In principle, you can create any attributes you want, though you can’t expect the browser to know what to do with them. This is true both in HTML:
<p thing="whatever" … </p>
and in JavaScript
// p = some element
p.setAttribute('thing','whatever');
You can expect CSS to take your custom attribute seriously, as long as you use the attribute selector:
…[thing] {
…
}
Of course, if you start making up your own attributes, you run into three problems:
An HTML validator won’t know whether your new attribute is deliberate or an error, and will assume that it’s incorrect
You are competing with other code which is also making up its own attributes
At some point in the future it’s possible that your made-up attribute name becomes a real-life attribute, which would cause further problems
The data- attribute prefix has three benefits:
HTML validators will ignore the attribute for validation purposes
JavaScript will gather these attributes into a special data object for easy access
You don’t run the risk of competing with real attribute names
Effectively the data- prefix allows you to work with an otherwise invalid attribute by telling the validator to overlook it.
This will not solve the problem of competing attribute names, so you’re pretty much on your own. However it is common practice to at least include a prefix specific to a library.
Finally to the question of being valid.
If, by valid, you mean will it pass a standard (modern) HTML validator, the answer, is only the data- attributes will work this way. If, on the other hand, you mean will it work, then both CSS and JavaScript will happily work with other made up attributes, as long as you don’t expect the browser to guess what you mean.

Custom attributes must start with data- or x- or they are invalid.
This might cause problems in future browsers, and HTML validators will say they're invalid.
See: What is the difference between ng-app and data-ng-app?
And: http://www.w3.org/TR/2011/WD-html5-20110525/elements.html#embedding-custom-non-visible-data-with-the-data-attributes

Official W3C validator does not consider ng-customattr or x-customattr attributes, or customattr as valid.
Statement, that even data- prefixed custom attributes are invalid are false, check this W3C specification
However it is worth to notice projects like Laravel's Dusk encourage developers to use custom, non data- prefixed attributes.
It is worth noting, that official W3C specifications are not exclusive way how to build HTML page, but rather reccommended ones. I would dare to say, there are unspoken standards, which are widely used on web and yet tolerated by all the major browsers, even though they are not mentioned in w3c specifications. According to this article, custom named html attributes are disregarded, but yet still accessible, therefore a viable option.
I afraid there is no firm ground below our feet - By naming your parameter prefixed by data-, you are doing things by reccomended way - avoiding possible deprecation warnings, or another issues, if browsers will be in future more strict. However standards can change, and since custom-way named attributes are widespread over the web, they can became standard themselves.
So whats the solution?
If you really want to use custom, non data- prefixed attributes in HTML, it would be good to make a research about general browser support (Of find someone who already did this and published his research results in some article), and make your decision based on that - just like with any other HTML/CSS/JS feature.

Why does CSS work with fake elements?

In my class, I was playing around and found out that CSS works with made-up elements.
Example:
imsocool {
color:blue;
}
<imsocool>HELLO</imsocool>
When my professor first saw me using this, he was a bit surprised that made-up elements worked and recommended I simply change all of my made up elements to paragraphs with ID's.
Why doesn't my professor want me to use made-up elements? They work effectively.
Also, why didn't he know that made-up elements exist and work with CSS. Are they uncommon?

Why does CSS work with fake elements?
(Most) browsers are designed to be (to some degree) forward compatible with future additions to HTML. Unrecognised elements are parsed into the DOM, but have no semantics or specialised default rendering associated with them.
When a new element is added to the specification, sometimes CSS, JavaScript and ARIA can be used to provide the same functionality in older browsers (and the elements have to appear in the DOM for those languages to be able to manipulate them to add that functionality).
(There is a specification for custom elements, but they have specific naming requirements and require registering using JavaScript.)
Why doesn't my professor want me to use made-up elements?
They are not allowed by the HTML specification
They might conflict with future standard elements with the same name
There is probably an existing HTML element that is better suited to the task
Also; why didn't he know that made-up elements existed and worked with CSS. Are they uncommon?
Yes. People don't use them because they have the above problems.

TL;DR
Custom tags are invalid in HTML. This may lead to rendering issues.
Makes future development more difficult since code is not portable.
Valid HTML offers a lot of benefits such as SEO, speed, and professionalism.
Long Answer
There are some arguments that code with custom tags is more usable.
However, it leads to invalid HTML. Which is not good for your site.
The Point of Valid CSS/HTML | StackOverflow
Google prefers it so it is good for SEO.
It makes your web page more likely to work in browsers you haven't tested.
It makes you look more professional (to some developers at least)
Compliant browsers can render [valid HTML faster]
It points out a bunch of obscure bugs you've probably missed that affect things you probably haven't tested e.g. the codepage or language set of the page.
Why Validate | W3C
Validation as a debugging tool
Validation as a future-proof quality check
Validation eases maintenance
Validation helps teach good practices
Validation is a sign of professionalism

YADA (yet another (different) answer)
Edit: Please see the comment from BoltClock below regarding type vs tag vs element. I usually don't worry about semantics but his comment is very appropriate and informative.
Although there are already a bunch of good replies, you indicated that your professor prompted you to post this question so it appears you are (formally) in school. I thought I would expound a little bit more in depth about not only CSS but also the mechanics of web browsers. According to Wikipedia, "CSS is a style sheet language used for describing ... a document written in a markup language." (I added the emphasis on "a") Notice that it doesn't say "written in HTML" much less a specific version of HTML. CSS can be used on HTML, XHTML, XML, SGML, XAML, etc. Of course, you need something that will render each of these document types that will also apply styling. By definition, CSS does not know / understand / care about specific markup language tags. So, the tags may be "invalid" as far as HTML is concerned, but there is no concept of a "valid" tag/element/type in CSS.
Modern visual browsers are not monolithic programs. They are an amalgam of different "engines" that have specific jobs to do. At a bare minimum I can think of 3 engines, the rendering engine, the CSS engine, and the javascript engine/VM. Not sure if the parser is part of the rendering engine (or vice versa) or if it is a separate engine, but you get the idea.
Whether or not a visual browser (others have already addressed the fact that screen readers might have other challenges dealing with invalid tags) applies the formatting depends on whether the parser leaves the "invalid" tag in the document and then whether the rendering engine applies styles to that tag. Since it would make it more difficult to develop/maintain, CSS engines are not written to understand that "This is an HTML document so here are the list of valid tags / elements / types." CSS engines simply find tags / elements / types and then tell the rendering engine, "Here are the styles you should apply." Whether or not the rendering engine decides to actually apply the styles is up it.
Here is an easy way to think of the basic flow from engine to engine: parser -> CSS -> rendering. In reality it is much more convoluted but this is good enough for starters.
This answer is already too long so I will end there.

Unknown elements are treated as divs by modern browsers. That's why they work. This is part of the oncoming HTML5 standard that introduces a modular structure to which new elements can be added.
In older browsers (I think IE7-) you can apply a Javascript-trick after which they will work as well.
Here is a related question I found when looking for an example.
Here is a question about the Javascript fix. Turns out it is indeed IE7 that doesn't support these elements out of the box.
Also; why didn't he know that made-up tags existed and worked with CSS. Are they uncommon?
Yes, quite. But especially: they don't serve additional purpose. And they are new to html5. In earlier versions of HTML an unknown tag was invalid.
Also, teachers seem to have gaps in their knowledge, sometimes. This might be due to the fact that they need to teach students the basics about a given subject, and it doesn't really pay off to know all ins and outs and be really up to date.
I once got detention because a teacher thought I programmed a virus, just because I could make a computer play music using the play command in GWBasic. (True story, and yes, long ago). But whatever the reason, I think the advice not to use custome elements is a sound one.

Actually you can use custom elements. Here is the W3C spec on this subject:
http://w3c.github.io/webcomponents/spec/custom/
And here is a tutorial explaining how to use them:
http://www.html5rocks.com/en/tutorials/webcomponents/customelements/
As pointed out by #Quentin: this is a draft specification in the early days of development, and that it imposes restrictions on what the element names can be.

There are a few things about the other answers that are either just poorly phrased or perhaps a little incorrect.
FALSE(ish): Non-standard HTML elements are "not allowed", "illegal", or "invalid".
Not necessarily. They're "non-conforming". What's the difference? Something can "not conform" and still be "allowed". The W3C aren't going to send the HTML police to your home and haul you away.
The W3C left things this way for a reason. Conformance and specifications are defined by a community. If you happen to have a smaller community consuming HTML for more specific purposes and they all agree on some new Elements they need to make things easier, they can have what the W3C refers to as "other applicable specifications". (this is a gross over simplification, obviously, but you get the idea)
That said, strict validators will declare your non-standard elements to be "invalid". but that's because the validator's job is to ensure conformance to whatever spec it's validating for, not to ensure "legality" for the browser or for use.
FALSE(ish): Non-standard HTML elements will result in rendering issues
Possibly, but unlikely. (replace "will" with "might") The only way this should result in a rendering issue is if your custom element conflicts with another specification, such as a change to the HTML spec or another specification being honored within the same system (such as SVG, Math, or something custom).
In fact, the reason CSS can style non-standard tags is because the HTML specification clearly states that:
User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them
Note: if you want to use a custom tag, just remember a change to the HTML spec at a later time could blow your styling up, so be prepared. It's really unlikely that the W3C will implement the <imsocool> tag, however.
Non-standard tags and JavaScript (via the DOM)
The reason you can access and alter custom elements using JavaScript is because the specification even talks about how they should be handled in the DOM, which is the (really horrible) API that allows you to manipulate the elements on your page.
The HTMLUnknownElement interface must be used for HTML elements that are not defined by this specification (or other applicable specifications).
TL;DR: Conforming to the spec is done for purposes of communication and safety. Non-conformance is still allowed by everything but a validator, whose sole purpose is to enforce conformity, but whose use is optional.
For example:
var wee = document.createElement('wee');
console.log(wee.toString()); //[object HTMLUnknownElement]
(I'm sure this will draw flames, but there's my 2 cents)

According to the specs:
CSS
A type selector is the name of a document language element type written using the syntax of CSS qualified names
I thought this was called the element selector, but apparently it is actually the type selector. The spec goes on to talk about CSS qualified names which put no restriction on what the names actually are. That is to say that as long as the type selector matches CSS qualified name syntax it is technically correct CSS and will match the element in the document. There is no CSS-specific restriction on elements that do not exist in a particular spec -- HTML or otherwise.
HTML
There is no official restriction on including any tags in the document that you want. However, the documentation does say
Authors must not use elements, attributes, or attribute values for purposes other than their appropriate intended semantic purpose, as doing so prevents software from correctly processing the page.
And it later says
Authors must not use elements, attributes, or attribute values that are not permitted by this specification or other applicable specifications, as doing so makes it significantly harder for the language to be extended in the future.
I'm not sure specifically where or if the spec says that unkown elements are allowed, but it does talk about the HTMLUnknownElement interface for unrecognized elements. Some browsers may not even recognize elements that are in the current spec (IE8 comes to mind).
There is a draft for custom elements, though, but I doubt it is implemented anywhere yet.

This is possible with html5 but you need to take into consideration of older browsers.
If you do decide to use them then, make sure to COMMENT your html!! Some people may have some trouble figuring out what it is so a comment could save them a ton of time.
Something like this,

When you make your own custom tag/elements the older browsers will have no clue what that is just like html5 elements like nav/section.
If you are interested in this concept then I recommend to do it the right way.
Getting started
Custom Elements allow web developers to define new types of HTML
elements. The spec is one of several new API primitives landing under
the Web Components umbrella, but it's quite possibly the most
important. Web Components don't exist without the features unlocked by
custom elements:
Define new HTML/DOM elements Create elements that extend from other
elements Logically bundle together custom functionality into a single
tag Extend the API of existing DOM elements
There is a lot you can do with it and it does make your script beautiful as this article likes to put it. Custom Elements defining new elements in HTML.
So lets recap,
Pros
Very elegant and easy to read.
It is nice to not see so many divs. :p
Allows a unique feel to the code
Cons
Older browser support is a strong thing to consider.
Other developers may have no clue what to do if they don't know about custom tags. (Explain to them or add comments to inform them)
Lastly one thing to take into consideration, but I am unsure, is block and inline elements. By using custom tags you are going to end up writing more css because of the custom tag won't have a default side to it.
The choice is entirely up to you and you should base it on what the project is asking for.
Update 1/2/2014
Here is a very helpful article I found and figured I would share, Custom Elements.
Learn the tech Why Custom Elements? Custom Elements let authors define
their own elements. Authors associate JavaScript code with custom tag
names, and then use those custom tag names as they would any standard
tag.
For example, after registering a special kind of button called
super-button, use the super button just like this:
Custom elements are still elements. We
can create, use, manipulate, and compose them just as easily as any
standard or today.
This seems like a very good library to use but I did notice it didn't pass Window's Build status. This is also in a pre-alpha I believe so I would keep an eye on this while it develops.

Why doesn't he want you to use them? They are not common nor part of the HTML5 standard.
Technically, they are not allowed. They are a hack.
I like them myself, though. You may be interested in XHTML5. It allows you to define your own tags and use them as part of the standard.
Also, as others have pointed out, they are invalid and thus not portable.
Why didn't he know that they exist? I don't know, except that they are not common. Possibly he was just not aware that you could.

Made-up tags are hardly ever used, because it's unlikely that they will work reliably in every current browser, and every future browser.
A browser has to parse the HTML code into elements that it knows, to made-up tags will be converted into something else to fit in the document object model (DOM). As the web standards doesn't cover how to handle everyting that is outside of the standards, web browsers tend to handle non-standars code in different ways.
Web development is tricky enough with a bunch of different browsers that have their own quirks, without adding another element of uncertainty. The best bet it to stick with things that are actually in the standards, that is what the browser vendors try to follow, so that has the best chance to actually work.

I think made-up tags are just potentially more confusing or unclear than p's with IDs (some block of text generally). We all know a p with an ID is a paragraph, but who knows what made-up tags are intended for? At least that's my thought. :) Therefore this is more of a style / clarity issue than one of functionality.

Others have made excellent points but its worth noting that if you look at a framework such as AngularJS, there is a very valid case for custom elements and attributes. These convey not only better semantic meaning to the xml, but they also can provide behavior, look and feel for the web page.

CSS is a style sheet language that can be used to present XML documents, not only (X)HTML documents. Your snippet with the made-up tags could be part of a legal XML document; it would be one if you enclose it in a single root element. Probably you already have a <html> ...</html> around it? Any current browser can display XML documents.
Of course it is not a very good XML document, it lacks a grammar and an XML declaration. If you use an HTML declaration header instead (and probably a server configuration that sends the correct mime type) it would instead be illegal HTML.
(X)HTML has advantages over plain XML as elements have a semantic meaning that is useful in the context of a web page presentation. Tools can work with this semantics, other developers know the meaning, it is less error prone and better to read.
But in other contexts it is better to use CSS with XML and/or XSLT to do the presentation. This is what you did. As this wasn't your task, you didn't know what you were doing, and HTML/CSS is the better way to go most of the time you should stick to it in your scenario.
You should add an (X)HTML header to your document so tools can give you meaningful error messages.

...I simply change all of my made up tags to paragraphs with ID's.
I actually take issue with his suggestion of how to do it properly.
A <p> tag is for paragraphs. I see people using it all the time instead of a div -- simply for spacing purposes or because it seems gentler. If it's not a paragraph, don't use it.
You don't need or want to stick ID's on everything unless you need to target it specifically (e.g. with Javascript). Use classes or just a straight-up div.

From its early days CSS was designed to be markup agnostic so it can be used with any markup language producing tree alike DOM structures (SVG for example). Any tag that comply to name token production is perfectly valid in CSS. So your question is rather about HTML than CSS itself.
Elements with custom tags are supported by HTML5 specification. HTML5 standardize the way how unknown elements must be parsed in the DOM. So HTML5 is the first HTML specification that enables custom elements strictly speaking. You just need to use HTML5 doctype <!DOCTYPE html> in your document.
As of custom tag names themselves...
This document http://www.w3.org/TR/custom-elements/ recommends custom tags you choose to contain at least one '-' (dash) symbol. This way they will not conflict with future HTML elements. Therefore you'd better change your doc to something like this:
<style>
so-cool {
color:blue;
}
</style>
<body>
<so-cool>HELLO</so-cool>
</body>

Surprisingly, nobody (including my past self) mentioned accessibility. Another reason that using valid tags instead of custom ones is for compatibility with the greatest amount of software, including screen-readers and other tools that people need for accessibility purposes. Moreover, accessibility laws like WAI require making accessible websites, which generally means requiring them to use valid markup.
Apparently nobody mentioned it, so I will.
This is a by-product of browser wars.
Back in the 1990’s when the Internet was first starting to go mainstream, competition incrased in the browser market. To stay competitive and draw users, some browsers (most notably Internet Explorer) tried to be helpful and “user-friendly” by attempting to figure out what page designers meant and thus allowed markup that are incorrect (e.g., <b><i>foobar</b></i> would correctly render as bold-italics).
This made sense to some degree because if one browser kept complaining about syntax errors while another ate anything you threw at it and spit out a (more-or-less) correct result, then people would naturally flock to the latter.
While many thought the browser wars were over, a new war between browser vendors has reignited in the past few years since Chrome was released, Apple started growing again and pushing Safari, and IE lost its dominance. (You could call it a “cold war” due to the perceived cooperation and support of standards by browser vendors.) Therefore, it is not a surprise that even contemporary browsers which supposedly conform strictly to web standards actually try to be “clever” and allow standard-breaking behavior such as this in order to try to gain an advantage as before.
Unfortunately, this permissive behavior led to a massive (some might even say cancerous) growth of poorly marked up webpages. Because IE was the most lenient and popular browser, and due to Microsoft’s continued flouting of standards, IE became infamous for encouraging and promoting bad design and propagating and perpetuating broken pages.
You may be able to get away with using quirks and exploits like that on some browsers for now, but other than the occasional puzzle or game or something, you should always stick to web standards when creating web pages and sites to ensure they display correctly and avoid them becoming broken (possibly completely ignored) with a browser update.

While browsers will generally relate CSS to HTML tags regardless of whether or not they are valid, you should ABSOLUTELY NOT do this.
There is technically nothing wrong with this from a CSS perspective. However, using made up tags is something you should NEVER do in HTML.
HTML is a markup language, which means that each tag corresponds to a specific type of information.
Your made up tags don't correspond to any type of information. This will create problems from web crawlers, such as Google.
Read more information on the importance of correct markup.
Edit
Divs refer to groups of multiple related elements, meant to be displayed in block form and can be manipulated as such.
Spans refer to elements that are to be styled differenly than the context they are currently in and are meant to be displayed inline, not as a block. An example is if a few words in a sentence needs to be all caps.
Custom tags do not correlate to any standards and thus span/div should be used with class/ID properties instead.
There are very specific exemptions to this, such as Angular JS

Although CSS has a thing called a "tag selector," it doesn't actually know what a tag is. That's left for the document's language to define. CSS was designed to be used not just with HTML, but also with XML, where (assuming you're not using a DTD or other validation scheme) the tags can be just about anything. You could use it with other languages too, though you would need to come up with your own semantics for exactly what things like "tags" and "attributes" correspond to.
Browsers generally apply CSS to unknown tags in HTML, because this is considered better than breaking completely: at least they can display something. But it is very bad practice to use "fake" tags deliberately. One reason for this is that new tags do get defined from time to time, and if one is defined that looks sort of like your fake tag but doesn't quite work the same way, that can cause problems with your site on new browsers.

Why does CSS work with fake elements? Because it doesn't hurt anyone because you're not supposed to use them anyways.
Why doesn't my professor want me to use made-up elements? Because if that element is defined by a specification in the future your element will have an unpredictable behavior.
Also, why didn't he know that made-up elements exist and work with CSS. Are they uncommon? Because he, like most other web developers, understand that we shouldn't use things that might break randomly in the future.

Yet another question regarding the html5 dtd/schema

If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation? And by document validation, I mean "how are we supposed to ensure our html5 documents are both syntactically accurate and structurally sound?" Please help! This is going to become a huge problem for our industry if we have no way to accurately validate HTML5 documents!
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax? I mean, it seems rather rudimentary to just look at the document and say "yep. that's a valid xml document." What about custom tags? What about tag attributes? It seems like the W3C is leaving us out in the cold a little bit here.
Maybe the best answer will be found in the HTML editor. But then you get DTD/schema fragmentation. Each editor vendor coming up with their own rendition of what a valid structure is.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
So, any thoughts?

If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation?
With a specialized HTML5 validator rather then a generic SGML or XML validator.
Obviously, as the specification is still in draft form, the tools that do exist are immature and likely to be out of date or become out of date.
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax?
Either use a different tool or download the W3C validator and run a local copy. It has a SOAP API so writing a batch validation tool isn't difficult.
What about custom tags?
HTML5 doesn't allow custom elements.
What about tag attributes?
The only custom attributes in HTML5 are data-* attributes, so an HTML 5 validator can recognize them.
It seems like the W3C is leaving us out in the cold a little bit here.
It seems like you expect the state of QA tools for HTML 5 (unfinished) to be up to the same standard as those for HTML 4 (over a decade old). This isn't a realistic expectation.
Maybe the best answer will be found in the HTML editor. But then you get DTD/schema fragmentation. Each editor vendor coming up with their own rendition of what a valid structure is.
The specification is clear (although in flux) even if it isn't expressed in the form of a DTD or schema. If each editor has a different idea of what is valid, then most or all of them are going to be either out of date or just buggy.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
If you need to live in the bleeding edge, then you have to accept the limitations and risks of doing so.

You might find this question/answer interesting: Will HTML 5 validation be worth the candle? . The answer is written by the developer of http://about.validator.nu/ .

You should start by taking a look at http://about.validator.nu/ .
Some, though not all, of your concerns are addressed there. You can host your own validator, there's a python based submission script, you can use a RESTFUL web service API and there are ways to get validation output in a variety of different forms.
I can't however see a simple way to integrate XHTML5 with other applications of XML such that one can easily create a validator of such compound documents. Not that there's really been a way to do that with earlier versions of XHTML either though.

This is working well for me: https://github.com/hober/html5-el
To get this to work, I renamed the default '/etc/schema/schemas.xml' file in order to move it out of the way and let the 'html5-el' one be used by nxml-mode.

If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation? And by document validation, I mean "how are we supposed to ensure our html5 documents are both syntactically accurate and structurally sound?" Please help! This is going to become a huge problem for our industry if we have no way to accurately validate HTML5 documents!
If testing pages with either Firefox or Opera, both of those will report errors such as code that is not "well-formed" and mismatched tags. Beyond that, one of the validators such as validator.w3.org or validator.nu will definitely help.
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax? I mean, it seems rather rudimentary to just look at the document and say "yep. that's a valid xml document."
There are ways to run the W3C validator in batch mode.
What about custom tags? What about tag attributes? It seems like the W3C is leaving us out in the cold a little bit here.
The easy answer to that one is that "custom tags" are simply not considered valid. The Working Group has thoroughly addressed the issue of "distributed extensibility", particularly with respect to allowing "decentralized
parties to create their own languages" and "extension attributes" (http:// lists.w3.org/Archives/Public/public-html/2011Feb/0085.html). There are numerous ways to extend HTML (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#extensibility) but adding custom tags is not one of them. Custom data and microdata attributes should validate fine.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
Since HTML 5 was stabilized at the end of last year (Dec. 2010), IMO we don't need to wait for it to become an official "recommendation" by the W3C. The stabilized spec provides a solid base that all browser vendors can implement consistently and for the ongoing evolution beyond HTML 5 of the spec, which is now being called the "HTML Living Standard" (Jan. 2011 and later). There is a good diagram of this at http://www.HTML-5.com/html-versions-and-history.html#html-versions (scroll down to see the diagram).

What are the concrete risks of using custom HTML attributes in HTML4 Strict?

This subject turned into a heated discussion at the office, so I'm interested to learn what you think.
We are working on a web app that only targets some specific browsers. These browsers presently include different flavors Opera 9 and Mozilla 1.7.12. In the future we likely also have to support Opera 10 and different flavors of WebKit. But it's very unlikely we are ever going to have to deal with any version of IE.
Our web app declares HTML 4.0 strict in it's doctype.
Recently, I proposed as a solution to a specific problem to use custom attributes in the HTML. I proposed something that would look like this:
<span translationkey="someKey">...</span>
Since this is not valid HTML 4, it did not go down well with our HTML guys, and we got into an argument.
My question is this: What - if any - are the risks of using custom attributes? I know the page won't validate, but don't all browsers just ignore attributes they do not know? Or is it conceivable that some browsers will change to "quirks mode" and render the page as if it was something other than strict HTML 4.0?
Update:
Hilited the actual question posed.

There are no browser limitations/risks. Only the w3 validator will bark, but barking dogs doesn't bite.
The w3 spec says the following:
If a user agent encounters an attribute it does not recognize, it
should ignore the entire attribute
specification (i.e., the attribute and
its value).
IE will also not render in quirks mode or so as some may think. It will only do that on invalid/forced doctypes, not on invalid attributes.
However, keep in mind that some Javascript libraries/frameworks will "invisibly" add/use custom HTML attributes in the DOM tree, such as several jQuery plugins do. This way you may risk collisions in attributes because it "by a coincidence" uses an attribute with the same name as you do for your own purposes. Sadly this is often poorly or even not documented at all.

HTML 5 allows custom attributes using a 'data-' prefix, see http://ejohn.org/blog/html-5-data-attributes/

If its a goal to maintain valid html4.0 strict, then it doesn't matter why you want to put in custom attributes, you are breaking the goal.
I think the question you need to be asking, is why do you need to break 4.0 strict to get the functionality you want: Anything that you could use a custom attribute for you, you could use a in an existing attribute:
<span translationkey="someKey">...</span>
could be:
<span class="Translationkey#someKey">...</span>
it will be some extra cycles to parse all the class information, but so long as you don't put any css info on that class, it doesn't change display, doesn't put you in quirks mode, and doesn't get you in fights at work.

Duplicate try this thread though: Is it alright to add custom Html attributes?
Also look at this: Non-Standard Attributes on HTML Tags. Good Thing? Bad Thing? Your Thoughts?

Or is it conceivable that some browsers will change to "quirks mode" and render the page as if it was something other than strict HTML 4.0?
No, bad attributes will not force a rendering mode change.
If you don't care about validation do what you like, but validation is a useful tool for detecting simple mistakes that can otherwise have you chasing around debugging. Given that there are many other perfectly good alternatives for passing data to JavaScript I prefer to use one of those, rather than forgo validation.
Plus, when you add an arbitrary attribute you are effectively playing in a global namespace. There's no guarantee that some future browser or standard won't decide to use the name ‘translationkey’ for some new feature that'll trip your script up. So if you must add attributes, give them a name that's obscure and likely to be unique, or just use the HTML5 data- prefix already.

If the page is declared to be HTML 4 strict, then it should not add attributes that are not used in that HTML specifies. Differently, it is not clear what the browsers would behave.
As already reported, a way to add additional attributes is to add them as classes, even if that has some limitations.

(Copying my answer from a duplicate question)
Answers which say custom attributes won't validate are incorrect.
Custom attributes will validate.
Custom tags will validate too, as long as the custom tags are lowercase and hyphenated.
Try this in any validator. It will validate.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Custom Test</title>
</head>
<body>
<dog-cat PIANO="yellow">test</dog-cat>
</body>
</html>
Some validators:
https://appdevtools.com/html-validator
https://www.freeformatter.com/html-validator.html
https://validator.w3.org/nu/
But is it safe? Will it break later?
Custom Tags
No hyphenated tags exist. For that reason, I believe that W3C will never add a hyphenated tag. It's very likely a custom hyphenated tag will never see a conflict. If you use a weird prefix in your custom tags, even less likely. Eg.<johny-Calendar>.
Custom Attributes
There are hyphenated HTML attributes. But the HTML spec promises never to use an attribute starting with data-. So data-myattrib is guaranteed to be safe.
I believe that W3C will never introduce any attribute that starts with johny- or piano-. As long as your prefix is weird, you'll never see a conflict.
It's worth considering that the first version of HTML was written in 1993. Now, thirty years later, all browsers still support custom tags and attributes, and validators validate them.
<jw-post jw-author="Johny Why" jw-date="12/2/2022">

What's the key difference between HTML 4 and HTML 5?

What are the key differences between HTML4 and HTML5 draft?
Please keep the answers related to changed syntax and added/removed html elements.

HTML5 has several goals which differentiate it from HTML4.
Consistency in Handling Malformed Documents
The primary one is consistent, defined error handling. As you know, HTML purposely supports 'tag soup', or the ability to write malformed code and have it corrected into a valid document. The problem is that the rules for doing this aren't written down anywhere. When a new browser vendor wants to enter the market, they just have to test malformed documents in various browsers (especially IE) and reverse-engineer their error handling. If they don't, then many pages won't display correctly (estimates place roughly 90% of pages on the net as being at least somewhat malformed).
So, HTML5 is attempting to discover and codify this error handling, so that browser developers can all standardize and greatly reduce the time and money required to display things consistently. As well, long in the future after HTML has died as a document format, historians may still want to read our documents, and having a completely defined parsing algorithm will greatly aid this.
Better Web Application Features
The secondary goal of HTML5 is to develop the ability of the browser to be an application platform, via HTML, CSS, and Javascript. Many elements have been added directly to the language that are currently (in HTML4) Flash or JS-based hacks, such as <canvas>, <video>, and <audio>. Useful things such as Local Storage (a js-accessible browser-built-in key-value database, for storing information beyond what cookies can hold), new input types such as date for which the browser can expose easy user interface (so that we don't have to use our js-based calendar date-pickers), and browser-supported form validation will make developing web applications much simpler for the developers, and make them much faster for the users (since many things will be supported natively, rather than hacked in via javascript).
Improved Element Semantics
There are many other smaller efforts taking place in HTML5, such as better-defined semantic roles for existing elements (<strong> and <em> now actually mean something different, and even <b> and <i> have vague semantics that should work well when parsing legacy documents) and adding new elements with useful semantics - <article>, <section>, <header>, <aside>, and <nav> should replace the majority of <div>s used on a web page, making your pages a bit more semantic, but more importantly, easier to read. No more painful scanning to see just what that random </div> is closing - instead you'll have an obvious </header>, or </article>, making the structure of your document much more intuitive.

From Wikipedia:
New parsing rules oriented towards flexible parsing and compatibility
New elements – section, video, progress, nav, meter, time, aside, canvas
New input attributes – dates and times, email, url
New attributes – ping, charset, async
Global attributes (that can be applied for every element) – id, tabindex, repeat
Deprecated elements dropped – center, font, strike

HTML5 introduces a number of APIs that help in creating Web applications. These can be used together with the new elements introduced for applications:
An API for playing of video and audio which can be used with the new video and audio elements.
An API that enables offline Web applications.
An API that allows a Web application to register itself for certain protocols or media types.
An editing API in combination with a new global contenteditable attribute.
A drag & drop API in combination with a draggable attribute.
An API that exposes the history and allows pages to add to it to prevent breaking the back button.

You'll want to check HTML5 Differences from HTML4: W3C Working Group Note 9 December 2014 for the complete differences. There are many new elements and element attributes. Some elements were removed and others have different semantic value than before.
There are also APIs defined, such as the use of canvas, to help build the next generation of web apps and make sure implementations are standardized.

You might be interested in this list of HTML5 elements and attributes.
Also, please note that it's "HTML 4", not "HTML4". Indeed, for HTML 5, both variants are used, but there is an important difference in meaning. HTML 5 refers to the name of the W3C specification, whereas "HTML5" is the document type of those HTML files with a text/html MIME type that follow this spec.
The same goes for XHTML 5 vs. XHTML5.

Now W3c provides an official difference on their site:
http://www.w3.org/TR/html5-diff/

HTML 5 invites you give add a lot of semantic value to your code. What's more, there are natives solution to embed multimedia content.
The rest is important, but it's more technical sugar that will save you from doing the same stuff with a client programming language.

In short it is much simple compared to html, the long doctype is removed and also center and font tag is removed.
I also answered this difference in my blog :
http://ravisinghblog.in/key-difference-between-html-and-html-5/

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008