HTML5: W3C vs WHATWG. Which gives the most authoritative spec? - html

I'm in halfway trough an html parser and found html5 defined explicitly the rules of thumb for parsing ill formed html. (And I used to infer them from DTDs, sigh)
I love that fact, but I know well that html5 isn't finalized yet (also I wonder if it ever will) and that it isn't developed by the W3C, but by the WHATWG.
Searching for the spec I need I'm presented with:
8.2 section of the W3C TR
http://www.w3.org/TR/html5/syntax.html#parsing
or
11.2 section of the WHATWG web-apps/current-work
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
If it wasn't for the section numbers I would induce those are simply the same. But the different numbering makes me wonder. Which version is, supposedly, the most authoritative?
WHATWG seems to have more sections, and to have been added to since W3C uploaded its candidate recommendation.
Will W3C update to the WHATWG version?
Or will they stick to their current candidate until it gets to the official recommendation status?
Which html5 spec are we poor devils supposed to follow, when in doubt?

Always choose WHATWG over W3C, no exceptions.
Anne van Kesteren, (a WHATWG member who was a major contributor to the the HTML specification prior to the WHATWG and W3C versions diverging, and who remains a major contributor to the WHATWG specification) describes the current situation between WHATWG and W3C as follows on his blog:
The W3C has forked the [WHATWG] HTML Standard for the nth time. As always, it is pretty disastrous:
Erased all Git history of the document.
Did not document how they transformed the document. Issues of mismatches have already been reported and it will likely be a long time, if ever, before all bugs due to this process are uncovered, since it was not open.
Did not discuss plans with the wider community.
Did not discuss plans with the folks they were forking from.
Did not even discuss plans with the members of the W3C Web Platform Working Group.
Erased the acknowledgments section.
Erased the copyright and licensing information and replaced it with their own.
2019: The war is finally over
On May 28th, 2019, W3C and the WHATWG have signed a agreement to collaborate on a single, authoritative version of the HTML and DOM specifications.
According to W3C's statement, the two parties have come to the following terms:
W3C and WHATWG work together on HTML and DOM, in the WHATWG repositories, to produce a Living Standard and Recommendation/Review Draft-snapshots
WHATWG maintains the HTML and DOM Living Standards
W3C facilitates community work directly in the WHATWG repositories (bridging communities, developing use cases, filing issues, writing tests, mediating issue resolution)
W3C stops independent publishing of a designated list of specifications related to HTML and DOM and instead will work to take WHATWG Review Drafts to W3C Recommendations

Biased answer from an editor of WHATWG HTML here. Hopefully the facts can speak for themselves though.
The WHATWG Living Standard should be considered authoritative. It is constantly worked on by a large community of contributors, including all browser vendors. No browser vendors implement according to W3C HTML; for some such as Firefox and Chrome this is a matter of publicly stated policy.
The WHATWG Living Standard is constantly receiving bug fixes and new features. For more information on this model of spec development, which more closely matches modern software development practices, see What does "Living Standard" mean?.
Unfortunately, the W3C sometimes copies and pastes our work onto their own website, and puts their own logo on it, and changes the names of the editors, and such. They do this for a variety of reasons, one of the largest of which is face-saving for the sake of their paying member companies (example of them stating this). What's worse, they like to release "versions" (like HTML "5.0", "5.1", etc.) which are just outdated versions missing modern bug fixes and features that clog up search result pages, causing confusion like this very question. We are currently tracking the confusion caused by these forks, of which HTML is only one.
You can track their progress on the copy-and-paste job in their issue tracker or in commits such as this one. It's a fun game to spot the bugs they introduce while doing this copy-and-paste job, as they generally do not read or understand the content they are copying, leading to widespread errors and inconsistencies.

It depends on who you ask. Really. The politics of this are ugly. And to make matters worse, the specifications aren't fully stable yet. I would have thought that the two specifications would be largely the same in their parsing sections since section 1.1.1 which lists the differences does not mention parsing. But then I did a web diff and I saw that there are subtle differences in the text. I would say that if you are actually implementing the specification to talk to the players involved about any differences you see between the specs, using the public mailing lists. Anyway, I am sorry I can't give you a clear cut answer.

OK , I eventually came to my own conclusion and I'm gonna share it.
I will follow the W3C version: blindly.
Politically speaking it's not a simple decision. Let me explain.
I was extremely sceptic about w3c, and I possibly even hated their
guts during the whole XHTML debate/debacle. I saw the rise of
WHATWG as the arrival of our pragmatical saviours: people that
openly admitted that HTML can't be made into a stiff, rigorous XML-derived language, while the whole internet bothers nigh about it.
So given this point of view I should go with the WHATWG spec, shouldn't I?
No. Why?
WHATWG doesn't establish official versions. I kind of wish they did, but they don't.
They feel versions are too rigid for their...let's say hip attitude.
They instead have only a live standard.
(and track implementation status of any single feature by major browsers)
But I'm not a major browser, I'm a small implementer, I cannot refer to a live standard.
Well, not unless I go crazy over it and release constantly, like there's no tomorrow.
(that's sort of what is happening with firefox and chrome)
So over neverending frenetic madness, I have to choose sanity. And W3C offers polished and numbered versions of the spec. And I can claim to conform to one of those version.

When in doubt, try to match the behavior of actual browsers. That's all that actually matters.
In general, WHATWG is probably more current than W3C, though it may include more things that browsers don't support (yet).
You can think of W3C as taking snapshots of WHATWG at given points in time, stabilizing them, and then hardening them, never to be changed.
W3C HTML5 was finalized 28 October 2014.
W3C HTML5.1 was finalized 1 November 2016.
W3C HTML5.2 is currently in its "Working Draft" and probably won't be finalized until 2019.

https://www.w3.org/html/ gives a clear answer to this old but still actual question:
https://html.spec.whatwg.org/multipage/ is the current HTML standard.
It obsoletes all other previously-published HTML specifications.
As announced at
https://www.w3.org/blog/2019/05/w3c-and-whatwg-to-work-together-to-advance-the-open-web-platform/,
the W3C and the WHATWG signed an agreement to collaborate on the
development of a single version of the HTML and DOM specifications:
https://html.spec.whatwg.org/multipage/ is the single version of HTML
being actively developed https://dom.spec.whatwg.org/ is the single
version of the DOM specification being actively developed. For further
details about the W3C-WHATWG agreement, see the Memorandum of
Understanding Between W3C and WHATWG.
The part "obsoletes all other previously-published HTML specifications" means that https://www.w3.org/TR/html52/ is considered obsolete.
P.S. The URL from the question, http://www.w3.org/TR/html5/syntax.html#parsing, redirects to https://html.spec.whatwg.org/multipage/parsing.html#parsing.

[Feb, 2023]
This issue seems to be closed definitively as WHATWG abandonment of W3C has forced them (W3C) to concede as per this Wikipedia entry:
In 2009, the W3C conceded and abandoned XHTML[24] and in 2019, ceded
control of the HTML specification to the WHATWG.[25]

Related

Confused about using the right HTML specification

As a novice, I have to say it's really complicated to recognize the official and appropriate HTML specification.
After going through the freecodecamp certification, I had a strong need to read complete specification of HTML, so I could have accurate information about this topic and was not misled by information from unofficial sites like w3schools (even though it's quite elaborate stuff). Wikipedia led me to the WHATWG site, which has large HTML documentation called Living Standard. On the other hand, there is W3C recommendation, where you can read a specification, too.
I would like to ask only one thing:
What standard is used by browsers? What is the stuff I should learn from, so I will not have problems when coding later?
Today, http://w3.org/TR/html5/ redirects to https://html.spec.whatwg.org/multipage/
Browser support is variable. https://caniuse.com is a useful reference.

Does the HTML "Living Standard" later become the "Working Draft"?

I have been studying about WHATWG and W3C from the articles https://wiki.whatwg.org/wiki/FAQ and https://en.wikipedia.org/wiki/World_Wide_Web_Consortium.
From what I understood I think WHATWG is a part of W3C. The WHATWG suggest new standards and features for HTML to browser vendors via W3C. That is first WHATWG documents all the new specifications and calls this document the Living Standard and then hand over this document to the W3C which rename this as Working Draft. WD is then made public and suggestions are taken over it and then after studying seriously by W3C membors they decide whether all the specs suggested are appropriate or not and then publishes their own modified version as the Candidate Recommendation. The CR is sent to all browser vendors to check if they can implement the specs mentioned. After receiveing all browser vendors suggestions they revise the whole document and modify accordingly and publish the document as Proposed Recommendation. This PR is then given the final call, that is everything is looked once again for final verification and then finally after making any changes needed the document W3C Recommendation is published.
My questions are:
Is the "Living Standard" of WHATWG and "Working Draft" of W3C one and the same thing?
Do browser vendors adhere to the "Living Standard" or "W3C Recommendation"?
Is the "Living Standard" of WHATWG and "Working Draft" of W3C one and the same thing?
No. A WHATWG Living Standard is continuously maintained, much like software is continuously maintained.
A W3C Working Draft is a snapshot publication on the road to W3C Recommendation, which is supposedly the final output (even though you can never really finish).
Do browser vendors adhere to the "Living Standard" or "W3C Recommendation"?
Whenever there is a choice between WHATWG and W3C for a document, browsers follow the WHATWG document as they're actively maintained and more accurate.
W3C:
Working Draft.
Working Draft. Last Call.
Candidate Recommendation.
Proposed Recommendation.
Recommendation. (WHATWG Living Standart analogue)

Why is HTML5 (or any HTML) a recommendation, not a specification?

The title of the question says it all. Is it a recommendation because browsers are free to do whatever they want anyway, i.e., they don't have to implement the recommendation and still allowed to call themselves HTML5 compliant or something like that?
I would assume that if this were a specification, that would imply that W3C or the WHATWG has authority to not allow some vendor to claim they are compliant.
Is that understanding correct? Can you add/clarify/correct me on this?
Thanks!
HTML5 is a specification - it says so in the first two words of its abstract. It's also a Recommendation.
HTML5.1 is also a specification. It says so at the start of its abstract too. But it is not a Recommendation, it's a Working Draft, or a "work in progress".
A Recommendation means a recommendation by the W3C - that is, the W3C organisation believes that its contents are sufficiently stable and reliable to recommend that you use it to guide you in your work, rather than using its earlier specifications, or other sources when guidance conflicts.
The W3C has no magical powers. There's no W3C police force, and no-one will break down your front door in the dead of night because you've used the <i> tag inappropriately. The specification is there to help you, and browser makers, by providing a common set of meanings and behaviours so that web users can all get the most from your content.

Is mediaStreamTrack in original w3c specification

I need to know whether "mediaStreamTrack" is defied in original W3C standards/spec.
Or it is just some other API(de-facto maybe) because i tried to find it among w3c standards on their site but couldn't find. Of course i did found it in some drafts but i guess draft doesn't represent the official standards.
If it is in the standards could someone please provide me some link.
The "Media Capture and Streams" specification - https://www.w3.org/TR/mediacapture-streams/ - is intended to become a W3C Recommendation (it says so in the "Status of This Document" section).
It's just that the W3C process is quite comprehensive and thus relatively slow, and the status of the spec is "Last Call Working Draft" (as of April 2016) and it will need to go through several further stages before being called a W3C Recommendation.
By the way, I'm not sure why you care about this, since it seems an accepted fact that:
The power is with browser makers. As Ian Hickson of WHATWG said, it doesn’t matter what the specs say if browsers don’t implement them.
(Bruce Lawson - On HTML5 vs Living Standard, W3C vs WHATWG)

Where is the official HTML 5 API?

For JavaScript it seems easy. If you want to know the API for the language itself just consult ES5. For a library such as jquery just check out www.api.jquery.com.
But for HTML 5, where is the go to place to look for the API for a specefic tag?
Suppose I want to know the interface for <video>
My guess is
https://developer.mozilla.org/en-US/
but this is from the perspective of a company - Mozilla. Is there a published API by those that release the specs?
Can we use <video> as an example?
Here is one useful site I found that states that it parses the different specefications:
http://html5index.org/
but it looks like is is just for the JS portion.
I found it using this google search:
https://www.google.com/?gws_rd=ssl#q=html5+api&spell=1
I have been using w3schools b.c. it has the best layout, but I've heard many on SO say not to use this.
If not, what is the go to resource?
There is no official HTML5 API, or official HTML5, so far. What people regard as “de facto standard” is one of the following:
W3C HTML5 CR, a Candidate Recommendation, which means that it is not expected to change substantially before it becomes W3C Recommendation (which is as official as things like this ever get), except that some features marked as being “at risk” may be removed due to lack of implementation.
W3C HTML 5.1 Nightly, an Editor’s Draft, a further development of W3C HTML5. As the name says, it may and will change daily.
WHATWG HTML Living Standard. Largely compatible with the W3C documents but with some minor and some major differences. Apparently never expected to become any more official than it is now: a mutable document maintained by Ian Hixie and his orchestra (the WHATWG group).
Note that even the most official of these, HTML5 CR, says: “This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.” In reality, it’s more stable and closer to a “standard” than this may suggest.
All the documents mentioned above are incomplete in the sense that they cite many documents, e.g. DOM specifications and drafts, leaving essential parts to be defined in them. And the cited documents may be very mutable and even sketchy. For example, WHATWG URL Living Standard is cited, instead of the Internet-standard on URLs (URIs), and instead of the various old DOM specs and drafts, new emerging documents are cited. Currently, HTML5 CR cites W3C DOM4 CR.
Here's the HTML standard. It sounds like that's what you're looking for.
http://www.whatwg.org/specs/web-apps/current-work/multipage/
For the <video> example, here's the interface:
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#htmlvideoelement
Of course, a lot of the interesting things are in the HTMLMediaElement interface.
If you keep going into the super-interfaces, you'll find that it extends the Element interface, which is part of DOM.
http://dom.spec.whatwg.org/#interface-element
Another popular standard comes from the W3C:
http://www.w3.org/TR/html5/
Here are a list of differences provided by W3C:
http://www.w3.org/wiki/HTML/W3C-WHATWG-Differences
The W3C published its official Recommendation of HTML5 on 28 October 2014.
There you will find a complete reference for all HTML5 elements including the video element.