Confused about using the right HTML specification - html

As a novice, I have to say it's really complicated to recognize the official and appropriate HTML specification.
After going through the freecodecamp certification, I had a strong need to read complete specification of HTML, so I could have accurate information about this topic and was not misled by information from unofficial sites like w3schools (even though it's quite elaborate stuff). Wikipedia led me to the WHATWG site, which has large HTML documentation called Living Standard. On the other hand, there is W3C recommendation, where you can read a specification, too.
I would like to ask only one thing:
What standard is used by browsers? What is the stuff I should learn from, so I will not have problems when coding later?

Today, http://w3.org/TR/html5/ redirects to https://html.spec.whatwg.org/multipage/
Browser support is variable. https://caniuse.com is a useful reference.

Related

Why is HTML5 (or any HTML) a recommendation, not a specification?

The title of the question says it all. Is it a recommendation because browsers are free to do whatever they want anyway, i.e., they don't have to implement the recommendation and still allowed to call themselves HTML5 compliant or something like that?
I would assume that if this were a specification, that would imply that W3C or the WHATWG has authority to not allow some vendor to claim they are compliant.
Is that understanding correct? Can you add/clarify/correct me on this?
Thanks!
HTML5 is a specification - it says so in the first two words of its abstract. It's also a Recommendation.
HTML5.1 is also a specification. It says so at the start of its abstract too. But it is not a Recommendation, it's a Working Draft, or a "work in progress".
A Recommendation means a recommendation by the W3C - that is, the W3C organisation believes that its contents are sufficiently stable and reliable to recommend that you use it to guide you in your work, rather than using its earlier specifications, or other sources when guidance conflicts.
The W3C has no magical powers. There's no W3C police force, and no-one will break down your front door in the dead of night because you've used the <i> tag inappropriately. The specification is there to help you, and browser makers, by providing a common set of meanings and behaviours so that web users can all get the most from your content.

Where is the official HTML 5 API?

For JavaScript it seems easy. If you want to know the API for the language itself just consult ES5. For a library such as jquery just check out www.api.jquery.com.
But for HTML 5, where is the go to place to look for the API for a specefic tag?
Suppose I want to know the interface for <video>
My guess is
https://developer.mozilla.org/en-US/
but this is from the perspective of a company - Mozilla. Is there a published API by those that release the specs?
Can we use <video> as an example?
Here is one useful site I found that states that it parses the different specefications:
http://html5index.org/
but it looks like is is just for the JS portion.
I found it using this google search:
https://www.google.com/?gws_rd=ssl#q=html5+api&spell=1
I have been using w3schools b.c. it has the best layout, but I've heard many on SO say not to use this.
If not, what is the go to resource?
There is no official HTML5 API, or official HTML5, so far. What people regard as “de facto standard” is one of the following:
W3C HTML5 CR, a Candidate Recommendation, which means that it is not expected to change substantially before it becomes W3C Recommendation (which is as official as things like this ever get), except that some features marked as being “at risk” may be removed due to lack of implementation.
W3C HTML 5.1 Nightly, an Editor’s Draft, a further development of W3C HTML5. As the name says, it may and will change daily.
WHATWG HTML Living Standard. Largely compatible with the W3C documents but with some minor and some major differences. Apparently never expected to become any more official than it is now: a mutable document maintained by Ian Hixie and his orchestra (the WHATWG group).
Note that even the most official of these, HTML5 CR, says: “This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.” In reality, it’s more stable and closer to a “standard” than this may suggest.
All the documents mentioned above are incomplete in the sense that they cite many documents, e.g. DOM specifications and drafts, leaving essential parts to be defined in them. And the cited documents may be very mutable and even sketchy. For example, WHATWG URL Living Standard is cited, instead of the Internet-standard on URLs (URIs), and instead of the various old DOM specs and drafts, new emerging documents are cited. Currently, HTML5 CR cites W3C DOM4 CR.
Here's the HTML standard. It sounds like that's what you're looking for.
http://www.whatwg.org/specs/web-apps/current-work/multipage/
For the <video> example, here's the interface:
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#htmlvideoelement
Of course, a lot of the interesting things are in the HTMLMediaElement interface.
If you keep going into the super-interfaces, you'll find that it extends the Element interface, which is part of DOM.
http://dom.spec.whatwg.org/#interface-element
Another popular standard comes from the W3C:
http://www.w3.org/TR/html5/
Here are a list of differences provided by W3C:
http://www.w3.org/wiki/HTML/W3C-WHATWG-Differences
The W3C published its official Recommendation of HTML5 on 28 October 2014.
There you will find a complete reference for all HTML5 elements including the video element.

HTML Validator for websites

I have website templates and want to validate its HTML. I've heard that there is a w5 or something like that, I don't remember the validator to check HTML errors. So can anyone post me a link of that?
http://validator.w3.org/ This is the standard.
This checks the markup validity of Web documents in HTML, XHTML, SMIL,
MathML, etc.
For things like CSS validity you should take a look at: http://jigsaw.w3.org/css-validator/
The website you are looking for is W3 Markup Validation Service (http://validator.w3.org/). The W3 stands for World Wide Web.
This depends on what you expect validation to be. The W3C Markup Validation Service http://validator.w3.org/ is the usual answer, but it answers a question different from what people usually mean when they ask for validation.
It is originally based on the SGML concept of validation, which is a purely formal process and does not check for correctness in general, only syntax rules specified in a formalized way. In addition to detecting actual markup errors, it issues messages about violations of formal syntax that do no harm in any browser (e.g. title=Hello! without quotation marks) and accepts constructs that do not work in any browser (say <em/Hello). More info: “HTML validation” is a good tool, but just a tool.
The W3C Markup Validation Service isn’t really a pure SGML validator (or a pure XML validator) but has some checks and warnings beyond that. There’s a more pure SGML validator: WDG HTML Validator.
When processing a document declared as HTML5, using <!DOCTYPE html>, the W3C Markup Validation Service changes it nature: it becomes an ad hoc checker for specifically HTML5, behaving rather differently from an SGML validator. It is described as experimental, and it does not necessarily correspond to the most recent version or HTML5 (or of “HTML Living Standard”). Technically it is based on Validator.nu Living Validator.
With very small exceptions, these services do not check for the correctness of embedded or linked CSS and JavaScript at all, or links, semantic correctness, usability, accessibility, etc. So a “valid” page may look complete mess and fail to work at all.
There is also software marketed as “HTML validator” (as you can find out by googling), typically performing a large set of heuristic checks, partly based on HTML specifications, partly on other specs, and partly by software vendor’s own ideas of what is correct.

HTML5: W3C vs WHATWG. Which gives the most authoritative spec?

I'm in halfway trough an html parser and found html5 defined explicitly the rules of thumb for parsing ill formed html. (And I used to infer them from DTDs, sigh)
I love that fact, but I know well that html5 isn't finalized yet (also I wonder if it ever will) and that it isn't developed by the W3C, but by the WHATWG.
Searching for the spec I need I'm presented with:
8.2 section of the W3C TR
http://www.w3.org/TR/html5/syntax.html#parsing
or
11.2 section of the WHATWG web-apps/current-work
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
If it wasn't for the section numbers I would induce those are simply the same. But the different numbering makes me wonder. Which version is, supposedly, the most authoritative?
WHATWG seems to have more sections, and to have been added to since W3C uploaded its candidate recommendation.
Will W3C update to the WHATWG version?
Or will they stick to their current candidate until it gets to the official recommendation status?
Which html5 spec are we poor devils supposed to follow, when in doubt?
Always choose WHATWG over W3C, no exceptions.
Anne van Kesteren, (a WHATWG member who was a major contributor to the the HTML specification prior to the WHATWG and W3C versions diverging, and who remains a major contributor to the WHATWG specification) describes the current situation between WHATWG and W3C as follows on his blog:
The W3C has forked the [WHATWG] HTML Standard for the nth time. As always, it is pretty disastrous:
Erased all Git history of the document.
Did not document how they transformed the document. Issues of mismatches have already been reported and it will likely be a long time, if ever, before all bugs due to this process are uncovered, since it was not open.
Did not discuss plans with the wider community.
Did not discuss plans with the folks they were forking from.
Did not even discuss plans with the members of the W3C Web Platform Working Group.
Erased the acknowledgments section.
Erased the copyright and licensing information and replaced it with their own.
2019: The war is finally over
On May 28th, 2019, W3C and the WHATWG have signed a agreement to collaborate on a single, authoritative version of the HTML and DOM specifications.
According to W3C's statement, the two parties have come to the following terms:
W3C and WHATWG work together on HTML and DOM, in the WHATWG repositories, to produce a Living Standard and Recommendation/Review Draft-snapshots
WHATWG maintains the HTML and DOM Living Standards
W3C facilitates community work directly in the WHATWG repositories (bridging communities, developing use cases, filing issues, writing tests, mediating issue resolution)
W3C stops independent publishing of a designated list of specifications related to HTML and DOM and instead will work to take WHATWG Review Drafts to W3C Recommendations
Biased answer from an editor of WHATWG HTML here. Hopefully the facts can speak for themselves though.
The WHATWG Living Standard should be considered authoritative. It is constantly worked on by a large community of contributors, including all browser vendors. No browser vendors implement according to W3C HTML; for some such as Firefox and Chrome this is a matter of publicly stated policy.
The WHATWG Living Standard is constantly receiving bug fixes and new features. For more information on this model of spec development, which more closely matches modern software development practices, see What does "Living Standard" mean?.
Unfortunately, the W3C sometimes copies and pastes our work onto their own website, and puts their own logo on it, and changes the names of the editors, and such. They do this for a variety of reasons, one of the largest of which is face-saving for the sake of their paying member companies (example of them stating this). What's worse, they like to release "versions" (like HTML "5.0", "5.1", etc.) which are just outdated versions missing modern bug fixes and features that clog up search result pages, causing confusion like this very question. We are currently tracking the confusion caused by these forks, of which HTML is only one.
You can track their progress on the copy-and-paste job in their issue tracker or in commits such as this one. It's a fun game to spot the bugs they introduce while doing this copy-and-paste job, as they generally do not read or understand the content they are copying, leading to widespread errors and inconsistencies.
It depends on who you ask. Really. The politics of this are ugly. And to make matters worse, the specifications aren't fully stable yet. I would have thought that the two specifications would be largely the same in their parsing sections since section 1.1.1 which lists the differences does not mention parsing. But then I did a web diff and I saw that there are subtle differences in the text. I would say that if you are actually implementing the specification to talk to the players involved about any differences you see between the specs, using the public mailing lists. Anyway, I am sorry I can't give you a clear cut answer.
OK , I eventually came to my own conclusion and I'm gonna share it.
I will follow the W3C version: blindly.
Politically speaking it's not a simple decision. Let me explain.
I was extremely sceptic about w3c, and I possibly even hated their
guts during the whole XHTML debate/debacle. I saw the rise of
WHATWG as the arrival of our pragmatical saviours: people that
openly admitted that HTML can't be made into a stiff, rigorous XML-derived language, while the whole internet bothers nigh about it.
So given this point of view I should go with the WHATWG spec, shouldn't I?
No. Why?
WHATWG doesn't establish official versions. I kind of wish they did, but they don't.
They feel versions are too rigid for their...let's say hip attitude.
They instead have only a live standard.
(and track implementation status of any single feature by major browsers)
But I'm not a major browser, I'm a small implementer, I cannot refer to a live standard.
Well, not unless I go crazy over it and release constantly, like there's no tomorrow.
(that's sort of what is happening with firefox and chrome)
So over neverending frenetic madness, I have to choose sanity. And W3C offers polished and numbered versions of the spec. And I can claim to conform to one of those version.
When in doubt, try to match the behavior of actual browsers. That's all that actually matters.
In general, WHATWG is probably more current than W3C, though it may include more things that browsers don't support (yet).
You can think of W3C as taking snapshots of WHATWG at given points in time, stabilizing them, and then hardening them, never to be changed.
W3C HTML5 was finalized 28 October 2014.
W3C HTML5.1 was finalized 1 November 2016.
W3C HTML5.2 is currently in its "Working Draft" and probably won't be finalized until 2019.
https://www.w3.org/html/ gives a clear answer to this old but still actual question:
https://html.spec.whatwg.org/multipage/ is the current HTML standard.
It obsoletes all other previously-published HTML specifications.
As announced at
https://www.w3.org/blog/2019/05/w3c-and-whatwg-to-work-together-to-advance-the-open-web-platform/,
the W3C and the WHATWG signed an agreement to collaborate on the
development of a single version of the HTML and DOM specifications:
https://html.spec.whatwg.org/multipage/ is the single version of HTML
being actively developed https://dom.spec.whatwg.org/ is the single
version of the DOM specification being actively developed. For further
details about the W3C-WHATWG agreement, see the Memorandum of
Understanding Between W3C and WHATWG.
The part "obsoletes all other previously-published HTML specifications" means that https://www.w3.org/TR/html52/ is considered obsolete.
P.S. The URL from the question, http://www.w3.org/TR/html5/syntax.html#parsing, redirects to https://html.spec.whatwg.org/multipage/parsing.html#parsing.
[Feb, 2023]
This issue seems to be closed definitively as WHATWG abandonment of W3C has forced them (W3C) to concede as per this Wikipedia entry:
In 2009, the W3C conceded and abandoned XHTML[24] and in 2019, ceded
control of the HTML specification to the WHATWG.[25]

Yet another question regarding the html5 dtd/schema

If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation? And by document validation, I mean "how are we supposed to ensure our html5 documents are both syntactically accurate and structurally sound?" Please help! This is going to become a huge problem for our industry if we have no way to accurately validate HTML5 documents!
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax? I mean, it seems rather rudimentary to just look at the document and say "yep. that's a valid xml document." What about custom tags? What about tag attributes? It seems like the W3C is leaving us out in the cold a little bit here.
Maybe the best answer will be found in the HTML editor. But then you get DTD/schema fragmentation. Each editor vendor coming up with their own rendition of what a valid structure is.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
So, any thoughts?
If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation?
With a specialized HTML5 validator rather then a generic SGML or XML validator.
Obviously, as the specification is still in draft form, the tools that do exist are immature and likely to be out of date or become out of date.
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax?
Either use a different tool or download the W3C validator and run a local copy. It has a SOAP API so writing a batch validation tool isn't difficult.
What about custom tags?
HTML5 doesn't allow custom elements.
What about tag attributes?
The only custom attributes in HTML5 are data-* attributes, so an HTML 5 validator can recognize them.
It seems like the W3C is leaving us out in the cold a little bit here.
It seems like you expect the state of QA tools for HTML 5 (unfinished) to be up to the same standard as those for HTML 4 (over a decade old). This isn't a realistic expectation.
Maybe the best answer will be found in the HTML editor. But then you get DTD/schema fragmentation. Each editor vendor coming up with their own rendition of what a valid structure is.
The specification is clear (although in flux) even if it isn't expressed in the form of a DTD or schema. If each editor has a different idea of what is valid, then most or all of them are going to be either out of date or just buggy.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
If you need to live in the bleeding edge, then you have to accept the limitations and risks of doing so.
You might find this question/answer interesting: Will HTML 5 validation be worth the candle? . The answer is written by the developer of http://about.validator.nu/ .
You should start by taking a look at http://about.validator.nu/ .
Some, though not all, of your concerns are addressed there. You can host your own validator, there's a python based submission script, you can use a RESTFUL web service API and there are ways to get validation output in a variety of different forms.
I can't however see a simple way to integrate XHTML5 with other applications of XML such that one can easily create a validator of such compound documents. Not that there's really been a way to do that with earlier versions of XHTML either though.
This is working well for me: https://github.com/hober/html5-el
To get this to work, I renamed the default '/etc/schema/schemas.xml' file in order to move it out of the way and let the 'html5-el' one be used by nxml-mode.
If there is no DTD or schema to validate the H5 document against, how are we supposed to do document validation? And by document validation, I mean "how are we supposed to ensure our html5 documents are both syntactically accurate and structurally sound?" Please help! This is going to become a huge problem for our industry if we have no way to accurately validate HTML5 documents!
If testing pages with either Firefox or Opera, both of those will report errors such as code that is not "well-formed" and mismatched tags. Beyond that, one of the validators such as validator.w3.org or validator.nu will definitely help.
Sure, the W3C has an online tool that validates individual pages. But, if I'm creating A LOT of pages (hundreds, say) and I want to validate them in a sort of batch mode, what is the accepted method of ensuring valid structure and syntax? I mean, it seems rather rudimentary to just look at the document and say "yep. that's a valid xml document."
There are ways to run the W3C validator in batch mode.
What about custom tags? What about tag attributes? It seems like the W3C is leaving us out in the cold a little bit here.
The easy answer to that one is that "custom tags" are simply not considered valid. The Working Group has thoroughly addressed the issue of "distributed extensibility", particularly with respect to allowing "decentralized
parties to create their own languages" and "extension attributes" (http:// lists.w3.org/Archives/Public/public-html/2011Feb/0085.html). There are numerous ways to extend HTML (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#extensibility) but adding custom tags is not one of them. Custom data and microdata attributes should validate fine.
Maybe the answer is "wait for HTML5 to become official". But I really can't wait for that. I need to start creating and validating content now. I have applications I want to publish that can only be accomplished with html5.
Since HTML 5 was stabilized at the end of last year (Dec. 2010), IMO we don't need to wait for it to become an official "recommendation" by the W3C. The stabilized spec provides a solid base that all browser vendors can implement consistently and for the ongoing evolution beyond HTML 5 of the spec, which is now being called the "HTML Living Standard" (Jan. 2011 and later). There is a good diagram of this at http://www.HTML-5.com/html-versions-and-history.html#html-versions (scroll down to see the diagram).