XHTML 5 standard check (HTML + CSS) - html

My web page must be strictly developed using XHTML 5 standard. How can I check it?

W3C validator:
http://validator.w3.org/

You can use the W3C validator here: http://validator.w3.org/
on a sidenote: there is no such thing as a XHTML5 standard. The last XHTML version was 2 and the development there has ended. The current standard is just called HTML5.

The W3C has an experimental HTML 5 validation engine here you might want to check out. But since the standard is still in development, I don't think you'll find any definitive validation engines just yet.
Please also keep in mind that there will be no official XHTML5 standard. HTML 5 will support two formats, one which uses strict XML syntax, and another which uses regular HTML syntax, which is somewhat looser in that it doesn't mandate closing tags or capitalization rules.
To get what you're looking for you might want to try two different validations. One to check that your document is a fully compliant XML document and also another to run it through the HTML 5 validation engine to check for non conforming tags, etc.

By comparing your code against the rules for XHTML serialization in the HTML5 specification, as soon as it becomes official (allow ten years for delivery). Meanwhile, the so-called HTML5 validators, such as http://validator.nu and the W3C service based on it may be useful, but a) they are known to be incomplete and can probably never check all aspects of the rules, b) they do not necessarily reflect the most recent HTML5 draft, and c) the drafts themselves are work in progress and may be changed at any moment without prior notice.

I recommend using http://validator.nu/.

Note what Jukka says, but with regard to using either validator.nu or the W3C HTML validator:
If you want to validate a page at a URL as XHTML with the W3C HTML validator, and the page has <!DOCTYPE html> as its doctype, then you must serve the page with an XML mime type such as application/xhtml+xml to the validator.
This is a good thing. Only if you use such a mime type will browsers treat your XHTML as XHTML, otherwise they will treat it as HTML and all your careful XHTML will be be so much tag soup. With HTML5, validators now behave the same way as browsers.

Related

Declaring Doctype HTML5 [duplicate]

The "old" HTML/XHTML standards have a DTD (Document Type Definition) defined for them:
HTML 4.01 http://www.w3.org/TR/html401/sgml/dtd.html
XHTML 1.0 http://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Strict
This DTDs specify the rules for nesting elements - "which types of elements may appear in which types of elements". I made a diagram for XHTML 1.0 here (sorry, I no longer have that resource)
I would like to update that diagram with a new version which also includes the new HTML5 elements. However, there doesn't seem to be a HTML5 DTD. It seems that the nesting rules are defined by the various content models that are defined in HTML5.
So there is no DTD, correct?
Follow-up question: Is there a reason why there is no DTD in HTML5? The DTD is such a nice method of defining the nesting rules for all the different types of elements. Why wouldn't they include such a thing?
Update: I found this: http://www.w3.org/TR/html5/dom.html#kinds-of-content I guess, this is the closest to having a DTD.
Update: The Visual Studio Team made a XML Schema for XHTML5. I guess that answers my question: Link
There is no HTML5 DTD. The HTML5 RC explicitly says this when discussing XHTML serialization, and this clearly applies to HTML serialization as well.
DTDs have been regarded by the designers of HTML5 as too limited in expressive power, and HTML5 validators (basically the HTML5 mode of http://validator.nu and its copy at http://validator.w3.org/nu/) use schemas and ad hoc checks, not DTD-based validation.
Moreover, HTML5 has been designed so that writing a DTD for it is impossible. For example, there is no SGML way to capture the HTML5 rule that any attribute name that starts with “data-” and complies with certain general rules is valid. In SGML, attributes need to be listed individually, so a DTD would need to be infinite.
It is possible to design DTDs that correspond to HTML5 with some omissions and perhaps with some extra rules imposed, but they won’t really be HTML5 DTDs. My experiment with the idea is not very encouraging: too many limitations, too tricky, and the DTD would need to be so permissive that many syntax errors would go uncaught.
Correct. There is no DTD. However, HTML5 documents should start with <!DOCTYPE html>
So there's a DOCTYPE, but no DTD.
See:
http://dev.w3.org/html5/spec/syntax.html#the-doctype
http://en.wikipedia.org/wiki/Document_Type_Declaration#HTML5_DTD-less_DOCTYPE
I have created an HTML5 DTD for use in my PHP XML projects. It ain't beautiful, but it works with well-formed XHTML5 (that is, HTML5 expressed as XML).
You can grab it from my bitbucket account here:
https://bitbucket.org/kashbridge/dtd/overview
Enjoy!
Certain Marcus from sgmljs.net created and analyzed an SGML DTD for HTML 5.1 and started a thread in the XML-DEV mailing list for review and discussion. The discussion revolves around entity definitions so far.
I've just completed my analysis of W3C's HTML 5.1 recommendation at
http://sgmljs.net/docs/html5.html (from a markup language rather than
web development PoV), and I'm publishing it here for review in the
form of an initial SGML DTD for parsing HTML 5.1, along with a lengthy
analysis text.
[…]
I'm aware that WHATWG and W3C have since long moved away from SGML
(and XML in most web-related specification work), treating it as a
legacy technique and with a somewhat presumptuous attitude in the
specification text and elsewhere. But as the analysis of HTML5's
grammar shows, they've essentially abandoned use of any formal methods
alltogether (and it shows in at least two flaws discussed in the
analysis).
Nothing official yet, but maybe this initiative will get traction, or at least find its users as an unofficial resource.
I think they did away with the old DTDs, now we just start HTML pages with: <!DOCTYPE HTML>
Maybe the W3C will come out with one eventually.

How exactly declaring page doctype as HTML5 reduces the error in W3C markup validation

I have a website which has got around 1000 pages. I declared all html doctype to use XHTML 1.0 strict
I checked the website pages using W3C markup validation tool, I got 320 errors, Then I changed the doctype to HTML 4.0 the errors reduced to 300.
Then I used the HTML5 doctype, Then errors got reduced to 75. So How these errors got reduced by just changing the doctype.
EDIT
My Question is:
1) Validating my pages against XHTML1.0 standards gives me more than 300 errors, Which is quite huge and bit difficult to resolve them.
2) Validating my pages against HTML5 standards gives me around 70 errors, Which is not a issue and can resolve them easily.
So In this case which HTML version i have to use so that It does not affects SEO of the pages, Because w3c validation also affects the SEO
If i just use HTML5 doctype but not exactly the page structure (nav, header, section, footer, article ....), Will this really matters Because I have got around 1000 pages which is very difficult make them to follow the HTML5 page structure.
What i am thinking is to reduce the errors in w3c, I will just change the doctype to HTML5 and resolve the w3c errors. Is this a good idea. Or If any please suggest me.
As #Quentin says, there are many differences between XHTML 1.0 Strict and HTML5. Apart from the new tags, there are other significative differences, some examples:
1 - All XHTML tags and attributes should be written in lower case.
Is there any uppercase tags or attributes in your code?
2 - In XHTML, when you use a singleton tag like <br/> you are
required to include a trailing slash in the element for valid XHTML.
In HTML 5, the trailing slash is optional.
Have you self-closing the singleton tags?
3 - All XHTML attribute values must be quoted. In HTML5, you don’t
need to place quotation marks around attribute values if there are no
spaces.
Are your attribute values properly quoted?
4 - All the XHTML tags must be nested properly.
Is this your case?
5 - The HTML5 <meta> tag with the charset attribute is simpler than in
XHTML: <meta charset=utf-8>
If you're using this tag your document fails in XHTML
6 - There’s also no need to include the Type Attribute for Style Sheet
Links and Scripts.
If you didn't declare this attribute, your document fails in XHTML
These are a few examples of how different can validation will be simply changing the Doctype. You could check these points to see if is there any your case.
You can retrieve all the info here: Baby steps from XHTML to HTML5
I will just change the doctype to HTML5 and resolve the w3c errors. Is this a good idea?
Well, HTML5 is more "easier" to construct, because is more flexible, but is a decision you must decide before start making the website. I suggest you to read the W3C specifications for XHTML 1.0 and HTML5 specifications, and then decide what language fits better with your requirements and how code it to have a valid markup.
Because, quite simply, different versions of HTML are different and allow different things.
<video> for example is new in HTML 5 so will error in HTML 4.
Poor code is poor code, regardless of doctype. You will see fewer errors when validating with an html5 doctype because html5 as a spec is much less rigid in how it defines html to be structured.
Google doesn't validate pages. That said, better markup can help a search engine to better understand your website. Although if you're just changing the doctype and not cleaning up the poor code, it's not going to have an effect.
It happens because xhtml uses xml parser, which demands more strict syntax. I've found it out that <!DOCTYPE html> is much more tolerant, for using standard that is still in developent (last subsentence is more my guess than concrete).

Is there any deprecated elements and properties checker (according to w3c) like w3c validator?

Is there any deprecated elements and properties checker (according to w3c) like w3c validator?
I don't know of any checker that lists them for you, but the Web Developer Toolbar from Firefox does that under the "Outline » Outline Deprecated Elements" command. Whait it does is, well, visually outline the faulty elements in your page.
You can also check which elements/attributes are deprecated in HTML. W3C is a great place to start: http://www.w3.org/TR/html4/index/elements.html
Same goes for XHTML, Google will gladly provide URLS for checking the deprecated list of elements and attributes.
Of course, you can always validate with the W3C validators and check the error messages for "deprecated" when using a strict doctype.
I'd provided the links, but as I'm a new user, I'm only allowed one link per post. ;p
Set a Strict doctype and run it through the W3C validator, it should trip up hard on deprecated elements (if it's in Transitional/Frameset and not in Strict, it's deprecated.)
Edit: If you are using XHTML, you can use a standard issue XML validator with the XHTML 1.1 or 1.0 Strict DTD.

Which one I have to follow for W3 Validation?

In W3 validations it have many options
HTML - XHTML1.0, XHTML - Transitional, XHTML - Strict and
for CSS - CSS - 2.0, CSS - 3.0 ...
Which one I should follow? Any suggestion plz
Validate as whichever you've targeted your document to. If you're producing HTML, validate as that. Likewise, if you're writing XHTML, validate as that. Go read the w3 docs on each spec to decide which you want to follow.
HTML5, HTML4 Strict or, if you have to, XHTML1 Strict. Don't fall into the trap of Transitional versions for new sites, you'll regret it later.
I'd use CSS3, but CSS2.1 is alright as well.
I'd personally say:
HTML 4.01 Strict
CSS 2.1
This is because I (and many others) feel XHTML is flawwed which is why W3C is stopping development of the XHTML2 stanrd in favor of HTML5 (which you won't be using in another couple of years because of lack in browser support)
Also another problem of XHTML is that it should be send with a MIME type of application/xhtml+xml however internet explorer 6 (maybe 7 and 8 too, not sure) do not render the page when it send as application/xhtml+xml, so you'll either need to send text/html for ie and text/html+xml for other browser, or use the wrong mime type completely.
http://www.w3.org/News/2009#item119 - W3C stops development of XHTML2
http://hixie.ch/advocacy/xhtml - Lengthy article about XHTML by Ian Hickson (creator of Acid2 and Acid3 test and member of WHATWG)
css 2.x is used in 99% cases
Do yiu know what DOCTYPE is? HTML and XHTML are different languages. If m then it's XHTML.
I make things tooooo simple. Better browse the web

HTML version choice [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When developing a new web based application which version of html should you aim for?
EDIT:
cool I was just attempting to get a feel from others I tend to use XHTML 1.0 Strict in my own work and Transitional when others are involved in the content creation.
I marked the first XHTML 1.0 Transitional post as the 'correct answer' but believe strongly that all the answers given at that point where equally valid.
HTML 4.01. There is absolutely no reason to use XHTML for anything but experimental or academic problems that you only want to run on the 'obscure' web browsers.
XHTML Transitional is completely pointless even to those browsers, so I'm not sure why anyone would aim for that. It's actually pretty alarming that a number of people would recommend that.
I'd say aiming for HTML 4.01 is the most predictable, but Teifion is right really, "anything that renders your page will do".
in response to Michael Stum:
XHTML is XML based, so it allows easier parsing and you can also use the XML Components of most IDEs to programatically query and insert stuff.
This is certainly not true. A lot of XHTML on the web (if not most) does not conform to XML validity (and it needn't - it's not being sent as XML). Trying to treat this like XML when dealing with it is just going to earn you a lot of headaches. This page on Stack Overflow, for instance, will generate errors with many unforgiving XML tools for having invalid mark-up.
I'd shoot for XHTML Transitional 1.0. There are still a few nuances out there that don't like XHTML strict, and most editors I've seen now will give you the proper nudges to make sure that things are done right.
Transitional flavors of XHTML and HTML are deprecated. They were intended only for old user-agents that don't support CSS. See explanation in the DTD.
W3C advises that you should use Strict whenever possible, and these days it's certainly possible.
Transitional version has already been removed in XHTML/1.1 and HTML5.
XHTML/1.0 has exactly the same elements and attributes (semantics) as HTML4. The XHTML/1.0 specification doesn't even specify any elements! For anything else than syntax, it refers to HTML4.
Additionally, you'll be unable to use any feature of XHTML that is not available in HTML (namespaces, XML DOM) if you send documents as text/html, and unfrortunately that is required for compatibility with IE and other HTML-only browsers.
In 2008 the correct choice would be HTML4 Strict:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
but as of 2016, there's only one version of HTML that matters.
<!DOCTYPE html>
Dillie-O is right on with his answer of XHTML 1.0 Transitional but I would suggest shooting for XHTML 1.0 Strict and only falling back on Transitional if there's some piece of functionality you absolutely need that Strict is not allowing.
#Mike:
While I agree that validity is not needed to make a page render (after all, we have to keep IE6 compatibility in...), creating valid XHTML that IS compatible AND valid is not a problem. The problems start when people are used to HTML 4 and using the depreciated tags and attributes.
Just because the Web is a pile of crap does not mean that every new page needs to be a pile of crap as well. Most Validation errors on SO are so trivial, it shouldn't take too long to fix, like missing quotes on attributes.
But it may still be kind of pointless, given the fact that the W3C does not have any idea where they want to be going anyway (see HTML 5) and a certain big Browser company that also makes operating systems does not care as well, so a site could as well send out it's doctype as HTML 1337 Sucks and browsers will still try to render it.
There are some compelling warnings about the usage of XHTML, primarily centering around the fact that the mime-type for such a document should be sent as:
Content-type: application/xhtml+xml
Yet IE 6 and 7 don't support this, and then websites must send it as:
Content-type: text/html
Unfortunately that method is considered harmful.
Some also bemoan the fact that although the intent of XHTML is to make web pages parsable by an XML parser, it has in practice failed due to incorrect usage on existing websites.
I still prefer to write documents in XHTML 1.0 Strict, mostly because of the challenge, and the cleanliness and error-checking that a validator gives. I enjoy the syntax a bit better, because it forces me to be very explicit in when tags end, etc. It's more for me a personal choice than purely technical.
Anything that renders your page is will do so regardless of which popular standard you use. XHTML is stricter and probably "better" but I can't see what advantages you will get with one standard over another.
Personally, I prefer XHTML 1.0 Transitional.
XHTML is XML based, so it allows easier parsing and you can also use the XML Components of most IDEs to programatically query and insert stuff.
Transitional is not as strict as strict, which makes it relatively easy to work with, compared to strict which can often be a PITA. Comparison between Transistional and Strict
1.0 is "more compatible" than 1.1 and 1.1 seems to be still under some sort of development.
I aim for XHTML 1.0 Trans. It's better to conform so when bugs are fixed in the browsers you won't suddenly be working against the clock trying to figure out what actually needs changing.
In my opinion 1.1 is borked and 2.0 has been smashed to smithereens: Do I really need/want a header/footer tag?
I'm all for XHTML Strict every time. I strongly believe that HTML should be more like XML. It's not hard to validate it if you know XML and the W3's validator ipoints you on the right track anyway.
XHTML 2.0 is heading toward what the W3 have been aiming for for a long time - the semantic web. The best benefit of XHTML 2.0 for me is that every conformant page on the web will be understandable as content, or an article (for that's what pages are - documents) becuase they all apply to the same standard. You would then be able to construct intepreters (i.e. browsers) that present the content in a completely different manner - there's literally thousands of ideas waiting here.
If you want to use XHTML 1.0 in an HTML-compatible way, that's fine. However, do note that the W3C validator and the XHTML DTDs know nothing about mime types and how browsers behave differently (like <map> name/id matching) between them. The DTDs know nothing about how well browsers support certain elements (like <embed> for example) either.
What this means is that the XHTML DTDs and the validator don't reflect reality and trying to conform to them is pointless.
If you want to use XHTML just so you can close certain elements with /> (where html-compatible), just use HTML5 markup (so the browser is in full standards mode). HTML5 allows the use of /> in an HTML-compatible way (the same HTML-compatible way you have to do it when using XHTML 1.0 markup with text/html). Then, just stick to what works (you know better than some DTD) in browsers.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<title></title>
</head>
<body>
<p>Line1<br/>Line2</p>
<p><img src="" alt="blank"/></p>
<p><input type="text"/></p>
<p><embed type="application/x-something" src=""/></p>
</body>
</html>
Then, use http://validator.nu/ to make sure it's well formed at least.
If you have tools to generate your XHTML like any other XML document, then go with XHTML. But when you just use plain text templates, text concatenation, etc. you are OK with good old HTML 4.01.
Browsers now start to support this 10 year old standard.
Important: Avoid being called a bozo when producing XML
I don't think it actually matters whether you use XHTML or plain HTML. The end goal here is to have low maintenance and quick development through a predictable rendering. You can get this from using xhtml or html, as long as you have validating code. I've even heard arguments that it's best to target quirks mode, because new versions of browsers don't change quirks mode, so maintenance is easy.
In the end, it all becomes tag soup, for good reason, because getting web app developers to write error-free html means asking them to write bug-free code. Validators are no help, because they only validate the initial page view. This is also why I've never seen the point in xhtml served as xml for anything beyond static sites. The level of arrogance a web apps developer would need to have to serve up their web app as xml is staggering.
HTML 4.0 Strict, or ISO HTML.