w3c validator HTML5 xmlns - html

I couldn't get my website to validate in W3C markup validation.
Here is an example of one of the validation errors.
Error: Attribute xmlns:content not allowed here.
I have done some research and some articles recommend us to change the:xmlns:name="http://url" syntax into the prefix="name:http://url"
However I am having multiple xmlns attribute.
Not sure how to write the prefix in this case.
Original:
`<html lang="en" dir="ltr"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:og="http://ogp.me/ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:sioct="http://rdfs.org/sioc/types#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">`
Is it separate by a space between each attribute?
<html lang="en" dir="ltr" prefix="content:http://purl.org/rss/1.0/modules/content/ dc:http://purl.org/dc/terms/ foaf:http://xmlns.com/foaf/0.1/">
or we should separate by \n new line?
I have checked in w3.org website and it looks like they separate the attribute in new line.
Am I right?
http://www.w3.org/TR/2011/WD-rdfa-core-20111215/

You can't use elements from arbitrary namespaces in HTML 5. For that you need to be using XML, and for validation you need a suitable DTD or Schema that includes all the namespaces you want to use.

You need a doctype to go before your html tag. You can use:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

The Web page contains RDFa markup, which is allowed in HTML 5. The HTML+RDFa 1.1 specification has a section specifically dedicated to the #xmlns: attributes.
It says that the use of #xmlns: is deprecated in RDFa 1.1 but still allowed for backward compatibility. So, if possible, it should not be used. Then, the following section on Conformance Criteria for #xmlns:-Prefixed Attributes indicates that:
For documents conforming to this specification, attributes with names that have a case insensitive prefix matching "#xmlns:" MUST be considered conforming. Conformance checkers SHOULD accept attribute names that have a case insensitive prefix matching "#xmlns:" as conforming. Conformance checkers SHOULD generate warnings noting that the use of #xmlns: is deprecated. Conformance checkers MAY report the use of xmlns: as an error.
I understand that the W3C HTML validator is adopting the stricter conformance check where xmlns: is reported as an error, although I find it strange because it SHOULD rather generate a warning.

Related

HTML5 Doctype for Domparser

Task: I want to parse an XML document using DOMParser (https://developer.mozilla.org/en-US/docs/Web/API/DOMParser). I have no and need no formal DTD and parsing this as "text/xml" worked pretty well. Now I want to use certain symbolic entities, such as in my xml and the parser, of course, complains that they are not known. Since I want to be able to access, in principle, all existing html entities, I tried to use a doctype specification
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/html4/strict.dtd">
and this worked as expected, since DOMParser seems to have this doctype and the connected entity list preloaded. However, this doctype is outdated. So I tried the new <!DOCYTPE html> but this did not work. Also this is expected, as the novel html5 doctype tag works differently than the older xml/sgml based ones.
Question: Is there some standardized !DOCTYPE for html (5) which the browser recognizes and which contains the preloaded HTML entities. (I do not want to copy in a list of all entities as separate entity definitions, the browser has them somewhere, I just do not know how to activate them by an xml/sgml style DTD for html5)
If you want to continue using XML, but don't want to use the XHTML doctype, then you have to declare the character entities of XHTML via ENTITY declarations directly in your document (in the internal subset or an external declaration set) since only HTML has nbsp and many others as predefined entities (XML has only quot, amp, apos, lt, and gt). You can use the HTML5 entity set from https://www.w3.org/2003/entities/2007/htmlmathml-f.ent (which includes the large set of MathML entities), or the much smaller set of classic HTML4 entities.
But I would first check if DomParser actually processes markup declarations and/or external declaration sets with markup declarations. Try to parse the following
<?xml version="1.0"?>
<!DOCTYPE test [
<!ENTITY nbsp " ">
]>
<test>
</test>
and check the console for error messages.
There is no "official" DTD for HTML (in fact, no formal grammar at all), but there's my SGML DTD for W3C HTML 5.1 with much more information about parsing HTML5 than you probably are interested in, including info about HTML5's predefined entities.

HTML5 html-tag and DOCTYPE

From what I've read, the correct way to start an HTML5 page is:
<!DOCTYPE html>
<html>
With nothing more in those lines. Is this true? (I'm asking because Visual Studio has more than that.)
(Also, I'm wondering if HTML5 is really the current standard or should I be using XHTML5 or some other version.)
According to the HTML living standard and the W3C spec, the doctype is the required preamble but required for legacy reasons. I quote:
A string that is an ASCII case-insensitive match for the string
"<!DOCTYPE".
One or more space characters.
A string that is an ASCII case-insensitive match for the string
"html".
Optionally, a DOCTYPE legacy string or an obsolete permitted DOCTYPE
string (defined below).
Zero or more space characters.
A U+003E GREATER-THAN SIGN character (>).
In other words, <!DOCTYPE html>, case-insensitively.
And <html></html> for a valid document
(Also, I'm wondering if HTML5 is really the current standard or should
I be using XHTML5 or some other version.)
It is not the current standard IMHO because it is not finished yet. But this article explains very well 10 reasons for using it now.
Mostly yes. But the HTML5 spec for the <html> element says
Authors are encouraged to specify a lang attribute on the root html
element, giving the document's language. This aids speech synthesis
tools to determine what pronunciations to use, translation tools to
determine what rules to use, and so forth.
so better, for a page whose content is in American English, would be
<!DOCTYPE html>
<html lang="en-us">
Also if you are using XHTML5 served as application/xhtml+xml you will need to add the namespace, and also the XML equivalent of the lang attribute making it:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-us" xml:lang="en-us">
Yes it's true. No more complicated doctypes in HTML5. The new standard is simplified and there's only the one you said.
According to the HTML5 drafts, “A DOCTYPE is a required preamble”. The preamble <!DOCTYPE html> is recommended, but legacy doctypes are allowed as an alternative, though they “should not be used unless the document is generated from a system that cannot output the shorter string”. The only part that is required in addition to it is the title element, and even it may be omitted under certain conditions. The <html> tag is not required.
HTML5 is not a standard. It is not even a W3C recommendation (yet). What you should use depends on what you are doing. It does not really matter which version of HTML you think you are using. What matters is the markup you have and how browsers (and search engines etc.) process it.
Yes, that's correct as far as I know.

Uppercase or lowercase doctype?

When writing the HTML5 doctype what is the correct method?
<!DOCTYPE html>
or
<!doctype html>
In HTML, the DOCTYPE is case insensitive. The following DOCTYPEs are all valid:
<!doctype html>
<!DOCTYPE html>
<!DOCTYPE HTML>
<!DoCtYpE hTmL>
In XML serializations (i.e. XHTML) the DOCTYPE is not required, but if you use it, DOCTYPE should be uppercase:
<!DOCTYPE html>
See The XML serialization of HTML5, aka ‘XHTML5’:
Note that if you don’t uppercase DOCTYPE in an XHTML document, the XML parser will return a syntax error.
The second part can be written in lowercase (html), uppercase (HTML) or even mixed case (hTmL) — it will still work. However, to conform to the Polyglot Markup Guidelines for HTML-Compatible XHTML Documents, it should be written in lowercase.
If anyone is still wondering in 2014, please consult this:
HTML5
W3 HTML5 Spec - Doctype
A DOCTYPE must consist of the following components, in this order:
A string that is an ASCII case-insensitive match for the string "<!DOCTYPE".
...
Note: Despite being displayed in all caps, the spec states it is insensitive.
XHTML5
W3 HTML5 - XHTML
This specification does not define any syntax-level requirements beyond those defined for XML proper.
XML documents may contain a DOCTYPE if desired, but this is not required to conform to this specification. This specification does not define a public or system identifier, nor provide a formal DTD.
Looking at the XML spec, it lists DOCTYPE in caps, but I can't find anything that states that 'all caps' is required (for comparison, in the HTML5 spec listed above, it is displayed in the example in all caps, but the spec explicitly states that is is case-insensitive).
Polyglot Markup
W3 Polyglot Markup - Intro
It is sometimes valuable to be able to serve HTML5 documents that are also well formed XML documents.
W3 Polyglot Markup - Doctype
Polyglot markup uses a document type declaration (DOCTYPE) specified by section 8.1.1 of [HTML5]. In addition, the DOCTYPE conforms to the following rules:
The string DOCTYPE is in uppercase letters.
...
So, note that Polyglot Markup uses a regular HTML5 doctype, but with additions/changes. For our discussion, most notably that DOCTYPE is declared in all caps.
Summary
View the W3's HTML vs. XHTML section.
[Opinion] I wouldn't worry too much about satisfying XML compliance unless you are specifically trying to make considerations for it. For most client and JS-based server development, JSON has replaced XML.
Therefore, I can only see this really applying if you are trying to update an existing, XHTML/XML-based legacy system to co-exist with new, HTML5 functionality. If this is the case then look into the polyglot markup spec.
According to the latest spec, you should use something that is a case-insensitive match for <!DOCTYPE html>. So while browsers are required to support whatever case you prefer, it's reasonable to infer from this that <!DOCTYPE html> is the canonical case.
Either upper or lower case is "correct". However if you use web fonts and care about IE7, I'd recommend using <!DOCTYPE html> because of a bug in IE7 where web fonts sometimes fail if using <!doctype html> (e.g. in this answer).
This is why I always upper-case the doctype.
The standard for HTML5 is that tags are case insensitive.
http://www.w3schools.com/html5/tag_doctype.asp
More Technically: (http://www.w3.org/TR/html5/syntax.html)
A DOCTYPE must consist of the following components, in this order:
A string that is an ASCII case-insensitive match for the string <!DOCTYPE.
The question sort of implies there's only one correct answer, supplies a multiple choice of two, and asks us to pick one. I would suggest that for HTML5 both <!DOCTYPE html> and <!doctype html> are valid.
So a HTML5-capable browser would accept the lowercase one and process the html properly.
Browsers previous and oblivious to HTML5, I've heard, even without a doctype, will attempt to process the html as best they can. And if they don't recognize the lowercase doctype will do the same. So there's no point in making it uppercase since those browsers won't be able to fully implement any HTML5 declarations anyway.
The doctype declaration is case insensitive, and any string of ASCII that matches
Html5 standard

What is the DOCTYPE... for [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What's up, Doctype?
When create a new file in Netbeans IDE I get <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> at the beginning of the file.
I delete it and my html still works. I wonder what that is and is it neccessary?
Thank you.
The doctype declaration is not an HTML tag; it is an instruction to the web browser about what version of the markup language the page is written in.
The doctype declaration refers to a Document Type Definition (DTD). The DTD specifies the rules for the markup language, so that the browsers render the content correctly.
take a look here: http://www.w3schools.com/tags/tag_doctype.asp
A Document Type Declaration, or DOCTYPE, is an instruction that associates a particular SGML or XML document (for example, a webpage) with a Document Type Definition (DTD) (for example, the formal definition of a particular version of HTML).
http://en.wikipedia.org/wiki/Document_Type_Declaration
Also, from W3C:
There is not just one type of HTML, there are actually many: HTML 4.01 Strict, HTML 4.01 Transitional, XHTML 1.0 Strict, and many more. All these types of HTML are defined in their respective W3C specifications, but they are also defined in a machine-readable language specifying the legal structure, elements and attributes of a type of HTML.
http://www.w3.org/QA/Tips/Doctype
I believe that if you don't specify a doctype, the browser will add a default one, that's why it works. Adding that line overrides the default to specify that you want that particular markup language.
There are many variations of HTML with various names; XHTML, DHTML etc... Your browser will do its best to work out which variation your document is written in but may not always get it right. Particularly in IE it will default to "quirks mode" if you do not declare a doctype which frequently causes most of your layout to break.
Declaring the doctype means the browser doesn't have to make this best guess and instead, it renders your page according to the specification related to the doctype you have declared.
Here are some interesting articles on the differences between some of the DTDs:
Strict vs. Transitional
HTML vs. XHTML
To make it clear: unless we care about validation, the only reason why to use doctype is to trigger standards mode (see other comments). Browsers do not differentiate between versions of HTML. This is why it is recommended to choose as simple doctype as possible:
<!doctype html>
The Doctype tells the browser what version of HTML or XHTML are you writing, so it can treat it as it is supposed to.
WIth no Doctype it will work, but the browser wont know exactly what version is it.

DOCTYPE's role in general XML

I know the purpose of DOCTYPE (and what each url/identifier on the line is) as far as web standards and page validation goes, but I am unsure about what it actually "is" in the context of an XML document.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>My Page</title>
</head>
<body>
<p>Hello</p>
</body>
</html>
Is it part of the actual XML document structure, or is it some kind of comment-like "hint" that is noted then stripped?
What is the significance of the "!" before the name? Does this denote a special type of "element"? What are they called?
The example I posted is XHTML for the web, but is DOCTYPE also used in general purpose XML documents?
DOCTYPE has been "inherited" from SGML (it was supposed to point to DTD file that explains how to parse the file), however self-explanatory XML syntax and namespaces made it largely irrelevant. The only real use for DOCTYPE/DTD in XML is to define allowed named entities (e.g. ).
XML spec even allows "non-validating" parsers that ignore DTD file completely (web browsers use such parsers, unless you've fallen into the text/html trap in which case XML parser is not used at all).
DTD is quite poor for purpose of validation (hard to specify rules for more than one level of nesting, no way to specify types of attributes beyond few predefined types). Schema, RelaxNG can be far more precise.
DTD doesn't fully suppport namespaces either, which leads to ridiculous workarounds like XHTMLplusMathMLplusSVG DOCTYPE.
In web browsers certain DOCTYPEs have desirable side-effect of triggering standards-compliant rendering mode. This is more of a hack than intended use DOCTYPEs.
If you're using real XHTML (application/xhtml+xml – the one that doesn't open in IE at all), then don't use DOCTYPE at all (that's recommendation from XHTML 5). XML mode will trigger standards-compliant rendering regardless of DOCTYPE.
If you're using text/html mode, then use <!DOCTYPE html>. That's HTML 5 DOCTYPE and it's a shortest one that triggers best possible rendering in all browsers. Browsers don't use DOCTYPE for any other purpose, so you're not missing out on anything.
If you're processing XHTML files with XML parsers (outside browsers), then please don't forget to set up DTD Catalog properly, otherwise your parser may be DoS-ing w3.org trying to fetch DTD every time. If you can't use DTD catalog, then disable "externals" in the parser or omit DOCTYPE and don't use named entities (i.e. use   rather than )
DOCTYPE is part of the XML specification (see the relevant subsection here) and can include either a link to a DTD, "internal" DTD declarations, or both. Many "modern" uses of XML don't use a DOCTYPE at all, though - as porneL mentions, both XML Schema and RelaxNG are more powerful ways to specify a document's syntax. See this Tim Bray blog post for a bit more background.