HTML5 microdata: span content? - html

I have read the HTML5 specification, the microdata specification, and the WHATWG HTML5 (with microdata) specification. These are well written and easy to understand.
But now I read the schema.org Book specification, and came across snippets like the following:
<span itemprop="price" content="6.99">$6.99</span>
<span itemprop="inLanguage" content="en">English-language</span>
<span itemprop="name" content="Tolkien, J. R. R. (John Ronald Reuel)">
J. R. R. Tolkien</span>
Apparently (compare with the JSON version), the values of these microdata properties are the values of the content attributes of the span elements. (Of course, if there is no content attribute, the value is instead the textContents of the span element.)
But I cannot find any support for this practice in the HTML and microdata specifications. In fact, I cannot even find any evidence that there is a content attribute on span elements at all!
The microdata specification doesn't say anything about a span content attribute when it gives the rules for values. [Unless 'the element's textContent' is overridden by the content attribute, but I cannot find any support for this either.]
Not even the full WHATWG HTML5+microdata specification supports the claim that there is a content attribute on span (see The span element and Global attributes).
So, I suppose the schema.org example is non-conforming. But is it also plain wrong? If not, where does this practice come from, and how accepted is it?

Yes, this is wrong. Neither Microdata nor HTML5 define a content attribute for the span element.
Several people wanted to use it, see for example the code in these questions:
Hide Microdata property value in 'content' attribute?
Categories for Product in schema.org?
Is the "content" attribute valid for the <span> tag > if so is it a good practice?
schema.org product availability tags markup
I’m not sure where exactly this confusion is coming from.
(It doesn’t help that Google’s Structured Data Testing Tool incorrectly uses the content attribute instead of the element content; but at least all other Microdata parsers seem to do it correctly.)
Maybe some people got confused because RDFa (but not Microdata) defines and allows the content attribute for span. See HTML+RDFa’s Extensions to the HTML5 Syntax:
For the avoidance of doubt, the following RDFa attributes are allowed on all elements in the HTML5 content model: #vocab, #typeof, #property, #resource, #prefix, #content, #about, #rel, #rev, #datatype, and #inlist.

(Sorry, I didn't have enough reputation to post a comment.)
We're at the end of 2017 now. Somehow, the MDN webdocs (https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/itemprop)
and the schema docs (http://schema.org/telephone) still propose to use a content attribute on span using microdata. No html5 validator will accept this of course.

Related

Using an individual tag without breaking the standards?

I would like to create some kind of API where people can include a hidden information inside a website, so that a bot can read the information.
I know it is possible with meta-tags, but I am considering using some kind of individual tag, because then I can use DOM which is a bit more comfortable to work with, and it is easier to read by humans.
Example:
<html>
...
<body>
...
<mytag id="123" foo="bar" bar="foo"></mytag>
...
<mytag id="345" foo="bar" bar="foo"></mytag>
...
</body>
</html>
My question is, if it is possible to make this individual tag somehow conform to the standards, maybe by creating some kind of DTD ?
I would like to support HTML 4.01, XHTML and HTML 5, if possible.
Having to support HTML 4.01 and HTML5 makes this hard. You can’t use meta-name elements (would work for HTML 4.01, but they have to be registered for HTML5), you can’t use custom data-* attributes (not allowed in HTML 4.01), you can’t use Microdata (only defined for HTML5+), you can’t use custom elements (only defined for HTML5+).
I can think of two ways.
script element as data block
In HTML5, the script element can also be used for data blocks. Examples: text/html, text/plain.
The HTML 4.01 spec doesn’t define it like that, but it should still be possible/valid (it’ll understand it as "script", but user agents are not expected to try to run it if they don’t recognize the content type as possible for scripts).
Drawback: The content is not part of the document’s DOM.
RDFa
It’s allowed in HTML 4.01 and HTML5 (you might have to adapt the DOCTYPE for the older HTML versions, e.g., for XHTML).
You can’t use custom elements, but you can add property and content attributes (for name-value pairs), and you could use typeof for "items" (e.g., what you would use the element name for), and you can make use of meta and link elements (visually hidden by default) in the body.
<div vocab="https://api.example.com/voc#" class="the-hidden-information">
<div typeof="Item-123">
<meta property="foo1" content="bar1" />
<meta property="foo2" content="bar2" />
</div>
<div typeof="Item-345">
<meta property="foo1" content="bar1" />
<link property="foo5" href="/some-url" />
</div>
</div>
(when using RDFa 1.0 instead of 1.1, you’d have to use xmlns instead of vocab)

Can empty HTML elements have attributes in HTML5?

Empty HTML elements (i.e. elements having no content and no closing tag, like br/hr or any other HTML elements which I'm not aware of) can have attributes in the latest HTML5 standard?
Somebody please explain me in simple and easy to understand language.
Yes. Example: The <hr> tag can be modified to move the line around or change its length.
<hr width="50%" align="right">
They can For example tag supports global HTML attributes. You can check the attributes of html tags in W3school site. Here is the one for br:
http://www.w3schools.com/tags/tag_br.asp
(Check out the Global Attributes and/or Event Attributes)
You can easily check yourself which attributes an HTML5 element can have. In short:
Visit the HTML5 specification.
Search for the element under the "Table of Contents" (section 4).
For each element, see the attributes listed under "Content attributes".
In case of br and hr, they can have the global attributes (class, id, lang etc.).

Is adding a css class to a <b> tag valid html/css

Is adding a css class to a <b> tag valid html/css
Example, can I do this:
<b class="myclass"> Foo Bar </b>
Is this valid html/css?
I need to add a class to a b tag as an identifier so I can use it in jQuery/js. It wont have any css styles
Yes, b tag can have all global attributes, including class.
The full list of attributes, you can add to b element:
accesskey
class
contenteditable
contextmenu
dir
draggable
dropzone
hidden
id
inert
itemid
itemprop
itemref
itemscope
itemtype
lang
spellcheck
style
tabindex
title
translate
You can also use any custom data attributes.
Finally, you can add also ARIA role attribute.
Of course. There's nothing wrong with that.
However, it's generally a bad idea to use class purely for identifying an element. Consider using something like data-reference or something, as this will be more correct (and more efficient on the browser not having to keep track of a class that's not used as a class)
Yes, this is perfectly valid. Absolutely nothing wrong with it.
There's nothing strictly wrong with it, except that the <b> bold tag is deprecated, in favor of using the <strong> tag.
I never give the strong (or b) tag a class because I only use it when I want the text strong to call attention to it, similar to <em> emphasizing words in text.
If I am bolding or emphasizing the text for some other reason I use a div or span with a class — for example, it is common to italicize the title of a book or article, and in that case I do not use <em> around the title, I use <span class="title">This Is That Title</span> to semantically mark what this thing is, then use a stylesheet to say "titles are italic".
There are no “CSS classes”. CSS has class selectors, but that’s a different issue and postulates the existence of a class attribute in a markup language. Thus, the question is meaningless as far as CSS is considered.
In HTML, the class attribute is valid (formally correct) on b elements, in any HTML version from HTML 4.0 (which introduced the attribute) onwards. Whether it makes sense or not is a different issue, but there are no formal restrictions on its use. Although class is most often used for styling, it can be used for other purposes, too, especially in scripting.

The HTML dfn and abbr tags and correct usage of the title attribute

I am trying to figure out what is the correct way to use the dfn tag along with the title attribute and abbr tag I am not sure if I'm doing it correctly and was hoping if someone can tell me if any if not all of my examples below are correct or wrong if so which example(s) is wrong and why so I can have a better understanding of what I am doing and correct my error(s) Thanks?
Example 1
<p><dfn>CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Example 2
<p><dfn title="Cascading Style Sheets">CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Example 3
<p><dfn title="A style sheet language used for describing the look and formatting of a document written in a markup language.">CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Example 4
<p><dfn><abbr title="Cascading Style Sheets">CSS</abbr></dfn> is a simple mechanism for adding style to Web documents.</p>
Example 5
<p><dfn title="CSS"><abbr title="Cascading Style Sheets">CSS</abbr></dfn> is a simple mechanism for adding style to Web documents.</p>
I think it should be:
<p>
<dfn>
<abbr title="Cascading Style Sheets">CSS</abbr>
</dfn>
is a simple mechanism for adding style to Web documents.
</p>
Else, if it contains only an abbr element with a title attribute, then the term is the value of that attribute.
MDN
TL;DR: Although two of these examples are semantically correct, the most semantic way is example 4.
<p><dfn>CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Correct. According to Mozilla, because you are not using the title attribute, "CSS" is considered to be the term defined by the sentence within p.
<p><dfn title="Cascading Style Sheets">CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Incorrect. According to Mozilla, the title attribute is meant for "another form of the term", but only when the term defined isn't an abreviation. When the term is an abreviation, abbr should be used within the dfn element.
<p><dfn title="A style sheet language used for describing the look and formatting of a document written in a markup language.">CSS</dfn> is a simple mechanism for adding style to Web documents.</p>
Incorrect. The title attribute is meant for "another form of the term", not its definition.
<p><dfn><abbr title="Cascading Style Sheets">CSS</abbr></dfn> is a simple mechanism for adding style to Web documents.</p>
Correct. According to Mozilla, that's the recommended usage.
<p><dfn title="CSS"><abbr title="Cascading Style Sheets">CSS</abbr></dfn> is a simple mechanism for adding style to Web documents.</p>
Incorrect. According to Mozilla, if the dfn element contains a single child element and does not have any text content of its own, and the child element is an abbr element with a title attribute itself, then the exact value of the abbr element's title attribute is the term being defined. Therefore, the title attribute of the dfn element shouldn't be taken into account by semantic systems (A.I., lurkers, accessibility, etc.)
N.B. I produced this answer mostly leveraging the "Using abbreviations and definitions together" and "Specifying the term being defined" sections of the dfn article on Mozilla
They are all “correct” in the sense of matching the loose definitions of abbr and dfn in HTML specifications and drafts. The statement here is not really a definition at all (it says something about CSS, instead of specifying its essential features needed to distinguish it from other entities, and “CSS” is not really an abbreviation but name, though nominally formed as an abbreviation of some words. But the specs are so vague that even the markup in the question may well be interpreted as matching the “semantics” of these elements.
The question is rather academic, since abbr and dfn have almost no impact on anything but some features of default rendering, and you could and should use CSS rules that either confirm or override such styling, and then you might almost as well use span.

When using HTML5 Microdata, should the 'itemscope' and 'itemtype' always be used on the same element?

I'm trying to understand the reason behind the existence of two attributes instead of just making the element holding the 'itemtype' the one that wraps the scope for the item.
Is it valid to have 'itemtype' attribute on one element and 'itemscope' attribute in some other? like this:
<section itemtype="http://data-vocabulary.org/Person">
<div itemscope>
<span itemprop="name">Alonso Torres</span>
</div>
</section>
If this case is not valid then why the existence of the 'itemscope' attribute at all? Why the spec didn't come up with the idea of making the element holding the 'itemtype' attribute to be the one which sets the scope. That would have make sense for me.
You're right, the itemscope attribute seems redundant. Someone else pointed this out on the W3C's HTML mailing list: http://lists.w3.org/Archives/Public/public-html-bugzilla/2011Jan/0517.html
The answer ( http://lists.w3.org/Archives/Public/public-html-bugzilla/2011Jan/0523.html ) was that:
The HTML spec editor did user-testing
of the feature earlier, and if I
recall correctly, several of the test
subjects found it much easier if there
was an explicit indicator of the
container, rather than it being
implicit due to the type.
In other words, it's better for attributes to have a single clear definition than multiple implied definitions. Not sure I agree but that's the official view.
itemscope is mandatory if itemtype is used on the same element
The example you show is invalid. The spec has been updated to include this:
The itemtype attribute must not be specified on elements that do not have an itemscope attribute specified.
Here, "must not" is to interpreted as in RFC2119: "the definition is an absolute prohibition of the specification".
I don't believe that it is useful to place an itemtype attribute anywhere but on the same element as the itemscope attribute. The spec says:
The type for an item is given as the
value of an itemtype attribute on the
same element as the itemscope
attribute.
The reasons why two attributes are needed isn't clear to me either. Semantically they serve different purposes, so for clarity of usage it may have seemed more sensible. For simple use, it's possible to create an item using itemscope without giving it a type. That means that itemscope is a boolean attribute, whereas itemtype takes a string value. It's not possible in HTML for an attribute to behave as boolean when used without a value, and a string when used with one, so separate attributes makes sense.
I know that Google did a usability study on the Microdata mark-up before it was announced, so it was likely that such questions were addressed there and that the separate attributes was the preferred outcome. (Although that study also resulted in a preference for itemref being an element, not an attribute, something that was subsequently changed.)