Using proper names - html

My HTML tag specifies lang="en", but there are a lot of proper names in the document. These are such things as surnames, which the validator flags as spelling mistakes. I'd like to put them in a <span> with lang="none" for example. Is there a correct way of doing this (i.e. one which validates as correct HTML?

The correct way to do it is to set the attribute to an empty string
<span lang="">...</span>
To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has a lang attribute in the XML namespace set or is an HTML element and has a lang in no namespace attribute set. That attribute specifies the language of the node (regardless of its value).
If the resulting value is the empty string, then it must be interpreted as meaning that the language of the node is explicitly unknown.
HTML5 Spec

Related

Random Letter html Tag

I was wondering if you can use a random letter as an html tag. Like, f isn't a tag, but I tried it in some code and it worked just like a span tag. Sorry if this is a bad question, I've just been curious about it for a while, and I couldn't find anything online.
I was wondering if you can use a random letter as an html tag.
Yes and no.
"Yes" - in that it works, but it isn't correct: when you have something like <z> it only works because the web (HTML+CSS+JS) has a degree of forwards compatibility built-in: browsers will render HTML elements that they don't recognize basically the same as a <span> (i.e. an inline element that doesn't do anything other than reify a range of the document's text).
However, to use HTML5 Custom Elements correctly you need to conform to the Custom Elements specification which states:
The name of a custom element must contain a dash (-). So <x-tags>, <my-element>, and <my-awesome-app> are all valid names, while <tabs> and <foo_bar> are not. This requirement is so the HTML parser can distinguish custom elements from regular elements. It also ensures forward compatibility when new tags are added to HTML.
So if you use <my-z> then you'll be fine.
The HTML Living Standard document, as of 2021-12-04, indeed makes an explicit reference to forward-compatibility in its list of requirements for custom element names:
https://html.spec.whatwg.org/#valid-custom-element-name
They start with an ASCII lower alpha, ensuring that the HTML parser will treat them as tags instead of as text.
They do not contain any ASCII upper alphas, ensuring that the user agent can always treat HTML elements ASCII-case-insensitively.
They contain a hyphen, used for namespacing and to ensure forward compatibility (since no elements will be added to HTML, SVG, or MathML with hyphen-containing local names in the future).
They can always be created with createElement() and createElementNS(), which have restrictions that go beyond the parser's.
Apart from these restrictions, a large variety of names is allowed, to give maximum flexibility for use cases like <math-α> or <emotion-😍>.
So, by example:
<a>, <q>, <b>, <i>, <u>, <p>, <s>
No: these single-letter elements are already used by HTML.
<z>
No: element names that don't contain a hyphen - cannot be custom elements and will be interpreted by present-day browsers as invalid/unrecognized markup that they will nevertheless (largely) treat the same as a <span> element.
<a:z>
No: using a colon to use an XML element namespace is not a thing in HTML5 unless you're using XHTML5.
<-z>
No - the element name must start with a lowercase ASCII character from a to z, so - is not allowed.
<a-z>
Yes - this is fine.
<a-> and <a-->
Unsure - these two names are curious:
The HTML spec says the name must match the grammar rule [a-z] (PCENChar)* '-' (PCENChar)*.
The * denotes "zero-or-more" which is odd, because that implies the hyphen doesn't need to be followed by another character.
PCENChar represents a huge range of visible characters permitted in element names, curiously this includes -, so by that rule <a--> should be valid.
But note that -- is a reserved character sequence in the greater SGML-family (including HTML and XML) which may cause weirdness. YMMV!

Can I have multiple values in one HTML "data-" element?

Can I have multiple values in one HTML "data-" element? Similar to how a class can have multiple class names.
If possible, I would like to create a CSS/JS library that makes use of one "data-" element to house all of the library styles. For example:
<div data-library-name="xs-hidden col-md-10 col-xl-8 big-hero"></div>
That way, any of the programmers custom style rules can go into the elements class. My reasoning for this is to make readability easier, so together it would look like:
<div class="custom-style another-style" data-library-name="xs-hidden col-md-10 col-xl-8 big-hero"></div>
Can I have multiple values in one HTML "data-" element?
You can have a string. The spec doesn't define any particular format for the data in the attribute, which is designed to be processed by site specific JavaScript.
Similar to how a class can have multiple class names.
The class attribute takes a space separated list of classes.
Your JavaScript can your_data_attribute_value.split(" "); if you like.
Handling this with CSS would use the ~= attribute selector.
[att~=val]
Represents an element with the att attribute whose value is a whitespace-separated list of words, one of which is exactly "val". If "val" contains whitespace, it will never represent anything (since the words are separated by spaces). Also if "val" is the empty string, it will never represent anything.
AFAIK, I don't think data- attributes can convert that to an array. Instead, I think it'll interpret it as one value, but it is allowed.
If you want to do that, you'll probably have to split() it later in JavaScript into an array of usable values.
See this example on JSFiddle.net.
CSS has the shortcut .class selector but it actually is parsing the attribute named "class" as a list for space separated values. This is supported in the non-shortcut form by the following attribute selector:
[att~=val]
Represents an element with the att attribute whose value is a white space-separated list of words, one of which is exactly "val". If "val" contains white space, it will never represent anything (since the words are separated by spaces). If "val" is the empty string, it will never represent anything either.
Ref: http://www.w3.org/TR/CSS2/selector.html#class-html
As your question is tagged CSS you're perhaps looking for that. The rules how the parsing of attribute values is done is given in that document as well, so in case the javascript library you're trying to use on this (if any) won't cover that, it should be easy to add:
var list = $("div").data("library-name").split(/\s+/);
^^^^^^^^^^^^
This split with the white-space regular expression parses the string attribute value into an array with javascript and the Jquery library (for accessing the DOM and the data attribute).

Is an empty class attribute valid HTML?

Is an empty class attribute valid HTML in the following formats:
<p class="">something</p>
<p class>something</p>
I found this question which is similar, but asks specifically about custom data attributes.
After looking at the specifications referred to in the other answers, I have found the sections that actually do answer the raised question.
<p class> is not allowed
The specification on attributes section 3.2.3.1 on Empty Attribute Syntax states the following:
An empty attribute is one where the value has been omitted. This is a syntactic shorthand for specifying the attribute with an empty value, and is commonly used for boolean attributes. This syntax may be used in the HTML syntax, but not in the XHTML syntax.
(...)
This syntax is permitted only for boolean attributes.
Seeing that the description of the class attribute (obviously) does not mention it being a boolean attribute, omitting the value is not permitted.
<p class=""> is allowed
From the section on class we learn that:
Every HTML element may have a class attribute specified.
The attribute, if specified, must have a value that is a set of space-separated tokens representing the various classes that the element belongs to.
and from the definition of space-seperated tokens:
A set of space-separated tokens is a string containing zero or more words (known as tokens) separated by one or more space characters, where words consist of any string of one or more characters, none of which are space characters.
we can conclude that the attribute value can in fact be empty (i.e. containing zero tokens).
From the HTML5 Reference page, section 3.2.3 Attributes:
Elements may have attributes that are used to specify additional information about them. Some attributes are defined globally and can be used on any HTML element, while others are defined for specific elements only. Every attribute must have an attribute name that is used to identify it. Every attribute also has an associated attribute value, which, depending on the attribute's definition, may represent one of several different types. The permitted syntax for each attribute depends on the given value.
So to answer your question,
Invalid:
<p class>
Valid (empty value)
<p class="">
See http://dev.w3.org/html5/html-author/ For all the reference regarding HTML5 you need.
Not having any values won't make it invalid. I have tested it in http://validator.w3.org/#validate_by_input
Put this code there and test:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<div class>Validiity
<input type="text" disabled>
</div>
</body>
</html>
Without quotes, just attribute names are drafted for boolean attribute like disabled, required
A number of attributes are boolean attributes. The presence of a boolean attribute on an element represents the true value, and the absence of the attribute represents the false value.
More here: https://html.spec.whatwg.org/#boolean-attributes
Read this Q/A on boolean attribute discussion - What does it mean in HTML 5 when an attribute is a boolean attribute?
Class attribute should contain a value. without value its not a valid one. but it shows no impact while rendering.

are attributes without value allowed in HTML4?

I wonder if HTML 4 allows attributes without value, as being equivalent to attributes with an empty value. For example:
<h2 section>foobar</h2>
instead of:<h2 section="">foobar</h2>
Are the two snippets equally valid? If not, are they valid in HTML version 5?
thanks!
Boolean Attributes, Yes they are completely valid.
From W3C: (On SGML & HTML)
Some attributes play the role of boolean variables (e.g., the selected
attribute for the OPTION element). Their appearance in the start tag
of an element implies that the value of the attribute is "true". Their
absence implies a value of "false".
Boolean attributes may legally take a single value: the name of the
attribute itself (e.g., selected="selected").
This states that Boolean attributes are valid in HTML4 as well, but if you use something like, would be invalid.. because that boolean belongs to option tag.. Thanks to #Ronni Skansing for clarifying the doubt..
<p selected>Hello</p>
HTML5 Docs :
From W3C :
Empty Attribute Syntax
Certain attributes may be specified by providing just the attribute
name, with no value.
From W3C: (HTML 5.1 Nightly )
A number of attributes are boolean attributes. The presence of a
boolean attribute on an element represents the true value, and the
absence of the attribute represents the false value.
BUT
section is an invalid attribute, if you want to define your own attributes, HTML5 provides a way to do that.. you need to use data- prefix, for example, your section should be written as data-section, this way your attribute will be counted as valid.
If you hesitate to do so, we always have a validator to check - W3C Markup Validation Service
^ Validated As HTML5
NOTE: Though I provided data- is applicable for HTML5, using custom
attributes in HTML4 is invalid, no matter even if you define data-
before the attribute name, but, boolean attributes are valid in HTML4
as well.
As formally defined, HTML 4 does not allow attributes without a value. What is commonly regarded as attribute without value, as in <input checked>, is formally an attribute value without an attribute name (and an equals sign). Though misleadingly characterized as “boolean attributes” with special minimization rules in HTML 4 specs, those specs normatively cite the SGML standard.
By the SGML standard, whenever an attribute is declared by enumerating keywords that are the only allowed values, an attribute specification may, under certain conditions, be minimized to the value. This means that in HTML 4, the tag <input checkbox> is valid; the attribute is a minimized form of type=checkbox. No browser supports that (they parse checkbox as attribute name), but in validators, the construct passes.
In practice, the part of the attribute minimization rules that browsers support consists of just the special cases where an attribute is declared as allowing a single keyword value only, such as the checked attribute, which is formally declared with
<!ATTLIST INPUT checked (checked) #IMPLIED>
So it depends on how the attribute is declared in the HTML 4 spec.
But this means that the minimized attribute checked means checked=checked. The value is not empty but the keyword checked. On the other hand, browsers treat such attributes as “presence attributes”: what matters is whether an element has that attribute or not, not its value.
In HTML5 serialized as XHTML (i.e., as XML), things are simple: every attribute specification must be of the form name="value" or name='value', so the equals sign is required, and so are the quotation marks; logically, the value is always there, though it can be the empty string, as in alt="".
In HTML5 serialized as HTML, some attributes are defined so that an attribute value (and an equals sign) is not required. Rather confusingly, they are the attributes declared as being “boolean attributes” (it’s confusing e.g. because the values true and false are not allowed, but the name partly reflects the principle that the corresponding DOM property, or “IDL attribute” as they call it, has the truth values true and false as the only permitted values). For such attributes, by definition, the value is even immaterial; only the presence of the attribute matters. For example, for the checked attribute, no value is used, but if a value is given, it must be either the empty string (checked="") or identical with the attribute name, case insensitively (e.g., checked=Checked). Any other value is nonconforming but is required to work, with the same meaning (e.g., checked=false means the same as checked).
Regarding the specific example, it is not valid in any version of HTML, since there is no attribute section declared.
Both snippets are syntactically valid in html4 and html5. The first is not valid xhtml, because in xhtml an attribute value is required.
On the other hand, section is not a defined attibute, but it is a valid tag in html5. Therefore your code is not valid.

Can the input tag's name attribute simply be an integer?

Are the names of input tags allowed to simply be integers?
<input type="text" name="34" />
Just asking in case I were a lazy programmer. Or if I have a ton of arbitrary fields and it's not important what they are named.
Yes, the name attribute is declared as taking CNAME value, which means any string of characters, without imposing constraints. HTML5 does not change this, except by disallowing the empty string; its definition explicitly says: “Any non-empty value for name is allowed”.
People sometimes confuse the name attribute with the id attribute, upon which there are various constraints depending on HTML version (e.g., some versions forbid a value that starts with a digit).
Yes, they can. I wouldn't recommend it, but there's nothing wrong with using a number as a name attribute.
No. Using integer is valid, yet it will be converted into a string.
The content attribute is the attribute as you set it from the content (the HTML code) and you can set it or get it via element.setAttribute() or element.getAttribute(). The content attribute is always a string even when the expected value should be an integer. For example, to set an element's maxlength to 42 using the content attribute, you have to call setAttribute("maxlength", "42") on that element.
ref: https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes
It's valid in HTML5, but I'd avoid it. However, your code is still not valid HTML5 because of the self-closing tag. It should be:
<input type="text" name="34">