I'm looking at an ancient website and I see !style="width:380px; height:30px;" inside an input element. What does the exclamation mark mean?
It does not mean anything. It is invalid markup, and no meaning is assigned to it, any more than an attribute specification like vjkfhjidfhgsi="fbhhjgf" has a meaning.
It has an effect, though, but only in the sense that the specification is parsed and an entry is created from it into the attributes object of the element node, with !style as the attribute name. This has no effect in itself, but such an attribute could be used in scripting and in CSS.
If you meant to ask why the exclamation mark was used, then your guess in your own answer may well be correct. Though invalid, the method is probably safer now than it was in the past. In the old days, syntax error handling was not defined for HTML, and one might argue that HTML parsers could well just skip a character like ! in this context and process the tag as if it were not there. In HTML5 parsing rules, which largely just make common browser practice the rule, specify the ! is to be taken as the first character of an attribute name, see 8.2.4.34 Before attribute name state and the next clause.
Since there are no way to comment out an attribute in HTML, this is probably a hack to make the attribute invalid and thus the styles are not applied. The exclamation mark is probably a convention used by the developer to mark all his/her unneeded attributes.
Related
I'm considering writing code which produces an HTML tag that could have duplicate attributes, like this:
<div data-foo="bar" class="some-class" data-foo="baz">
Is this legal HTML? Does one of the data-foo-values take precendence over the other? Can I count on semi-modern browsers (IE >= 9) to parse it without choking?
Or am I about to do something really stupid here?
It is not valid to have the same attribute name twice in an element. The authoritative references for this are somewhat complicated, as old HTML versions were nominally based on SGML and the restriction is implied by a normative reference to the SGML standard. In HTML5 PR, section 8.1.2.3 Attributes explicitly says: “There must never be two or more attributes on the same start tag whose names are an ASCII case-insensitive match for each other.”
What happens in practice is that the latter attribute is ignored. Well, future browsers might do otherwise. In the DOM, attributes appear as properties of the element node as well as in the attributes object, so there would be no natural way to store two values.
It's not technically valid, but every browser will ignore duplicate attributes in HTML documents and use the first value (data-foo="bar" in your case).
Using the same attribute name twice in a tag is considered an internal parse error. It would cause your document to fail validation, if that's something you're worried about. However, it's important to understand that HTML 5 defines an expected result even for cases where you have a "parse error". The parser is allowed to stop when it encounters an error, but if it chooses not to stop it must produce a specific result described in the specification. In practice, no browsers choose to stop when encountering errors in HTML documents (XML/XHTML is a different matter), so all modern browsers will handle this case successfully and consistently.
The WHATWG HTML specification describes this case in section 12.2.4.33 "Attribute name state":
When the user agent leaves the attribute name state (and before emitting the tag token, if appropriate), the complete attribute's name must be compared to the other attributes on the same token; if there is already an attribute on the token with the exact same name, then this is a parse error and the new attribute must be dropped, along with the value that gets associated with it (if any).
See also its description of "parse error" from the opening of section 12.2 "Parsing HTML documents":
Certain points in the parsing algorithm are said to be parse errors. The error handling for parse errors is well-defined (that's the processing rules described throughout this specification), but user agents, while parsing an HTML document, may abort the parser at the first parse error that they encounter for which they do not wish to apply the rules described in this specification.
I wanted to add a comment to the excellent accepted answer, but my reputation is not high enough.
I wanted to add it is important to consider how your code gets compiled.
For example, Angular removes prior duplicate (non-angular) class attributes and only keeps the last one.
Note: Angular also modifies the value of the class attribute with ngClass and any [class.class-name] attributes.
This is also something you can use linter for.
See htmlhint (attr-no-duplication) or htmllint (attr-no-dup).
According to many recent HTML specs, when we are using custom attributes (meaning any attributes not defined in the spec), we should prefix them with data-. However, I see no reason to have to do this (unless you require perfectly valid HTML, obviously). Pretty much all current browsers correctly ignore custom attributes, meaning no conflicts except with identically-named attributes from others' code, and we can ignore even this with custom prefixes or something similar (as suggested on the AngularJS directive page). What, if any, other benefits are there? This question has been asked before, at least twice, but both are pretty old.
I forget where I read it, but some guide said custom HTML tags need dashes, and single-word tags aren't valid. First of all, why? Second, should we do this, and why (besides being valid)? Would there be any problem with underscores or camelCase, etc.? Also, conflicts with existing elements shouldn't be a problem, if, like with data attributes, you prefix or suffix them, etc. See the Angular directive page again.
I'm sure all these questions have been asked before, but I'm combining them into one. Is that a good idea (quick, someone ask on Meta)?
The data-* attributes have two advantages:
It is a convention meaning other programmers will understand quickly that it is a custom attribute.
You get a DOM Javascript API for free: HTMLElement.dataset. If you use jQuery, it leverages this to populates the keys and values you find with .data().
The reason for the - in custom element names is for two basic reasons:
It is a quick way for the HTML parser to know it is a custom element instead of a standard element.
You don't run into the issue of a new standard element being added with the same name which would cause conflict if you register a custom Javascript prototype for the DOM element.
Should you use your own custom element name? Right now it is so new that don't expect it to be fully supported. Let's say it does work. You have to balance the issue of the extra complexity with the benefit. If you can get away with a classname, then use a classname. But if you need a whole new element with a custom Javascript DOM prototype for the element, then you may have a valid usage of it.
I know that technically HTML5 is a 'living spec' but I'm wondering if it's compliant to have trailing spaces inside of a class name. I didn't see any reference to this scenario in the spec, but one of my teammates said it was invalid. Maybe I missed something?
It'd be a pain to trim those spaces (I'm working inside of a large Java ecom application) and I want to know if it's worth doing for SEO, validation or any other purpose. I get the feeling it's not... haha
Yes, it is compliant.
From
http://www.w3.org/html/wg/drafts/html/master/dom.html#classes:
The attribute, if specified, must have a value that is a set of space-separated tokens representing the various classes that the element belongs to.
From
http://www.w3.org/html/wg/drafts/html/master/infrastructure.html#set-of-space-separated-tokens:
A string containing a set of space-separated tokens may have leading or trailing space characters.
As we can see in the link below, there is no restriction on what the developer can use in the class attribute.
http://www.w3.org/html/wg/drafts/html/master/dom.html#classes
In fact, after saying what the classes are and that they're used spliting in the spaces, the author(s) says:
There are no additional restrictions on the tokens authors can use in
the class attribute, but authors are encouraged to use values that
describe the nature of the content, rather than values that describe
the desired presentation of the content.
Our colleagues here have tested and it successfully passes W3C validation, so I can't guess why your friend thought it was invalid.
According to http://validator.w3.org/ under a <!DOCTYPE html> the following validates successfully.
<div class=" name1 name2 "></div>
Leaving a trailing and leading space may be acceptable, but it's not pretty and some people do not consider it to be best a practice.
Anything that knows how to handle class attributes should be fine with this. It is possible to give a tag multiple classes by separating them with spaces so anything which reads them had better know how to deal with spaces.
In HTML 5 specification the parser and the specification state that the element name can be everything starting with a letter and followed by alpha-numeric characters.
Now the question is what happens if I introduce additional elements not part of the specification but valid in terms of compliance to the specified syntax.
What do all those browsers do when they encounter elements with custom yet unknown name? Does those elements got treaten like any element or are they left out, stripped out or replaced?
How do for instance do HTML5 editor behave?
Is there anything in the specifications I have overlooked regarding valid element tag names?
[Update]
The specification was missleading here since it states the alpha-numerical character of the HTML element names. While reading a HTML 5 Specification, I missconcluded that this is true for all element names.
That is appearently wrong. In the parser section it states that a element name must only start with an ASCII letter and after that letter everthing except:
"tab" (U+0009)
"LF" (U+000A)
"FF" (U+000C)
U+0020 SPACE
"/" (U+002F)
">" (U+003E)
U+0000 NULL
EOF
Beside those mentioned characters which require special treatment involving errors or ending the tag name all other possible characters seam to be allowed.
Anything else
--> Append the current input character to the current tag token's tag name.
From my field test also additional uni-code letters are allowed for the first letter by several parsers (at least they are graceful with those).
[/Update]
In HTML 5 specification the parser and the specification state that
the element name can be everything starting with a letter and followed
by alpha-numeric characters.
Incorrect. The specification states that the element name must be one of the names explicitly listed in that document, or in another applicable specification. These include but are not limited to SVG and MathML.
The specification also includes a processing specification for consumers of HTML, such as browsers. This doesn't describe what's "allowed", it describes what those consumers should do with each character of the document regardless of whether it contains things that are allowed or not allowed.
Now the question is what happens if I introduce additional elements
not part of the specification but valid in terms of compliance to the
specified syntax.
The above rules are followed. The "specified syntax" is irrelevant. The specification describes what the consumer should do for any input stream of characters.
What do all those browsers do when they encounter elements with custom
yet unknown name? Does those elements got treated like any element or
are they left out, stripped out or replaced?
They are treated as elements in the http://www.w3.org/1999/xhtml namespace which implement the HTMLUnknownElement interface.
How for instance do HTML5 editor behave?
If they are HTML5 compliant they will behave the same way when reading in the HTML.
Is there anything in the specifications I have overlooked regarding
valid element tag names?
See the first paragraph above. Also the Custom Elements spec which makes any element name starting with an ASCII letter and containing a hyphen to be considered valid. It is unclear whether that specification is currently an "HTML5 applicable specification" but if not, it will very probably be one soon.
I know that data- Attributes is part of HTML 5. It seems to be a good choice to use it to serialize some data in markup. So there are people using data-bind="xxx" . But can I just use bind="xxx". It seems violate schema, of specification, but practically it works in all browser. So my question is, what is the practical reason (not in theory) like performance that I should not use customs attribute just like bind="xxx". I know bind attribute is not reserved attribute.
Thanks
Practically, some browser can implement bind with a totally different meaning.
You're using it for Knockout, but hypothetically the new meaning is different. When you change the inline CSS on one element, it should change it on another element based on a selector in the bind attribute.
There's a reason to respect standards and use private (e.g. data-) or vendor-specific prefixes.
I know this question is old but I had the same question as Fred. I found the answer and I wanted to share it. There is no practical reason not to use an arbitrary attribute name other than what was mentioned in another answer. That is a vendor or browser maker may decide to use the name for something else.
It is however not allowed in the spec, kind-of, while the spec does read that arbitrary attribute names should not be used, there is nothing preventing their use. This is like the US Code for Displaying a flag, it's a set of rules but no way to enforce it.
Contrast this to a rule that is enforced such as a single body tag within an html document, etc. and you can realize that there is nothing enforcing using arbitrary names for attributes.
So what remains is why should you not use arbitrary attribute names? Since it will not pass the W3 html validator, it can be argued that societal pressure may increase to the point that your code will not be acceptable. For example, SEO may suffer, a future employer may not like sloppy code, etc.
I certainly became embarrassed as I make liberal use of arbitrary attribute names to pass data. Something that I will stop immediately. I am now convinced that using the data-* attribute will make my code easier to read, easier to maintain and clean enough for peer review.