After ATTLIST declaration in DTD browser renders custom character - html

I declared rel="value" attribute for <li> element in DTD like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" [<!ATTLIST li rel CDATA #IMPLIED>]>
After that my code with <li rel="value"></li> was valid, but I got another issuer: Browser renders "]>" character in document.
How to fix this?

You should not use an internal subset in a doctype declaration, because browsers do not understand it, or DTDs at all.
If you use a simple added attribute, for some reason, it is often best to just be careful enough with it, or “check it manually”. But to perform DTD-based validation, you would need to construct an external DTD, based on the DTD you wish to use as basis, and with the extra markup added into it. In this case, you would copy the HTML 4.01 Transitional DTD and replace
<!ATTLIST LI
%attrs;>
by
<!ATTLIST LI
rel CDATA #IMPLIED
%attrs;>
(That is, you need to provide the full list of allowed attributes, with your custom attribute added, instead of declarign an attribute list that only allows your attribute [unless that’s what you really want].)
You would then use a doctype declaration that refers to your modified copy by its URL, with
<!DOCTYPE HTML SYSTEM "dtdurl">
where dtdurl is an absolute URL for the DTD. More info: Creating your own DTD for HTML validation.
It is generally not advisable to add attributes of your own, as they may clash with attributes that might be added to HTML in some future version. According to HTML5 drafts, attributes with names starting with data- are meant for site-specific use and will never have any publicly defined meaning, so data-rel would be safer than rel.

Browsers don't understand embedded SGML. They simply stop reading the doctype at the first > character. So they see the following ]> as text to be rendered.
Just don't use embedded SGML.

Use a pseudo-attribute > delimiter instead of a literal > delimiter to escape the nested > within ]>:
<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"
[<!ATTLIST li rel CDATA #IMPLIED>]>
References
Pseudo-Attributes

Related

Using an individual tag without breaking the standards?

I would like to create some kind of API where people can include a hidden information inside a website, so that a bot can read the information.
I know it is possible with meta-tags, but I am considering using some kind of individual tag, because then I can use DOM which is a bit more comfortable to work with, and it is easier to read by humans.
Example:
<html>
...
<body>
...
<mytag id="123" foo="bar" bar="foo"></mytag>
...
<mytag id="345" foo="bar" bar="foo"></mytag>
...
</body>
</html>
My question is, if it is possible to make this individual tag somehow conform to the standards, maybe by creating some kind of DTD ?
I would like to support HTML 4.01, XHTML and HTML 5, if possible.
Having to support HTML 4.01 and HTML5 makes this hard. You can’t use meta-name elements (would work for HTML 4.01, but they have to be registered for HTML5), you can’t use custom data-* attributes (not allowed in HTML 4.01), you can’t use Microdata (only defined for HTML5+), you can’t use custom elements (only defined for HTML5+).
I can think of two ways.
script element as data block
In HTML5, the script element can also be used for data blocks. Examples: text/html, text/plain.
The HTML 4.01 spec doesn’t define it like that, but it should still be possible/valid (it’ll understand it as "script", but user agents are not expected to try to run it if they don’t recognize the content type as possible for scripts).
Drawback: The content is not part of the document’s DOM.
RDFa
It’s allowed in HTML 4.01 and HTML5 (you might have to adapt the DOCTYPE for the older HTML versions, e.g., for XHTML).
You can’t use custom elements, but you can add property and content attributes (for name-value pairs), and you could use typeof for "items" (e.g., what you would use the element name for), and you can make use of meta and link elements (visually hidden by default) in the body.
<div vocab="https://api.example.com/voc#" class="the-hidden-information">
<div typeof="Item-123">
<meta property="foo1" content="bar1" />
<meta property="foo2" content="bar2" />
</div>
<div typeof="Item-345">
<meta property="foo1" content="bar1" />
<link property="foo5" href="/some-url" />
</div>
</div>
(when using RDFa 1.0 instead of 1.1, you’d have to use xmlns instead of vocab)

What is the purpose of the <html> element?

Doesn't the file type already let the browser know that the document is an html document. MDN mentions that it is the root element, so is using it just a formality?
It is a family trait of HTML, XML, and SGML that they all need to be nested inside a root element. It's just part of the data standard and lets the interpreter know where to start and stop, and verifies that the document is complete and well-formed.
<!DOCTYPE html> specifies the type of document. In that case it means that it is HTML 5 currently, as opposed to XML or XHTML 1.0 transitional, as examples. Keep in mind that if you are downloading these as byte streams you may not always know the file type.
Yes. The <html> tag is the root and can even be omitted in somes cases (from MDN):
The start tag may be omitted if the first thing inside the <html> element is not a comment. The end tag may be omitted if the <html> element is not immediately followed by a comment, and it contains a <body> element either that is not empty or whose start tag is present.
But not only:
It can be styled with CSS (though styling the <body> will usually be enough).
It can have global attributes, especially lang, which is the W3C way of defining an HTML document language.
There is probably more to say but that’s what I see as arguments for the <html> element, apart from its main role of being the root element for an HTML document.

Validity of href attribute without a value

Is <a href>some text</a> a valid html code?
This one some text is valid.
What about the first one? Is that a typo or it is allowed to skip =""?
The markup <a href>some text</a> is valid (and equivalent to some text) according to HTML5 CR in HTML serialization, but not otherwise.
General HTML5 rules on HTML serialization (HTML syntax) allow empty attribute syntax: “Just the attribute name. The value is implicitly the empty string.” And the empty string is a valid URL, referring to the current document.
In XHTML, <a href>some text</a> is invalid and not even well-formed, since well-formedness rules (i.e., general XML syntax rules) require that an attribute specicification is of the form name="value" or name='value', with no shortcuts.
In earlier HTML specs, up to and including HTML 4.01, <a href>some text</a> is invalid but on other grounds. By the formal rules, an attribute value may never be omitted from an attribute specification, but the name and the equals sign may be omitted, if the attribute is declared with an enumerated set of values. So <a href>some text</a> would be valid if there were an attribute for a declared with enumerated values so that one of them is href (and there is only one such attribute). But there is no such attribute.
It depends on your doctype. More importantly, the rendering depends on your client's browser implementation. Chrome, FF, IE>7, etc, these browsers know what you meant, and can pick up the pieces just fine.
HTML5
<!DOCTYPE html>
The validator says:
Valid, but WARNING: Attribute href without an explicit value seen. The attribute may be dropped by IE7.
XHTML1.0 Strict and XHTML1.0 Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
and
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
The validator says:
Invalid: "href" is not a member of a group specified for any attribute
HTML 4.01 Strict and HTML 4.01 Loose
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><html>
and
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html>
The validator says:
Invalid: "HREF" is not a member of a group specified for any attribute
No it's not valid. Use:
<span class="pseudo-link">Some text</span>
and define styles in CSS for class pseudo-link to look like a normal link.
:hover selector will be important to change font color and underline the text when mouse is over it.
You can then define action for this pseudo-link with Javascript.

Html "Already defined" Error

I am debugging a layout right now and have come across some strange errors. and I am serving the page up as DTD XHTML 1.0 Strict.
The error shows is like this
ID "OFFICENAME" already defined:
div class="office" id="officename"
ID "OFFICENAME" first defined here
span id="officename">
and
NET-enabling start-tag requires SHORTTAG YES
This error is showing in the break code
<br />
Please any one help me out of this and tell me the correct way of representing
id must be unique. You can't have two elements with the same ID. You should remove one of the ids or use class instead. You can have multiple classes on any given element, e.g.:
class="office officename"
In HTML/SGML meaning of / is different than in XHTML: <foo/bar/ is <foo>bar</foo> and <foo/> is <foo></foo>> (that's an archaic quirk supported only by W3C validator).
You're probably sending XHTML markup as HTML. Use text/html MIME type with HTML5 DOCTYPE instead (you'll get better compatiblity, better validation and /> talismans will be allowed).
<!DOCTYPE html>
You can't have multiple elements with the same id. Change the id on the span or the div to something else.

Is this valid DTD? (official html 4.01 dtd)

Following declaration appears in html 4.01 dtds
<!ELEMENT STYLE - - %StyleSheet -- style info -->
(see http://www.w3.org/TR/REC-html40/sgml/dtd.html it's in both strict.dtd and loose.dtd)
Apparently, the ; is missing after %StyleSheet. The reference should have been %StyleSheet;
But this is the official dtd of the holy html - by far the most important dtd of all dtds - so what's going on there? Is it valid entity reference like that?
It is valid without the semicolon in HTML 4.01 DTDs. Here's an extract from the W3C's HTML 4.01 Specification - On SGML and HTML:
... Instances of parameter entities in a DTD begin with "%", then the parameter entity name, and terminated by an optional ";".
In an XHTML DTD it wouldn't be valid; they follow this recommendation (because XHTML is XML): Extensible Markup Language (XML) 1.0 (Fifth Edition) - Character and Entity References:
... Definition: Parameter-entity references use percent-sign (%) and semicolon (;) as delimiters.