Empty HTML tags - html

I have a lot of empty span tags and so on in my code, due to CSS image replacement techniques.
They trigger a HTML validation warning.
Should I care?

I agree with #Gumbo, but having lots of warnings that make no difference can hide the ones that you do need to pay attention to. If you can put something in the span that gets rid of the warning but doesn't break your UI, you should at least think about doing it.

I've made validation part of my workflow because it helps me catch mistakes early. And while I don't consider empty elements to be a problem, it negates some of the value of using a validator if I have to mentally parse a list of warnings each time and decide whether a warning is important or not. So I try to keep my pages both error- and warning-free so that a quick glance at the HTML Validator icon in the Firefox status bar only changes when there is a real problem. To that end I keep empty elements "unempty" by inserting an empty comment.
<span><!-- --></span>
(At least that works with the Tidy validator.)
Now, that being said, I don't think this is at all necessary for many situations. It is perfectly reasonably to think that adding eight extra characters to your code just to avoid a validator warning is ridiculous. But it works for me.

You should consider the behavior of the page for things like screen readers. It is common to actually put a few words describing the image in the tag that are then hidden by the image replacement.
See the CSS Zen Garden where you can see examples like H1 spans with text being replaced in CSS by images.
This will improve the not only the accessibility of your site, but also the search-ability.

An "empty" tag has a very specific definition in HTML:
<span/>
versus
<span></span>
The former is not permitted by the HTML 4.0 Strict DTD, so should be flagged by a validator. The only tags that can use the former syntax are those specifically identified as "EMPTY" in the DTD (eg, <br>).
The second form is valid HTML, and does not get flagged by the W3C validator.
So I have to assume that either (1) your validator is broken, or (2) you are using the tag incorrectly.

A warning is not an error. It’s just a reminder that you should improve something.

I suppose if bandwidth was an issue, those empty tags could be revisited to see if you could get them from appearing alogether.

Eric Meyer also says that empy tags are bad, semantically.
Warning don't mean it's wrong, but say it could, or sometimes should, be better.
In the same way
if("value"==variable)
is better than
if(variable=="value")

Related

Do we really need to use </body> and </html> closing tags?

More often than not I see HTML without the closing tags, especially body and html.
According to:
http://www.w3.org/TR/html5/sections.html#the-body-element
http://www.w3.org/TR/html5/semantics.html#the-html-element
This can be omitted, but what about cross device issues? Like running such HTML on androids or windows phone's or whatever you know where not having these closing tags this would not work.
Do we need it? Well that depends on your DTD. If you're using XHTML, then yes, you will need it to conform. For accessibility sake I would include the closing tags, you never know if there's a screen reader (or other piece of software) out there that only parses valid XHTML, you could be hindering partially sighted people for example.
Google will also, apparently, rank your valid documents higher than invalid documents in their listings.
Here's a document by a friend of a friend that answers your question a bit better; granted that it was written in 2008, I think some of the points still apply.
If you ever need to use the same html in an XHTML application you won't need to mess around with it, you can just copy it across and not have to worry about conforming (because you already are).
On a separate note, you are essentially future proofing your markup. Who's to say that the spec won't eventually change to "You must include the closing head and body tags"? You won't need to worry if you already have them. It is, however, highly unlikely that the spec will change to, "You must not include the closing head and body tags".
As a great man once said:
Should I close the lid of the toilet when I'm finished? Yes,
especially if the wife is going to use it after me.
- Darren Gourley (Nov 2015)
use https://validator.w3.org/
select your standards target... if that says it passes then I think its Gd enough.
Please bared in mind the HTML5 spec is still being defined/evolved.
Technically you could omit html, head and body tags all together as long as the markup follows the following conditions:
http://www.w3.org/TR/2011/WD-html5-20110525/syntax.html#optional-tags
In regards to your comment about half your team using them and half your team not I would suggest that as long as either option is technically standards compliant you just choose one and move on as the entire subject is open for discussion and interpretation. My personal opinion would be that it's probably more important for your team to get on the same page and produce work of a similar standard especially if you have more than one person working on a project simultaneously.
You can leave out the end tags. Indeed you can leave out the opening tags, too (obviously not if you are using any attributes on them).
Not only is that the case with the more recent standard, but it's been the case since the very beginning. (The obvious exception being if you are using the XML syntax, since XML itself requires all elements have an explicit closing tag.)
Browsers have been dealing with the existence of HTML documents lacking the trailing closing tags since the 1990s. If the standards hadn't allowed it they'd probably still have dealt with them, much as they try their best to deal with all manner of messy code. (This causes it's own problems, which was one of the motivations behind XML not allowing optional tags, but that's another matter).
Many people consider it poor style. I would be one of them. But it's certainly widely supported.

HTML-Input tag syntax

I feel like this might have already been answered on SO, but I couldn't find a similar question.
I've been told by some more experienced programmers that the following is incorrect syntax :
<input type=...> Name </input>
and that the following is correct syntax :
<input type=..../>
I've never known about this, I've always used the first snippet and have never encountered any problems that I know of. Could someone explain why the first is incorrect, and why it still works? Is it rescued by the browser, or is it just a style issue? Any explanation concerning the syntax above is acceptable, as I don't want to risk looking like a noob the next time I'm asked to write html code.
Could someone explain why the first is incorrect
An input element is not allowed to have child nodes, everything about the input is described by its attributes. To associate a label with an input, use a <label> element.
and why it still works?
Browsers are designed to cope with bad input. They ignore end tags for elements that are not open.
Input tags are basically known as void elements i.e they aren't designed to contain text or other elements, and as such do not need a closing tag.
Realistically, the validity would depend on which doctype you declare.
In this context, /> is valid with HTML 5 and XHTML doctypes and invalid with HTML 4.01 doctypes.
Browsers are quite forgiving, but your friend is right and the syntax is wrong. Input elements should not have content. There are plenty of resources about this, one of them is w3schools which says:
Note: The <input> element is empty, it contains attributes only.
Tip: Use the <label> element to define labels for elements.
You can also find a validator online which will validate your page and report any issues to you: W3 validator. It's well worth the effort to validate a page before publishing it. Even though browsers are forgiving and will try to display a page as good as possible, errors like yours make a page invalid and increase the risk that a page is rendered incorrectly or -theoretically- not at all.
In this particular case, a browser might choose to display the text before or after the input, inside the input (not likely), or not at all. By making an HTML document valid, you decrease the risk of a bad surprise.
The HTML specification about the input element says this:
Content model:
- Empty.
and:
Tag omission in text/html:
- No end tag
That means that the element doesn't have any content and doesn't have any ending tag.
In HTML you write the tag like this:
<input type= ... >
In XHTML the tag follows XML standard, so a tag without content is self closed:
<input type= ... />
In HTML5 either way of writing it is valid.
The reason that the code still works is that browsers try to make the best of invalid code. The extra </input> tag doesn't stop anything from working, the browsers will just ignore it. Either the browser doesn't understand what it is supposed to be, or the browser vendor anticipated that specific error and the browser knows to ignore it.

Is leaving out end tags valid?

I remember reading a while ago that in some cases leaving out end tags (</li>, for example) speeds up the rendering (and loading/parsing, since there is less bytes) of a webpage?
Unfortunately, I forgot where I read this, but I remember it saying this feature was specific to HTML 4.0.
Since I no longer have access to this source I was wondering if someone can confirm this or link to the documentation on w3c (since I wasn't able it find it myself)?
Thanks!
EDIT: Forgot to mention that I meant to ask if this behaviour is also available in HTML5.
EDIT 2: I manged to find the article again, and it does mention it only speeds the download speed of the page, not actual rendering:
One good reason for leaving out the end tags for these elements is because they add extra characters to the page download and thus slow down the pages. If you are looking for things to do to speed up your web page downloads, getting rid of optional closing tags is a good place to start. For documents that have lots of paragraphs or table cells this can be a significant savings.
Sorry for asking a pointless question! :(
Here is the list of HTML 4.01 elements.
http://www.w3.org/TR/html401/index/elements.html
The End Tag column says where end tags are optional.
However, take note that this is valid only in HTML 4.01. In Xhtml, all end tags are required. Not 100% sure about HTML5.
I wrote a HTML parser once, and believe me, if you're a parser and you're inside a <p> and you encounter a </table> end tag, it's slower to check in your document tree if that is correct, and if so, to close the current <p> first, than if you simply encounter a </p>.
Edit:
Ah, found it: http://dev.w3.org/html5/html-author/#index-of-elements
Same requirements as HTML 4.01.
New edit:
Oh, that was a page from 2009. This one is more up to date:
http://dev.w3.org/html5/spec/syntax.html#optional-tags
Some tags in some version of the HTML spec have optional end tags. However, I believe it is generally considered bad form to exclude the end tag.
As mentioned, the end tag of li is optional in html4:
http://www.w3.org/TR/html401/struct/lists.html#h-10.2
so technically this is valid:
<ul>
<li>
text
<li>
<span>stuff</span>
</ul>
But you are only saving 5 characters per li, not really worth what you lose in readability/maintainability.
EDIT: The HTML5 spec is sort of interesting:
An li element's end tag may be omitted if the li element is
immediately followed by another li element or if there is no more
content in the parent element.
Leaving out ending tags is usually forgivable by browsers (it's generally smart enough to know what you're doing). However, any css or js markup properties that the unclosed tag has can affect descendant and/or sibling tags, leaving you scratching your head as to what happened.
While XHTML does expect you to add a closing forward slash to self-contained tags, HTML 5 does not.
XHTML: <img src="" />
HTML5: <img src="">
If you're writing using an xhtml DOCTYPE, then the answer is 'yes', they are required. An xhtml document needs to be valid XML, which means that all tags need to be properly closed.
An HTML document is a bit less fussy. Some tags are specified as being 'self closing', which means you don't need to close them specifically. These include <br>, <img>, etc.
The browsers are generally pretty lenient, because they need to be able to cope with badly written code. But beware that sometimes skipping closing tags can result in different browsers interpreting your code differently, and producing hard-to-debug layout glitches.
In terms of page load speed, you might be right that there would be a marginal gain to be had in download speed and bandwidth costs, but it would be marginal. In terms of rendering, I suspect you'd actually lose speed if you provided invalid HTML, as the browser would have to work harder to parse it.
So even if there is a speed gain to be had it will be marginal, and I don't think skipping closing tags deliberately is a worthwhile exercise. It might possibly be helpful to reduce bandwidth if you're running a site that has massive traffic, but very few of us are writing for Facebook or Google; for virtually everyone else, it's better to write valid code than to try to shave those few bytes.
If you're that worried about bandwidth and page loading speeds, there are likely to be other better ways to reduce your page load sizes than this. For example, compressing your files with gZip will drastically reduce your bandwidth, with zero impact on your code or the browser. gZip compression can be configured in your web server, so you just switch it on and forget about it. You can also 'minify' your CSS and JS code by stripping out unnecessary white space. (HTML can also be minified to a certain extent, but beware that white space is syntactically relevant in HTML, so minifying may not be the right thing to do in all cases).
AFAIK, in XHTML you must always at least self-close a tag <img ... />
In HTML (non xml-html) some tags do not need to be closed. <img> for instance. However, I'd suggest making sure you know exactly which version you're targeting and use W3C's validation service to double-check.
http://validator.w3.org/
I don't see how this would speed things up except that you'd have to send less bytes of data per page (no /'s for some tags, no closing tags for others.) As for building the DOM, I don't know the details of a given implementation (webkit, mozilla, etc) to know which way is faster to parse. I would imagine XML is simply because it is more regular.
EDIT: Yes this behavior is available in HTML5. Note that the help pages are confusing, such as:
http://www.w3schools.com/html5/tag_meta.asp
Meta's in non-xml-html do not require the /, but they can have it. Because of the (in my opinion) leaning towards XML-flavored HTML's the ending slash is more prevalent in written HTML, but you can see they use both styles in the document. The Validator will let you know for sure what you can get away with. :)
In HTML 4.01, which became a W3C Recommendation way back in 1999, you're right:
9.3.1 Paragraphs: the P element
Start tag: required, End tag: optional
http://www.w3.org/TR/1999/REC-html401-19991224/struct/text.html#h-9.3.1
And as for <li>,
Start tag: required, End tag: optional
http://www.w3.org/TR/1999/REC-html401-19991224/struct/lists.html#h-10.2

Space Before Closing Slash?

I've frequently seen a space preceding the closing slash in XML and HTML tags. The XHTML line break is probably the canonical example:
<br />
instead of:
<br/>
The space seems superfluous. In fact, I think that it is superfluous.
What is the reason for writing this space?
I've read that the space solves some "backwards compatibility issues." Which backwards compatibility issues? Are those issues still relevant, or are we still adding extra spaces for the sake of, say, IE3 compatibility? Does there exist some spec with the definitive answer on this?
If not backwards compatibility, then is it a readability issue? Similar to the Great Open Curly Brace debate?
void it_goes_up_here() {
int no_you_fool_it_goes_down_there()
{
I can certainly respect differing stylistic opinions, so I'll be happy to learn that writing the space is simply a matter of taste.
The answer is people wish to adhere to Appendix C of the XHTML1.0 specification. Which you only need to do if you are serving XHTML as text/html. Which most people do, because XHTML's real MIME type (application/html+xml) does not work in Internet Explorer.
No current browser cares for the space. Browsers are very tolerant of these things.
The space used to be required to ensure HTML parsers treated the trailing slash as an unrecognised attribute.
Supporting bobince's answer with screenshot of Netscape 4.80 showing documents
data:text/html,<title>space</title>foo<br />bar
(top left, linebreak rendered) and
data:text/html,<title>no space</title>foo<br/>bar
(bottom left, linebreak ignored).
Posting as answer to show the picture
Tangentially related: in fact I had a lengthy answer identifying the cause of such misbehaviour of ancient browsers (and resulting recommendation to include space) in misunderstood SGML specs, namely SGML Null End Tag (NET) (where 1<tag/2/3 equals 1<tag>2</tag>3 so 1<tag/>2 would actually mean 1<tag>>2), but not only I was unable to find good proof and concrete version of standard, I wasn't even able to grasp proper standard-complying behaviour. So few raw links for reference:
w3c validator notice mentioning problematic closing slash and pointing to
Empty elements in SGML, HTML, XML, and XHTML # www.cs.tut.fi/~jkorpela/
Beware of XHTML: Null End Tags (NET) stating, that
However, there are still some smaller user agents that properly support Null End Tags. One of the more well-known user agents that support it is the W3C validator.
(Unable to reproduce there now, but supports Lee Kowalkowski's statement about multiple browsers affected by this.)
XML W3C Working Draft 07-Aug-97 - latest specs draft that includes reference of Null End Tag in DTD snippet: NET "/>"
Are those issues still relevant or are we still adding extra spaces for the sake of, say, IE3 compatibility?
You were close - it is for Netscape 4.
It is interesting to see other rationalisations, but that's all it was meant for.
No, the space is not required but it is necessary for some older browsers to render those tags correctly. The proper way to do it is without the extra space as this is something XHTML inherited from XML.
In XHTML, br tags must be closed, but the space is not necessary. It's a stylistic thing. In HTML, br tags cannot be closed, so both are wrong.
The space just makes the tags more readable. I am a big proponent of formatting for more readable code. Little things like that go a long way. Without the space the closing tag blends in with the opening tag. It takes just an instant longer for me to process it as I am quickly reading the code.
I think that the white space is a way to reinforce the idea that this tag is empty and it closes itself.
Today i don't use the white space anymore because i never had a problem with no white space.
What if there was a very lazy html writer out there or maybe he had a fear of quotation marks.
Consider the following if you were his robot page crawler...
<img src=http://myunquotedurl.com/image.jpg />
versus
<img src=http://myunquotedurl.com/image.jpg/>
This might seem small but look what it can do if the space isn't there. The robot won't know if the slash is part of the url or part of the closing tag.

Ideal user feedback for HTML input

Let's face it: writing proper, standards compliant HTML is quite difficult to do. Writing semantic HTML is even more so, but I don't think it's possible for a computer to figure that out.
So my question to you is what would the "ideal" feedback for a user who entered HTML be? Would it be a W3C validator style list of errors and corresponding line numbers and columns? Would it be a annotated code display of highlighted lines, explanations of the errors, and possible fixes? A spell-check style mode where you handle each error separately? Would it be not giving them any error information at all? Also, what types of errors are a good idea to tell users? (Some broad classes of errors include parsing errors, nesting errors (i.e. putting a div in a b tag) and well-formedness errors.)
Scottm: Good point; I've never liked the W3C way of listing all the errors either. However, there is still the question of then letting the user edit the offending HTML appropriately.
onebyone: Ok, so looking at some screenshots it looks like HTML Validator has a W3C error list, but combined with the ability to go straight to the relevant source segment and expanded error information, as well as the fact that you don't have to go scrolly to jump from one section to another. Looks pretty good, but is it usable by the average Joe?
Edit 1: As a clarification, this is with regards to the interface, not necessarily the underlying implementation. However, interface needs to be feasible with plain HTML and JavaScript (double usability points if it just needs HTML, but I think you're going to get stuck with W3C in that case).
The output from the Firefox "HTML validator" add-on is pretty good. It shows you the source in a big window, and a list of errors in a small window (smallness doesn't matter, since you generally only care about the first one, since you're aiming for a total of none). Click an error to highlight, and an expanded explanation is shown in a second small window, while the offending part of the code is highlighted in the big window.
The add-on doesn't include a text editor, though, so it's not a full solution to your problem. It uses both an SGML-based validator and HTML Tidy, though, and I think for local files you can get it to make the corrections suggested by Tidy.
I always think syntax highlighting is great. In HTML this would be very useful too, as tags can be easily distinguished by the developer when he/she can see them appropraitely coloured.
Personally I don't like the W3C way of giving you a big boring list of problems. Visual aids in the code itself are much better.