Proper DOCTYPE... custom attributes and non-standard markup - html

Ok, don't get me wrong, I absolutely love the idea of web standards... wrote and validated a number of pages with strict XHTML 1.0 - however, the web is evolving... and the more I use XML, realize the capacity of the DOM, and realize most browsers don't care one way or the other, the more I realize I realize I just want to break conventions and start using custom attributes.
Example of this is on my current site which uses a "message" attribute on a number of elements, and jQuery to then match those element types and update a footer message (something like a static tooltip). Problem of course is... this isn't actually supported.
My question then is simply is there something of a broader spectrumed doctype that would allow me to use 99% of the XHTML and/or HTML5 standard but throw in some custom attributes?
Or do I just continue to break validation and say to hell with it cause the browser and javascript will "get it" anyway?

The nature of DTDs and XML validation requires that a custom DTD be used if you're adding extra namespaces to a document. See the A List Apart articles Validating a Custom DTD and More About Custom DTDs for details on how to create a custom DTD. I don't know if this is possible within the confines of DTD syntax, but you could consider creating your own namespace and simply declaring "this namespace may contain anything" -- that should provide a nice dumping-ground for custom data without interfering with XHTML parsing.

If you're interested in HTML5, then make sure the names of your custom attributes start with "data-" and they'll validate in an HTML5 validator.
Otherwise, I'd just break validation. XHTML 1.x validation (which is doctype based) and browser interpretation of the markup (which is content-type based) are far enough apart to make XHTML validation of dubious value, once you know what you're doing.

Related

html validator versus SU:BADGE

How to fix that
element "SU:BADGE" undefined <su:badge layout="5"></su:badge>
I use HTML 4.01 Transitional on my website ...
The BADGE is from stumbleupon.com website
Thank you . Regards.
Since such markup is not valid in HTML 4.01, the validator is just doing what you asked it to do: reporting any reportable markup errors.
The only practical reason why such error messages might be disturbing is that if there are many of them, you might accidentally miss to notice some other error messages, relating to constructs where your markup unintentionally deviates from HTML 4.01. If this is an issue, consider writing a custom DTD. It requires some understanding of SGML, but so does the use of validators in general, does it not?
On the other hand, you might decide to refrain from using tags suggested by people who cannot do such a simple thing using valid HTML. There are many ways to put information on a web page in a manner that does not affect the visible appearance but can be retrieved by programs that search for specific constructs (e.g., meta tags). They decided on a way that causes problems to authors who wish to use validtors.
You don’t.
Validation is nice (especially as a sanity check while developing) but if you don’t use valid markup then there’s nothing you can do, and if you need invalid markup (because you use a third-party API for instance) then validation simply ceases to be relevant.
Alternatively, you could serve your page as XML instead of HTML 4 and define an appropriate su XML namespace. But since this would mean that some browsers no longer display your page correctly this is more a theoretical possibility.

Enctype for XHTML Strict 1.0 Page

I have, for my own edification, constructed an XHTML Strict 1.0 page containing a form. I'd like for it to accept text/xml MIME-types only and so I've specified the accept attribute accordingly. However, it can't be validated when also including the enctype="mulipart/form-data" attribute-value pair. Is there an alternative to specifying the enctype when working with XHTML Strict 1.0? Do I need to specify the enctype or something similar at all?
I have not set up an actual "action" (cgi or some other back-end function). I'm only concerned with client-side for the moment and would like for the user to be prompted when uploading anything that's NOT xml. Do I need JavaScript here?
Also, it seems that not too many people are fond of XHTML in any form. If you have the liberty of choosing XHTML Strict/Frameset/Transitional or HTML 4.01 for a static page, which standard would be best?
The form encoding (enctype attribute) has to be multipart/form-data for the file uploads to work.
According to what I can find, the accept attribute is not implemented in any browser at all.
So, using Javascript seems like the only option if you want to offer any feedback on the selected file before the actual upload.
There are some people that have very strong opinions about XHTML, but that doesn't mean that it's not a widely used standard.
To address the last point first: Strict XHTML 1.1 suffers from the fact that the W3C recommendation really requires you to deliver the document as MIME type application/xml+xhtml or something like that, and that's virtually impossible to set up on a web server in a way that satisfies most, if not all, current clients. So if you cannot do it right anyway you might as well just use HTML 4.01, which is grammatically nearly equivalent and arguably more powerful (e.g. HTML 4.01 can validably prohibit nested anchors, while XHTML has to add that as a textual extra clause). You'll get the same job done, and it'll actually be understood by nearly all existing clients. (Since I trust you'll only be using DOM methods to manipulate the document client-side, there won't be a problem with AJAX backends sending other forms of XML, either.)
For the first question: There is nothing that forces any client to do anything specific. The accept attribute is a hint for the client what your server will probably accept or reject, but it doesn't have to act on this in any defined manner. If you like, you can add some optional additional verification on the client with scripting, but of course you always must validate input data on the server, too.

What are the concrete risks of using custom HTML attributes in HTML4 Strict?

This subject turned into a heated discussion at the office, so I'm interested to learn what you think.
We are working on a web app that only targets some specific browsers. These browsers presently include different flavors Opera 9 and Mozilla 1.7.12. In the future we likely also have to support Opera 10 and different flavors of WebKit. But it's very unlikely we are ever going to have to deal with any version of IE.
Our web app declares HTML 4.0 strict in it's doctype.
Recently, I proposed as a solution to a specific problem to use custom attributes in the HTML. I proposed something that would look like this:
<span translationkey="someKey">...</span>
Since this is not valid HTML 4, it did not go down well with our HTML guys, and we got into an argument.
My question is this: What - if any - are the risks of using custom attributes? I know the page won't validate, but don't all browsers just ignore attributes they do not know? Or is it conceivable that some browsers will change to "quirks mode" and render the page as if it was something other than strict HTML 4.0?
Update:
Hilited the actual question posed.
There are no browser limitations/risks. Only the w3 validator will bark, but barking dogs doesn't bite.
The w3 spec says the following:
If a user agent encounters an attribute it does not recognize, it
should ignore the entire attribute
specification (i.e., the attribute and
its value).
IE will also not render in quirks mode or so as some may think. It will only do that on invalid/forced doctypes, not on invalid attributes.
However, keep in mind that some Javascript libraries/frameworks will "invisibly" add/use custom HTML attributes in the DOM tree, such as several jQuery plugins do. This way you may risk collisions in attributes because it "by a coincidence" uses an attribute with the same name as you do for your own purposes. Sadly this is often poorly or even not documented at all.
HTML 5 allows custom attributes using a 'data-' prefix, see http://ejohn.org/blog/html-5-data-attributes/
If its a goal to maintain valid html4.0 strict, then it doesn't matter why you want to put in custom attributes, you are breaking the goal.
I think the question you need to be asking, is why do you need to break 4.0 strict to get the functionality you want: Anything that you could use a custom attribute for you, you could use a in an existing attribute:
<span translationkey="someKey">...</span>
could be:
<span class="Translationkey#someKey">...</span>
it will be some extra cycles to parse all the class information, but so long as you don't put any css info on that class, it doesn't change display, doesn't put you in quirks mode, and doesn't get you in fights at work.
Duplicate try this thread though: Is it alright to add custom Html attributes?
Also look at this: Non-Standard Attributes on HTML Tags. Good Thing? Bad Thing? Your Thoughts?
Or is it conceivable that some browsers will change to "quirks mode" and render the page as if it was something other than strict HTML 4.0?
No, bad attributes will not force a rendering mode change.
If you don't care about validation do what you like, but validation is a useful tool for detecting simple mistakes that can otherwise have you chasing around debugging. Given that there are many other perfectly good alternatives for passing data to JavaScript I prefer to use one of those, rather than forgo validation.
Plus, when you add an arbitrary attribute you are effectively playing in a global namespace. There's no guarantee that some future browser or standard won't decide to use the name ‘translationkey’ for some new feature that'll trip your script up. So if you must add attributes, give them a name that's obscure and likely to be unique, or just use the HTML5 data- prefix already.
If the page is declared to be HTML 4 strict, then it should not add attributes that are not used in that HTML specifies. Differently, it is not clear what the browsers would behave.
As already reported, a way to add additional attributes is to add them as classes, even if that has some limitations.
(Copying my answer from a duplicate question)
Answers which say custom attributes won't validate are incorrect.
Custom attributes will validate.
Custom tags will validate too, as long as the custom tags are lowercase and hyphenated.
Try this in any validator. It will validate.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Custom Test</title>
</head>
<body>
<dog-cat PIANO="yellow">test</dog-cat>
</body>
</html>
Some validators:
https://appdevtools.com/html-validator
https://www.freeformatter.com/html-validator.html
https://validator.w3.org/nu/
But is it safe? Will it break later?
Custom Tags
No hyphenated tags exist. For that reason, I believe that W3C will never add a hyphenated tag. It's very likely a custom hyphenated tag will never see a conflict. If you use a weird prefix in your custom tags, even less likely. Eg.<johny-Calendar>.
Custom Attributes
There are hyphenated HTML attributes. But the HTML spec promises never to use an attribute starting with data-. So data-myattrib is guaranteed to be safe.
I believe that W3C will never introduce any attribute that starts with johny- or piano-. As long as your prefix is weird, you'll never see a conflict.
It's worth considering that the first version of HTML was written in 1993. Now, thirty years later, all browsers still support custom tags and attributes, and validators validate them.
<jw-post jw-author="Johny Why" jw-date="12/2/2022">

Why is XHTML syntax so widely used in web pages?

First of all let's emphasize that syntax rules don't work alone, but they need the correct Content-type header to be fully interpreted by the clients. Currently web pages cannot be served with the correct XHTML header because Internet Explorer doesn't understand that.
The first advantage usually mentioned is that XHTML requires pages to be well-formed: true, but when browsers treat them as (malformed) HTML nothing enforces this rule, so it's up to you being a disciplined developer -- but you can be as disciplined writing good well-formed HTML too.
Another point often mentioned is that XHTML promotes the separation between content and presentation, but even in this case it doesn't really offer anything that can't be done with HTML -- it still depends on the developer since nothing is enforced, and no exclusive tools are offered.
So why do so many developer (including those of famous CMS/blogging softwares) still use XHTML syntax instead of directly writing what those pages will become anyway (i.e. plain HTML)?
Related fact: Stackoverflow uses HTML strict.
http://en.wikipedia.org/wiki/XHTML
From the wiki:
"The only essential difference between XHTML and HTML is that XHTML must be well-formed XML, while HTML need not be."
It's up to you which one you choose. There is no real difference in terms of what the user sees. Whichever you choose, please try to make it well-formed and make sure that your HTML/XHTML validates and follows the standards.
This probably isn't the actual reason, but it makes them parsable using a regular XML parser.
Sadly, XHTML syntax isn't as widely used as the XHTML doctype. You'd think people would be conscious about it, but a lot of the time (at least a few years ago), an XHTML doctype was used mostly because HTML 4 was being "dissed". That hasn't stopped people from continuing to use HTML syntax though. Open ended <li> and <p> tags, non-terminated <br> and <img> tags, tag attributes not enclosed in quotes, and more hypocritical nonsense.
Currently web pages cannot be served with the correct XHTML header because Internet Explorer doesn't understand that.
Sure they can, provided you're prepared to use content negotiation to serve a application/xhtml+xml content type to those user-agents that say they accept it.
There a number of reasons both good and bad why xhtml is so widely used. Jay Askren has a point about people who use XML in other contexts, (I'm one of them), but I doubt if that accounts for much use. If there is a good reason why XHTML is popular, it's most likely that the orthogonality of XML is a very seductive idea. It's simply easier remembering "Always close every tag, always quote the attribute values" than trying to remember all the rules about when you can safely omit tags and leave attributes unquoted etc., even though it results in a more verbose document.
There are other reasons like the fact that it's easier to indent your code if every opening tag has a matching closing one, and if you do, you've got a pretty accurate picture of the DOM laid out in the source code, which can aid with scripting. But I doubt that this is a primary reason.
Using XHTML states an intent, don’t underestimate that (but don’t overestimate this either). Web standards are politics: if nobody cares, nothing is gonna change. Using XHTML (or HTML5) signals “yes, we are in fact interested in the continued development of the standards.
Furthermore, while clients certainly don’t enforce XHTML rules with a text/html content type, design tools still can do this. XHTML is much easier to support for editors than real HTML (with “real” I mean the whole ugly SGML package). There are good XHTML validators that do much more than HTML validators can (e.g. Schneegans’ XML schema validator).
All in all, many arguments against XHTML are in fact straw-men that aim at some of the poorly-formulated arguments for XHTML. For instance, Microsoft is responsible of publishing long lists of purported XHTML advantages (such as semantic web design). Attacking those arguments is like reductio ad absurdum. But there are good arguments for XHTML.
I suspect a major reason xhtml is so popular is cultural and historical more than anything. XML became quite popular some time ago and it is still used quite heavily. It is good for for defining a data model that can be sent over the wire using webservices. There are lots of tools/technologies that work with it such as xslt and many others. It is natural for a developer to use html which is structured like xml, even if there is no real advantage just because they use xml in other contexts.

At the end of the day, why choose XHTML over HTML? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I wonder why I should use XHTML instead of HTML.
XHTML is supposed to be "modularized", but I haven't seen any server side language take advantage of any of that.
XHTML is also more strict, and I don't see the advantage. What does XHTML offer that I need so bad? How does it make my code "better"?
EDIT: another question I found in the comments: Does XHTML parse faster than HTML?
EDIT2: after reading all your comments and the links, I indeed agree that another post deserves to be the correct answer, so I chose the one that directly links to the best source.
Also, goes to show that people upvote the green comment without even reading it.
You should read Beware of XHTML, which is an informative article that warns about some of the pitfalls of XHTML over HTML.
I was pretty gung-ho about XHTML until I read it, but it does make several valid points. Including the following bit;
XHTML 1.x is not “future-compatible”. XHTML 2, currently in the drafting stages, is not backwards-compatible with XHTML 1.x. XHTML 2 will have lots of major changes to the way documents are written and structured, and even if you already have your site written in XHTML 1.1, a complete site rewrite will usually be necessary in order to convert it to proper XHTML 2. A simple XSL transformation will not be sufficient in most cases, because some semantics won't translate properly.
HTML 4.01 is actually more future-compatible. A valid HTML 4.01 document written to modern support levels will be valid HTML 5, and HTML 5 is where the majority of attention is from browser developers and the W3C.
Future compatibility can be huge when working on some projects. The article goes on to make several other good points, but I think that may have stood out the most for me.
Don't mistake the article for a rant against XHTML, the author does talk about the good points of XHTML, but it is good to be aware of the shortcomings before you dive in.
I was going to add this as a comment to one of the other posts, but it grew a little too large.
What the fundamental point that most people seem to be missing, is the purpose behind XHTML. One of the major reasons for developing the XHTML specification was to de-emphasise presentation-related tags in the markup, and to defer presentation to CSS. Whilst this separation can be achieved with plain HTML, this behaviour isn't promoted by the specifcation.
Separating meta-markup and presentation is a vital part of developing for the 'programmable web', and will not only improve SEO, and access for screen readers/text browsers, but will also lead towards your website being more easily analysable by those wishing to access it programmatically (in many simple cases, this can negate the need for developing a specific API, or even just allow for client-side scripts to do things like, identify phone numbers readily). If your web-page conforms to the XHTML specification, it can easily be traversed using XML-related tools, and things such as XPath... which is fantastic news for those who want to extract particular information from your website.
XHTML was not developed for use by itself, but by use with a variety of other technologies. It relies heavily on the use of CSS for presentation, and places a foundation for things like Microformats (whether you love them, or hate them) to offer a standardised markup for common data presentation.
Don't be fooled by the crowd who think that XHTML is insignificant, and is just overly restrictive and pointless... it was created with a purpose that 95% of the world seems to ignore/not know about.
By all means use HTML, but use it for what it's good for, and take the same approach when looking at XHTML.
With regard to parsing speed, I imagine there would be very little difference in the parsing of the actual documents between XHTML and HTML. The trade-off will come purely in how you describe the document using the available markup. XHTML tags tend to be longer, due to required attributes, proper closing, etc. but will forego the need for any presentational markup in the document itself. With that being the case, I think you're talking about comparing one type of apple, with a very slightly different type of apple... they're different, but it's unlikely to be of any consequence (in terms of parsing and rendering) when all you want is a healthy, tasty apple.
For the visitor of a website it probably doesn't make any visible difference. Furthermore, XHTML is usually more of a pain to use as at least one widespread browser still doesn't know how to handle it and you need to serve it as text/html in that case (which yields invalid HTML).
If your HTML is going to be regularly processed by automated tools instead of being read by humans, then you might want to use XHTML because of its more strict structure and being XML it's more easy to parse (from an application standpoint. Not that XML is inherently easy to parse, though).
Apart from that I don't see any compelling reasons to use it, though. XHTML was created in an approach of making use of XML features for HTML and basically it boils down to "HTML 4 with several annoying side-effects" (IMHO, at least).
Use HTML (HTML4 Strict or HTML5).
HTML can fully utilize CSS, can be validated and parsed unambiguously. Separation of structure and presentation has been done in HTML4 and XHTML merely continued that.
All browsers support HTML. Only some browsers support XHTML and those that do, often have more mature and better tested and optimized support for HTML (it's caused by the fact that tiny fraction of pages uses XML mode).
If you care about IE and Google, you have to use HTML or subset of XHTML and HTML defined in Appendix C of XHTML spec. The latter is almost worst of the both worlds, because such XHTML cannot be generated with standard XML tools, cannot use extension mechanisms new to XHTML and has additional limitations over those in HTML alone.
XHTML1.0 is now over 10 years old, it was designed in "Web1.0" times, and as head of W3C said, in retrospect it didn't work out and better approach is needed. W3C HTML5 is written as we speak and addresses needs of web applications used today, and has very good backwards compatibility.
HTML5 closes many gaps that were between HTML4 and XHTML1 (e.g. adds inline SVG, MathML i RDF), cleans up language beyond what was done in XHTML1.0 and XHTML1.1.
XHTML2 is not going to be supported by web browsers in forseeable future. It's likely that it will never be supported (all browser vendors heavily support [X]HTML5, some have already declared that they won't implement XHTML2).
XHTML1.0 has exactly the same semantics and separation of presentation from structure as HTML4.01. Anybody who says otherwise, hasn't read the specification. I encourage everybody to read the spec – it's suprisingly short and uninteresting.
Stylesheets were introduced in HTML4.01 and were not changed in XHTML1.0.
Presentational elements were deprecated in HTML4.01 and were not removed in XHTML1.0.
XHTML myths.
There are no untractable differences in HTML and XHTML that would make parsing of one much slower than another. It depends how the parser is implemented.
Both SGML and XML parsers need to load and parse entire DTD in order to understand entities. This alone is usually more work than parsing of the document itself. HTML parsers almost always "cheat" and use hardcoded entities and element information. XHTML parsers in browsers cheat too.
Parsing of HTML requires handling of implied start and end tags, and real-world HTML requires additional work to handle misplaced tags.
Proper parsing of XHTML requires tracking of XML namespaces.
Draconian XML rules require checking if every character is properly encoded. HTML parsers may get away with this, but OTOH they need to look for <meta>.
The overall difference in cost of parsing is tiny compared to time it takes to download document, build DOM, run scripts, apply CSS and all other things browsers have to do.
I'm surprised that all the answers here recommend XHTML over HTML. I am firmly of the opposite opinion - you should not use XHTML, for the foreseeable future. Here's why:
No browser interprets XHTML as XHTML unless you serve it as mimetype application/xhtml+xml. If you just serve it with the default mimetype, all browsers will interpret it as HTML - eg, accepting unclosed or improperly nested elements.
However, you should never actually do this, as Internet Explorer does not recognise application/xhtml+xml, and would fail to render the page completely.
There are significant differences in the DOM between XHTML and HTML. Since all so-called XHTML pages are being served as HTML at the moment, all javascript code is written using the HTML DOM. If, support for the XHTML mimetype becomes significant enough to convince people to start using it, most of their javascript code will break - even if they think their pages validate as XHTML.
Instead of continuing to debate HTML 4.01 Strict vs XHTML Strict, I would suggest starting to use HTML 5 today. John Resig, the author of jquery, made a similar suggestion last year on his blog.
The HTML 5 doctype, in it's beautiful simplicity will trigger standards mode in all browsers (including IE6).
<!DOCTYPE html>
That's it.
HTML 5 provides some exciting new features such as the <canvas> tag which potentially can push javascript application development to the next level. HTML 5 also has proper support for media (and media is a fairly important aspect of the web these days!) in the form of <video> and <audio> tags.
If you like the syntax of XHTML, i.e. closing "empty" tags such as <br />, that is fully supported in HTML 5. From Karl Dubost of the W3C's post Learn How To Write HTML 5:
auto-closing tag is allowed and conformant in HTML 5.
XHTML2 has received relatively little attention compared to HTML 5. It's becoming increasingly clear that HTML 5 is the future of markup on the web. Microsoft's latest browser, IE8 still renders XHTML served as text/xml as text/html.
Microsoft have a co-chair on the W3C HTML working group and there's an implied support from them for HTML 5. All of the browser vendors have publicly announced their support for HTML 5.
At the end of the day, even if XHTML2 regains support from the industry, it won't be a significant issue having two competing standards as it has been in the past. Both languages support XML namespaces (in the case of HTML 5, serialization of HTML i.e. DOCTYPE switching).
As a programmer, you should be VERY concerned about your code. HTML is ugly and follows few rules.
XHTML on the other hand, turns HTML into a proper language, following strict structural and syntactic rules.
XHTML is better for everyone, as it will help move the web to a point where everyone (all browsers) can agree on how to display a web page.
XHTML is an XML descendent, and us such is much easier on parsers built for the job of analysing syntactically sound XML documents.
If you can't see the benefit of XHTML, you might as well be using MS Word to create your HTML documents.
Take a look at http://www.w3.org/MarkUp/2004/xhtml-faq#need. There are some good reasons apart from modularisation.
I favor XHTML because it's stricter and more clearly laid out. HTML is quirky and browsers have to accept things like <b><i>sadasd</b></i>.
While this is a really simple example, it could also get more confusing and different browsers could lay out things differently.
Also I think that XHTML has to be "faster" since the browser doesn't have to do that kind of "reparations".
Some differences are:
XHTML tags must be properly nested
The documents must have one root element
XHTML tags are always in lowercase
Tags must always be closed (e.g. using the <br> tag in XHTML must have closing tag <br /> or <br></br> in XHTML)
Here are some links on it
wiki XHTML
wiki HTML vs XHTML
XHTML allows to use all those tools designed for XML. Among then, there is XSLT, embedding SVG, etc...
Interesting development: XHTML 2 Working Group Expected to Stop Work End of 2009, W3C to Increase Resources on HTML 5
2009-07-02: Today the Director announces that when the XHTML 2 Working Group charter expires as scheduled at the end of 2009, the charter will not be renewed. By doing so, and by increasing resources in the Working Group, W3C hopes to accelerate the progress of HTML 5 and clarify W3C's position regarding the future of HTML. A FAQ answers questions about the future of deliverables of the XHTML 2 Working Group, and the status of various discussions related to HTML. Learn more about the HTML Activity.
Well, I guess that makes the future of HTML pretty clear.
XHTML forces you to be neat.
For example, in HTML, you can write:
<img src="image.jpg">
This isn't very logical, because the img tag never gets closed. In XHTML, however, you're forced to close the tag neatly, like this:
<img src="image.jpg" />
I like using something that forces me to be neat.
Steve
The subtitle to the XHTML 1.0 recommendation:
A Reformulation of HTML 4 in XML 1.0
Many tools exist today to process XML. By using XHTML, you are allowing a huge set of tools to operate on your pages and to extract information programmatically.
If you were to use HTML, this would be possible too. There are tools in existence to parse HTML DOM trees. However, these tools can often be more specialized than those for XML. You may not find your favorite XML data processing tools compatible with HTML. Furthermore, there are so many uses for XML nowadays that you may be using XML for some other part of an application; why not also use that same XML parser to parse your web pages? This is the motivation behind XHTML.
If you're already comfortable and familiar with HTML 4.01, you have an established project using HTML 4, and you don't have tons of spare time, just go with HTML 4.01. If you have spare time, learn XHTML 1.1 anyway, and start your new projects in XHTML 1.1 – there's no harm in doing so. If you're using something other than HTML 4.01 or are pretty unfamiliar with HTML 4 anyway, just learn XHTML 1.1.
Using XHTML with the correct DocType will force the browser to render the content in a more standards compliant (strict) mode. This makes the different browsers behave better and, most importantly, more like each other. This makes your job as a webdeveloper a lot easier since it reduces the amount of browser specific tweaks needed to make the content look the same in all browsers.
Quirksmode.org has a lot of good info on this subject.
In my opinion, the strictness is, at least in theory, a good thing, because in HTML, you don't need to be strict, and because of that and the HTML5 junk, Browsers have advanced error correction algorithms that will make the best out of broken HTML. The problem is, the algorithms are not exactly the same and will lead to really strange behaviour you can't predict. With XHTML, on the other hand, you typically have fine, valid XHTML and so the error correction algorithms are not needed, i.e. the entire Browser behaviour is predictable. In addition, strict code makes it easier for your tools to work with the code. So you have actually nothing to lose by using XHTML, but there is some potential to gain. Things will get worse with plain HTML when HTML5 is finally out and the "be open in what you accept" will lead to the described strange behaviour. But at least then it's a standardized strange behaviour. Sigh.
On the other hand, if you use a good IDE like Visual Studio, it's almost impossible to produce broken HTML code anyway, so the result is the same.
Use XHTML
Fails fast. If there are any inconsistencies they will be found during validation.
It encourages better design by separating semantic markup from presentation etc.
It's structured which means that you can treat it as a data object and run all sorts of queries against it. For example you could find all addresses or citations within your website.
You can do build-time optimizations. Since it's well-formed XML you can easily do find/replace operations during build time. Or any document management and manipulation.
You can write XSLT or other transformation scripts to programatically transform your XHTML for other platforms. For example you could have an XSLT for the iPhone that would transform all XHTML to make it compatible or more user-friendly for the iPhone
You are future proofing yourself. Transforming XHTML to newer semantics is again, very easy using transformation.
Search engines will continue to evolve to gather more semantic information as part of the programmable web.
DOM operations are more reliable since it's structured.
From an algorithmic perspective, it yields easier and faster parsing.
XHTMl is a good standing point to use because if you want valid code you would need to provide some aspect of help to the disabled community due to the fact screen readers need the alt and title parts of the image and link tags.
It must be faster to parse to an extent because unlike HTML the parser wouldn't need to check to see if the tag wasn't closed properly, if it was nested correctly etc.
Also it is better to use it because yes it is strict but it helps you to think more logically (in my opinion) when it comes to learning programming languages.
I believe XHTML is (or should be) faster to parse. A valid XHTML document must be written to a stricter spec in that errors are fatal when parsing, whereas HTML is more lenient and allows for oddities mentioned before my comment like out of order closing tags and such. I found this helpful in uncovering the differences between HTML and XHTML parsing:
http://wiki.whatwg.org/wiki/HTML_vs._XHTML#Parsing
A reason you might use XHTML over HTML might be if you intend to have mobile users as part of your audience. If I recall, many phones use something more of an XML parser, rather than an HTML one to display the web. If you are writing for desktop browsers, HTML would probably be acceptable.
That said, if you are going to serve the data as text/html anyway, you should use HTML:
http://www.hixie.ch/advocacy/xhtml