HTML 5 - Early Adoption Where Possible - Good or Bad?

HTML 5 - Early Adoption Where Possible - Good or Bad? - html

This question was inspired a bit by this question, in which the most upvoted answer recommended using a feature from HTML 5. It certainly seemed to be a good method to me, but it made me curious about using features from a future spec in general.
HTML 5 offers a lot of nice improvements, many of which can be used without causing problems in current browsers.
Some examples:
// new, simple HTML5 doctype (puts browsers in standards mode)
<!doctype HTML>
// new input types, for easy, generic client side validation
<input type="email" name="emailAddress"/>
<input type="number" name="userid"/>
<input type="date" name="dateOfBirth"/>
// new "required" attribute indicates that a field is required
<input type="text" name="userName" required="true"/>
// new 'data-' prefixed attributes
// for easy insertion of js-accessible metadata in dynamic pages
<div data-price="33.23">
<!-- -->
</div>
<button data-item-id="93024">Add Item</button>
Many of these new features are designed to make it possible for browsers to automatically validate forms, as well as give them better inputs (for example a date picker). Some are just convenient and seem like a good way to get ready for the future.
They currently don't break anything (as far as I can tell) in current browsers and they allow for clean, generic clientside code.
However, even though they are all valid in HTML 5, they are NOT valid for HTML 4, and HTML 5 is still a draft at this point.
Is it a good idea to go ahead and use these features early?
Are there browser implementation issues with them that I haven't realized?
Should we be developing web pages now that make use of HTML 5 draft features?

There are several things to consider:
First, validation doesn't mean that much, because an HTML page can very well be valid but badly authored, inaccessible, etc. See Say no to "Valid HTML" icons and Sending XHTML as text/html Considered Harmful (in reference to the hobo-web tests mentioned in another response)
Given this, I'd highly recommend using the new DOCTYPE: the only reason for having it in HTML5 is that it's the smallest thing that triggers standards mode in browsers, so if you want standards mode, go with it; you have little to no reason to use another, verbose, error-prone DOCTYPE
As for the forms enhancements, you can use Weston Ruter's webforms2 JS library to bring it to non-aware browsers
and finally, about the data-* attributes, it a) works in all browsers (as long as you use getAttribute()), b) is still better than abusing the title or class attributes and c) won't bother you with validation as we said earlier that validation isn't that important (of course it is, but it doesn't matter that your page is invalid if the validity errors are willful; and you can already use HTML5 validation in the W3C validator, so...); so there's no real reason not to use them either.

Good question!
In short: it depends on your context, and risk tolerance :)
Slightly longer:
I think it's always good to push the envelope on early adoption of technology. It gives you an advantage over late-comers in the commercial world, and also gives you much more leverage in influencing the technology as it emerges.
If you don't want to have to re-write code, or update your source, then early adoption may not be for you. It's perfectly respectable to want to write solid, stable code that never has to change, but it's entirely up to you (and your business context)

If your page relies heavily on search engine placement, it may be worth considering that some engines give priority to validating HTML (Source: http://www.hobo-web.co.uk/seo-blog/index.php/official-google-prefers-valid-html-css/).
Also, it is worth considering that relying on the new date input elements (such as those in Opera, possibly others) allows for more convenience on the part of the developer, it typically precludes including more complex Javascript controls which would better server older browsers (typically falling back to a simple text input field).
Of course and as always, don't rely on browser side checks and validate all input server side.

Please don’t use the new features before you can test them in at least one browser. For example, if you use the now form features, be sure to test in Opera. Otherwise, you’ll likely do more harm than good by contributing to a poisoned legacy out there.
When a feature is already implemented in browsers and you are testing with those browsers, sure, please use the new features.
See also an older answer.

See Robustness principle:
In RFC 761 (Transmission Control
Protocol, 1980) American computer
scientist Jon Postel summarized
earlier communications of desired
interoperability criteria for the
Internet Protocol (cf. IEN 1111, RFC
760) as follows:
TCP implementations should follow a
general principle of robustness: be
conservative in what you do, be
liberal in what you accept from
others.
So, imho, no.

I will not implement new features from HTML until at least they have support from all major browsers.
Clients don't care if your page is valid, they care much more if it works cross browser. Even if we fight to implement the latest standards there will be still clients and companies that will never shed their IE6, and IE6 will be on their browser requirements list for still a while.
The new form types are welcomed, nevertheless the forms have to be checked in the server side.
Passing to HTML5 existing documents will require a lot of effort and adaptation and in my estimate will not happen overnight. Expect at least a 3 years until it will hit the mainstream.

I would use HTML 5 just for fun and learning, but definitively I wouldn't touch any of my production code (existing code) with this new standard, at least by now and until I have a valid reason to support this move.

Related

Is there a better approach to html standards validation than w3c validator?

Granted, this is a very generic question, but I am wondering if w3c validation is considered a best practice for html validation, or if there are better approaches to ensure contemporary standards-compliant markup.
This question arose when I noticed duplicate IDs on an MDN page (a site I would have assumed would be very strict about its coding practices). It appeared to be an artifact of how they generated the sections of the page.
Curious, I validated the page's code on the w3c validator, and there were various "errors" that suggested that MDN was just ignoring that a certain attribute or value was not valid. Generally, these related to seemingly appropriate uses of rel attributes.
I was left wondering if standards for valid, semantic markup matter less, or if there's a new ideal approach to code validation and standardization than relying on w3c validation.

Maintainer of the current W3C HTML Checker (validator) here. I think it's important to understand the intended purpose of the current HTML checker, which is different from the purpose of the legacy W3C Markup Validator.
The purpose of the checker is documented at https://validator.w3.org/nu/about.html#why-validate:
The core reason to run your HTML documents through a conformance checker is simple: To catch unintended mistakes—mistakes you might have otherwise missed—so that you can fix them.
Beyond that, some document-conformance requirements (validity rules) in the HTML spec are there to help you and the users of your documents avoid certain kinds of potential problems.
There are some markup cases defined as errors because they are potential problems for accessibility, usability, interoperability, security, or maintainability—or because they can result in poor performance, or that might cause your scripts to fail in ways that are hard to troubleshoot.
Along with those, some markup cases are defined as errors because they can cause you to run into potential problems in HTML parsing and error-handling behavior—so that, say, you’d end up with some unintuitive, unexpected result in the DOM
Validating your documents alerts you to those potential problems.
So as far as your question about "are better approaches to ensure contemporary standards-compliant markup", the answer is that it's not an either-or thing; there are a variety of approaches and the W3C HTML Checker is just one of them, and its goal isn't to be the single way to determine anything but instead to just help you catch mistakes you might otherwise miss and that might cause unexpected problems for your users.
As far as ways to get alerted to specific device issues or browser-implementation issues, we don’t have good automated checking tools for that, but a couple of things which are huge help there are:
https://caniuse.com/ — detailed information about the level of support for particular web-runtime features in different browsers, and in different versions of those browsers, and in release of the browsers for mobile devices vs desktop
https://wptdashboard.appspot.com/ — current test results across all major browser engines for dozens of web-runtime features/specs; if https://caniuse.com/ doesn’t have information about a particular feature, you can look through this dashboard and browse to the directory that has tests for that feature, and find whether a browser passes the tests for the feature
But as far as good automated tools we do actually have for checking other things, here are two:
https://validator.w3.org/i18n-checker/ — W3C Internationalization Checker
https://observatory.mozilla.org/ — for doing a security assessment of the content of your site

I recently faced a problem with mentioned above W3C HTML Checker. I respect a huge amount of work that was done by author of this validator, but it did not allow me in any way a tag <script type="text/vbscript" src="file.vbs">. It was said to change type value to empty string, a JavaScript MIME type, or module, which makes my page useless.
I know than VBScript language is rarely used now, it was just a test page, but let me share with you less tricky alternative, as good as the first one for HTML error checking.
Maintainer of the current JsonFormatter (validator) is here

To what point is making an HTML page valid worth it?

Since a long time ago, when I found out about the W3C Validator, I made sure every HTML document I made was valid HTML.
However, I think sometimes it just isn't necessary to waste time making it valid. Of course, for actual Internet pages may be important, but is making pages on an Intranet, or even little front-ends that are used with other programs, when the HTML page renders correctly in the most used browsers (not necessarily counting IE 6 and 7).
I think I'm mostly talking about little improvements over code, such as wrapping every shown element of the page on <p> or <div> tags.

Making a page validate for its own sake is not really a business proposition. What happens for end-users (with their cranky browsers) is the real test.
That said, validating periodically will help you debug. It'll catch the more salient errors like unclosed tags. Which, in turn, does affect end-users. So treat validation like compiler warnings -- good for discipline.

It's the best practice, but it really comes down to an organizational requirement/desire. Is it important enough that standards add value for your organization? Or is it simply enough that it displays correctly? Often with intranets its the latter.

Making an HTML page "valid" is worth it if you intend to be future friendly. That is, when browsers begin to strip out deprecated or vendor specific tags, you will find your page displaying incorrectly.
Web standards are there for a reason - to ensure consistent display/output among web browsers and interpreters. Choosing to write your pages in non-compliant HTML is your decision. It is also, to take an old adage, your "funeral".

What happens when the browser of choice for the intranet changes? There really isn't a way to guarantee that the code you have will render correctly in EVERY browser. But in a lot of cases the browsers will be reasonably close to the standard. I think it also depends on how complex the page is because the chances it renders differently in different browsers increases as the complexity of the CSS and tag depth does. The best way is to write valid cross-browser code and test for target browsers. Its silly to think write-once and render the same everywhere is possible for all browsers. But adhering to the standards is the best way you can get close.

HTML 5 Browser Compatibility Chart - HTML 5 in Old Browsers?

I have just started considering using the HTML 5 api for a Rails/JQuery project, so I can use that great data- attribute to store values.
I am worried though about browser compatibility issues. I have two questions (basic questions):
In order to use HTML 5, do people need to upgrade their browsers? How does that work?
Is there an up-to-the-day chart of what features each browser layout engine supports, more up-to-date than this Wikipedia article on comparing HTML layout engines and this When can I use... HTML 5 page?
Is it going to be an issue with people using IE6 for example? Lots of non-computer saavy people I've talked to who want to get an internet presence themselves use, and the people they talk to use, still, IE6!
If it's not an issue, and you can use HTML 5 on old browsers, how do you? Or what docs should I look at :)? Thanks.
Update: I will post some interesting links as I find them below.
FindMeByIP: "A simple app which reveals your browsers' support for CSS3 and HTML5 features in an easy to read format using Modernizr." - Browser Support for CSS3 and HTML5

It is not useful to consider HTML5 as a single entity, that browsers either ‘support’ or ‘don't support’. HTML5 is:
an attempt to codify widespread existing practice beyond the limits of what the previous W3 HTML and DOM standards had covered, such as IE and Firefox extensions that the other browsers have copied, and long-standing ‘DOM Level 0’ behaviours that everyone took for granted but weren't written into any spec before.
a random selection of new extensions not yet in widespread use, which it is hoped browser manufacturers will support. Some have already succeeded, heading into all new browsers already; some have been spun off into their own specifications (which is much more manageable for everyone), some are controversial, and some no-one cares about at all.
It has been, IMO, an enormous mistake to try to cover these two bases at once. I would have preferred an HTML 3.2-style ‘catch-up’ standard plus many separate extension specifications. But there's nothing can be done about it now.
HTML5 is also:
Not finished. The specification is massive, complicated, incomplete, and likely to change in details (or maybe more than that) before it becomes a proper standard. No-one can say they ‘support HTML5’ yet, because no-one knows yet what ‘HTML5’ is actually going to be.
In practical terms: there are some parts of HTML5 that have long been in use. There are some parts that you can use safely on modern browsers. There are some parts that you can use on new browsers except for IE. There are many parts you can use with fallback workarounds or ‘graceful degradation’. There are some parts you may never be able to use. For now you will have to learn each separately, because there won't be a browser that supports absolutely everything in HTML5 for many, many years. If ever. Add the extra features you like gradually as you go along and they're supported by a greater share of browsers; there will be no ‘big bang’ where everyone updates their browser at once.
As for data- attributes, well, yeah, you can kind of get away with using them, in that most browsers have always allowed any old attributes to go through anyway. This is typical of several HTML5 extensions, the browser doesn't need to explicitly ‘support’ it for it to work.
But since there are other ways of passing data (classes, comments, script blocks, etc.), I'm not wholly convinced it's worth dropping (universally supported, validatable against a fixed standard) HTML4/XHTML1 pages just for that one feature yet.

You might want to check out diveintohtml5.ep.io and modernizr.com.
Modernizr is a small and simple JavaScript library that helps you take advantage of emerging web technologies (CSS3, HTML 5) while still maintaining a fine level of control over older browsers that may not yet support these new technologies.
Here's an interactive chart of html feature support:
http://a.deveria.com/caniuse/
As you can see, there are a lot of browsers that support quite a few of the H5 features.

If you're using jQuery, concerned about interoperability, and the only reason you're investigating HTML5 is to use the data-* attribute set, then I would consider switching back to a better-supported doctype and using jQuery's $().data() method, which allows you to bind arbitrary pieces of data to DOM nodes, similar to how the data- attribute set does.
Example:
<button id="set">Click me!</button>
$('button#set').click(function(){
if($(this).data('name')){
alert('Clickin\' again so soon, ' + $(this).data('name') + '?');
}else{
$(this).data('name', prompt('Hey good lookin\', what\'s your name?', ''));
}
});
Try it out!

My answer might not be the one you would like but I would say - don't. Don't use HTML 5 just yet.

Use Protovis. It uses javascript and HMTL5. No Flash here. More important, Protovis has BSD license. So you can use it in commercial projects. Although D3 is the a newer project that authors of Protovis are working on.

Although this is an old(ish) question, the topic of browser support will always be relevant. There's no right way or wrong way to approach it, but take a look through one of the many browser feature support tables that show you what percentage of users will see a certain feature and then be brutal.
Don't try to please everyone. Don't kill yourself to catch a few percent of the Luddite's who are still using IE7. Next year, substitute that for IE8. Personally, I would be happy to lose 8% in order to spend that time on forward thinking practices rather than catering for those who don't know what an upgrade is.
Maybe your site will cause people to upgrade. These people will come round eventually.

So what if custom HTML attributes aren't valid XHTML?

I know that is the reason some people don't approve of them, but does it really matter? I think that the power that they provide, in interacting with JavaScript and storing and sending information from and to the server, outweighs the validation concern. Am I missing something? What are the ramifications of "invalid" HTML? And wouldn't a custom DTD resolve them anyway?

The ramification is that w3c comes along in 2, 5, 10 years and creates an attribute with the same name. Now your page is broken.
HTML5 is going to provide a data attribute type for legal custom attributes (like data-myattr="foo") so maybe you could start using that now and be reasonably safe from future name collisions.
Finally, you may be overlooking that custom logic is the rational behind the class attribute. Although it is generally thought of as a style attribute it is in reality a legal way to set custom meta-properties on an element. Unfortunately you are basically limited to boolean properties which is why HTML5 is adding the data prefix.
BTW, by "basically boolean" I mean in principle. In reality there is nothing to stop you using a seperator in your class name to define custom values as well as attributes.
class="document docId.56 permissions.RW"

Yes you can legally add custom attributes by using "data".
For example:
<div id="testDiv" data-myData="just testing"></div>
After that, just use the latest version of jquery to do something like:
alert($('#testDiv').data('myData'))
or to set a data attribute:
$('#testDiv').data('myData', 'new custom data')
And since jQuery works in almost all browsers, you shouldn't have any problems ;)
update
data-myData may be converted to data-mydata in some browsers, as far as the javascript engine is concerned. Best to keep it lowercase all the way.

Validation is not an end in itself, but a tool to be used to help catch mistakes early, and reduce the number of mysterious rendering and behavioural issues that your web pages may face when used on multiple browser types.
Adding custom attributes will not affect either of these issues now, and unlikely to do so in the future, but because they don't validate, it means that when you come to assess the output of a validation of your page, you will need to carefully pick between the validation issues that matter, and the ones that don't. Each time you change your page and revalidate, you have to repeat this operation. If your page validates entirely then you get a nice green PASS message, and you can move on the next stage of testing, or to the next change that needs to be made.

I've seen people obsessed with validation doing far worse/weird things than using a simple custom attribute:
<base href="http://example.com/" /><!--[if IE]></base><![endif]-->
In my opinion, custom attributes really don't matter. As other say, it may be good to watch out for future additions of attributes in the standards. But now we have data-* attributes in HTML5, so we're saved.
What really matters is that you have properly nested tags, and properly quoted attribute values.
I even use custom tag names (those introduced by HTML5, like header, footer, etc), but these ones have problems in IE.
By the way, I often find ironically how all those validation zealots bow in front of Google's clever tricks, like iframe uploads.

Instead of using custom attributes, you can associate your HTML elements with the attributes using JSON:
var customAttributes = { 'Id1': { 'custAttrib1': '', ... }, ... };
And as for the ramifications, see SpliFF's answer.

Storing multiple values in the class attribute is not correct code encapsulation and just a convoluted hack way of doing things. Take a custom ad rotator for instance that uses jquery. It is much cleaner on the page to do
<div class="left blue imagerotator" AdsImagesDir="images/ads/" startWithImage="0" endWithImage="10" rotatorTimerSeconds="3" />
and let some simple jquery code do the work from here.
Any developer or web designer now can work on the ad rotator and change values to this when asked without much ado.
Coming back to project a year later or coming into a new one where the previous developer split and went to an island somewhere in the pacific can be hell trying to figure out intentions when code is written in an unclear encrypted manner like this:
<div class="left blue imagerotator dir:images-ads endwith:10 t:3 tf:yes" />
When we write code in c# and other languages we don't write code putting all custom properties in one property as a space delimited string and end up having to parse that string every time we need to access or write to it. Think about the next person that will work on your code.

The thing with validation is that TODAY it may not matter, but you cannot know if it's going to matter tomorrow (and, by Murphy's law, it WILL matter tomorrow).
It's just better to choose a future-proof alternative. If they don't exist (they do in this particular case), the way to go is to invent a future proof alternative.
Using custom attributes is probably harmless, but still, why choose a potentially harmful solution just because you think (you can never be sure) it will cause no harm?. It might be worth to discuss this further if the future proof alternative was too costly or unwieldy, but this is certainly not the case.

Old discussion but nevertheless; in my opinion since html is a mark-up and not a progamming language, it should always be interpreted with leniency for mark-up 'errors'. A browser is perfectly able to do so. I don't think this will and should change ever. Therefore, the only important practical criteria is that your html will be displayed correctly by most browsers and will continue to do so in, say a few years. After that time, your html will probalbly be redesigned anyway.

Just to add my ingredient to the mix, validation is also important when you need to create content that can/could be post-processed using automated tools. If your content is valid you can much more easily convert markup from one format to another. For example, doing valid XHTML to XML with a specific schema is Much easier when parsing data that you know and can verify to follow a predictable format.
I, for example NEED my content to be valid XHTML because very often it is converted into XML for various jobs and then converted back without data loss or unexpected rendering results.

Well it depends on your client/boss/etc .. do they require it be validating XHTML?
Some people say there are a lot of workarounds - and depending on the sceneraio, they can work great. This includes adding classes, leveraging the rel attribute, and someone that has even written their own parser to extract JSON from HTML comments.
HTML5 provides a standard way to do this, prefix your custom attributes with "data-". I would recommend doing this now anyway, as there is a chance you may use an attribute that will be used down the track in standard XHTML.

Using non-standard HTML could make the browser render the page in "quirks mode", in which case some other parts of the page may render differently, and other things like positioning may be slightly different. Using a custom DTD may get around this, though.

Because they're not standard you have no idea what might happen, neither now, nor in the future. As others have said W3C might start using those same names in the future. But what's even more dangerous is that you don't know what the developers of "browser xxx" have done when they encounter they.
Maybe the page is rendered in quirks mode, maybe the page doesn't render at all on some obscure mobile browser, maybe the browser will leak memory, maybe a virus killer will choke on your page, etc, etc, etc.
I know that following the standards religiously might seem like snobbery. However once you have experienced problems due to not following them, you tend to stop thinking like that. However, then it's mostly too late, and you need to start your application from scratch with a different framework...

I think developers validate just to validate, but there is something to be said for the fact that it keeps markup clean. However, because every (exaggeration warning!) browser displays everything differently there really is no standard. We try to follow standards because it makes us feel like we at least have some direction. Some people argue that keeping code standard will prevent issues and conflicts in the future. My opinion: Screw that nobody implements standards correctly and fully today anyway, might as well assume all your code will fail eventually. If it works it works, use it, unless its messy or your just trying to ignore standards to stick it to W3C or something. I think its important to remember that standards are implemented very slowly, has the web changed all that much in 5 years. I'm sure anyone will have years of notice when they need to fix a potential conflict. No reason to plan for compatibility of standards in the future when you can't even rely on today's standards.
Oh I almost forgot, if your code doesn't validate 10 baby kittens will die. Are you a kitten killer?

Jquery .html(markup) doesn't work if markup is invalid.

Validation
You shouldn't need custom attributes to provide validation. A better approach would be to add validation based on fields actual task.
Assign meaning by using classes. I have classnames like:
date (Dates)
zip (Zip code)
area (Areas)
ssn (Social security number)
Example markup:
<input class="date" name="date" value="2011-08-09" />
Example javascript (with jQuery):
$('.date').validate(); // use your custom function/framework etc here.
If you need special validators for a certain or scenario you just invent new classes (or use selectors) for your
special case:
Example for checking if two passwords match:
<input id="password" />
<input id="password-confirm" />
if($('#password').val() != $('#password-confirm').val())
{
// do something if the passwords don't match
}
(This approach works quite seamless with both jQuery validation and the mvc .net framework and probably others too)
Bonus: You can assign multiple classes separated with a space class="ssn custom-one custom-two"
Sending information "from and to the server"
If you need to pass data back, use <input type="hidden" />. They work out of the box.
(Make sure you don't pass any sensitive data with hidden inputs since they can be modified by the user with almost no effort at all)

Is XHTML compliance pointless?

I'm building a site right now, so far I've painfully forced everything to be compliant and it looks pretty much the same across browsers. However, I'm starting to implement some third party/free javascripts which do things like add attributes (eg. order=2). I could work around this but it's a pain, and I'm starting to lose my principals of making sure everything is valid. Really, is there any point to working around something like this? I got the HTMLValidator plugin for firefox, and looking at most major sites (including this one, google, etc.), they aren't valid XHTML or HTML.

The validation is useful to determine when things are failing to meet standards you presumably agree with. If you are purposefully using a tool that specifically adds something not in the validation standards, obviously that does not break your personal standards agreement.
This discussion gets much more difficult if you have a boss or a client who believes everything should return the green light, as you'll have to explain the above to them and convince them it's not simply you being lazy.
That said, be sure it's not simply be a case of you being lazy. While the validators may annoyingly constantly bring up every instance of the third party attribute, that doesn't invalidate (ha) the other validation errors they're mentioning. It's often worth scanning through as a means of double-checking your work.

Standards compliance is about increasing the chance that your page will work in the browsers you don't test against. This includes screen readers, and the next update of the browsers you do test against, and browsers which you do test against but which have been configured in unexpected ways by the user.
Validating doesn't guarantee you anything, since it's possible for your page to validate but still be sufficiently ambiguous that it won't behave the way you want it to on some browser some day.
However, if your page does validate, you at least have the force of the XHTML spec saying how it should behave. If it doesn't validate, all you have is a bunch of informal conventions between browser writers.
It's probably better to write valid HTML 3 than invalid XHTML, if there's something you want to do which is allowed in one but not the other.

If you're planning on taking advantage of XHTML as XML, then it's worth it to make your pages valid and well formed. Otherwise, plain old semantic HTML is probably want you want. Either way, the needs of your audience outweigh the needs of a validator.

I have yet to experience an instance where the addition of a non-standard attribute has caused a rendering issue in any browser.
Don't try to work around those non-standard attributes. Validators are handy as tools to double check your code for unintentional mistakes, but as we all know, even fully valid xhtml will not always render consistently across browsers. There are many times when design decisions require us to use browser specific (and non-standard) hacks to achieve an effect. This is the life of a web developer as evidenced by the number of technology driving sites (google, yahoo, etc.) that do not validate.

Just keep in mind that the XHTML tag renders differently in most browsers than not having it. The DOCTYPE attribute determines what mode the browser renders in and dictates what is and isn't allowed. If you stray from the XHTML compliance just be sure to retest in all browsers.
Personally I stick with the latest standards whenever possible, but you have to weigh time/money against compliance for sure and it comes down to personal preference for most.

As far as browsers are concerned, XHTML compliance is pointless in that:
Browsers don't have XHTML parsers. They have non-version-specific, web-compatible HTML parsers that build a DOM around the http://www.w3.org/1999/xhtml namespace.
Some browsers that have XML parsers can treat XHTML markup served as application/xhtml+xml as XML. This will take the XML and give default HTML style and behavior to elements in the http://www.w3.org/1999/xhtml namespace. But, as far as parsing goes, it has nothing to do with XHTML. XML parsing rules are followed, not some XHTML DTD's rules.
So, when you use XHTML markup, you're giving something alien to browsers and seeing if it comes out as you intend. The thing is, you can do this with any markup. If it renders as intended and produces the correct DOM, you're doing pretty good. You just have to make sure to keep DOCTYPE switching in mind and make sure you're not relying on a browser bug (so things don't fall apart in browsers that don't have the bug).
What XHTML compliance is good for is syntax checking (by validating) to see if the markup is well formed. This helps avoid parsing bugs. Of course, this can be done with HTML also, so there's nothing special about XHTML in this case. Either way, you still have to test in browsers and hope browser vendors make awesome HTML parsers that can accept all kinds of crap.
What's not pointless is trying to conform to what browsers expect. HTML5 helps with this big time. And, speaking of HTML5, you can define custom attributes all you want. Just prefix them with data-, as in <p data-order="This is a valid, custom attribute.">test</p>.

Being HTML Valid is usually a help for both of you and the browser rendering engine. The less quirks the browsers have to deal with, the more they can focus on adding new features. The more strict you are, the less time you'll spend time wondering why this f##cking proprietary tag does not work in the other browsers.
On the other hand, XHTML is, IMHO, more pointless, except if you plan to integrate it within some XML document. As IE still does not recognize it, it's pretty useless to stay stick with.

I think writing "valid code" is important, simply because you're setting an example by following the rules. If every developer had written code for Fx, Safari and Opera, I think IE had to "start following the rules" sooner than with version 8.

I try write compliant code most of the time weighing the time/cost vs the needs of the audience in all cases but one. Where you code needs to be 503 compliant, it is in your best interest and the interest of your audience to write compliant code. I've come across a bunch of screen readers that blow up when the code is even slightly off.
Like the majority of posters said, it's really all about what your audience needs.

It's not pointless by any means, but there is plenty of justification for breaking it. During the initial stages of CSS development it's very useful for diagnosing browser issues if your markup is valid. Beyond that, if you want to do something and you feel the most appropriate method is to break the validation, that's usually ok.
An alternative to using custom attributes is to make use of the 'rel' attribute, for an example see Litebox (and its kin).

Sure, you could always just go ahead and write it in the way you want, making sure that at minimum it works. Of course, we've already suffered this mentality and have witnessed its output, Internet Explorer 6.
I am a big fan of the Mike Davidson approach to standards-oriented development.
Just because you can validate your code doesn’t mean you are better than anybody else. Heck, it doesn’t even necessarily mean you write better code than anybody else. Someone who can write a banking application entirely in Flash is a better coder than you. Someone who can integrate third-party code into a complicated publishing environment is a better coder than you. Think of validation as using picture perfect grammar; it helps you get your ideas across and is a sign of a good education, but it isn’t nearly as important as the ideas and concepts you think of and subsequently communicate. The most charismatic and possibly smartest person I’ve ever worked for was from the South and used the word “ain’t” quite regularly. It didn’t make him any less smart, and, in fact, it made him more memorable. So all I’m saying is there are plenty of things to judge someone on… validation is one of them, but certainly not the most important.
A lot of people misunderstand this post to mean that we shouldn't code to standards. We should, obviously, but it's not something that should even really be thought about. The validation army will always decry those that do not validate, but validation means so much more than valid code.
So, don't lose your principles, but remember that if you follow the standards you're a lot less likely to end up in the deep-end of issues in the future. The content you're trying to provide is far more important than how it is displayed.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008