HTML differences between browsers - html

Do you know of any differences in handling HTML tags/properties in different browsers? For example, I once saw a page with a input tag with a maxlength field set to "2o". Firefox and Opera ignore the "o", and set the max length to 2, while Internet Explorer ignores the field altogether. Do you know of any more?
(Note: seeing as this will probably be a list, it would be great if the general name of the difference was in bold text, like: Different erratic value handling in tag properties)

Bug Lists
Web developers have already compiled some pretty comprehensive lists; I think it's better to compile a list of resources than to duplicate those lists.
http://www.positioniseverything.net/
http://www.richinstyle.com/bugs/table.html
http://www.quirksmode.org/ (as mentioned by Kristopher Johnson)
Javascript
I agree with Craig - it's best to program Javascript using a library that handles differences between browsers (as well as simplify things like namespacing, AJAX event handling, and context). Here's the jump to Craig's answer (on this page).
CSS Resets
CSS Resets can really simplify web development. They override settings which vary slightly between browsers to give you a more common starting point. I like Yahoo's YUI Reset CSS.

Check out http://www.quirksmode.org/

If you are programming in javascript the best advice I can give is to use a javascript library instead of trying to roll your own. The libraries are well tested, and the corner cases are more likely to have been encountered.
Scriptalicious - http://script.aculo.us/
jQuery - http://jquery.com/
Microsoft AJAX - http://www.asp.net/ajax/
Dojo - http://dojotoolkit.org/
Prototype - http://www.prototypejs.org/
YUI - http://developer.yahoo.com/yui/

Do you know of any differences in handling HTML tags/properties in different browsers
Is this question asking for information on all differences, including DOM and CSS? Bit of a big topic. I thought the OP was asking about HTML behaviour specifically, not all this other stuff...

The one that really annoys me is IE's broken document.getElementById javascript function - in most browsers this will give you something that has the id you specify, IE is happy to give you something that has the value in the name attribute, even if there is something later in the document with the id you asked for.

I once saw a page with a input tag
with a maxlength field set to "2o".
In this specific case, you're talking about invalid code. The maxlength attribute can't contain letters, only numbers.
What browsers do with invalid code varies a great deal, as you can see for yourself.
If you're really asking "what do all the different browsers do when faced with HTML code that, for any one of an infinite number of reasons, is broken?", that way lies madness.
We can reduce the problem space a great deal by using valid code.
So, use valid HTML. Then you are left with two main problem areas:
browser bugs -- how the browser follows the HTML standard and what it does wrong
differences in browser defaults, like the amount of padding/margin it gives to the body

Inconsistent parsing of XHTML in HTML mode
HTML parsers are not designed to handle XML.
If an XHTML document is served as "text/html“ and the compatibilities guidelines are not followed you can get unexpected results.
Empty tags is one possible source of problems. <tag/> and <tag></tag> are equivalent in XML. However the HTML parser can interpret them in two ways.
For instance Opera and IE treat <br></br> as two <br> but Firefox and WebKit treat <br></br> as one <br>.

Related

Switch browser to a strict mode in order to write proper html code

Is it possible to switch a browser to a "strict mode" in order to write proper code at least during the development phase?
I see always invalid, dirty html code (besides bad javascript and css) and I feel that one reason is also the high tolerance level of all browsers. So at least I would be ready to have a stricter mode while I use the browser for the development for the pages in order to force myself to proper code.
Is there anything like that with any of the known browser?
I know about w3c-validator but honestly who is really using this frequently?
Is there maybe some sort of regular interface between browser and validator? Are there any development environments where the validation is tested automatically?
Is there anything like that with any of the known browser? Is there maybe some sort of regular interface between browser and validator? Are there any development environments where the validation is tested automatically?
The answer to all those questions is “No“. No browsers have any built-in integration like what you describe. There are (or were) some browser extensions that would take every single document you load and send it to the W3C validator for checking, but using one of those extensions (or anything else that automatically sends things to the W3C validator in the background) is a great way to get the W3C to block your IP address (or the IP-address range for your entire company network) for abuse of W3C services.
I know about w3c-validator but honestly who is really using this frequently?
The W3C validator currently processes around 17 requests every second—around 1.5 million documents every day—so I guess there are quite a lot of people using it frequently.
I see always invalid, dirty html code… I would be ready to have a stricter mode while I use the browser for the development for the pages in order to force myself to proper code.
I'm not sure what specifically you mean by “dirty html code” or “proper code“ but I can say that there are a lot of markup cases that are not bad or invalid but which some people mistakenly consider bad.
For example, some people think every <p> start tag should always have a matching </p> end tag but the fact is that from the time when HTML was created, it has never required documents to always have matching </p> end tags in all cases (in fact, when HTML was created, the <p> element was basically an empty element—not a container—and so the <p> tag simply was a marker.
Another example of a case that some people mistakenly think of as bad is the case of unquoted attribute values; e.g., <link rel=stylesheet …>. But that fact is that unless an attribute value contains spaces, it generally doesn't need to be quoted. So in fact there's actually nothing wrong at all with a case like <link rel=stylesheet …>.
So there's basically no point in trying to find a tool or mechanism to check for cases like that, because those cases are not actually real problems.
All that said, the HTML spec does define some markup cases as being errors, and those cases are what the W3C validator checks.
So if you want to catch real problems and be able to fix them, the answer is pretty simple: Use the W3C validator.
Disclosure: I'm the maintainer of the W3C validator. 😀
As #sideshowbarker notes, there isn't anything built in to all browsers at the moment.
However I do like the idea and wish there was such a tool also (that's how I got to this question)
There is a "partial" solution, in that if you use Firefox, and view the source (not the developer tools, but the CTRL+U or right click "View Page Source") Firefox will highlight invalid tag nesting, and attribute issues in red in the raw HTML source. I find this invaluable as a first pass looking at a page that doesn't seem to be working.
It is quite nice because it isn't super picky about the asdf id not being quoted, or if an attribute is deprecated, but it highlights glitchy stuff like the spacing on the td attributes is messed up (this would cause issues if the attributes were not quoted), and it caught that the span tag was not properly closed, and that the script tag is outside of the html tag, and if I had missed the doctype or had content before it, it flags that too.
Unfortunately "seeing" these issues is a manual process... I'd love to see these in the dev console, and in all browsers.
Most plugins/extensions only get access to the DOM after it has been parsed and these errors are gone or negated... however if there is a way to get the raw HTML source in one of these extension models that we can code an extension for to test for these types of errors, I'd be more than willing to help write one (DM #scunliffe on Twitter). Alternatively this may require writing something at a lower level, like a script to run in Fiddler.

Why does CSS work with fake elements?

In my class, I was playing around and found out that CSS works with made-up elements.
Example:
imsocool {
color:blue;
}
<imsocool>HELLO</imsocool>
When my professor first saw me using this, he was a bit surprised that made-up elements worked and recommended I simply change all of my made up elements to paragraphs with ID's.
Why doesn't my professor want me to use made-up elements? They work effectively.
Also, why didn't he know that made-up elements exist and work with CSS. Are they uncommon?
Why does CSS work with fake elements?
(Most) browsers are designed to be (to some degree) forward compatible with future additions to HTML. Unrecognised elements are parsed into the DOM, but have no semantics or specialised default rendering associated with them.
When a new element is added to the specification, sometimes CSS, JavaScript and ARIA can be used to provide the same functionality in older browsers (and the elements have to appear in the DOM for those languages to be able to manipulate them to add that functionality).
(There is a specification for custom elements, but they have specific naming requirements and require registering using JavaScript.)
Why doesn't my professor want me to use made-up elements?
They are not allowed by the HTML specification
They might conflict with future standard elements with the same name
There is probably an existing HTML element that is better suited to the task
Also; why didn't he know that made-up elements existed and worked with CSS. Are they uncommon?
Yes. People don't use them because they have the above problems.
TL;DR
Custom tags are invalid in HTML. This may lead to rendering issues.
Makes future development more difficult since code is not portable.
Valid HTML offers a lot of benefits such as SEO, speed, and professionalism.
Long Answer
There are some arguments that code with custom tags is more usable.
However, it leads to invalid HTML. Which is not good for your site.
The Point of Valid CSS/HTML | StackOverflow
Google prefers it so it is good for SEO.
It makes your web page more likely to work in browsers you haven't tested.
It makes you look more professional (to some developers at least)
Compliant browsers can render [valid HTML faster]
It points out a bunch of obscure bugs you've probably missed that affect things you probably haven't tested e.g. the codepage or language set of the page.
Why Validate | W3C
Validation as a debugging tool
Validation as a future-proof quality check
Validation eases maintenance
Validation helps teach good practices
Validation is a sign of professionalism
YADA (yet another (different) answer)
Edit: Please see the comment from BoltClock below regarding type vs tag vs element. I usually don't worry about semantics but his comment is very appropriate and informative.
Although there are already a bunch of good replies, you indicated that your professor prompted you to post this question so it appears you are (formally) in school. I thought I would expound a little bit more in depth about not only CSS but also the mechanics of web browsers. According to Wikipedia, "CSS is a style sheet language used for describing ... a document written in a markup language." (I added the emphasis on "a") Notice that it doesn't say "written in HTML" much less a specific version of HTML. CSS can be used on HTML, XHTML, XML, SGML, XAML, etc. Of course, you need something that will render each of these document types that will also apply styling. By definition, CSS does not know / understand / care about specific markup language tags. So, the tags may be "invalid" as far as HTML is concerned, but there is no concept of a "valid" tag/element/type in CSS.
Modern visual browsers are not monolithic programs. They are an amalgam of different "engines" that have specific jobs to do. At a bare minimum I can think of 3 engines, the rendering engine, the CSS engine, and the javascript engine/VM. Not sure if the parser is part of the rendering engine (or vice versa) or if it is a separate engine, but you get the idea.
Whether or not a visual browser (others have already addressed the fact that screen readers might have other challenges dealing with invalid tags) applies the formatting depends on whether the parser leaves the "invalid" tag in the document and then whether the rendering engine applies styles to that tag. Since it would make it more difficult to develop/maintain, CSS engines are not written to understand that "This is an HTML document so here are the list of valid tags / elements / types." CSS engines simply find tags / elements / types and then tell the rendering engine, "Here are the styles you should apply." Whether or not the rendering engine decides to actually apply the styles is up it.
Here is an easy way to think of the basic flow from engine to engine: parser -> CSS -> rendering. In reality it is much more convoluted but this is good enough for starters.
This answer is already too long so I will end there.
Unknown elements are treated as divs by modern browsers. That's why they work. This is part of the oncoming HTML5 standard that introduces a modular structure to which new elements can be added.
In older browsers (I think IE7-) you can apply a Javascript-trick after which they will work as well.
Here is a related question I found when looking for an example.
Here is a question about the Javascript fix. Turns out it is indeed IE7 that doesn't support these elements out of the box.
Also; why didn't he know that made-up tags existed and worked with CSS. Are they uncommon?
Yes, quite. But especially: they don't serve additional purpose. And they are new to html5. In earlier versions of HTML an unknown tag was invalid.
Also, teachers seem to have gaps in their knowledge, sometimes. This might be due to the fact that they need to teach students the basics about a given subject, and it doesn't really pay off to know all ins and outs and be really up to date.
I once got detention because a teacher thought I programmed a virus, just because I could make a computer play music using the play command in GWBasic. (True story, and yes, long ago). But whatever the reason, I think the advice not to use custome elements is a sound one.
Actually you can use custom elements. Here is the W3C spec on this subject:
http://w3c.github.io/webcomponents/spec/custom/
And here is a tutorial explaining how to use them:
http://www.html5rocks.com/en/tutorials/webcomponents/customelements/
As pointed out by #Quentin: this is a draft specification in the early days of development, and that it imposes restrictions on what the element names can be.
There are a few things about the other answers that are either just poorly phrased or perhaps a little incorrect.
FALSE(ish): Non-standard HTML elements are "not allowed", "illegal", or "invalid".
Not necessarily. They're "non-conforming". What's the difference? Something can "not conform" and still be "allowed". The W3C aren't going to send the HTML police to your home and haul you away.
The W3C left things this way for a reason. Conformance and specifications are defined by a community. If you happen to have a smaller community consuming HTML for more specific purposes and they all agree on some new Elements they need to make things easier, they can have what the W3C refers to as "other applicable specifications". (this is a gross over simplification, obviously, but you get the idea)
That said, strict validators will declare your non-standard elements to be "invalid". but that's because the validator's job is to ensure conformance to whatever spec it's validating for, not to ensure "legality" for the browser or for use.
FALSE(ish): Non-standard HTML elements will result in rendering issues
Possibly, but unlikely. (replace "will" with "might") The only way this should result in a rendering issue is if your custom element conflicts with another specification, such as a change to the HTML spec or another specification being honored within the same system (such as SVG, Math, or something custom).
In fact, the reason CSS can style non-standard tags is because the HTML specification clearly states that:
User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them
Note: if you want to use a custom tag, just remember a change to the HTML spec at a later time could blow your styling up, so be prepared. It's really unlikely that the W3C will implement the <imsocool> tag, however.
Non-standard tags and JavaScript (via the DOM)
The reason you can access and alter custom elements using JavaScript is because the specification even talks about how they should be handled in the DOM, which is the (really horrible) API that allows you to manipulate the elements on your page.
The HTMLUnknownElement interface must be used for HTML elements that are not defined by this specification (or other applicable specifications).
TL;DR: Conforming to the spec is done for purposes of communication and safety. Non-conformance is still allowed by everything but a validator, whose sole purpose is to enforce conformity, but whose use is optional.
For example:
var wee = document.createElement('wee');
console.log(wee.toString()); //[object HTMLUnknownElement]
(I'm sure this will draw flames, but there's my 2 cents)
According to the specs:
CSS
A type selector is the name of a document language element type written using the syntax of CSS qualified names
I thought this was called the element selector, but apparently it is actually the type selector. The spec goes on to talk about CSS qualified names which put no restriction on what the names actually are. That is to say that as long as the type selector matches CSS qualified name syntax it is technically correct CSS and will match the element in the document. There is no CSS-specific restriction on elements that do not exist in a particular spec -- HTML or otherwise.
HTML
There is no official restriction on including any tags in the document that you want. However, the documentation does say
Authors must not use elements, attributes, or attribute values for purposes other than their appropriate intended semantic purpose, as doing so prevents software from correctly processing the page.
And it later says
Authors must not use elements, attributes, or attribute values that are not permitted by this specification or other applicable specifications, as doing so makes it significantly harder for the language to be extended in the future.
I'm not sure specifically where or if the spec says that unkown elements are allowed, but it does talk about the HTMLUnknownElement interface for unrecognized elements. Some browsers may not even recognize elements that are in the current spec (IE8 comes to mind).
There is a draft for custom elements, though, but I doubt it is implemented anywhere yet.
This is possible with html5 but you need to take into consideration of older browsers.
If you do decide to use them then, make sure to COMMENT your html!! Some people may have some trouble figuring out what it is so a comment could save them a ton of time.
Something like this,
<!-- Custom tags in use, refer to their CSS for aid -->
When you make your own custom tag/elements the older browsers will have no clue what that is just like html5 elements like nav/section.
If you are interested in this concept then I recommend to do it the right way.
Getting started
Custom Elements allow web developers to define new types of HTML
elements. The spec is one of several new API primitives landing under
the Web Components umbrella, but it's quite possibly the most
important. Web Components don't exist without the features unlocked by
custom elements:
Define new HTML/DOM elements Create elements that extend from other
elements Logically bundle together custom functionality into a single
tag Extend the API of existing DOM elements
There is a lot you can do with it and it does make your script beautiful as this article likes to put it. Custom Elements defining new elements in HTML.
So lets recap,
Pros
Very elegant and easy to read.
It is nice to not see so many divs. :p
Allows a unique feel to the code
Cons
Older browser support is a strong thing to consider.
Other developers may have no clue what to do if they don't know about custom tags. (Explain to them or add comments to inform them)
Lastly one thing to take into consideration, but I am unsure, is block and inline elements. By using custom tags you are going to end up writing more css because of the custom tag won't have a default side to it.
The choice is entirely up to you and you should base it on what the project is asking for.
Update 1/2/2014
Here is a very helpful article I found and figured I would share, Custom Elements.
Learn the tech Why Custom Elements? Custom Elements let authors define
their own elements. Authors associate JavaScript code with custom tag
names, and then use those custom tag names as they would any standard
tag.
For example, after registering a special kind of button called
super-button, use the super button just like this:
Custom elements are still elements. We
can create, use, manipulate, and compose them just as easily as any
standard or today.
This seems like a very good library to use but I did notice it didn't pass Window's Build status. This is also in a pre-alpha I believe so I would keep an eye on this while it develops.
Why doesn't he want you to use them? They are not common nor part of the HTML5 standard.
Technically, they are not allowed. They are a hack.
I like them myself, though. You may be interested in XHTML5. It allows you to define your own tags and use them as part of the standard.
Also, as others have pointed out, they are invalid and thus not portable.
Why didn't he know that they exist? I don't know, except that they are not common. Possibly he was just not aware that you could.
Made-up tags are hardly ever used, because it's unlikely that they will work reliably in every current browser, and every future browser.
A browser has to parse the HTML code into elements that it knows, to made-up tags will be converted into something else to fit in the document object model (DOM). As the web standards doesn't cover how to handle everyting that is outside of the standards, web browsers tend to handle non-standars code in different ways.
Web development is tricky enough with a bunch of different browsers that have their own quirks, without adding another element of uncertainty. The best bet it to stick with things that are actually in the standards, that is what the browser vendors try to follow, so that has the best chance to actually work.
I think made-up tags are just potentially more confusing or unclear than p's with IDs (some block of text generally). We all know a p with an ID is a paragraph, but who knows what made-up tags are intended for? At least that's my thought. :) Therefore this is more of a style / clarity issue than one of functionality.
Others have made excellent points but its worth noting that if you look at a framework such as AngularJS, there is a very valid case for custom elements and attributes. These convey not only better semantic meaning to the xml, but they also can provide behavior, look and feel for the web page.
CSS is a style sheet language that can be used to present XML documents, not only (X)HTML documents. Your snippet with the made-up tags could be part of a legal XML document; it would be one if you enclose it in a single root element. Probably you already have a <html> ...</html> around it? Any current browser can display XML documents.
Of course it is not a very good XML document, it lacks a grammar and an XML declaration. If you use an HTML declaration header instead (and probably a server configuration that sends the correct mime type) it would instead be illegal HTML.
(X)HTML has advantages over plain XML as elements have a semantic meaning that is useful in the context of a web page presentation. Tools can work with this semantics, other developers know the meaning, it is less error prone and better to read.
But in other contexts it is better to use CSS with XML and/or XSLT to do the presentation. This is what you did. As this wasn't your task, you didn't know what you were doing, and HTML/CSS is the better way to go most of the time you should stick to it in your scenario.
You should add an (X)HTML header to your document so tools can give you meaningful error messages.
...I simply change all of my made up tags to paragraphs with ID's.
I actually take issue with his suggestion of how to do it properly.
A <p> tag is for paragraphs. I see people using it all the time instead of a div -- simply for spacing purposes or because it seems gentler. If it's not a paragraph, don't use it.
You don't need or want to stick ID's on everything unless you need to target it specifically (e.g. with Javascript). Use classes or just a straight-up div.
From its early days CSS was designed to be markup agnostic so it can be used with any markup language producing tree alike DOM structures (SVG for example). Any tag that comply to name token production is perfectly valid in CSS. So your question is rather about HTML than CSS itself.
Elements with custom tags are supported by HTML5 specification. HTML5 standardize the way how unknown elements must be parsed in the DOM. So HTML5 is the first HTML specification that enables custom elements strictly speaking. You just need to use HTML5 doctype <!DOCTYPE html> in your document.
As of custom tag names themselves...
This document http://www.w3.org/TR/custom-elements/ recommends custom tags you choose to contain at least one '-' (dash) symbol. This way they will not conflict with future HTML elements. Therefore you'd better change your doc to something like this:
<style>
so-cool {
color:blue;
}
</style>
<body>
<so-cool>HELLO</so-cool>
</body>
Surprisingly, nobody (including my past self) mentioned accessibility. Another reason that using valid tags instead of custom ones is for compatibility with the greatest amount of software, including screen-readers and other tools that people need for accessibility purposes. Moreover, accessibility laws like WAI require making accessible websites, which generally means requiring them to use valid markup.
Apparently nobody mentioned it, so I will.
This is a by-product of browser wars.
Back in the 1990’s when the Internet was first starting to go mainstream, competition incrased in the browser market. To stay competitive and draw users, some browsers (most notably Internet Explorer) tried to be helpful and “user-friendly” by attempting to figure out what page designers meant and thus allowed markup that are incorrect (e.g., <b><i>foobar</b></i> would correctly render as bold-italics).
This made sense to some degree because if one browser kept complaining about syntax errors while another ate anything you threw at it and spit out a (more-or-less) correct result, then people would naturally flock to the latter.
While many thought the browser wars were over, a new war between browser vendors has reignited in the past few years since Chrome was released, Apple started growing again and pushing Safari, and IE lost its dominance. (You could call it a “cold war” due to the perceived cooperation and support of standards by browser vendors.) Therefore, it is not a surprise that even contemporary browsers which supposedly conform strictly to web standards actually try to be “clever” and allow standard-breaking behavior such as this in order to try to gain an advantage as before.
Unfortunately, this permissive behavior led to a massive (some might even say cancerous) growth of poorly marked up webpages. Because IE was the most lenient and popular browser, and due to Microsoft’s continued flouting of standards, IE became infamous for encouraging and promoting bad design and propagating and perpetuating broken pages.
You may be able to get away with using quirks and exploits like that on some browsers for now, but other than the occasional puzzle or game or something, you should always stick to web standards when creating web pages and sites to ensure they display correctly and avoid them becoming broken (possibly completely ignored) with a browser update.
While browsers will generally relate CSS to HTML tags regardless of whether or not they are valid, you should ABSOLUTELY NOT do this.
There is technically nothing wrong with this from a CSS perspective. However, using made up tags is something you should NEVER do in HTML.
HTML is a markup language, which means that each tag corresponds to a specific type of information.
Your made up tags don't correspond to any type of information. This will create problems from web crawlers, such as Google.
Read more information on the importance of correct markup.
Edit
Divs refer to groups of multiple related elements, meant to be displayed in block form and can be manipulated as such.
Spans refer to elements that are to be styled differenly than the context they are currently in and are meant to be displayed inline, not as a block. An example is if a few words in a sentence needs to be all caps.
Custom tags do not correlate to any standards and thus span/div should be used with class/ID properties instead.
There are very specific exemptions to this, such as Angular JS
Although CSS has a thing called a "tag selector," it doesn't actually know what a tag is. That's left for the document's language to define. CSS was designed to be used not just with HTML, but also with XML, where (assuming you're not using a DTD or other validation scheme) the tags can be just about anything. You could use it with other languages too, though you would need to come up with your own semantics for exactly what things like "tags" and "attributes" correspond to.
Browsers generally apply CSS to unknown tags in HTML, because this is considered better than breaking completely: at least they can display something. But it is very bad practice to use "fake" tags deliberately. One reason for this is that new tags do get defined from time to time, and if one is defined that looks sort of like your fake tag but doesn't quite work the same way, that can cause problems with your site on new browsers.
Why does CSS work with fake elements? Because it doesn't hurt anyone because you're not supposed to use them anyways.
Why doesn't my professor want me to use made-up elements? Because if that element is defined by a specification in the future your element will have an unpredictable behavior.
Also, why didn't he know that made-up elements exist and work with CSS. Are they uncommon? Because he, like most other web developers, understand that we shouldn't use things that might break randomly in the future.

How should user agents handle unrecognized HTML elements?

I have tried to find an answer to this in the W3C HTML specifications, but haven't had any luck so far.
For example, if I have the following HTML code:
<body>
<p>
<foo>bar</foo>
</p>
</body>
Does W3C specify how a user agent should handle this? E.g should the "foo" element be completely ignored? Should the "foo" element be ignored but the content "bar" parsed?
Also, is it even "legal" to do this?
Edit: Some excellent answers from all of you! I totally agree that it would be bad practice to embed generic XML unless, possibly, if you have complete control over which browser your users will use. I was mostly curious about what actually would or should happen if such markup were to be produced :-)
The HTML spec doesn't say much about it, other than:
The HTMLUnknownElement interface must be used for HTML elements that are not defined by this specification (or other applicable specifications).
This can be verified in conforming browsers using the following JavaScript code in the console:
Object.prototype.toString.call(document.createElement("foo"));
//-> "[object HTMLUnknownElement]"
However, some browsers either don't follow the specification here yet. For instance, Chrome 13 gives [object HTMLElement], IE 8 gives [object HTMLGenericElement] (IE 9 is correct).
As far as I'm aware, all browsers will parse <foo> as an element, but default styling and behaviour is not guaranteed to be the same. Where HTMLUnknownElement is implemented and the spec is followed, it should inherit directly from HTMLElement and, therefore, have many of the default properties found on other elements.
Please note that your HTML will not validate when you have non-standard elements in your markup. It's also worth mentioning that search engine crawlers, screen readers and other software will not be able to extract semantic meaning from these elements.
Further reading:
Why generic XML on the web is a bad idea and 386: Generic Elements; Still a Bad Idea - Anne van Kesteren's blog (2005, 2010)
Some excellent advice from #Andy E. This is just some add-ons to that.
The HTML5 draft does define how to parse unknown elements, however, it is distinctly non-trivial. To see the rules, see http://dev.w3.org/html5/spec/tree-construction.html
Note that the first version of Firefox to use these rules is FireFox 4, and the first version of IE to use the rules is IE 10. Older versions have a number of different and often very strange behaviours.
HTML has no notion of "legality", only validity or conformance to a standard. You are free to decide whether you want your pages to conform to any particular standard or not. There is no W3C standard of HTML where use of arbitrarily named elements is conforming.
It is generally advisable to make your HTML conforming to avoid unpredictable errors in browsers and other HTML consumers that you haven't tested against.
"bar" should definitely be rendered. For example, in the HTML5 video element, the contents of the element contain fallback content to be displayed in older browsers for exactly this reason. It's also why people traditionally put comments around style declarations:
<style><!--
(styling goes here)
--></style>
to hide the styling information from pre-HTML 4 browsers. (I think the comments aren't considered good practice any more.)

Has anyone come accross a browser where a custom attribute didn't work?

The web application im working on has a few custom attributes on HTML elements to store data that is output.
Only happens here and there and so far I haven't noticed anything wrong in how the page is rendered on IE7, IE8, FF 3.5 and Chrome 3.
I'd like to assume everything will be ok but just wanted to check if anyone else has had problems with custom attributes in other browsers.
I understand its not part of standards to add custom attributes but all that matters to me is how the page is output to customers.
html5 supports custom attributes with names starting with "data-". Using those yields the smallest chance that anything breaks in the future.
Browsers silently ignore tags or tag attributes they do not understand, so you're good. That said, your HTML won't validate (I know you said you don't care, but still) plus there are other possible ramifications.
See this question for more details.
Just be sure to use the same case when referencing the attribute in code. I've had issues in the past with Internet Explorer returning null with getAttribute because my case didn't match what was defined in markup or previously in code.

So what if custom HTML attributes aren't valid XHTML?

I know that is the reason some people don't approve of them, but does it really matter? I think that the power that they provide, in interacting with JavaScript and storing and sending information from and to the server, outweighs the validation concern. Am I missing something? What are the ramifications of "invalid" HTML? And wouldn't a custom DTD resolve them anyway?
The ramification is that w3c comes along in 2, 5, 10 years and creates an attribute with the same name. Now your page is broken.
HTML5 is going to provide a data attribute type for legal custom attributes (like data-myattr="foo") so maybe you could start using that now and be reasonably safe from future name collisions.
Finally, you may be overlooking that custom logic is the rational behind the class attribute. Although it is generally thought of as a style attribute it is in reality a legal way to set custom meta-properties on an element. Unfortunately you are basically limited to boolean properties which is why HTML5 is adding the data prefix.
BTW, by "basically boolean" I mean in principle. In reality there is nothing to stop you using a seperator in your class name to define custom values as well as attributes.
class="document docId.56 permissions.RW"
Yes you can legally add custom attributes by using "data".
For example:
<div id="testDiv" data-myData="just testing"></div>
After that, just use the latest version of jquery to do something like:
alert($('#testDiv').data('myData'))
or to set a data attribute:
$('#testDiv').data('myData', 'new custom data')
And since jQuery works in almost all browsers, you shouldn't have any problems ;)
update
data-myData may be converted to data-mydata in some browsers, as far as the javascript engine is concerned. Best to keep it lowercase all the way.
Validation is not an end in itself, but a tool to be used to help catch mistakes early, and reduce the number of mysterious rendering and behavioural issues that your web pages may face when used on multiple browser types.
Adding custom attributes will not affect either of these issues now, and unlikely to do so in the future, but because they don't validate, it means that when you come to assess the output of a validation of your page, you will need to carefully pick between the validation issues that matter, and the ones that don't. Each time you change your page and revalidate, you have to repeat this operation. If your page validates entirely then you get a nice green PASS message, and you can move on the next stage of testing, or to the next change that needs to be made.
I've seen people obsessed with validation doing far worse/weird things than using a simple custom attribute:
<base href="http://example.com/" /><!--[if IE]></base><![endif]-->
In my opinion, custom attributes really don't matter. As other say, it may be good to watch out for future additions of attributes in the standards. But now we have data-* attributes in HTML5, so we're saved.
What really matters is that you have properly nested tags, and properly quoted attribute values.
I even use custom tag names (those introduced by HTML5, like header, footer, etc), but these ones have problems in IE.
By the way, I often find ironically how all those validation zealots bow in front of Google's clever tricks, like iframe uploads.
Instead of using custom attributes, you can associate your HTML elements with the attributes using JSON:
var customAttributes = { 'Id1': { 'custAttrib1': '', ... }, ... };
And as for the ramifications, see SpliFF's answer.
Storing multiple values in the class attribute is not correct code encapsulation and just a convoluted hack way of doing things. Take a custom ad rotator for instance that uses jquery. It is much cleaner on the page to do
<div class="left blue imagerotator" AdsImagesDir="images/ads/" startWithImage="0" endWithImage="10" rotatorTimerSeconds="3" />
and let some simple jquery code do the work from here.
Any developer or web designer now can work on the ad rotator and change values to this when asked without much ado.
Coming back to project a year later or coming into a new one where the previous developer split and went to an island somewhere in the pacific can be hell trying to figure out intentions when code is written in an unclear encrypted manner like this:
<div class="left blue imagerotator dir:images-ads endwith:10 t:3 tf:yes" />
When we write code in c# and other languages we don't write code putting all custom properties in one property as a space delimited string and end up having to parse that string every time we need to access or write to it. Think about the next person that will work on your code.
The thing with validation is that TODAY it may not matter, but you cannot know if it's going to matter tomorrow (and, by Murphy's law, it WILL matter tomorrow).
It's just better to choose a future-proof alternative. If they don't exist (they do in this particular case), the way to go is to invent a future proof alternative.
Using custom attributes is probably harmless, but still, why choose a potentially harmful solution just because you think (you can never be sure) it will cause no harm?. It might be worth to discuss this further if the future proof alternative was too costly or unwieldy, but this is certainly not the case.
Old discussion but nevertheless; in my opinion since html is a mark-up and not a progamming language, it should always be interpreted with leniency for mark-up 'errors'. A browser is perfectly able to do so. I don't think this will and should change ever. Therefore, the only important practical criteria is that your html will be displayed correctly by most browsers and will continue to do so in, say a few years. After that time, your html will probalbly be redesigned anyway.
Just to add my ingredient to the mix, validation is also important when you need to create content that can/could be post-processed using automated tools. If your content is valid you can much more easily convert markup from one format to another. For example, doing valid XHTML to XML with a specific schema is Much easier when parsing data that you know and can verify to follow a predictable format.
I, for example NEED my content to be valid XHTML because very often it is converted into XML for various jobs and then converted back without data loss or unexpected rendering results.
Well it depends on your client/boss/etc .. do they require it be validating XHTML?
Some people say there are a lot of workarounds - and depending on the sceneraio, they can work great. This includes adding classes, leveraging the rel attribute, and someone that has even written their own parser to extract JSON from HTML comments.
HTML5 provides a standard way to do this, prefix your custom attributes with "data-". I would recommend doing this now anyway, as there is a chance you may use an attribute that will be used down the track in standard XHTML.
Using non-standard HTML could make the browser render the page in "quirks mode", in which case some other parts of the page may render differently, and other things like positioning may be slightly different. Using a custom DTD may get around this, though.
Because they're not standard you have no idea what might happen, neither now, nor in the future. As others have said W3C might start using those same names in the future. But what's even more dangerous is that you don't know what the developers of "browser xxx" have done when they encounter they.
Maybe the page is rendered in quirks mode, maybe the page doesn't render at all on some obscure mobile browser, maybe the browser will leak memory, maybe a virus killer will choke on your page, etc, etc, etc.
I know that following the standards religiously might seem like snobbery. However once you have experienced problems due to not following them, you tend to stop thinking like that. However, then it's mostly too late, and you need to start your application from scratch with a different framework...
I think developers validate just to validate, but there is something to be said for the fact that it keeps markup clean. However, because every (exaggeration warning!) browser displays everything differently there really is no standard. We try to follow standards because it makes us feel like we at least have some direction. Some people argue that keeping code standard will prevent issues and conflicts in the future. My opinion: Screw that nobody implements standards correctly and fully today anyway, might as well assume all your code will fail eventually. If it works it works, use it, unless its messy or your just trying to ignore standards to stick it to W3C or something. I think its important to remember that standards are implemented very slowly, has the web changed all that much in 5 years. I'm sure anyone will have years of notice when they need to fix a potential conflict. No reason to plan for compatibility of standards in the future when you can't even rely on today's standards.
Oh I almost forgot, if your code doesn't validate 10 baby kittens will die. Are you a kitten killer?
Jquery .html(markup) doesn't work if markup is invalid.
Validation
You shouldn't need custom attributes to provide validation. A better approach would be to add validation based on fields actual task.
Assign meaning by using classes. I have classnames like:
date (Dates)
zip (Zip code)
area (Areas)
ssn (Social security number)
Example markup:
<input class="date" name="date" value="2011-08-09" />
Example javascript (with jQuery):
$('.date').validate(); // use your custom function/framework etc here.
If you need special validators for a certain or scenario you just invent new classes (or use selectors) for your
special case:
Example for checking if two passwords match:
<input id="password" />
<input id="password-confirm" />
if($('#password').val() != $('#password-confirm').val())
{
// do something if the passwords don't match
}
(This approach works quite seamless with both jQuery validation and the mvc .net framework and probably others too)
Bonus: You can assign multiple classes separated with a space class="ssn custom-one custom-two"
Sending information "from and to the server"
If you need to pass data back, use <input type="hidden" />. They work out of the box.
(Make sure you don't pass any sensitive data with hidden inputs since they can be modified by the user with almost no effort at all)