when can XHTML unexpectedly cause problems on IE? - html

since IE won't render XHTML as XHTML, but treat it as HTML instead, when can this actually cause problems for IE?

i know of one case, where
<div style="clear:both" />
in browsers that support XHTML, the div is closed. But IE will treat the div as still open, so the layout can have unexpected result later.

Internet Explorer will have trouble distinguishing XHTML documents from XML documents if the MIME-type is not specified as text/html. However, because it fully supports HTML 4.01 the majority of problems arise from inconsistent and non-standards implementations of positioning, layout, and CSS properties. To avoid any problems it is best to write valid XHTML and specify a DOCTYPE.
A list of all known Internet Explorer Bugs

Self-closing syntax won't work (it will appear to work only on elements that are always empty in HTML). XML serializers might generate <textarea/>, <script/> and similar, which break pages in various ways (triggering complicated error recovery, sometimes involving re-parsing of remainder of the page).
Explicitly closed HTML "empty" elements might behave oddly (</br> inserts break in IE).
<![CDATA[ outside HTML's hardcoded CDATA elements will be recognized as a tag. It won't affect escaping and might make some content disappear.
In HTML's CDATA elements (namely <script>) entities won't be recognized. XHTML requires <script> if (1 < 2) … which is going to be syntax error in IE.
Background of <body> will be applied differently in IE.
There will be no cross-browser syntax for namespace-aware selectors in CSS.
You'll get all implied HTML elements (e.g. <tbody> in all tables) and implictly closed elements (it's usually not a problem when document is valid, but other browsers won't warn you as long as markup is well-formed).
Elements and attributes with prefixes won't be namespaced and will get different tagName in IE (which is also illegal in XML). They won't get appropriate default styling and behaviour either (<xhtml:a> can't be a link).
You won't be able to use namespace-aware methods like createElementNS (they don't exist in IE), .tagName will be uppercase in IE, but not in all cases.
Elements and attributes with prefixes won't be namespaced and will get different local name in IE (which is also illegal in XML).
These are only problems concerning switching from working XML document to HTML. There are as many surprises when you're going from HTML (i.e. what everyone expects and takes as normal behavior) to real XML, e.g. document.write doesn't work rendering most of Google's scripts useless.

These all apply to any browser treating XHTML as text/html rather than specifically IE, but you should read Appendix C of the XHTML 1.0 spec here: http://www.w3.org/TR/xhtml1/#guidelines

Related

HTML - An XML with Predefined CSS

Since HTML and XML are referred as structure of the data the only difference between HTML and XML is, an HTML tag has predefined CSS style in technical point of view. Am i correct.
Not really.
HTML, for historical reasons, does not conform to XML syntax requirements (unless you're talking about XHTML). The most obvious example is the <br> tag, which does not have a closing tag. Also, browsers are extremely forgiving in what they accept and will try to do something halfway meaningful even if the HTML is not valid. This is in stark contrast to XML parsers, which will reject any XML that is not well-formed.
You are correct that browsers implement a default CSS stylesheet, but there are subtle differences between browsers and between different versions of browsers, so many frameworks clear all defaults and re-specify the CSS for every element.

Why many developers mixed HML5 and XHTML?

I see many posts realated to this but never see some good explanation why people do this and what is best practice in professional way?
We all normaly use HTML5 rules and that's perfect but is there any reason to we use in modern design with XHTML rules "for any case"?
I see many WordPress, Jommla, Drupal and some less known templates in this modern days that use combination of HTML5 and XHTML like properly nested HTML tags, non minimizations, closed empty elements like <br />, <hr />, <img />, <input />, etc.
Why do this mix? Is that because support for old browsers or just old-school development mixed with new technology or just leak of knowladge of HTML5 rules?
XHTML had really strict rules, and the browsers wouldn't show things correctly if they weren't coded using the correct syntax.
HTML5 is not that strict. Even if you write a page with doctype set to HTML5, XHTML code will still work.
In some cases it still is a good idea to use XHTML. Eg. e-books. Even though the epub format now supports HTML5, older screenreaders still don't do that. Because of this alot of e-books are stil written using XHTML
The context is understanding the difference between what version of HTML (HTML5) and which parser are being used in combination.
The HTML parser is loose and will accept literally almost anything.
The XML parser is strict and will not tolerate poor code.
Also:
XHTML (application/xhtml+xml) is a subset of XML.
HTML is a subset of SGML.
So you can use HTML5 with the XML parser, my web platform does this (see my profile).
Why serve HTML5 as XML? I had already been using XHTML 1.1 years ago and witnessed a thread on a different PHP programming forum. Some guy could not figure out why Safari would not style an element like all of the other browsers. After three days he figured out he was missing a quote on an attribute; if he had been parsing the page as application/xhtml+xml the page would have broke (Gecko/Firefox/Waterfox the whole page breaks, other browsers will render up to the error) and being aware of the issue fixed it and recovered in seconds.
Those websites are not XHTML, they are simply using an XHTML doctype. The page must be served as application/xhtml+xml (see the network requests panel in any browser developer tools) to be considered XHTML (e.g. XHTML5) otherwise it's actually HTML code with invalid bits of code that are ignored by the browser.
Your comment about the trailing slash is either correct or incorrect subjective to the context of what you intended due to the vagueness of your comment. If you implied that people generally switched from XHTML 1 to HTML5 then yes however if you intended that XHTML now allows omitting the trailing forward slash than no. XML / XHTML require the trailing slash without exception.
The correct syntax for HTML's elements base, link, meta, hr, br, wbr, source, img, embed, param, track, area, col, and input (called void elements in HTML 5) is not to use an XML-style empty-element tag.
Fromt WHATWG/W3c's HTML current specification at W3C:
Void elements only have a start tag; end tags must not be specified for void elements
This isn't a case of XML/XHTML being stricter than HTML or something; it's just due to HTML's SGML legacy: in HTML 4's SGML grammar (DTD) from 1999 these elements were declared to have content EMPTY. If anything, using XML-style empty element syntax is less formal, since merely tolerated and ignored by HTML 5 parsers; but a sequence of a start-element tag, followed by an end-element tag for a void element is not.
See also How to find empty elements in html5 for a more elaborate discussion of empty elements.

</div></div> auto closing create child not sibling

I really like to use "short closing" for tags using ordinary <tag/> format but unfortunately using such method in Browser (i.e. chrome) cause quite unexpected behavior.
When in document I have:
<div/><div/>
it's interpreted as
<div>
<div></div>
<div>
no matter what DOCTYPE i use (XHTML) or HTML5 I just get this in a wrong way.
I'm also using this "notation" for custom tags in namespace <widget:aSampleWidgetA/> <widget:aSampleWidgetB/> which also introduce this problem.
I don't want to use a full closing notation as its making a lot of visual mess in code.
Is there some way to force Browser to parse those tags as proper XML?
Apologies, I can't find great documentation on this but I suspect it is because a div is not a valid self closing tag. Looking at the XHTML DTD, empty tags are specifically marked as EMPTY, div is not, so Chrome instead behaves as if it is html5 where the closing tags are can be left off and takes a best guess as to where to close them.
Alternatively, if you don't like the look of html, perhaps you might prefer something like haml or jade templates.
There is a way to make browsers (except of IE8 and below) parse the markup as XML. You need to serve it with the proper XHTML content type application/xhtml+xml.
Doctype is irrelevant for parsing, it affects only rendering mode (Standards or Quirks). When served as text/html, all pages will be parsed by HTML rules (HTML5 rules for modern browsers), which effectively mean that end slash in the 'self-closing' syntax is just ignored, and the ability of the element to be 'self-closed' is actually hard-coded in the parser. Divs and custom tags don't have this ability.

HTML Validation Error : Element "ui" undefined. (Accessibility)

In trying to comply with new web accessibility guidelines that some Ontario, Canada establishments must follow we have run some tools to access the changes required. One of the HTML standards being flagged appears as the following:
Element "ui" undefined.
This happens when the version
of HTML used on this page doesn't support an element with this name.
This can happen if the element is misspelled, is uppercase or mixed
case in XHTML, or isn't supported by some HTML versions. For example,
HTML 4 DOCTYPEs don't allow HTML 5 element and Strict DOCTYPEs don't
allow stylistic elements and frames which were present in earlier
versions of HTML.
The lines it points to look like the following:
<ui class="global-menu">
There are four of these lines, each of which it gives this problem to. What exactly is wrong with these? I've never known any HTML version to not support ui elements. Is this possibly a bug with the validation checker, and should I just ignore it, or is there actually an issue here that needs fixing to be accessible?
There is no <ui> element in HTML. The class of "global-menu" makes me think you're looking you're looking for a <ul> not a <ui>.

What is the effect of using HTML tags that are invalid according to the doctype?

We're currently working with a system (for better or worse) that declares a doctype as follows:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
The trouble is, many of our users, who will be writing content, are used to using XHTML-style tags like <br />, or <img ... />, instead of those that strictly should be used (i.e. <br> and <img>).
My question is, what is the real-world effect of this on a browsers rendering capability, and on semantics?
My first inclination is that it's a) not fair on the browser to throw this at it and expect it to bend over backwards and to know what to do, and b) removes the "guarantee" that any browser today or in the future will know how to display our pages correctly.
The page appears outwardly fine (although looking at the source code makes me shudder), but is this having some more sinister effect that isn't immediately apparent?
Browsers simply don't care about things like that. They usually even support attributes that simply do not exist in the given doctype (<a target="..."> in XHTML strict).
However, if you use XHTML with an XML content-type they may use an XML parser which will be strict and throw an error if you do invalid things - IE is known to behave like this.
The question appears to be about “self-closing” tags in HTML 4.01, rather than the much more general question in the heading. The answer is that they have no effect on browsers and it is highly unlikely that this would change, given the vast amount of such code around.
Technically, <br /> and <img ... /> are not invalid in HTML 4.01. HTML has formally been defined so that due to certain syntactic specialties, these constructs mean the same as <br>> and <img ...>> (where the final > is a data character). Browsers do not implement HTML this way; instead they just treat the / as an unrecognized and therefore discarded part of the tag.