IE9 Period Wrap Bug - html

I just uncovered this little gem of an IE9 bug. It seems that IE9 does not recognize a space preceding a period as a breaking point. As in a list of domain or file extensions. Open the following fiddle in IE9.
http://jsfiddle.net/cssguru/nNnzM/1/
I tried using escape characters, but it didn't help. Any suggestions on a workaround?

It is an annoying feature, but it is probably intentional and not regarded as a bug by the vendor. Instead, it is regarded as implementing Unicode line breaking rules (which are partly rather odd). According to those rules, the period (or FULL STOP as they call it) has line breaking class IS, infix numeric separator, and “When not used in a numeric context, infix separators are sentence-ending punctuation. Therefore they always prevent breaks before.”
To deal with such issues, it is probably best nowadays to insert U+200B ZERO WIDTH SPACE between a normal space and a period, e.g.
.web ​.shop ​.blog ​.nyc ...
U+200B is a control character that allows line breaks in a place where they would not otherwise be allowed.
Old IE versions (IE 6) may have difficulties with this, displaying a symbol of unrepresentable character in place of U+200B. An alternative method, the <wbr> tag, would not have this problem, but it seems that IE 8 and newer often fail to honor this age-old tag (perhaps because it never made its way to any standard, despite its usefulness).

I added a declaration for word-wrap in this update to your Fiddle, and that seems to have resolved the issue.

Related

HTML Minification: Whitespace between element attributes

I'd like to remove more unnecessary bytes from my output, and it seems it's acceptable (in practice) to strip what can add up to quite a lot of whitespace from HTML markup by omitting/collapsing the gaps between DOM element attributes.
Although I've tested and researched (a little in both cases), I'm wondering how safe it would be?
I tested in Chrome (43.0.2357.65 m), IE (11.0.9600.17801), FF (38.0.1) and Safari (5.1.7 (blah-di-blah)) and they didn't seem to mind, and couldn't find anything specific in The Specs about whitespace between attributes.
w3.org's Validator complains, which is a strong indication that this is not safe and shouldn't be expected to work, but (there's always a "but") it's possible the requirement for a space is only strict when no quotes are present (for obvious reasons).
Also (snippy but poignant): their SSL is "out of date" which doesn't inspire confidence in their opinion.
I noted also that someone's HTML compressor could (when enabled) strip quotes around attribute values where those values had no whitespace within them (e.g. id), which implies that at least most if not all HTML parsing is focussed on the text either side of the equals signs (except with booleans of course), and where quotes are in use, they'd be considered the prioritized delimiter.
So, would:
<!DOCTYPE html><html><body>
Yabba Dabba Doo!
</body></html>
▲ that ever go wrong, and if so, under which conditions?
What other reasons could there be to maintain this whitespace in production output (code "readability" is a non issue in this case)?
Update (since finding an answer):
Although I basically answered my own question insofar that there is a specification governing whether there should be a space between attributes, I still wonder if omitting them when using quoted values can be considered practically safe, and would appreciate feedback on this point.
Considering how often spaces may be omitted by accident in production HTML, and that the browsers I tested don't seem to mind when they are, I assume it would be very rare if ever that a browser failed to handle documents with these spaces omitted.
Although it's sensible to follow the specs in pretty much all situations, might this be one time cheating a bit could be acceptable?
After all - if we can magically save several hundred bytes without affecting the quality of the output, why not?
There is a specification (after all)
It turns out I should have looked harder. My bad.
According to these specs:
If an attribute using the empty attribute syntax is to be followed by another attribute, then there must be a space character separating the two.
and
If an attribute using the unquoted attribute syntax is to be followed by another attribute or by the optional U+002F SOLIDUS character (/) allowed in step 6 of the start tag syntax above, then there must be a space character separating the two.
and
If an attribute using the single-quoted attribute syntax is to be followed by another attribute, then there must be a space character separating the two.
and
If an attribute using the double-quoted attribute syntax is to be followed by another attribute, then there must be a space character separating the two.
Which unless I am mistaken (again), means there must always be spaces between attributes.
You could try online HTML minifiers like http://www.whak.ca/minify/HTML.htm or http://www.scriptcompress.com/minify-HTML.htm (search google for more) and find little things they change for hints to what can be taken out yet still render the HTML code.
On the first link your code:
<!DOCTYPE html><html><body>
Yabba Dabba Doo!
</body></html>
Turns into:
<!DOCTYPE html><html><body>Yabba Dabba Doo!
saving you 18 bytes already...

Is there a HTML character that is blank (including no whitespace) on all browsers?

Is there a HTML character that, on all (major) browsers (plus IE8 sadly) displays nothing and doesn't add any extra space?
So, an alternative to but which doesn't add whitespace to the page, and which won't ever show up as an ugly "unrecognised character" marker or ?.
Why: in my case, I'm trying to work around a problem on an old, proprietary CMS that is removing empty but necessary HTML elements that are required because other parts of the system will fill them dynamically.
Imagine something like (simplified trivial example) <span class="placeholder" data-type="username"></span> which is populated with a user's username if a user is logged in - but this old-school CMS sees it as being empty and removes it.
There seem to be two options that mostly fit the bill. They seem to reliably not show anything when in a <span>, but they (particularly the second option) might have a minor effect on copy/paste and word breaking in some cases.
Zero-width space
​ aka ​ which behaves the same as the (now in HTML5) <wbr> - used to make words break at certain points without changing the display of the words.
<h1>This text is full<span>​</span> of spans with char<span>​</span>acte<span></span>rs that affe<span>​</span>ct word brea<span></span>king but don't show up</h1>
<h1>Especially in das super<span>​</span>douper​crazy<span>​</span>long<span></span>worden.</h1>
Seems to work fine on modern browsers and IE7+ (not tested on IE6).
Soft hyphen
­ - like a zero-width space but (in theory) adds a hyphen when it breaks a word across a line.
<h1>This text is full<span>­</span> of spans with char<span>­</span>acte<span></span>rs that affe<span>­</span>ct word brea<span></span>king but don't show up</h1>
<h1>Especially in das super<span>­</span>douper­crazy<span>­</span>long<span></span>worden.</h1>
<h1>Example where das super­douper­crazy­longword contains no spans.</h1>
Fine on modern browsers and IE7+ (not tested on IE6), though as some comments note there are issues with these turning into regular hyphens when copied and pasted, for example, here's how it pastes from Chrome to Notepad, on Windows 8.1:
Within a span, it seems to never add a hyphen (but still better to use zero-width spaces if possible).
Edit: I found an older SO answer discussing these as a solution to a different problem which suggests these are robust except for possible copy/paste quirks.
The only other issue with these I could find in research is that apparently some search engines may treat words containing these as being split (e.g. awe­some might match searches for awe and some instead of awesome).
There are two characters that are graphic characters but defined to be zero width: U+200B ZERO WIDTH SPACE and U+FEFF ZERO WIDTH NO-BREAK SPACE. The former acts like a space character, so that it is a separator between words and allows line breaking in formatting, whereas the latter explicitly forbids line breaks. It depends on the purpose and context which one you should use. The can be represented in HTML as ​ and .
There characters work well in most browsing situations. However, in IE 6, they tend to be rendered as small rectangles, since IE 6 does not know these characters and tries to render them as if they were graphic characters (which lack glyphs).
There are also control characters that are allowed in HTML, such as U+200E LEFT-TO-RIGHT MARK and U+200D ZERO WIDTH JOINER. They have no rendering as such, though they may affect rendering of graphic characters, e.g. by setting writing direction, affecting ligature behavior, etc. Due to the possibility of such effects, it might be risky to use them as “dummy” characters.

HTML Ascii not showing on IE 8-9

Trying to get the 1/3 and 1/8 symbols to show on IE 8-9. It shows fine on earlier versions of IE and all other browsers.
Code I'm using:
&frac13;
¼
IE 9 and older do not recognize the references &frac13; and &frac18; (the latter is obviously what you meant, instead of ¼, which is well supported) but instead render them literally. These references were not defined in HTML 4.01, and they were added to browsers relatively late (around 2011). Use the numeric references instead: ⅓ and ⅛ (or type the characters themselves, using a suitable editor and UTF-8 encoding).
You would still have a font problem, because the font you use for normal text may not contain these characters. In some situations, browsers are unable to use a fallback font unless you give them a helping hand with a font-family declaration. Besides, there is a risk of getting these characters in a style different from other characters, including more common vulgar fractions like ½. Thus, a careful choice of a font list is recommended; see my Guide to using special characters in HTML.
They aren't in the HTML4 spec, which is what IE9 and lower use. Only 1/4, 1/2 and 3/4 are supported.
Here is a list of what works and what doesn't.
http://stanford.library.usyd.edu.au/symbols/entities.html#Math

IE9 doesn't like naked numbers in CSS?

This works in Chrome, Safari and Firefox, but not in IE9
style='height:100'
IE9 just ignores it. But this DOES work:
style='height:100px'
Unfortunately, i really truly do not want to have to go around adding a "px" to all my values. It makes doing any sort of manipulation (for example, multiplying the width by two, adding 100 to it, etc.) incredibly troublesome. Is this expected behaviour, and is there any special flag somewhere i can set to get IE9 to accept the first example?
(i know inline styles are generally considered incredibly bad, but i have reason to use them anyway in some special cases)
style='height:100' may work in some tolerant browsers, but is actually invalid CSS, so IE has every right to refuse to parse those.
That (width:100) doesn't actually work in any of those browsers due to being invalid CSS:
http://jsbin.com/ovewus
<div style="width:100;background:red">test</div>
..unless your page is in Quirks Mode due to not having a valid doctype, like this:
http://jsbin.com/ovewus/2 - width:100
works!
Intentionally writing pages for Quirks Mode is doubly incredibly bad, just don't do it.
I don't see "naked" numbers here :) http://www.w3.org/TR/CSS21/visudet.html#the-width-property as 100 can be px,em..
" Specifies the width of the content area using a length unit."
As Gruikya said, it is invalid CSS to have non-unit numbers (unless the number is 0, or the style is line-height), so I would definitely recommend adding the correct units. If you need to do mathematical operations on those units, (I presume in Javascript) then you should be using parseInt or parseFloat anyway, and these will happily ignore the trailing units.
You shouldn't define any size without actually defining the unit. You're making browsers guess what unit you want to use, because you're not telling them.
Depending on what CSS standard is followed by the browser, different browsers may handle it differently.
The CSS2 specification defines rules without units should be ignored. It also mention this behavior may (theoretically) change in future specification.

&nbsp without semicolon

I have been using prototype.js in a web application. I am populating some Divs dynamically, on selecting some radio buttons. Before populating contents in Div, I am clearing previous contents by using prototype's update method as -
$('item').update('');
In IE, this puts a line break automatically but not in Firefox 3.5. So, to work it in same manner as IE, I have change the code as -
$('item').update('&nbsp');
Now, this worked for me as expected. But, generally "&nbsp" is being used with a semicolon ($nbsp;). I want to know, if there can be a failure of this code.
For example, is there any chances where &nbsp will be display instead of a blank space? But some browsers might be smart enough to detect the coding error by the programmer and autocorrect it.
The ending semicolon can be omited but it’s absolutely not recommended. See this note in the HTML 4 specification regarding character references:
In SGML, it is possible to eliminate the final ";" after a character reference in some cases (e.g., at a line break or immediately before a tag). In other circumstances it may not be eliminated (e.g., in the middle of a word). We strongly suggest using the ";" in all cases to avoid problems with user agents that require this character to be present.
NEVER rely on browser autocorrection or "recommended" behavior.
If your code is wrong, then (a) it's wrong and (b) browsers are free to do inconsistent things with it.
I concur, Dodgy markup and JavaScript errors are a disaster for debugging any problems you have in future as you can't eliminate the dodgy markup is causing the browser to interpret code incorrectly.
not sure if this will help (or change anything) but the correct syntax is actually without the single quotes:
$('element').update();