Detect UTF-16 support, and replace with images otherwise - html

As part of an HUD I'm designing, I'd like to use this symbol: 🏥
Since my site serves files in Windows-1252, any characters outside the 0-255 range are represented as &#x____;, with appropriate hex digits filling in the blank. In this case: 🏥
However, when testing this out I noticed that IE can render the symbol, but Chrome cannot. I guess this question is this: What does IE have that Chrome doesn't, is there any way to give Chrome the ability to render these symbols, and if not can I detect that and replace it with an image?

It is not only enough to properly encode the charset information inside your HTML document, but on the other hand, the browser must be able to use the right encoding (check the settings) and it must be able to display it. In the following example, the browser is properly configured, it is just that the operating system is not able to use UTF-8 characters:

As far as I (mis-)understood it also is a font issue, where smart browsers possibly substitute the missing char by one out of another font. The newer browsers can also be served fonts from the server. Maybe that helps.
I hope you get a better answer.

Related

Why the HTML entity doesn't work on the Windows Chrome?

HTML entities are not working on chrome and IE (on windows).
I have entered the following code in my page and it works fine on mac chrome or firefox or safari, but not on windows.
<span class="font-family:Arial;">〈 〉 〉 〈 </span>
This is primarily a font issue, though there is a nasty silent change in HTML specs involved.
Modern browsers interpret 〈 and 〉 as referring to U+27E8 MATHEMATICAL LEFT ANGLE BRACKET “⟨” and U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET “⟩”, informally known as “bra” and “ket”. This interpretation is being made official in the named character references section of HTML5.
These characters are adequate for use in many mathematical notations, and the ISO 80000-2 standard explicitly specifies that they are used e.g. for certain scalar product notations. But support to them in fonts is rather limited. In old Windows systems, no font contains them. In newer Windows systems, from Windows Vista onwards, Cambria Math should be available. It is possible that you have been testing on an old Windows version, but it is also possible that Chrome is unable to find the right font. To give it a helping hand, use a CSS rule that suggests that font, e.g. with the attribute
style="font-family: Cambria Math"
You might consider adding some other fonts to the list, using fonts that are known to contain the characters. See my Guide to using special characters in HTML.
The nasty change is that in HTML 4.01, in the entities section, 〈 and 〉 are defined as referring to U+2329 LEFT-POINTING ANGLE BRACKET “〈” and U+232A RIGHT-POINTING ANGLE BRACKET “〉”. They are logically less satisfactory (and deprecated by the Unicode Standard), but they have somewhat wider font support.
So in addition to declaring fonts that contain the characters you use, you need to decide which pair of these characters you use or whether you use something else; it's a complicated question. If you use them, it is best to use them as such (in a UTF-8 encoded HTML document) or using numeric character references such as ⟨. The reason is that 〈 and 〉 should not be expected to work consistently; they probably work the HTML5 way in all modern browsers, but there is hardly any reason to take the risk, when you can unambiguously indicate the characters you want.
That particular character is simply a unicode codepoint which is an arbitrary number. There are a lot of unicode codepoints that do not have an 'official' symbol. Even if they do have a symbol, it is not necessarily the case that your font has a symbol for that codepoint. If you choose a different font, you may end up with a different symbol.
I looked at the CSS for the page and it shows this character displaying in Arial (plus a bunch of other fonts that do not matter). Windows comes with Arial so it should always pick up that font first. It looks like Arial does not have a symbol for that unicode codepoint. Anytime you do not have a glyph for a codepoint, it puts in some form of a box indicating there is no glyph
It depends on the entity, and the fonts on the system your reader is using. The issue is that these characters are not in the MathJax web fonts, so MathJax has to fall back on system fonts to find them. Some browsers are better at that than others. Your configuration controls what fonts MathJax lists for the browser to look in, so you may want to modify that to include fonts where you know your entities can be found (and you may want to think about the fact that you may have people reading your site on Windows, Mac, and Linux, and also mobile devices, so such decisions are not always easy).
Notice that when you install STIX fonts, it works for you. This is because STIXGeneral is in the default list of fonts that MathJax uses for unknown characters. You want to add others to that list (it is stored in the undefinedFamily property of the HTML-CSS and SVG sections of your configuration). Note however, that IE will stop checking fonts once it encounters a font that is installed on the system, even if it doesn't include the needed character and later fonts in the list do, so you have to be careful about the order that you use.

How to display "—" sign in Chrome and Opera?

In my site I need to display "—", but it looks like :
( http://i.imgur.com/CiLIo.png )
in Google Chrome and Opera, but looks ok in Firefox and IE9.
Can someone supply working code of this sign ?
I can't be sure what you want to display 100%, since I am on Chrome :), but that looks like an m-dash and the correct way to display it would be to write in your code — or —. It could be thought that that character is missing from the font Chrome and/or Opera is using on your system.
Check out this link: http://tlt.its.psu.edu/suggestions/international/web/codehtml.html
To quote it:
“Smart (curly) quotes” (vs. "plain (straight) quotes") and long dashes such as em dash (—) and en dash (–) are actually considered "special characters" in HTML.
So use an m- or n-dash: — or –
The odds are that the page uses some odd font that is broken, containing “ó” in place of the em dash, and that some browsers are able to analyze the font better than others. If this were just an encoding problem (which was my first thought, too), then surely — would work.
It would be an odd font, but it’s impossible to analyze the issue further without more information. A URL would probably suffice.
Try adding this line in your document <head>:
<meta charset="utf-8">
Charset sets a character encoding that specifies to the browser / document reader what kind of characters to expect. That way, it can handle your text appropriately and in an expected manner. In this case, we're using "utf-8" as it supports a wider range of characters.
Also / otherwise, click the Wrench icon in Chrome, hover to Tools, hover to Encoding, and see which encoding is checked.

How to give tool tips (title) in locale language (UTF-8) in IE and Chrome?

I want to show tool tips (title) in locale specific language by UTF-8 formatted values.
I tried It's working in firefox but not working in IE and Chrome. What I have to do for this problem?
<div title='(some UTF-8 formatted value)'
above code is working perfectly in firefox.
Thanx in advance.
The font(s) used in tooltips depend(s) on the browser, which may or may not use settings made at the operating system level. Thus it may be controllable by the user, though few users know about this. In any case, it is outside the control of the author.
This implies that the repertoire of characters you can use there may vary. A plain square or rectangle in text typically indicates that there is a recognized character but it cannot be displayed because it is not present in the font(s) being used.
Partly for reasons like this, authors are more and more moving towards using other techniques than the title attribute, namely “CSS tooltips” (or maybe “JavaScript tooltips”). This lets you use the same fonts as in the textual content or, if preferred, to set some suitable other fonts.

Degrading Unicode characters for web browsers with missing fonts

I am using the Unicode 'CHECK MARK' (U+2713) in a html document. I find that it renders OK in most browsers, but occasionally I encounter someone with a missing font on their PC. Are there any HTML / JS tricks to specify an alternative display character (or an image) if the font is missing?
There's not a direct way to tell if any particular character has rendered in a useful way. About all you can do from JavaScript is to create a <span> containing one (or several) of the target character in the target font, and compare its width to another <span> containing the same number of characters you know won't render usefully(*). If they're the same width, chances are you've got a load of boxes or question marks in each, so you can take backup measures like adding an image.
Since this is quite a lot of annoyance you may prefer to just go for the image. Or you could try using #font-face embedding on modern browsers to use a known-good font in general. Since it is typically IE that has poor Unicode font fallback support, be sure to include an EOT font.
(*: you could try a character that's currently unassigned and will hopefully stay that way, eg. U+08FF, or a guaranteed-invalid character like U+FFFF, though it's questionable whether you should be allowed to put that in an HTML document.)
This is not quite what you're asking for, but it might solve your problem (assuming your goal is to output HTML without it needing to rely on outside images, etc)
Have you considered image data URLs (also known as RFC2397):
http://www.ietf.org/rfc/rfc2397.txt
Instead of using:
✓
to represent a check mark, you would use:
<img src="data:image/gif;base64,R0lGODlhCgAKAJEAAAAAAP///////wAAACH5BAEAAAIALAAAAAAKAAoAAAISlG8AeMq5nnsiSlsj
zmpzmj0FADs="/>
This won't require any particular Unicode fonts with the CHECK MARK character to be installed on the client side, BUT it won't work in Internet Explorer 7 or lower. (Internet Explorer 8, Firefox, Safari, etc. should work just fine)
If you can devise a way to remotely check if MS Office 2000 or newer is installed, you should be able to assume that Arial Unicode MS is installed and hence having this code point in a font (as long as you have the CSS font family set to something like "Arial Unicode MS, Arial, sans-serif").
I believe this will only work in Microsoft Internet Explorer, but you should be able to detect a Word installation by trying to create its ActiveX object in JavaScript:
if(new ActiveXObject("Word.Application"))
{
window.alert("Word is installed, go ahead and use the Unicode check mark in HTML");
}
else
{
window.alert("Word is not installed, use your image of a check mark.");
}
But given that this really only works in IE, will probably throw a security warning in IE8, and you still need a fallback mechanism for other browsers or IE browsers without MS Office, using an image all the time is probably the best way to go.
Unicode is pretty standard, I always use unicodemap.org. Here is the character your using [link] this will give you all codes associated with the checkmark. If you want full backwards compatibility you will need to use an actual image. 1 image file for a checkmark is more lightweight than a javascript hack/plugin. Probably your best alternative.

HTML: How do i debug why a language does not display correctly

i was recently asked why a tumblr theme of mine does not display Vietnamese correctly on this site. how do i debug whats the problem.
i wonder if its because of the use of
a custom font or cufon?
maybe its a character set issue? but
UTF-8 shld support most languages?
Debugging is difficult, especially if you don't read the language in question. There are some things you should check though:
1.) Fonts. This is the main cause of trouble. If you want to display a character you must have that character in the selected font. If you use standard fonts that may work on internationalised Windows but there are also "unicode" fonts (ie, Arial Unicode MS) you may want to specify explicitly.
2.) Encoding. Make sure the page is served in an appropriate character set. Check the HTTP and HTML headers "charset". UTF-8 is appropriate for most languages.
3.) Browser and OS Support. It's pretty much a given these days that browsers support non-latin character sets, however it's possible the client has a very old or unusual browser. Can't hurt to find out which browser/os combination they are using and what their "Regional Settings" are.