I'm building a quiz that support 20 languages.
One is Maldivian.
How do I support this. Right now I'm having a bunch of square.
I want to know:
- What font should I use.
- Is there an online translator for English-Maldivian? (google translate do not support this)
Maldivian uses the Thaana script, which is not very widely supported in fonts. There are two basic strategies: specify a font-family rule that lists fonts known to contain Thaana letters, hoping that the user has at least one of them installed, or use a downloadable font with #font-family. The latter sounds more realistic in this case. For it, you would need a font that you can legally use that way.
Free fonts that support Thaana include MPH 2B Damase and TITUS Cyberbit Basic.
For generalities, see my Guide to using special characters in HTML.
I would be very surprised at seeing an automatic translator for a small language like Maldivian, and I would also be surprised at seeing an automatic translator that produces decent results when translating a web site.
Related
The Unicode standard now contains control characters for the formatting of Egyptian hieroglyphics. (Nine characters: 13430-8). In principle (my understanding) is that they are intended to work as so:
šš°š
in which code 13430 (EGYPTIAN HIEROGLYPH VERTICAL JOINER) indicates that one sign should be put above the other, like this:
Unfortunately, there seems to be no support for these characters at this point.
Particularly I am wanting to implement something for the web, i.e., I would like to display correctly formatted hieroglyphs without having to generate an image on the backend. Ideally, something like <span class="hieroglyphic-unicode">šš°š</span> would display correctly.
Any help/advice would be greatly appreciated.
Peter Constable's comments are correct, rendering Egyptian Hieroglyphs in fonts is very complex and takes time. There is a working group which combines the talents of Egyptologists and Unicode experts to ensure that the desired rendering capabilities are developed. We expect fonts to start becoming available during 2023 after platforms have had time to update to Unicode 15. As Peter pointed out, github.com/microsoft/font-tools is a good place to watch for progress.
The approach taken in your HTML looks correct, but I can't find any support for those Egyptian hieroglyphics control characters, even though they were added in Unicode 12.0 back in March, 2019.
One reason for that may have been the ongoing requests to add further control characters to provide richer support for combining hieroglyphics, so there was a reluctance to commit to doing anything while those proposed enhancements were still being discussed. However, those changes have now been approved (see Item 4 "Egyptian Hieroglyphs", pages 6 through 10).
Hopefully those changes will be included in Unicode 15 which is due to be released this September, but I couldn't find any documentation confirming that. If not, you may face a long wait! In the meantime I think you are stuck with using images, although that approach may not be practicable, depending on your requirements.
See this thread, Support Egyptian Hieroglyph Format Controls #1469, which confirms that there is no current support for the existing (Unicode 12) control characters with Google's Noto Sans Egyptian Hieroglyphs font. A comment near the end of the thread discussing support for Egyptian hieroglyph Format controls once Unicode 15 is released states "we will consider it along with all of the other Unicode changes, bug fixes, etc...".
If it is just for the web, there has been support (i.e. a JavaScript implementation) for the first 9 control characters (Unicode 12) since the beginning (2018):
https://mjn.host.cs.st-andrews.ac.uk/egyptian/res/js/
This uses HTML canvas, which is admittedly not ideal. I'm working on a better solution, which will also cover the new 29 control characters from Unicode 15.
It is the implementation in OpenType that is really difficult (it is being worked on).
I am making an application, and I want to add a "HOME" button.
After much struggling with various icon libraries, I stumbled upon this site,
http://graphemica.com/%F0%9F%8F%A0, with this
š
A unicode symbol, which is more akin to a letter than an image.
I pasted it into my HTML, and it just workedTM.
All this seems a little too easy, though. Are unicode symbols widely supported? Is there some kind of problem with them that leads people to use icon libraries instead?
It depends on what do you mean for "safe".
User should have the fonts, so you must include the relative font, and in various formats: there is not yet a format recognized by most used web-browsers.
Additionally, font with multiple colours are not fully understood by various systems, so you should care about what do you expect from users (click, select, copy, etc.).
Additionally, every fonts has own design, so between different fonts (so browsers and operating system) things can look differently. We do not have yet a "Helvetica 'Home'", a "Times New Roman 'Home'".
All this points, could be solved by using a web font, with monochrome glyphs (but it could be huge, if it includes all Unicode code points (+ usual combinations).
It seems that various recent browser crashes if there are many different glyphs, but usually it should not be a problem.
I also recommend aria stuffs so that you page could be used also by e.g. readers (and braille screen).
Note: on the plus side, the few people that use text browser can better see the HOME (not the case in case of an image), if somebody still care about this use case.
Some things you want to make sure youāre doing:
Save your HTML file as UTF-8. In fact, save all text files as UTF-8 unless thereās some reason you canāt.
Put the line <meta charset="utf-8" /> near the top of your HTML file.
Make sure your server isnāt misconfigured to tell all browsers that webpages are in the wrong encoding.
If, somehow, it is and you canāt fix it, fall back on &entities;.
Specify a font stack for your emoji in CSS with a set of fonts that cover nearly every system, perhaps including Apple Color Emoji, Noto Color Emoji, Segoe UI Emoji and Twemoji.
If a free font such as Noto or Symbola contains the emoji you use, you can package it as a WOFF to be sure it will always display the way you want. (As of 2018, Tor browser does not show most emoji correctly by default, but mainstream browsers do.)
I think using unicode is a good practice for development. Beacause The unicodes are essentially part of your operating system so you donāt need any special library or plugin and you treat them like regular text.
The only problem is - code can be defficult to read or understand. I think it is not easy to understand that (ㇼ 8;š ) printing home icon.
Even the 8 bit PNGs are faster then the font icons.
Image icons can be lightweight but still slow down your site with another HTTP request and time for the image to load. With images you donāt have flexibility over the color and scaling. SVG vector image alternatives are still not faster than plain-text (Unicode characters). Unicode doesnāt require additional HTTP requests and can be made to scale nicely.
If you are developing a website using only simple shapes, you can use unicode UTF-8 symbols as replacement for font icons.
I think :
Almost every developer use libraries for icons because of readablility of code, Easy to use and get more options.
Safe or Not
I can not say whether it is safe or not.
Because Unicode contains such a large number of characters and incorporates the varied writing systems of the world, incorrect usage can expose programs or systems to possible security attacks. This is especially important as more and more products are internationalized. This document describes some of the security considerations that programmers, system analysts, standards developers, and users should take into account, and provides specific recommendations to reduce the risk of problems.
Read about UNICODE SECURITY CONSIDERATIONS
Here are few precautions to be taken while doing that, I did some research and found this to be more helpful for your question. Also I dont know how you can do but credits go to Mr.GOY
Displaying unicode symbols in HTML
I would like to use the UTF-8 character ā on my site but I am not sure if this will be supported cross browser.
I am worried that:
a) Users will not have access to a font containing that character
b) IE will not find the character even if the user has a font that could display it. I am worried about this because of this info:
By the specifications, browsers should display a character if there is any font in the system that contains it. If the fonts specified by the author (in CSS font-family settings or, rarely these days, using font markup in HTML) do not contain the character, browsers are supĀposed to use fallback fonts. The same applies if no fonts are specified by the author; browsĀers should use primarily their default fonts, using alternate fonts for any character not covered by the primary font.
In practice, things donāt always work that way. Especially IE is notorious for its failures in this respect. It often fails to display a character, even though it could do that if it used all the fonts in the system. If a browser cannot render a character, it may show a small rectangle, possibly containing a question mark, ?, or some similar indicator. Hereās a quick test (charĀacĀter U+0840, which is probably not supported by any font on your computer): ą”.
Source.
c) Other issues that I have though of.
There is a resource called Unify, that will show what devices the character is supported on but it currently (Sept 14, 2015) only suport 107 characters.
So to summarize, the question is: How can I determine if it is safe to use a utf-8 special character on my site? Is it safe to use ā specifically on my site?
It's always safe - your user's computers won't suddenly burst into flame.
From a technical perspective, your best bet is to use a web font that has support for every Unicode character you want to use. That is not a catch-all (the user might have web fonts disabled or is using a command line browser, etc...), but it should support the vast majority of computers.
From there I would apply common sense. If the displaying of a character is absolutely crucial and lives depend on it, try to not use Unicode. Otherwise I'd say 'go ahead'.
This is as much a UX question as it is a technical one, so I will mention both.
As a comparison, on my IE11 browser, it looks like this: , but on my Firefox 31.8, it looks like this: . A good user experience is generally associated with consistency, and this approach is not very portable. So from a UX perspective, this is not a great solution.
I would say using a tiny *.gif or *.bmp, or even *.png if you need transparency, is a better solution. Even better yet, go with *.svg so scaling will not be an issue. From a technical aspect, the overhead of something that small is generally insignificant.
The only problem you can face is that exotic symbols are not implemented in many fonts, so the user can see a dummy character (e.g. square) instead of this. I personally like to use svg symbols for this purpose.
An alternative solution would be to use a web font with those icons in it (although probably a subset version of, so that it's less and 1 kb and doesn't weight down your pages).
I'm wanting to use the characters ā (U+2716) and ā (U+2714) in my CSS for form validation purposes. Basically, if a field is valid/invalid, I use the after pseudo class to insert the corresponding symbol after the field.
For example:
.field:after {
content: "\2716";
}
This is working great on my Mac, but when I switch to my Windows XP VMWare instance, I don't get the characters, no matter what font I choose (even Arial).
My suspicion is that perhaps my Windows VM isn't configured properly, but that causes me to be weary of using these characters at all.
Does anyone know if there are "safe" characters or ranges in unicode that you can reliably assume will be viewable by most people?
UDPATE:
Here is a list of unicode characters I was hoping to possibly be able to use as icons. Specifically the dingbats section.
http://en.wikipedia.org/wiki/List_of_Unicode_characters#Dingbats
If you don't see these characters on your machine, definitely let me know in the comments.
In addition to the problems of using CSS for presenting essential information (see CSS Caveats), thereās the problem that the characters mentioned are often not available in peopleās computers. The fonts supporting them do not contain any font that is shipped with a Windows system, for example. Support exists in Arial Unicode MS, which is shipped with Microsoft Office, but not everyone is using Office.
Besides, the symbols are not universal. A symbol like āāā meant wrong when I was at school.
Using āOKā and āerrorā might be best, unless you need to use some other language.
What browser are you using in your XP VM? IE6 and 7 don't support the :after selector, so that might be the issue.
I need to decide whether to render geometric symbols in a web GUI (e.g. arrows and triangles for buttons, menus, etc.) as Unicode symbols (MUCH easier and color-independent) or GIF/PNG files (lots of hassle I would like to avoid).
However, I have seen clients that have trouble displaying even advanced punctuation symbols declared as unicode characters (Example).
Does anybody know from which version on, OSs / Service Packs / Applications ship with Unicode versions of the standard fonts? There is, for example, Microsoft's Arial unicode that ships with Office since 1999, however I do not have office installed and still my Arial has at least some of the Unicode range.
Also, what is the situation with Mac OS and Linux?
Could somebody point me towards some comprehensive resources on this - reports, lists, overviews?
There's not really such a thing as a āUnicode versionā of a font(*). āArial Unicodeā is a misleading name: it's not materially different to normal āArialā, it just has some more characters in it. It does not contain usable glyphs for every single one of the tens of thousands of characters defined so far, and indeed there is no one OS standard font that does.
The significant question is merely whether the characters you want to use have glyphs in the default fonts of commonly-deployed operating systems. You need to look to look at font support for particular characters you wish to use on an individual basis.
The character U+0360 Combining Double Tilde you mentioned is not really āadvanced punctuationā, it's an curious and rarely-used diacritical mark for phonetics work. So it's not really surprising that font support for it is poor. On the other hand, Stack Overflow can get away with using U+25CF Black Circle (ā) because lots of fonts have it. Some of the other characters from the Geometric Shapes block such as U+25B2 Black Up-pointing Triangle (ā²) are also pretty common.
fileformat.info has a list of common fonts that support each character, so you can check there to get a feel of how widely supported a symbol is, and whether the default OS fonts you recognise are present, before using it as a replacement for an image. For example U+25CF is in many fonts, but U+0360 isn't that well-supported: none of the default Windows install fonts are there, and the āLibertineā font renders it badly wrong.
(*: OK, there is sort of such a thing as a Unicode font, in that a font's internal character lookup tables may be denominated in Unicode or some other character set. However this makes no practical difference as the application will always be addressing it as Unicode; the OS will do the conversion on lookup transparently.)
This question may be a duplicate of Unicode and fonts where I posted a list of unicode font links:
http://en.wikipedia.org/wiki/Category%3AFree%5Fsoftware%5FUnicode%5Ftypefaces
http://en.wikipedia.org/wiki/Unicode%5Ftypefaces
http://unifoundry.com/unifont.html
http://www.fileformat.info/info/unicode/font/index.htm
http://www.alanwood.net/unicode/fontsbyrange.html
http://www.alanwood.net/unicode/fonts.html
http://www.unifont.org/fontguide/
http://www.wazu.jp/index.html
However, I am not sure how the Unicode standard defines how to render a stand-alone COMBINING DOUBLE character, as it is supposed to combine other characters.
Some random observations:
On OSĀ X the Unicode support is perfect, at least for your needs.
On Windows the situation seems to depend on the browser. I donāt use many arcane characters, but the few I do (mostly punctuation) seem to display just fine in Firefox. The only problem is in Internet Explorer, as usual.
If you have some control over your clients you could distribute some good free fonts?
Even web fonts could work.
One drawback to Unicode charactes is that they are often quite ugly. Too big, too small, have wrong position, etc.