Is it possible to use English-Hindi converter software in Flex application? - actionscript-3

I am using Flex4.6 and anmol hindi font. But some are missing in Keymap and it is difficult to type those words.
So, I want to type Hindi in a easy way, as we do in Gmail. So for that I got a software called English to Hindi converter in Indiatyping.com. But don't know how to use that in my application. I tried all most all fonts. But it is difficult to type few words and there are no proper guidelines to type all the alphabets. From a,aa, k,kha, to ka, kaa, jna, jnaa.
Please help me. Tired by searching in google. But didn't get any proper solution.

Related

HTML input using exonym suggestions

I am creating a website in German. No big deal I am nativ speaker.
If I misspell "Mittelsmann" by left the last character I have this suggestion from chrome:
This is usefull and I like to keep this behavior. But this input-field is an exception because it requires a scientific value and should be in english, so all suggestions for german words are basically wrong (call to Seiri of Kaizen).
Question: How can I force the suggestion to be in english for this input field ONLY?
That is a very good question. I'm German myself and ran into the same problem last year.
There are suggestions to let the HTML lang attribute decide which spellcheck is used (see the chromium developer discussion ending with WontFix/Closed). But sadly a lot of the current browsers (especially native browsers of the OS, and all mobile keyboards of cause) use the external spell checking provided by the OS.
So your wanted feature doesn't exist (resp. no standard for that), because a customized/picked language by JavaScript is often not transferable to the spell checker.
I solved my problem by adding and activating both languages German and English to my spell checker in Google Chrome, which resulted in getting both suggestions. That's not satisfying, but it worked for me.
Edit:
I also see a lot of German sites solving that problem by using attributes like autocorrect="off", autocapitalize="off" and spellcheck="false" on an "just English" input field to avoid confusing the user by German spelling suggestions and corrections. May that be an option to you?

HTML input in Hindi or other non-english languages (Mostly Indian)

I want to take user input in Hindi in a HTML form.
How do i go about it?
I tried setting the font-family for the <input /> to some hindi font. But that doesn't work.
Is there any other way of doing it?
Even embedding Google Transliterate or something similar will do. But i need to store the data entered into a MySQL Database.
PS: I am using PHP to do the server side stuff.
Although from your question it's not clear what you are trying to do. But for the sake of this answer I'm going to assume that you want users to see hindi characters as they type with their keyboard. In that case changing font is not going to help. Changing font has nothing to do with what characters you type on keyboard and what characters you see on screen.
If you want to let users type hindi character with their QUERTY keyboard then you need to embed something like google transliterate.
Here is the developers documentation on how to do it
https://developers.google.com/transliterate/v1/getting_started
Hope it helps.
Just download a Hindi font and change the input field font family to a Hindi font name.

Recognizing superscript characters using OCR

I've started a simple project in which it must get an image containing text with superscripts and then by using OCR (currently I'm using tesseract) it has to recognize the superscript characters + the normal ones.
For example, we have a chemical equation such as Cl², but when I use the tesseract to recognize it, it gives me Cl2 (all in one line).
So, what is the solution for this problem? Is there any other OCR API that has the ability to read superscripts?
Very good question that touches more advanced features of any OCR system.
First of all, to make sure you are NOT overlooking the functionality even though it may be there on an OCR system. Make sure to look at your result test not in plain TXT format, but in some kind of rich text capable viewer. TXT viewers, such as Notepad on Windows, often do not support superscript/subscript characters, so even if OCR were to give you correct characters, your viewer could have converted it to display it. If you are accessing text result programatically, that is less of an issue because you are supposed to get a proper subscript character value when accessing it directly. Just note that viewers must support it for you to actually see it. If you eliminated this possible post-processing conversion and made sure that no subscript is returned from OCR, then it probably does not support it.
Just like in this text box, in your original question you tried to give us a superscript character example, but this text box did not accept it even though you could copy/paste it from elsewhere.
Many OCR will see subscript as any other normal character, if they can see it at all. OCR of your use needs to have technical capability to actually produce superscripts/subscripts, and many of them do, but they tend to be commercial OCR systems not surprisingly.
I made a small testcase before answering this letter. I generated an image with a few superscript/subscript examples for my testing (of course EMC2 was the first example that came to mind :) .
You can find my test image here:
www.ocr-it.com/documents/superscript_subscript_test_page.tif
And processed this image through OCR-IT OCR Cloud 2.0 API using all default settings, but exporting to a rich text format, such as MS Word .DOC.
You can find my test image here:
www.ocr-it.com/documents/superscript_subscript_test_page_result.doc
Also note: When you are interested to extract superscript/subscript characters, pay separate attention to your image quality, more than you would with a typical text. Those characters are tiny and you need sufficient details and resolution to achieve descent OCR quality. Even scanned at 300 dpi images sometimes have issues with tiny characters due to too few pixels. If you are considering mobile and digital cameras, that becomes even more important.
DISCLOSURE: My specialty is implementing internal OCR solutions for companies of different sizes. My company is WiseTREND. Contact me directly if I can assist with anything further.

What is a good resource for HTML character codes -> glyph and

I've already found a good site to convert HTML character codes to their respective glyphs:
http://www.public.asu.edu/~rjansen/glyph_encoding.html
However, I need a bit more information. Does anyone know of a site like the one above that also provides information on what type of character code it is? Meaning, is it a special character? Is the glyph visible? Etc...
So far I have found some tables with this information, but they aren't as complete as the resource above. I would really like to get my hands on a complete table.
Thanks,
-Ben
HTML Entity Character Lookup
I like FileFormat.Info--e.g.: http://www.fileformat.info/info/unicode/char/20ac/index.htm
The character map on Ubuntu (and I assume most other Linux distros) is fantastic. You can search for any character by its name or description (e.g. "arrow") really easily.
Windows' character map is a poor imitation but kinda works too. It seems to decide that certain fonts (Arial, Verdana etc) can't display some characters, even though they work absolutely fine. (Hint: try MS's more recent font creations like Calibri for better results.)
Once you've found a character you can either:
Copy it and use it directly (requires pages to be UTF-8) like this: ↗
Insert it as a hexadecimal entity. The above character is "U+2197 North East Arrow" so the entity would be ↗
Convert the hex code to decimal (the calculators on Windows and Linux can do this). The above example is ↗
Here's a quick, low-footprint way to look them up: &what;

Best practices for internationalizing web applications?

Internationalizing web apps always seems to be a chore. No matter how much you plan ahead for pluggable languages, there's always issues with encoding, funky phrasing that doesn't fit your templates, and other problems.
I think it would be useful to get the SO community's input for a set of things that programmers should look out for when deciding to internationalize their web apps.
Internationalization is hard, here's a few things I've learned from working with 2 websites that were in over 20 different languages:
Use UTF-8 everywhere. No exceptions. HTML, server-side language (watch out for PHP especially), database, etc.
No text in images unless you want a ton of work. Use CSS to put text over images if necessary.
Separate configuration from localization. That way localizers can translate the text and you can deal with different configurations per locale (features, layout, etc). You don't want localizers to have the ability to mess with your app.
Make sure your layouts can deal with text that is 2-3 times longer than English. And also 50% less than English (Japanese and Chinese are often shorter).
Some languages need larger font sizes (Japanese, Chinese)
Colors are locale-specific also. Red and green don't mean the same thing everywhere!
Add a classname that is the locale name to the body tag of your documents. That way you can specify a specific locale's layout in your CSS file easily.
Watch out for variable substitution. Don't split your strings. Leave them whole like this: "You have X new messages" and replace the 'X' with the #.
Different languages have different pluralization. 0, 1, 2-4, 5-7, 7-infinity. Hard to deal with.
Context is difficult. Sometimes localizers need to know where/how a string is used to make sure it's translated correctly.
Resources:
http://interglacial.com/~sburke/tpj/as_html/tpj13.html
http://www.ryandoherty.net/2008/05/26/quick-tips-for-localizing-web-apps/
http://ed.agadak.net/2007/12/one-potato-two-potato-three-potato-four
In my company all our strings are stored in *.properties files. Our build tools build a "test languange" copy of the properties files, which replace a string like this:
Click here
with something like this:
[~~ Çļïčк н∑ѓё ~~ タウ ~~]
Now, when we set the language to "test" in our config files, these properties files are used. (And of course we don't ship the test language files).
This allows us to:
Make sure that Unicode characters are displayed correctly, including Japanese/Chinese/Korean.
Make sure that the layout scales appropriately for languages with longer words (German in particular has longer words on average than English).
Spot any hard-coded strings (as they will be in plain-English).
As for the actual translation, this is done by professional translators, not developers.
As an English person living abroad I have become frustrated by many web application's approach to internationalization and have blogged about my frustrations.
My tips would be:
think about how you show an international version of a page
using geolocation might work for many users, but as my examples show for many it will not
why not use the Accept-Language header for determining which language to serve
if a user accesses a page via a search engine then don't redirect them somewhere else e.g. to a homepage in a different language
it's extremely annoying to change language and have a different page reload - either serve the same page or warn the user that the current content is not available in a different language before redirecting them
English is a very common language, so perhaps default to that
But make sure the change language option is clear on the GUI (I like what Google Maps are doing, as shown in the post)
All I see on the Web is companies getting internalization wrong. Getting it right from a user's perspective is tricky indeed.
I have a couple apps that are "bilingual"
I used resource files in ASP.NET1.1
There is also something called the String Resource Tool
Basically you put all your strings in a .RES file for both languages and then determine what file to read from based on Culture or whether someone clicked a Link for the language
The biggest gotcha is making sure the Translations are done correctly