How to get text translation framework for any programming languages? - language-agnostic

I am writing an application. Such as Chinese/Japanese/Arabic/etc to English or English to french/dutch/russian. I do not have internet, where the project will use, so i can not use other existing internet translation services. So i am trying to collect a framework for my language such as Java or C or Python or D.
How to get Google like text translation framework as open source/free to use for quality text translation or which other framework can be used for this project?
Challenge is:
"how are you?" = hoe gaat het
= 你是如何
= お元気ですか?
= كيف حالك
= आप कैसे हैं

I don't understand the question.
If you want your application to display various user messages in several human languages, you need to worry about localization or internationalization. With C or C++ on Posix, functions like gettext are relevant.
Several libraries or frameworks also provide related features. As an example, Qt has its internationalization framework
Or do you want to automatically translate text from one human language to another? This is an entire research domain (machine translation), so prepare to get a PhD on that; you'll need several years of hard work (and the results are still not very satisfactory).

Related

Custom translator - How can I train the machine to recognize the right translation solution (synonyms)?

I'm pretty new with Custom Translator and I'm working on a fashion-related EN_KO project.
There are many cases where a single English term has two possible translations into Korean. An example: if "fastening"is related to "bags, backpacks..." is 잠금 but if it's related to "clothes, shoes..." is 여밈.
I'd like to train the machine to recognize these differences. Could it be useful to upload a phrase dictionary? Any ideas? Thanks!
The purpose of training a custom translation system is to teach it how to translate terms in context.
The best way to teach the system how to translate is training with parallel documents of full sentence prose: the same document in two languages. A translation memory extract in a TMX or XLIFF file is the best material, but many other document formats are suitable as well, as long as you have both languages. Have at least 10000 sentences in both languages, upload to http://customtranslator.ai, and build a custom system with it.
If you have documents in Korean that are representative of the terminology and style you want to achieve, without an English match, you can automatically translate those to English, and add to the training material as parallel documents. Be sure to not use the automatically translated documents in the other direction.
A phrase dictionary is of limited help, because it is unaware of context. It is useful only in bootstrapping your custom system or for very rare terms where you cannot find or create a sentence.

API call - the SMT category

I have recently tried to review the Chinese -> English system. According to https://blogs.msdn.microsoft.com/translation/2017/11/15/microsoft-translator-accelerates-use-of-neural-networks-across-its-offerings/ , those systems were already switched to NMT models. There is also statement, that user can still use the statistical system when setting category to "SMT".
However the https://blogs.msdn.microsoft.com/translation/2016/01/27/new-microsoft-translator-customization-features-help-unleash-the-power-of-artificial-intelligence-for-everyone/ mentions there were actually three standard categories available for SMT engines: General(default), TECH, SPEECH.
Could you please explain which domain is offered by the SMT category now? And for how long it will be supported on your side?
Thanks
We are working on customizaton using a neural network decoder. Currently, the Microsoft Translator Hub has 3 Category IDs for SMT and they are general, tech and speech.
With content that is not narrowly confined to your domain, you may find it to be better using category=generalnn than your current customization.
Chinese is using the NMT system so using Category=generalnn would result in the same translation when calling the service using the Microsoft Translator Text API.
The second article is addressing Customization where you can create your own custom translation system or dictionary tuned to your domain, style and terminology. If you're interested in customization (SMT at this time), there are categories associated with using the Translator Text API and the Microsoft Translator Hub. The category identifies the domain for the project you create using the Hub. Two of the categories are Tech and Speech.
See the Microsoft Translator Hub User Guide to learn more about the Hub.
The tech category will produce different results only when translating FROM English to other languages. In the case of English>Chinese, with my sample sentence "My computer doesn't boot up.", it does. For Chinese>English, specifying "tech" will fall back to the default, which is neural in the case of Chinese<>English. "speech" generates the same results as "generalnn" in all cases.
It is generally true, including for Hub categories, that a category that is valid in one language pair is valid in all language pairs. The API will fail with an "invalid category" error only if that category doesn't exist at all. The reason for this design is so that you can build your custom systems out language by language, over time, while still allowing the user to choose between all available languages, at the cost of, maybe, occasionally suboptimal domain vocabulary in an as of yet uncustomized language pair.
The API does not return to you whether a customized system was used or not. A trick to get that feature anyway is to watermark your custom system using a dictionary entry. Make a dictionary entry "_mywatermark" that translates to "CustomSystem180309_1700_en_ru" for instance, and then you can test anytime, in any application, whether you are getting your custom system or not.

Best language and framework for a text based game like mafia wars

Which do you think is the best language/framework to develop a text based adventure game like
Mafia wars? I am proficient in Java/JavaScript and have dabbled in Python, Perl, Erlang, Scheme. Also, any pointers to articles relating to this is very welcome. I am starting from scratch and hence have no constraints. This is a hobby project that I am planning to do to satisfy my coding urge.
The 'best' language doesn't exist.
Try using the one that you feel most comfortable with, after thinking about date structures, functional requirements, possibly the one where you can get the most support in your immediate (person to person) or close (e.g. stackoverflow) environment.
I'm going to try something original here - give Natural language a try.
Inform is a tool for creating interactive fiction (a.k.a. text-based adventures) that features its own language. It takes care of creating the initial "infrastructure" (taking user input, recognizing verbs, that sort of thing) and lets you concentrate on creating "things", "places" and "actions".
Here's a sample, extracted from its tutorial:
The wood-slatted crate is in the Gazebo. The crate is a container.
Mr Jones wears a top hat. The crate contains a croquet mallet.
It looks deceitfully easy, I know. But try it :)
Inform also allows you publish it on The Interactive Fiction Database, as well as export it to a standard Z-machine format (I belive the file extension for this is .z8) .There's even a javascript z-machine interpreter, in case you prefer to host your adventure on a web-page yourself.
Edit: I've found two additional "frameworks" - I don't know whether they use a programming language, or they are completely graphical, I don't use windows: Adrift and TADS 3
I'm a little confused by your requirements; Mafia Wars is a web game, correct? Text adventures, while they can be played on the web (see this article: http://kooneiform.wordpress.com/tag/if-interpreters/) are usually single-player games, a far cry from Mafia Wars.
I think you mean you want to create a PBBG or web game; based on your experience then I recommend a Python back-end with JavaScript on the client-side. One framework you could look into is the Google App Engine, which has Python support, and would be an excellently scalable solution.
Alternatively you can choose one of the many Python web frameworks available. If you'd like a simple place to start, I recommend web.py, which I've been trying out recently and quite like. I've found that combining Python and JavaScript/AJAX with web.py and something like jQuery is a very enjoyable and friction-free way to develop.
Clojure could be a fun option - Lisps are a classic way of writing natural language processing programs and text adventure games are a good example.
Here's a nice little tutorial for writing a text adventure in Clojure.
Just use what you have learn, there no specific programming language to do that kind of application. Just it's more or less easy depending the language
Since you seems to be experienced in Python just go ahead with python! If you don't already made some web project, you should take a look at tutorials and resources on the web.
Good luck!

Diversify programming knowledge

I've taken courses, studied, and even developed a little by myself, but so far, i've only worked with Microsoft technologies, and until now I have no problems with it.
I recently got a job in a Microsoft gold partner company for development in C#, VB.net and asp.net.
I'd like tips on how to diversify, learning technologies other than those from Microsoft. Not necessarely for finding another job, I think my job just fits me for my current interests. I think that by learning by myself other languages, frameworks, databases.. I may become a better programmer as a whole and (maybe) at the end of it all having more options of job opportunities, choosing what i'm going to be working with.
What should I start with? how should I do it?
If you're comfortable with C# and VB, learn a language that uses different paradigms. The usual suspects would be Ruby, Erlang, Haskell, Lisp. All of these are available for Windows and other platforms. You might have to get used to different tools to interact with them but that's not necessarily a bad thing.
At the risk of sounding trite, why not install some variant of Linux on a cheap desktop? The mere act of setting up a Linux box is educational.
Once you find your way around it, do some shell scripting and install things like a web server. That should keep you busy for a while. Once you past that, play with some dynamic languages like perl, ruby, python, PHP, etc.
If you're interested in other languages, just pick one and away you go. You sound like you have enough experience to be apt in another language.
If you're looking into a new desktop-development-language then I'd recommend Java or Python, both of which you'd ease into with your C# and VB.NET experience.
If you're looking into web programming, go for PHP?
Browse some source
examples and see what catches your
eye as the most interesting.
Pick up a book on that language.
Ideally, one should know at least one example from each of the major "paradigms":
Assembly (nowadays a dying art, and not that useful)
plain C
one of the OO-variants of C (C++, objective C)
Java or C# (they are very similar, probably no need to learn both)
a scripting language like Ruby or Perl
Javascript (preferrably via Crockford's book)
a non-pure functional language, e.g Scheme (PLT Scheme is a nice learning environment)
a pure-functionalal language like Haskell or OCAML
Erlang (somewhat of a class of its own)
a mathematical/statistical language like R, or J (an APL-successor)
Microsoft technologies aren't bad to start with. My advice would be:
Make sure you aquire sound knowledge about the foundations of programming and the technologies you use. The more basics you know, the more independent you'll be from the latest fads:
Read "Windows Internals" to understand the operating system you're working with. In the process, you will understand other operating systems a lot better.
Toy around with other languages. Learn the differences between statically-typed languages and duct-type languages, functional programming languages, iterative programming languages whatever.
Learn the language you use the best you can. Become John Skeet!
In other words, don't move sideways first. Dig deeper and become better at understanding what you do.
It would be a nice idea to get associated with one the open source programm on http://sf.net. That way you can even have your learning for new platform and also produce some legitimate code. Also you get to look at some good coding practices. Last but not least some giving back to the software community
Maybe think of a project that would be of use to you in your daily life and see if you could develop that in a suitable language. That way you have a goal and at the end of the project you have something useful.
Alternatively why not try learing something not directly programming related, project management might be of use for future roles or do some reading about the history of technology.
These won't add any new languages to your CV but they might add some different aspects to your thinking that might make you a more well rounded potential employee.
I see two main directions to go:
Specific technologies. Select these depending upon how you want to extend yourself, new language (perhaps scripting if you haven't done that, perhaps functional programming), or new techniques (for example, UI programming, or low-level network programming depending upon what you haven't already done), or new OS (Linux if you're a Windows person).
Or, look at higher level problems, for example Design Methods and Team organisation. Read books such as Brooks' Mythical Man Month and Beck's Extreme Pogramming. Consider how to deal with problems bigger that can be solved by one person. Read up on (Rational) Unified Process, UML. Explore revision control systems, Testing techniques, not just Unit Test, but otehr flavours. Think about how you would organise a team if you were the leader. How would the tasks be subdivided, how would communication be managed?

What does it take to make a language successful? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have an interesting idea for a new programming language. It's based on a new programming paradigm that I've been working out in my head for some time. I finally got around to start working on a basic parser and interpreter for it a few weeks ago.
I want my new language to be successful and I want to eventually create a community around it when it's ready to release. The idea behind it is fairly innovative, so I don't expect it to gain a lot of ground in the business world, but it would thrill me more than anything else to see a handful of start ups use or open source projects use it.
So taking those aims into account, what can I do to help make my language successful? What do language projects do to become successful? What should I avoid at all costs? I'd love to hear opinions or stories about other languages -- successful or not -- so I can think about them as I continue to develop.
So far, the two biggest concerns on my mind are finding a market, access to existing libraries, having amazing tool support. What else might I add to this list?
The true answer is by having a beard.
http://blogs.microsoft.co.il/blogs/tamir/archive/2008/04/28/computer-languages-and-facial-hair-take-two.aspx
Although not specific to new programming languages, the book Producing Open Source Software by Karl Fogel (available to read online) may be contain some hints to the issue of making a community around your new programming language.
In terms of adoption of programming languages in general, it seems like the trend lately has been to have a rich library to make development times shorter.
As there isn't much detail on what your language is like, it's hard to determine whether adoption of the language is going to depend on the availability of a rich library. Perhaps your language will be able to fill a niche that has been overlooked by other languages and be able to gain users. Or perhaps it has a slick name that will draw people in -- there are many factors which can affect the adoption of a language.
Here are some factors that come to mind when thinking about recent successful languages:
Ability to leverage existing libraries in the new language.
Having an adapter to external libraries written in other languages.
Python allows access to code written in C through the Python/C API.
Targeting a platform which already has plenty of libraries available for use.
Groovy and Scala target the Java platform, therefore allowing the use of and interoperation between existing Java code.
Language design and syntax to allow increased productivity.
Many dynamically-typed languages have gained popularity, such as Ruby and Python to name a couple.
More concise and clear code can be written in languages such as Groovy, as opposed to verbose languages such as Java.
Offering features such as functions as first-class objects and closures which aren't offered in more "traditional" languages such as C and Java.
A community of dedicated users who also are willing to teach newcomers on the benefits of a language
The human factor is going to be big in wide-spread support for a language -- if people never start using your language, it won't gain more users.
Also, another suggestion that I could add is to make the development of your language open -- keep your users posted on developments in your language, and allow people to give you feedback. Better yet, let your users take part in the decision-making process, if you feel that is appropriate.
I believe that by offering ways to participate in the bringing up of a language, the more people will feel that they have a stake in the success of the new language, so the more likely it will gain more support.
Good luck!
Most languages that end up taking off rapidly do so by means of a killer app. For C it was Unix. Ruby had Rails. JavaScript is the only available programming system common to most browsers without third-party add-ons.
Another means of success is by fiat. This only works if you have significant clout. For example C#, as nice as a language as it might be, wouldn't be any where near as popular as it is now if Microsoft had not pushed it as hard as it does. Objective-C is the language of MacOS X simply because Apple says so.
The vast majority of languages, though, which lack a single killer app or a major corporate backer have gained success through long term investment of their respective creators. Perl and Python are prime examples. C++ has no single entity behind it, but it has evolved as the needs of developers have changed.
Don't worry about trying to make the language be successful; worry about using it to solve real problems and make real money.
You'll either make lots of money from using this language, or not. Once you have lots of money, others may care how you did it. Or not, either way you have lots of money.
If you don't make lots of money, nobody will want to know how you did it.
Edit based on comment: I define successful as people using it, and people use languages to solve problems, most for profit, thus successful == profitable.
In addition to making the language easy to use (which has several meanings), you should develop a comprehensive library that covers and also provides a good level of abstraction over (the following most important areas):
* Data structures and manipulation
* File I/O support
* XML processing
* Networking (plus web based technologies like HTTP/HTTPS)
* Database support
* Synchronous and asynchronous I/O
* Processes and threads
* Math
A well thought out framework that makes rapid development faster (and easier to maintain) would be a great addition. For this, you should know the currently popular frameworks well.
Keep in mind that it takes a lot of time. I think it took python about 10 years (someone please correct me if I'm wrong).
So even if your community still seems small after say, 5 years, that's not the end of the story.
"It's based on a new programming paradigm that I've been working out in my head for some time."
While laudable, odds are really good that someone has already done something with your "new" paradigm.
To make a language usable, it must build on prior art. Totally new is not a good path to success. My favorite example is Algol 68.
Algol 60 was wildly popular (back in the day, which is a while ago, admittedly).
The experts wanted to build on this success. They proposed some new paradigms, the effort split into factions. The purists put the new paradigms into Algol 68; it disappeared into obscurity. Some folks created a different version of Algol, called PL/I. It did not have any really new paradigms. It actually went somewhere and was used heavily. Another group created Pascal -- it didn't have much that was new -- it discarded things from Algol 60. It actually went somewhere ans was used heavily.
Your new paradigm must have a clear and concise summary so people can fit it into a context of where the language is usable, how it can be used, what the costs and benefits of using it are.
A "new programming paradigm" causes some people to say "why learn a completely new paradigm when the ones I have work so nicely?" You have to be very clear on how it helps to have a new paradigm.
The language and libraries must work, and work very, very well. A language that isn't rock-solid is worthless. In order to be rock-solid it must be very simple.
It has to have a tutorial that will help anyone get started with your language.
Good Framework for Common Tasks
Easy Installation/Deployment
Good Documentation
Debugger/IDE and other Tools
A popular flagship product that uses your language!
Good documentation, including a detailed reference manual as well as simple examples to get people started quickly.
Good library support so that people can actually write useful programs.
Most popular languages seem to be very strong in either or both or both of those.
Use Trojan Horse approach
C++ - The Forgotten Trojan Horse
An interesting article on why C++ can grab the heart of programmers successfully.