Microsoft Translator: Does GetLanguagesForTranslation() translate language names into the browser language? - microsoft-translator

This may be a stupid question, but I have no way of testing this with multiple languages.
I am wondering if the Microsoft Translator translates the list of supported languages into the language detected by the browser? I assume is does, but would appreciate validation.
TIA

GetLanguagesForTranslate() returns the set of supported languages IDs.
GetLanguageNames() takes an array of languag IDs and returns the friendly name of the language in the language you specify in the "locale" parameter, where locale is the ISO639 language code.
You may listen to the browser's accept-language setting and pass the first element of it to the locale parameter. That's what the Bing Translator home page does (www.bing.com/translator)

Related

How to Dictionary only training?

I want to train the basic translation system with only a glossary.
The language pair is ENtoKO. I trained 1,700 sentences in the dictionary tab in the manner described in the article.
I did not select anything in the "Training" tab.
https://cognitive.uservoice.com/knowledgebase/articles/1166938-hub-building-a-custom-system-using-a-dictionary-o
enter image description here
However, expectation and system did not translate the terms. and unlike the document (Microsoft Translator Hub User Guide.pdf), the training completes much time.
Dictionary only training: You can now train a custom translation system when with just a dictionary and no other parallel documents. There is no minimum size for that dictionary, one entry is enough. Just upload the dictionary, which is an Excel file with the language identifier as column header, include it in your training set, and hit train. The training completes very quickly, then you can deploy and use your system with that dictionary. The dictionary applies the translation you provided with 100% probability, regardless of context. This type of training does not produce bleu score and this option only available if MS models are available for given language pair.
Why this training only losing silp Dicionary would like to know. If the update is a feature that is not the intended schedule?
In addition, I am wondering if there is a plan to introduce the Dictionary application function to the NMT Api function as well.
Customizing NMT is available now by using Custom Translator (Preview) and we expect the Dictionary feature to be available when Custom Translator is Generally Available.
You do need to be using the Microsoft Translator Text API v3 and Custom Translator supports language pairs where NMT languages are available today (Korean is a NMT language).
Thank you.
Yes.
You can customize our en-ko general domain baseline with your dictionary. Please follow our quick start documentation.

In the ABBYY Cloud OCR SDK, does the order of Language parameter values matter?

When using the the ABBYY Cloud OCR SDK, one of the optional parameters is "language," which will accept a comma-separated list of languages the service should look for when recognizing the text. Does anyone know if the order of these matters? For example, when calling the processBusinessCard method as documented here
Would a language option of English,Japanese behave any differently from Japanese,English - for example by preferring the detection of one language over the other?
I'm trying to decide if I can use a simple checkbox group to set the values for this setting, or if I need to provide an interface that allows for ordering.
UPDATE:
I've tested reordering the languages, and it doesn't seem to change the behavior.
As I know there is no difference. But to be sure you can upload sample image into demo-service and to try both ways.
A little tip: you can use ABBYY OCR SDK forum for contacting support team.

Converting JSON string to an array in ColdFusion MX7

I have a cookie value like:
"[{"index":"1","name":"TimePeriod","hidden":false},{"index":"2","name":"Enquiries","hidden":false},{"index":"3","name":"Online","hidden":false}]"
I would like to use this cookie value as an array in ColdFusion. What would be the best possible way to do this?
The normal answer would be use the built-in deserializeJson function, but since that function wasn't available in CFMX7 (it arrived in CF8), you will need to use a UDF to achieve the same thing.
There are two sites which contain resources of this type, cflib.org and riaforge.org, each of which have a different potential solution for MX7.
Searching CFlib provides JsonDecode. (CFLib has a specific filter for "Maximum Required CF Version", so you can ensure any results that appear will work for your version.)
Searching riaforge provides JSONUtil, which runs on MX7 (but also claims better type mapping than the newer built-in functions).
Since MX7 runs on Java, you can likely also make use of any of the numerous Java libraries listed on json.org, using createObject/java.
JSON serialization was added natively in CF8.
If you are on MX7 look on riaforge.org for a library that will deSerialize JSON for you.

IDL for JSON REST/RPC interface

We are designing a fairly complex REST API, in which most of the I/O are JSON encoded objects with a specific structure. One challenge we have found is to document the API in such a way that makes it easier for clients to post correct input and process output. Because the data of both the input and output requires fairly complex JSON objects, client developers often introduce bugs related to the structure of the I/O objects.
With all of the JSON web API's these days, I would have hoped for a general solution, but I am having a hard time finding one. I looked into json-schema which is a json-validation schema but both the IETF draft and implementations seem to be fairly immature (even though they have been around for a while, which is not a good sign).
A slightly different approach is offered by Protocol Buffers and Apache Avro, where the schema is not used for validation, but actually required for the encoding/decoding of the message. Of these 2, Avro seems to have rather limited documentation and implementations. ProtoBuf seems better, but I am not sure if this is really suitable to use in the browser to call a JSON api?
Now I am starting to doubt if I am looking at this from the right angle. Are there other methods available to make my API a bit more strong-typed'ish? Or is a formal description of a JSON REST/RPC API something that defeats the purpose of using JSON?
Edit: 6 months after this topic we found mongoose, which is very close to what we were lookin for.
Below a reply I received by email from Douglas Crockford.
I am not a believer in schemas as an alternative to input validation.
There are properties that cannot be verified from the syntax. I think
that was one of the ways that XML went wrong.
If your formats are too complex, then I would look at simplifying
them.
Such systems exist and I'm the author of one of them. It is called Piqi-RPC and it does IDL-based validation of the input and output parameters for RPC-style APIs over HTTP.
It supports JSON, XML and Google Protocol Buffers as data representation formats for input and output of HTTP POST requests. Clients can choose to use any of the three formats and specify their choice using the standard Accept and Content-Type HTTP headers.
So, yes, in theory, you are looking in the right direction. However, at the moment, Piqi-RPC supports writing servers only in Erlang and it wouldn't be very useful for you if you use a different stack. I heard that Apache Thrift also supports JSON over HTTP transport, but I haven't checked. Another kind of similar system I know of (also for Erlang) is called UBF. I have heard of libraries for Java that can parse and validate JSON based on Protocol Buffers specification (e.g. http://code.google.com/p/protostuff/).
The idea itself is far from being new, but there aren't many systems that approach it in practice. It is a challenging problem.
Historically, IDLs were used for interface definition and binary data serialization and not so much for validating dynamic data interchange formats (e.g. XML and JSON) which emerged later. Sun-RPC IDL and CORBA IDL fall in the first category. WSDL would be one of few examples covering both areas, but it is a terrible piece of technology and it would be a bad choice for most modern systems. In addition, there are many schema languages (also known as DDLs -- data definition languages), most of which are highly specialized and work with only one representation format, e.g. XML or JSON schemas. Few of those have stable implementations.
The Piqi project and Piqi-RPC, which is based on it, are build around several fairly simple realizations:
DLL doesn't have to be explicitly tied to any particular data representation format or built around it. Instead, such language can be fairly universal and cover wide range of practical use-cases (e.g. cross-language data serialization and data validation) and data formats (e.g. JSON, XML, Protocol Buffers).
IDL for RPC-style communication can be implemented as a thin, mostly syntactic layer on top of the universal DDL.
Such IDL and interface specifications can be transport agnostic.
Speaking of REST-style APIs over HTTP compared to RPC-style APIs over HTTP.
With RPC-style APIs, service developer or an automated system have to validate three things: function name (according to some service naming scheme), input and, if you choose so, output.
In case of REST-style APIs, people get themselves in trouble for no good reason. Now, they have a lot more stuff to validate: arbitrarily complex URL syntax, including dynamic parameters encoded in URL segments (for all HTTP methods) and URL query string (only for HTTP GET method), HTTP method correspondence (whether it should be GET, POST, PUT, DELETE, etc.), HTTP body when some parameters go there (sometimes they do it manually twice for parameters represented in JSON and XML), custom HTTP headers, and separately -- service documentation. Imagine an IDL supporting all that!
XML is better for RESTful services in many ways. It has native linking (<link href=, for all those HATEOAS fans), native language support (lang="en") and a great ecosystem.
It is also better for future proofing and future API refactorings. Converting this:
<profile>
<username>alganet</username>
</profile>
To support more usernames:
<profile>
<username>alganet</username>
<username>alexandre</username>
</profile>
Is much more simpler to do without breaking existing clients using XML. JSON is hard on that.
If you really need JSON, JSON-Schema is the way to go. It's immature, but I don't know anything better on that case. Maybe your consumers could choose between XML and JSON, so they can choose between a small payload (JSON) or RESTful candies (XML) using Content Negotiation.
I'd say the answer to your last question is yes. If you need a way to constrain and document the JSON "schema", why didn't you go with XML in the first place? It is not that much harder to parse, and being able to enforce a schema for it is a great advantage.

what's DSL in plain words?

I heard from someone that DSL is really powerful in some specific fields. So i want to find out if i can put it into my skill sets.
The first problem came out is What is DSL exactly? After doing some search, it seems Groovy supports DSL very well. Then i go and read Groovy's documents and try it out by myself.
And i got the impression that DSL is just some kind of configuration files consisting of texts, XMLs and you use some tools like Groovy to parse it, it magically become some methods or functions you can invoke. What happened?
I read something, but cannot get it straight. Any Help?
Did you read this? Martin Fowler is an authority on the subject and a great writer. I doubt that anyone will improve on the first paragraph. If you still don't get it, give it some time and re-read the article a few times.
I'd recommend looking into JetBrain's MPS
A book might be overwhelming, but there's a relatively new one available.
And i got the impression that DSL is just some kind of configuration
files consisting of texts, XMLs and you use some tools like Groovy to
parse it, it magically become some methods or functions you can
invoke. What happened?
I don't think your impression is entirely accurate. I'd forget about Groovy and parsing and all the implementation details for now. Focus on the problem that DSL is trying to solve.
A DSL designer tries to come up with a pseudo programming language that an expert, who is unfamilar with programming languages like Groovy or Java or C#, would recognize as a simple language describing they way they solve problems.
The DSL uses terms and concepts familiar to any one knowledgable about that domain.
The DSL shields users from the underlying implementation details so they can focus on how to attack their problems.
A DSL is written for the convenience of business users, not developers.
Keep that in mind and the rest is implementation. Eye on the prize....
A domain specific language (DSL) is a programing language that is not fully featured. The point is that programing in a DSL can be easier than programing in a general purpose language, and be less prone to bugs. The "domain" in "domain specific language" refers to the specific purpose the language will be used for.
For example, the language that a calculator uses with just + - * / and numbers could be called a domain specific language. It has the advantage over a regular programing language in that programs will never segfault, crash, loop forever, etc. Other examples of domains might be web development -- for example, Ur/Web is a DSL for building web applications. SQL is a database domain specific language. etc.
I don't know much about Groovy, but it seems that there are particular tools for using it to create DSLs. Fundamentally, to create a DSL you need to specify a syntax, along with some sort of semantics. How exactly Groovy does this I do not know.
DSL is a language dedicated to a specific domain. For instance, the well-known CSS is a Domain Specific Language serving the look and formatting of a document.
By using Groovy you might create your own DSL focusing on any selected domain - e.g. accounting, telecommunications, banking etc. This means, that the language will use the common terminology of this area meeting the needs of this domain. This language will be easily understood by people of this domain that are not necessarily technical (e.g. accountants). In some times, it focuses on being used by non-programmers. Especially Groovy is a dynamic language with which you can enable end-users to add code scripts dynamically similarly to what Excel does with VB, through configuration files.
You should delve into Martin Fowler's publications if you are interested in this subject, anyway.