How does Microsoft Translator Service handle double quotation marks? - microsoft-translator

I get all kinds of different results for using double quotation marks.
For example, they are translated to {" and "* for "Test error":
They are translated to [ and ] for "Test paper":
Could anyone shed some light on this?

Related

Snowflake how to escape all special characters in a string of an array of objects before we parse it as JSON?

We are loading data into Snowflake using a JavaScript procedure.
The script will loop over an array of objects to load some data. These objects contain string that may have special characters.
i.e.:
"Description": "This file contain "sensitive" information."
The double quotes on sensitive word will become:
"Description": "This file contain \"sensitive\" information."
Which broke the loading script.
The same issue happened when we used HTML tags within description key:
"Description": "Please use <b>specific fonts</b> to update the file".
This is another example on the Snowflake community site.
Also this post recommended setting FIELD_OPTIONALLY_ENCLOSED_BY equal to the special characters, but I am handling large data set which might have all the special characters.
How can we escape special characters automatically without updating the script and use JavaScript to loop over the whole array to anticipate and replace each special character with something else?
EDIT
I tried using JSON_EXTRACT_PATH_TEXT:
select JSON_EXTRACT_PATH_TEXT(parse_json('{
"description": "Please use \"Custom\" fonts"
}'), 'description');
and got the following error:
Error parsing JSON: missing comma, line 2, pos 33.
I think the escape characters generated by the JS procedure are escaped when passing to SQL functions.
'{"description": "Please use \"Custom\" fonts"}'
becomes
'{"description": "Please use "Custom" fonts"}'
Therefore parsing them as JSON/fetching a field from JSON fails. To avoid error, the JavaScript procedure should generate a double backslash instead of a backslash:
'{"description": "Please use \\"Custom\\" fonts"}'
I do not think there is a way to prevent this error without modifying the JavaScript procedure.
I came across this today, Gokhan is right you need the double backslashes to properly escape the quote.
Here are a couple links that explain it a little more:
https://community.snowflake.com/s/article/Escaping-new-line-character-in-JSON-to-avoid-data-loading-errors
https://community.snowflake.com/s/article/Unable-to-Insert-Data-Containing-Back-Slash-from-Stored-Procedure
For my case I found that I could address this challenge by disabling the escaping and then manually replacing the using replace function.
For your example the replace is not necessary.
select parse_json($${"description": "Please use \"Custom\" fonts"}$$);
select parse_json($${"description": "Please use \"Custom\" fonts"}$$):description;

Json deserialization issue in xamarin forms

I have a xamarin.forms application in which I am deserialize a json data. Deserialization worked fine until an extra double quotes appeared on the json.The json deserializer throwed an error.
My Json data
{
"Model_id": 403,
"Model": "iPad Pro 9.7""
}
The extra " after 9.7 causes the problem. But that double quotes indicates the inches of the device.
My deserialization
resultObject = JsonConvert.DeserializeObject<T>(resultJSON);
How Can I solve this? Any help appreciated.
Simply speaking, the presented JSON is not a valid JSON. See the RFC:
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks, except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).
(Emphasis mine)
This means, that you have to escape the quotation mark within the string.
{
"Model_id": 403,
"Model": "iPad Pro 9.7\""
}
Strictly speaking the preferred way of escaping characters is by their unicode escape sequence, i.e. a backslash followed by an u and then the unicode number, which would be \u0022 for a quotation mark. This would render your JSON
{
"Model_id": 403,
"Model": "iPad Pro 9.7\u0022"
}
Anyway, the RFC also states
Alternatively, there are two-character sequence escape
representations of some popular characters.
and \" is one of them.

double quotes are invalid in JSON validator even after manually replacing these text quotes in a code editor

I was given a payload and was preparing it to be JSON. When validating, it says the double quotes are invalid. I tried opening the file in VS Code and deleted and retyped over the invalid text quotes with double quotes, saved the file and then validated again, but the JSON validator still says these double quote marks are invalid. Anyone else run into this problem? I really don't want to have to type the entire file manually from scratch.
This is an example of some of the code that's causing this error in the JSON validator:
{
"conjunctive": "&&",
"conditionKey": "nested condition",
"operator": "nested condition",
"conditionValue": "nested condition",
"description": "nested condition"
}
I know this is a long shot....I am surprised this is happening, but, hoping anyone knows what to do here. I tried the JSON validator in Sublime and the two at these URLs
https://jsonformatter.curiousconcept.com/
https://www.freeformatter.com/json-validator.html

BigQuery failing to import CSV from tshark

Currently, I have tshark logging all packets matching a certain messaging criteria and outputting them into a CSV. The CSVs are then stored on Google CloudStorage ready for importing into BigQuery.
This is one example line from the CSV that tshark outputs.
"1380106851.793056000",
"1.1.1.1",
"2.2.2.2",
"99999",
"1111",
"raw:ip",
"324",
"af:00:21:9a",
"880",
"102",
"74:00",
"ORIG",
"It's text or !\x0a\" 's not D",
"0x00",
"0",
BigQuery will not import this line claiming that "Data between close double quote (") and field separator: field starts with: ". I assume it is the 13th column ("It's text or !\x0a\" 's not D") that is causing this issue, but I'm unsure of how to negate it. This column contains the message text and it is reasonable to assume that it may never contain balanced syntax.
The only remedy that I can think of is running awk over the file and replacing any non-syntax double-quotes with single quotes.
Is there anything I've missed?
I'm not sure why tshark escapes double quotes with a backslash, but according to RFC 4180, they should be quoted with a double quote:
"A (double) quote character in a field must be represented by two
(double) quote characters."
BigQuery will happily ingest a quote escaped in this way:
Doesn't work: "It's text or !\x0a\" 's not D"
Works: "It's text or !\x0a"" 's not D"
Is there a way to tell tshark how to appropriately escape CSV? Otherwise I bet it would be a welcomed patch, if citing RFC standards. Also, if necessary this alternate escape mechanism could be implemented as a BigQuery feature (I guess votes in this question can act as measure of how much it's needed).

Why this JSON is invalid?

I have tried to send some emails to my php server as json format,but when I validate that json it shows some error
{"function":"contacts", "parameters": {"emails": "(
"John#mac.com",
"anna#gmail.com",
"hank#mac.com"
)","user_id": "90"},"token":""}
Error shows as -
Parse error on line 4:
...{ "emails": "( "John#ma
----------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['
You need to use square brackets for all arrays. Right now you are using parenthesis and you also start with a quotation indicating a string instead of an array.
Either you want the whole value to be a string in which you should be escaping the quotes within the string with a "\" or you should remove the quote and replace the parenthesis with square brackets.
Note: The syntax highlighting above should hint at where you are going wrong.
You need to change you JSON to the following. Or you need to escape the double quotes surrounding the email addresses.
{"function":"contacts", "parameters": {"emails": [
"John#mac.com",
"anna#gmail.com",
"hank#mac.com"
],"user_id": "90"},"token":""}
Your emails element is being defined as a string starting with "( and ending with )".
Inside that string, you then have separate strings each starting and ending with quote marks.
This is obviously incorrect.
What you probably intended is "emails" : [ ...... ]
(... unless you actually intended for the emails element to be a string, in which case the quotes within it need to be escaped as \", as do the line feeds as \n. but I don't think that's what you intended, is it?).
That's why you're getting a syntax error.
However I guess The key point here is that issues like this demonstrate why it is a bad idea to hand-write JSON code (or indeed other text-based syntaxes like XML). You should always use an encoder or decoder to create your JSON strings from within the language you're working with. This will avoid you ever having to deal with issues like this; if you use an encoder, your JSON will always be valid; you don't need to worry about it.
Hope that helps.
Your array appears to be delimited with "(...)" instead of [..]. Among other things, this makes the strings inside it "inside out". If you meant for it to be a quoted string that just resembles an array, you'll need to escape the quotes inside it, like foo \"and\" bar. The flow-chart at http://www.json.org/ is really quite useful, as it is a very small spec.
I am guessing you want:
{
"function":"contacts",
"parameters": {
"emails": [
"John#mac.com",
"anna#gmail.com",
"hank#mac.com"
],
"user_id": "90"
},
"token":""
}