Why does JSON.parse choke on encoded characters in nodejs? - json

I'm attempting to look up the word "flower" in Google's dictionary semi-api. Source:
https://gist.github.com/DelvarWorld/0a83a42abbc1297a6687
Long story short, I'm calling JSONP with a callback paramater then regexing it out.
But it hits this snag:
undefined:1
ple","terms":[{"type":"text","text":"I stopped to buy Bridget some \x3cem\x3ef
^
SyntaxError: Unexpected token x
at Object.parse (native)
Google is serving me escaped HTML characters, which is fine, but JSON.parse cannot handle them?? What's weirding me out is this works just fine:
$ node
> JSON.parse( '{"a":"\x3cem"}' )
{ a: '<em' }
I don't get why my thingle is crashing
Edit These are all nice informational repsonses, but none of them help me get rid of the stacktrace.

\xHH is not part of JSON, but is part of JavaScript. It is equivalent to \u00HH. Since the built-in JSON doesn't seem to support it and I doubt you'd want to go through the trouble of modifying a non-built-in JSON implementation, you might just want to run the code in a sandbox and collect the resulting object.

According to http://json.org, a string character in a JSON representation of string may be:
any-Unicode-character-
except-"-or--or-
control-character
\"
\
\/
\b
\f
\n
\r
\t
\u four-hex-digits
So according to that list, the "json" you are getting is malformed at \x3

The reason why it works is because these two are equivalent.
JSON.parse( '{"a":"\x3cem"}' )
and
JSON.parse( '{"a":"<em"}' )
you string is passed to JSON.parse already decoded since its a literal \x3cem is actually <em
Now, \xxx is valid in JavaScript but not in JSON, according to http://json.org/ the only characters you can have after a \ are "\/bfnrtu.

answer is correct, but needs couple of modifications. you might wanna try this one: https://gist.github.com/Selmanh/6973863

Related

github api "message":"Problems parsing JSON" for large Base64 string

I'm trying to learn to use the guthub api from Java.
I've created a simple program that can read and commit new versions of a file.
I have tested this for many text files of short lenght and I think I'm correctly using the mime base64.
I'm now trying to upload a larger file, in the order of 5 MB.
And this means having a JSON in the body looking like this:
{
"owner": "example42gdrive",
"repo": "Example1",
"message": "FileSystem 42 module on github",
"content": "rO0ABXNyABdpcy5MNDIuZ2V ...5MB of JS string here... ABGluZm90AB5MfqIDcQB+AAU=",
"sha":"a7ef93d3eb50383028578cb916b70060067d9c8a"
}
And I get back as a response
400
{"message":"Problems parsing JSON","documentation_url":"https://docs.github.com/rest/reference/repos#create-or-update-file-contents"}
Notes:
The same exact code works for smaller content
java Base64.getMimeEncoder() will insert some \n in the result to separate it in lines. I'm removing those newlines in order to get a valid JS string.
Does anyone knows what I'm doing wrong or what should I do instead?
EDIT: after some experimentation, the problem seams to be in the \n:
if I produce a base64 string short enough that java Base64.getMimeEncoder() does not insert any \n, all is fine. Of course, a string with a \n can not be 'stringyfied' by simply adding (") before and after, so I tried
removing the \n, no effect -> Problems parsing JSON
replacing the \n with \n (so that the parser will see them as \n inside a string) -> Problems parsing JSON
replacing the \n with \\n (so that the parser will see them as \n, this may help if there are somehow two levels of escape server side ->Problems parsing JSON
replacing the \n with a space ->Problems parsing JSON
In https://en.wikipedia.org/wiki/Base64
wikipedia clearly states that
(newlines and white spaces may be present anywhere but are to be ignored
on decoding)
I'm starting to think that there is something I do not understand and that is so obvious that the github api do not mention it
Ok, I did it.
I found the answer indirectly by reading
Java 8 Base64 Encode (Basic) doesn't add new line anymore. How can I reimplement this?
Basically, just looking to the screen I belived the 'newline' was just "\n", instead it was a "\r\n". Thus, I was replacing only the "\n" leaving the "\r" in place.
Replacing "\r\n" with "" works.
However, replacing "\r\n" with "\r\n" doesnot. This suggests a bug in the github decoder (if wikipedia is right and new lines must be allowed)
I hate this! I hate that we have more then one way to express 'new line' and that in most context they are rendered the same graphically!!!

Escaping symbols in Gatling jsonpath

We're using Gatling jsonpath in scala to parse our JSON, and are using it like so as per the docs:
val jsonSample = (new ObjectMapper).readValue("""{"#a":"A","#b":"B"}""", classOf[Object])
JsonPath.query("$.#a", jsonSample).right.map(_.toVector)
However, this code fails, and we get an error message about "string matching regex '[$_\d... etc etc }]* expected, but # found".
I've tried using backslashes, but these do not work and give the same error message. Does anyone know how to escape the # symbol?
It's worth noting I also tried the solution with hex on this page, but it doesn't work for the above. How do you escape the # symbol in jsonpath?
Thanks!
Turns out using a different syntax fixes this:
JsonPath.query("$['#a']", jsonSample).right.map(_.toVector)

JSON get parsed in browser but not by node.js

i'm about to write some test for my client UI.
the weird thing, my JSON string:
{"match":"\s?5\.7\s?\<=\>\s?7","success":"null-coalesce-operator"}
used to be parsed by JSON.parse by browser(Chrome) and looks like this:
{
match: "\s?5\.7\s?\<=\>\s?7",
success:"null-coalesce-operator"
}
everything is fine,
but when i run that part by mocha within node.js env, i get:
{"match":"\s?5\.7\s?\<=\>\s?7","success":"null-coalesce-operator"}
^
SyntaxError: Unexpected token s
at Object.parse (native)
...
did anyone experienced stuff like this. thx for any tipp.
node version is v5.7.1
mocha version is 2.4.5
UPDATE html string that i test is:
<!doctype html><html><body><div data-meta="{"match":"\\s?5\\.7\\s?\\<=\\>\\s?7","success":"null-coalesce-operator"}"></div></body></html>
it just a single line string without any \n newlines and the same.
I think it is because it also parse specials characaters (e.g \n => line feed, \r => carriage return, etc), what chrome did not. So because you want an antislash in you regex, before parsing in node, you need to replace each\ by \\:
json_string = json_string.replace(new RegExp('\\\\', 'g'), '\\\\') //we have to use regex, because when using replace with string, it only replaces the first occurence...
otherwise, when parsing, it will tell, à \s : 'It is a special character, identified by s. But I haven't any tokens s. So I throw an error."

How does parsing special characters in JSON work?

Alright, I know i’m probably going to get yelled at for asking such a ‘simple’ question (seems to be the trend around here) but check it out…
I am building a JSON parser and got everything working correctly except the parsers ability to deal with special characters. I am trying to implement the same special characters that are listed on http://www.json.org/ namely, [", \, /, b, f, n, r, t, u].
When I run these special characters through the builtin JSON.parse method though, most of them either return an error or don’t do anything at all
JSON.parse('["\\"]')
SyntaxError: Unexpected end of input
JSON.parse("['\"']")
SyntaxError: Unexpected token ‘
JSON.parse('["\b"]')
SyntaxError: Unexpected token
JSON.parse('["\f"]')
SyntaxError: Unexpected token
Yes, I see the other post "Parsing JSON with special characters", it has nothing to do with my question. Don't refer me to another question, I have seen them all. How does parsing special characters in JSON work?
JSON.parse expects JavaScript strings.
JavaScript string literals use backslashes for escaping.
JSON uses backslashes for escaping.
So...
JSON.parse('["\\\\"]')
// ["\"]
JSON.parse("['\"']")
// SyntaxError: Unexpected token '
// JSON doesn't have single quotes as string delimiters!
JSON.parse('["\\b"]')
// [""]
JSON.parse('["\\f"]')
// [""]
JSON.parse('["\\\\b"]')
// ["\b"]
JSON.parse('["\\\\f"]')
// ["\f"]
The reason you've got an issue here is because \ is also a marker for special characters within javascript strings.
Take your first example: '["\\"]'. As javascript parses this string, \\ is escaped to a single \ in your string, so the value passed to JSON.parse method is actually ["\"] - hence the "unexpected end of input error".
Essentially, you need to cater for the javascript parser by doubling-up on the backslash escape sequences. In this case, to pass your intended ["\\"] value to JSON.parse, you need to use JSON.parse('["\\\\"]') in javascript as that will pass the string ["\\"] into the JSON.parse method.

Parse error creating Erlang JSON string

I'm having trouble properly escaping a string I'm trying to use to represent JSON in Erlang. I'm not sure why this particular sequence is giving the parser trouble. I have this string in a Basho Bench configuration file.
'{
"stats":"completed",
"times":[
{
"time":"2014-10-29T23:40:46.558Z"
}
]
}'
I am getting this error:
23:37:18.521 [error] Failed to parse config file server/http.config.erl: {29,erl_scan,{illegal,atom}}
It seems like maybe the issue is the numbers in the string but I don't get how I would escape them. Any thoughts?
You provided insufficient info but anyway, server/http.config.erl is not JSON. It is erlang term, so this error is from Erlang parser. The whole text you provided is parsed as atom because of ' which is delimiter for atoms.
The string is not a string. Single quotes denote an atom. It must be wrapped in double quotes to be interpreted as a string.