I have the following string:
"{\"headers\":[\"CNPJ\",\"PDF\",\"error\"],\"rows\":[[\"17192451000170\",\"FILE:application/pdf;170286;\",null],[\"234566767544\",\"FILE:application/pdf;456378;\",null],[\"233456767544\",\"FILE:application/pdf;456378;\",null]]}"
how do I parse it to a normal Json format?
meaning:
{"rows" :[
{"CNPJ":"17192451000170","PDF":"FILE:application/pdf;170286;","error":null},
{"CNPJ":"17192451000170","PDF":"FILE:application/pdf;170286;","error":null},
{"CNPJ":"17192451000170", "PDF":"FILE:application/pdf;170286;,"error":null"}
]}
or any other json format
This is already a valid JSON format.
If you just want to strip \ then you can simply:
(hbd#crayon2.yoonka.com)31> JsonOrg = <<"{\"headers\":[\"CNPJ\",\"PDF\",\"error\"],\"rows\":[[\"17192451000170\",\"FILE:application/pdf;170286;\",null],[\"234566767544\",\"FILE:application/pdf;456378;\",null],[\"233456767544\",\"FILE:application/pdf;456378;\",null]]}">>.
<<"{\"headers\":[\"CNPJ\",\"PDF\",\"error\"],\"rows\":[[\"17192451000170\",\"FILE:application/pdf;170286;\",null],[\"234566767544\",\"FI"...>>
(hbd#crayon2.yoonka.com)32> io:format("~s~n", [binary_to_list(JsonOrg)]).
{"headers":["CNPJ","PDF","error"],"rows":[["17192451000170","FILE:application/pdf;170286;",null],["234566767544","FILE:application/pdf;456378;",null],["233456767544","FILE:application/pdf;456378;",null]]}
ok
You can also parse back and forth between Json and Erlang. I tested that with the yajler decoder:
(hbd#crayon2.yoonka.com)43> {ok, Parsed} = yajler:decode(<<"{\"headers\":[\"CNPJ\",\"PDF\",\"error\"],\"rows\":[[\"17192451000170\",\"FILE:application/pdf;170286;\",null],[\"234566767544\",\"FILE:application/pdf;456378;\",null],[\"233456767544\",\"FILE:application/pdf;456378;\",null]]}">>).
{ok,[{<<"headers">>,[<<"CNPJ">>,<<"PDF">>,<<"error">>]},
{<<"rows">>,
[[<<"17192451000170">>,<<"FILE:application/pdf;170286;">>,
undefined],
[<<"234566767544">>,<<"FILE:application/pdf;456378;">>,
undefined],
[<<"233456767544">>,<<"FILE:application/pdf;456378;">>,
undefined]]}]}
(hbd#crayon2.yoonka.com)44> Json = binary:list_to_bin(yajler:encode(Parsed)).
<<"{\"headers\":[\"CNPJ\",\"PDF\",\"error\"],\"rows\":[[\"17192451000170\",\"FILE:application/pdf;170286;\",\"undefined\"],[\"2345667675"...>>
Yajler is an Erlang NIF so it is using a C library, in this case called yajl, to do the actual parsing, but I imagine a similar result you would get from other Erlang applications that can parse JSON.
Related
I want to get a nested field in a json string using JSONPath.
Take for example the following json:
{
"ID": "2ac464eb-352f-4e36-8b9f-950a24bb9586",
"PAYLOAD": "{\"#type\":\"Event\",\"id\":\"baf223c4-4264-415a-8de5-61c9c709c0d2\"}"
}
If I want to extract the #type field, I expect to do it like this
$.PAYLOAD.#type
But that doesn't seem to work..
Also tried this:
$.PAYLOAD['#type']
Do I need to use escape chars or something?
Posting my comment as an answer
"{\"#type\":\"Event\",\"id\":\"baf223c4-4264-415a-8de5-61c9c709c0d2\"}"
Isn't JSON, it's a string containing encoded JSON.
Since JsonPath can't decode such string, you'll have to use a language of your desire to decode the string.
Eg: Decoding JSON String in Java
I'm having trouble with json conversion within pyspark working with complex nested-struct columns. The schema for the from_json doesn't seem to behave. Example:
import pyspark.sql.functions as f
df = spark.createDataFrame([[1,'a'],[2,'b'],[3,'c']], ['rownum','rowchar'])\
.withColumn('struct', f.expr("transform(array(1,2,3), i -> named_struct('a1',rownum*i,'a2',rownum*i*2))"))
df.display()
df.withColumn('struct',f.to_json('struct')).withColumn('struct',f.from_json('struct',df.schema['struct'])).display()
df.withColumn('struct',f.to_json('struct')).withColumn('struct',f.from_json('struct',df.select('struct').schema)).display()
fails with
Cannot parse the schema in JSON format: Failed to convert the JSON string (big JSON string) to a data type
Not sure if this is a syntax error on my end, an edge case that's failing, the wrong way to do things, or something else.
You're not passing the correct schema to from_json. Try with this instead:
df.withColumn('struct', f.to_json('struct')) \
.withColumn('struct', f.from_json('struct', df.schema["struct"].dataType)) \
.display()
I am using akka.http.scaladsl.model.HttpResponse, HttpEntity.
After getting the response , it is of type responseEntity of the format (Content-type: 'application/json', {MyJSONHERE}). Is there a way I can extract my json from the entity.
I tried entity.getDataBytes which gives the content of the entity in ByteString format. I want to properly read the JSON and parse it. Can someone guide me on this?
Code below works for me
entity.dataBytes.runWith(Sink.fold(ByteString.empty)(_ ++ _)).map(_.utf8String) map { result =>
JsonMethods.parse(result)
}
dataBytes returns Source[ByteString, Any], Sink.fold combines all parts of the stream into one ByteString and utf8String converts ByteString into usual String.
Here is some useful docs about HttpEntity.
Can you try below code?
entity.getDataBytes.utf8String
That would return String representation of JSON.
So I have some json that looks like this, which I got after taking it out of some other json by doing response.body.to_json:
{\n \"access_token\": \"<some_access_token>\",\n \"token_type\": \"Bearer\",\n \"expires_in\": 3600,\n \"id_token\": \<some_token>\"\n}\n"
I want to pull out the access_token, so I do
to_return = {token: responseJson[:access_token]}
but this gives me a
TypeError: no implicit conversion of Symbol into Integer
Why? How do I get my access token out? Why are there random backslashes everywhere?
to_json doesn't parse JSON - it does the complete opposite: it turns a ruby object into a string containing the JSON representation of that object is.
It's not clear from your question what response.body is. It could be a string, or depending on your http library it might have already been parsed for you.
If the latter then
response.body["access_token"]
Will be your token, if the former then try
JSON.parse(response.body)["access_token"]
Use with double quotes when calling access_token. Like below:
to_return = {token: responseJson["access_token"]}
Or backslashes are escaped delimiters and make sure you first parse JSON.
What is the fastest way to convert this
{"a":"ab","b":"cd","c":"cd","d":"de","e":"ef","f":"fg"}
into mutable map in scala ? I read this input string from ~500MB file. That is the reason I'm concerned about speed.
If your JSON is as simple as in your example, i.e. a sequence of key/value pairs, where each value is a string. You can do in plain Scala :
myString.substring(1, myString.length - 1)
.split(",")
.map(_.split(":"))
.map { case Array(k, v) => (k.substring(1, k.length-1), v.substring(1, v.length-1))}
.toMap
That looks like a JSON file, as Andrey says. You should consider this answer. It gives some example Scala code. Also, this answer gives some different JSON libraries and their relative merits.
The fastest way to read tree data structures in XML or JSON is by applying streaming API: Jackson Streaming API To Read And Write JSON.
Streaming would split your input into tokens like 'beginning of an object' or 'beginning of an array' and you would need to build a parser for these token, which in some cases is not a trivial task.
Keeping it simple. If reading a json string from file and converting to scala map
import spray.json._
import DefaultJsonProtocol._
val jsonStr = Source.fromFile(jsonFilePath).mkString
val jsonDoc=jsonStr.parseJson
val map_doc=jsonDoc.convertTo[Map[String, JsValue]]
// Get a Map key value
val key_value=map_doc.get("key").get.convertTo[String]
// If nested json, re-map it.
val key_map=map_doc.get("nested_key").get.convertTo[Map[String, JsValue]]
println("Nested Value " + key_map.get("key").get)