Scala Parse Json without knowing schema - json

I was wondering if there is a parser or an easy way to iterate through a json object without knowing the keys/schema of the json ahead of time in scala. I took a look at a few libraries like json4s, but it seems to still require knowing the schema ahead of time before extracting the fields. I just want to iterate over each field, extract the fields and print out their values something like:
json.foreachkey(key -> println(key +":" + json.get(key))

In Play Json you'll initially parse your json into a JsValue; you can then pattern-match this to determine if it is a JsObject (note that you can find the fields of this using fields or value), a JsArray (again, note the value), or a primitive such as JsString or JsNull
def parse(jsVal: JsValue) {
jsVal match {
case json: JsObject =>
case json: JsArray =>
case json: JsString =>
...
}
}

If by json you mean any JValue, then json4s seems to have this functionality out of the box:
scala> import org.json4s.JsonDSL._
import org.json4s.JsonDSL._
scala> import org.json4s.native.JsonMethods._
import org.json4s.native.JsonMethods._
scala> val json = parse(""" { "numbers" : [1, 2, 3, 4] } """)
json: org.json4s.JValue = JObject(List((numbers,JArray(List(JInt(1), JInt(2), JInt(3), JInt(4))))))
scala> compact(render(json))
res1: String = {"numbers":[1,2,3,4]}

Use liftweb, it allows you to parse the json first into JValue - then extract native scala objects from it, no matter the schema:
val jsonString = """{"menu": {
| "id": "file",
| "value": "File",
| "popup": {
| "menuitem": [
| {"value": "New", "onclick": "CreateNewDoc()"},
| {"value": "Open", "onclick": "OpenDoc()"},
| {"value": "Close", "onclick": "CloseDoc()"}
| ]
| }
|}}""".stripMargin
val jVal: JValue = parse(jsonString)
jVal.values
>>> Map(menu -> Map(id -> file, value -> File, popup -> Map(menuitem -> List(Map(value -> New, onclick -> CreateNewDoc()), Map(value -> Open, onclick -> OpenDoc()), Map(value -> Close, onclick -> CloseDoc())))))

Related

Convert Json to a Map[String, String]

I have input json like
{"a": "x", "b": "y", "c": "z", .... }
I want to convert this json to a Map like Map[String, String]
so basically a map of key value pairs.
How can I do this using circe?
Note that I don't know what keys "a", "b", "c" will be present in Json. All I know is that they will always be strings and never any other data type.
I looked at Custom Decoders here https://circe.github.io/circe/codecs/custom-codecs.html but they work only when you know the tag names.
I found an example to do this in Jackson. but not in circe
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.databind.ObjectMapper
val data = """
{"a": "x", "b", "y", "c": "z"}
"""
val mapper = new ObjectMapper
mapper.registerModule(DefaultScalaModule)
mapper.readValue(data, classOf[Map[String, String]])
While the solutions in the other answer work, they're much more verbose than necessary. Off-the-shelf circe provides an implicit Decoder[Map[String, String]] instance, so you can just write the following:
scala> val doc = """{"a": "x", "b": "y", "c": "z"}"""
doc: String = {"a": "x", "b": "y", "c": "z"}
scala> io.circe.parser.decode[Map[String, String]](doc)
res0: Either[io.circe.Error,Map[String,String]] = Right(Map(a -> x, b -> y, c -> z))
The Decoder[Map[String, String]] instance is defined in the Decoder companion object, so it's always available—you don't need any imports, other modules, etc. Circe provides instances like this for most standard library types with reasonable instances. If you want to decode a JSON array into a List[String], for example, you don't need to build your own Decoder[List[String]]—you can just use the one in implicit scope that comes from the Decoder companion object.
This isn't just a less verbose way to solve this problem, it's the recommended way to solve it. Manually constructing an explicit decoder instance and converting from Either to Try to compose parsing and decoding operations is both unnecessary and error-prone (if you do need to end up with Try or Option or whatever, it's almost certainly best to do that at the end).
Assuming:
val rawJson: String = """{"a": "x", "b": "y", "c": "z"}"""
This works:
import io.circe.parser._
val result: Try[Map[String, String]] = parse(rawJson).toTry
.flatMap(json => Try(json.asObject.getOrElse(sys.error("Not a JSON Object"))))
.flatMap(jsonObject => Try(jsonObject.toMap.map{case (name, value) => name -> value.asString.getOrElse(sys.error(s"Field '$name' is not a JSON string"))}))
val map: Map[String, String] = result.get
println(map)
Or with using Decoder:
import io.circe.Decoder
val decoder = Decoder.decodeMap(KeyDecoder.decodeKeyString, Decoder.decodeString)
val result = for {
json <- parse(rawJson).toTry
map <- decoder.decodeJson(json).toTry
} yield map
val map = result.get
println(map)
You can test following invalid inputs and see, what exception will be thrown:
val rawJson: String = """xxx{"a": "x", "b": "y", "c": "z"}""" // invalid JSON
val rawJson: String = """[1,2,3]""" // not a JSON object
val rawJson: String = """{"a": 1, "b": "y", "c": "z"}""" // not all values are string

Accessing attributes in objects Scala

An example of the sort of objects I need to grab from my json can be found in the following example(src):
{
"test": {
"attra": "2017-10-12T11:17:52.971Z",
"attrb": "2017-10-12T11:20:58.374Z"
},
"dummyCheck": false,
"type": "object",
"ruleOne": {
"default": 2557
},
"ruleTwo": {
"default": 2557
}
}
From the example above I want to access the default value under "ruleOne".
I've tried messing about with several different things below but I seem to be struggling. I can grab values like "dummyCheck" ok. What's the best way to key into where I need to go?
Example of how I am trying to get the value below:
import org.json4s._
import org.json4s.native.JsonMethods._
import org.json4s.DefaultFormats
implicit val formats = DefaultFormats
val test = parse(src)
println((test \ "ruleOne.default").extract[Integer])
Edit:
To further extend what is above:
def extractData(data: java.io.File) = {
val json = parse(data)
val result = (json \ "ruleOne" \ "default").extract[Int]
result
}
If I was to extend the above into a function that is called by passing in:
extractData(src)
That would only ever give me RuleOne.default.. is there a way I could extend it so that I could dynamically pass it multiple string arguments to parse (like a splat)
def extractData(data: java.io.File, path: String*) = {
val json = parse(data)
val result = (json \ path: _*).extract[Int]
result
}
so consuming it would be like
extractData(src, "ruleOne", "default")
This here works with "json4s-jackson" % "3.6.0-M2", but it should work in exactly the same way with native backend.
val src = """
|{
| "test": {
| "attra": "2017-10-12T11:17:52.971Z",
| "attrb": "2017-10-12T11:20:58.374Z"
| },
| "dummyCheck": false,
| "type": "object",
| "ruleOne": {
| "default": 2557
| },
| "ruleTwo": {
| "default": 2557
| }
|}""".stripMargin
import org.json4s._
import org.json4s.jackson.JsonMethods._
import org.json4s.DefaultFormats
implicit val formats = DefaultFormats
val test = parse(src)
println((test \ "ruleOne" \ "default").extract[Int])
Output:
2557
To make it work with native, simply replace
import org.json4s.jackson.JsonMethods._
by
import org.json4s.native.JsonMethods._
and make sure that you have the right dependencies.
EDIT
Here is a vararg method that transforms string parameters into a path:
def extract(json: JValue, path: String*): Int = {
path.foldLeft(json)(_ \ _).extract[Int]
}
With this, you can now do:
println(extract(test, "ruleOne", "default"))
println(extract(test, "ruleTwo", "default"))
Note that it accepts a JValue, not a File, because the version with File would be unnecessarily painful to test, whereas JValue-version can be tested with parsed string constants.

Turn string into simple JSON in scala

I have a string in scala which in terms of formatting, it is a json, for example
{"name":"John", "surname":"Doe"}
But when I generate this value it is initally a string. I need to convert this string into a json but I cannot change the output of the source. So how can I do this conversion in Scala? (I cannot use the Play Json library.)
If you have strings as
{"name":"John", "surname":"Doe"}
and if you want to save to elastic as mentioned here then you should use parseRaw instead of parseFull.
parseRaw will return you JSONType and parseFull will return you map
You can do as following
import scala.util.parsing.json._
val jsonString = "{\"name\":\"John\", \"surname\":\"Doe\"}"
val parsed = JSON.parseRaw(jsonString).get.toString()
And then use the jsonToEs api as
sc.makeRDD(Seq(parsed)).saveJsonToEs("spark/json-trips")
Edited
As #Aivean pointed out, when you already have json string from source, you won't be needing to convert to json, you can just do
if jsonString is {"name":"John", "surname":"Doe"}
sc.makeRDD(Seq(jsonString)).saveJsonToEs("spark/json-trips")
You can use scala.util.parsing.json to convert JSON in string format to JSON (which is basically HashMap datastructure),
eg.
scala> import scala.util.parsing.json._
import scala.util.parsing.json._
scala> val json = JSON.parseFull("""{"name":"John", "surname":"Doe"}""")
json: Option[Any] = Some(Map(name -> John, surname -> Doe))
To navigate the json format,
scala> json match { case Some(jsonMap : Map[String, Any]) => println(jsonMap("name")) case _ => println("json is empty") }
John
nested json example,
scala> val userJsonString = """{"name":"John", "address": { "perm" : "abc", "temp" : "zyx" }}"""
userJsonString: String = {"name":"John", "address": { "perm" : "abc", "temp" : "zyx" }}
scala> val json = JSON.parseFull(userJsonString)
json: Option[Any] = Some(Map(name -> John, address -> Map(perm -> abc, temp -> zyx)))
scala> json.map(_.asInstanceOf[Map[String, Any]]("address")).map(_.asInstanceOf[Map[String, String]]("perm"))
res7: Option[String] = Some(abc)

Serializing a JSON string as JSON in Scala/Play

I have a String with some arbitrary JSON in it. I want to construct a JsObject with my JSON string as a JSON object value, not a string value. For example, assuming my arbitrary string is a boring {} I want {"key": {}} and not {"key": "{}"}.
Here's how I'm trying to do it.
val myString = "{}"
Json.obj(
"key" -> Json.parse(myString)
)
The error I get is
type mismatch; found :
scala.collection.mutable.Buffer[scala.collection.immutable.Map[String,java.io.Serializable]]
required: play.api.libs.json.Json.JsValueWrapper
I'm not sure what to do about that.
"{}" is an empty object.
So, to get {"key": {}} :
Json.obj("key" -> Json.obj())
Update:
Perhaps you have an old version of Play. This works under Play 2.3.x:
scala> import play.api.libs.json._
scala> Json.obj("foo" -> Json.parse("{}"))
res2: play.api.libs.json.JsObject = {"foo":{}}

Scala play json api: can't implicitly convert JsArray to JsValueWrapper

I have json rest api application based on play framework and want to receive information about validation errors when I parse incoming requests. Everything is working fine except conversion from json array to json value.
Json structure I want to achieve:
{
"errors": {
"name": ["invalid", "tooshort"],
"email": ["invalid"]
}
}
When I tried to implement a sample it worked perfectly:
def error = Action {
BadRequest(obj(
"errors" -> obj(
"name" -> arr("invalid", "tooshort"),
"email" -> arr("invalid")
)
))
}
When I tried to extract the changing part like this:
def error = Action {
val e = Seq("name" -> arr("invalid", "tooshort"), "email" -> arr("invalid"))
// in real app it will be Seq[(JsPath, Seq[ValidationError])]
BadRequest(obj(
"errors" -> obj(e: _*)
))
}
I got a compiler error:
type mismatch; found : Seq[(String, play.api.libs.json.JsArray)] required: Seq[(String, play.api.libs.json.Json.JsValueWrapper)]
Maybe there is some implicit conversion I'm missing from JsArray to JsValueWrapper? But then, why does the sample works fine in the same file with the same imports?
Play 2.1.1, Scala 2.10.0
UPDATE:
Problem resolved thanks to Julien Lafont, the final code:
implicit val errorsWrites = new Writes[Seq[(JsPath, Seq[ValidationError])]] {
/**
* Maps validation result of Ember.js json request to json validation object, which Ember can understand and parse as DS.Model 'errors' field.
*
* #param errors errors collected after json validation
* #return json in following format:
*
* {
* "errors": {
* "error1": ["message1", "message2", ...],
* "error2": ["message3", "message4", ...],
* ...
* }
* }
*/
def writes(errors: Seq[(JsPath, Seq[ValidationError])]) = {
val mappedErrors = errors.map {
e =>
val fieldName = e._1.toString().split("/").last // take only last part of the path, which contains real field name
val messages = e._2.map(_.message)
fieldName -> messages
}
obj("errors" -> toJson(mappedErrors.toMap)) // Ember requires root "errors" object
}
}
Usage:
def create = Action(parse.json) { // method in play controller
request =>
fromJson(request.body) match {
case JsSuccess(pet, path) => Ok(obj("pet" -> Pets.create(pet)))
case JsError(errors) => UnprocessableEntity(toJson(errors)) // Ember.js expects 422 error code in case of errors
}
}
You can simply define your errors in a Map[String, Seq[String]], and transform it in Json with Json.toJson (there are built-in writers for Map[String,Y] and Seq[X])
scala> val e = Map("name" -> Seq("invalid", "tooshort"), "email" -> Seq("invalid"))
e: scala.collection.immutable.Map[String,Seq[String]] = Map(name -> List(invalid, tooshort), email -> List(invalid))
scala> Json.toJson(e)
res0: play.api.libs.json.JsValue = {"name":["invalid","tooshort"],"email":["invalid"]}
scala> Json.obj("errors" -> Json.toJson(e))
res1: play.api.libs.json.JsObject = {"errors":{"name":["invalid","tooshort"],"email":["invalid"]}}
The reason the long handed version works is because scala's automatic type inference triggers an implicit conversion toJsFieldJsValueWrapper.
For example
scala> import play.api.libs.json._
import play.api.libs.functional.syntax._
import play.api.libs.json._
scala> import play.api.libs.functional.syntax._
import play.api.libs.functional.syntax._
scala> import Json._
import Json._
scala> arr("invalid", "tooshort")
res0: play.api.libs.json.JsArray = ["invalid","tooshort"]
scala> obj("name" -> arr("invalid", "tooshort"))
res1: play.api.libs.json.JsObject = {"name":["invalid","tooshort"]}
scala> obj _
res3: Seq[(String, play.api.libs.json.Json.JsValueWrapper)] => play.api.libs.json.JsObject = <function1>
Notice that arr returns a JsArray, however obj requires a JsValueWrapper. Scala is able to convert the JsArray to JsValueWrapper when it constructs the arguments for obj. However it is not able to convert a Seq[(String, JsArray)] to a `Seq[(String, JsValueWrapper)].
If you provide the expected type when of the sequence in advance, the Scala compiler's type inferencer will perform the conversion as it creates the sequence.
scala> Seq[(String, JsValueWrapper)]("name" -> arr("invalid", "tooshort"), "email" -> arr("invalid"))
res4: Seq[(String, play.api.libs.json.Json.JsValueWrapper)] = List((name,JsValueWrapperImpl(["invalid","tooshort"])), (email,JsValueWrapperImpl(["invalid"])))
However once the sequence is created it cannot be converted unless there is an implicit conversion in scope that can convert sequences.
The final code snippet looks like:
def error = Action {
val e = Seq[(String, JsValueWrapper)]("name" -> arr("invalid", "tooshort"), "email" -> arr("invalid"))
// in real app it will be Seq[(JsPath, Seq[ValidationError])]
BadRequest(obj(
"errors" -> obj(e: _*)
))
}