Parsing document fragments with lift-json - json

I'm trying to parse a JSON document with lift-json when I may not know the exact structure and order of the document that I'm parsing. The document contains list of "objects", all organized into sections for that object type with each section named for that type. I've played around with various ways to loop over the types, pattern-matching on the type name and then trying to get that list of objects out but it never seems to work properly. I either get a blank list or an error about not being able to find the proper JSON chunk to map to my case classes.
Here's some (almost pseudo) code that is as close as I've come:
case class TypesQueries(queries: Map[String, JValue]);
case class AddressQueries(addresses: List[AddressQuery]);
case class AddressQuery(street: String, city: String, state: String, zip: Int)
case class NameQueries(names: List[NameQuery]);
case class NameQuery(firstName: String, lastName: String);
case class EmailQueries(emails: List[EmailQuery]);
case class EmailQuery(emailAddress: String);
val jsonData = parse("""{
"queries" : {
"addresses" : [
{
"street" : "1234 Main St.",
"city" : "New York",
"state" : "New York",
"zip" : 12345
},
{
"street" : "9876 Broadway Blvd.",
"city" : "Chicago",
"state" : "IL",
"zip" : 23456
}
],
"names": [
{
"firstName" : "John",
"lastName" : "Doe"
}
],
"emails" : [
{
"emailAddress" : "john.doe#gmail.com"
},
{
"emailAddress" : "david.smith#gmail.com"
}
]
}
}""");
val typesQuery = parse(jsonData).extract[TypesQueries];
typesQuery.queries.foreach { case(queryType, queryDefinition) =>
queryType match {
case "addresses" =>
// These extract methods do not work.
val addressQueries = queryDefinition.extract[AddressQueries];
case "names" =>
// These extract methods do not work.
val nameQueries = queryDefinition.extract[NameQueries];
case "emails" =>
// These extract methods do not work.
val emailQueries = queryDefinition.extract[EmailQueries];
}
}
"addresses", "names" and "email" may come in any order inside "queries" and there may be a variable number of them.
In the end, I want to be able to extract lists of objects for the respective list of types and then, once the parsing is complete, pass the various lists of objects to the appropriate method.
So, the question is: How can I parse into case classes in lift-json if I do not know what the complete document structure will be ahead of time.

You were very close, this works on the repl:
(Updated)
typesQuery.queries.foreach {
case(queryType, queryDefinition) => queryType match {
case "addresses" => val addressQueries = typesQuery.queries.extract[AddressQueries]; println(addressQueries)
case "names" => val nameQueries = typesQuery.queries.extract[NameQueries]; println(nameQueries)
case "emails" => val emailQueries = typesQuery.queries.extract[EmailQueries]; println(emailQueries)
}
}
The idea is that the foreach "removes" the list that encloses each "object", so we call typesQuery.queries.extract to help the case classes match our parsed json

Related

How to traverse list of nested maps in scala

I have been given a json string that looks like the following one:
{
"dataflows": [
{
"name": "test",
"sources": [
{
"name": "person_inputs",
"path": "/data/input/events/person/*",
"format": "JSON"
}
],
"transformations": [
{
"name": "validation",
"type": "validate_fields",
"params": {
"input": "person_inputs",
"validations": [
{
"field": "office",
"validations": [
"notEmpty"
]
},
{
"field": "age",
"validations": [
"notNull"
]
}
]
}
},
{
"name": "ok_with_date",
"type": "add_fields",
"params": {
"input": "validation_ok",
"addFields": [
{
"name": "dt",
"function": "current_timestamp"
}
]
}
}
],
"sinks": [
{
"input": "ok_with_date",
"name": "raw-ok",
"paths": [
"/data/output/events/person"
],
"format": "JSON",
"saveMode": "OVERWRITE"
},
{
"input": "validation_ko",
"name": "raw-ko",
"paths": [
"/data/output/discards/person"
],
"format": "JSON",
"saveMode": "OVERWRITE"
}
And I have been asked to use it as some kind of recipe for an ETL pipeline, i.e., the data must be extracted from the "path" specifid in the "sources" key, the transformations to be carried out are specified within the "transformations" key and, finally, the transformed data must saved to one of the two specified "sink" keys.
I have decided to convert the json string into a scala map, as follows:
val json = Source.fromFile("path/to/json")
//parse
val parsedJson = jsonStrToMap(json.mkString)
implicit val formats = org.json4s.DefaultFormats
val parsedJson = parse(jsonStr).extract[Map[String, Any]]
so, with that, I get a structure like this one:
which is a map whose first value is a list of maps. I can evaluate parsedJson("dataflows") to get:
which is a list, as expected, but, then I cannot traverse such list, even though I need to in order to get to the sources, transformations and sinks. I have tried using the index of the listto, for example, get its first element, like this: parsedJson("dataflows")(0), but to no avail.
Can anyone please help me traverse this structure? Any help would be much appreciated.
Cheers,
When you evaluate parsedJson("dataflows") a Tuple2 is returned aka a Tuple which has two elements that are accessed with ._1 and ._2
So for dataflows(1)._1 the value returned is "sources" and dataflows(1)._2 is list of maps (List[Map[K,V]) which can be traversed like you would normally traverse elements of a List where each element is Map
Let's deconstruct this for example:
val dataFlowsZero = ("sources", List(Map(42 -> "foo"), Map(42 -> "bar")))
The first element in the Tuple
scala> dataFlowsZero._1
String = sources
The second element in the Tuple
scala> dataFlowsZero._2
List[Map[Int, String]] = List(Map(42 -> foo), Map(42 -> bar))`
Map the keys in each Map in List to a new List
scala> dataFlowsZero._2.map(m => m.keys)
List[Iterable[Int]] = List(Set(42), Set(42))
Map the values in each Map in the List to a new List
scala> dataFlowsZero._2.map(m => m.values)
List[Iterable[String]] = List(Iterable(foo), Iterable(bar))
The best solution is to convert the JSON to the full data structure that you have been provided rather than just Map[String, Any]. This makes it trivial to pick out the data that you want. For example,
val dataFlows = parse(jsonStr).extract[DataFlows]
case class DataFlows(dataflows: List[DataFlow])
case class DataFlow(name: String, sources: List[Source], transformations: List[Transformation], sinks: List[Sink])
case class Source(name: String, path: String, format: String)
case class Transformation(name: String, `type`: String, params: List[Param])
case class Param(input: String, validations: List[Validation])
case class Validation(field: String, validations: List[String])
case class Sink(input: String, name: String, paths: List[String], format: String, saveMode: String)
The idea is to make the JSON handler do most of the work to create a type-safe version of the original data.

Scala play JSON, lookup and match defined field holding null value

I have the following Json block that I have returned as a JsObject
{
"first_block": [
{
"name": "demo",
"description": "first demo description"
}
],
"second_block": [
{
"name": "second_demo",
"description": "second demo description",
"nested_second": [
{
"name": "bob",
"value": null
},
{
"name": "john",
"value": null
}
]
}
]
}
From this, I want to return a list of all the possible values I could have in the second block, nested array for name and value. so with the example above
List([bob,null],[john,null]) or something along those lines.
The issue I am having is with the value section understanding null values. I've tried to match against it and return a string "null" but I can't get it to match on Null values.
What would be the best way for me to return back the name and values in the nested_second array.
I've tried using case classes and readAsNullable with no luck, and my latest attempt has gone along these lines:
val secondBlock = (jsObj \ "second_block").as[List[JsValue]]
secondBlock.foreach(nested_block => {
val nestedBlock = (nested_block \ "nested_second").as[List[JsValue]]
nestedBlock.foreach(value => {
val name = (value \ "name").as[String] //always a string
var convertedValue = ""
val replacement_value = value \ "value"
replacement_value match {
case JsDefined(null) => convertedValue = "null"
case _ => convertedValue = replacement_value.as[String]
}
println(name)
println(convertedValue)
})
}
)
It seems convertedValue returns as 'JsDefined(null)' regardless and I'm sure the way I'm doing it is horrifically bad.
Replace JsDefined(null) with JsDefined(JsNull).
You probably got confused, because println(JsDefined(JsNull)) prints as JsDefined(null). But that is not, how null value of a JSON field is represented. null is represented as case object JsNull. This is just a good API design, where possible cases are represented with a hierarchy of classes:
With play-json I use always case-classes!
I simplified your problem to the essence:
import play.api.libs.json._
val jsonStr = """[
{
"name": "bob",
"value": null
},
{
"name": "john",
"value": "aValue"
},
{
"name": "john",
"value": null
}
]"""
Define a case class
case class Element(name: String, value: Option[String])
Add a formatter in the companion object:
object Element {
implicit val jsonFormat: Format[Element] = Json.format[Element]
}
An use validate:
Json.parse(jsonStr).validate[Seq[Element]] match {
case JsSuccess(elems, _) => println(elems)
case other => println(s"Handle exception $other")
}
This returns: List(Element(bob,None), Element(john,Some(aValue)), Element(john,None))
Now you can do whatever you want with the values.

Filter json properties by name using JSONPath

I'd like to select all elements with a certain match in the name of the property.
For example, all the properties whose name starts with 'pass' from this json:
{
"firstName": "John",
"lastName" : "doe",
"age" : 50,
"password" : "1234",
"phoneNumbers": [
{
"type" : "iPhone",
"number": "0123-4567-8888",
"password": "abcd"
},
{
"type" : "home",
"number": "0123-4567-8910",
"password": "fghi"
}
]
}
Would result something like this:
[
"1234",
"abcd",
"fghi"
]
I don't want filter by values, only by property names. Is it possible using jsonpath?
I'm using the method SelectTokens(string path) of Newtonsoft.Json.Linq
No, JSONPath defines expressions to traverse through a JSON document to reach to a subset of the JSON. It cannot be used when you don't know the exact property names.
In your case you need property values whose name starts with a specific keyword. For that, you need to traverse the whole JSON text and look for the property names which start with pass having a string type
var passwordList = new List<string>();
using (var reader = new JsonTextReader(new StringReader(jsonText)))
{
while (reader.Read())
{
if(reader.TokenType.ToString().Equals("PropertyName")
&& reader.ValueType.ToString().Equals("System.String")
&& reader.Value.ToString().StartsWith("pass"))
{
reader.Read();
passwordList.Add(reader.Value.ToString());
}
}
passwordList.ForEach(i => Console.Write("{0}\n", i));
}

Parse JSON with unknown attributes names into a Case Class

I have the following JSON file to be parsed into a case class:
{
"root": {
"nodes": [{
"id": "1",
"attributes": {
"name": "Node 1",
"size": "3"
}
},
{
"id": "2",
"attributes": {
"value": "4",
"name": "Node 2"
}
}
]
}
}
The problem is that the attributes could have any value inside it: name, size, value, anything ...
At this moment I have defined my case classes:
case class Attributes(
name: String,
size: String,
value: Sting
)
case class Nodes(
id: String,
attributes: Attributes
)
case class Root(
nodes: List[Nodes]
)
case class R00tJsonObject(
root: Root
)
Whats is the best way to deal with this scenario when I can receive any attribute ?
Currently I am using Json4s to handle son files.
Thanks!
Your attributes are arbitrarily many and differently named, but it seems you can store them in a Map[String, String] (at least, if those examples are anything to go by). In this case, using circe-parser (https://circe.github.io/circe/parsing.html), you could simply use code along these lines in order to convert your JSON directly into a simple case-class:
import io.circe._, io.circe.parser._
import io.circe.generic.semiauto._
case class Node(id: String, attributes: Map[String,String])
case class Root(nodes: List[Node])
implicit val nodeDecoder: Decoder[Node] = deriveDecoder[Node]
implicit val nodeEncoder: Encoder[Node] = deriveEncoder[Node]
implicit val rootDecoder: Decoder[Root] = deriveDecoder[Root]
implicit val rootEncoder: Encoder[Root] = deriveEncoder[Root]
def myParse(jsonString: String) = {
val res = parse(jsonString) match {
case Right(json) => {
val cursor = json.hcursor
cursor.get[Root]("root")
}
case _ => Left("Wrong JSON!")
}
println(res)
}
This snippet will print
Right(Root(List(Node(1,Map(name -> Node 1, size -> 3)), Node(2,Map(value -> 4, name -> Node 2)))))
on the console, for the JSON, you've given. (Assuming, the solution doesn't have to be in Json4s.)

Scala/Play - Building an aggregated JSON by getting partial data from the list

I have a following case class:
case class Vehicle( type:String, brand: String, transmission: String)
I am using Slick to query a database that contains all of the columns in the case class and I get a query result like this (a list of objects):
List(Vehicle(car, audi, automatic), Vehicle(truck, toyota, automatic), Vehicle(motorcycle, bmw, manual))
I want the result JSON to look like this:
{
"vehicles" : [
{
"type" : "car",
"brand" : "audi",
"transmission" : "automatic",
},
{
"type" : "truck",
"brand" : "toyota",
"transmission" : "automatic",
},
{
"type" : "motorcycle",
"brand" : "bmw",
"transmission" : "manual",
}
]
}
Now to achieve that, I can easily use a mutable list and map my DB result one by one and build out a JSON like that. Since I am already using the Play Framework, writing out JSON is a piece of cake. But, I would like to do things in a more functional way and without using any mutable variables.
How do I read the List of objects from the DB result and put it together like in the result JSON?
Define Json Format for Vehicle and then you can convert List[Vehicle] to Json automatically by Json.toJson(vehicles)
import play.api.libs.json._
case class Vehicle( type:String, brand: String, transmission: String)
object Vehicle {
implicit val vehicleFormat = Json.format[Vehicle]
}
//this will be slick function which gets data from database
def getVehiclesFromDB: Future[List[Vehicle]] = Future(List(Vehicle("a", "b", "c"), Vehicle("d", "e", "f")))
Inside the Controller
#Singleton
class Application #Inject() (vehiclesRepo: VehiclesRepo) extends Controller {
def getVehicles = Action.async {
vehiclesRepo.getVehiclesFromDB.map { vehicles =>
Ok(Json.toJson(vehicles))
}
}
}