I have been given a json string that looks like the following one:
{
"dataflows": [
{
"name": "test",
"sources": [
{
"name": "person_inputs",
"path": "/data/input/events/person/*",
"format": "JSON"
}
],
"transformations": [
{
"name": "validation",
"type": "validate_fields",
"params": {
"input": "person_inputs",
"validations": [
{
"field": "office",
"validations": [
"notEmpty"
]
},
{
"field": "age",
"validations": [
"notNull"
]
}
]
}
},
{
"name": "ok_with_date",
"type": "add_fields",
"params": {
"input": "validation_ok",
"addFields": [
{
"name": "dt",
"function": "current_timestamp"
}
]
}
}
],
"sinks": [
{
"input": "ok_with_date",
"name": "raw-ok",
"paths": [
"/data/output/events/person"
],
"format": "JSON",
"saveMode": "OVERWRITE"
},
{
"input": "validation_ko",
"name": "raw-ko",
"paths": [
"/data/output/discards/person"
],
"format": "JSON",
"saveMode": "OVERWRITE"
}
And I have been asked to use it as some kind of recipe for an ETL pipeline, i.e., the data must be extracted from the "path" specifid in the "sources" key, the transformations to be carried out are specified within the "transformations" key and, finally, the transformed data must saved to one of the two specified "sink" keys.
I have decided to convert the json string into a scala map, as follows:
val json = Source.fromFile("path/to/json")
//parse
val parsedJson = jsonStrToMap(json.mkString)
implicit val formats = org.json4s.DefaultFormats
val parsedJson = parse(jsonStr).extract[Map[String, Any]]
so, with that, I get a structure like this one:
which is a map whose first value is a list of maps. I can evaluate parsedJson("dataflows") to get:
which is a list, as expected, but, then I cannot traverse such list, even though I need to in order to get to the sources, transformations and sinks. I have tried using the index of the listto, for example, get its first element, like this: parsedJson("dataflows")(0), but to no avail.
Can anyone please help me traverse this structure? Any help would be much appreciated.
Cheers,
When you evaluate parsedJson("dataflows") a Tuple2 is returned aka a Tuple which has two elements that are accessed with ._1 and ._2
So for dataflows(1)._1 the value returned is "sources" and dataflows(1)._2 is list of maps (List[Map[K,V]) which can be traversed like you would normally traverse elements of a List where each element is Map
Let's deconstruct this for example:
val dataFlowsZero = ("sources", List(Map(42 -> "foo"), Map(42 -> "bar")))
The first element in the Tuple
scala> dataFlowsZero._1
String = sources
The second element in the Tuple
scala> dataFlowsZero._2
List[Map[Int, String]] = List(Map(42 -> foo), Map(42 -> bar))`
Map the keys in each Map in List to a new List
scala> dataFlowsZero._2.map(m => m.keys)
List[Iterable[Int]] = List(Set(42), Set(42))
Map the values in each Map in the List to a new List
scala> dataFlowsZero._2.map(m => m.values)
List[Iterable[String]] = List(Iterable(foo), Iterable(bar))
The best solution is to convert the JSON to the full data structure that you have been provided rather than just Map[String, Any]. This makes it trivial to pick out the data that you want. For example,
val dataFlows = parse(jsonStr).extract[DataFlows]
case class DataFlows(dataflows: List[DataFlow])
case class DataFlow(name: String, sources: List[Source], transformations: List[Transformation], sinks: List[Sink])
case class Source(name: String, path: String, format: String)
case class Transformation(name: String, `type`: String, params: List[Param])
case class Param(input: String, validations: List[Validation])
case class Validation(field: String, validations: List[String])
case class Sink(input: String, name: String, paths: List[String], format: String, saveMode: String)
The idea is to make the JSON handler do most of the work to create a type-safe version of the original data.
Related
Suppose I have some JSON data like this:
{
"data": {
"title": "example input",
"someBoolean": false,
"innerData": {
"innerString": "input inner string",
"innerBoolean": true,
"innerCollection": [1,2,3,4,5]
},
"collection": [6,7,8,9,0]
}
}
And I want to flatten it a bit and transform or remove some fields, to get the following result:
{
"data": {
"ttl": "example input",
"bool": false,
"collection": [6,7,8,9,0],
"innerCollection": [1,2,3,4,5]
}
}
How can I do this with Circe?
(Note that I'm asking this as a FAQ since similar questions often come up in the Circe Gitter channel. This specific example is from a question asked there yesterday.)
I've sometimes said that Circe is primarily a library for encoding and decoding JSON, not for transforming JSON values, and in general I'd recommend mapping to Scala types and then defining relationships between those (as Andriy Plokhotnyuk suggests here), but for many cases writing transformations with cursors works just fine, and in my view this kind of thing is one of them.
Here's how I'd implement this transformation:
import io.circe.{DecodingFailure, Json, JsonObject}
import io.circe.syntax._
def transform(in: Json): Either[DecodingFailure, Json] = {
val someBoolean = in.hcursor.downField("data").downField("someBoolean")
val innerData = someBoolean.delete.downField("innerData")
for {
boolean <- someBoolean.as[Json]
collection <- innerData.get[Json]("innerCollection")
obj <- innerData.delete.up.as[JsonObject]
} yield Json.fromJsonObject(
obj.add("boolean", boolean).add("collection", collection)
)
}
And then:
val Right(json) = io.circe.jawn.parse(
"""{
"data": {
"title": "example input",
"someBoolean": false,
"innerData": {
"innerString": "input inner string",
"innerBoolean": true,
"innerCollection": [1,2,3]
},
"collection": [6,7,8]
}
}"""
)
And:
scala> transform(json)
res1: Either[io.circe.DecodingFailure,io.circe.Json] =
Right({
"data" : {
"title" : "example input",
"collection" : [
6,
7,
8
]
},
"boolean" : false,
"collection" : [
1,
2,
3
]
})
If you look at it the right way, our transform method kind of resembles a decoder, and we can actually write it as one (although I'd definitely recommend not making it implicit):
import io.circe.{Decoder, Json, JsonObject}
import io.circe.syntax._
val transformData: Decoder[Json] = { c =>
val someBoolean = c.downField("data").downField("someBoolean")
val innerData = someBoolean.delete.downField("innerData")
(
innerData.delete.up.as[JsonObject],
someBoolean.as[Json],
innerData.get[Json]("innerCollection")
).mapN(_.add("boolean", _).add("collection", _)).map(Json.fromJsonObject)
}
This can be convenient in some situations where you want to perform the transformation as part of a pipeline that expects a decoder:
scala> io.circe.jawn.decode(myJsonString)(transformData)
res2: Either[io.circe.Error,io.circe.Json] =
Right({
"data" : {
"title" : "example input",
"collection" : [ ...
This is also potentially confusing, though, and I've thought about adding some kind of Transformation type to Circe that would encapsulate transformations like this without questionably repurposing the Decoder type class.
One nice thing about both the transform method and this decoder is that if the input data doesn't have the expected shape, the resulting error will include a history that points to the problem.
I'm getting a JSON object over the network, as a String. I'm then using Circe to parse it. I want to add a handful of fields to it, and then pass it on downstream.
Almost all of that works.
The problem is that my "adding" is really "overwriting". That's actually ok, as long as I add an empty object first. How can I add such an empty object?
So looking at the code below, I am overwriting "sometimes_empty:{}" and it works. But because sometimes_empty is not always empty, it results in some data loss. I'd like to add a field like: "custom:{}" and then ovewrite the value of custom with my existing code.
Two StackOverflow posts were helpful. One worked, but wasn't quite what I was looking for. The other I couldn't get to work.
1: Modifying a JSON array in Scala with circe
2: Adding field to a json using Circe
val js: String = """
{
"id": "19",
"type": "Party",
"field": {
"id": 1482,
"name": "Anne Party",
"url": "https"
},
"sometimes_empty": {
},
"bool": true,
"timestamp": "2018-12-18T11:39:18Z"
}
"""
val newJson = parse(js).toOption
.flatMap { doc =>
doc.hcursor
.downField("sometimes_empty")
.withFocus(_ =>
Json.fromFields(
Seq(
("myUrl", Json.fromString(myUrl)),
("valueZ", Json.fromString(valueZ)),
("valueQ", Json.fromString(valueQ)),
("balloons", Json.fromString(balloons))
)
)
)
.top
}
newJson match {
case Some(v) => return v.toString
case None => println("Failure!")
}
We need to do a couple of things. First, we need to zoom in on the specific property we want to update, if it doesn't exist, we'll create a new empty one. Then, we turn the zoomed in property in the form of a Json into JsonObject in order to be able to modify it using the +: method. Once we've done that, we need to take the updated property and re-introduce it in the original parsed JSON to get the complete result:
import io.circe.{Json, JsonObject, parser}
import io.circe.syntax._
object JsonTest {
def main(args: Array[String]): Unit = {
val js: String =
"""
|{
| "id": "19",
| "type": "Party",
| "field": {
| "id": 1482,
| "name": "Anne Party",
| "url": "https"
| },
| "bool": true,
| "timestamp": "2018-12-18T11:39:18Z"
|}
""".stripMargin
val maybeAppendedJson =
for {
json <- parser.parse(js).toOption
sometimesEmpty <- json.hcursor
.downField("sometimes_empty")
.focus
.orElse(Option(Json.fromJsonObject(JsonObject.empty)))
jsonObject <- json.asObject
emptyFieldJson <- sometimesEmpty.asObject
appendedField = emptyFieldJson.+:("added", Json.fromBoolean(true))
res = jsonObject.+:("sometimes_empty", appendedField.asJson)
} yield res
maybeAppendedJson.foreach(obj => println(obj.asJson.spaces2))
}
}
Yields:
{
"id" : "19",
"type" : "Party",
"field" : {
"id" : 1482,
"name" : "Anne Party",
"url" : "https"
},
"sometimes_empty" : {
"added" : true,
"someProperty" : true
},
"bool" : true,
"timestamp" : "2018-12-18T11:39:18Z"
}
I'm try to get the head from the keys of JsValue type in Scala. I googled a lot to know how to get the head key from JsValue type.
Finally, I found that result.keys.head is the way to get the head key, but it throws error value keys is not a member of play.api.libs.json.JsValue.
And my result variable has the below form of data:
{
"intents": [{
"intent": "feeling",
"confidence": 0.1018563217175903
}],
"entities": [],
"input": {
"text": "{reset-encounter}"
},
"output": "Good"
}
Code:
import play.api.libs.json._
val jsonStr = """
{
"intents": [{
"intent": "feeling",
"confidence": 0.1018563217175903
}],
"entities": [],
"input": {
"text": "{reset-encounter}"
},
"output": "Good"
}
"""
val result = Json.parse(jsonStr)
println("key: ", result.keys.head)
At result.keys.head line, throws error.
I'm not sure but I think, may be I'm doing something wrong here.
Json.parse produces a JsValue, which could represent any type of json object (boolean, number, array, etc). If you know you're working with an object, you can use .as[JsObject]:
import play.api.libs.json._
val result = Json.parse(jsonStr).as[JsObject]
println("key: " + result.keys.head)
What are you trying to get? That's not the way to deal with play.api.Json objects.
.keys would result in a Map, not in a JsValue.
Check the documentation: https://www.playframework.com/documentation/2.5.x/ScalaJson
If you want to access a specific key (https://www.playframework.com/documentation/2.5.x/ScalaJson#Traversing-a-JsValue-structure) you should try:
result \ "keyName"
or for a recursive search:
result \\ "keyName"
Please help! I'm trying to generate object from JSON with jackson kotlin module. Here is json source:
{
"name": "row",
"type": "layout",
"subviews": [{
"type": "horizontal",
"subviews": [{
"type": "image",
"icon": "ic_no_photo",
"styles": {
"view": {
"gravity": "center"
}
}
}, {
"type": "vertical",
"subviews": [{
"type": "text",
"fields": {
"text": "Some text 1"
}
}, {
"type": "text",
"fields": {
"text": "Some text 2"
}
}]
}, {
"type": "vertical",
"subviews": [{
"type": "text",
"fields": {
"text": "Some text 3"
}
}, {
"type": "text",
"fields": {
"text": "Some text 4"
}
}]
}, {
"type": "vertical",
"subviews": [{
"type": "image",
"icon": "ic_no_photo"
}, {
"type": "text",
"fields": {
"text": "Some text 5"
}
}]
}]
}]
}
I'm trying to generate instance of Skeleton class.
data class Skeleton (val type : String,
val name: String,
val icon: String,
val fields: List<Field>,
val styles: Map<String, Map<String, Any>>,
val subviews : List<Skeleton>)
data class Field (val type: String, val value: Any)
As you can see, Skeleton object can have other Skeleton objects inside (and these objects can have other Skeleton objects inside too), also Skeleton can have List of Field objects
val mapper = jacksonObjectMapper()
val skeleton: Skeleton = mapper.readValue(File(file))
This code ends with exception:
com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class com.uibuilder.controllers.parser.Skeleton] value failed (java.lang.IllegalArgumentException): Parameter specified as non-null is null: method com.uibuilder.controllers.parser.Skeleton.<init>, parameter name
at [Source: docs\layout.txt; line: 14, column: 3] (through reference chain: com.uibuilder.controllers.parser.Skeleton["subviews"]->java.util.ArrayList[0]->com.uibuilder.controllers.parser.Skeleton["subviews"]->java.util.ArrayList[0])
There are several issues I found about your mapping that prevent Jackson from reading the value from JSON:
Skeleton class has not-null constructor parameters (e.g. val type: String, not String?), and Jackson passes null to them if the value for those parameters is missing in JSON. This is what causes the exception you mentioned:
Parameter specified as non-null is null: method com.uibuilder.controllers.parser.Skeleton.<init>, parameter name
To avoid it, you should mark the parameters that might might have values missing as nullable (all of the parameters in your case):
data class Skeleton(val type: String?,
val name: String?,
val icon: String?,
val fields: List<Field>?,
val styles: Map<String, Map<String, Any>>?,
val subviews : List<Skeleton>?)
fields in Skeleton has type List<Field>, but in JSON it's represented by a single object, not by an array. The fix would be to change the fields parameter type to Field?:
data class Skeleton(...
val fields: Field?,
...)
Also, Field class in your code doesn't match the objects in JSON:
"fields": {
"text": "Some text 1"
}
You should change Field class as well, so that it has text property:
data class Field(val text: String)
After I made the changes I listed, Jackson could successfully read the JSON in question.
See also: "Null Safety" in Kotlin reference.
im trying to extract my data from json into a case class without success.
the Json file:
[
{
"name": "bb",
"loc": "sss",
"elements": [
{
"name": "name1",
"loc": "firstHere",
"elements": []
}
]
},
{
"name": "ca",
"loc": "sss",
"elements": []
}
]
my code :
case class ElementContainer(name : String, location : String,elements : Seq[ElementContainer])
object elementsFormatter {
implicit val elementFormatter = Json.format[ElementContainer]
}
object Applicationss extends App {
val el = new ElementContainer("name1", "firstHere", Seq.empty)
val el1Cont = new ElementContainer("bb","sss", Seq(el))
val source:String=Source.fromFile("src/bin/elementsTree.json").getLines.mkString
val jsonFormat = Json.parse(source)
val r1= Json.fromJson[ElementContainer](jsonFormat)
}
after running this im getting inside r1:
JsError(List((/elements,List(ValidationError(List(error.path.missing),WrappedArray()))), (/name,List(ValidationError(List(error.path.missing),WrappedArray()))), (/location,List(ValidationError(List(error.path.missing),WrappedArray())))))
been trying to extract this data forever, please advise
You have location instead loc and, you'll need to parse file into a Seq[ElementContainer], since it's an array, not a single ElementContainer:
Json.fromJson[Seq[ElementContainer]](jsonFormat)
Also, you have the validate method that will return you either errors or parsed json object..