Flattening nested JSON objects with Circe - json

Suppose I have a JSON object like this:
{
"foo": true,
"bar": {
"baz": 1,
"qux": {
"msg": "hello world",
"wow": [null]
}
}
}
And I want to flatten it recursively to a single layer, with the keys merged with an underscore:
{
"foo": true,
"bar_baz": 1,
"baz_qux_msg": "hello world",
"baz_qux_wow": [null]
}
How can I do this with Circe?
(Note: this is another FAQ from the Circe Gitter channel.)

You can do this without too much pain in Circe with a recursive method:
import io.circe.Json
def flatten(combineKeys: (String, String) => String)(value: Json): Json = {
def flattenToFields(value: Json): Option[Iterable[(String, Json)]] =
value.asObject.map(
_.toIterable.flatMap {
case (k, v) => flattenToFields(v) match {
case None => List(k -> v)
case Some(fields) => fields.map {
case (innerK, innerV) => combineKeys(k, innerK) -> innerV
}
}
}
)
flattenToFields(value).fold(value)(Json.fromFields)
}
Here our internal flattenToFields method takes each JSON value and either returns None if it's a non-JSON object value, as a signal that that field doesn't need flattening, or a Some containing a sequence of flattened fields in the case of a JSON object.
If we have a JSON value like this:
val Right(doc) = io.circe.jawn.parse("""{
"foo": true,
"bar": {
"baz": 1,
"qux": {
"msg": "hello world",
"wow": [null]
}
}
}""")
We can verify that flatten does what we want like this:
scala> flatten(_ + "_" + _)(doc)
res1: io.circe.Json =
{
"foo" : true,
"bar_baz" : 1,
"bar_qux_msg" : "hello world",
"bar_qux_wow" : [
null
]
}
Note that flattenToFields is not tail recursive, and will overflow the stack for deeply-nested JSON objects, but probably not until you're several thousand levels deep, so it's unlikely to be an issue in practice. You could make it tail recursive without too much trouble, but at the expense of additional overhead for the common cases where you only have a few layers of nesting.

I propose a variation of the solution by Travis Brown. The variation concerns objects in the JSON lists, i.e. how to handle
{
"foo": true,
"bar": {
"baz": 1,
"qux": {
"msg": "hello world",
"wow": [{"x": 1, "y": 2}, {"x": 3, "y": 4}]
}
}
}
One possible solution for recursively handling objects in lists is the following implementation, where the position in the list is taken as an additional key part
def flattenDeep(combineKeys: (String, String) => String)(value: Json): Json = {
def flattenToFields(value: Json): Option[Iterable[(String, Json)]] = {
value.fold(
jsonNull = None,
jsonNumber = _ => None,
jsonString = _ => None,
jsonBoolean = _ => None,
jsonObject = { obj =>
val fields = obj.toIterable.flatMap {
case (field, json) =>
flattenToFields(json).fold(Iterable(field -> json)) {
_.map {
case (innerField, innerJson) =>
combineKeys(field, innerField) -> innerJson
}
}
}
Some(fields)
},
jsonArray = { array =>
val fields = array.zipWithIndex.flatMap {
case (json, index) =>
flattenToFields(json).fold(Iterable(index.toString -> json)) {
_.map {
case (innerField, innerJson) =>
combineKeys(index.toString, innerField) -> innerJson
}
}
}
Some(fields)
}
)
}
flattenToFields(value).fold(value)(Json.fromFields)
}
With this implementation the above example is flattened to:
{
"foo" : true,
"bar_baz" : 1,
"bar_qux_msg" : "hello world",
"bar_qux_wow_0_x" : 1,
"bar_qux_wow_0_y" : 2,
"bar_qux_wow_1_x" : 3,
"bar_qux_wow_1_y" : 4
}
For even deeper nested structures one still gets a flat representation, e.g.
{
"foo": true,
"bar": {
"baz": 1,
"qux": {
"msg": "hello world",
"wow": [
{
"x": 1,
"y": 2
},
{
"x": 3,
"y": 4
}
],
"deeper": [
{
"alpha": {
"h": 12,
"m": 1
},
"beta": [ "a", "b", "c" ]
},
{
"alpha": {
"h": 21,
"m": 0
},
"beta": [ "z" ]
}
]
}
}
}
will be flattened into
{
"foo" : true,
"bar_baz" : 1,
"bar_qux_msg" : "hello world",
"bar_qux_wow_0_x" : 1,
"bar_qux_wow_0_y" : 2,
"bar_qux_wow_1_x" : 3,
"bar_qux_wow_1_y" : 4,
"bar_qux_deeper_0_alpha_h" : 12,
"bar_qux_deeper_0_alpha_m" : 1,
"bar_qux_deeper_0_beta_0" : "a",
"bar_qux_deeper_0_beta_1" : "b",
"bar_qux_deeper_0_beta_2" : "c",
"bar_qux_deeper_1_alpha_h" : 21,
"bar_qux_deeper_1_alpha_m" : 0,
"bar_qux_deeper_1_beta_0" : "z"
}

Related

How to read from a JSON with two keys

I have a json that I need to import and then return a certain value. The json has two keys, like
{
"NUM_High_Objects": {
"abseta_pt": {
"field1:[0.0,0.9]": {
"field2:[15,20]": {
"tagIso": 0.00012,
"value": 0.99
},
"field2:[20,25]": {
"tagIso": 0.00035,
"value": 0.98
}
},
"field1:[0.91,1.2]": {
"field2:[15,20]": {
"tagIso": 0.00013,
"value": 0.991
},
"field2:[20,25]": {
"tagIso": 0.00036,
"value": 0.975
}
},
"binning": [
{
"binning": [
0.0,
0.9,
1.2,
2.1,
2.4
],
"variable": "abseta"
},
{
"binning": [
15,
20,
25,
30,
40,
50,
60,
120
],
"variable": "pt"
}
]
}
},
What I need is to search if a pair of values is within the range of "field1" and "field2" and return the corresponding "value"
I tried following this Search nested json / dict for multiple key values matching specified keys but could not make it to work...
I ve tried something like
class checkJSON() :
def __init__(self,filein) :
self.good, self.bad = 0, 0
print 'inside json function : will use the JSON', filein
input_file = open (filein)
self.json_array = json.load(input_file)
def checkJSON(self,LS,run) :
try :
LSlist = self.json_array[str(run)]
for LSrange in LSlist :print LSrange, run
except KeyError :
pass
self.bad += 1
return False
CJ=''
CJ=checkJSON(filein='test.json')
isInJSON = CJ.checkJSON("0.5", "20")
print isInJSON
but this does not work as I am not sure how to loop inside the keys
If I am understanding your question correctly then the relevant portion of your JSON is:
{
"field1:[0.0,0.9]": {
"field2:[15,20]": {
"tagIso": 0.00012,
"value": 0.99
},
"field2:[20,25]": {
"tagIso": 0.00035,
"value": 0.98
}
},
"field1:[0.91,1.2]": {
"field2:[15,20]": {
"tagIso": 0.00013,
"value": 0.991
},
"field2:[20,25]": {
"tagIso": 0.00036,
"value": 0.975
}
},
"binning": [
{
"binning": [
0.0,
0.9,
1.2,
2.1,
2.4
],
"variable": "abseta"
},
{
"binning": [
15,
20,
25,
30,
40,
50,
60,
120
],
"variable": "pt"
}
]
}
Then the following code should do what you are trying to achieve. It doesn't look like you need to search for nested keys, you simply need to parse your field1[...] and field2[...]. The code below is a quick implementation of what I understand you are trying to achieve. It will return the value if the first parameter is in the range of a field1[...] and the second parameter is in the range of a field2[...]. Otherwise, it will return None.
import json
def check_json(jsondict, l1val, l2val):
def parse_key(keystr):
level, lrange = keystr.split(':')
return level, eval(lrange)
for l1key, l2dict in jsondict.items():
if 'field' in l1key:
l1, l1range = parse_key(l1key)
if l1val >= l1range[0] and l1val <= l1range[1]:
for l2key, vals in l2dict.items():
l2, l2range = parse_key(l2key)
if l2val >= l2range[0] and l2val <= l2range[1]:
return vals['value']
return None
Here is a driver code to test the implementation above.
if __name__ == '__main__':
with open('data.json', 'r') as f:
myjson = json.load(f)
print(check_json(myjson, 0.5, 20))

JSON Objects null filtering in scala

I am using play.api.libs.json in Scala (2.12.8) to process some json objects. I have for example a JSON string that looks like:
{
"field1": null,
"field2": 23,
"field3": {
"subfield1": "a",
"subfield2": null
},
"field4": {
"subfield1": true,
"subfield2": {
"subsubfield1": null,
"subsubfield2": "45"
},
"field5": 3
}
}
And I want to filter out every null fields or subfields.
As explained here: Play: How to remove the fields without value from JSON and create a new JSON with them
Doing:
import play.api.libs.json.{ JsNull, JsObject, JsValue, Json }
val j = Json.parse(myJsonString).as[JsObject]
JsObject(j.fields.filterNot(t => withoutValue(t._2)))
def withoutValue(v: JsValue) = v match {
case JsNull => true
case _ => false
}
helps me remove the upper level fields: in my case, field1
But field3.subfield2 and field4.subfield2.subsubfield1 are still present. I want to remove them. Also I should mention that not every subfields can be null at once. Should this happen, I think we could just remove the upper level field. If field3.subfield1and field3.subfield2 are null, we can remove field3.
Any idea on how to do this neatly in Scala?
PS: the desired output is:
{
"field2": 23,
"field3": {
"subfield1": "a"
},
"field4": {
"subfield1": true,
"subfield2": {
"subsubfield2": "45"
},
"field5": 3
}
}
You need to do a recursive solution. For example:
def removeNulls(jsObject: JsObject): JsValue = {
JsObject(jsObject.fields.collect {
case (s, j: JsObject) =>
(s, removeNulls(j))
case other if (other._2 != JsNull) =>
other
})
}
Code run at Scastie. Output is as expected.

Scala Circe JSON Library - Understanding Implicit Encoder in an Example

I am working with the Scala Circe library. It seems very useful and I want to get better at using it.
One example I have is the following.
Consider the following code:
import io.circe.syntax._
import io.circe.{Json, Encoder}
import io.circe.generic.auto._
sealed trait JsonComponent
case class DataSequences(values: Map[String, List[Int]]) extends JsonComponent
case class DataField(data: List[DataSequences], `type`: String) extends JsonComponent
object Example extends App {
implicit val encodeDataSequences: Encoder[DataSequences] = new Encoder[DataSequences] {
final def apply(sequence: DataSequences): Json = sequence.values.asJson
}
val x = new DataSequences(Map("x" -> List(1, 2, 3), "y" -> List(1, 2, 4)))
val l = List(DataField(List(x, x), "abc"), DataField(List(x, x), "cde")).asJson
println(l)
}
This gives the following output:
[
{
"data" : [
{
"x" : [
1,
2,
3
],
"y" : [
1,
2,
4
]
},
{
"x" : [
1,
2,
3
],
"y" : [
1,
2,
4
]
}
],
"type" : "abc"
}
]
However, if I comment out the encodeDataSequences encoder definition, I get the following instead:
[
{
"data" : [
{
"x" : [
1,
2,
3
],
"y" : [
1,
2,
4
]
},
{
"x" : [
1,
2,
3
],
"y" : [
1,
2,
4
]
}
],
"type" : "abc"
}
]
So now this "values" appears. I do not want the "values" field to show up. I am not sure how the implicit is shaping the Json under the hood, and if someone could highlight what's going on that would be appreciated.
In addition, as a gernal thing, am I writing idiomatic Circe code using that implicit encoder, and if so is there a better way to do what I want?

Find the path of an JSON element with dynamic key with Play JSON

I am using Play Framework with Scala. I have the following JSON structure:
{
"a": 1540554574847,
"b": 2,
"c": {
"pep3lpnp1n1ugmex5uevekg5k20wkfq3": {
"a": 1,
"b": 1,
"c": 1,
"d": 1
},
"p3zgudnf7tzqvt50g7lpr2ryno7yugmy": {
"b": [
"d10e5600d11e5517"
],
"c": 1,
"d": 1,
"e": 1,
"g": 1,
"h": [
"d10e5600d11e5517",
"d10e5615d11e5527",
"d10e5605d11e5520",
"d10e5610d11e5523",
"d10e5620d11e5530"
],
"q": "a_z6smu56gstysjpqbzp21ruxii6g2ph00"
},
"33qfthhugr36f5ts4251glpqx0o373pe": {
"b": [
"d10e5633d11e5536"
],
"c": 1,
"d": 1,
"e": 1,
"g": 1,
"h": [
"d10e5638d11e5539",
"d10e5633d11e5536",
"d10e5643d11e5542",
"d10e5653d11e5549",
"d10e5648d11e5546"
],
"q": "a_cydo6wu1ds340j3q6qxeig97thocttsp"
}
}
}
I need to fetch values from paths
"c" -> "pep3lpnp1n1ugmex5uevekg5k20wkfq3" -> "b",
"c" -> "p3zgudnf7tzqvt50g7lpr2ryno7yugmy" -> "b",
"c" -> "33qfthhugr36f5ts4251glpqx0o373pe" -> "b", and so on, where "pep3lpnp1n1ugmex5uevekg5k20wkfq3" is dynamic and changes for every JSON input.
Output should be like Seq(object(q,b,c)).
If you don't need to know which generated key belongs to which value you can use recursive path \\ operator:
import play.api.libs.json.Json
import play.api.libs.json._
val jsonText = """{
"a":1540554574847,
"b":2,
"c":{
"onegeneratedkey":{
"a":1,
"b":1,
"c":1,
"d":1
},
"secondsonegeneratedkey":{
"a":1,
"b": [1, 2, 3],
"c":1,
"d":1
}
}
}"""
val result: Seq[JsValue] = Json.parse(jsonText) \ "c" \\ "b"
// res: List(1, [1,2,3])
UPD.
To get all values stored inside object with generated-keys, one can use JsObject#values:
val valuesSeq: Seq[JsValue] = (Json.parse(jsonText) \ "c").toOption // get 'c' field
.collect {case o: JsObject => o.values.toSeq} // get all object that corresponds to generated keys
.getOrElse(Seq.empty)
// res: Seq({"a":1,"b":1,"c":1,"d":1}, {"a":1,"b":[1,2,3],"c":1,"d":1})
val valuesABC = valuesSeq.map(it => (it \ "a", it \ "b", it \ "c"))
// res: Seq((JsDefined(1),JsDefined(1),JsDefined(1)), (JsDefined(1),JsDefined([1,2,3]),JsDefined(1)))
I misread the question, and this is the modified version.
Here I used json.pick to read JsObject and iterate the keys from there.
Ps: You don't have to create Reads or the case classes, but it should made the caller program more readable.
import play.api.libs.json.Json
import play.api.libs.json._
val jsonText =
"""{
"top": {
"level2a": {
"a": 1,
"b": 1,
"c": 1,
"d": 1
},
"level2b": {
"a": 2,
"b": 2,
"nested": {
"b": "not interested"
}
}
}
}"""
case class Data(k: String, v: Int)
case class Datas(list: Seq[Data])
object Datas {
implicit val reads: Reads[Datas] = (__ \ "top").json.pick.map {
case obj: JsObject =>
new Datas(obj.keys.flatMap(k => (obj \ k \ "b").validate[Int] match {
case JsSuccess(v, _) => Some(Data(k, v))
case _ => None
}).toSeq)
}
}
Json.parse(jsonText).validate[Datas].asOpt match {
case Some(d) => println(s"found: $d")
case _ => println("not found")
}
To deserialize the internal structure within level2, you may choose to create the internal structure and use Json.reads to create the default reads. So long as the data structure is known and predictable.
For example
case class Internal(a: Int, b: Int, c: Option[Int], d: Option[Int])
object Internal {
implicit val reads = Json.reads[Internal]
}
case class Data(k: String, v: Internal)
case class Datas(list: Seq[Data])
object Datas {
implicit val reads: Reads[Datas] = (__ \ "top").json.pick.map {
case obj: JsObject =>
new Datas(obj.keys.flatMap(k => (obj \ k).validate[Internal].asOpt
.map(v => Data(k, v))).toSeq)
}
}

Put Data in mutlple branch of Array : Json Transformer ,Scala Play

i want to add values to all the arrays in json object.
For eg:
value array [4,2.5,2.5,1.5]
json =
{
"items": [
{
"id": 1,
"name": "one",
"price": {}
},
{
"id": 2,
"name": "two"
},
{
"id": 3,
"name": "three",
"price": {}
},
{
"id": 4,
"name": "four",
"price": {
"value": 1.5
}
}
]
}
i want to transform the above json in
{
"items": [
{
"id": 1,
"name": "one",
"price": {
"value": 4
}
},
{
"id": 2,
"name": "two",
"price": {
"value": 2.5
}
},
{
"id": 3,
"name": "three",
"price": {
"value": 2.5
}
},
{
"id": 4,
"name": "four",
"price": {
"value": 1.5
}
}
]
}
Any suggestions on how do i achieve this. My goal is to put values inside the specific fields of json array. I am using play json library throughout my application. What other options do i have instead of using json transformers.
You may use simple transformation like
val prices = List[Double](4,2.5,2.5,1.5).map(price => Json.obj("price" -> Json.obj("value" -> price)))
val t = (__ \ "items").json.update(
of[List[JsObject]]
.map(_.zip(prices).map(o => _._1 ++ _._2))
.map(JsArray)
)
res5: play.api.libs.json.JsResult[play.api.libs.json.JsObject] = JsSuccess({"items":[{"id":1,"name":"one","price":{"value":4}},{"id":2,"name":"two","price":{"value":2.5}},{"id":3,"name":"three","price":{"value":2.5}},{"id":4,"name":"four","price":{"value":1.5}}]},/items)
I suggest using classes, but not sure this fits to your project because it's hard to guess how your whole codes look like.
I put new Item manually for simplicity. You can create items using Json library :)
class Price(val value:Double) {
override def toString = s"{value:${value}}"
}
class Item(val id: Int, val name: String, val price: Price) {
def this(id: Int, name: String) {
this(id, name, null)
}
override def toString = s"{id:${id}, name:${name}, price:${price}}"
}
val price = Array(4, 2.5, 2.5, 1.5)
/** You might convert Json data to List[Item] using Json library instead. */
val items: List[Item] = List(
new Item(1, "one"),
new Item(2, "two"),
new Item(3, "three"),
new Item(4, "four", new Price(1.5))
)
val valueMappedItems = items.zipWithIndex.map{case (item, index) =>
if (item.price == null) {
new Item(item.id, item.name, new Price(price(index)))
} else {
item
}
}