Parsing bad Json in Scala - json

I'm trying to parse some problematic Json in Scala using Play Json and using implicit, but not sure how to proceed...
The Json looks like this:
"rules": {
"Some_random_text": {
"item_1": "Some_random_text",
"item_2": "text",
"item_n": "MoreText",
"disabled": false,
"Other_Item": "thing",
"score": 1
},
"Some_other_text": {
"item_1": "Some_random_text",
"item_2": "text",
"item_n": "MoreText",
"disabled": false,
"Other_Item": "thing",
"score": 1
},
"Some_more_text": {
"item_1": "Some_random_text",
"item_2": "text",
"item_n": "MoreText",
"disabled": false,
"Other_Item": "thing",
"score": 1
}
}
I'm using an implicit reader but because each top level item in rules is effectively a different thing I don't know how to address that...
I'm trying to build a case class and I don't actually need the random text heading for each item but I do need each item.
To make my life even harder after these items are lots of things in other formats which I really don't need. They are unnamed items which just start:
{
random legal Json...
},
{
more Json...
}
I need to end up with the Json I'm parsing in a seq of case classes.
Thanks for your thoughts.

I'm using an implicit reader but because each top level item in rules is effectively a different thing I don't know how to address that...
Play JSON readers depend on knowing names of fields in advance. That goes for manually constructed readers and also for macro generated readers. You cannot use an implicit reader in this case. You need to do some traversing first and extract pieces of Json that do have regular structure with known names and types of fields. E.g. like this:
case class Item(item_1: String, item_2: String, item_n: String, disabled: Boolean, Other_Item: String, score: Int)
implicit val itemReader: Reads[Item] = Json.reads[Item]
def main(args: Array[String]): Unit = {
// parse JSON text and assume, that there is a JSON object under the "rules" field
val rules: JsObject = Json.parse(jsonText).asInstanceOf[JsObject]("rules").asInstanceOf[JsObject]
// traverse all fields, filter according to field name, collect values
val itemResults = rules.fields.collect {
case (heading, jsValue) if heading.startsWith("Some_") => Json.fromJson[Item](jsValue) // use implicit reader here
}
// silently ignore read errors and just collect sucessfully read items
val items = itemResults.flatMap(_.asOpt)
items.foreach(println)
}
Prints:
Item(Some_random_text,text,MoreText,false,thing,1)
Item(Some_random_text,text,MoreText,false,thing,1)
Item(Some_random_text,text,MoreText,false,thing,1)

Related

Is it possible to use something like JsonPath with kotlin JSON parsing

I have a json structure that I need to (sort of) flatten when serializing it into an object. Some of the elements are at the top level and some are in a sub field. In addition, 1 of the fields is an array of space delimited strings that I need to parse and represent as myString.splig(" ")[0]
So, short of a when expression to do the job, can I use something like a jsonpath query to bind to certain fields? I have thought of even doing some kind of 2-pass binding and then merging both instances.
{
"key": "FA-207542",
"fields": {
"customfield_10443": {
"value": "TBD"
},
"customfield_13600": 45,
"customfield_10900": {
"value": "Monitoring/Alerting"
},
"customfield_10471": [
"3-30536161871 (SM-2046076)"
],
"issuetype": {
"name": "Problem Mgmt - Corrective Action"
},
"created": "2022-08-11T04:46:44.000+0000",
"updated": "2022-11-08T22:11:23.000+0000",
"summary": "FA | EJWL-DEV3| ORA-00020: maximum number of processes (10000) exceeded",
"assignee": null
}
}
And, here's the data object I'd like to bind to. I have represented what they should be as jq expressions.
#Serializable
data class MajorIncident constructor(
#SerialName("key")
val id: String, // .key
val created: Instant, // .fields.created
val pillar: String, // .fields.customfield_10443.value
val impactBreadth: String?,
val duration: Duration, // .fields.customfield_13600 as minutes
val detectionSource: String, //.fields.customfield_10900.value
val updated: Instant, // .fields.updated
val assignee: String, // .fields.assignee
// "customfield_10471": [
// "3-30536161871 (SM-2046076)"
// ],
val serviceRequests: List<String>?, // .fields.customfield_10471 | map(split(" ")[0]) -
#SerialName("summary")
val title: String, //.summary
val type: String, // .fields.issuetype.name // what are options?
)
If you're using Kotlinx Serialization, I'm not sure there is any built-in support for jsonpath.
One simple option is to declare your Kotlin model in a way that matches the JSON. If you really want a flattened object, you could convert from the structured model into the flat model from Kotlin.
Another option is to write a custom serializer for your type.

How to use Hcursor or Optics, as part of Circe-Json, to return a List of matching Objects?

I have code which looks roughly like this:
val json: Json = parse("""
[
{
"id": 1,
"type": "Contacts",
"admin": false,
"cookies": 3
},
{
"id": 2,
"type": "Apples",
"admin": false,
"cookies": 6
},
{
"id": 3,
"type": "Contacts",
"admin": true,
"cookies": 19
}
]
""").getOrElse(Json.Null)
I'm using Circe, Cats, Scala, Circe-json, and so on, and the Parse call succeeds.
I want to return a List, where each top-level Object where type="Contacts", is shown in it's entirety.
Something like:
List[String] = ["{"id": 1,"type": "Contacts","admin": false,"cookies": 3}","{"id": 3,"type": "Contacts","admin": true,"cookies": 19}"]
The background is that I have large JSON files on disk. I need to filter out the subset of objects that match a certain type= value, in this case, type=Contacts, and then split these out from the rest of the json file. I'm not looking to modify the file, I'm more looking to grep for matching objects and process them accordingly.
Thank you.
The most straightforward way to accomplish this kind of thing is to decode the document into either a List[Json] or List[JsonObject] value. For example, given your definition of json:
import io.circe.JsonObject
val Right(docs) = json.as[List[JsonObject]]
And then you can query based on the type:
scala> val contacts = docs.filter(_("type").contains(Json.fromString("Contacts")))
contacts: List[io.circe.JsonObject] = List(object[id -> 1,type -> "Contacts",admin -> false,cookies -> 3], object[id -> 3,type -> "Contacts",admin -> true,cookies -> 19])
scala> contacts.map(Json.fromJsonObject).map(_.noSpaces).foreach(println)
{"id":1,"type":"Contacts","admin":false,"cookies":3}
{"id":3,"type":"Contacts","admin":true,"cookies":19}
Given your use case, circe-optics seems unlikely to be a good fit (see my answer here for some discussion of why filtering with arbitrary predicates is awkward with Monocle's Traversal).
It may be worth looking into circe-fs2 or circe-iteratee, though, if you're interested in parsing and filtering large JSON files without loading the entire contents of the file into memory. In both cases the principle would be the same as in the List[JsonObject] code just above—you decode your big JSON array into a stream of JsonObject values, which you can query however you want.

How to avoid definition of implicit reads/writes for the sub-classes for JSON Macro Inception (to convert nested JSON structure to Scala)

I have a requirement where the incoming JSON object is complex and mostly nested ex:
"users": {
"utype": "PERSON",
"language":"en_FR",
"credentials": [
{
"handle": "xyz#abc.com",
"password": "123456",
"handle_type": "EMAIL"
}
],
"person_details": {
"primary": "true",
"names": [
{
"name_type": "OFFICIAL",
"title": "MR",
"given": "abc",
"family": "zat",
"middle": "pqs",
"suffix":"anathan"
}
],
"addresses": [
{
"ad_type": "HOME",
"line1": "Residential 2211 North 1st Street",
"line2": "Bldg 17",
"city": "test",
"county": "Shefield",
"state" : "NY",
"country_code": "xx",
"postal_code": "95131"
}
]
}
}
For parsing this structure I use the below Case Classes
case class PersonUser (
user_type:String,
language_code:String,
credentials:List[Credential],
person_details:PersonDetails
)
case class Credential(handle:String, password:String,handle_type:String)
case class PersonDetails(
primary_user:Boolean,
names: List[Name],
addresses:List[Address]
)
case class Name(
name_type: String,
title: String,
given: String,
family: String,
middle: String,
suffix:String
)
case class Address(
address_type: String,
line1: String,
line2: String,
city: String,
county: String,
state : String,
country_code: String,
postal_code: String
)
To convert the JSON structure to Scala I used JSON Inception:
implicit val testReads = Json.reads[PersonUser]
Also I had to specify similar reads implicits in the sub classes - Credential, PersonDetails, Name and Address. Given below on such instance:
case class Credential(handle:String, password:String,handle_type:String)
object Credential{
implicit val reads = Json.reads[Credential]
}
Now comes the question, if my JSON structure is really big with lots of sub-structures, there will be a number of Scala case classes I need to define. It will be really cumbersome to define companion objects and implicit read for each of the case classes (Ex: if I have 8 case classes to represent the JSON structure fully, I will have to define 8 more companion objects). Is there any way to avoid this extra work?
This question is already answered but I thought I'd explain why it is the way it is.
If the macro generated the necessary reads for nested case classes, this would make it impossible to define your own custom reads for those nested classes. The consequence, the macro would only be useful for the most trivial of cases, requiring switching to manual implementation of reads for the entire hierarchy just for the sake of defining custom behaviour for one deeply nested case class. The way the macro is implemented now, you can arbitrarily change any part of it easily.
Yes, there is a small boilerplate overhead. But on the plus side, the compiler tells you if your structure can be deserialised, meaning you capture errors early, making refactoring safer, and also it means there's less implied magic that the only way you can know how it works is if you read the docs. By strongly typing everything, there is no magic, no surprises, no implied knowledge, and this leads to improved maintainability.
No, there is no way to avoid defining a Format instance for each class you want to (de)serialize.
You can also have a look at other libs, at least Genson doesn't require you to define tons of implicits. The code you showed above should work with Genson by default.
import com.owlike.genson.defaultGenson_
val personUser: PersonUser = fromJson[PersonUser](json)
val json = toJson(personUser)
Genson has many other features, I'll let you judge by your self.

Parse JSON array using Scala Argonaut

I'm using Scala & Argonaut, trying to parse the following JSON:
[
{
"name": "apple",
"type": "fruit",
"size": 3
},
{
"name": "jam",
"type": "condiment",
"size": 5
},
{
"name": "beef",
"type": "meat",
"size": 1
}
]
And struggling to work out how to iterate and extract the values into a List[MyType] where MyType will have name, type and size properties.
I will post more specific code soon (i have tried many things), but basically I'm looking to understand how the cursor works, and how to iterate through arrays etc. I have tried using \\ (downArray) to move to the head of the array, then :->- to iterate through the array, then --\ (downField) is not available (at least IntelliJ doesn't think so).
So the question is how do i:
navigate to the array
iterate through the array (and know when I'm done)
extract string, integer etc. values for each field - jdecode[String]? as[String]?
The easiest way to do this is to define a codec for MyType. The compiler will then happily construct a decoder for List[MyType], etc. I'll use a plain class here (not a case class) to make it clear what's happening:
class MyType(val name: String, val tpe: String, val size: Int)
import argonaut._, Argonaut._
implicit def MyTypeCodec: CodecJson[MyType] = codec3(
(name: String, tpe: String, size: Int) => new MyType(name, tpe, size),
(myType: MyType) => (myType.name, myType.tpe, myType.size)
)("name", "type", "size")
codec3 takes two parameter lists. The first has two parameters, which allow you to tell how to create an instance of MyType from a Tuple3 and vice versa. The second parameter list lets you specify the names of the fields.
Now you can just write something like the following (if json is your string):
Parse.decodeValidation[List[MyType]](json)
And you're done.
Since you don't need to encode and are only looking at decoding, you can do as suggested by Travis, but by implementing another implicit: MyTypeDecodeJson
implicit def MyTypeDecodeJson: DecodeJson[MyType] = DecodeJson(
raw => for {
name <- raw.get[String]("name")
type <- raw.get[String]("type")
size <- raw.get[Int]("size")
} yield MyType(name, type, size))
Then to parse your list:
Parse.decodeValidation[List[MyType]](jsonString)
Assuming MyType is a case class, the following works too:
case class MyType(name: String, type: String, size: Int)
object MyType {
implicit val createCodecJson: CodecJson[MyType] = CodecJson.casecodec3(apply, unapply)(
"name",
"type",
"size"
)
}

Scala/Play: JSON serialization issue

I have a simple custom data structure which I use to map the results from the database:
case class Filter(id: Int, table: String, name: String, Type: String, structure: String)
The resulting object type is List[Filter] and if converted to JSON, it should look something like this:
[
{
"id": 1,
"table": "table1",
"name": "name1",
"Type": "type1",
"structure": "structure1"
},
{
"id": 2,
"table": "table2",
"name": "name2",
"Type": "type2",
"structure": "structure2"
}
]
Now when I try to serialize my object into JSON
val result: String = Json.toJson(filters)
I am getting something like
No Json deserializer found for type List[Filter]. Try to implement an implicit Writes or Format for this type.
How do I solve this seemingly simple problem without writing some ridiculous amount of boilerplate?
My stack is Play 2.2.1, Scala 2.10.3, Java 8 64bit
Short answer:
Just add:
implicit val filterWrites = Json.writes[Filter]
Longer answer:
If you look at the definition of Json.toJson, you will see that its complete signature is:
def toJson[T](o: T)(implicit tjs: Writes[T]): JsValue = tjs.writes(o)
Writes[T] knows how to take a T and transform it to a JsValue. You will need to have an implicit Writes[Filter] around that knows how to serialize your Filter instance. The good news is that Play's JSON library comes with a macro that can instantiate those Writes[_] for you, so you don't have to write boring code that transforms your case class's fields into JSON values. To invoke this macro and have its value picked up by implicit search add the line above to your scope.