Alternative to parsing json with option [ either [A,B ] ] in scala - json

For example, here payload is optional and it has 3 variants:
How can I parse the json with types like option[either[A,B,C]] but to use abstract data type using things sealed trait or sum type?
Below is a minimal example with some boiler plate:
https://scalafiddle.io/sf/K6RUWqk/1
// Start writing your ScalaFiddle code here
val json =
"""[
{
"id": 1,
"payload" : "data"
},
{
"id": 2.1,
"payload" : {
"field1" : "field1",
"field2" : 5,
"field3" : true
}
},
{
"id": 2.2,
"payload" : {
"field1" : "field1",
}
},
{
"id": 3,
payload" : 4
},
{
"id":4,
"
}
]"""
final case class Data(field1: String, field2: Option[Int])
type Payload = Either[String, Data]
final case class Record(id: Int, payload: Option[Payload])
import io.circe.Decoder
import io.circe.generic.semiauto.deriveDecoder
implicit final val dataDecoder: Decoder[Data] = deriveDecoder
implicit final val payloadDecoder: Decoder[Payload] = Decoder[String] either Decoder[Data]
implicit final val recordDecoder: Decoder[Record] = deriveDecoder
val result = io.circe.parser.decode[List[Record]](json)
println(result)

Your code is almost fine, you have just syntax issues in your json and Record.id should be Double instead of Int - because it is how this field present in your json ("id": 2.1). Please, find fixed version below:
val json =
s"""[
{
"id": 1,
"payload" : "data"
},
{
"id": 2.1,
"payload" : {
"field1" : "field1",
"field2" : 5,
"field3" : true
}
},
{
"id": 2.2,
"payload" : {
"field1" : "field1"
}
},
{
"id": 3,
"payload" : 4
},
{
"id": 4
}
]"""
type Payload = Either[String, Data]
final case class Data(field1: String, field2: Option[Int])
final case class Record(id: Double, payload: Option[Payload]) // id is a Double in your json in some cases
import io.circe.Decoder
import io.circe.generic.semiauto.deriveDecoder
implicit val dataDecoder: Decoder[Data] = deriveDecoder
implicit val payloadDecoder: Decoder[Payload] = Decoder[String] either Decoder[Data]
implicit val recordDecoder: Decoder[Record] = deriveDecoder
val result = io.circe.parser.decode[List[Record]](json)
println(result)
Which produced in my case:
Right(List(Record(1.0,Some(Left(data))), Record(2.1,Some(Right(Data(field1,Some(5))))), Record(2.2,Some(Right(Data(field1,None)))), Record(3.0,Some(Left(4))), Record(4.0,None)))
UPDATE:
The more general approach would be to use so-called Sum Types or in simple words - general sealed trait with several different implementations. Please, see for more details next Circe doc page : https://circe.github.io/circe/codecs/adt.html
In your case it can be achieved something like this:
import cats.syntax.functor._
import io.circe.Decoder
import io.circe.generic.semiauto.deriveDecoder
sealed trait Payload
object Payload {
implicit val decoder: Decoder[Payload] = {
List[Decoder[Payload]](
Decoder[StringPayload].widen,
Decoder[IntPayload].widen,
Decoder[ObjectPayload].widen
).reduce(_ or _)
}
}
case class StringPayload(value: String) extends Payload
object StringPayload {
implicit val decoder: Decoder[StringPayload] = Decoder[String].map(StringPayload.apply)
}
case class IntPayload(value: Int) extends Payload
object IntPayload {
implicit val decoder: Decoder[IntPayload] = Decoder[Int].map(IntPayload.apply)
}
case class ObjectPayload(field1: String, field2: Option[Int]) extends Payload
object ObjectPayload {
implicit val decoder: Decoder[ObjectPayload] = deriveDecoder
}
final case class Record(id: Double, payload: Option[Payload])
object Record {
implicit val decoder: Decoder[Record] = deriveDecoder
}
def main(args: Array[String]): Unit = {
val json =
s"""[
{
"id": 1,
"payload" : "data"
},
{
"id": 2.1,
"payload" : {
"field1" : "field1",
"field2" : 5,
"field3" : true
}
},
{
"id": 2.2,
"payload" : {
"field1" : "field1"
}
},
{
"id": 3,
"payload" : "4"
},
{
"id": 4
}
]"""
val result = io.circe.parser.decode[List[Record]](json)
println(result)
}
which produced in my case next output:
Right(List(Record(1.0,Some(StringPayload(data))), Record(2.1,Some(ObjectPayload(field1,Some(5)))), Record(2.2,Some(ObjectPayload(field1,None))), Record(3.0,Some(StringPayload(4))), Record(4.0,None)))
Hope this helps!

Related

kotlinx deserialization: different types && scalar && arrays

I'm trying to deserialize a JSON like this (much more complex, but this is the essential part):
[
{
"field": "field1",
"value": [1000, 2000]
},
{
"field": "field2",
"value": 1
},
{
"field": "field2",
"value":["strval2","strval3"]
},
{
"field": "field4",
"value": "strval1"
}
]
I've tried to figure out how to use JsonContentPolymorphicSerializer in different variants but it all ends up the same:
class java.util.ArrayList cannot be cast to class myorg.ConditionValue (java.util.ArrayList is in module java.base of loader 'bootstrap'; myorg.ConditionValue is in unnamed module of loader 'app')
#Serializable
sealed class ConditionValue
#Serializable(with = StringValueSerializer::class)
data class StringValue(val value: String) : ConditionValue()
#Serializable(with = StringListValueSerializer::class)
data class StringListValue(val value: List<StringValue>) : ConditionValue()
object ConditionSerializer : JsonContentPolymorphicSerializer<Any>(Any::class) {
override fun selectDeserializer(element: JsonElement) = when (element) {
is JsonPrimitive -> StringValueSerializer
is JsonArray -> ListSerializer(StringValueSerializer)
else -> StringValueSerializer
}
}
object StringValueSerializer : KSerializer<StringValue> {
override val descriptor: SerialDescriptor = buildClassSerialDescriptor("StringValue")
override fun deserialize(decoder: Decoder): StringValue {
require(decoder is JsonDecoder)
val element = decoder.decodeJsonElement()
return StringValue(element.jsonPrimitive.content)
}
override fun serialize(encoder: Encoder, value: StringValue) {
encoder.encodeString(value.value)
}
}
What am I missing? And how to approach it?
This is indeed a difficult problem.
Probably the quickest and clearest way is to avoid getting bogged down with the 'correct' Kotlinx Serializer way and just decode the polymorphic type to a JsonElement
import kotlinx.serialization.Serializable
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.JsonElement
#Serializable
data class MyData(
val field: String,
val value: JsonElement, // polymorphism is hard, JsonElement is easy
)
The following code produces the correct output
import kotlinx.serialization.decodeFromString
import kotlinx.serialization.json.Json
fun main() {
val json = /*language=json*/ """
[
{
"field": "field1",
"value": [1000, 2000]
},
{
"field": "field2",
"value": 1
},
{
"field": "field2",
"value":["strval2","strval3"]
},
{
"field": "field4",
"value": "strval1"
}
]
""".trimIndent()
val result = Json.decodeFromString<List<MyData>>(json)
println(result)
}
[
MyData(field=field1, value=[1000,2000]),
MyData(field=field2, value=1),
MyData(field=field2, value=["strval2","strval3"]),
MyData(field=field4, value="strval1")
]
Now you can manually convert MyData to a more correct instance.
val converted = results.map { result ->
val convertedValue: ConditionValue = when (val value = result.value) {
is JsonPrimitive -> convertPrimitive(value)
is JsonArray -> convertJsonArray(value)
else -> error("cannot convert $value")
}
MyDataConverted(
field = result.field,
value = convertedValue
)
}
...
fun convertJsonArray(array: JsonArray): ConditionValueList<*> =
TODO()
fun convertPrimitive(primitive: JsonPrimitive): ConditionValuePrimitive =
TODO()
As a final note, I can recommend using inline classes to represent your values. If you do want to work with Kotlinx Serialization, then they work better than creating custom serializers for primitive types.
Here's how I'd model the data in your example:
sealed interface ConditionValue
sealed interface ConditionValuePrimitive : ConditionValue
sealed interface ConditionValueCollection<T : ConditionValuePrimitive> : ConditionValue
#JvmInline
value class StringValue(val value: String) : ConditionValuePrimitive
#JvmInline
value class IntegerValue(val value: Int) : ConditionValuePrimitive
#JvmInline
value class ConditionValueList<T : ConditionValuePrimitive>(
val value: List<T>
) : ConditionValueCollection<T>
data class MyDataConverted(
val field: String,
val value: ConditionValue,
)
Versions:
Kotlin 1.7.10
Kotlinx Serialization 1.3.3

Convert List to Json

How do I create a Json (Circe) looking like this:
{
"items": [{
"field1": "somevalue",
"field2": "somevalue2"
},
{
"field1": "abc",
"field2": "123abc"
}]
}
val result = Json.fromFields(List("items" -> ???))
You can do so using Circe's built in list typeclasses for encoding JSON. This code will work:
import io.circe.{Encoder, Json}
import io.circe.syntax._
case class Field(field1: String, field2: String)
object Field {
implicit val encodeFoo: Encoder[Field] = new Encoder[Field] {
final def apply(a: Field): Json = Json.obj(
("field1", Json.fromString(a.field1)),
("field2", Json.fromString(a.field2))
)
}
}
class Encoding(items: List[Field]) {
def getJson: Json = {
Json.obj(
(
"items",
items.asJson
)
)
}
}
If we instantiate an "Encoding" class and call getJson it will give you back the desired JSON. It works because with circe all you need to do to encode a list is provide an encoder for whatever is inside the list. Thus, if we provide an encoder for Field it will encode it inside a list when we call asJson on it.
If we run this:
val items = new Encoding(List(Field("jf", "fj"), Field("jfl", "fjl")))
println(items.getJson)
we get:
{
"items" : [
{
"field1" : "jf",
"field2" : "fj"
},
{
"field1" : "jfl",
"field2" : "fjl"
}
]
}

Parsing a file having content as Json format in Scala

I want to parse a file having content as json format.
From the file I want to extract few properties (name, DataType, Nullable) to create some column names dynamically.
I have gone through some examples but most of them are using case class but my problem is every time I will receive a file may have different content.
I tried to use the ujson library to parse the file but I am unable to understand how to use it properly.
object JsonTest {
def main(args: Array[String]): Unit = {
val source = scala.io.Source.fromFile("C:\\Users\\ktngme\\Desktop\\ass\\file.txt")
println(source)
val input = try source.mkString finally source.close()
println(input)
val data = ujson.read(input)
data("name") = data("name").str.reverse
val updated = data.render()
}
}
Content of the file example:
{
"Organization": {
"project": {
"name": "POC 4PL",
"description": "Implementation of orderbook"
},
"Entities": [
{
"name": "Shipments",
"Type": "Fact",
"Attributes": [
{
"name": "Shipment_Details",
"DataType": "StringType",
"Nullable": "true"
},
{
"name": "Shipment_ID",
"DataType": "StringType",
"Nullable": "true"
},
{
"name": "View_Cost",
"DataType": "StringType",
"Nullable": "true"
}
],
"ADLS_Location": "/mnt/mns/adls/raw/poc/orderbook/"
}
]
}
}
Expected output:
StructType(
Array(StructField("Shipment_Details",StringType,true),
StructField("Shipment_ID",DateType,true),
StructField("View_Cost",DateType,true)))
StructType needs to be added to the expected output programatically.
Try Using Playframework's Json utils - https://www.playframework.com/documentation/2.7.x/ScalaJson
Here's the solution to your issue-
\ Placed your json in text file
val fil_path = "C:\\TestData\\Config\\Conf.txt"
val conf_source = scala.io.Source.fromFile(fil_path)
lazy val json_str = try conf_source.mkString finally conf_source.close()
val conf_json: JsValue = Json.parse(json_str)
val all_entities: JsArray = (conf_json \ "Organization" \ "Entities").get.asInstanceOf[JsArray]
val shipments: JsValue = all_entities.value.filter(e => e.\("name").as[String] == "Shipments").head
val shipments_attributes: IndexedSeq[JsValue] = shipments.\("Attributes").get.asInstanceOf[JsArray].value
val shipments_schema: StructType = StructType(shipments_attributes.map(a => Tuple3(a.\("name").as[String], a.\("DataType").as[String], a.\("Nullable").as[String]))
.map(x => StructField(x._1, StrtoDatatype(x._2), x._3.toBoolean)))
shipments_schema.fields.foreach(println)
Output is -
StructField(Shipment_Details,StringType,true)
StructField(Shipment_ID,StringType,true)
StructField(View_Cost,StringType,true)
It depends if you want it to be completely dynamic or not, here are some options:
If you just want to read one field you can do:
import upickle.default._
val source = scala.io.Source.fromFile("C:\\Users\\ktngme\\Desktop\\ass\\file.txt")
val input = try source.mkString finally source.close()
val json = ujson.read(input)
println(json("Organization")("project")("name"))
the output will be: "POC 4PL"
If you just want just the Attributes to be with types, you can do:
import upickle.default.{macroRW, ReadWriter => RW}
import upickle.default._
val source = scala.io.Source.fromFile("C:\\Users\\ktngme\\Desktop\\ass\\file.txt")
val input = try source.mkString finally source.close()
val json = ujson.read(input)
val entitiesArray = json("Organization")("Entities")(0)("Attributes")
println(read[Seq[StructField]](entitiesArray))
case class StructField(name: String, DataType: String, Nullable: String)
object StructField{
implicit val rw: RW[StructField] = macroRW
}
the output will be: List(StructField(Shipment_Details,StringType,true), StructField(Shipment_ID,StringType,true), StructField(View_Cost,StringType,true))
another option, is to use a different library to do the class mapping. If you use Google Protobuf Struct and JsonFormat it can be 2-liner:
import com.google.protobuf.Struct
import com.google.protobuf.util.JsonFormat
val source = scala.io.Source.fromFile("C:\\Users\\ktngme\\Desktop\\ass\\file.txt")
val input = try source.mkString finally source.close()
JsonFormat.parser().merge(input, builder)
println(builder.build())
the output will be: fields { key: "Organization" value { struct_value { fields { key: "project" value { struct_value { fields { key: "name" value { string_value: "POC 4PL" } } fields { key: "description" value { string_value: "Implementation of orderbook" } } } } } fields { key: "Entities" value { list_value { values { struct_value { fields { key: "name" value { string_value: "Shipments" } }...

Argonaut: decoding a polymorphic array

The JSON object for which I'm trying to write a DecodeJson[T] contains an array of different "types" (meaning the JSON structure of its elements is varying). The only common feature is the type field which can be used to distinguish between the types. All other fields are different. Example:
{
...,
array: [
{ type: "a", a1: ..., a2: ...},
{ type: "b", b1: ...},
{ type: "c", c1: ..., c2: ..., c3: ...},
{ type: "a", a1: ..., a2: ...},
...
],
...
}
Using argonaut, is it possible to map the JSON array to a Scala Seq[Element] where Element is a supertype of suitable case classes of type ElementA, ElementB and so on?
I did the same thing with play-json and it was quite easy (basically a Reads[Element] that evaluates the type field and accordingly forwards to more specific Reads). However, I couldn't find a way to do this with argonaut.
edit: example
Scala types (I wish to use):
case class Container(id: Int, events: List[Event])
sealed trait Event
case class Birthday(name: String, age: Int) extends Event
case class Appointment(start: Long, participants: List[String]) extends Event
case class ... extends Event
JSON instance (not under my control):
{
"id":1674,
"events": {
"data": [
{
"type": "birthday",
"name": "Jones Clinton",
"age": 34
},
{
"type": "appointment",
"start": 1675156665555,
"participants": [
"John Doe",
"Jane Doe",
"Foo Bar"
]
}
]
}
}
You can create a small function to help you build a decoder that handles this format.
See below for an example.
import argonaut._, Argonaut._
def decodeByType[T](encoders: (String, DecodeJson[_ <: T])*) = {
val encMap = encoders.toMap
def decoder(h: CursorHistory, s: String) =
encMap.get(s).fold(DecodeResult.fail[DecodeJson[_ <: T]](s"Unknown type: $s", h))(d => DecodeResult.ok(d))
DecodeJson[T] { c: HCursor =>
val tf = c.downField("type")
for {
tv <- tf.as[String]
dec <- decoder(tf.history, tv)
data <- dec(c).map[T](identity)
} yield data
}
}
case class Container(id: Int, events: ContainerData)
case class ContainerData(data: List[Event])
sealed trait Event
case class Birthday(name: String, age: Int) extends Event
case class Appointment(start: Long, participants: List[String]) extends Event
implicit val eventDecoder: DecodeJson[Event] = decodeByType[Event](
"birthday" -> DecodeJson.derive[Birthday],
"appointment" -> DecodeJson.derive[Appointment]
)
implicit val containerDataDecoder: DecodeJson[ContainerData] = DecodeJson.derive[ContainerData]
implicit val containerDecoder: DecodeJson[Container] = DecodeJson.derive[Container]
val goodJsonStr =
"""
{
"id":1674,
"events": {
"data": [
{
"type": "birthday",
"name": "Jones Clinton",
"age": 34
},
{
"type": "appointment",
"start": 1675156665555,
"participants": [
"John Doe",
"Jane Doe",
"Foo Bar"
]
}
]
}
}
"""
def main(args: Array[String]) = {
println(goodJsonStr.decode[Container])
// \/-(Container(1674,ContainerData(List(Birthday(Jones Clinton,34), Appointment(1675156665555,List(John Doe, Jane Doe, Foo Bar))))))
}

How to deserialize json with json-lenses

What is the best practice to deserialize JSON to a Scala case class using json-lenses?
some.json :
[
{
"id": 1,
"name": "Alice"
},
{
"id": 2,
"name": "Bob"
},
{
"id": 3,
"name": "Chris"
}
]
some case class :
case class Foo(id: Long, name: String)
What's best way to convert the json in some.json to List[Foo] ?
json-lenses supports spray-json and with spray-json you could do:
import spray.json._
case class Foo(id: Long, name: String)
object JsonProtocol extends DefaultJsonProtocol {
implicit val FooFormat = jsonFormat2(Foo)
}
import JsonProtocol._
val source = scala.io.Source.fromFile("some.json")
val json = try source.mkString.parseJson finally source.close()
json.convertTo[List[Foo]]
// List[Foo] = List(Foo(1,Alice), Foo(2,Bob), Foo(3,Chris))