Argonaut: Generic method to encode/decode array of objects - json

I am trying to implement a generic pattern with which to generate marshallers and unmarshallers for an Akka HTTP REST service using Argonaut, handling both entity and collection level requests and responses. I have no issues in implementing the entity level as such:
case class Foo(foo: String)
object Foo {
implicit val FooJsonCodec = CodecJson.derive[Foo]
implicit val EntityEncodeJson = FooJson.Encoder
implicit val EntityDecodeJson = FooJson.Decoder
}
I am running into issues attempting to provide encoders and decoders for the following:
[
{ "foo": "1" },
{ "foo": "2" }
]
I have attempted adding the following to my companion:
object Foo {
implicit val FooCollectionJsonCodec = CodecJson.derive[HashSet[Foo]]
}
However, I am receiving the following error:
Error:(33, 90) value jencode0L is not a member of object argonaut.EncodeJson
I see this method truly does not exist but is there any other generic method to generate my expected result. I'm strongly avoiding using an additional case class to describe the collection since I am using reflection heavily in my use case.
At this point, I'd even be fine with a manually constructed Encoder and Decoder, however, I've found no documentation on how to construct it with the expected structure.

Argonaut have predefined encoders and decoders for Scala's immutable lists, sets, streams and vectors. If your type is not supported explicitly, as in the case of java.util.HashSet, you can easily add EncodeJson and DecodeJson for the type:
import argonaut._, Argonaut._
import scala.collection.JavaConverters._
implicit def hashSetEncode[A](
implicit element: EncodeJson[A]
): EncodeJson[java.util.HashSet[A]] =
EncodeJson(set => EncodeJson.SetEncodeJson[A].apply(set.asScala.toSet))
implicit def hashSetDecode[A](
implicit element: DecodeJson[A]
): DecodeJson[java.util.HashSet[A]] =
DecodeJson(cursor => DecodeJson.SetDecodeJson[A]
.apply(cursor)
.map(set => new java.util.HashSet(set.asJava)))
// Usage:
val set = new java.util.HashSet[Int]
set.add(1)
set.add(3)
val jsonSet = set.asJson // [1, 3]
jsonSet.jdecode[java.util.HashSet[Int]] // DecodeResult(Right([1, 3]))
case class A(set: java.util.HashSet[Int])
implicit val codec = CodecJson.derive[A]
val a = A(set)
val jsonA = a.asJson // { "set": [1, 3] }
jsonA.jdecode[A] // DecodeResult(Right(A([1, 3])))
Sample is checked on Scala 2.12.1 and Argonaut 6.2-RC2, but as far as I know it shouldn't depend on some latest changes.
Approach like this works with any linear or unordered homogenous data structure that you want to represent as JSON array. Also, this is preferable to creating a CodecJson: latter can be inferred automatically from JsonEncode and JsonDecode, but not vice versa. This way, your set will serialize and deserialize both when used independently or within other data type, as shown in example.

I don't use Argonaut but use spray-json and suspect solution can be similar.
Have you tried something like this ?
implicit def HashSetJsonCodec[T : CodecJson] = CodecJson.derive[Set[T]]
if it doesn't work I'd probably try creating more verbose implicit function like
implicit def SetJsonCodec[T: CodecJson](implicit codec: CodecJson[T]): CodecJson[Set[T]] = {
CodecJson(
{
case value => JArray(value.map(codec.encode).toList)
},
c => c.as[JsonArray].flatMap {
case arr: Json.JsonArray =>
val items = arr.map(codec.Decoder.decodeJson)
items.find(_.isError) match {
case Some(error) => DecodeResult.fail[Set[T]](error.toString(), c.history)
case None => DecodeResult.ok[Set[T]](items.flatMap(_.value).toSet[T])
}
}
)
}
PS. I didn't test this but hopefully it leads you to the right direction :)

Related

How to create a JSON of a List of List of Any in Scala

The JSON output that I am looking for is
{[[1, 1.5, "String1"], [-2, 2.3, "String2"]]}
So I want to have an Array of Arrays and the inner array is storing different types.
How should I store my variables so I can create such JSON in Scala?
I thought of List of Tuples. However, all the available JSON libraries try to convert a Tuple to a map instead of an Array. I am using json4s library.
Here is a custom serializer for those inner arrays using json4s:
import org.json4s._
class MyTupleSerializer extends CustomSerializer[(Int, Double, String)](format => ({
case obj: JArray =>
implicit val formats: Formats = format
(obj(0).extract[Int], obj(1).extract[Double], obj(2).extract[String])
}, {
case (i: Int, d: Double, s: String) =>
JArray(List(JInt(i), JDouble(d), JString(s)))
}))
The custom serialiser converts JArray into a tuple and back again. This will be used wherever the Scala object being read or written has a value of the appropriate tuple type.
To test this against the sample input I have modified it to make it valid JSON by adding a field name:
{"data": [[1, 1.5, "String1"], [-2, 2.3, "String2"]]}
I have defined a container class to match this:
case class MyTupleData(data: Vector[(Int, Double, String)])
The name of the class is not relevant but the field name data must match the JSON field name. This uses Vector rather than Array because Array is really a Java type rather than a Scala type. You can use List if preferred.
import org.json4s.jackson.Serialization.{read, write}
case class MyTupleData(data: Vector[(Int, Double, String)])
object JsonTest extends App {
val data = """{"data": [[1, 1.5, "String1"], [-2, 2.3, "String2"]]}"""
implicit val formats: Formats = DefaultFormats + new MyTupleSerializer
val td: MyTupleData = read[MyTupleData](data)
println(td) // MyTupleData(Vector((1,1.5,String1), (-2,2.3,String2)))
println(write(td)) // {"data":[[1,1.5,"String1"],[-2,2.3,"String2"]]}
}
If you prefer to use a custom class for the data rather than a tuple, the code looks like this:
case class MyClass(i: Int, d: Double, s: String)
class MyClassSerializer extends CustomSerializer[MyClass](format => ({
case obj: JArray =>
implicit val formats: Formats = format
MyClass(obj(0).extract[Int], obj(1).extract[Double], obj(2).extract[String])
}, {
case MyClass(i, d, s) =>
JArray(List(JInt(i), JDouble(d), JString(s)))
}))
Use a List of List rather than List of Tuples.
an easy way to convert list of tuples to list of list is:
val listOfList: List[List[Any]] = listOfTuples.map(_.productIterator.toList)
I would use jackson, which is a java library and can deal with arbitrary datatypes inside collections of type Any/AnyRef, rather than trying to come up with a custom serializer in one of scala json libraries.
To convert scala List to java List use
import collection.JavaConverters._
So, in summary the end list would be:
val javaListOfList: java.util.List[java.util.List[Any]] = listOfTuples.map(_.productIterator.toList.asJava).asJava
Using this solution, you could have arbitrary length tuples in your list and it would work.
import com.fasterxml.jackson.databind.ObjectMapper
import collection.JavaConverters._
object TuplesCollectionToJson extends App {
val tuplesList = List(
(10, false, 43.6, "Text1"),
(84, true, 92.1, "Text2", 'X')
)
val javaList = tuplesList.map(_.productIterator.toList.asJava).asJava
val mapper = new ObjectMapper()
val json = mapper.writeValueAsString(javaList)
println(json)
}
Would produce:
[[10,false,43.6,"Text1"],[84,true,92.1,"Text2","X"]]
PS: Use this solution only when you absolutely have to work with variable types. If your tuple datatype is fixed, its better to create a json4s specific serializer/deserializer.

How to handle different JSON schemas and dispatch hem to be handled by the right parser?

I am currently building a very simple JSON parser in Scala that has to deal with two (slightly) different schemas. My objective is to parse one value in the json, and based on that value, I would like to dispatch it to the relevant decoder. I have used circe for my implementation, but other implementations and/or suggestions are also welcome.
I have formulated a simplified version for my example to help clarifying the question.
There are two types of JSONs that I can receive, either a stock:
"data": {
"name": "XYZ"
},
"type": "STOCK"
}
Or a quote (which is similar to the stock but includes a price).
"data": {
"name": "ABC",
"price": 1151.6214,
},
"type": "QUOTE"
}
On my side, I have developed a simple decoder that looks like this (for the stock):
implicit private val dataDecoder: Decoder[Stock] = (hCursor: HCursor) => {
for {
isin <- hCursor.downField("data").downField("name").as[String]
typ <- hCursor.downField("type").as[StockType]
} yield Instrument(name, typ, LocalDateTime.now())
}
I could also develop a parser that only parses the "type" part of the JSON, and then send the data to be handled by the relevant parser (quote or stock). However, I am wondering what is the:
Efficient way to do this
Idiomatic/clean way to do this
To help rephrase my question if needed, what is the right & efficient way to handle slightly different JSON schemas, and forward them to be handled by the right parser.
I usually encounter this situation when need to serialize ADTs. As somebody mentioned in the comments to your question, circe has support for ADT codec auto generation, however I usually prefer to manually write the codecs.
In any case in a situation like your I would do something along these lines:
sealed trait Data
case class StockData(name: String) extends Data
case class QuoteData(name: String, quote: Double) extends Data
implicit val stockDataEncoder: Encoder[StockData] = ???
implicit val stockDataDecoder: Decoder[StockData] = ???
implicit val quoteDataEncoder: Encoder[QuoteData] = ???
implicit val quoteDataDecoder: Decoder[QuoteData] = ???
implicit val dataEncoder: Encoder[Data] = Encoder.instance {
case s: StockData => stockDataEncoder(s).withObject(_.add("type", "stock))
case q: QuoteData => quoteDataEncoder(q).withObject(_.add("type", "quote"))
}
implicit val dataDecoder: Decoder[Data] = Decoder.instance { c =>
for {
stype <- c.get[String]("type)
res <- stype match {
case "stock" => stockDataDecoder(c)
case "quote" => quoteDataDecoder(c)
case unk => Left(DecodingFailure(s"Unsupported data type: ${unk}", c.history))
}
} yield res
}

Play Json API: Convert a JsArray to a JsResult[Seq[Element]]

I have a JsArray which contains JsValue objects representing two different types of entities - some of them represent nodes, the other part represents edges.
On the Scala side, there are already case classes named Node and Edge whose supertype is Element. The goal is to transform the JsArray (or Seq[JsValue]) to a collection that contains the Scala types, e.g. Seq[Element] (=> contains objects of type Node and Edge).
I have defined Read for the case classes:
implicit val nodeReads: Reads[Node] = // ...
implicit val edgeReads: Reads[Edge] = // ...
Apart from that, there is the first step of a Read for the JsArray itself:
implicit val elementSeqReads = Reads[Seq[Element]](json => json match {
case JsArray(elements) => ???
case _ => JsError("Invalid JSON data (not a json array)")
})
The part with the question marks is responsible for creating a JsSuccess(Seq(node1, edge1, ...) if all elements of the JsArray are valid nodes and edges or a JsError if this is not the case.
However, I'm not sure how to do this in an elegant way.
The logic to distinguish between nodes and edges could look like this:
def hasType(item: JsValue, elemType: String) =
(item \ "elemType").asOpt[String] == Some(elemType)
val result = elements.map {
case n if hasType(n, "node") => // use nodeReads
case e if hasType(e, "edge") => // use edgeReads
case _ => JsError("Invalid element type")
}
The thing is that I don't know how to deal with nodeReads / edgeReads at this point. Of course I could call their validate method directly, but then result would have the type Seq[JsResult[Element]]. So eventually I would have to check if there are any JsError objects and delegate them somehow to the top (remember: one invalid array element should lead to a JsError overall). If there are no errors, I still have to produce a JsSuccess[Seq[Element]] based on result.
Maybe it would be a better idea to avoid the calls to validate and work temporarily with Read instances instead. But I'm not sure how to "merge" all of the Read instances at the end (e.g. in simple case class mappings, you have a bunch of calls to JsPath.read (which returns Read) and in the end, validate produces one single result based on all those Read instances that were concatenated using the and keyword).
edit: A little bit more information.
First of all, I should have mentioned that the case classes Node and Edge basically have the same structure, at least for now. At the moment, the only reason for separate classes is to gain more type safety.
A JsValue of an element has the following JSON-representation:
{
"id" : "aet864t884srtv87ae",
"type" : "node", // <-- type can be 'node' or 'edge'
"name" : "rectangle",
"attributes": [],
...
}
The corresponding case class looks like this (note that the type attribute we've seen above is not an attribute of the class - instead it's represented by the type of the class -> Node).
case class Node(
id: String,
name: String,
attributes: Seq[Attribute],
...) extends Element
The Read is as follows:
implicit val nodeReads: Reads[Node] = (
(__ \ "id").read[String] and
(__ \ "name").read[String] and
(__ \ "attributes").read[Seq[Attribute]] and
....
) (Node.apply _)
everything looks the same for Edge, at least for now.
Try defining elementReads as
implicit val elementReads = new Reads[Element]{
override def reads(json: JsValue): JsResult[Element] =
json.validate(
Node.nodeReads.map(_.asInstanceOf[Element]) orElse
Edge.edgeReads.map(_.asInstanceOf[Element])
)
}
and import that in scope, Then you should be able to write
json.validate[Seq[Element]]
If the structure of your json is not enough to differentiate between Node and Edge, you could enforce it in the reads for each type.
Based on a simplified Node and Edge case class (only to avoid any unrelated code confusing the answer)
case class Edge(name: String) extends Element
case class Node(name: String) extends Element
The default reads for these case classes would be derived by
Json.reads[Edge]
Json.reads[Node]
respectively. Unfortunately since both case classes have the same structure these reads would ignore the type attribute in the json and happily translate a node json into an Edge instance or the opposite.
Lets have a look at how we could express the constraint on type all by itself :
def typeRead(`type`: String): Reads[String] = {
val isNotOfType = ValidationError(s"is not of expected type ${`type`}")
(__ \ "type").read[String].filter(isNotOfType)(_ == `type`)
}
This method builds a Reads[String] instance which will attempt to find a type string attribute in the provided json. It will then filter the JsResult using the custom validation error isNotOfType if the string parsed out of the json doesn't matched the expected type passed as argument of the method. Of course if the type attribute is not a string in the json, the Reads[String] will return an error saying that it expected a String.
Now that we have a read which can enforce the value of the type attribute in the json, all we have to do is to build a reads for each value of type which we expect and compose it with the associated case class reads. We can used Reads#flatMap for that ignoring the input since the parsed string is not useful for our case classes.
object Edge {
val edgeReads: Reads[Edge] =
Element.typeRead("edge").flatMap(_ => Json.reads[Edge])
}
object Node {
val nodeReads: Reads[Node] =
Element.typeRead("node").flatMap(_ => Json.reads[Node])
}
Note that if the constraint on type fails the flatMap call will be bypassed.
The question remains of where to put the method typeRead, in this answer I initially put it in the Element companion object along with the elementReads instance as in the code below.
import play.api.libs.json._
trait Element
object Element {
implicit val elementReads = new Reads[Element] {
override def reads(json: JsValue): JsResult[Element] =
json.validate(
Node.nodeReads.map(_.asInstanceOf[Element]) orElse
Edge.edgeReads.map(_.asInstanceOf[Element])
)
}
def typeRead(`type`: String): Reads[String] = {
val isNotOfType = ValidationError(s"is not of expected type ${`type`}")
(__ \ "type").read[String].filter(isNotOfType)(_ == `type`)
}
}
This is actually a pretty bad place to define typeRead :
- it has nothing specific to Element
- it introduces a circular dependency between the Elementcompanion object and both Node and Edge companion objects
I'll let you think up of the correct location though :)
The specification proving it all works together :
import org.specs2.mutable.Specification
import play.api.libs.json._
import play.api.data.validation.ValidationError
class ElementSpec extends Specification {
"Element reads" should {
"read an edge json as an edge" in {
val result: JsResult[Element] = edgeJson.validate[Element]
result.isSuccess should beTrue
result.get should beEqualTo(Edge("myEdge"))
}
"read a node json as an node" in {
val result: JsResult[Element] = nodeJson.validate[Element]
result.isSuccess should beTrue
result.get should beEqualTo(Node("myNode"))
}
}
"Node reads" should {
"read a node json as an node" in {
val result: JsResult[Node] = nodeJson.validate[Node](Node.nodeReads)
result.isSuccess should beTrue
result.get should beEqualTo(Node("myNode"))
}
"fail to read an edge json as a node" in {
val result: JsResult[Node] = edgeJson.validate[Node](Node.nodeReads)
result.isError should beTrue
val JsError(errors) = result
val invalidNode = JsError.toJson(Seq(
(__ \ "type") -> Seq(ValidationError("is not of expected type node"))
))
JsError.toJson(errors) should beEqualTo(invalidNode)
}
}
"Edge reads" should {
"read a edge json as an edge" in {
val result: JsResult[Edge] = edgeJson.validate[Edge](Edge.edgeReads)
result.isSuccess should beTrue
result.get should beEqualTo(Edge("myEdge"))
}
"fail to read a node json as an edge" in {
val result: JsResult[Edge] = nodeJson.validate[Edge](Edge.edgeReads)
result.isError should beTrue
val JsError(errors) = result
val invalidEdge = JsError.toJson(Seq(
(__ \ "type") -> Seq(ValidationError("is not of expected type edge"))
))
JsError.toJson(errors) should beEqualTo(invalidEdge)
}
}
val edgeJson = Json.parse(
"""
|{
| "type":"edge",
| "name":"myEdge"
|}
""".stripMargin)
val nodeJson = Json.parse(
"""
|{
| "type":"node",
| "name":"myNode"
|}
""".stripMargin)
}
if you don't want to use asInstanceOf as a cast you can write the
elementReads instance like so :
implicit val elementReads = new Reads[Element] {
override def reads(json: JsValue): JsResult[Element] =
json.validate(
Node.nodeReads.map(e => e: Element) orElse
Edge.edgeReads.map(e => e: Element)
)
}
unfortunately, you can't use _ in this case.

Abstraction to extract data from JSON in Scala

I am looking for a good abstraction to extract data form JSON (I am using json4s now).
Suppose I have a case class A and data in JSON format.
case class A(a1: String, a2: String, a3: String)
{"a1":"xxx", "a2": "yyy", "a3": "zzz"}
I need a function to extract the JSON data and return A with these data as follows:
val a: JValue => A = ...
I do not want to write the function a from scratch. I would rather compose it from primitive functions.
For example, I can write a primitive function to extract string by field name:
val str: (String, JValue) => String = {(fieldName, jval) => ... }
Now I would like to compose the function a: JValue => A from str. Does it make sense ?
Consider use of Play-JSON, which has a composable "Reads" object. If you've ever used ReactiveMongo, it can be used in much the same way. Contrary to some older posts here, it can be used stand-alone, without most of the rest of Play.
It uses the common "implicit translator" (my term) idiom. I found that my favorite deserializing pattern for using it is not highlighted in the docs, though - the pattern they espouse is a lot harder to get right, IMHO. I make heavy use of .as and .asOpt, which are documented on the first linked page above, in the small section "Using JsValue.as/asOpt". When deserializing a JSON object, you can say something like
val person:Person = (someParsedJsonObject \ "aPerson").as[Person]
and as long as you have an implicit Reads[Person] in scope, all just works. There are built-in Reads for all primitive types and many collection types. In many cases, it makes sense to put the Reads and Writes implicit objects in the companion object for, e.g., Person.
I thought json4s had a similar feature, but I could be wrong.
Argonaut is fully functional Scala library.
It allows to encode/decode case classes (JSON codecs).
import argonaut._, Argonaut._
case class Person(name: String, age: Int)
implicit def PersonDecodeJson: DecodeJson[Person]
jdecode2L(Person.apply)("name", "age")
// Codec for Person case class from JSON of form
// { "name": "string", "age": 1 }
It also provides JSON cursor (lenses/monocle) for custom parsing.
implicit def PersonDecodeJson: DecodeJson[Person] =
DecodeJson(c => for {
name <- (c --\ "_name").as[String]
age <- (c --\ "_age").as[String].map(_.toInt)
} yield Person(name, age))
// Decode Person from a JSON with property names different
// from those of the case class, and age passed as string:
// { "_name": "string", "age": "10" }
Parsing result is represented by DecodeResult type that can be composed (.map, .flatMap) and handle error cases.

Suggestions for Writing Map as JSON file in Scala

I have a simple single key-valued Map(K,V) myDictionary that is populated by my program and at the end I want to write it as JSON format string in a text file - as I would need parse them later.
I was using this code earlier,
Some(new PrintWriter(outputDir+"/myDictionary.json")).foreach{p => p.write(compact(render(decompose(myDictionary)))); p.close}
I found it to be slower as the input size increased. Later, I used this var out = new
var out = new PrintWriter(outputDir+"/myDictionary.json");
out.println(scala.util.parsing.json.JSONObject(myDictionary.toMap).toString())
This is proving to be bit faster.
I have run this for sample input and found that this is faster than my earlier approach. I assuming my input map size would reach at least a million values( >1GB text file) (K,V) hence I want to make sure that I follow the faster and memory efficient approach for Map serialization process.What are other approaches that you would recommend,that I can look into to optimize this.
The JSON support in the standard Scala library is probably not the best choice. Unfortunately the situation with JSON libraries for Scala is a bit confusing, there are many alternatives (Lift JSON, Play JSON, Spray JSON, Twitter JSON, Argonaut, ...), basically one library for each day of the week... I suggest you have a look at these at least to see if any of them is easier to use and more performative.
Here is an example using Play JSON which I have chosen for particular reasons (being able to generate formats with macros):
object JsonTest extends App {
import play.api.libs.json._
type MyDict = Map[String, Int]
implicit object MyDictFormat extends Format[MyDict] {
def reads(json: JsValue): JsResult[MyDict] = json match {
case JsObject(fields) =>
val b = Map.newBuilder[String, Int]
fields.foreach {
case (k, JsNumber(v)) => b += k -> v.toInt
case other => return JsError(s"Not a (string, number) pair: $other")
}
JsSuccess(b.result())
case _ => JsError(s"Not an object: $json")
}
def writes(m: MyDict): JsValue = {
val fields: Seq[(String, JsValue)] = m.map {
case (k, v) => k -> JsNumber(v)
} (collection.breakOut)
JsObject(fields)
}
}
val m = Map("hallo" -> 12, "gallo" -> 34)
val serial = Json.toJson(m)
val text = Json.stringify(serial)
println(text)
val back = Json.fromJson[MyDict](serial)
assert(back == JsSuccess(m), s"Failed: $back")
}
While you can construct and deconstruct JsValues directly, the main idea is to use a Format[A] where A is the type of your data structure. This puts more emphasis on type safety than the standard Scala-Library JSON. It looks more verbose, but in end I think it's the better approach.
There are utility methods Json.toJson and Json.fromJson which look for an implicit format of the type you want.
On the other hand, it does construct everything in-memory and it does duplicate your data structure (because for each entry in your map you will have another tuple (String, JsValue)), so this isn't necessarily the most memory efficient solution, given that you are operating in the GB magnitude...
Jerkson is a Scala wrapper for the Java JSON library Jackson. The latter apparently has the feature to stream data. I found this project which says it adds streaming support. Play JSON in turn is based on Jerkson, so perhaps you can even figure out how to stream your object with that. See also this question.