Capturing unused fields while decoding a JSON object with circe - json

Suppose I have a case class like the following, and I want to decode a JSON object into it, with all of the fields that haven't been used ending up in a special member for the leftovers:
import io.circe.Json
case class Foo(a: Int, b: String, leftovers: Json)
What's the best way to do this in Scala with circe?
(Note: I've seen questions like this a few times, so I'm Q-and-A-ing it for posterity.)

There are a couple of ways you could go about this. One fairly straightforward way would be to filter out the keys you've used after decoding:
import io.circe.{ Decoder, Json, JsonObject }
implicit val decodeFoo: Decoder[Foo] =
Decoder.forProduct2[Int, String, (Int, String)]("a", "b")((_, _)).product(
Decoder[JsonObject]
).map {
case ((a, b), all) =>
Foo(a, b, Json.fromJsonObject(all.remove("a").remove("b")))
}
Which works as you'd expect:
scala> val doc = """{ "something": false, "a": 1, "b": "abc", "0": 0 }"""
doc: String = { "something": false, "a": 1, "b": "abc", "0": 0 }
scala> io.circe.jawn.decode[Foo](doc)
res0: Either[io.circe.Error,Foo] =
Right(Foo(1,abc,{
"something" : false,
"0" : 0
}))
The disadvantage of this approach is that you have to maintain code to remove the keys you've used separately from their use, which can be error-prone. Another approach is to use circe's state-monad-powered decoding tools:
import cats.data.StateT
import cats.instances.either._
import io.circe.{ ACursor, Decoder, Json }
implicit val decodeFoo: Decoder[Foo] = Decoder.fromState(
for {
a <- Decoder.state.decodeField[Int]("a")
b <- Decoder.state.decodeField[String]("b")
rest <- StateT.inspectF((_: ACursor).as[Json])
} yield Foo(a, b, rest)
)
Which works the same way as the previous decoder (apart from some small differences in the errors you'll get if decoding fails):
scala> io.circe.jawn.decode[Foo](doc)
res1: Either[io.circe.Error,Foo] =
Right(Foo(1,abc,{
"something" : false,
"0" : 0
}))
This latter approach doesn't require you to change the used fields in multiple places, and it also has the advantage of looking a little more like any other decoder you'd write manually in circe.

Related

How to handle different JSON schemas and dispatch hem to be handled by the right parser?

I am currently building a very simple JSON parser in Scala that has to deal with two (slightly) different schemas. My objective is to parse one value in the json, and based on that value, I would like to dispatch it to the relevant decoder. I have used circe for my implementation, but other implementations and/or suggestions are also welcome.
I have formulated a simplified version for my example to help clarifying the question.
There are two types of JSONs that I can receive, either a stock:
"data": {
"name": "XYZ"
},
"type": "STOCK"
}
Or a quote (which is similar to the stock but includes a price).
"data": {
"name": "ABC",
"price": 1151.6214,
},
"type": "QUOTE"
}
On my side, I have developed a simple decoder that looks like this (for the stock):
implicit private val dataDecoder: Decoder[Stock] = (hCursor: HCursor) => {
for {
isin <- hCursor.downField("data").downField("name").as[String]
typ <- hCursor.downField("type").as[StockType]
} yield Instrument(name, typ, LocalDateTime.now())
}
I could also develop a parser that only parses the "type" part of the JSON, and then send the data to be handled by the relevant parser (quote or stock). However, I am wondering what is the:
Efficient way to do this
Idiomatic/clean way to do this
To help rephrase my question if needed, what is the right & efficient way to handle slightly different JSON schemas, and forward them to be handled by the right parser.
I usually encounter this situation when need to serialize ADTs. As somebody mentioned in the comments to your question, circe has support for ADT codec auto generation, however I usually prefer to manually write the codecs.
In any case in a situation like your I would do something along these lines:
sealed trait Data
case class StockData(name: String) extends Data
case class QuoteData(name: String, quote: Double) extends Data
implicit val stockDataEncoder: Encoder[StockData] = ???
implicit val stockDataDecoder: Decoder[StockData] = ???
implicit val quoteDataEncoder: Encoder[QuoteData] = ???
implicit val quoteDataDecoder: Decoder[QuoteData] = ???
implicit val dataEncoder: Encoder[Data] = Encoder.instance {
case s: StockData => stockDataEncoder(s).withObject(_.add("type", "stock))
case q: QuoteData => quoteDataEncoder(q).withObject(_.add("type", "quote"))
}
implicit val dataDecoder: Decoder[Data] = Decoder.instance { c =>
for {
stype <- c.get[String]("type)
res <- stype match {
case "stock" => stockDataDecoder(c)
case "quote" => quoteDataDecoder(c)
case unk => Left(DecodingFailure(s"Unsupported data type: ${unk}", c.history))
}
} yield res
}

Extracting string from JSON using Json4s using Scala

I have a JSON body in the following form:
val body =
{
"a": "hello",
"b": "goodbye"
}
I want to extract the VALUE of "a" (so I want "hello") and store that in a val.
I know I should use "parse" and "Extract" (eg. val parsedjson = parse(body).extract[String]) but I don't know how to use them to specifically extract the value of "a"
To use extract you need to create a class that matches the shape of the JSON that you are parsing. Here is an example using your input data:
val body ="""
{
"a": "hello",
"b": "goodbye"
}
"""
case class Body(a: String, b: String)
import org.json4s._
import org.json4s.jackson.JsonMethods._
implicit val formats = DefaultFormats
val b = Extraction.extract[Body](parse(body))
println(b.a) // hello
You'd have to use pattern matching/extractors:
val aOpt: List[String] = for {
JObject(map) <- parse(body)
JField("a", JString(value)) <- map
} yield value
alternatively use querying DSL
parse(body) \ "a" match {
case JString(value) => Some(value)
case _ => None
}
These are options as you have no guarantee that arbitrary JSON would contain field "a".
See documentation
extract would make sense if you were extracting whole JObject into a case class.

Encoding Scala None to JSON value using circe

Suppose I have the following case classes that need to be serialized as JSON objects using circe:
#JsonCodec
case class A(a1: String, a2: Option[String])
#JsonCodec
case class B(b1: Option[A], b2: Option[A], b3: Int)
Now I need to encode val b = B(None, Some(A("a", Some("aa")), 5) as JSON but I want to be able to control whether it is output as
{
"b1": null,
"b2": {
"a1": "a",
"a2": "aa"
},
"b3": 5
}
or
{
"b2": {
"a1": "a",
"a2": "aa"
},
"b3": 5
}
Using Printer's dropNullKeys config, e.g. b.asJson.noSpaces.copy(dropNullKeys = true) would result in omitting Nones from output whereas setting it to false would encode Nones as null (see also this question). But how can one control this setting on a per field basis?
The best way to do this is probably just to add a post-processing step to a semi-automatically derived encoder for B:
import io.circe.{ Decoder, JsonObject, ObjectEncoder }
import io.circe.generic.JsonCodec
import io.circe.generic.semiauto.{ deriveDecoder, deriveEncoder }
#JsonCodec
case class A(a1: String, a2: Option[String])
case class B(b1: Option[A], b2: Option[A], b3: Int)
object B {
implicit val decodeB: Decoder[B] = deriveDecoder[B]
implicit val encodeB: ObjectEncoder[B] = deriveEncoder[B].mapJsonObject(
_.filter {
case ("b1", value) => !value.isNull
case _ => true
}
)
}
And then:
scala> import io.circe.syntax._
import io.circe.syntax._
scala> B(None, None, 1).asJson.noSpaces
res0: String = {"b2":null,"b3":1}
You can adjust the argument to the filter to remove whichever null-valued fields you want from the JSON object (here I'm just removing b1 in B).
It's worth noting that currently you can't combine the #JsonCodec annotation and an explicitly defined instance in the companion object. This isn't an inherent limitation of the annotation—we could check the companion object for "overriding" instances during the macro expansion, but doing so would make the implementation substantially more complicated (right now it's quite simple). The workaround is pretty simple (just use deriveDecoder explicitly), but of course we'd be happy to consider an issue requesting support for mixing and matching #JsonCodec and explicit instances.
Circe have added a method dropNullValues on Json that uses what Travis Brown mentioned above.
def dropNulls[A](encoder: Encoder[A]): Encoder[A] =
encoder.mapJson(_.dropNullValues)
implicit val entityEncoder: Encoder[Entity] = dropNulls(deriveEncoder)

Decode single field of object in a json array with argonaut/circe

Suppose I have such json
{
"sha": "some sha",
"parents": [{
"url": "some url",
"sha": "some parent sha"
}]
}
and such case class
case class Commit(sha: String, parentShas: List[String])
In play-json I could write the reads like this:
val commitReads: Reads[Commit] = (
(JsPath \ "sha").read[String] and
(JsPath \ "parents" \\ "sha").read[List[String]]
)(Commit.apply _)
I'm looking for a equivalent way of decoding only the "sha" of "parent" in argonaut/circe but I haven't found any. "HCursor/ACursor" has downArray but from there on I don't know what to do. Thank you very much in advance!
Neither circe nor Argonaut keeps track of which fields have been read in JSON objects, so you can just ignore the extra "url" field (just as in Play). The trickier part is finding the equivalent of Play's \\, which circe doesn't have at the moment, although you've convinced me we need to add it.
First of all, this is relatively easy if you have a separate SHA type:
import io.circe.Decoder
val doc = """
{
"sha": "some sha",
"parents": [{
"url": "some url",
"sha": "some parent sha"
}]
}
"""
case class Sha(value: String)
object Sha {
implicit val decodeSha: Decoder[Sha] = Decoder.instance(_.get[String]("sha")).map(Sha(_))
}
case class Commit(sha: Sha, parentShas: List[Sha])
object Commit {
implicit val decodeCommit: Decoder[Commit] = for {
sha <- Decoder[Sha]
parents <- Decoder.instance(_.get[List[Sha]]("parents"))
} yield Commit(sha, parents)
}
Or, using Cats's applicative syntax:
import cats.syntax.cartesian._
implicit val decodeCommit: Decoder[Commit] =
(Decoder[Sha] |#| Decoder.instance(_.get[List[Sha]]("parents"))).map(Commit(_, _))
And then:
scala> import io.circe.jawn._
import io.circe.jawn._
scala> decode[Commit](doc)
res0: cats.data.Xor[io.circe.Error,Commit] = Right(Commit(Sha(some sha),List(Sha(some parent sha))))
But that's not really an answer, since I'm not going to ask you to change your model. :) The actual answer is a bit less fun:
case class Commit(sha: String, parentShas: List[String])
object Commit {
val extractSha: Decoder[String] = Decoder.instance(_.get[String]("sha"))
implicit val decodeCommit: Decoder[Commit] = for {
sha <- extractSha
parents <- Decoder.instance(c =>
c.get("parents")(Decoder.decodeCanBuildFrom[String, List](extractSha, implicitly))
)
} yield Commit(sha, parents)
}
This is bad, and I'm ashamed it's necessary, but it works. I've just filed an issue to make sure this gets better in a future circe release.

immutable Map (de)serialization to/from Play JSON

I have following (simplified) structure:
case class MyKey(key: String)
case class MyValue(value: String)
Let's assume that I have Play JSON formatters for both case classes.
As an example I have:
val myNewMessage = collection.immutable.Map(MyKey("key1") -> MyValue("value1"), MyKey("key2") -> MyValue("value2"))
As a result of following transformation
play.api.libs.json.Json.toJson(myNewMessage)
I'm expecting something like:
{ "key1": "value1", "key2": "value2" }
I have tried writing the formatter, but somehow I can not succeed:
implicit lazy val mapMyKeyMyValueFormat: Format[collection.immutable.Map[MyKey, MyValue]] = new Format[collection.immutable.Map[MyKey, MyValue]] {
override def writes(obj: collection.immutable.Map[MyKey, MyValue]): JsValue = Json.toJson(obj.map {
case (key, value) ⇒ Json.toJson(key) -> Json.toJson(value)
})
override def reads(json: JsValue): JsResult[collection.immutable.Map[MyKey, MyValue]] = ???
}
I have no idea how to write proper reads function. Is there any simpler way of doing it? I'm also not satisfied with my writes function.
Thx!
The reason the writes method is not working is because you're transforming the Map[MyKey, MyValue] into a Map[JsValue, JsValue], but you can't serialize that to JSON. The JSON keys need to be strings, so you need some way of transforming MyKey to some unique String value. Otherwise you'd be trying to serialize something like this:
{"key": "keyName"} : {"value": "myValue"}
Which is not valid JSON.
If MyKey is as simple as stated in your question, this can work:
def writes(obj: Map[MyKey, MyValue]): JsValue = Json.toJson(obj.map {
case (key, value) => key.key -> Json.toJson(value)
}) // ^ must be a String
Play will then know how to serialize a Map[String, MyValue], given the appropriate Writes[MyValue].
But I'm not certain that's what you want. Because it produces this:
scala> Json.toJson(myNewMessage)
res0: play.api.libs.json.JsValue = {"key1":{"value":"value1"},"key2":{"value":"value2"}}
If this is the output you want:
{ "key1": "value1", "key2": "value2" }
Then your Writes should look more like this:
def writes(obj: Map[MyKey, MyValue]): JsValue = {
obj.foldLeft(JsObject(Nil)) { case (js, (key, value)) =>
js ++ Json.obj(key.key -> value.value)
}
}
Which produces this:
scala> writes(myNewMessage)
res5: play.api.libs.json.JsValue = {"key1":"value1","key2":"value2"}
Reads are easy so long as the structure of MyKey and MyValue are the same, otherwise I have no idea what you'd want it to do. It's very dependent on the actual structure you want. As is, I would suggest leveraging existing Reads[Map[String, String]] and transforming it to the type you want.
def reads(js: JsValue): JsResult[Map[MyKey, MyValue]] = {
js.validate[Map[String, String]].map { case kvMap =>
kvMap.map { case (key, value) => MyKey(key) -> MyValue(value) }
}
}
It's hard to see much else without knowing the actual structure of the data. In general I stay away from having to serialize and deserialize Maps.