I am using Json4s classes inside of a Spark 2.2.0 closure. The "workaround" for a failure to serialize DefaultFormats is to include their definition inside every closure executed by Spark that needs them. I believe I have done more than I needed to below but still get the serialization failure.
Using Spark 2.2.0, Scala 2.11, Json4s 3.2.x (whatever is in Spark) and also tried using Json4s 3.5.3 by pulling it into my job using sbt. In all cases I used the workaround shown below.
Does anyone know what I'm doing wrong?
logger.info(s"Creating an RDD for $actionName")
implicit val formats = DefaultFormats
val itemProps = df.rdd.map[(ItemID, ItemProps)](row => { <--- error points to this line
implicit val formats = DefaultFormats
val itemId = row.getString(0)
val correlators = row.getSeq[String](1).toList
(itemId, Map(actionName -> JArray(correlators.map { t =>
implicit val formats = DefaultFormats
JsonAST.JString(t)
})))
})
I have also tried another suggestion, which is to set the DefaultFormats implicit in the class constructor area and not in the closure, no luck anywhere.
The JVM error trace is from Spark complaining that the task is not serializable and pointing to the line above (last line in my code anyway) then the root cause is explained with:
Serialization stack:
- object not serializable (class: org.json4s.DefaultFormats$, value: org.json4s.DefaultFormats$#7fdd29f3)
- field (class: com.actionml.URAlgorithm, name: formats, type: class org.json4s.DefaultFormats$)
- object (class com.actionml.URAlgorithm, com.actionml.URAlgorithm#2dbfa972)
- field (class: com.actionml.URAlgorithm$$anonfun$udfLLR$1, name: $outer, type: class com.actionml.URAlgorithm)
- object (class com.actionml.URAlgorithm$$anonfun$udfLLR$1, <function3>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$4, name: func$4, type: interface scala.Function3)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$4, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF, name: f, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF, UDF(input[2, bigint, false], input[3, bigint, false], input[5, bigint, false]))
- element of array (index: 1)
- array (class [Ljava.lang.Object;, size 3)
- field (class: org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10, name: references$1, type: class [Ljava.lang.Object;)
- object (class org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10, <function2>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 128 more
I have another example. You can try it by using spark-shell. I hope it can help you.
import org.json4s._
import org.json4s.jackson.JsonMethods._
def getValue(x: String): (Int, String) = {
implicit val formats: DefaultFormats.type = DefaultFormats
val obj = parse(x).asInstanceOf[JObject]
val id = (obj \ "id").extract[Int]
val name = (obj \ "name").extract[String]
(id, name)
}
val rdd = sc.parallelize(Array("{\"id\":0, \"name\":\"g\"}", "{\"id\":1, \"name\":\"u\"}", "{\"id\":2, \"name\":\"c\"}", "{\"id\":3, \"name\":\"h\"}", "{\"id\":4, \"name\":\"a\"}", "{\"id\":5, \"name\":\"0\"}"))
rdd.map(x => getValue(x)).collect
Interesting. One typical problem is that you run into serialization issues with the implicit val formats, but as you define them inside your loop this should be ok.
I know that this is bit hacky, but you could try the following:
using #transient implicit val
maybe do a minimal test whether JsonAST.JString(t) is serializable
Related
I have something like:
sealed trait Foo
case class Bar(field: ...) extends Foo
case class Baz(otherField: ...) extends Foo
trait JsonFormat {
implicit val barWrites = Json.writes[Bar]
implicit val barReads = Json.reads[Bar]
implicit val bazWrites = Json.writes[Baz]
implicit val bazReads = Json.reads[Baz]
implicit val fooWrites = Json.writes[Foo]
implicit val fooReads = Json.reads[Foo]
// other vals that depend on Foo
}
When I compile, I get an error like:
[error] /file/path/JsonFormat.scala:68:41: unreachable code
[error] implicit val fooWrites = Json.writes[Foo]
[error] ^
[error] one error found
I'm pretty new to scala and I understand an "unreachable code" error in the context of pattern matching, but I can't figure this one out.
I'm using play 2.8.
This may not be the exact solution to the answer but possible advice to change your approach.
First, if you add your own apply() method in your class's companion object this error does happen. Check that.
Second, the best practice seems to be that you would put an implicit converter within the companion object of the class per class and I'd go ahead and implicit the formatter so both conversion directions are supported.
case class SomeClass(someString: String)
object SomeClass {
implicit val jsonFormatter: OFormat[SomeClass] = Json.format[SomeClass]
}
If you approach your JSON implicit converters this way, anywhere you use your DTOs will automatically get the implicit converter anywhere you use that class in both in and out directions.
Other place to assemble "shared" implicits is Package Object. I keep, for example, an ISO DateTime string <-> Date converter there.
So I have two classes in my project
case class Item(id: Int, name: String)
and
case class Order(id: Int, items: List[Item])
I'm trying to make reads and writes properties for Order but I get a compiler error saying:
"No unapply or unapplySeq function found"
In my controller I have the following:
implicit val itemReads = Json.reads[Item]
implicit val itemWrites = Json.writes[Item]
implicit val listItemReads = Json.reads[List[Item]]
implicit val listItemWrites = Json.writes[List[Item]]
The code works for itemReads and itemWrites but not for the bottom two. Can anyone tell me where I'm going wrong, I'm new to Play framework.
Thank you for your time.
The "No unapply or unapplySeq function found" error is caused by these two:
implicit val listItemReads = Json.reads[List[Item]]
implicit val listItemWrites = Json.writes[List[Item]]
Just throw them away. As Ende said, Play knows how to deal with lists.
But you need Reads and Writes for Order too! And since you do both reading and writing, it's simplest to define a Format, a mix of the Reads and Writes traits. This should work:
case class Item(id: Int, name: String)
object Item {
implicit val format = Json.format[Item]
}
case class Order(id: Int, items: List[Item])
object Order {
implicit val format = Json.format[Order]
}
Above, the ordering is significant; Item and the companion object must come before Order.
So, once you have all the implicit converters needed, the key is to make them properly visible in the controllers. The above is one solution, but there are other ways, as I learned after trying to do something similar.
You don't actually need to define those two implicits, play already knows how to deal with a list:
scala> import play.api.libs.json._
import play.api.libs.json._
scala> case class Item(id: Int, name: String)
defined class Item
scala> case class Order(id: Int, items: List[Item])
defined class Order
scala> implicit val itemReads = Json.reads[Item]
itemReads: play.api.libs.json.Reads[Item] = play.api.libs.json.Reads$$anon$8#478fdbc9
scala> implicit val itemWrites = Json.writes[Item]
itemWrites: play.api.libs.json.OWrites[Item] = play.api.libs.json.OWrites$$anon$2#26de09b8
scala> Json.toJson(List(Item(1, ""), Item(2, "")))
res0: play.api.libs.json.JsValue = [{"id":1,"name":""},{"id":2,"name":""}]
scala> Json.toJson(Order(10, List(Item(1, ""), Item(2, ""))))
res1: play.api.libs.json.JsValue = {"id":10,"items":[{"id":1,"name":""},{"id":2,"name":""}]}
The error you see probably happens because play uses the unapply method to construct the macro expansion for your read/write and List is an abstract class, play-json needs concrete type to make the macro work.
This works:
case class Item(id: Int, name: String)
case class Order(id: Int, items: List[Item])
implicit val itemFormat = Json.format[Item]
implicit val orderFormat: Format[Order] = (
(JsPath \ "id").format[Int] and
(JsPath \ "items").format[JsArray].inmap(
(v: JsArray) => v.value.map(v => v.as[Item]).toList,
(l: List[Item]) => JsArray(l.map(item => Json.toJson(item)))
)
)(Order.apply, unlift(Order.unapply))
This also allows you to customize the naming for your JSON object. Below is an example of the serialization in action.
Json.toJson(Order(1, List(Item(2, "Item 2"))))
res0: play.api.libs.json.JsValue = {"id":1,"items":[{"id":2,"name":"Item 2"}]}
Json.parse(
"""
|{"id":1,"items":[{"id":2,"name":"Item 2"}]}
""".stripMargin).as[Order]
res1: Order = Order(1,List(Item(2,Item 2)))
I'd also recommend using format instead of read and write if you are doing symmetrical serialization / deserialization.
I've a bunch of case classes that I use to build a complex object Publisher.
sealed case class Status(status: String)
trait Running extends Status
trait Stopped extends Status
case class History(keywords: List[String], updatedAt: Option[DateTime])
case class Creds(user: String, secret: String)
case class Publisher(id: Option[BSONObjectID], name: String, creds: Creds, status: Status, prefs: List[String], updatedAt: Option[DateTime])
I want to convert the Publisher into a JSON string using the play JSON API.
I used Json.toJson(publisher) and it complained about not having an implicit for Publisher. The error went away after I provided the following
implicit val pubWrites = Json.writes[Publisher]
As excepted it is now complaining about not being able to find implicits for Status, BSONObjectID and Creds. However, when I provide implicits for each Status and Creds it still complains.
implicit val statusWrites = Json.writes[Status]
implicit val credsWrites = Json.writes[Creds]
Any idea how to resolve this ? This is the first time I'm using Play JSON. I've used Json4s before and would like to try this using Play JSON if possible before I move go Json4s unless there are clear benefits for using/not using Json4s vs Play JSON.
The order of implicits are also important. From the least important to the most important. If Writes of Publisher requires Writes of Status, the implicit for Writes of Status should be before the Writes of Publisher.
Here is the code I tested to work
import play.modules.reactivemongo.json.BSONFormats._
import play.api.libs.json._
import reactivemongo.bson._
import org.joda.time.DateTime
sealed case class Status(status: String)
trait Running extends Status
trait Stopped extends Status
case class History(keywords: List[String], updatedAt: Option[DateTime])
case class Creds(user: String, secret: String)
case class Publisher(id: Option[BSONObjectID], name: String, creds: Creds, status: Status, prefs: List[String], updatedAt: Option[DateTime])
implicit val statusWrites = Json.writes[Status]
implicit val credsWrites = Json.writes[Creds]
implicit val pubWrites = Json.writes[Publisher]
val p = Publisher(
Some(new BSONObjectID("123")),
"foo",
Creds("bar","foo"),
new Status("foo"),
List("1","2"),
Some(new DateTime))
Json.toJson(p)
//res0: play.api.libs.json.JsValue = {"id":{"$oid":"12"},"name":"foo","creds":{"user":"bar","secret":"foo"},"status":{"status":"foo"},"prefs":["1","2"],"updatedAt":1401787836305}
I need to parse the following json string:
{"type": 1}
The case class I am using looks like:
case class MyJsonObj(
val type: Int
)
However, this confuses Scala since 'type' is a keyword. So, I tried using #JsonProperty annotation from Jacson/Jerkson as follows:
case class MyJsonObj(
#JsonProperty("type") val myType: Int
)
However, the Json parser still refuses to look for 'type' string in json instead of 'myType'. Following sample code illustrates the problem:
import com.codahale.jerkson.Json._
import org.codehaus.jackson.annotate._
case class MyJsonObj(
#JsonProperty("type") val myType: Int
)
object SimpleExample {
def main(args: Array[String]) {
val jsonLine = """{"type":1}"""
val JsonObj = parse[MyJsonObj](jsonLine)
}
I get the following error:
[error] (run-main-a) com.codahale.jerkson.ParsingException: Invalid JSON. Needed [myType], but found [type].
P.S: As seen above, I am using jerkson/jackson, but wouldn't mind switching to some other json parsing library if that makes life easier.
Use backquotes to prevent the Scala compiler from interpreting type as the keyword:
case class MyJsonObj(
val `type`: Int
)
I suspect you aren't enabling Scala support in Jackson properly.
I've tried this:
object Test extends App {
val mapper = new ObjectMapper
mapper.registerModule(DefaultScalaModule)
println(mapper.writeValueAsString(MyJsonObj(1)))
val obj = mapper.readValue("""{"type":1}""", classOf[MyJsonObj])
println(obj.myType)
}
case class MyJsonObj(#JsonProperty("type") myType: Int)
And I get:
{"type":1}
1
Note that I've added Scala support to the object mapper by calling registerModule
As #wingedsubmariner implied, the answer lies with Scala meta annotations.
This worked for me:
import scala.annotation.meta.field
case class MyJsonObj(
#(JsonProperty #field)("type") val myType: Int
)
This is in addition to mapper.registerModule(DefaultScalaModule), which you'll probably need if you're deserializing into a Scala class.
With Jerkson, I was able to parse a String containing a JSON array, like this:
com.codahale.jerkson.Json.parse[Array[Credentials]](contents)
where contents was a String containing the following:
[{"awsAccountName":"mslinn","accessKey":"blahblah","secretKey":"blahblah"}]
... and I would get the array of Credentials.
(Brief diversion) I tried to do something similar using the new JSON parser for Play 2.1 and Scala using different data. For a simple parse, the following works fine. A case class (S3File) defines the unapply method necessary for this to work:
case class S3File(accountName: String,
bucketName: String,
endpoint: String = ".s3.amazonaws.com")
implicit val s3FileFormat = Json.format[S3File]
val jsValue = Json.parse(stringContainingJson)
Json.fromJson(jsValue).get
Let's reconsider the original string called contents containing JSON. As with all collections, an array of objects has no unapply method. That means the technique I showed in the the diversion above won't work. I tried to create a throwaway case class for this purpose:
case class ArrayCreds(payload: Array[Credentials])
implicit val credsFormat = Json.format[ArrayCreds]
val jsValue = Json.parse(contents)
val credArray = Json.fromJson(jsValue).get.payload
... unfortunately, this fails:
No unapply function found
[error] implicit val credsFormat = Json.format[ArrayCreds]
[error] ^
[error]
/blah.scala:177: diverging implicit expansion for type play.api.libs.json.Reads[T]
[error] starting with method ArrayReads in trait DefaultReads
[error] val credArray = Json.fromJson(jsValue).get
[error] ^
Is there a simple way of parsing an array of JSON using Play 2.1's new JSON parser? I expect the throwaway case class is the wrong approach, and the implicit will need to be instead:
implicit val credsFormat = Json.format[Credentials]
But I don't understand how to write the rest of the deserialization in a simple manner. All of the code examples I have seen are rather verbose, which seems contrary to the spirit of Scala. The ideal incantation would be as simple as Jerkson's incantation.
Thanks,
Mike
I think this is what you're looking for:
scala> import play.api.libs.json._
import play.api.libs.json._
scala> case class Credentials(awsAccountName: String, accessKey: String, secretKey: String)
defined class Credentials
scala> implicit val credentialsFmt = Json.format[Credentials]
credentialsFmt: play.api.libs.json.OFormat[Credentials] = play.api.libs.json.OFormat$$anon$1#1da9be95
scala> val js = """[{"awsAccountName":"mslinn","accessKey":"blahblah","secretKey":"blahblah"}]"""
js: String = [{"awsAccountName":"mslinn","accessKey":"blahblah","secretKey":"blahblah"}]
scala> Json.fromJson[Seq[Credentials]](Json.parse(js))
res3: play.api.libs.json.JsResult[Seq[Credentials]] = JsSuccess(List(Credentials(mslinn,blahblah,blahblah)),)
HTH,
Julien