Scala/Spark: NoClassDefFoundError: net/liftweb/json/Formats - json

I am trying to create a JSON String from a Scala Object as described here.
I have the following code:
import scala.collection.mutable._
import net.liftweb.json._
import net.liftweb.json.Serialization.write
case class Person(name: String, address: Address)
case class Address(city: String, state: String)
object LiftJsonTest extends App {
val p = Person("Alvin Alexander", Address("Talkeetna", "AK"))
// create a JSON string from the Person, then print it
implicit val formats = DefaultFormats
val jsonString = write(p)
println(jsonString)
}
My build.sbt file contains the following:
libraryDependencies += "net.liftweb" %% "lift-json" % "2.5+"
When I build with sbt package, it is a success.
However, when I try to run it as a Spark job, like this:
spark-submit \
--packages com.amazonaws:aws-java-sdk-pom:1.10.34,org.apache.hadoop:hadoop-aws:2.6.0,net.liftweb:lift-json:2.5+ \
--class "com.foo.MyClass" \
--master local[4] \
target/scala-2.10/my-app_2.10-0.0.1.jar
I get this error:
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: net.liftweb#lift-json;2.5+: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1068)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:287)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
What am I doing wrong here? Is net.liftweb:lift-json:2.5+ in my packages argument incorrect? Do I need to add a resolver in build.sbt?

Users may also include any other dependencies by supplying a comma-delimited list of maven coordinates with --packages.
2.5+ in your build.sbt is Ivy version matcher syntax, not actual artifact version needed for Maven coordinates. spark-submit apparently doesn't use Ivy for resolution (and I think it would be surprising if it did; your application could suddenly stop working because a new dependency version was published). So you need to find what version 2.5+ resolves to in your case e.g. using https://github.com/jrudolph/sbt-dependency-graph (or trying to find it in show dependencyClasspath).

Related

Play2 on Scala : JSON serialization/deserialization

I am new Play/Scala and started porting a Spring Boot RestAPI to Play2 as a learning exercise.
In Java/SpringRest ,its simply a matter of annotating POJOs and the JSon library handle the serialize/deserialization automatically.
According to every Play2/Scala tutorial I read, I have to write a Writer/Reader for each model/case class as follows
implicit val writesItem = Writes[ClusterStatus] {
case ClusterStatus(gpuFreeMemory, gpuTotalMemory, labelsLoaded, status) =>
Json.obj("gpuFreeMemory" -> gpuFreeMemory,
"gpuTotalMemory" -> gpuTotalMemory,
"labelsLoaded" -> labelsLoaded,
"status" -> status)
}
//HTTP method
def status() = Action { request =>
val status: ClusterStatus = clusterService.status()
Ok(Json.toJson(status))
}
This means If have a large domain model/response model, I have to write a lot of Writers/Readers for serialize/deserialization?
Is there any simpler way to handle this?
You can give a try to "com.typesafe.play" %% "play-json" % "2.7.2". For using that you just need to do the below steps:
1) Add below dependencies(Use version according to your project):
"com.typesafe.play" %% "play-json" % "2.7.2",
"net.liftweb" % "lift-json_2.11" % "2.6.2"
2) Define formats:
implicit val formats = DefaultFormats
implicit val yourCaseClassFormat= Json.format[YourCaseClass]
This format defines both read and writes for your case class.

Encoding/Decode shapeless records with circe

Upgrading circe from 0.4.1 to 0.7.0 broke the following code:
import shapeless._
import syntax.singleton._
import io.circe.generic.auto._
.run[Record.`'transaction_id -> Int`.T](transport)
def run[A](transport: Json => Future[Json])(implicit decoder: Decoder[A], exec: ExecutionContext): Future[A]
With the following error:
could not find implicit value for parameter decoder: io.circe.Decoder[shapeless.::[Int with shapeless.labelled.KeyTag[Symbol with shapeless.tag.Tagged[String("transaction_id")],Int],shapeless.HNil]]
[error] .run[Record.`'transaction_id -> Int`.T](transport)
[error] ^
Am I missing some import here or are these encoders/decoders not available in circe anymore?
Instances for Shapeless's hlists, records, etc. were moved to a separate circe-shapes module in the circe 0.6.0 release. If you add this module to your build, the following should just work:
import io.circe.jawn.decode, io.circe.shapes._
import shapeless._, record.Record, syntax.singleton._
val doc = """{ "transaction_id": 1 }"""
val res = decode[Record.`'transaction_id -> Int`.T](doc)
The motivation for moving these instances was that the improved generic derivation introduced in 0.6 meant that they were no longer necessary, and keeping them out of implicit scope when they're not needed is both cleaner and potentially supports faster compile times. The new circe-shapes module also includes features that were not available in circe-generic, such as instances for coproducts.

How to parse Json formatted Kafka message in spark streaming

I have JSON messages on Kafka like this:
{"id_post":"p1", "message":"blablabla"}
and I want to parse the message, and print (or use for further computation) the message element.
With the following code I print the json
val kafkaStream = KafkaUtils.createStream(ssc, zkQuorum, inputGroup, topicMap)
val postStream = kafkaStream.map(_._2)
postStream.foreachRDD((rdd, time) => {
val count = rdd.count()
if (count > 0){
rdd.foreach(record => {
println(record)
}
}
but I can't manage to get the single element.
I tried a few JSON parser, but no luck.
Any idea?
update:
a few errors with different JSON parser
this is the code and output with circe parser:
val parsed_record = parse(record)
and the output:
14:45:00,676 ERROR Executor:95 - Exception in task 0.0 in stage 4.0 (TID 4)
java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at io.circe.jawn.CirceSupportParser$$anon$1$$anon$4.add(CirceSupportParser.scala:36)
at jawn.CharBasedParser$class.parseString(CharBasedParser.scala:90)
at jawn.StringParser.parseString(StringParser.scala:15)
at jawn.Parser.rparse(Parser.scala:397)
at jawn.Parser.parse(Parser.scala:338)
at jawn.SyncParser.parse(SyncParser.scala:24)
at jawn.SupportParser$$anonfun$parseFromString$1.apply(SupportParser.scala:15)
and so on.. at the row in which I use parse(record)
looks like it can't access and/or parse the string record.
Same if I use lift-json
at the parse(record) the error output is more or less the same:
16:58:20,425 ERROR Executor:95 - Exception in task 0.0 in stage 4.0 (TID 4)
java.lang.NoSuchMethodError: scala.runtime.ObjectRef.create(Ljava/lang/Object;)Lscala/runtime/ObjectRef;
at net.liftweb.json.JsonParser$$anonfun$2.apply(JsonParser.scala:144)
at net.liftweb.json.JsonParser$$anonfun$2.apply(JsonParser.scala:141)
at net.liftweb.json.JsonParser$.parse(JsonParser.scala:80)
at net.liftweb.json.JsonParser$.parse(JsonParser.scala:45)
at net.liftweb.json.package$.parse(package.scala:40)
at SparkConsumer$$anonfun$main$1$$anonfun$apply$1.apply(SparkConsumer.scala:98)
at SparkConsumer$$anonfun$main$1$$anonfun$apply$1.apply(SparkConsumer.scala:95)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
I solved the issue, so I am writing here for future references:
dependencies, dependencies, dependecies!
I choose to use lift-json, but this applies to any JSON parser and/or framework.
The SPARK version I am using (v1.4.1) is the one compatible with scala 2.10, here the dependencies from pom.xml:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.4.1</version>
<scope>provided</scope>
</dependency>
and some other libraries. I was using the lift-json version for scala 2.11 ... and that is WRONG.
So, for the future me and if you are reading this topic: be coherent with scala version and among dependencies.
In lift-json case:
<dependency>
<groupId>net.liftweb</groupId>
<artifactId>lift-json_2.10</artifactId>
<version>3.0-M1</version>
</dependency>
same problem with you.
However I solved this problem by using the fastjson.
SBT dependency :
// http://mvnrepository.com/artifact/com.alibaba/fastjson
libraryDependencies += "com.alibaba" % "fastjson" % "1.2.12"
or
Maven dependency :
<!-- http://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.12</version>
</dependency>
You can have a try. Hope this would be helpful.
Extracting the Data from JSON String in Scala/Apache Spark
import org.apache.spark.rdd.RDD
object JsonData extends serializable{
def main(args: Array[String]): Unit = {
val msg = "{ \"id_post\":\"21\",\"message\":\"blablabla\"}";
val m1 = msgParse(msg)
println(m1.id_post)
}
case class SomeClass(id_post: String, message: String) extends serializable
def msgParse(msg: String): SomeClass = {
import org.json4s._
import org.json4s.native.JsonMethods._
implicit val formats = DefaultFormats
val m = parse(msg).extract[SomeClass]
return m
}
}
Below is the Maven Decency
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-native_2.10</artifactId>
<version>3.3.0</version>
</dependency>

Play Framework 2 JSON Reads, deserialization of one variable

I'm using Play Framework 2.4 and I'm trying to do a basic JSON deserialization with Reads but I get an error. Here is the code:
case class Config(action: String)
and somewhere,
implicit val configReads: Reads[Config] = (
(__ \ "action").read[String]
)(Config.apply _)
I think the configReads is correctly formed but I get an IDE error on the "read" method call (symbol not defined), when I compile the code I get the following error:
Error:(30, 27) overloaded method value read with alternatives:
(t: String)play.api.libs.json.Reads[String] <and>
(implicit r: play.api.libs.json.Reads[String])play.api.libs.json.Reads[String]
cannot be applied to (String => wings.common.json.Config)
(__ \ "action").read[String]
^
but, if instead of trying to deserialize ONE argument I declare a class with TWO arguments in the constructor and I write the code to deserialize it, it works.
Does anybody know how to solve this?
Edit:
Digging in the depth of Google I found this for Play 2.1.x but I'm using the Json library for Play 2.4.1 so this problem should not be happening.
You can do like this:
implicit val configReads: Reads[Config] = (
(__ \ "action").read[String]
) map Config.apply

How to prevent sbt from running integration tests?

Maven surefire-plugin doesn't run integration tests (they named with "IT" suffix by convention), but sbt runs both: unit and integration. So, how to prevent this behaviour? Is there a common way to distinguish integration and unit tests for ScalaTest (don't run FeatureSpec-tests by default)
How to do that is exactly documented on the sbt manual on http://www.scala-sbt.org/release/docs/Detailed-Topics/Testing#additional-test-configurations-with-shared-sources :
//Build.scala
import sbt._
import Keys._
object B extends Build {
lazy val root =
Project("root", file("."))
.configs( FunTest )
.settings( inConfig(FunTest)(Defaults.testTasks) : _*)
.settings(
libraryDependencies += specs,
testOptions in Test := Seq(Tests.Filter(itFilter)),
testOptions in FunTest := Seq(Tests.Filter(unitFilter))
)
def itFilter(name: String): Boolean = name endsWith "ITest"
def unitFilter(name: String): Boolean = (name endsWith "Test") && !itFilter(name)
lazy val FunTest = config("fun") extend(Test)
lazy val specs = "org.scala-tools.testing" %% "specs" % "1.6.8" % "test"
}
Call sbt test for unit tests and sbt fun:test for integration test and sbt test fun:test for both.
The simplest way with the latest sbt is just to apply IntegrationTest config and corresponding settings as described here, - and you put your tests in src/it/scala directory in your project.