Consuming Kafka DStream in Spark Streaming Procss - json

I'm consuming a Kafka topic inside a spark streaming program like this:
import ...
object KafkaStreaming {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("KafkaStreaming").setMaster("local[*]")
val sc = new SparkContext(conf)
val ssc = new StreamingContext(sc, Seconds(10))
val kafkaConf = Map(
...
)
val messages = KafkaUtils.createDirectStream[String, String](
ssc,
LocationStrategies.PreferConsistent,
ConsumerStrategies.Subscribe[String, String](Seq("topic"), kafkaConf)
)
val lines: DStream[String] = messages.map(_.value)
val line: DStream[String] = lines.flatMap(_.split("\n"))
process(line)
ssc.start()
ssc.awaitTermination()
}
def process(line: DStream[String]): Unit =
{
// here is where I want to convert the DStream to JSON
var json: Option[Any] = JSON.parseFull(line) // <--
println(json.getOrElse("json is NULL"))
if(json.isEmpty == false) {
println("NOT FALSE")
var map = json.get.asInstanceOf[Map[String, Any]]
// use every member of JSON document to access the value
map.get("any json element").toString
// do some other manipulation
}
}
}
Inside the process function I want to manipulate each line of string to extract a JSON object out of it and perform further processing and persisting. How can I do it?

Instead of taking a DStream[String], use can use DStream.map and then foreachRDD:
def process(line: String): Unit = ???
And then:
messages
.map(_.value)
.flatMap(_.split("\n"))
.map(process)
.foreachRDD { rdd =>
rdd.foreachPartition { itr =>
// Do stuff with `Iterator[String]` after JSON transformation
}
}

Related

How do I de-serialize this json?

I am working on a project where I need to access currency rates once a day so I am trying to use this json.
To read this, I am simply getting the text of the URL and then trying to use the JSONReader to de-serialize.
val url = URL("https://www.floatrates.com/daily/usd.json")
val stream = url.openStream()
url.readText()
val jsonReader = JsonReader(InputStreamReader(stream))
jsonReader.isLenient = true
jsonReader.beginObject()
while (jsonReader.hasNext()) {
val codeName:String = jsonReader.nextName()
jsonReader.beginObject();
var code:String? = null
var rate = 0.0
while (jsonReader.hasNext()) {
val name:String = jsonReader.nextName()
when(name){
"code" -> {
code = jsonReader.nextString()
break
}
"rate" -> {
rate = jsonReader.nextDouble()
break
}
else -> {
jsonReader.skipValue()
}
}
code?.let {
rates?.set(it, rate)
}
}
}
jsonReader.endObject();
When I run the code , I get:
Expected BEGIN_OBJECT but was STRING
at
jsonReader.beginObject();
When I try using Gson, with the code below:
var url = URL("https://www.floatrates.com/daily/usd.json").readText()
//url = "[${url.substring(1, url.length - 1)}]"
val gson = Gson()
val currencies:Array<SpesaCurrency> = gson.fromJson(url, Array<SpesaCurrency>::class.java)
I get this error :
Expected BEGIN_OBJECT but was STRING at line 1 column 3 path $[0]
at:
gson.fromJson(url, Array<SpesaCurrency>::class.java)
SpesaCurrency.kt looks like this:
class SpesaCurrency(
val code:String,
val alphaCode:String,
val numericCode:String,
val name:String,
val rate:Float,
val date:String,
val inverseRate:Float
)
Any help is greatly appreciated.
I think there is one
jsonReader.endObject();
missing in your code. That imbalance causes the program to fail after the first object has been read. Add it after the inner
while (jsonReader.hasNext()) {
...
}
loop.

How to convert Scala Document to JSON in Scala

I want to convert variable message which is of type scala.Seq[Scala.Document] to JSON format in following code:
path("getMessages"){
get {
parameters('roomname.as[String]) {
(roomname) =>
try {
val messagesByGroupName = MongoDatabase.collectionForChat.find(equal("groupChatName",roomname)).toFuture()
val messages = Await.result(messagesByGroupName,60.seconds)
println("Messages:"+messages)
complete(messages)
}
catch {
case e:TimeoutException =>
complete("Reading file timeout.")
}
}
}
But it is giving me error on complete(messages) line. It is not accepting message of that type.
I tried to convert it into JSON by using following :
import play.api.libs.json._
object MyJsonProtocol{
implicit object ChatFormat extends Format[Chat] {
def writes(c: Chat) : JsValue = {
val chatSeq = Seq (
"sender" -> JsString(c.sender),
"receiver" -> JsString(c.receiver),
"message" -> JsString(c.message),
"groupChatName" -> JsString(c.groupChatName),
)
JsObject(chatSeq)
}
def reads(value: JsValue) = {
JsSuccess(Chat("","","",""))
}
}
}
But it is not working.
My Chat.scala class is as follows:
import play.api.libs.json.{Json, Reads, Writes}
class Chat(var sender:String,var receiver:String,var message:String, var groupChatName:String){
def setSenderName(senderName:String) = {
sender = senderName
}
def setReceiverName(receiverName:String) = {
receiver = receiverName
}
def setMessage(getMessage:String) = {
message = getMessage
}
def setGroupChatName(chatName:String) = {
groupChatName = chatName
}
}
object Chat {
def apply(sender: String, receiver: String, message: String, groupname: String): Chat
= new Chat(sender, receiver, message,groupname)
def unapply(arg: Chat): Option[(String, String, String,String)] = ???
implicit val requestReads: Reads[Chat] = Json.reads[Chat]
implicit val requestWrites: Writes[Chat] = Json.writes[Chat]
}
I am also not able to figure out what to write in unapply method.
I am new to scala and akka.
EDIT:
My MongoDatabase.scala which has collection is as follows:
object MongoDatabase {
val chatCodecProvider = Macros.createCodecProvider[Chat]()
val codecRegistry = CodecRegistries.fromRegistries(
CodecRegistries.fromProviders(chatCodecProvider),
DEFAULT_CODEC_REGISTRY
)
implicit val system = ActorSystem("Scala_jwt-App")
implicit val executor: ExecutionContext = system.dispatcher
val mongoClient: MongoClient = MongoClient()
val databaseName = sys.env("database_name")
// Getting mongodb database
val database: MongoDatabase = mongoClient.getDatabase(databaseName).withCodecRegistry(codecRegistry)
val registrationCollection = sys.env("register_collection_name")
val chatCollection = sys.env("chat_collection")
// Getting mongodb collection
val collectionForUserRegistration: MongoCollection[Document] = database.getCollection(registrationCollection)
collectionForUserRegistration.drop()
val collectionForChat: MongoCollection[Document] = database.getCollection(chatCollection)
collectionForChat.drop()
}
And if try to change val collectionForChat: MongoCollection[Document] = database.getCollection(chatCollection)
to
val collectionForChat: MongoCollection[Chat] = database.getCollection[Chat](chatCollection)
then I get error on in saveChatMessage() method below:
def saveChatMessage(sendMessageRequest: Chat) : String = {
val senderToReceiverMessage : Document = Document(
"sender" -> sendMessageRequest.sender,
"receiver" -> sendMessageRequest.receiver,
"message" -> sendMessageRequest.message,
"groupChatName" -> sendMessageRequest.groupChatName)
val chatAddedFuture = MongoDatabase.collectionForChat.insertOne(senderToReceiverMessage).toFuture()
Await.result(chatAddedFuture,60.seconds)
"Message sent"
}
on val chatAddedFuture = MongoDatabase.collectionForChat.insertOne(senderToReceiverMessage).toFuture() this line since it accepts data of type Seq[Document] and I am trying to add data of type Seq[Chat]
I am going to assume that MongoDatabase.collectionForChat.find(equal("groupChatName",roomname)) returns either Seq[Chat], or Chat. Both of them are the same for play.
You have 2 options:
Adding the default format on the companion object:
object Chat {
implicit val format: Format[Chat] = Json.format[Chat]
}
In this case you can delete the object MyJsonProtocol which is not used.
In case you want to keep your own serializers(i.e. MyJsonProtocol), you need to rename MyJsonProtocol into Chat. This way the complete route will be able to find the implicit Format.
create case class for the message object you want to send
for example:
case class MyMessage(sender: String, receiver: String, message: String, groupChatName: String)
You should create Format for the type of case class
implicit val MessageTypeFormat = Json.format[MyMessage]
if complete should get JSON type - then call complete myMessage when myMessage is an instance of MyMessage.
complete(Json.toJson(myMessage))

Invoke jsonStringify() outside Gatling EL

Is it possible to invoke jsonStringify() to get a properly formatted JSON string outside a Gatling EL?
I need to convert a Map into its JSON String to calculate a signature.
val scn = scenario("My Scenario")
.exec(buildPayload)
.exec(http("Post")
.post("/api/postSomething")
.asJson
.body(StringBody("${payload.jsonStringify()}"))
def buildPayload: Expression[Session] = session => {
val header = Map(...)
val data = Map(...)
val signature = calculateSignature(JsonStringify(data)) // << is it possible??
val payload = Map(
"header" -> header,
"data" -> data,
"signature" -> signature
)
session.set("payload", payload)
}
def calculateSignature(payload: String): String = {
...
}
Do you see any other approach?

SCALA How to parse json back to the controller?

I am new to Scala. I want to parse JSON data in scala store to database table.
My GET method looks like this (Please ignore the permissions):
def Classes = withAuth { username =>
implicit request =>
User.access(username, User.ReadXData).map { user =>
implicit val writer = new Writes[Class] {
def writes(entry: Class): JsValue = Json.obj(
"id" -> entry.id,
"name" -> entry.name
)
}
val classes = (Class.allAccessible(user))
Ok(Json.obj("success" -> true, "classes" -> classes))
}.getOrElse(Forbidden(Application.apiMessage("Not authorised"))) }
This GET method returns the json below:
"success":true,"schools":[{"id":93,"name":"Happy unniversity",}]}
I'm currently rendering the JSOn in a datatables js (editor) grid - with success
HOWEVER, I'm unable to parse and POST the JSON and store it to the database (mysql) table.
Thank you for your guidance!
Looks you are using play-json.
For class User
import play.api.libs.json.Json
final case class User(id: String, name: String)
object User {
implicit val userFormat = Json.format[User]
}
object UserJson {
def main(args: Array[String]): Unit = {
val user = User("11", "Peter")
val json = Json.toJson(user).toString()
println("json ===> " + json)
val user2 = Json.parse(json).as[User]
println("name ===> " + user2.name)
}
}
I definitely recommend this lib: "de.heikoseeberger" %% "akka-http-jackson" % "1.27.0" for akka-http.

Find the maximum value from JSON data in Scala

I am very new to programming in Scala. I am writing a test program to get maximum value from JSON data. I have following code:
import scala.io.Source
import scala.util.parsing.json._
object jsonParsing{
//Id int `json:"id"`
//Price int `json:"price"`
def main(args: Array[String]): Unit = {
val file_name = "jsonData.txt"
val json_string = scala.io.Source.fromFile("jsonData.txt").getLines.mkString
val json_arr = json_string.split(",")
json_arr.foreach {println}
}
}
The json_arr.foreach {println} prints following data:
[{ "id":1
"price":4629}
{ "id":2
"price":7126}
{ "id":3
"price":8862}
{ "id":4
"price":8999}
{ "id":5
"price":1095}]
I am stuck at the part of figuring out how to find the maximum price from such JSON data? That is, in this case the output should be '8999'.
you can try something like this below:
package com.x.x.integration.commons
import collection.immutable.IndexedSeq
import com.google.gson.Gson
import com.google.gson.JsonObject
import com.google.gson.JsonParser
case class wrapperObject(val json_string: Array[MyJsonObject])
case class MyJsonObject(val id:Int ,val price:Int)
object Demo {
val gson = new Gson()
def main(args: Array[String])={
val json_string = scala.io.Source.fromFile("jsonData.txt").getLines.mkString
//val json_string= """{"json_string":[{"id":1,"price":4629},{"id":2,"price":7126},{"id":3,"price":8862},{"id":4,"price":8999},{"id":5,"price":1095}]}"""
val jsonStringAsObject= new JsonParser().parse(json_string).getAsJsonObject
val objectThatYouCanPlayWith:wrapperObject = gson.fromJson(jsonStringAsObject, classOf[wrapperObject])
var maxPrice:Int = 0
for(i <- objectThatYouCanPlayWith.json_string if i.price>maxPrice)
{
maxPrice= i.price
}
println(maxPrice)
}
}
check if it helps you
I also recommend to use Json4s or playJson.
But you could do without any libraries as such.
val json = """[{"id":1,"price":100},{"id":2, "price": 200}]"""
val priceRegex = """"price"\s*:\s*(\d+)""".r
val maxPrice = priceRegex.findAllIn(json).map({
case priceRegex(price) => price.toInt
}).max
println(maxPrice) // print 200
Although Play JSON is handy, you could use Regex as well.
import scala.io.Source
import scala.util.matching.Regex._
val jsonString = Source
.fromFile("jsonData.txt")
.getLines.mkString.split(",")
var maxPrice = 0
jsonString.foreach(each => {
val price: Option[Match] = ("\"price\":(\\d+)").r.findFirstMatchIn(each)
if (price.isDefined) {
maxPrice = Math.max(maxPrice, price.get.group(1).toInt)
}
})