Scala get JSON value in a specific data type on map object - json

Using jackson library I read json data from a file (each row of file is a JSON object) an parse it to a map object of String and Any. My goal is to save specified keys (id and text) to a collection.
val input = scala.io.Source.fromFile("data.json").getLines()
val mapper = new ObjectMapper() with DefaultScalaModule
val data_collection = mutable.HashMap.empty[Int, String]
for (i <- input){
val parsedJson = mapper.readValue[Map[String, Any]](i)
data_collection.put(
parsedJson.get("id"),
parsedJson.get("text")
)
But as the values in the parsedJson map have the Any type, getting some keys like id and text, it returns Some(value) not just the value with the appropriate type. I expect the values for the id key to be Integer and values for the text to be String.
Running the code I got the error:
Error:(31, 23) type mismatch;
found : Option[Any]
required: Int
parsedJson.get("id"),
Here is a sample of JSON data in the file:
{"text": "Hello How are you", "id": 1}
Is it possible in Scala to parse id values to Int and text values to String, or at least convert Some(value) to value with type Int or String?

If you want to get a plain value from a Map instead of a Option you can use the () (apply) method - However it will throw an exception if the key is not found.
Second, Scala type system is static not dynamic, if you have an Any that's it, it won't change to Int or String at runtime, and the compiler will fail - Nevertheless, you can cast them using the asInstanceOf[T] method, but again if type can't be casted to the target type it will throw an exception.
Please note that even if you can make your code work with the above tricks, that code wouldn't be what you would expect in Scala. There are ways to make the code more typesafe (like pattern matching), but parsing a Json to a typesafe object is an old problem, I'm sure jackson provides a way to parse a json into case class that represent your data. If not take a look to circe it does.

Try the below code :
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import
com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
val input = scala.io.Source.fromFile("data.json").getLines()
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
val obj = mapper.readValue[Map[String, Any]](input)
val data_collection = mutable.HashMap.empty[Int, String]
for (i <- c) {
data_collection.put(
obj.get("id").fold(0)(_.toString.toInt),
obj.get("text").fold("")(_.toString)
)
}
println(data_collection) // Map(1 -> Hello How are you)

Related

How to convert response body in Json using play framework

override def accessToken(): ServiceCall[RequestTokenLogIn, Done] = {
request=>
val a=request.oauth_token.get
val b=request.oauth_verifier.get
val url=s"https://api.twitter.com/oauth/access_token?oauth_token=$a&oauth_verifier=$b"
ws.url(url).withMethod("POST").get().map{
res=>
println(res.body)
}
The output which I am getting on terminal is
oauth_token=xxxxxxxxx&oauth_token_secret=xxxxxxx&user_id=xxxxxxxxx&screen_name=xxxxx
I want to convert this response in json format.like
{
oauth_token:"",
token_secret:"",
}
When Calling res.json.toString its not converting into jsValue.
Is there any other way or am I missing something?
According to the documentation twitter publishes, it seems that the response is not a valid json. Therefore you cannot convert it automagically.
As I see it you have 2 options, which you are not going to like. In both options you have to do string manipulations.
The first option, which I like less, is actually building the json:
print(s"""{ \n\t"${res.body.replace("=", "\": \"").replace("&", "\"\n\t\"")}" \n}""")
The second option, is to extract the variables into a case class, and let play-json build the json string for you:
case class TwitterAuthToken(oauth_token: String, oauth_token_secret: String, user_id: Long, screen_name: String)
object TwitterAuthToken {
implicit val format: OFormat[TwitterAuthToken] = Json.format[TwitterAuthToken]
}
val splitResponse = res.body.split('&').map(_.split('=')).map(pair => (pair(0), pair(1))).toMap
val twitterAuthToken = TwitterAuthToken(
oauth_token = splitResponse("oauth_token"),
oauth_token_secret = splitResponse("oauth_token_secret"),
user_id = splitResponse("user_id").toLong,
screen_name = splitResponse("screen_name")
)
print(Json.toJsObject(twitterAuthToken))
I'll note that Json.toJsObject(twitterAuthToken) returns JsObject, which you can serialize, and deserialize.
I am not familiar with any option to modify the delimiters of the json being parsed by play-json. Given an existing json you can manipulate the paths from the json into the case class. But that is not what you are seeking for.
I am not sure if it is requires, but in the second option you can define user_id as long, which is harder in the first option.

spark streaming JSON value in dataframe column scala

I have a text file with json value. and this gets read into a DF
{"name":"Michael"}
{"name":"Andy", "age":30}
I want to infer the schema dynamically for each line while Streaming and store it in separate locations(tables) depending on its schema.
unfortunately while I try to read the value.schema it still shows as String. Please help on how to do it on Streaming as RDD is not allowed in streaming.
I wanted to use the following code which doesnt work as the value is still read as String format.
val jsonSchema = newdf1.select("value").as[String].schema
val df1 = newdf1.select(from_json($"value", jsonSchema).alias("value_new"))
val df2 = df1.select("value_new.*")
I even tried to use,
schema_of_json("json_schema"))
val jsonSchema: String = newdf.select(schema_of_json(col("value".toString))).as[String].first()
still no hope.. Please help..
You can load the data as textFile, create case class for person and parse every json string to Person instance using json4s or gson, then creating the Dataframe as follows:
case class Person(name: String, age: Int)
val jsons = spark.read.textFile("/my/input")
val persons = jsons.map{json => toPerson(json) //instead of 'toPerson' actually parse with json4s or gson to return Person instance}
val df = sqlContext.createDataFrame(persons)
Deserialize json to case class using json4s:
https://commitlogs.com/2017/01/14/serialize-deserialize-json-with-json4s-in-scala/
Deserialize json to case class using gson:
https://alvinalexander.com/source-code/scala/scala-case-class-gson-json-object-deserialization-and-scalatra

Klaxon Parsing of Nested Arrays

Im trying to parse this file with Klaxon, generally its going well, except I am totally not succeeding in parsing that subarray of features/[Number]/properties/
So my thought is to get the raw string of properties and to parse it seperately with Klaxon, though I dont succeed in that either. Apart from that I took many other approaches as well.
My code so far:
class Haltestelle(val type: String?, val totalFeatures: Int?, val features: Array<Any>?)
fun main(args: Array<String>) { // Main-Routine
val haltejsonurl = URL("http://online-service.kvb-koeln.de/geoserver/OPENDATA/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=ODENDATA%3Ahaltestellenbereiche&outputFormat=application/json")
val haltestringurl = haltejsonurl.readText()
val halteklx = Klaxon().parse<Haltestelle>(haltestringurl)
println(halteklx?.type)
println(halteklx?.totalFeatures)
println(halteklx?.features)
halteklx?.features!!.forEach {
println(it)
}
I am aware that I am invoking features as an Array of Any, so the Output is just printing me java.lang.Object#blabla everytime. Though, using Array failes either.
Really spend hours in this, how would you go on this?
Regards of newbie
Here's how I did something similar in Kotlin. You can parse the response as a Klaxon JsonObject, then access the "features" element to parse all the array objects into a JsonArray of JsonObjects. This can be iterated over and cast with parseFromJsonObject<Haltestelle> in your example:
import com.beust.klaxon.JsonArray
import com.beust.klaxon.JsonObject
import com.beust.klaxon.Parser
import com.github.aivancioglo.resttest.*
val response : Response = RestTest.get("http://anyurlwithJSONresponse")
val myParser = Parser()
val data : JsonObject = myParser.parse(response.getBody()) as JsonObject
val allFeatures : JsonArray<JsonObject>? = response["features"] as JsonArray<JsonObject>?
for((index,obj) in allFeatures.withIndex()) {
println("Loop Iteration $index on each object")
val yourObj = Klaxon().parseFromJsonObject<Haltestelle>(obj)
}

Inserting JsNumber into Mongo

When trying to insert a MongoDBObject that contains a JsNumber
val obj: DBObject = getDbObj // contains a "JsNumber()"
collection.insert(obj)
the following error occurs:
[error] play - Cannot invoke the action, eventually got an error: java.lang.IllegalArgumentException: can't serialize class scala.math.BigDecimal
I tried to replace the JsNumber with an Int, but I got the same error.
EDIT
Error can be reproduced via this test code. Full code in scalatest (https://gist.github.com/kman007us/6617735)
val collection = MongoConnection()("test")("test")
val obj: JsValue = Json.obj("age" -> JsNumber(100))
val q = MongoDBObject("name" -> obj)
collection.insert(q)
There are no registered handlers for Plays JSON implementation - you could add handlers to automatically translate plays Js Types to BSON types. However, that wont handle mongodb extended json which has a special structure dealing with non native json types eg: date and objectid translations.
An example of using this is:
import com.mongodb.util.JSON
val obj: JsValue = Json.obj("age" -> JsNumber(100))
val doc: DBObject = JSON.parse(obj.toString).asInstanceOf[DBObject]
For an example of a bson transformer see the joda time transformer.
It seems that casbah driver isn't compatible with Plays's JSON implementation. If I look through the cashbah code than it seems that you must use a set of MongoDBObject objects to build your query. The following snippet should work.
val collection = MongoConnection()("test")("test")
val obj = MongoDBObject("age" -> 100)
val q = MongoDBObject("name" -> obj)
collection.insert(q)
If you need the compatibility with Play's JSON implementation then use ReactiveMongo and Play-ReactiveMongo.
Edit
Maybe this Gist can help to convert JsValue objects into MongoDBObject objects.

How do I deserialize a JSON array using the Play API

I have a string that is a Json array of two objects.
> val ss = """[ {"key1" :"value1"}, {"key2":"value2"}]"""
I want to use the Play Json libraries to deserialize it and create a map from the key values to the objects.
def deserializeJsonArray(ss:String):Map[String, JsValue] = ???
// Returns Map("value1" -> {"key1" :"value1"}, "value2" -> {"key2" :"value2"})
How do I write the deserializeJsonArray function? This seems like it should be easy, but I can't figure it out from either the Play documentation or the REPL.
I'm a bit rusty, so please forgive the mess. Perhaps another overflower can come in here and clean it up for me.
This solution assumes that the JSON is an array of objects, and each of the objects contains exactly one key-value pair. I would highly recommend spicing it up with some error handling and/or pattern matching to validate the parsed JSON string.
def deserializeJsonArray(ss: String): Map[String, JsValue] = {
val jsObjectSeq: Seq[JsObject] = Json.parse(ss).as[Seq[JsObject]]
val jsValueSeq: Seq[JsValue] = Json.parse(ss).as[Seq[JsValue]]
val keys: Seq[String] = jsObjectSeq.map(json => json.keys.head)
(keys zip jsValueSeq).toMap
}