There are many JSON parsers in Kotlin like Forge, Gson, JSON, Jackson... But they deserialize the JSON to a data class, meaning it's needed to define a data class with the properties corresponding to the JSON, and this for every JSON which has a different structure.
But what if you don't want to define a data class for every JSON you could have to parse?
I'd like to have a parser which wouldn't use data classes, for example it could be something like:
val jsonstring = '{"a": "b", "c": {"d: "e"}}'
parse(jsonstring).get("c").get("d") // -> "e"
Just something that doesn't require me to write a data class like
data class DataClass (
val a: String,
val b: AnotherDataClass
)
data class AnotherDataClass (
val d: String
)
which is very heavy and not useful for my use case.
Does such a library exist? Thanks!
With kotlinx.serialization you can parse JSON String into a JsonElement:
val json: Map<String, JsonElement> = Json.parseToJsonElement(jsonstring).jsonObject
You can use JsonPath
val json = """{"a": "b", "c": {"d": "e"}}"""
val context = JsonPath.parse(json)
val str = context.read<String>("c.d")
println(str)
Output:
Result: e
Related
I have this API that returns a list as a response:
https://animension.to/public-api/search.php?search_text=&season=&genres=&dub=&airing=&sort=popular-week&page=2
This returns a response string in list format like this
[["item", 1, "other item"], ["item", 2, "other item"]]
How do I convert this so that I can use it as a list, not a string data type?
One way is to use Kotlinx Serialization. It doesn't have direct support for tuples, but it's easy to parse the input as a JSON array, then manually map to a specific data type (if we make some assumptions).
First add a dependency on Kotlinx Serialization.
// build.gradle.kts
plugins {
kotlin("jvm") version "1.7.10"
kotlin("plugin.serialization") version "1.7.10"
// the KxS plugin isn't needed, but it might come in handy later!
}
dependencies {
implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.4.0")
}
(See the Kotlinx Serialization README for Maven instructions.)
Then we can use the JSON mapper to parse the result of the API request.
import kotlinx.serialization.json.*
fun main() {
val inputJson = """
[["item", 1, "other item"], ["item", 2, "other item"]]
""".trimIndent()
// Parse, and use .jsonArray to force the results to be a JSON array
val parsedJson: JsonArray = Json.parseToJsonElement(inputJson).jsonArray
println(parsedJson)
// [["item",1,"other item"],["item",2,"other item"]]
}
It's much easier to manually map this JsonArray object to a data class than it is to try and set up a custom Kotlinx Serializer.
import kotlinx.serialization.json.*
fun main() {
val inputJson = """
[["item", 1, "other item"], ["item", 2, "other item"]]
""".trimIndent()
val parsedJson: JsonArray = Json.parseToJsonElement(inputJson).jsonArray
val results = parsedJson.map { element ->
val data = element.jsonArray
// Manually map to the data class.
// This will throw an exception if the content isn't the correct type...
SearchResult(
data[0].jsonPrimitive.content,
data[1].jsonPrimitive.int,
data[2].jsonPrimitive.content,
)
}
println(results)
// [SearchResult(title=item, id=1, description=other item), SearchResult(title=item, id=2, description=other item)]
}
data class SearchResult(
val title: String,
val id: Int,
val description: String,
)
The parsing I've defined is strict, and assumes each element in the list will also be a list of 3 elements.
The JSON output that I am looking for is
{[[1, 1.5, "String1"], [-2, 2.3, "String2"]]}
So I want to have an Array of Arrays and the inner array is storing different types.
How should I store my variables so I can create such JSON in Scala?
I thought of List of Tuples. However, all the available JSON libraries try to convert a Tuple to a map instead of an Array. I am using json4s library.
Here is a custom serializer for those inner arrays using json4s:
import org.json4s._
class MyTupleSerializer extends CustomSerializer[(Int, Double, String)](format => ({
case obj: JArray =>
implicit val formats: Formats = format
(obj(0).extract[Int], obj(1).extract[Double], obj(2).extract[String])
}, {
case (i: Int, d: Double, s: String) =>
JArray(List(JInt(i), JDouble(d), JString(s)))
}))
The custom serialiser converts JArray into a tuple and back again. This will be used wherever the Scala object being read or written has a value of the appropriate tuple type.
To test this against the sample input I have modified it to make it valid JSON by adding a field name:
{"data": [[1, 1.5, "String1"], [-2, 2.3, "String2"]]}
I have defined a container class to match this:
case class MyTupleData(data: Vector[(Int, Double, String)])
The name of the class is not relevant but the field name data must match the JSON field name. This uses Vector rather than Array because Array is really a Java type rather than a Scala type. You can use List if preferred.
import org.json4s.jackson.Serialization.{read, write}
case class MyTupleData(data: Vector[(Int, Double, String)])
object JsonTest extends App {
val data = """{"data": [[1, 1.5, "String1"], [-2, 2.3, "String2"]]}"""
implicit val formats: Formats = DefaultFormats + new MyTupleSerializer
val td: MyTupleData = read[MyTupleData](data)
println(td) // MyTupleData(Vector((1,1.5,String1), (-2,2.3,String2)))
println(write(td)) // {"data":[[1,1.5,"String1"],[-2,2.3,"String2"]]}
}
If you prefer to use a custom class for the data rather than a tuple, the code looks like this:
case class MyClass(i: Int, d: Double, s: String)
class MyClassSerializer extends CustomSerializer[MyClass](format => ({
case obj: JArray =>
implicit val formats: Formats = format
MyClass(obj(0).extract[Int], obj(1).extract[Double], obj(2).extract[String])
}, {
case MyClass(i, d, s) =>
JArray(List(JInt(i), JDouble(d), JString(s)))
}))
Use a List of List rather than List of Tuples.
an easy way to convert list of tuples to list of list is:
val listOfList: List[List[Any]] = listOfTuples.map(_.productIterator.toList)
I would use jackson, which is a java library and can deal with arbitrary datatypes inside collections of type Any/AnyRef, rather than trying to come up with a custom serializer in one of scala json libraries.
To convert scala List to java List use
import collection.JavaConverters._
So, in summary the end list would be:
val javaListOfList: java.util.List[java.util.List[Any]] = listOfTuples.map(_.productIterator.toList.asJava).asJava
Using this solution, you could have arbitrary length tuples in your list and it would work.
import com.fasterxml.jackson.databind.ObjectMapper
import collection.JavaConverters._
object TuplesCollectionToJson extends App {
val tuplesList = List(
(10, false, 43.6, "Text1"),
(84, true, 92.1, "Text2", 'X')
)
val javaList = tuplesList.map(_.productIterator.toList.asJava).asJava
val mapper = new ObjectMapper()
val json = mapper.writeValueAsString(javaList)
println(json)
}
Would produce:
[[10,false,43.6,"Text1"],[84,true,92.1,"Text2","X"]]
PS: Use this solution only when you absolutely have to work with variable types. If your tuple datatype is fixed, its better to create a json4s specific serializer/deserializer.
I am trying to familiarize myself with the PlayJSON library. I have a JSON formatted file like this:
{
"Person": [
{
"name": "Jonathon",
"age": 24,
"job": "Accountant"
}
]
}
However, I'm having difficulty with parsing it properly due to the file having different types (name is a String but age is an Int). I could technically make it so the age is a String and call .toInt on it later but for my purposes, it is by default an integer.
I know how to parse some of it:
import play.api.libs.json.{JsValue, Json}
val parsed: JsValue = Json.parse(jsonFile) //assuming jsonFile is that entire JSON String as shown above
val person: List[Map[String, String]] = (parsed \ "Person").as[List[Map[String, String]]]
Creating that person value throws an error. I know Scala is a strongly-typed language but I'm sure there is something I am missing here. I feel like this is an obvious fix too but I'm not quite sure.
The error produced is:
JsResultException(errors:List(((0)/age,List(JsonValidationError(List(error.expected.jsstring),WrappedArray())))
The error you are having, as explained in the error you are getting, is in casting to the map of string to string. The data you provided does not align with it, because the age is a string. If you want to keep in with this approach, you need to parse it into a type that will handle both strings and ints. For example:
(parsed \ "Person").validate[List[Map[String, Any]]]
Having said that, as #Luis wrote in a comment, you can just use case class to parse it. Lets declare 2 case classes:
case class JsonParsingExample(Person: Seq[Person])
case class Person(name: String, age: Int, job: String)
Now we will create a formatter for each of them on their corresponding companion object:
object Person {
implicit val format: OFormat[Person] = Json.format[Person]
}
object JsonParsingExample {
implicit val format: OFormat[JsonParsingExample] = Json.format[JsonParsingExample]
}
Now we can just do:
Json.parse(jsonFile).validate[JsonParsingExample]
Code run at Scastie.
I have a text file with json value. and this gets read into a DF
{"name":"Michael"}
{"name":"Andy", "age":30}
I want to infer the schema dynamically for each line while Streaming and store it in separate locations(tables) depending on its schema.
unfortunately while I try to read the value.schema it still shows as String. Please help on how to do it on Streaming as RDD is not allowed in streaming.
I wanted to use the following code which doesnt work as the value is still read as String format.
val jsonSchema = newdf1.select("value").as[String].schema
val df1 = newdf1.select(from_json($"value", jsonSchema).alias("value_new"))
val df2 = df1.select("value_new.*")
I even tried to use,
schema_of_json("json_schema"))
val jsonSchema: String = newdf.select(schema_of_json(col("value".toString))).as[String].first()
still no hope.. Please help..
You can load the data as textFile, create case class for person and parse every json string to Person instance using json4s or gson, then creating the Dataframe as follows:
case class Person(name: String, age: Int)
val jsons = spark.read.textFile("/my/input")
val persons = jsons.map{json => toPerson(json) //instead of 'toPerson' actually parse with json4s or gson to return Person instance}
val df = sqlContext.createDataFrame(persons)
Deserialize json to case class using json4s:
https://commitlogs.com/2017/01/14/serialize-deserialize-json-with-json4s-in-scala/
Deserialize json to case class using gson:
https://alvinalexander.com/source-code/scala/scala-case-class-gson-json-object-deserialization-and-scalatra
Using jackson library I read json data from a file (each row of file is a JSON object) an parse it to a map object of String and Any. My goal is to save specified keys (id and text) to a collection.
val input = scala.io.Source.fromFile("data.json").getLines()
val mapper = new ObjectMapper() with DefaultScalaModule
val data_collection = mutable.HashMap.empty[Int, String]
for (i <- input){
val parsedJson = mapper.readValue[Map[String, Any]](i)
data_collection.put(
parsedJson.get("id"),
parsedJson.get("text")
)
But as the values in the parsedJson map have the Any type, getting some keys like id and text, it returns Some(value) not just the value with the appropriate type. I expect the values for the id key to be Integer and values for the text to be String.
Running the code I got the error:
Error:(31, 23) type mismatch;
found : Option[Any]
required: Int
parsedJson.get("id"),
Here is a sample of JSON data in the file:
{"text": "Hello How are you", "id": 1}
Is it possible in Scala to parse id values to Int and text values to String, or at least convert Some(value) to value with type Int or String?
If you want to get a plain value from a Map instead of a Option you can use the () (apply) method - However it will throw an exception if the key is not found.
Second, Scala type system is static not dynamic, if you have an Any that's it, it won't change to Int or String at runtime, and the compiler will fail - Nevertheless, you can cast them using the asInstanceOf[T] method, but again if type can't be casted to the target type it will throw an exception.
Please note that even if you can make your code work with the above tricks, that code wouldn't be what you would expect in Scala. There are ways to make the code more typesafe (like pattern matching), but parsing a Json to a typesafe object is an old problem, I'm sure jackson provides a way to parse a json into case class that represent your data. If not take a look to circe it does.
Try the below code :
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import
com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
val input = scala.io.Source.fromFile("data.json").getLines()
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
val obj = mapper.readValue[Map[String, Any]](input)
val data_collection = mutable.HashMap.empty[Int, String]
for (i <- c) {
data_collection.put(
obj.get("id").fold(0)(_.toString.toInt),
obj.get("text").fold("")(_.toString)
)
}
println(data_collection) // Map(1 -> Hello How are you)