Scala Json parsing - json

This is the input json which I am getting which is nested json structure and I don't want to map directly to class, need custom parsing of some the objects as I have made the case classes
{
"uuid": "b547e13e-b32d-11ec-b909-0242ac120002",
"log": {
"Response": {
"info": {
"receivedTime": "2022-02-09T00:30:00Z",
"isSecure": "Yes",
"Data": [{
"id": "75641",
"type": "vendor",
"sourceId": "3",
"size": 53
}],
"Group": [{
"isActive": "yes",
"metadata": {
"owner": "owner1",
"compressionType": "gz",
"comments": "someComment",
"createdDate": "2022-01-11T11:00:00Z",
"updatedDate": "2022-01-12T14:17:55Z"
},
"setId": "1"
},
{
"isActive": "yes",
"metadata": {
"owner": "owner12",
"compressionType": "snappy",
"comments": "someComment",
"createdDate": "2022-01-11T11:00:00Z",
"updatedDate": "2022-01-12T14:17:55Z"
},
"setId": "2"
},
{
"isActive": "yes",
"metadata": {
"owner": "owner123",
"compressionType": "snappy",
"comments": "someComment",
"createdDate": "2022-01-11T11:00:00Z",
"updatedDate": "2022-01-12T14:17:55Z"
},
"setId": "4"
},
{
"isActive": "yes",
"metadata": {
"owner": "owner124",
"compressionType": "snappy",
"comments": "someComments",
"createdDate": "2022-01-11T11:00:00Z",
"updatedDate": "2022-01-12T14:17:55Z"
},
"setId": "4"
}
]
}
}
}
}
Code that I am trying play json also tried circe . Please help ..New to scala world
below is object and case class
case class DataCatalog(uuid: String, data: Seq[Data], metadata: Seq[Metadata])
object DataCatalog {
case class Data(id: String,
type: String,
sourceId: Option[Int],
size: Int)
case class Metadata(isActive: String,
owner: String,
compressionType: String,
comments: String,
createdDate: String,
updatedDate: String
)
def convertJson(inputjsonLine: String): Option[DataCatalog] = {
val result = Try {
//val doc: Json = parse(line).getOrElse(Json.Null)
//val cursor: HCursor = doc.hcursor
//val uuid: Decoder.Result[String] = cursor.downField("uuid").as[String]
val lat = (inputjsonLine \ "uuid").get
DataCatalog(uuid, data, group)
}
//do pattern matching
result match {
case Success(dataCatalog) => Some(dataCatalog)
case Failure(exception) =>
}
}
}
Any parsing api is fine.

If you use Scala Play, for each case class you should have an companion object which will help you a lot with read/write object in/from json:
object Data {
import play.api.libs.json._
implicit val read = Json.reads[Data ]
implicit val write = Json.writes[Data ]
def tupled = (Data.apply _).tupled
}
object Metadata {
import play.api.libs.json._
implicit val read = Json.reads[Metadata ]
implicit val write = Json.writes[Metadata ]
def tupled = (Metadata.apply _).tupled
}
Is required as each companion object to be in same file as the case class. For your json example, you need more case classes because you have a lot of nested objects there (log, Response, info, each of it)
or, you can read the field which you're interested in as:
(jsonData \ "fieldName").as[CaseClassName]
You can try to access the Data value as:
(jsonData \ "log" \ "Response" \ "info" \ "Data").as[Data]
same for Metadata

Related

Deep remove specific empty json array in circe scala

I want to remove deep empty json array from my json before/during processing by circe.
Incoming JSON
{
"config": {
"newFiles": [{
"type": "audio",
"value": "welcome1.mp3"
}],
"oldFiles": [],
"channel": "BC"
}
}
or
{
"config": {
"newFiles": [],
"oldFiles": [{
"type": "audio",
"value": "welcome1.mp3"
}],
"channel": "BC"
}
}
Resulted Json should look like
{
"config": {
"newFiles": [{
"type": "audio",
"value": "welcome1.mp3"
}],
"channel": "BC"
}
}
or
{
"config": {
"oldFiles": [{
"type": "audio",
"value": "welcome1.mp3"
}],
"channel": "BC"
}
}
What i understand that this can be done before decoding config as well as during decoding config.
The idea here is i want to handle only one of files (either new or old) at my case class level.
Method 1: Tried at config decoding level which works well.
case class File(`type`: String, value: String)
case class Config(files: List[File],
channel: String = "BC")
object Config{
implicit final val FileDecoder: Decoder[File] = deriveDecoder[File]
implicit val ConfigDecoder: Decoder[Config] = (h:HCursor) =>
for {
oldFiles <- h.get[List[File]]("oldFiles")
files <- if (oldFiles.isEmpty) h.get[List[File]]("newFiles") else h.get[List[File]]("oldFiles")
channel <- h.downField("channel").as[String]
}yield{
Config(files, channel)
}
}
case class Inventory(config: Config)
object Inventory {
implicit val InventoryDecoder: Decoder[Inventory] = deriveDecoder[Inventory]
}
Method 2: Tried before feeding into decoding which didn't worked out
Let me know what could be the elegant approach to handle it.
PS: I have multiple similar config decoders and if i handle this at config decoding level then there will be a lot of boiler plate code.
I have simplified the problem a little bit again, but combining this with the previous answer should be simple.
I also took advantage of cats.data.NonEmptyList
final case class Config(files: NonEmptyList[String], channel: String = "BC")
object Config {
implicit final val ConfigDecoder: Decoder[Config] =
(
(Decoder[NonEmptyList[String]].at(field = "newFiles") or Decoder[NonEmptyList[String]].at(field = "oldFiles")),
Decoder[String].at(field = "channel")
).mapN(Config.apply).at(field = "config")
}
This can be used like this:
val data =
"""[
{
"config": {
"newFiles": ["Foo", "Bar", "Baz"],
"oldFiles": [],
"channel": "BC"
}
},
{
"config": {
"newFiles": [],
"oldFiles": ["Quax"],
"channel": "BC"
}
}
]"""
parser.decode[List[Config]](data)
// res: Either[io.circe.Error, List[Config]] =
// Right(List(
// Config(NonEmptyList("Foo", "Bar", "Baz"), "BC"),
// Config(NonEmptyList("Quax"), "BC")
// ))
Note: I am assuming that at least one of the two lists must be non-empty and give priority to the new one.
You can see the code running here.

How to update nested JSON array in scala

I'm a newbie to scala/play and I've been trying to update (add a new id to) the query array in the sections[1] of the JSON but I've had no success traversing the JSON as I have little knowledge of transformers and how to use it.
"definitions": [
{
"sections": [
{
"priority": 1,
"content": {
"title": "Driver",
"links": [
{
"url": "https://blabla.com",
"text": "See all"
}
]
},
"SearchQuery": {
"options": {
"aggregate": true,
"size": 20,
},
"query": "{\"id\":{\"include\":[\"0wxZ4Nr2\", \"0wxZbNr2\", \"6WZOPMw1\"}}"
}
},
{
"priority": 2,
"content": {
"title": "Deliver",
"links": [
{
"url": "https://blabla.com",
"text": "See all"
}
]
},
"SearchQuery": {
"options": {
"aggregate": true,
"size": 20,
},
"query": "{\"id\":{\"include\":[\"2W12Q2wq\", \"Nwq09lW3\", \"QweNN2d9\"]}}"
}
}
]
}
Any suggestions on how I can achieve this. My goal is to put values inside the specific fields of JSON array. I am using play JSON library throughout my application?
As you noticed if you use PlayJSON you can use Json Transformers
Updating a field would work like this:
val queryUpdater = (__ \ "definitions" \ 1 \ "SearchQuery" \ "query").json.update(
of[JsString].map {
case JsString(value) =>
val newValue: String = ... // calculate new value
JsString(newValue)
}
)
json.transform(queryUpdater)
If you needed to update all queries it would be more like:
val updateQuery = (__ \ "SearchQuery" \ "query").json.update(
of[JsString].map {
case JsString(value) =>
val newValue: String = ... // calculate new value
JsString(newValue)
}
)
val updateQueries = (__ \ "definitions").json.update(
of[JsArray].map {
case JsArray(arr) =>
JsArray(arr.map(_.transform(updateQuery)))
}
)
json.transform(updateQueries)

How to stream insert JSON array to BigQuery table in Apache Beam

My apache beam application receives a message in JSON array but insert each row to a BigQuery table. How can I support this usecase in ApacheBeam? Can I split each row and insert it to table one by one?
JSON message example:
[
{"id": 1, "name": "post1", "price": 10},
{"id": 2, "name": "post2", "price": 20},
{"id": 3, "name": "post3", "price": 30}
]
BigQuery table schema:
[
{
"mode": "REQUIRED",
"name": "id",
"type": "INT64"
},
{
"mode": "REQUIRED",
"name": "name",
"type": "STRING"
},
{
"mode": "REQUIRED",
"name": "price",
"type": "INT64"
}
]
Here is my solution. I converted JSON string to List once then c.output one by one. My code in in Scala but you can do the same thing in Java.
case class MyTranscationRecord(id: String, name: String, price: Int)
case class MyTranscation(recordList: List[MyTranscationRecord])
class ConvertJSONTextToMyRecord extends DoFn[KafkaRecord[java.lang.Long, String], MyTranscation]() {
private val logger: Logger = LoggerFactory.getLogger(classOf[ConvertJSONTextToMyRecord])
#ProcessElement
def processElement(c: ProcessContext): Unit = {
try {
val mapper: ObjectMapper = new ObjectMapper()
.registerModule(DefaultScalaModule)
val messageText = c.element.getKV.getValue
val transaction: MyRecord = mapper.readValue(messageText, classOf[MyTranscation])
logger.info(s"successfully converted to an EPC transaction = $transaction")
for (record <- transaction.recordList) {
c.output(record)
}
} catch {
case e: Exception =>
val message = e.getLocalizedMessage + e.getStackTrace
logger.error(message)
}
}
}

How to parse just part of JSON with Klaxon?

I'm trying to parse some JSON to kotlin objects. The JSON looks like:
{
data: [
{ "name": "aaa", "age": 11 },
{ "name": "bbb", "age": 22 },
],
otherdata : "don't need"
}
I just need to data part of the entire JSON, and parse each item to a User object:
data class User(name:String, age:Int)
But I can't find an easy way to do it.
Here's one way you can achieve this
import com.beust.klaxon.Klaxon
import java.io.StringReader
val json = """
{
"data": [
{ "name": "aaa", "age": 11 },
{ "name": "bbb", "age": 22 },
],
"otherdata" : "not needed"
}
""".trimIndent()
data class User(val name: String, val age: Int)
fun main(args: Array<String>) {
val klaxon = Klaxon()
val parsed = klaxon.parseJsonObject(StringReader(json))
val dataArray = parsed.array<Any>("data")
val users = dataArray?.let { klaxon.parseFromJsonArray<User>(it) }
println(users)
}
This will work as long as you can fit the whole json string in memory. Otherwise you may want to look into the streaming API: https://github.com/cbeust/klaxon#streaming-api

how to parse a complicated json in Scala

I am a real newbie in Scala (Using Play framework) and trying to figure out what would be the best way to parse a complicated JSON.
This is an example I have:
{
"id":"test1",
"tmax":270,
"at":2,
"bcat":[
"IAB26",
"IAB25",
"IAB24"
],
"imp":[
{
"id":"1",
"banner":{
"w":320,
"h":480,
"battr":[
10
],
"api":[
]
},
"bidfloor":0.69
}
],
"app":{
"id":"1234",
"name":"Video Games",
"bundle":"www.testapp.com",
"cat":[
"IAB1"
],
"publisher":{
"id":"1111"
}
},
"device":{
"dnt":0,
"connectiontype":2,
"carrier":"Ellijay Telephone Company",
"dpidsha1":"e5f61ae0597d8abee94860d66f7d512aa68d0985",
"dpidmd5":"c1827fe90bae819017dfdb30db7e84fa",
"didmd5":"1f6cb9dc519db8e48cf9592b31cce04e",
"didsha1":"2e6a5d7f5fd1b2b5dea56a80f2b9dc24902a0ca7",
"ifa":"422aeb3a-6507-40b5-9e6f-42a0e14b51be",
"osv":"4.4",
"os":"Android",
"ua":"Mozilla/5.0 (Linux; Android 4.4.2; DL1010Q Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Safari/537.36",
"ip":"24.75.160.74",
"devicetype":1,
"geo":{
"type":1,
"country":"USA",
"city":"Jasper"
}
},
"user":{
"id":"9c2be7c2a3dfe19070f193910e92b2e0"
},
"cur":[
"USD"
]
}
Thank you very much!
I would go with case classes as described in http://json4s.org/#extracting-values e.g.
scala> import org.json4s._
scala> import org.json4s.jackson.JsonMethods._
scala> implicit val formats = DefaultFormats // Brings in default date formats etc.
scala> case class Child(name: String, age: Int, birthdate: Option[java.util.Date])
scala> case class Address(street: String, city: String)
scala> case class Person(name: String, address: Address, children: List[Child])
scala> val json = parse("""
{ "name": "joe",
"address": {
"street": "Bulevard",
"city": "Helsinki"
},
"children": [
{
"name": "Mary",
"age": 5,
"birthdate": "2004-09-04T18:06:22Z"
},
{
"name": "Mazy",
"age": 3
}
]
}
""")
scala> json.extract[Person]
res0: Person = Person(joe,Address(Bulevard,Helsinki),List(Child(Mary,5,Some(Sat Sep 04 18:06:22 EEST 2004)), Child(Mazy,3,None)))
Use implicit to automatically fetch json value with the same name.
if you have a string like
val jsonString =
{ "company": {
"name": "google",
"department": [
{
"name": "IT",
"members": [
{
"firstName": "Max",
"lastName": "Joe",
"age" : 25
},
{
"firstName": "Jean",
"lastName": "Nick",
"age" : 55,
"salary": 100,000
}
]
},
{
"name": "Marketing"
"members": [
{
"firstName": "Mike",
"lastName": "Lucas",
"age" : 43
}
]
},
]
}
}
Then we should do as following to parse it into scala object
val json: JsValue = Json.parse(jsonString) // string to JsValue
case class Person(firstName: String, lastName: String, age: Int, salary: Option[Int]) // keep the lower level object above since you will use them below
implicit personR = Json.reads[Person]
// implicit will read the json attribute if you use the same name as it is in json String. For example "name" : "google" will be read if you have Company(name: String). See the "name" and name:String .
case class Department(name: String, members: Seq[Person])
implicit departmentR = Json.reads[Department]
case class Company(name: String, department: Seq[Department])
implicit val companyR = Json.reads[Company]
val result = json.validate[Company] match {
case s: JsSuccess[Company] => s.get // s.get will return a Company object
case e: JsError => println("error")
}
Don't know if it can be used standalone, but Play Framework has great tools to work with JSON. Take a look here