How to find the difference/mismatch between two JSON file?

How to find the difference/mismatch between two JSON file? - json

I have two json files, one is expected json and the another one is the result of GET API call. I need to compare and find out the mismatch in the file.
Expected Json:
{
"array": [
1,
2,
3
],
"boolean": true,
"null": null,
"number": 123,
"object": {
"a": "b",
"c": "d",
"e": "f"
},
"string": "Hello World"
}
Actual Json response:
{
"array": [
1,
2,
3
],
"boolean": true,
"null": null,
"number": 456,
"object": {
"a": "b",
"c": "d",
"e": "f"
},
"string": "India"
}
Actually there are two mismatch: number received is 456 and string is India.
Is there a way to compare and get these two mismatch as results.
This need to be implemented in gatling/scala.

You can use, for example, play-json library and recursively traverse both JSONs. For next input (a bit more sophisticated than yours input):
LEFT:
{
"array" : [ 1, 2, 4 ],
"boolean" : true,
"null" : null,
"number" : 123,
"object" : {
"a" : "b",
"c" : "d",
"e" : "f"
},
"string" : "Hello World",
"absent-in-right" : true,
"different-types" : 123
}
RIGHT:
{
"array" : [ 1, 2, 3 ],
"boolean" : true,
"null" : null,
"number" : 456,
"object" : {
"a" : "b",
"c" : "d",
"e" : "ff"
},
"string" : "India",
"absent-in-left" : true,
"different-types" : "YES"
}
It produces this output:
Next fields are absent in LEFT:
*\absent-in-left
Next fields are absent in RIGHT:
*\absent-in-right
'*\array\(2)' => 4 != 3
'*\number' => 123 != 456
'*\object\e' => f != ff
'*\string' => Hello World != India
Cannot compare JsNumber and JsString in '*\different-types'
Code:
val left = Json.parse("""{"array":[1,2,4],"boolean":true,"null":null,"number":123,"object":{"a":"b","c":"d","e":"f"},"string":"Hello World","absent-in-right":true,"different-types":123}""").asInstanceOf[JsObject]
val right = Json.parse("""{"array":[1,2,3],"boolean":true,"null":null,"number":456,"object":{"a":"b","c":"d","e":"ff"},"string":"India","absent-in-left":true,"different-types":"YES"}""").asInstanceOf[JsObject]
// '*' - for the root node
showJsDiff(left, right, "*", Seq.empty[String])
def showJsDiff(left: JsValue, right: JsValue, parent: String, path: Seq[String]): Unit = {
val newPath = path :+ parent
if (left.getClass != right.getClass) {
println(s"Cannot compare ${left.getClass.getSimpleName} and ${right.getClass.getSimpleName} " +
s"in '${getPath(newPath)}'")
}
else {
left match {
// Primitive types are pretty easy to handle
case JsNull => logIfNotEqual(JsNull, right.asInstanceOf[JsNull.type], newPath)
case JsBoolean(value) => logIfNotEqual(value, right.asInstanceOf[JsBoolean].value, newPath)
case JsNumber(value) => logIfNotEqual(value, right.asInstanceOf[JsNumber].value, newPath)
case JsString(value) => logIfNotEqual(value, right.asInstanceOf[JsString].value, newPath)
case JsArray(value) =>
// For array we have to call showJsDiff on each element of array
val arr1 = value
val arr2 = right.asInstanceOf[JsArray].value
if (arr1.length != arr2.length) {
println(s"Arrays in '${getPath(newPath)}' have different length. ${arr1.length} != ${arr2.length}")
}
else {
arr1.indices.foreach { idx =>
showJsDiff(arr1(idx), arr2(idx), s"($idx)", newPath)
}
}
case JsObject(value) =>
val leftFields = value.keys.toSeq
val rightJsObject = right.asInstanceOf[JsObject]
val rightFields = rightJsObject.fields.map { case (name, value) => name }
val absentInLeft = rightFields.diff(leftFields)
if (absentInLeft.nonEmpty) {
println("Next fields are absent in LEFT: ")
absentInLeft.foreach { fieldName =>
println(s"\t ${getPath(newPath :+ fieldName)}")
}
}
val absentInRight = leftFields.diff(rightFields)
if (absentInRight.nonEmpty) {
println("Next fields are absent in RIGHT: ")
absentInRight.foreach { fieldName =>
println(s"\t ${getPath(newPath :+ fieldName)}")
}
}
// For common fields we have to call showJsDiff on them
val commonFields = leftFields.intersect(rightFields)
commonFields.foreach { field =>
showJsDiff(value(field), rightJsObject(field), field, newPath)
}
}
}
}
def logIfNotEqual[T](left: T, right: T, path: Seq[String]): Unit = {
if (left != right) {
println(s"'${getPath(path)}' => $left != $right")
}
}
def getPath(path: Seq[String]): String = path.mkString("\\")

Use diffson - a Scala implementation of RFC-6901 and RFC-6902: https://github.com/gnieh/diffson

json4s has a handy diff function described here: https://github.com/json4s/json4s (search for Merging & Diffing) and API doc here: https://static.javadoc.io/org.json4s/json4s-core_2.9.1/3.0.0/org/json4s/Diff.html

This is a slightly modified version of Artavazd's answer (which is amazing btw thank you so much!). This version outputs the differences into a convenient object instead of only logging them.
import play.api.Logger
import play.api.libs.json.{JsArray, JsBoolean, JsError, JsNull, JsNumber, JsObject, JsString, JsSuccess, JsValue, Json, OFormat, Reads}
case class JsDifferences(
differences: List[JsDifference] = List()
)
object JsDifferences {
implicit val format: OFormat[JsDifferences] = Json.format[JsDifferences]
}
case class JsDifference(
key: String,
path: Seq[String],
oldValue: Option[String],
newValue: Option[String]
)
object JsDifference {
implicit val format: OFormat[JsDifference] = Json.format[JsDifference]
}
object JsonUtils {
val logger: Logger = Logger(this.getClass)
def findDiff(left: JsValue, right: JsValue, parent: String = "*", path: List[String] = List()): JsDifferences = {
val newPath = path :+ parent
if (left.getClass != right.getClass) {
logger.debug(s"Cannot compare ${left.getClass.getSimpleName} and ${right.getClass.getSimpleName} in '${getPath(newPath)}'")
JsDifferences()
} else left match {
case JsNull => logIfNotEqual(JsNull, right.asInstanceOf[JsNull.type], newPath)
case JsBoolean(value) => logIfNotEqual(value, right.asInstanceOf[JsBoolean].value, newPath)
case JsNumber(value) => logIfNotEqual(value, right.asInstanceOf[JsNumber].value, newPath)
case JsString(value) => logIfNotEqual(value, right.asInstanceOf[JsString].value, newPath)
case JsArray(value) =>
val arr1 = value
val arr2 = right.asInstanceOf[JsArray].value
if (arr1.length != arr2.length) {
logger.debug(s"Arrays in '${getPath(newPath)}' have different length. ${arr1.length} != ${arr2.length}")
JsDifferences()
} else JsDifferences(arr1.indices.flatMap(idx => findDiff(arr1(idx), arr2(idx), s"($idx)", newPath).differences).toList)
case leftJsObject: JsObject => {
val leftFields = leftJsObject.keys.toSeq
val rightJsObject = right.asInstanceOf[JsObject]
val rightFields = rightJsObject.fields.map { case (name, value) => name }
val keysAbsentInLeft = rightFields.diff(leftFields)
val leftDifferences = keysAbsentInLeft.map(fieldName => JsDifference(
key = fieldName, path = newPath :+ fieldName, oldValue = None, newValue = Some(rightJsObject(fieldName).toString)
))
val keysAbsentInRight = leftFields.diff(rightFields)
val rightDifferences = keysAbsentInRight.map(fieldName => JsDifference(
key = fieldName, path = newPath :+ fieldName, oldValue = Some(leftJsObject(fieldName).toString), newValue = None
))
val commonKeys = leftFields.intersect(rightFields)
val commonDifferences = commonKeys.flatMap(field => findDiff(leftJsObject(field), rightJsObject(field), field, newPath).differences).toList
JsDifferences((leftDifferences ++ rightDifferences ++ commonDifferences).toList)
}
}
}
def logIfNotEqual[T](left: T, right: T, path: Seq[String]): JsDifferences = {
if (left != right) {
JsDifferences(List(JsDifference(
key = path.last, path = path, oldValue = Some(left.toString), newValue = Some(right.toString)
)))
} else JsDifferences()
}
def getPath(path: Seq[String]): String = path.mkString("\\")
}

Related

How to get specific values from 2 CSV in groovy

Please help with parse CSV to JSON from 2 files in groovy.
I have 1st CSV like this (line numbers may be different each time):
testKey,status
Name001,PASS
Name002,PASS
Name003,FAIL
CSV2 (list of all testkeys but with different names of keys:
Kt,Pd
PT-01,Name007
PT-02,Name001
PT-03,Name003
PT-05,Name002
PT-06,Name004
PT-07,Name006
I need to match in result exactly the same values for testKey (testKey.CSV1=Kt=CSV2)
Something like this:
{
"testExecutionKey": "DEMO-303",
"info": {
"user": "admin"
},
"tests": [
{
"testKey": "PT-02",
"status": "PASS"
},
{
"testKey": "PT-05",
"status": "PASS"
},
{
"testKey": "PT-03",
"status": "FAIL"
}
]
}
This code is parsing only the same value but with no matching exactly testKey:
File csv1 = new File( 'one.csv')
File csv2 = new File( 'two.csv')
def lines1 = csv1.readLines()
def lines2 = csv2.readLines()
assert lines1.size() <= lines2.size()
fieldSep = /,[ ]*/
def fieldNames1 = lines1[0].split( fieldSep )
def fieldNames2 = lines1[0].split( fieldSep )
def testList = []
lines1[1..-1].eachWithIndex { csv1Line, lineNo ->
def mappedLine = [:]
def fieldsCsv1 = csv1Line.split( fieldSep )
fieldsCsv1[1..-1].eachWithIndex { value, fldNo ->
String name = fieldNames1[ fldNo + 1 ]
mappedLine[ name ] = value
}
def fieldsCsv2 = lines2[lineNo + 1].split( fieldSep )
fieldsCsv2[0..-2].eachWithIndex { value, fldNo ->
String name = fieldNames2[ fldNo ]
mappedLine[ name ] = value
}
testList << mappedLine
}
def builder = new JsonBuilder()
def root = builder {
testExecutionKey 'DEMO-303'
info user: 'admin'
tests testList
}
println builder.toPrettyString()

You need to bind CSV2 to a Map, and then use it to replace values from CSV1, like so:
import groovy.json.*
def csv1 = '''
testKey,status
Name001,PASS
Name002,PASS
Name003,FAIL
Name999,FAIL
'''.trim()
def csv2 = '''
Kt,Pd
PT-01,Name007
PT-02,Name001
PT-03,Name003
PT-05,Name002
PT-06,Name004
PT-07,Name006
'''.trim()
boolean skip1st = false
def testMap2 = [:]
//parse and bind 1st CSV to Map
csv2.splitEachLine( /\s*,\s*/ ){
skip1st ? ( testMap2[ it[ 1 ] ] = it[ 0 ] ) : ( skip1st = true )
}
def keys
def testList = []
csv1.splitEachLine( /\s*,\s*/ ){ parts ->
if( !keys )
keys = parts*.trim()
else{
def test = [:]
parts.eachWithIndex{ val, ix -> test[ keys[ ix ] ] = val }
//check if testKey present in csv2
if( testMap2[ test.testKey ] ){
test.testKey = testMap2[ test.testKey ] // replace values from CSV2
testList << test
}
}
}
def builder = new JsonBuilder()
def root = builder {
testExecutionKey 'DEMO-303'
info user: 'admin'
tests testList
}
builder.toPrettyString()
gives:
{
"testExecutionKey": "DEMO-303",
"info": {
"user": "admin"
},
"tests": [
{
"testKey": "PT-02",
"status": "PASS"
},
{
"testKey": "PT-05",
"status": "PASS"
},
{
"testKey": "PT-03",
"status": "FAIL"
}
]
}

Explode Deeply Nested JSON returning duplicates in Spark Scala

I have a utility which is working fine for parsing simple JSONs, but cross joining in case multiple array[structs] is present in the JSON
I have tried distinct() or dropDuplicates() as well to remove duplicates which is happening due to the cross join that I have included in the code, but thats returning empty DF..
def flattenDataFrame(df: DataFrame): DataFrame = {
var flattenedDf: DataFrame = df
if (isNested(df)) {
val flattenedSchema: Array[(Column, Boolean)] = flattenSchema(df.schema)
var simpleColumns: List[Column] = List.empty[Column]
var complexColumns: List[Column] = List.empty[Column]
flattenedSchema.foreach {
case (col, isComplex) => {
if (isComplex) {
complexColumns = complexColumns :+ col
} else {
simpleColumns = simpleColumns :+ col
}
}
}
var crossJoinedDataFrame = df.select(simpleColumns: _*)
complexColumns.foreach(col => {
crossJoinedDataFrame = crossJoinedDataFrame.crossJoin(df.select(col))
crossJoinedDataFrame = flattenDataFrame(crossJoinedDataFrame)
})
crossJoinedDataFrame
} else {
flattenedDf
}
}
private def flattenSchema(schema: StructType, prefix: String = null): Array[(Column, Boolean)] = {
schema.fields.flatMap(field => {
val columnName = if (prefix == null) field.name else prefix + "." + field.name
field.dataType match {
case arrayType: ArrayType => {
val cols: Array[(Column, Boolean)] = Array[(Column, Boolean)](((explode_outer(col(columnName)).as(columnName.replace(".", "_"))), true))
cols
}
case structType: StructType => {
flattenSchema(structType, columnName)
}
case _ => {
val columnNameWithUnderscores = columnName.replace(".", "_")
val metadata = new MetadataBuilder().putString("encoding", "ZSTD").build()
Array(((col(columnName).as(columnNameWithUnderscores, metadata)), false))
}
}
}).filter(field => field != None)
}
def isNested(df: DataFrame): Boolean = {
df.schema.fields.flatMap(field => {
field.dataType match {
case arrayType: ArrayType => {
Array(true)
}
case mapType: MapType => {
Array(true)
}
case structType: StructType => {
Array(true)
}
case _ => {
Array(false)
}
}
}).exists(b => b)
}
A sample JSON in which I am facing the issue:
[
{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
},
{
"id": "0002",
"type": "donut",
"name": "Raised",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
]

Solution without join and more than that, no cross-join which is your problem:
Sorry for the formatting, can't really get it to format well for stack-overflow
def flattenDataFrame(df: DataFrame): DataFrame = {
val flattenedDf: DataFrame = df
if (isNested(df)) {
val flattenedSchema: Array[(Column, Boolean)] = flattenSchema(flattenedDf.schema)
var simpleColumns: List[Column] = List.empty[Column]
var complexColumns: List[Column] = List.empty[Column]
flattenedSchema.foreach {
case (col, isComplex) =>
if (isComplex) {
complexColumns = complexColumns :+ col
} else {
simpleColumns = simpleColumns :+ col
}
}
val complexUnderlyingCols = complexColumns.map { column =>
val name = column.expr.asInstanceOf[UnresolvedAttribute].name
val unquotedColName = s"${name.replaceAll("`","")}"
val explodeSelectColName = s"`${name.replaceAll("`","")}`"
(unquotedColName, col(name).as(unquotedColName), explode_outer(col(explodeSelectColName)).as(unquotedColName))
}
var joinDataFrame = flattenedDf.select(simpleColumns ++ complexUnderlyingCols.map(_._2): _*)
complexUnderlyingCols.foreach { case (name, tempCol, column) =>
val nonTransformedColumns = joinDataFrame.schema.fieldNames.diff(List(name)).map(fieldName => s"`${fieldName.replaceAll("`", "")}`").map(col)
joinDataFrame = joinDataFrame.select(nonTransformedColumns :+ column :_*)
}
flattenDataFrame(joinDataFrame)
} else {
flattenedDf
}
}
private def flattenSchema(schema: StructType, prefix: String = null, level: Int = 0): Array[(Column, Boolean)] = {
val unquotedPrefix = if (prefix != null) prefix.replace("", "") else null
println(level)
schema.fields.flatMap(field => {
val fieldName = field.name
val columnName = if (level == 0) {
s"$fieldName"
} else {
val fullName = s"$unquotedPrefix.$fieldName"
val x = fullName.split('.').reverse.zipWithIndex.reverse.foldLeft(new StringBuilder("")){ case (builder, (fieldPart, index)) =>
if(index > level) {
builder.append(s".$fieldPart")
} else if (index == level) {
builder.append(s".$fieldPart")
} else {
builder.append(s".$fieldPart")
}
}
x.replace(1,2,"").toString()
}
val unquotedColumnName = columnName.replace("", "")
field.dataType match {
case _: ArrayType =>
val cols: Array[(Column, Boolean)] = Array[(Column, Boolean)]((col(columnName), true)) // We pass only the column as we'll generate explode function while expanding the DF
cols
case structType: StructType =>
flattenSchema(structType, columnName, level + 1)
case _ =>
val metadata = new MetadataBuilder().putString("encoding", "ZSTD").build()
Array((col(columnName).as(unquotedColumnName, metadata), false))
}
})
}
def isNested(df: DataFrame): Boolean = {
df.schema.fields.flatMap(field => {
field.dataType match {
case _: ArrayType =>
Array(x = true)
case _: MapType =>
Array(x = true)
case _: StructType =>
Array(x = true)
case _ =>
Array(x = false)
}
}).exists(b => b)
}

Update all values from any Json

I've got some json and I have to encrypt all the values. Below is a json, all the values of which should be updated:
json = Json.parse("""{
"key1" : 1.5,
"key2" : [
{"key211": 1, "key212": "value212"},
{"key221": 2, "key222": "value222"}
]
"key3" : {
"key31" : true,
"key32" : "value32"
},
"key4" : 17
}"""
After encrypting and updating all the values, it should look like this:
val json = Json.parse("""{
"key1" : "uhKhbofQtL",
"key2" : [
{"key211": "FxnbGGZFMW", "key212": "VsdfdGfg"},
{"key221": "sdffFdd", "key222": "Fsdfsfds"}
]
"key3" : {
"key31" : "Fsdfasdf",
"key32" : "Vsdfsdfsdfs"
},
"key4" : "sfsdfFSdfs"
}"""
How can I do it?

Parse it as a Map, then traverse the map and encrypt:
def encrypt(data: Any, enc: Any => String): Any = data match {
case v: Map[String, Any] => v.map { case (k,v) => k -> encrypt(v, enc) }
case v: List[Any] => v.map(encrypt(_, enc))
case v => enc(v)
}

I've solved this problem and solution is below
implicit val readsMap: Reads[Map[String, Any]] = Reads[Map[String, Any]](m => Reads.mapReads[Any](metaValueReader).reads(m))
implicit val writesMap: Writes[Map[String, Any]] = Writes[Map[String, Any]](m => Writes.mapWrites[Any](metaValueWriter).writes(m))
def metaValueReader(jsValue: JsValue): JsResult[Any] = jsValue match {
case JsObject(m) => JsSuccess(m.map { case (k, v) => k -> metaValueReader(v) })
case JsArray(arr) => JsSuccess(arr.map(metaValueReader))
case JsBoolean(b) => JsSuccess(b).map(encryptValue)
case JsNumber(n) => JsSuccess(n).map(encryptValue)
case JsString(s) => JsSuccess(s).map(encryptValue)
case JsNull => JsSuccess("").map(encryptValue)
case badValue => JsError(s"$badValue is not a valid value")
}
def metaValueWriter(value: Any): JsValue = value match {
case jsRes: JsSuccess[Any] => metaValueWriter(jsRes.get)
case m: Map[String, Any] => JsObject(m.map { case (k, v) => k -> metaValueWriter(v) })
case arr: Seq[Any] => JsArray(arr.map(metaValueWriter))
case s: String => JsString(s)
}
How can I improve this code?

Scala: Parsing JSON using the org.fastxml.jackson library

I have written the following program to parse a JSON structure in a streaming fashion.
However this looks very imperative. This is my latest attempt to wrote more idiomatic Scala code but I am not there yet.
I am parsing the following JSON, using the Scala code that follows the JSON snippet. My goal is to shorten the code through the use of more idiomatic scala structures.
Thanks in advance.
{
"type": "ImportantIncidentInfo",
"incidentTimestamp": "2014-05-15T10:09:27.989-05:00",
"numOfMatches": 4,
"myReport": {
"docReports": {
"part1/.": {
"path": [
"unknown"
],
"myAnalysis": {
"matches": [
{
"id": {
"major": 1,
"minor": 0
},
"name": "US SSN",
"position": 13,
"string": " 636-12-4567 "
},
{
"id": {
"major": 3,
"minor": 0
},
"name": "MasterCard Credit Card Number",
"position": 35,
"string": " 5424-1813-6924-3685 "
}
]
},
"cleanedUpData": [
{
"startPosition": 0,
"endPosition": 65,
"frameContent": ""
}
],
"minedMetadata": {
"Content-Encoding": "ISO-8859-1",
"Content-Type": "text/html; charset=iso-8859-1"
},
"deducedMetadata": {
"Content-Type": "text/html; iso-8859-1"
}
},
"part2/.": {
"path": [
"unknown"
],
"myAnalysis": {
"matches": [
{
"id": {
"major": 1,
"minor": 0
},
"name": "SSN",
"position": 3,
"string": " 636-12-4567\r"
},
{
"id": {
"major": 3,
"minor": 0
},
"name": "MasterCard Credit Card Number",
"position": 18,
"string": "\n5424-1813-6924-3685\r"
}
]
},
"cleanedUpData": [
{
"startPosition": 0,
"endPosition": 44,
"frameContent": ""
}
],
"minedMetadata": {
"Content-Encoding": "windows-1252",
"Content-Type": "text/plain; charset=windows-1252"
},
"deducedMetadata": {
"Content-Type": "text/plain; iso-8859-1"
}
}
}
},
"whatSetItOffEntry": {
"action": "Log",
"component": {
"type": "aComponent",
"components": [
{
"type": "PatternComponent",
"patterns": [
1
],
"not": false
}
],
"not": false
},
"ticketInfo": {
"createIncident": true,
"tags": [],
"seeRestrictedIds": [
{
"type": "userGroup",
"name": "SiteMasters",
"description": "Group for SiteMasters",
"masters": [
"04fb02a2bc0fba"
],
"members": [],
"id": "04fade"
}
]
},
"letmeknowInfo": {
"createNotification": true,
"contactNames": [
"someguy#gmail.com"
]
}
},
"seeRestrictedIds": [
"04fade66c0"
],
"status": "New",
"timeStamps": [
"2015-05-15T10:09:27.989-05:00"
],
"count": 1
}
package mypackage
import java.io.BufferedReader
import java.io.FileReader
import java.io.IOException
import java.io.InputStream
import java.util._
import com.fasterxml.jackson.core._
import com.fasterxml.jackson.databind._
import java.util.Properties
import JacksonStreaming._
object JacksonStreaming {
def main(args: Array[String]) {
println("Entered Main")
try {
new JacksonStreaming().getNames
} catch {
case e: Exception => e.printStackTrace()
}
}
}
class JacksonStreaming {
var jsonMapper: ObjectMapper = new ObjectMapper()
var jsonFactory: JsonFactory = new JsonFactory()
var prop: Properties = new Properties()
var filePath: String = ""
val path = Array("myReport", "docReports", "part1/.", "myAnalysis", "matches", "name")
def getNames() {
println("Entered getNames")
var rootNode: JsonNode = null
try {
val fileReader = new BufferedReader(new FileReader("C:/jsonFormattedModified.json"))
println("fileReader is: " + fileReader)
rootNode = jsonMapper.readTree(fileReader)
println("Return value of jsonMapper.readTree is: " + rootNode)
findByPath(rootNode)
val jsonParser = jsonFactory.createParser(new FileReader("C:/jsonFormattedModified.json"))
println("JsonParser is: " + jsonParser)
var pathIndex = 0
val names = new ArrayList[String]()
var breakOnClose = false
while (jsonParser.nextToken() != null) {
val fieldName = jsonParser.getCurrentName
if (fieldName == null) {
//continue
}
if (breakOnClose && fieldName == path(path.length - 2)) {
println("Stopping search at end of node " + fieldName)
//break
}
if (jsonParser.getCurrentToken != JsonToken.FIELD_NAME) {
//continue
}
if (pathIndex >= path.length - 1) {
if (fieldName == path(path.length - 1)) {
try {
jsonParser.nextToken()
} catch {
case e: IOException => e.printStackTrace()
}
var name: String = null
name = jsonParser.getValueAsString
if (name == null) {
throw new RuntimeException("No value exists for field " + fieldName)
}
names.add(name)
println("Found " + fieldName + " value: " + name)
}
} else if (fieldName == path(pathIndex)) {
println("Found node " + path(pathIndex))
pathIndex += 1
if (pathIndex >= path.length - 1) {
println("Looking for names ...")
breakOnClose = true
try {
jsonParser.nextFieldName()
} catch {
case e: IOException => e.printStackTrace()
}
}
}
}
} catch {
case e: IOException => e.printStackTrace()
}
}
def findByPath(jn: JsonNode) {
println("Entered findByPath")
var matchesNamesNode = jn
for (i <- 0 until path.length - 1) {
matchesNamesNode = matchesNamesNode.path(path(i))
}
if (matchesNamesNode.isMissingNode) {
throw new RuntimeException("No node with names found.")
}
println("Tree names: " + matchesNamesNode.findValuesAsText("name"))
}
}

I think that Scala is Expression Oriented, Object oriented and Functional programming language, of course you can write it imperative but for working with JSON I Recommend you to go througt Object Oriented, you can find examples it it's github repository
https://github.com/FasterXML/jackson-module-scala/
For example I recommend you to write A Scala, classes for All the Json and then for the sub objects like MyReport or whatSetItOffEntry, in the github repo is an example for this type of solution in the repo:
package com.fasterxml.jackson.module.scala
import com.fasterxml.jackson.annotation.{JsonUnwrapped, JsonProperty, JsonIgnore}
import org.junit.runner.RunWith
import org.scalatest.junit.JUnitRunner
import org.scalatest.matchers.ShouldMatchers
import org.scalatest.FlatSpec
import com.fasterxml.jackson.databind.ObjectMapper
case class Address(address1: Option[String], city: Option[String], state: Option[String])
class NonCreatorPerson
{
var name: String = _
#JsonUnwrapped var location: Address = _
var alias: Option[String] = _
}
case class Person(name: String, #JsonIgnore location: Address, alias: Option[String])
{
private def this() = this("", Address(None, None, None), None)
def address1 = location.address1
private def address1_=(value: Option[String]) {
setAddressField("address1", value)
}
def city = location.city
private def city_=(value: Option[String]) {
setAddressField("city", value)
}
def state = location.state
private def state_= (value: Option[String]) {
setAddressField("state", value)
}
private def setAddressField(name: String, value: Option[String])
{
val f = location.getClass.getDeclaredField(name)
f.setAccessible(true)
f.set(location, value)
}
}
#RunWith(classOf[JUnitRunner])
class UnwrappedTest extends BaseSpec {
"mapper" should "handle ignored fields correctly" in {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
val p = Person("Snoopy", Address(Some("123 Main St"), Some("Anytown"), Some("WA")), Some("Joe Cool"))
val json = mapper.writeValueAsString(p)
// There's some instability in the ordering of keys. Not sure what that's about, but rather than
// have buggy tests, I'm accepting it for now.
// json should (
// be === """{"name":"Snoopy","alias":"Joe Cool","city":"Anytown","address1":"123 Main St","state":"WA"}""" or
// be === """{"name":"Snoopy","alias":"Joe Cool","state":"WA","address1":"123 Main St","city":"Anytown"}"""
// )
val p2 = mapper.readValue(json, classOf[Person])
p2 shouldEqual p
}
it should "handle JsonUnwrapped for non-creators" in {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
val p = new NonCreatorPerson
p.name = "Snoopy"
p.location = Address(Some("123 Main St"), Some("Anytown"), Some("WA"))
p.alias = Some("Joe Cool")
val json = mapper.writeValueAsString(p)
val p2 = mapper.readValue(json, classOf[NonCreatorPerson])
p2.name shouldBe p.name
p2.location shouldBe p.location
p2.alias shouldBe p.alias
}
}

Play [Scala]: How to flatten a JSON object

Given the following JSON...
{
"metadata": {
"id": "1234",
"type": "file",
"length": 395
}
}
... how do I convert it to
{
"metadata.id": "1234",
"metadata.type": "file",
"metadata.length": 395
}
Tx.

You can do this pretty concisely with Play's JSON transformers. The following is off the top of my head, and I'm sure it could be greatly improved on:
import play.api.libs.json._
val flattenMeta = (__ \ 'metadata).read[JsObject].flatMap(
_.fields.foldLeft((__ \ 'metadata).json.prune) {
case (acc, (k, v)) => acc andThen __.json.update(
Reads.of[JsObject].map(_ + (s"metadata.$k" -> v))
)
}
)
And then:
val json = Json.parse("""
{
"metadata": {
"id": "1234",
"type": "file",
"length": 395
}
}
""")
And:
scala> json.transform(flattenMeta).foreach(Json.prettyPrint _ andThen println)
{
"metadata.id" : "1234",
"metadata.type" : "file",
"metadata.length" : 395
}
Just change the path if you want to handle metadata fields somewhere else in the tree.
Note that using a transformer may be overkill here—see e.g. Pascal Voitot's input in this thread, where he proposes the following:
(json \ "metadata").as[JsObject].fields.foldLeft(Json.obj()) {
case (acc, (k, v)) => acc + (s"metadata.$k" -> v)
}
It's not as composable, and you'd probably not want to use as in real code, but it may be all you need.

This is definitely not trivial, but possible by trying to flatten it recursively. I haven't tested this thoroughly, but it works with your example and some other basic one's I've come up with using arrays:
object JsFlattener {
def apply(js: JsValue): JsValue = flatten(js).foldLeft(JsObject(Nil))(_++_.as[JsObject])
def flatten(js: JsValue, prefix: String = ""): Seq[JsValue] = {
js.as[JsObject].fieldSet.toSeq.flatMap{ case (key, values) =>
values match {
case JsBoolean(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsNumber(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsString(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsArray(seq) => seq.zipWithIndex.flatMap{ case (x, i) => flatten(x, concat(prefix, key + s"[$i]")) }
case x: JsObject => flatten(x, concat(prefix, key))
case _ => Seq(Json.obj(concat(prefix, key) -> JsNull))
}
}
}
def concat(prefix: String, key: String): String = if(prefix.nonEmpty) s"$prefix.$key" else key
}
JsObject has the fieldSet method that returns a Set[(String, JsValue)], which I mapped, matched against the JsValue subclass, and continued consuming recursively from there.
You can use this example by passing a JsValue to apply:
val json = Json.parse("""
{
"metadata": {
"id": "1234",
"type": "file",
"length": 395
}
}
"""
JsFlattener(json)
We'll leave it as an exercise to the reader to make the code more beautiful looking.

Here's my take on this problem, based on #Travis Brown's 2nd solution.
It recursively traverses the json and prefixes each key with its parent's key.
def flatten(js: JsValue, prefix: String = ""): JsObject = js.as[JsObject].fields.foldLeft(Json.obj()) {
case (acc, (k, v: JsObject)) => {
if(prefix.isEmpty) acc.deepMerge(flatten(v, k))
else acc.deepMerge(flatten(v, s"$prefix.$k"))
}
case (acc, (k, v)) => {
if(prefix.isEmpty) acc + (k -> v)
else acc + (s"$prefix.$k" -> v)
}
}
which turns this:
{
"metadata": {
"id": "1234",
"type": "file",
"length": 395
},
"foo": "bar",
"person": {
"first": "peter",
"last": "smith",
"address": {
"city": "Ottawa",
"country": "Canada"
}
}
}
into this:
{
"metadata.id": "1234",
"metadata.type": "file",
"metadata.length": 395,
"foo": "bar",
"person.first": "peter",
"person.last": "smith",
"person.address.city": "Ottawa",
"person.address.country": "Canada"
}

#Trev has the best solution here, completely generic and recursive, but it's missing a case for array support. I'd like something that works in this scenario:
turn this:
{
"metadata": {
"id": "1234",
"type": "file",
"length": 395
},
"foo": "bar",
"person": {
"first": "peter",
"last": "smith",
"address": {
"city": "Ottawa",
"country": "Canada"
},
"kids": ["Bob", "Sam"]
}
}
into this:
{
"metadata.id": "1234",
"metadata.type": "file",
"metadata.length": 395,
"foo": "bar",
"person.first": "peter",
"person.last": "smith",
"person.address.city": "Ottawa",
"person.address.country": "Canada",
"person.kids[0]": "Bob",
"person.kids[1]": "Sam"
}
I've arrived at this, which appears to work, but seems overly verbose. Any help in making this pretty would be appreciated.
def flatten(js: JsValue, prefix: String = ""): JsObject = js.as[JsObject].fields.foldLeft(Json.obj()) {
case (acc, (k, v: JsObject)) => {
val nk = if(prefix.isEmpty) k else s"$prefix.$k"
acc.deepMerge(flatten(v, nk))
}
case (acc, (k, v: JsArray)) => {
val nk = if(prefix.isEmpty) k else s"$prefix.$k"
val arr = flattenArray(v, nk).foldLeft(Json.obj())(_++_)
acc.deepMerge(arr)
}
case (acc, (k, v)) => {
val nk = if(prefix.isEmpty) k else s"$prefix.$k"
acc + (nk -> v)
}
}
def flattenArray(a: JsArray, k: String = ""): Seq[JsObject] = {
flattenSeq(a.value.zipWithIndex.map {
case (o: JsObject, i: Int) =>
flatten(o, s"$k[$i]")
case (o: JsArray, i: Int) =>
flattenArray(o, s"$k[$i]")
case a =>
Json.obj(s"$k[${a._2}]" -> a._1)
})
}
def flattenSeq(s: Seq[Any], b: Seq[JsObject] = Seq()): Seq[JsObject] = {
s.foldLeft[Seq[JsObject]](b){
case (acc, v: JsObject) =>
acc:+v
case (acc, v: Seq[Any]) =>
flattenSeq(v, acc)
}
}

Thanks m-z, it is very helpful. (I'm not so familiar with Scala.)
I'd like to add a line for "flatten" working with primitive JSON array like "{metadata: ["aaa", "bob"]}".
def flatten(js: JsValue, prefix: String = ""): Seq[JsValue] = {
// JSON primitive array can't convert to JsObject
if(!js.isInstanceOf[JsObject]) return Seq(Json.obj(prefix -> js))
js.as[JsObject].fieldSet.toSeq.flatMap{ case (key, values) =>
values match {
case JsBoolean(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsNumber(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsString(x) => Seq(Json.obj(concat(prefix, key) -> x))
case JsArray(seq) => seq.zipWithIndex.flatMap{ case (x, i) => flatten(x, concat(prefix, key + s"[$i]")) }
case x: JsObject => flatten(x, concat(prefix, key))
case _ => Seq(Json.obj(concat(prefix, key) -> JsNull))
}
}
}

Based on previous solutions, have tried to simplify the code a bit
def getNewKey(oldKey: String, newKey: String): String = {
if (oldKey.nonEmpty) oldKey + "." + newKey else newKey
}
def flatten(js: JsValue, prefix: String = ""): JsObject = {
if (!js.isInstanceOf[JsObject]) return Json.obj(prefix -> js)
js.as[JsObject].fields.foldLeft(Json.obj()) {
case (o, (k, value)) => {
o.deepMerge(value match {
case x: JsArray => x.as[Seq[JsValue]].zipWithIndex.foldLeft(o) {
case (o, (n, i: Int)) => o.deepMerge(
flatten(n.as[JsValue], getNewKey(prefix, k) + s"[$i]")
)
}
case x: JsObject => flatten(x, getNewKey(prefix, k))
case x => Json.obj(getNewKey(prefix, k) -> x.as[JsValue])
})
}
}
}

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to find the difference/mismatch between two JSON file? - json

Use diffson - a Scala implementation of RFC-6901 and RFC-6902: https://github.com/gnieh/diffson

json4s has a handy diff function described here: https://github.com/json4s/json4s (search for Merging & Diffing) and API doc here: https://static.javadoc.io/org.json4s/json4s-core_2.9.1/3.0.0/org/json4s/Diff.html

Related

How to get specific values from 2 CSV in groovy

Explode Deeply Nested JSON returning duplicates in Spark Scala

Update all values from any Json

Scala: Parsing JSON using the org.fastxml.jackson library

Play [Scala]: How to flatten a JSON object

Categories

Resources