json object append in redis-cli - json

i am using redis-cli for one of my project and i need to append the data into existing json in redis, i have tried json.arrappend but it is not working.
i need to append in sDetail array and in jsonDetails array. any suggestions how to append in json in redis-cli
'{"xyz": [{"subType": 1,"sDetail": [{"eCs": "3","jsonDetails": "{\"ce\" :[{\"cRId\":272, \"cV\":10000, \"type\":1, \"tId\":0, \"uTid\":\"T00005\", \"sNumber\":\"53320\", \"sDetailId\":1101}]}"}]}]}'

jsonDetails is a string and not an array
127.0.0.1:6379> JSON.SET test $ '{"xyz": [{"subType": 1,"sDetail": [{"eCs": "3","jsonDetails": "{\"ce\" :[{\"cRId\":272, \"cV\":10000, \"type\":1, \"tId\":0, \"uTid\":\"T00005\", \"sNumber\":\"53320\", \"sDetailId\":1101}]}"}]}]}'
OK
127.0.0.1:6379> JSON.TYPE test $.xyz[0].sDetail[0].jsonDetails
1) "string"
If you set it like this
JSON.SET test $ '{"xyz": [{"subType": 1,"sDetail": [{"eCs": "3","jsonDetails": {"ce" :[{"cRId":272, "cV":10000, "type":1, "tId":0, "uTid":"T00005", "sNumber":"53320", "sDetailId":1101}]}}]}]}'
It would be an array
127.0.0.1:6379> JSON.TYPE test $.xyz[0].sDetail[0].jsonDetails.ce
1) "array"
And now you can append to it using JSON.ARRAPPEND, for example:
127.0.0.1:6379> JSON.ARRAPPEND test $.xyz[0].sDetail[0].jsonDetails.ce '{"new": "data"}'
1) (integer) 2
To append to sDetail you can try something like
JSON.ARRAPPEND test $.xyz[0].sDetail '{"eCs": "42", "jsonDetails": {"ce": [{"cRId":999, "cV": 888}]}}'

Related

Play JSON Parse and Extract Elements Without a Key Path

I have a JSON that looks like this, yes the JSON is a valid format.
[2,
"19223201",
"BootNotification",
{
"reason": "PowerUp",
"chargingStation": {
"model": "SingleSocketCharger",
"vendorName": "VendorX"
}
}
]
I'm using Play framework's JSON library and I would like to understand how I could parse the 3rd line and extract the BootNotification value as a String.
If it had a key, I can use that key to traverse the JSON and get the corresponding value, but this is not the case here. I also do not have the possibility to load this line by line and infer from line number 3 as with the example above.
Any suggestions on how I could do this?
I think, I have found out a way after trying all this on Ammonite. Here is what I could do:
# val input: JsValue = Json.parse("""[2,"12345678","BNR",{"reason":"PowerUp"}]""")
input: JsValue = JsArray(ArrayBuffer(JsNumber(2), JsString("12345678"), JsString("BNR"), JsObject(Map("reason" -> JsString("PowerUp")))))
Parsing the JSON, I get a nice array and I know that I always expect just 4 elements in the Array, so explicitly looking for an element with the array index is what I need. So to get the text at position 3, I could do the following:
# (input \ 2)
res2: JsLookupResult = JsDefined(JsString("BNR"))
# (input \ 2).toOption
res3: Option[JsValue] = Some(JsString("BNR"))
# (input \ 2).toOption.isDefined
res4: Boolean = true

Add a new line in front of each line before writing to JSON format using Spark in Scala

I'd like to add one new line in front of each of my json document before Spark writes it into my s3 bucket:
df.createOrReplaceTempView("ParquetTable")
val parkSQL = spark.sql("select LAST_MODIFIED_BY, LAST_MODIFIED_DATE, NVL(CLASS_NAME, className) as CLASS_NAME, DECISION, TASK_TYPE_ID from ParquetTable")
parkSQL.show(false)
parkSQL.count()
parkSQL.write.json("s3://test-bucket/json-output-7/")
with only this command, it'll produce files with contents below:
{"LAST_MODIFIED_BY":"david","LAST_MODIFIED_DATE":"2018-06-26 12:02:03.0","CLASS_NAME":"/SC/Trade/HTS_CA/1234abcd","DECISION":"AGREE","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}
{"LAST_MODIFIED_BY":"sarah","LAST_MODIFIED_DATE":"2018-08-26 12:02:03.0","CLASS_NAME":"/SC/Import/HTS_US/9876abcd","DECISION":"DISAGREE","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}
but, what I'd like to achieve is something like below:
{"index":{}}
{"LAST_MODIFIED_BY":"david","LAST_MODIFIED_DATE":"2018-06-26 12:02:03.0","CLASS_NAME":"/SC/Trade/HTS_CA/1234abcd","DECISION":"AGREE","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}
{"index":{}}
{"LAST_MODIFIED_BY":"sarah","LAST_MODIFIED_DATE":"2018-08-26 12:02:03.0","CLASS_NAME":"/SC/Import/HTS_US/9876abcd","DECISION":"DISAGREE","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}
Any insight on how to achieve this result would be greatly appreciated!
Below code will concat {"index":{}} with existing row data in DataFrame & It will convert data into json then save json data using text format.
df
.select(
lit("""{"index":{}}""").as("index"),
to_json(struct($"*")).as("json_data")
)
.select(
concat_ws(
"\n", // This will split index column & other column data into two lines.
$"index",
$"json_data"
).as("data")
)
.write
.format("text") // This is required.
.save("s3://test-bucket/json-output-7/")
Final Output
cat part-00000-24619b28-6501-4763-b3de-1a2f72a5a4ec-c000.txt
{"index":{}}
{"CLASS_NAME":"/SC/Trade/HTS_CA/1234abcd","DECISION":"AGREE","LAST_MODIFIED_BY":"david","LAST_MODIFIED_DATE":"2018-06-26 12:02:03.0","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}
{"index":{}}
{"CLASS_NAME":"/SC/Import/HTS_US/9876abcd","DECISION":"DISAGREE","LAST_MODIFIED_BY":"sarah","LAST_MODIFIED_DATE":"2018-08-26 12:02:03.0","TASK_TYPE_ID":"abcd1234-832b-43b6-afa6-361253ffe1d5"}

How to add field within nested JSON when reading from/writing to Kafka via a Spark dataframe

I've a Spark (v.3.0.1) job written in Java that reads Json from Kafka, does some transformation and then writes it back to Kafka. For now, the incoming message structure in Kafka is something like:
{"catKey": 1}. The output from the Spark job that's written back to Kafka is something like: {"catKey":1,"catVal":"category-1"}. The code for processing input data from Kafka goes something as follows:
DataFrameReader dfr = putSrcProps(spark.read().format("kafka"));
for (String key : srcProps.stringPropertyNames()) {
dfr = dfr.option(key, srcProps.getProperty(key));
}
Dataset<Row> df = dfr.option("group.id", getConsumerGroupId())
.load()
.selectExpr("CAST(value AS STRING) as value")
.withColumn("jsonData", from_json(col("value"), schemaHandler.getSchema()))
.select("jsonData.*");
// transform df
df.toJSON().write().format("kafka").option("key", "val").save()
I want to change the message structure in Kafka. Now, it should be of the format: {"metadata": <whatever>, "payload": {"catKey": 1}}. While reading, we need to read only the contents of the payload, so the dataframe remains similar. Also, while writing back to Kafka, first I need to wrap the msg in payload, add a metadata. The output will have to be of the format: {"metadata": <whatever>, "payload": {"catKey":1,"catVal":"category-1"}}. I've tried manipulating the contents of the selectExpr and from_json method, but no luck so far. Any pointer on how to achieve the functionality would be very much appreciated.
To extract the content of payload in your JSON you can use get_json_object. And to create the new output you can use the built-in functions struct and to_json.
Given a Dataframe:
val df = Seq(("""{"metadata": "whatever", "payload": {"catKey": 1}}""")).toDF("value").as[String]
df.show(false)
+--------------------------------------------------+
|value |
+--------------------------------------------------+
|{"metadata": "whatever", "payload": {"catKey": 1}}|
+--------------------------------------------------+
Then creating the new column called "value"
val df2 = df
.withColumn("catVal", lit("category-1")) // whatever your logic is to fill this column
.withColumn("payload",
struct(
get_json_object(col("value"), "$.payload.catKey").as("catKey"),
col("catVal").as("catVal")
)
)
.withColumn("metadata",
get_json_object(col("value"), "$.metadata"),
).select("metadata", "payload")
df2.show(false)
+--------+---------------+
|metadata|payload |
+--------+---------------+
|whatever|[1, category-1]|
+--------+---------------+
val df3 = df2.select(to_json(struct(col("metadata"), col("payload"))).as("value"))
df3.show(false)
+----------------------------------------------------------------------+
|value |
+----------------------------------------------------------------------+
|{"metadata":"whatever","payload":{"catKey":"1","catVal":"category-1"}}|
+----------------------------------------------------------------------+

Select (ignore if does not exists) for JSON logs Spark SQL

I am new to Apache spark and trying out a few POCs around this. I am trying to read json logs which are structured but a few fields are not always guaranteed, for example :
{
"item": "A",
"customerId": 123,
"hasCustomerId": true,
.
.
.
},
{
"item": "B",
"hasCustomerId": false,
.
.
.
}
}
Assume I want to transform these JSON logs into CSV, I was trying out Spark SQL to get hold of all the fields by simple Select statements but as the second JSON is missing a field(although it does has an identifier) I am not sure how can I handle this.
I want to transform the above json logs to
item, customerId, ....
A , 123 , ....
B , null/0 , ....
You should use SqlContext to read the JOSN file, sqlContext.read.json("file/path") But if you want to convert it into CSV and then you want to read it with missing values. Your CSV file should be look like
item,customerId,hasCustomerId, ....
A,123,, .... // hasCustomerId is null
B,,888, .... // customerId is null
i.e. empty record. Then you have to read this like
val df = sqlContext.read
.format("com.databricks.spark.csv")
.option("header", "true") // Use first line of all files as header
.option("inferSchema", "true") // Automatically infer data types
.load("file/path")

I m trying to use 'ffprobe' with Java or groovy

As per my understanding "ffprobe" will provide file related data in JSON format. So, I have installed the ffprobe in my Ubuntu machine but I don't know how to access the ffprobe JSON response using Java/Grails.
Expected response format:
{
"format": {
"filename": "/Users/karthick/Documents/videos/TestVideos/sample.ts",
"nb_streams": 2,
"nb_programs": 1,
"format_name": "mpegts",
"format_long_name": "MPEG-TS (MPEG-2 Transport Stream)",
"start_time": "1.430800",
"duration": "170.097489",
"size": "80425836",
"bit_rate": "3782576",
"probe_score": 100
}
}
This is my groovy code
def process = "ffprobe -v quiet -print_format json -show_format -show_streams HelloWorld.mpeg ".execute()
println "Found ${process.text}"
render process as JSON
I m able to get the process object and i m not able to get the json response
Should i want to convert the process object to json object?
OUTPUT:
Found java.lang.UNIXProcess#75566697
org.codehaus.groovy.grails.web.converters.exceptions.ConverterException: Error converting Bean with class java.lang.UNIXProcess
Grails has nothing to do with this. Groovy can execute arbitrary shell commands in a very simplistic way:
"mkdir foo".execute()
Or for more advanced features, you might look into using ProcessBuilder. At the end of the day, you need to execute ffprobe and then capture the output stream of JSON to use in your app.
Groovy provides a simple way to execute command line processes. Simply
write the command line as a string and call the execute() method.
The execute() method returns a java.lang.Process instance.
println "ffprobe <options>".execute().text
[Source]