Map nested JSON in Azure Data Factory to raw object - json

Since ADF (Azure Data Factory) isn't able to handle complex/nested JSON objects, I'm using OPENJSON in SQL to parse the objects. But, I can't get the 'raw' JSON from the following object:
{
"rows":[
{
"name":"Name1",
"attribute1":"attribute1",
"attribute2":"attribute2"
},
{
"name":"Name2",
"attribute1":"attribute1",
"attribute2":"attribute2"
},
{
"name":"Name3",
"attribute1":"attribute1",
"attribute2":"attribute2"
}
]
}
Config 1
When I use this config:
I get all the names listed
Name1
Name2
Name3
Result:
Config 2
When I use this config:
I get the whole JSON in one record:
[ {{full JSON}} ]
Result:
Needed config
But, what I want, is this result:
{ "name":"Name1", "attribute1":"attribute1", "attribute2":"attribute2 }
{ "name":"Name2", "attribute1":"attribute1", "attribute2":"attribute2 }
{ "name":"Name3", "attribute1":"attribute1", "attribute2":"attribute2 }
Result:
So, I need the iteration of Config 1, with the raw JSON per row. Everytime I use the $['rows'], or $['rows'][0], it seems to 'forget' to iterate.
Anyone?

Have you tried Data Flows to handle JSON structures? We have that feature built-in with data flow transformations like derived column, flatten, and sink mapping.

The copy active can help us achieve it.
For example I copy B.json fron container "backup" to another Blob container "testcontainer" .
This is my B.json source dataset:
Source:
Sink:
Mapping:
Pipeline executed successful:
Check the data in testcontainer:
Hope this helps.
Update:
Copy the nested json to SQL.
Source is the same B.json in blob.
Sink dataset:
Sink:
Mapping:
Run pipeline:
Check the data in SQL database:

Related

How to pass multiple json objects of array to post request in jmeter using groovy script

POST API is triggering fine and taking only first json object from array and rest of the json objects are not passing. I need to trigger API with multiple json payloads sequentially using JMeter and Groovy.
json payload : Below is the sample json payload
[
{
"person": "abc",
"Id": "123"},
{
"person": "adfg",
"Id": "12883"},
{
"person": "adf",
"Id": "125"}
]
Groovy code : Reading data from json file which includes multiple json objects and send it to post request in jmeter.
try
{
JsonSlurper jsonSlurper=new JsonSlurper();
def jsonPayload = jsonSlurper.parse(new File('PAYLOAD.json'))
String inputData = new groovy.json.JsonBuilder(jsonPayload)
JsonElement root = new JsonParser().parse(inputData);
JsonArray jsonArray = root.getAsJsonArray();
log.info("jsonArray:"+jsonArray);
if(jsonArray != null && !jsonArray.isEmpty())
{
jsonArray.each{paylodData ->
println paylodData
log.info("post data:"+paylodData);
vars.putObject("payloads", paylodData.toString())
log.info('Generated body: ' + vars.getObject('payloads'))
}
}
}catch (FileNotFoundException fe) {
log.info("Error: Please Check the file path");
}
JMeter Test : Triggering same API with multiple payloads
using below variable in post request body
${payloads}
NOTE : API is triggering fine and taking only first json object and rest of the json objects are not passing. I need to trigger API with multiple json payloads sequentially.
What are you trying to achieve?
You have a foreach loop which iterates the JSON Array and writes the inner object value into a payloads JMeter Variable
The point is that each iteration of the foreach loop overwrites the previous value in the variable so you will always get the last value.
You either need to replace jsonArray.each with jsonArray.eachWithIndex and store each array member into a separate variable like:
payload_0 = first array member
payload_1 = second array member
etc.
or transform the response into another format, but here I cannot suggest anything because I don't know what is the expected one.
More information:
Apache Groovy - Parsing and producing JSON
Apache Groovy: What Is Groovy Used For?

Compare two Json files using Apache Spark

I am new to Apache Spark and I am trying to compare two json files.
My requirement is to find out that which key/value is added, removed or modified and what is its path.
To explain my problem, I am sharing the code which I have tried with a small json sample here.
Sample Json 1 is:
{
"employee": {
"name": "sonoo",
"salary": 57000,
"married": true
} }
Sample Json 2 is:
{
"employee": {
"name": "sonoo",
"salary": 58000,
"married": true
} }
My code is:
//Compare two multiline json files
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
//Load first json file
val jsonData_1 = sqlContext.read.json(sc.wholeTextFiles("D:\\File_1.json").values)
//Load second json file
val jsonData_2 = sqlContext.read.json(sc.wholeTextFiles("D:\\File_2.json").values)
//Compare both json files
jsonData_2.except(jsonData_1).show(false)
The output which I get on executing this code is:
+--------------------+
|employee |
+--------------------+
|{true, sonoo, 58000}|
+--------------------+
But here only one field i.e. salary was modified so output should be only the updated field with its path.
Below is the expected output details:
[
{
"op" : "replace",
"path" : "/employee/salary",
"value" : 58000
}
]
Can anyone point me in the right direction?
Assuming each json has an identifier, and that you have two json groups (e.g. folders), you need to compare b/w the jsons in the two groups:
Load the jsons from each group into a dataframe, providing a schema matching the structure of the son. After this, you have two dataframes.
Compare the jsons (by now rows in a dataframe) by joining on the identifiers, looking for mismatched values.

Flatten nested JSON with jq

I'm trying to flatten some nested JSON with jq. My first attempt was by looping over the JSON in bash with base64 as per this article. It turned out to perform very slowly, so I'm trying to figure out an alternative with just jq.
I have some JSON like this:
[
{
"id":117739,
"officers": "[{\"name\":\"Alice\"},{\"name\":\"Bob\"}]"
},
{
"id":117740,
"officers":"[{\"name\":\"Charlie\"}]"
}
]
The officers field holds a string which is JSON too. I'd like to reduce this to:
[
{ "id":117739, "name":"Alice" },
{ "id":117739, "name":"Bob" },
{ "id":117740, "name":"Charlie" }
]
Well the data you're attempting to flatten is itself JSON so you have to parse it using fromjson. Once parsed, you could then generate the new objects.
map({id} + (.officers | fromjson[]))

Dynamically build json using groovy

I am trying to dynamically build some json based on data I retrieve from a database. Up until the opening '[' is the "root" I guess you could say. The next parts with name and value are dynamic and will be based on the number of results I get from the db. I query the db and then the idea was to iterate through the result adding to the json. Can I use jsonBuilder for the root section and then loop with jsonSlurper to add each additional section? Most of the examples I have seen deal with a root and then a one time "slurp" and then joining the two so wasn't sure if I should try a different method for looping and appending multiple sections.
Any tips would be greatly appreciated. Thanks.
{
"hostname": "$hostname",
"path": "$path",
"extPath": "$extPath",
"appName": "$appName",
"update": {"parameter": [
{
"name": "$name",
"value": "$value"
},
{
"name": "$name",
"value": "$value"
}
]}
}
EDIT: So what I ended up doing was just using StringBuilder to create the initial block and then append the subsequent sections. Maybe not the most graceful way to do it, but it works!
//Create the json string
StringBuilder json = new StringBuilder("""{
"hostname": "$hostname",
"path": "$path",
"extPath": "$extPath",
"appName": "$appName",
"update": {"parameter": ["""
)
//Append
sql.eachRow("""<query>""",
{ params ->
json.append("""{ "name": "params.name", "value": "params.value" },""");
}
)
//Add closing json tags
json.append("""]}}""")
If I got your explanation correctly and if the data is not very big (it can live in memory), I'd build a Map object (which is very easy to work with in groovy) and convert it to JSON afterwards. Something like this:
def data = [
hostname: hostname,
path: path,
extPath: extPath,
appName: appName,
update: [parameter: []]
]
sql.eachRow(sqlStr) { row ->
data.update.parameter << [name: row.name, value: row.value]
}
println JsonOutput.toJson(data)
If you're using Grails and Groovy you can utilize grails.converters.JSON.
First, define a JSON named config:
JSON.createNamedConfig('person') {
it.registerObjectMarshaller(Person) {
Person person ->
def output = [:]
output['name'] = person.name
output['address'] = person.address
output['age'] = person.age
output
}
}
This will result in a statically defined named configuration for the Object type of person. Now, you can simply call:
JSON.use('person') {
Person.findAll() as JSON
}
This will return every person in the database with their name, address and age all in one JSON request. I don't know if you're using grails as well in this situation though, for pure Groovy go with another answer here.

Get json data from var

I gets following data in to a variable fields
{ data: [ '{"myObj":"asdfg"}' ] }
How to get the value of myObj to another variable? I tried fields.myObj.
I am trying to upload file to server using MEANjs and node multiparty
Look at your data.
fields only has one property: data. So fields.myObj isn't going to work.
So, let's start with fields.data.
The value of that is an array. You can see the []. It has only one member, so:
fields.data[0]
This is a string. You seem to want to treat it as an object. It happens to conform to the JSON syntax, so you can parse it:
JSON.parse(fields.data[0])
This parses into an object, so now you can access the myObj property.
JSON.parse(fields.data[0]).myObj
var fields = { data: [ '{"myObj":"asdfg"}' ] };
alert(JSON.parse(fields.data[0]).myObj);