Parse complex json file in Azure Data Factory - json

I would like to parse a complex json file in Azure Data Factory. The structure is the below which means that there are nested objects and arrays. From my understanding ADF can parse arrays but what should we do in order to parse more complex files?
The structure of the file is the below
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": "xyzzy"
"history data": [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
}
}

There seems to be some error in you JSON structure you posted. This is not a valid JSON structure. It is missing the commas (,) and braces.
When you have a valid JSON structure, you can use flatten transformation in Data flow to flatten the JSON.
Source:
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": "xyzzy",
"history data": [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
]
}
}
ADF Data flow:
Source transformation:
Connect the JSON dataset to source transformation and in Source Options, under JSON settings, select a single document.
Source preview:
Flatten transformation:
Here select the array level which you want to unroll in Unroll by and Unroll root and add mappings.
Preview of flatten:
Refer to this parse & flatten documents for more details on parsing the JSON documents in ADF.

Related

Azure Data Factory - convert Json Array to Json Object

I retrieve data using Azure Data Factory from an OnPremise database and the output I get is as follows:
{
"value":[
{
"JSON_F52E2B61-18A1-11d1-B105-XXXXXXX":"{\"productUsages\":[{\"customerId\":3552,\"productId\":120,\"productionDate\":\"2015-02-10\",\"quantity\":1,\"userName\":\"XXXXXXXX\",\"productUsageId\":XXXXXX},{\"customerId\":5098,\"productId\":120,\"productionDate\":\"2015-04-07\",\"quantity\":1,\"userName\":\"ZZZZZZZ\",\"productUsageId\":ZZZZZZ}]}"
}
]
}
The entire value array is being serialized into a JSON and I end up with:
[{
"productUsages":
[
{
"customerId": 3552,
"productId": 120,
"productionDate": "2015-02-10",
"quantity": 1,
"userName": "XXXXXXXX",
"productUsageId": XXXXXX
},
{
"customerId": 5098,
"productId": 120,
"productionDate": "2015-04-07",
"quantity": 1,
"userName": "ZZZZZZZ",
"productUsageId": ZZZZZZZ
}
]
}]
I need to have a Json Object at a root level, not Json Array ([] replaced with {}). What's the easiest way to achieve that in Azure Data Factory?
Thanks
In ADF When you read any Json file it will read as array of Objects by default :
Sample data While reading Json data:
Data preview:
But when you want to move data to sink in Json format you have option called Set of objects you need to select that:
Sample data While storing in sink in form of Json data:
Output

Access properties of an object via Dust.js after running JSON Parse filter

Is there any way to access properties of an object that was transformed into JSON through the jp (JSON parse) filter of Dust.js?
{
"response": {
"services": [
"{
\"prop1\":\"value1\",
\"prop2\":\"value2\",
\"prop3\":10
}"
]
}
}
For example, with the input above, I intend to receive the following output:
[
{
"prop1": "value1"
}
]
Note that the values inside the service array are strings, and because of that, before accessing the object's properties, I need to run JSON parse filter.
[
{#response.services}
{
"prop1": "{.|jp}"
}{#sep}, {/sep}
{/response.services}
]
What I've developed so far is the code above, and this code is returning the following output:
[
{
"prop1": "[object Object]"
}
]
In short, what I need to do is increment this {.|jp} into something where I can access the properties of the returned object, without adding new filters.
Thanks in advance to everyone who is willing to help!

How to loop through and assert JSON array objects in JMeter if they have the same name?

I have the below JSON response to be validated. I need to validate all the "createdDate" from all the Arrays irrespective. Is there any easy way to capture them or loop through them (since it has the same object name, but in different arrays) and put them in variables to do an assertion against their corresponding values from a JDBC response?
Right now I have used JSON Assertion for each and every "createdDate" using the JSON path to validate against the database value.
{
"someobject1": 123,
"Array1":
[
{
"someobject2": 2,
"createdDate": "2019-03-26T20:29:44.631+0000",
"someobject3": "SCRIPT1"
},
{
"someobject4": 3,
"createdDate": "2019-03-27T20:29:44.631+0000",
"someobject5": "SCRIPT2"
}
],
"Array2":
[
{
"someobject6": 4,
"createdDate": "2019-03-28T20:29:44.631+0000",
"someobject7": "SCRIPT3"
},
{
"someobject8": 5,
"createdDate": "2019-03-29T20:29:44.631+0000",
"someobject9": "SCRIPT4"
}
]
}
You can use JSON Assertion configured like:
Assert JSON Path Exists: $..createdDate
Expected Value: ["2019-03-26T20:29:44.631+0000","2019-03-27T20:29:44.631+0000","2019-03-28T20:29:44.631+0000","2019-03-29T20:29:44.631+0000"]
Full configuration:
More information:
JSON Path: Deep Scan Operator
JSON Path Examples
JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios

how to nest an array inside an object in yaml?

suppose you have a Map<String, Object> called "something" in YAML
something:
and the corresponding JSON should look like this:
json
"something": {
"else": "then",
"array": [
"element in array"
]
}
so for this yaml spec might be:
something:
else: then
array:
- element in array
but since something is a Map it does not let me do
array:
- element in array
or this
array: ['element in array']
so the question is what should be the yaml to get the above mentioned JSON considering something is a Map<String, Object> is it possible?
This is regarding the defining of the ServiceCatalogDefinition for the implementation of OpenServiceBroker API.
OSB Catalog using Yaml
OSB Catalog json looks like this
I am trying to make the "properties" mentioned in schemas in the above link as required.
for that I need to make it return the json like this:
"properties" : {
"someProperty" : {
"description": "description",
"type": "string"
},
"required": [
"someProperty"
]
}
And the yaml does validation in my application.yml throwing the error mentioned in comment
There is two things you need to do:
make the JSON valid, e.g. by inserting a comma (as #flyx suggests) and adding curly braces around the root level object:
{
"something": {
"else": "then",
"array": [
"element in array"
]
}
}
change the plain scalar (i.e. without quotes) mapping key something, to a double quoted scalar:
{
"something": {
"else": "then",
"array": [
"element in array"
]
}
}
Since YAML has, for all practical purposes, effectively been a superset of JSON (since YAML 1.2 from 2009), you don't need to do anything else. And of course you can read the above with both a YAML loader, as well as with a JSON parser.
Using the site json2yaml, you get YAML :
---
something:
else: then
array:
- element in array
from the json :
{
"something": {
"else": "then",
"array": [
"element in array"
]
}
}
Compare to you, I think it's your "-" must be to the same level as "array".

indexing json file using solr

I am able to index a simple JSON using solr but for complex JSON which are having nested structures like below I am getting an error. I am using the curl command to index the JSON file using solr:
curl 'https://localhost:8983/solr/json_collection/update?commit=true' --data-binary #/home/mic.json -H 'Content-type:application/json'
Error:
Error - {"responseHeader":{"status":400,"QTime":12},"error":{"metadata":["error-class","org.apache.solr.common.SolrException"],"msg":"Error parsing JSON field value. Unexpected OBJECT_START","code":400}}
JSON:
[
{
"PART I, ITEM 1. BUSINESS": {
"GENERAL": {
"Our vision": {
"text": [
"Microsoft world."
]
},
"The ambitions that drive us": {
"text": [
"To carry ambitions:",
"* Create more personal computing."
],
"Create more personal computing": {
"text": [
"We strive available. website."
]
}
}
},
"ITEM 1A. RISK FACTORS": "Our opk."
}
}
]
Error
JSON
Your JSON seems to be erroneous. In either of the cases, single object or array of JSON, your JSON should follow basic conventions.
In case of single object, the syntax should be-
{ "key":"value"}
In case of Array of JSON, the syntax can be-
{
"key1":["value1", "value2",...],
"key2":["value12", "value22",...]
}