I have rows of the following JSON form:
[
{
"id": 1,
"costs": [
{
"blue": 100,
"location":"courts",
"sport": "football"
}
]
}
]
I want to upload this into a redshift table as follows:
id | blue | location | sport
--------+------+---------+------
1 | 100 | courts |football
The following JSONPaths file is not successful:
{
"jsonpaths": [
"$.id",
"$.costs[0].blue",
"$.costs[0].location",
"$.costs[0].sport"
]
}
Redshift returns the following error code:
err_code: 1216 Invalid JSONPath format: Member is not an object.
How can I change the jsonpaths file to be able to upload the json as desired?
The answer to this is provided by John Rotenstein in the comments. I am just formalizing the answer here.
As shown in the documentation, the input JSON records have to be a new line delimited sequence of JSON objects. The examples show the JSON objects as pretty printed, but typically the input stream of records would be one JSON object per line.
{ "id": 1, "costs": [ { "blue": 100, "location":"courts", "sport": "football" } ] }
{ "id": 2, "costs": [ { "blue": 200, "location":"fields", "sport": "cricket" } ] }
So, technically the input record stream is not required to be a valid JSON, but a stream of delimited valid JSON objects.
Related
I retrieve data using Azure Data Factory from an OnPremise database and the output I get is as follows:
{
"value":[
{
"JSON_F52E2B61-18A1-11d1-B105-XXXXXXX":"{\"productUsages\":[{\"customerId\":3552,\"productId\":120,\"productionDate\":\"2015-02-10\",\"quantity\":1,\"userName\":\"XXXXXXXX\",\"productUsageId\":XXXXXX},{\"customerId\":5098,\"productId\":120,\"productionDate\":\"2015-04-07\",\"quantity\":1,\"userName\":\"ZZZZZZZ\",\"productUsageId\":ZZZZZZ}]}"
}
]
}
The entire value array is being serialized into a JSON and I end up with:
[{
"productUsages":
[
{
"customerId": 3552,
"productId": 120,
"productionDate": "2015-02-10",
"quantity": 1,
"userName": "XXXXXXXX",
"productUsageId": XXXXXX
},
{
"customerId": 5098,
"productId": 120,
"productionDate": "2015-04-07",
"quantity": 1,
"userName": "ZZZZZZZ",
"productUsageId": ZZZZZZZ
}
]
}]
I need to have a Json Object at a root level, not Json Array ([] replaced with {}). What's the easiest way to achieve that in Azure Data Factory?
Thanks
In ADF When you read any Json file it will read as array of Objects by default :
Sample data While reading Json data:
Data preview:
But when you want to move data to sink in Json format you have option called Set of objects you need to select that:
Sample data While storing in sink in form of Json data:
Output
I have a json message like below. I am using dbt and with Big query plug in. I need to create table dynamically in Big query
{
"data": {
"schema":"dev",
"payload": {
"lastmodifieddate": "2022-11-122 00:01:28",
"changeeventheader": {
"changetype": "UPDATE",
"changefields": [
"lastmodifieddate",
"product_value"
],
"committimestamp": 18478596845860,
"recordIds":[
"568069"
]
},
"product_value" : 20000
}
}
}
I need to create table dynamically with recordIds and changed fields. This field list changes dynamically whenever source sends update..
Expected output:
recordIds | product_value | lastmodifieddate |changetype
568069 | 20000 | 2022-11-122 00:01:28 |UPDATE
Thanks for your suggestions and help!.
JSON objects can be saved in a BigQuery table. There is no need to use dbt here.
with tbl as (select 5 row, JSON '''{
"data": {
"schema":"dev",
"payload": {
"lastmodifieddate": "2022-11-122 00:01:28",
"changeeventheader": {
"changetype": "UPDATE",
"changefields": [
"lastmodifieddate",
"product_value"
],
"committimestamp": 18478596845860,
"recordIds":[
"568069"
]
},
"product_value" : 20000
}
}
}''' as JS)
select *,
JSON_EXTRACT_STRING_ARRAY(JS.data.payload.changeeventheader.recordIds) as recordIds,
JSON_EXTRACT_SCALAR(JS.data.payload.product_value) as product_value,
Json_value(JS.data.payload.lastmodifieddate) as lastmodifieddate,
Json_value(JS.data.payload.changeeventheader.changetype) as changetype
from tbl
If the JSON is saved as string in a BigQuery table, please use PARSE_JSON(column_name) to convert the string to JSON first.
I've a json file, and there is a comma in the end of JSON object. How to remove the last comma of Item2?
Opening this file in the notepad++ having json viewer plugin/Format json removes the commas from Item1, Item2 and last json object.
Does PowerShell support reading this json and format properly like notepad++ does?
Found this documentation https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/convertto-json?view=powershell-7.2
Did not find any options in ConvertTo-Json to format the json given below. And write same json again in correct format.
{
"Name": "SampleName",
"Cart": [
{
"Item1": "ItemOne",
},
{
"Item2": "ItemTwo",
},
]
}
Expected json output
{
"Name": "SampleName",
"Cart": [
{
"Item1": "ItemOne"
},
{
"Item2": "ItemTwo"
}
]
}
You can use the third-party module newtonsoft.json
Then the cmdlet ConvertFrom-JsonNewtonsoft will accept this malformatted JSON file.
Once converted to an object you can convert it back to a json valid string
$a = #"
{
"Name": "SampleName",
"Cart": [
{
"Item3": "ItemOne",
},
{
"Item2": "ItemTwo",
},
]
}
"#
$a | ConvertFrom-JsonNewtonsoft | ConvertTo-JsonNewtonsoft
E.g.
API A has this response and is using the Json extractor to get value of "location"
{
"customer":[
{
"name": John,
"age": 21,
"location": USA,
},
{
"name": Jane,
"age": 32,
"location": Canada,
}
]
}
The CSV data set config has been set up and in CSV file, it has the ${location} variable
Name
Information
John
"Info": [{"location": "${location}","phone": "99999"}]
Jane
"Info": [{"location": "${location}","phone": "22231"}]
API B needs to get the following request
{
"Contact":[
{
${information}
}
]
}
But instead, I still receive without any value for ${location}
{
"Contact":[
{
"Info": [{"location": "${location}","phone": "99999"}]
}
]
}
Expecting for the first row:
{
"Contact":[
{
"Info": [{"location": "USA","phone": "99999"}]
}
]
}
And then the next iteration would do the same thing to Jane etc...
Is there a way to pass the location value from API A to the CSV and pass the information including the location to the API B?
You need to wrap the reference variable from the CSV Data Set Config into __eval() function like:
{
"Contact":[
{
${__eval(${information})}
}
]
}
Demo:
More information: Here’s What to Do to Combine Multiple JMeter Variables
I have used REST to get data from API and the format of JSON output that contains arrays. When I am trying to copy the JSON as it is using copy activity to BLOB, I am only getting first object data and the rest is ignored.
In the documentation is says we can copy JSON as is by skipping schema section on both dataset and copy activity. I followed the same and I am the getting the output as below.
https://learn.microsoft.com/en-us/azure/data-factory/connector-rest#export-json-response-as-is
Tried copy activity without schema, using the header as first row and output files to BLOB as .json and .txt
Sample REST output:
{
"totalPages": 500,
"firstPage": true,
"lastPage": false,
"numberOfElements": 50,
"number": 0,
"totalElements": 636,
"columns": {
"dimension": {
"id": "variables/page",
"type": "string"
},
"columnIds": [
"0"
]
},
"rows": [
{
"itemId": "1234",
"value": "home",
"data": [
65
]
},
{
"itemId": "1235",
"value": "category",
"data": [
92
]
},
],
"summaryData": {
"totals": [
157
],
"col-max": [
123
],
"col-min": [
1
]
}
}
BLOB Output as the text is below: which is only first object data
totalPages,firstPage,lastPage,numberOfElements,number,totalElements
500,True,False,50,0,636
If you want to write the JSON response as is, you can use an HTTP connector. However, please note that the HTTP connector doesn't support pagination.
If you want to keep using the REST connector and to write a csv file as output, can you please specify how you want the nested objects and arrays to be written ?
In csv files, we can not write arrays. You could always use a custom activity or an azure function activity to call the REST API, parse it the way you want and write to a csv file.
Hope this helps.