Flatten nested JSON with jq - json

I'm trying to flatten some nested JSON with jq. My first attempt was by looping over the JSON in bash with base64 as per this article. It turned out to perform very slowly, so I'm trying to figure out an alternative with just jq.
I have some JSON like this:
[
{
"id":117739,
"officers": "[{\"name\":\"Alice\"},{\"name\":\"Bob\"}]"
},
{
"id":117740,
"officers":"[{\"name\":\"Charlie\"}]"
}
]
The officers field holds a string which is JSON too. I'd like to reduce this to:
[
{ "id":117739, "name":"Alice" },
{ "id":117739, "name":"Bob" },
{ "id":117740, "name":"Charlie" }
]

Well the data you're attempting to flatten is itself JSON so you have to parse it using fromjson. Once parsed, you could then generate the new objects.
map({id} + (.officers | fromjson[]))

Related

Using parsed json by JsonSlurper, how do I return a value based on another key value pair of the same node?

I have below JSON and want to fetch the value of "person1" and "person2" either into a map as a key-value pair or individually is also fine.
Expected Output: [attributes:["person1": "ROBERT", "person2": "STEVEN"]]
I started with JSON parsing and dont really have idea on what to do next?
def parsedJSON= new groovy.json.JsonSlurper().parseText(body)
JSON
"permutationsRequest":{
"attributes":[
{
"name":"person1",
"value":"ROBERT"
},
{
"name":"person2",
"value":"STEVEN"
}
]
}
}
def map = parsedJSON.permutationsRequest.attributes.collectEntries{ [it.name,it.value] }
println map.person2

Merge patch json array items?

I have the following json array...
[
{ "class":"Identity", "id":5, "type":6 },
{ "class":"Combat", "damage":10.0 },
.
.
.
]
I want to update the json object where 'class'='Identity' and add a few more values to it like { 'x':5 } or something similar. So a JSON_MERGE_PATCH.
But as far as i know that merge patch does only accept a json as an input... not a path to an json object.
So how do we add/merge patch an json object in an array ?

Map nested JSON in Azure Data Factory to raw object

Since ADF (Azure Data Factory) isn't able to handle complex/nested JSON objects, I'm using OPENJSON in SQL to parse the objects. But, I can't get the 'raw' JSON from the following object:
{
"rows":[
{
"name":"Name1",
"attribute1":"attribute1",
"attribute2":"attribute2"
},
{
"name":"Name2",
"attribute1":"attribute1",
"attribute2":"attribute2"
},
{
"name":"Name3",
"attribute1":"attribute1",
"attribute2":"attribute2"
}
]
}
Config 1
When I use this config:
I get all the names listed
Name1
Name2
Name3
Result:
Config 2
When I use this config:
I get the whole JSON in one record:
[ {{full JSON}} ]
Result:
Needed config
But, what I want, is this result:
{ "name":"Name1", "attribute1":"attribute1", "attribute2":"attribute2 }
{ "name":"Name2", "attribute1":"attribute1", "attribute2":"attribute2 }
{ "name":"Name3", "attribute1":"attribute1", "attribute2":"attribute2 }
Result:
So, I need the iteration of Config 1, with the raw JSON per row. Everytime I use the $['rows'], or $['rows'][0], it seems to 'forget' to iterate.
Anyone?
Have you tried Data Flows to handle JSON structures? We have that feature built-in with data flow transformations like derived column, flatten, and sink mapping.
The copy active can help us achieve it.
For example I copy B.json fron container "backup" to another Blob container "testcontainer" .
This is my B.json source dataset:
Source:
Sink:
Mapping:
Pipeline executed successful:
Check the data in testcontainer:
Hope this helps.
Update:
Copy the nested json to SQL.
Source is the same B.json in blob.
Sink dataset:
Sink:
Mapping:
Run pipeline:
Check the data in SQL database:

How I can fix json structure to help spark read it properly. Different types for same key

I'm reciving json. I don't know on which keys problem will appear. When spark see different types for same key it puts this into string and I need to have data in array type. I'm using spark 2.4 with json lib so I read jsons as
spark.read.json("jsonfile")
I'm flattening my json schema to this kind of format where col name is:
B__C
B__somedifferentColname
Sample json look like this
{
"A":[
{
"B":{
"C":"Hello There"
}
},
{
"B":[
{
"C":"Hello"
},
{
"C":"Hi"
}
]
}
]
}
and I would like to have this json in format like this:
{
"A":[
{
"B":[{
"C":"Hello There"
}]
},
{
"B":[
{
"C":"Hello"
},
{
"C":"Hi"
}
]
}
]
}
So as you can see what I have changed is added square brackets to first object.
But when I have one value as struct type and one value as a list it puts this to string so the column value will be look like:
"[{"C":"Hello"},{"C":"Hi"}]"
and it should look like that
B__C
Hello
Hi
Hello There
Is anyone able to help me what trick I can use to resolve this issue?
Team which delivers jsons to us said this is not possible to do this from thier side so we have to resolve this on our side.

How do I simplify a JSON object using JQ?

I've got a huge JSON object and I want to filter it down, to a small % of the available fields. I've searched some similar questions, such as enter link description here but that is for an array of objects. I have a JSON object that looks something like:
{
"timestamp":1455408955250999808,
"client":
{
"ip":"76.72.172.208",
"srcPort":0,
"country":"us",
"deviceType":"desktop"},
"clientRequest":
{
"bytes":410,
"bodyBytes":0}
}
What I'm trying to do is create a new JSON object that looks likes:
{
"timestamp":1455408955250999808,
"client":
{
"ip":"76.72.172.208",
}
"clientRequest":
{
"bytes":410
}
}
So effectively filter down the data. I've tried:
| jq 'map({client.ip: .client.ip, timestamp: .timestamp})' and I continue to get:
jq: error (at <stdin>:0): Cannot index number with string "client"
Even the most simple | jq 'map({timestamp: .timestamp})' is showing the same error.
I thought I could access the K,V pairs and use the map function as the person did for his array in the question linked above. Any help much appreciated.
Huzzah. Simple enough really :)
cat LogSample.txt | jq '. | {Id: .Id, client: {ip: .client.ip}}'
Basically define the object yourself :)
It looks like it will be simplest if you construct the object you want. Based on your example, you could do so using the following filter:
{ timestamp,
client: { ip: .client.ip },
clientRequest: {bytes: .clientRequest.bytes }
}
By contrast, map expects its input to be an array, whereas your input is a JSON object.
Please also note that jq provides direct ways to remove keys as well, e.g. using del/1.