jq transform JSON child array objects to delimited string - json

I want to transform the following input with jq:
{
"root":[
{
"field1":"field1value1",
"field2":"field2value2",
"field3Array":[
{
"prop1":"prop1_value1"
}
]
},
{
"field1":"field1value3",
"field2":"field2value4",
"field3Array":[
{
"prop1":"prop1_value3"
},
{
"prop1":"prop1_value4"
}
]
}
]
}
Output should be:
[
{
"field1": "field1value1",
"field2": "field2value2",
"field3Array": "prop1_value1"
},
{
"field1": "field1value3",
"field2": "field2value4",
"field3Array": "prop1_value3,prop1_value4"
}
]
I use this jq filter so far:
[.root[] | {field1, field2, field3Array: .field3Array[].prop1}]
but I don't know how to join the array property "prop1" to a comma-delimited string "prop1_value3,prop1_value4".
https://jqplay.org/s/CR8mGBX8Dz

You need to map the objects contained in the field3Array to their string values and join the resulting array :
.root | map({field1, field2, field3Array: .field3Array | map(.prop1) | join(",")})
You can try it here.
It can be somewhat simplified in the following where we update the .field3Array in-place instead of recreating a whole object :
.root | map(.field3Array |= (map(.prop1) | join(",")))
You can try it here.
If you're unfamiliar with the map function, the following would have worked as well :
[.root[] | {field1, field2, field3Array: [ .field3Array[] | .prop1 ] | join(",")}]
You can try it here.

Related

Transforming json file using jq so data ends up under a common object

I have some json data i get from an API that i need to transform using jq into another json format for later use with ansible as part of an inventory script.
What i have is something like:
{
"results": [
{
"name": "hostname1",
"key1": "somevalue1",
"key2": "somevalue2",
"key3": "somevalue3"
},
{
"name": "hostname2",
"key1": "somevalue12",
"key2": "somevalue22",
"key3": "somevalue32"
},
{
"name": "hostname3",
"key1": "somevalue13",
"key2": "somevalue23",
"key3": "somevalue33"
}
]
}
and i need to transform this to look like this:
{
"_meta": {
"hostvars": {
"hostname1": {
"name": "hostname1",
"key1": "somevalue1",
"key2": "somevalue2",
"key3": "somevalue3"
},
"hostname2": {
"name": "hostname2",
"key1": "somevalue12",
"key2": "somevalue22",
"key3": "somevalue32"
},
"hostname3": {
"name": "hostname3",
"key1": "somevalue13",
"key2": "somevalue23",
"key3": "somevalue33"
}
}
}
}
Things i have not been able to figure out.
If i use something like:
cat input.json | jq '{_meta: { hostvars: { (.results[].name): (.results[]) } }}'
Then _meta and hostvars is repeated for each object in the input and that is not at all what i want, i need a common "header" and then the data under there.
Ideally i would like to also exclude the "name" part in the output since it is already used and duplicated, but this is just a bonus.
Advice on how to do this? or is the filter in jq always run against one object at a time?
I experimented a bit with --slurp but didn't get anywhere
The crucial part is: Take an array of objects and transform it into one (outer) object where the attribute names of that outer object are given by some attribute value of the inner objects. That's the job description for INDEX(filter).
So:
{ _meta : { hostvars: ( .results | INDEX(.name) ) } }
Using .results[] twice will iterate over the same list twice, giving you the cartesian product of each host with each of the other objects ( 3 x 3 = 9 ). You need to reference it once!
You can do something like below. The key to the logic is forming an array of objects, firstly by making the key name as .name and the value as the whole sub-object inside.
Once you have the array, you can un-wrap into a comma-separated list of objects using add.
{ _meta : { hostvars: ( ( .results | map( { ( .name ) : . } ) | add ) ) } }
Demo - jqplay
Yet another approach using from_entries could be:
jq '.results | map({key: .name, value: .}) | {_meta: {hostvars: from_entries}}'
{
"_meta": {
"hostvars": {
"hostname1": {
"name": "hostname1",
"key1": "somevalue1",
"key2": "somevalue2",
"key3": "somevalue3"
},
"hostname2": {
"name": "hostname2",
"key1": "somevalue12",
"key2": "somevalue22",
"key3": "somevalue32"
},
"hostname3": {
"name": "hostname3",
"key1": "somevalue13",
"key2": "somevalue23",
"key3": "somevalue33"
}
}
}
}
Demo
You'll probably want to take advantage of jq pipes.
cat input.json | jq '{ _meta : { hostvars : (.results | map({key : .name, value : del(. | .name) }) | from_entries) }}'
Map these results as if they were entries to a new set of entries where the key is .name, also removing the .name from .value as you do so:
.results | map({key : .name, value : del(. | .name) })
Then, form an object from the entries:
... | from_entries
NOTE: See Inian's answer for why the array syntax used by the OP does not work.

jq select items where a property contains the value of another property

I'm trying to filter for items from a list that contain the values of other properties in the same object.
Example of the data.json:
{ result:
[
{ name: 'foo', text: 'my name is foo' },
{ name: 'bar', text: 'my name is baz' },
]
}
What I've tried:
cat data.json | jq '.result[] | select(.text | ascii_downcase | contains(.name))'
However it throws the following error:
Cannot index string with string
Is there a way in jq to select based on a dynamic property rather than a string literal?
Assuming your JSON looks more like this (strings in double quotes, no comma after the last array item):
{ "result":
[
{ "name": "foo", "text": "my name is foo" },
{ "name": "bar", "text": "my name is baz" }
]
}
When going into .text, you have lost the context from which you can access .name. You could save it (or directly the desired value) in a variable and reference it when needed:
jq '.result[] | select(.name as $name | .text | ascii_downcase | contains($name))'
{
"name": "foo",
"text": "my name is foo"
}
Demo

jq: map arrays to csv field headers

Is there a way to export a json like this:
{
"id":"2261026",
"meta":{
"versionId":"1",
"lastUpdated":"2021-11-08T15:13:39.318+01:00",
},
"address": [
"string-value1",
"string-value2"
],
"identifier":[
{
"system":"urn:oid:2.16.724.4.9.20.93",
"value":"6209"
},
{
"system":"urn:oid:2.16.724.4.9.20.2",
"value":"00042"
},
{
"system":"urn:oid:2.16.724.4.9.20.90",
"value":"UAB2"
}
]
}
{
"id":"2261027",
"meta":{
"versionId":"1",
"lastUpdated":"2021-11-08T15:13:39.318+01:00",
},
"address": [
"string-value1",
"string-value2",
"string-value3",
"string-value4"
],
"identifier":[
{
"system":"urn:oid:2.16.724.4.9.20.93",
"value":"6205"
},
{
"system":"urn:oid:2.16.724.4.9.20.2",
"value":"05041"
}
]
}
I'd like to get something like this:
"id","meta_versionId","meta_lastUpdated","address","identifier0_system","identifier0_value","identifier1_system","identifier1_value","identifier2_system","identifier2_value"
"2261026","1","2021-11-08T15:13:39.318+01:00","string-value1|string-value2","urn:oid:2.16.724.4.9.20.93","6209","urn:oid:2.16.724.4.9.20.2","00042","urn:oid:2.16.724.4.9.20.90","UAB2"
"2261027","1","2021-11-08T15:13:39.318+01:00","string-value1|string-value2|string-value3|string-value4","urn:oid:2.16.724.4.9.20.93","6205","urn:oid:2.16.724.4.9.20.2","05041",,
In short:
address array field string values has to be mapped joining its values using "|" character. Example: "string-value1|string-value2"
identifiers array field objects have to be mapped to "n-field-header". Example: "identifier0_system","identifier0_value","identifier1_system","identifier1_value","identifier2_system","identifier2_value,..."
Any ideas?
Try this
jq -r '[
.id,
(.meta | .versionId, .lastUpdated),
(.address | join("|")),
(.identifier[] | .system, .value)
] | #csv'
Demo
To prepend a header row with the number of identifierX_system and identifierX_value field pairs in it matching the length of the input's longest identifier array, try this
jq -rs '[
"id",
"meta_versionId", "meta_lastUpdated",
"address",
(
range([.[].identifier | length] | max)
| "identifier\(.)_system", "identifier\(.)_value"
)
], (.[] | [
.id,
(.meta | .versionId, .lastUpdated),
(.address | join("|")),
(.identifier[] | .system, .value)
]) | #csv'
Demo

jq: Lifting fields from array of objects into the parent object

I have a JSON object an array at the top-level, and each array entry is an object with a nested array of objects as one of the fields.
I'd like to "lift" some fields from the sub-arrays into the objects within the first array. It's confusing to write, so, here's my input:
{ "emails": [
{ "email":"foo1#bar.com",
"events":[
{ "type":"open", "time":"t1", "ignore":"this" },
{ "type":"click", "time":"t2", "ignore":"this" } ] },
{ "email":"foo2#bar.com",
"events":[
{ "type":"open", "time":"t3", "ignore":"this" },
{ "type":"click", "time":"t4", "ignore":"this" },
{ "type":"open", "time":"t5", "ignore":"this" } ] }
] }
What I'd like to receive as output is:
[
{ "email":"foo1#bar.com", "type":"open", "time":"t1" },
{ "email":"foo1#bar.com", "type":"click", "time":"t2" },
{ "email":"foo2#bar.com", "type":"open", "time":"t3" },
{ "email":"foo2#bar.com", "type":"click", "time":"t4" },
{ "email":"foo2#bar.com", "type":"open", "time":"t5" }
]
I can jq this with multiple pipes already, like so:
[ .emails[] | { email:.email, event:(.events[] | { type:.type, time:.time }) } | { email:.email, type:.event.type, time:.event.time } ]
... but this seems way too verbose.
Is there an easier way to 'lift' those type and time fields from the deepest objects to the object one-level up?
When I try to use the .events[] iterator twice, I wind up with the Cartesian product of events, which is wrong :-/
I must be missing some simpler way (than my 'intermediate-object' approach above) to achieve this ... anyone know of a better way?
Let's proceed in two steps: first, the (heavy) lifting, and secondly, the (light) trimming.
Lifting
.emails | map( {email} + .events[])
or equivalently:
[.emails[] | {email} + .events[]]
Notice that {"email": .email} has been abbreviated to {email}.
Trimming
We can delete the "ignore" key using del(.ignore). With an eye to efficiency, we arrive at the following solution:
.emails | map( {email} + (.events[] | del(.ignore) ) )

Get parent element id while parsing json data with jq

I want to print ID of parent element when child element value is client_release from JSON data.
if
data.properties.value== "client_release"
then output should be
abcd1g2f,hirk5d7b3l
I tried below, but no luck
jq '.data[].properties[]|select(.value=="client_release")|.id'
JSON data is below:
{
"data":[
{
"id":"abcd1g2f",
"resourceURI":"https://somerepo.com/service/local/privileges/abcd1g2f",
"name":"release1",
"description":"release1",
"type":"target",
"userManaged":true,
"properties":[
{
"key":"repositoryGroupId",
"value":""
},
{
"key":"method",
"value":"create,read"
},
{
"key":"repositoryId",
"value":"client_release"
},
{
"key":"repositoryTargetId",
"value":"1"
}
]
},
{
"id":"asdf1k4g",
"resourceURI":"https://somerepo.com/service/local/privileges/asdf1k4g",
"name":"release2",
"description":"release2",
"type":"target",
"userManaged":true,
"properties":[
{
"key":"repositoryGroupId",
"value":""
},
{
"key":"method",
"value":"read"
},
{
"key":"repositoryId",
"value":"formal_release"
},
{
"key":"repositoryTargetId",
"value":"1"
}
]
},
{
"id":"hirk5d7b3l",
"resourceURI":"https://somerepo.com/service/local/privileges/hirk5d7b3l",
"name":"release3",
"description":"release3",
"type":"target",
"userManaged":true,
"properties":[
{
"key":"repositoryGroupId",
"value":""
},
{
"key":"method",
"value":"create,read"
},
{
"key":"repositoryId",
"value":"client_release"
},
{
"key":"repositoryTargetId",
"value":"1"
}
]
}
]
}
The idea is right, but the data[] array should be outside the select statement,
jq '.data[] | select(.properties[].value == "client_release") | .id'
To put it in the CSV format as indicated in the question, put the result into an array and use the #csv construct
jq --raw-output '[.data[] | select(.properties[].value == "client_release") | .id] | #csv'
The following filter avoids duplications and might be more efficient than using select(.properties[].value ...):
.data
| map(select(.properties | any(.[]; .value == "client_release")) | .id)
| join(",")
(You could alternatively use #csv at the end if you want the values of .id as JSON strings.)
"repositoryId"
If attention should only be paid to the value corresponding to "repositoryId", then you could
use from_entries, e.g.:
.data
| map(select(.properties | from_entries.repositoryId == "client_release") | .id)
| join(",")