parsing jq returns null - json

I have a json output
{
"7": [
{
"devices": [
"/dev/sde"
],
"name": "osd-block-dcc9b386-529c-451e-9d84-8ccc4091102b",
"tags": {
"ceph.crush_device_class": "None",
"ceph.db_device": "/dev/nvme0n1p5",
"ceph.wal_device": "/dev/nvme0n1p6",
},
"type": "block",
"vg_name": "ceph-c4de9e90-853e-4569-b04f-8677ef9a8c7a"
},
{
"path": "/dev/nvme0n1p5",
"tags": {
"PARTUUID": "69712eb4-be52-4618-ba46-e317d6d3d76e"
},
"type": "db"
}
],
"41": [
{
"devices": [
"/dev/nvme1n1p13"
],
"name": "osd-block-97bce07f-ae98-4fdb-83a9-9fa2f35cee60",
"tags": {
"ceph.crush_device_class": "None",
},
"type": "block",
"vg_name": "ceph-c1d48671-2a33-4615-95e3-cc1b18783f0c"
}
],
"9": [
{
"devices": [
"/dev/sdf"
],
"name": "osd-block-35323eb8-17c1-460d-8cc5-565f549e6991",
"tags": {
"ceph.crush_device_class": "None",
"ceph.db_device": "/dev/nvme0n1p7",
"ceph.wal_device": "/dev/nvme0n1p8",
},
"type": "block",
"vg_name": "ceph-9488e8b8-ec18-4860-93d3-6a1ad91c698c"
},
{
"path": "/dev/nvme0n1p7",
"tags": {
"PARTUUID": "ef0e9588-2a20-4c2c-8b62-d73945e01322"
},
"type": "db"
}
]
}
Required output:
osd.7 /dev/sde /dev/nvme0n1p5 /dev/nvme0n1p6
osd.41 /dev/nvme1n1p13 n/a n/a
osd.9 /dev/sdf /dev/nvme0n1p7 /dev/nvme0n1p7
Problems:
When I try parsing using jq .[][].devices, I get null values:
$ cat json | jq .[][].devices
[
"/dev/sde"
]
null
[
"/dev/nvme1n1p13"
]
null
[
"/dev/sdf"
]
null
I can solve it via jq .[][].devices[]?.
However, this trick doesn't help me when I do want to see where there's no value (to print n/a instead):
$ cat json | jq '.[][].tags | ."ceph.db_device"'
"/dev/nvme0n1p5"
null
"/dev/nvme0n1p3"
null
null
"/dev/nvme0n1p7"
null
And finally, I try to create a table:
$ cat json | jq -r '["osd."+keys[]], [.[][].devices[]?], [.[][].tags."ceph.db_device" // ""] | #csv' | column -t -s,
"osd.7" "osd.41" "osd.9"
"/dev/sde" "/dev/nvme0n1p13" "/dev/sdf"
"/dev/nvme0n1p5" "/dev/nvme0n1p7"
So the obvious problem is that the 3rd row doesn't match the correct values.
And the final problem is how do I transpose it from columns to rows, as detailed in the required output?

Would this do what you want?
jq --raw-output '
to_entries[] | [
"osd." + .key,
( .value[0]
| .devices[],
( .tags
| ."ceph.db_device" // "n/a",
."ceph.wal_device" // "n/a"
)
)
]
| #tsv
'
osd.7 /dev/sde /dev/nvme0n1p5 /dev/nvme0n1p6
osd.41 /dev/nvme1n1p13 n/a n/a
osd.9 /dev/sdf /dev/nvme0n1p7 /dev/nvme0n1p8
Demo

Related

jq - how to filter values in an inner array without knowing the key

I have the following JSON:
{
"ids": {
"sda": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
],
"sdb": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
],
"sdb1": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
],
"sdc": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
}
What I want to achieve is to search for .*scsi0$ in the values of the inner array and get sdb as the result.
Using JSON jq endswith to filter results:
.ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key
Results in:
"sdb"
Try it here: https://jqplay.org/s/DAhKosXXgiA
First get .ids:
{
"sda": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
],
"sdb": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
],
"sdb1": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
],
"sdc": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
...then pipe the results to the to_entries function to convert that to an array of {key, value} objects, .ids | to_entries:
[
{
"key": "sda",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
]
},
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
},
{
"key": "sdb1",
"value": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
]
},
{
"key": "sdc",
"value": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
]
...next stream the list of objects, .ids | to_entries[]:
{
"key": "sda",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi2"
]
}
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
}
{
"key": "sdb1",
"value": [
"lvm-pv-uuid-lvld3A-oA4k-hC19-DXzv-D0Fq-xyME-BwgJid",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part1"
]
}
{
"key": "sdc",
"value": [
"lvm-pv-uuid-pWes2W-dgYF-l8hG-La48-9ozH-hPdU-MOkOtf",
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi1"
]
}
...and select from a stream of values, .ids | to_entries[] | select(.value[]) where value endswith "scsi0", select(.value[] | endswith("scsi0")) :
{
"key": "sdb",
"value": [
"scsi-0QEMU_QEMU_HARDDISK_drive-scsi0"
]
}
...finally, get the key value, .ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key:
"sdb"
Command line:
jq '.ids | to_entries[] | select(.value[] | endswith("scsi0")) | .key'
Try it here: https://jqplay.org/s/DAhKosXXgiA
.ids | to_entries[] | select(.value[] | test(".*scsi0$")) | .key
Will print the key if any value in the array matches your regex.
If none match, there will be no output.
to_entries is uses to easy capture the key of the objects.
select to filter the values
.value[] | test(".*scsi0$") checks each value to the regex
.key and we show the key of the result
Try it online

JQ: Merging array entities with same attribute name

I have a data structure like this:
[
{
"some_id": "123",
"items_1": [
{
"label": "my_name"
}
],
"items_2": []
},
{
"some_id": "123",
"items_1": [],
"items_2": [
"value_1",
"value_3"
]
},
{
"some_id": "123",
"items_1": [],
"items_2": [
"value_1",
"value_2"
]
}
]
And I want to modify the data into something like
[
{
"some_id": "123",
"items_1": [
{
"label": "my_name"
}
],
"items_2": [
"value_1",
"value_2",
"value_3"
]
}
]
Basically taking any fields that are the same and concatenating the arrays together. Similarly, items_1 can have some value for the same id down the line and I want to concatenate that array with another if needed.
I have tried using JQ with something like
jq -Mr '[ group_by(.media_url)[] | add | tojson ] | join(",\n")' test.json
However this doesnt seem to be working.
Would the following work for you?
group_by(.some_id) | map({
some_id: map(.some_id) | first,
items_1: map(.items_1) | add | unique,
items_2: map(.items_2) | add | unique })
demo

Convert/Export JSON to CSV

JSON file:
"UserDetailList": [
{
"UserName": "citrix-xendesktop-ec2-provisioning",
"GroupList": [],
"CreateDate": "2017-11-07T14:20:14Z",
"UserId": "1234556",
"Path": "/",
"AttachedManagedPolicies": [
{
"PolicyName": "AmazonEC2FullAccess",
"PolicyArn": "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
},
{
"PolicyName": "AmazonS3FullAccess",
"PolicyArn": "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}
],
"Arn": "arn:aws:iam::1234567890:user/citrix-xendesktop-ec2-provisioning"
},
{
"UserName": "rundeck-read-only-iam-permissions",
"GroupList": [],
"CreateDate": "2018-03-09T11:13:38Z",
"UserId": "AIDAJQOQGKISLCWDXG6EQ",
"Path": "/",
"AttachedManagedPolicies": [
{
"PolicyName": "IAMReadOnlyAccess",
"PolicyArn": "arn:aws:iam::aws:policy/IAMReadOnlyAccess"
}
],
"Arn": "arn:aws:iam::279052847476:user/rundeck-read-only-iam-permissions"
}
]
with
jq -r '.UserDetailList[] | [.UserName] | #csv' output.json > fileout2.csv
I can get
citrix-xendesktop-ec2-provisioning"
"rundeck-read-only-iam-permissions"
How to get IAM policies for these 2 users, i need to extract AmazonEC2FullAccess and AmazonS3FullAccess under AttachedManagedPolicies ?
so output can be
citrix-xendesktop-ec2-provisioning",AmazonEC2FullAccess
citrix-xendesktop-ec2-provisioning",AmazonS3FullAccess
rundeck-read-only-iam-permissions,IAMReadOnlyAccess
The trick is to extract .UserName as a variable before iterating over the inner array:
.UserDetailList[]
| .UserName as $u
| .AttachedManagedPolicies[]
| [$u, .PolicyName]
| #csv
Of course this assumes valid JSON input.

jq get the value of x based on y in a complex json file

jq strikes again. Trying to get the value of DATABASES_DEFAULT based on the name in a json file that has a whole lot of names and I'm completely lost.
My file looks like the following (output of an aws ecs describe-task-definition) only much more complex; I've stripped this to the most basic example I can where the structure is still intact.
{
"taskDefinition": {
"status": "bar",
"family": "bar2",
"volumes": [],
"taskDefinitionArn": "bar3",
"containerDefinitions": [
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
},
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo2"
}
],
"name": "boo",
"links": []
}
],
"revision": 1
}
}
I need the value of DATABASES_DEFAULT where the name is baz. Note that there are a lot of keypairs with name, I'm specifically talking about the one outside of environment.
I've been tinkering with this but only got this far before realizing that I don't understand how to access nested values.
jq '.[] | select(.name==DATABASES_DEFAULT) | .value'
which is returning
jq: error: DATABASES_DEFAULT/0 is not defined at <top-level>, line 1:
.[] | select(.name==DATABASES_DEFAULT) | .value
jq: 1 compile error
Obviously this a) doesn't work, and b) even if it did, it's independant of the name value. My thought was to return all the db defaults and then identify the one with baz, but I don't know if that's the right approach.
I like to think of it as digging down into the structure, so first you open the outer layers:
.taskDefinition.containerDefinitions[]
Now select the one you want:
select(.name =="baz")
Open the inner structure:
.environment[]
Select the desired object:
select(.name == "DATABASES_DEFAULT")
Choose the key you want:
.value
Taken together:
parse.jq
.taskDefinition.containerDefinitions[] |
select(.name =="baz") |
.environment[] |
select(.name == "DATABASES_DEFAULT") |
.value
Run it like this:
<infile jq -f parse.jq
Output:
"foo"
The following seems to work:
.taskDefinition.containerDefinitions[] |
select(
select(
.environment[] | .name == "DATABASES_DEFAULT"
).name == "baz"
)
The output is the object with the name key mapped to "baz".
$ jq '.taskDefinition.containerDefinitions[] | select(select(.environment[]|.name == "DATABASES_DEFAULT").name=="baz")' tmp.json
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
}

How to get a flat output based on conditional jq query?

I have the following JSON:
[
{
"name": "InstanceA",
"tags": [
{
"key": "environment",
"value": "production"
},
{
"key": "group",
"value": "group1"
}
]
},
{
"name": "InstanceB",
"tags": [
{
"key": "group",
"value": "group2"
},
{
"key": "environment",
"value": "staging"
}
]
}
]
I'm trying to get a flat output of value based on the condition key == 'environment'. I already tried select(boolean_expression), but I cannot get the desired output, like:
"InstanceA, production"
"InstanceB, staging"
Does jq support this kind of output? If so, how to do it?
Yes.
For example:
$ jq '.[] | "\(.name), \(.tags | from_entries | .environment)"' input.json
Output:
"InstanceA, production"
"InstanceB, staging"
jq '.[] | .name + ", " + (.tags[] | select(.key == "environment").value)' f.json
Here is a solution using join
.[]
| [.name, (.tags[] | if .key == "environment" then .value else empty end)]
| join(", ")