Convert/Export JSON to CSV - json

JSON file:
"UserDetailList": [
{
"UserName": "citrix-xendesktop-ec2-provisioning",
"GroupList": [],
"CreateDate": "2017-11-07T14:20:14Z",
"UserId": "1234556",
"Path": "/",
"AttachedManagedPolicies": [
{
"PolicyName": "AmazonEC2FullAccess",
"PolicyArn": "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
},
{
"PolicyName": "AmazonS3FullAccess",
"PolicyArn": "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}
],
"Arn": "arn:aws:iam::1234567890:user/citrix-xendesktop-ec2-provisioning"
},
{
"UserName": "rundeck-read-only-iam-permissions",
"GroupList": [],
"CreateDate": "2018-03-09T11:13:38Z",
"UserId": "AIDAJQOQGKISLCWDXG6EQ",
"Path": "/",
"AttachedManagedPolicies": [
{
"PolicyName": "IAMReadOnlyAccess",
"PolicyArn": "arn:aws:iam::aws:policy/IAMReadOnlyAccess"
}
],
"Arn": "arn:aws:iam::279052847476:user/rundeck-read-only-iam-permissions"
}
]
with
jq -r '.UserDetailList[] | [.UserName] | #csv' output.json > fileout2.csv
I can get
citrix-xendesktop-ec2-provisioning"
"rundeck-read-only-iam-permissions"
How to get IAM policies for these 2 users, i need to extract AmazonEC2FullAccess and AmazonS3FullAccess under AttachedManagedPolicies ?
so output can be
citrix-xendesktop-ec2-provisioning",AmazonEC2FullAccess
citrix-xendesktop-ec2-provisioning",AmazonS3FullAccess
rundeck-read-only-iam-permissions,IAMReadOnlyAccess

The trick is to extract .UserName as a variable before iterating over the inner array:
.UserDetailList[]
| .UserName as $u
| .AttachedManagedPolicies[]
| [$u, .PolicyName]
| #csv
Of course this assumes valid JSON input.

Related

parsing jq returns null

I have a json output
{
"7": [
{
"devices": [
"/dev/sde"
],
"name": "osd-block-dcc9b386-529c-451e-9d84-8ccc4091102b",
"tags": {
"ceph.crush_device_class": "None",
"ceph.db_device": "/dev/nvme0n1p5",
"ceph.wal_device": "/dev/nvme0n1p6",
},
"type": "block",
"vg_name": "ceph-c4de9e90-853e-4569-b04f-8677ef9a8c7a"
},
{
"path": "/dev/nvme0n1p5",
"tags": {
"PARTUUID": "69712eb4-be52-4618-ba46-e317d6d3d76e"
},
"type": "db"
}
],
"41": [
{
"devices": [
"/dev/nvme1n1p13"
],
"name": "osd-block-97bce07f-ae98-4fdb-83a9-9fa2f35cee60",
"tags": {
"ceph.crush_device_class": "None",
},
"type": "block",
"vg_name": "ceph-c1d48671-2a33-4615-95e3-cc1b18783f0c"
}
],
"9": [
{
"devices": [
"/dev/sdf"
],
"name": "osd-block-35323eb8-17c1-460d-8cc5-565f549e6991",
"tags": {
"ceph.crush_device_class": "None",
"ceph.db_device": "/dev/nvme0n1p7",
"ceph.wal_device": "/dev/nvme0n1p8",
},
"type": "block",
"vg_name": "ceph-9488e8b8-ec18-4860-93d3-6a1ad91c698c"
},
{
"path": "/dev/nvme0n1p7",
"tags": {
"PARTUUID": "ef0e9588-2a20-4c2c-8b62-d73945e01322"
},
"type": "db"
}
]
}
Required output:
osd.7 /dev/sde /dev/nvme0n1p5 /dev/nvme0n1p6
osd.41 /dev/nvme1n1p13 n/a n/a
osd.9 /dev/sdf /dev/nvme0n1p7 /dev/nvme0n1p7
Problems:
When I try parsing using jq .[][].devices, I get null values:
$ cat json | jq .[][].devices
[
"/dev/sde"
]
null
[
"/dev/nvme1n1p13"
]
null
[
"/dev/sdf"
]
null
I can solve it via jq .[][].devices[]?.
However, this trick doesn't help me when I do want to see where there's no value (to print n/a instead):
$ cat json | jq '.[][].tags | ."ceph.db_device"'
"/dev/nvme0n1p5"
null
"/dev/nvme0n1p3"
null
null
"/dev/nvme0n1p7"
null
And finally, I try to create a table:
$ cat json | jq -r '["osd."+keys[]], [.[][].devices[]?], [.[][].tags."ceph.db_device" // ""] | #csv' | column -t -s,
"osd.7" "osd.41" "osd.9"
"/dev/sde" "/dev/nvme0n1p13" "/dev/sdf"
"/dev/nvme0n1p5" "/dev/nvme0n1p7"
So the obvious problem is that the 3rd row doesn't match the correct values.
And the final problem is how do I transpose it from columns to rows, as detailed in the required output?
Would this do what you want?
jq --raw-output '
to_entries[] | [
"osd." + .key,
( .value[0]
| .devices[],
( .tags
| ."ceph.db_device" // "n/a",
."ceph.wal_device" // "n/a"
)
)
]
| #tsv
'
osd.7 /dev/sde /dev/nvme0n1p5 /dev/nvme0n1p6
osd.41 /dev/nvme1n1p13 n/a n/a
osd.9 /dev/sdf /dev/nvme0n1p7 /dev/nvme0n1p8
Demo

Join array and dictionary into TSV in jq

I have the input JSON data at the bottom.
I'd like to generate a TSV output like the following. <TAB> is the TAB character. How can I do it in jq?
timestamp<TAB>open<TAB>high<TAB>dividends
1623072600<TAB>4229.33984375<TAB>4232.33984375<TAB>
1623159000<TAB>4233.81005859375<TAB>4236.740234375<TAB>
1623245400<TAB>4232.990234375<TAB>4237.08984375<TAB>0.42
1623331800<TAB>4228.56005859375<TAB>4249.740234375<TAB>
1623418200<TAB>4242.89990234375<TAB>4248.3798828125<TAB>
{
"timestamp": [
1623072600,
1623159000,
1623245400,
1623331800,
1623418200
],
"indicators": {
"quote": [
{
"open": [
4229.33984375,
4233.81005859375,
4232.990234375,
4228.56005859375,
4242.89990234375
],
"high": [
4232.33984375,
4236.740234375,
4237.08984375,
4249.740234375,
4248.3798828125
]
}
]
},
"events": {
"dividends": {
"1623245400": {
"amount": 0.42,
"date": 1623245400
}
}
}
}
Using jq with the -r command-line option:
(.events.dividends|map_values(.amount)) as $dividends
| ["timestamp", "open", "high", "dividends"],
( [.timestamp, (.indicators.quote[0] | .open, .high),
[$dividends[.timestamp[]|tostring]]]
| transpose[])
| #tsv
Notice how the dividends column is computed using the provided $dividends dictionary:
$dividends[.timestamp[]|tostring]

Fill arrays in the first input with elements from the second based on common field

I have two files and I would need to merge the elements of the second file into an object array in the first file based on searching the reference field.
The first file:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : []
},
{
"reference": 25423,
"order_number": "10_2",
"details" : []
}
]
The second file:
[
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
},
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
I would like to get:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : [
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details" :[
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
}
]
Below is my code in es_func.jq file launched by this command:
jq -n --argfile f1 es_file1.json --argfile f2 es_file2.json -f es_func.jq
INDEX($f2[] ; .reference) as $details
| $f1
| map( ($details[.reference|tostring]| .row_description) as $vn
| if $vn then .details = [{"row_description" : $vn}] else . end)
I get the result only for the last record in 25422 reference with "row description": "descr_1_1" and not have "row_description": "descr_1_0"
[
{
"reference": 25422,
"order_number": "10_1",
"details": [
{
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details": [
{
"row_description": "descr_2_0"
}
]
}
]
I think I'm close to the solution but something is still missing. Thank you
This would be way easier if you used reduce instead.
jq 'reduce inputs[] as $rec (INDEX(.reference);
.[$rec.reference | tostring].details += [$rec]
) | map(.)' es_file1.json es_file2.json
Online demo
Here's a straightforward, reduce-free solution:
jq '
group_by(.reference)
| INDEX(.[]; .[0]|.reference|tostring) as $dict
| input
| map_values(. + {details: $dict[.reference|tostring]})
' 2.json 1.json

Using jq find key/value pair based on another key/value pair

I'm pasting here a JSON example data which would require some manipulation to get a desired output which is mentioned in the next section to be read after this piece of JSON code.
I want to use jq for parsing my desired data.
{
"MetricAlarms": [
{
"EvaluationPeriods": 3,
"ComparisonOperator": "GreaterThanOrEqualToThreshold",
"AlarmActions": [
"Unimportant:Random:alarm:ELK2[10.1.1.2]-Root-Disk-Alert"
],
"AlarmName": "Unimportant:Random:alarm:ELK1[10.1.1.0]-Root-Alert",
"Dimensions": [
{
"Name": "path",
"Value": "/"
},
{
"Name": "InstanceType",
"Value": "m5.2xlarge"
},
{
"Name": "fstype",
"Value": "ext4"
}
],
"DatapointsToAlarm": 3,
"MetricName": "disk_used_percent"
},
{
"EvaluationPeriods": 3,
"ComparisonOperator": "GreaterThanOrEqualToThreshold",
"AlarmActions": [
"Unimportant:Random:alarm:ELK2[10.1.1.2]"
],
"AlarmName": "Unimportant:Random:alarm:ELK2[10.1.1.2]",
"Dimensions": [
{
"Name": "path",
"Value": "/"
},
{
"Name": "InstanceType",
"Value": "r5.2xlarge"
},
{
"Name": "fstype",
"Value": "ext4"
}
],
"DatapointsToAlarm": 3,
"MetricName": "disk_used_percent"
}
]
}
So when I Pass some Key1 & value1 as a parameter "Name": "InstanceType", to the JQ probably using cat | jq and output expected should be as below
m5.2xlarge
r5.2xlarge
A generic approach to search for a key-value pair (sk-sv) in input recursively and extract another key's value (pv) from objects found:
jq -r --arg sk Name \
--arg sv InstanceType \
--arg pv Value \
'.. | objects | select(contains({($sk): $sv})) | .[$pv]' file

Reconstructing JSON with jq

I have a JSON like this (sample.json):
{
"sheet1": [
{
"hostname": "sv001",
"role": "web",
"ip1": "172.17.0.3"
},
{
"hostname": "sv002",
"role": "web",
"ip1": "172.17.0.4"
},
{
"hostname": "sv003",
"role": "db",
"ip1": "172.17.0.5",
"ip2": "172.18.0.5"
}
],
"sheet2": [
{
"hostname": "sv004",
"role": "web",
"ip1": "172.17.0.6"
},
{
"hostname": "sv005",
"role": "db",
"ip1": "172.17.0.7"
},
{
"hostname": "vsv006",
"role": "db",
"ip1": "172.17.0.8"
}
],
"sheet3": []
}
I want to extract data like this:
sheet1
jq '(something command)' sample.json
{
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
},
"db": {
"hosts": [
"172.17.0.5"
]
}
}
Is it possible to perform the reconstruction with jq map?
(I will reuse the result for ansible inventory.)
Here's a short, straight-forward and efficient solution -- efficient in part because it avoids group_by by courtesy of the following generic helper function:
def add_by(f;g): reduce .[] as $x ({}; .[$x|f] += [$x|g]);
.sheet1
| add_by(.role; .ip1)
| map_values( {hosts: .} )
Output
This produces the required output:
{
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
},
"db": {
"hosts": [
"172.17.0.5"
]
}
}
If the goal is to regroup the ips by their roles within each sheet you could do this:
map_values(
reduce group_by(.role)[] as $g ({};
.[$g[0].role].hosts = [$g[] | del(.hostname, .role)[]]
)
)
Which produces something like this:
{
"sheet1": {
"db": {
"hosts": [
"172.17.0.5",
"172.18.0.5"
]
},
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4"
]
}
},
"sheet2": {
"db": {
"hosts": [
"172.17.0.7",
"172.17.0.8"
]
},
"web": {
"hosts": [
"172.17.0.6"
]
}
},
"sheet3": {}
}
https://jqplay.org/s/3VpRc5l4_m
If you want to flatten all to a single object keeping only unique ips, you can keep everything mostly the same, you'll just need to flatten the inputs prior to grouping and remove the map_values/1 call.
$ jq -n '
reduce ([inputs[][]] | group_by(.role)[]) as $g ({};
.[$g[0].role].hosts = ([$g[] | del(.hostname, .role)[]] | unique)
)
'
{
"db": {
"hosts": [
"172.17.0.5",
"172.17.0.7",
"172.17.0.8",
"172.18.0.5"
]
},
"web": {
"hosts": [
"172.17.0.3",
"172.17.0.4",
"172.17.0.6"
]
}
}
https://jqplay.org/s/ZGj1wC8hU3