Pivot transform in vegalite - vega-lite

I have the following dataset:
"data": {
"values": [{
"service": "service1",
"build": 5555,
"branch": "develop",
"env": "prod"
}, {
"service": "service2",
"build": 5555,
"branch": "develop",
"env": "staging"
}, {
"service": "service3",
"build": 5555,
"branch": "develop",
"env": "dev"
}, {
"service": "service4",
"build": 5555,
"branch": "develop",
"env": "test"
}
]
},
I want to show the data in the following way:
service
dev
test
staging
production
service1
5555
5555
5555
5555
service2
5555
5555
5555
5555
service3
5555
5555
5555
5555
This is my example of what I did:
enter link description here
How I can achieve that using vegalite?

I'm not sure how you get your desired output from your input but you can do this with a pivot transform:
"transform": [{"pivot": "env", "value": "build", "groupby": ["service"]}],

Related

JQ Error: Cannot iterate over string while trying to map IP and Ports

I have a below json output, I would love to remove duplicate data and map the data in a table format using jq. I am using the below query but I keep getting an error;
Cannot iterate over string ("78.45.196...)
e.t.c.
Json data
[
{
"ip": "78.45.196.23",
"timestamp": "1616566245",
"ports": [
{
"port": 5060,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
},
{
"ip": "67.89.378.82",
"timestamp": "1616566255",
"ports": [
{
"port": 2000,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
},
{
"ip": "67.89.378.82",
"timestamp": "1616566255",
"ports": [
{
"port": 2080,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
},
{
"ip": "78.45.196.23",
"timestamp": "1616566245",
"ports": [
{
"port": 5060,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
},
{
"ip": "67.89.378.82",
"timestamp": "1616566255",
"ports": [
{
"port": 2000,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
},
{
"ip": "78.45.196.23",
"timestamp": "1616566245",
"ports": [
{
"port": 5080,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 50
}
]
}
]
My query
jq -r '.[][] | group_by(.ip) | map({ip: .ip, ports: map(.ports[].port) | add | unique})' jsonfile.json
Expected output
To remove duplicates and get ip and ports.
Or can one explain to me how to get unique values from both IP and ports.
[
{"ip:" "67.89.378.82", "ports:"[2000, 2080]},
{"ip:" "78.45.196.23", "ports:"[5060, 5080]}
]
Construct your desired JSON immediately following the group_by()
group_by(.ip) |
map
(
{
ip: .[0].ip,
ports: [ .[].ports[].port ] | unique
}
)
jq play link
Follow-up question to discard IPs that have only port as 0
group_by(.ip) |
map
(
{
ip: .[0].ip,
ports: [ .[].ports[] | select(.port != 0 ).port ] | unique
} |
select(.ports | length > 0)
)

storing json output in bash from cloudfromation

I am using aws ecs query to get list of properties being used by the current running task.
command -
cft = "aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b
I am storing this in an output variable
output= $( eval $cft)
Output:
"tasks": [
{
"attachments": [
{
"id": "da8a1312-8278-46d5-8e3b-6b6a1d96f820",
"type": "ElasticNetworkInterface",
"status": "ATTACHED",
"details": [
{
"name": "subnetId",
"value": "subnet-0a151f2eb959ad4"
},
{
"name": "networkInterfaceId",
"value": "eni-081948e3666253f"
},
{
"name": "macAddress",
"value": "02:2a:9i:5c:4a:77"
},
{
"name": "privateDnsName",
"value": "ip-172-56-17-177.us-west-2.compute.internal"
},
{
"name": "privateIPv4Address",
"value": "172.56.17.177"
}
]
}
],
"availabilityZone": "us-west-2a",
"clusterArn": "arn:aws:ecs:us-west-2:4984314772:cluster/secrets",
"containers": [
{
"taskArn": "arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b",
"name": "nginx",
"image": "nginx",
"lastStatus": "PENDING",
"networkInterfaces": [
{
"attachmentId": "da8a1312-8278-46d5-6b6a1d96f820",
"privateIpv4Address": "172.31.17.176"
}
],
"healthStatus": "UNKNOWN",
"cpu": "0"
}
],
"cpu": "256",
"createdAt": "2020-12-10T18:00:16.320000+05:30",
"desiredStatus": "RUNNING",
"group": "family:nginx",
"healthStatus": "UNKNOWN",
"lastStatus": "PENDING",
"launchType": "FARGATE",
"memory": "512",
"overrides": {
"containerOverrides": [
{
"name": "nginx"
}
],
"inferenceAcceleratorOverrides": []
},
"platformVersion": "1.4.0",
"tags": [],
"taskArn": "arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b",
"taskDefinitionArn": "arn:aws:ecs:us-west-2:4984314772:task-definition/nginx:17",
"version": 2
}
],
"failures": []
}
now if do an echo of $output.tasks[0].containers[0] nothing happens it prints the entire thing again, i want to store the result in output variable and refer different parameter like we do in json format.
You will need to use a json parser such as jq and so:
eval $cft | jq '.tasks[].containers[]'
To avoid using eval you could simple pipe the aws command into jq and so:
aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b | jq '.tasks[].containers[]'
or:
cft=$(aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b | jq '.tasks[].containers[]')
echo $cft | jq '.tasks[].containers[]'

Looping thru multiple regex matches with bash [duplicate]

Just started out with Bash scripting and stumbled upon jq to work with JSON.
I need to transform a JSON string like below to a table for output in the terminal.
[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]
What I want to display in the terminal:
ID Name
=================
12 George
18 Jack
19 Joe
Notice how I don't want to display the email property for each row, so the jq command should involve some filtering. The following gives me a plain list of names and id's:
list=$(echo "$data" | jq -r '.[] | .name, .id')
printf "$list"
The problem with that is, I cannot display it like a table. I know jq has some formatting options, but not nearly as good as the options I have when using printf. I think I want to get these values in an array which I can then loop through myself to do the formatting...? The things I tried give me varying results, but never what I really want.
Can someone point me in the right direction?
Using the #tsv filter has much to recommend it, mainly because it handles numerous "edge cases" in a standard way:
.[] | [.id, .name] | #tsv
Adding the headers can be done like so:
jq -r '["ID","NAME"], ["--","------"], (.[] | [.id, .name]) | #tsv'
The result:
ID NAME
-- ------
12 George
18 Jack
19 Joe
As pointed out by #Tobia, you might want to format the table for viewing by using column to post-process the result produced by jq. If you are using a bash-like shell then column -ts $'\t' should be quite portable.
length*"-"
To automate the production of the line of dashes:
jq -r '(["ID","NAME"] | (., map(length*"-"))), (.[] | [.id, .name]) | #tsv'
Why not something like:
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r '.[] | "\(.id)\t\(.name)"'
Output
12 George
18 Jack
19 Joe
Edit 1 : For fine grained formatting use tools like awk
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r '.[] | [.id, .name] | #csv' | awk -v FS="," 'BEGIN{print "ID\tName";print "============"}{printf "%s\t%s%s",$1,$2,ORS}'
ID Name
============
12 "George"
18 "Jack"
19 "Joe"
Edit 2 : In reply to
There's no way I can get a variable containing an array straight
from jq?
Why not?
A bit involved example( in fact modified from yours ) where email is changed to an array demonstrates this
echo '[{
"name": "George",
"id": 20,
"email": [ "george#domain1.example" , "george#domain2.example" ]
}, {
"name": "Jack",
"id": 18,
"email": [ "jack#domain3.example" , "jack#domain5.example" ]
}, {
"name": "Joe",
"id": 19,
"email": [ "joe#domain.example" ]
}]' | jq -r '.[] | .email'
Output
[
"george#domain1.example",
"george#domain2.example"
]
[
"jack#domain3.example",
"jack#domain5.example"
]
[
"joe#domain.example"
]
Defining headers by hand is suboptimal! Omitting headers is also suboptimal.
TL;DR
data
[{ "name": "George", "id": 12, "email": "george#domain.example" },
{ "name": "Jack", "id": 18, "email": "jack#domain.example" },
{ "name": "Joe", "id": 19, "email": "joe#domain.example" }]
script
[.[]| with_entries( .key |= ascii_downcase ) ]
| (.[0] |keys_unsorted | #tsv)
, (.[] |map(.) |#tsv)
how to run
$ < data jq -rf script | column -t
name id email
George 12 george#domain.example
Jack 18 jack#domain.example
Joe 19 joe#domain.example
I found this question while summarizng some data from amazon web services. The problem I was working on, in case you want another example:
$ aws ec2 describe-spot-instance-requests | tee /tmp/ins |
jq --raw-output '
# extract instances as a flat list.
[.SpotInstanceRequests | .[]
# remove unwanted data
| {
State,
statusCode: .Status.Code,
type: .LaunchSpecification.InstanceType,
blockPrice: .ActualBlockHourlyPrice,
created: .CreateTime,
SpotInstanceRequestId}
]
# lowercase keys
# (for predictable sorting, optional)
| [.[]| with_entries( .key |= ascii_downcase ) ]
| (.[0] |keys_unsorted | #tsv) # print headers
, (.[]|.|map(.) |#tsv) # print table
' | column -t
Output:
state statuscode type blockprice created spotinstancerequestid
closed instance-terminated-by-user t3.nano 0.002000 2019-02-24T15:21:36.000Z sir-r5bh7skq
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:51:47.000Z sir-1k9s5h3m
closed instance-terminated-by-user t3.nano 0.002000 2019-02-24T14:55:26.000Z sir-43x16b6n
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:29:23.000Z sir-2jsh5brn
active fulfilled t3.nano 0.002000 2019-02-24T15:37:26.000Z sir-z1e9591m
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:33:42.000Z sir-n7c15y5p
Input:
$ cat /tmp/ins
{
"SpotInstanceRequests": [
{
"Status": {
"Message": "2019-02-24T15:29:38+0000 : 2019-02-24T15:29:38+0000 : Spot Instance terminated due to user-initiated termination.",
"Code": "instance-terminated-by-user",
"UpdateTime": "2019-02-24T15:31:03.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T15:21:36.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-0414083bef5e91d94",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-r5bh7skq",
"State": "closed",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T15:21:36.000Z",
"SpotPrice": "0.008000"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:51:48.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:51:47.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Invalid device name /dev/sda",
"Code": "InvalidBlockDeviceMapping"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-1k9s5h3m",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:51:47.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "2019-02-24T15:02:17+0000 : 2019-02-24T15:02:17+0000 : Spot Instance terminated due to user-initiated termination.",
"Code": "instance-terminated-by-user",
"UpdateTime": "2019-02-24T15:03:34.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:55:26.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-010442ac3cc85ec08",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-43x16b6n",
"State": "closed",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:55:26.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:29:24.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:29:23.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Addressing type must be 'public'",
"Code": "InvalidParameterCombination"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-2jsh5brn",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:29:23.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "Your spot request is fulfilled.",
"Code": "fulfilled",
"UpdateTime": "2019-02-24T15:37:28.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T15:37:26.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-0a29e9de6d59d433f",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-z1e9591m",
"State": "active",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T15:37:26.000Z",
"SpotPrice": "0.008000"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:33:43.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:33:42.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Invalid device name /dev/sda",
"Code": "InvalidBlockDeviceMapping"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-n7c15y5p",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:33:42.000Z",
"SpotPrice": "0.011600"
}
]
}
The problem with the answers above is they only work if the fields are all about the same width.
To avoid this issue, the Linux column command could be used:
// input.json
[
{
"name": "George",
"id": "a very very long field",
"email": "george#domain.example"
},
{
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
},
{
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}
]
Then:
▶ jq -r '.[] | [.id, .name] | #tsv' input.json | column -ts $'\t'
a very very long field George
18 Jack
19 Joe
I made a mix with all responses to get all this behaviours
create header table
handle long fields
create a function to reuse
function bash
function jsonArrayToTable(){
jq -r '(.[0] | ([keys[] | .] |(., map(length*"-")))), (.[] | ([keys[] as $k | .[$k]])) | #tsv' | column -t -s $'\t'
}
Sample use
echo '[{"key1":"V1.1", "key2":"V2.1"}, {"keyA":"V1.2", "key2":"V2.2"}]' | jsonArrayToTable
output
key1 key2
---- ----
V1.1 V2.1
V2.2 V1.2
If you want to generate an HTML table instead of a table for terminal output:
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r 'map("<tr><td>" + .name + "</td><td>" + (.id | tostring) + "</td></tr>") | ["<table>"] + . + ["</table>"] | .[]'
Output:
<table>
<tr><td>George</td><td>12</td></tr>
<tr><td>Jack</td><td>18</td></tr>
<tr><td>Joe</td><td>19</td></tr>
</table>
If the values don't contain spaces, this might be helpful:
read -r -a data <<<'name1 value1 name2 value2'
echo "name value"
echo "=========="
for ((i=0; i<${#data[#]}; i+=2)); do
echo ${data[$i]} ${data[$((i+1))]}
done
Output
name value
==========
name1 value1
name2 value2
More simple implement:
jq -r '(.[0]|keys_unsorted|(.,map(length*"-"))),.[]|map(.)|#tsv'|column -ts $'\t'
you can add the following jq function into ~/.jq:
def pretty_table:
(.[0]|keys_unsorted|(.,map(length*"-"))),.[]|map(.)|#tsv
;
and then run:
cat apps.json | jq -r pretty_table | column -ts $'\t'

Query parent data (multi-level) based on a child value, on a json file, using jq

I have a ksh script that retrives (using curl) a json file similar to the one bellow:
{
"Type1": {
"dev": {
"server": [
{ "group": "APP1", "name": "DAPP1002", "ip": "10.1.1.1" },
{ "group": "APP2", "name": "DAPP2001", "ip": "10.1.1.2" }
]
},
"qa": {
"server": [
{ "group": "APP1", "name": "QAPP1002", "ip": "10.1.2.1" },
{ "group": "APP2", "name": "QAPP2001", "ip": "10.1.2.2" }
]
},
"prod": {
"proxy": "type1.prod.proxy.mydomain.com",
"server": [
{ "group": "APP1", "name": "PAPP1001", "ip": "10.1.3.1" },
{ "group": "APP1", "name": "PAPP1002", "ip": "10.1.3.2" },
{ "group": "APP2", "name": "PAPP2001", "ip": "10.1.3.3" }
]
}
},
"Type2": {
"dev": {
"server": [
{ "group": "APP8", "name": "DAPP8002", "ip": "10.2.1.1" },
{ "group": "APP9", "name": "DAPP9001", "ip": "10.2.1.2" }
]
},
"qa": {
"server": [
{ "group": "APP8", "name": "QAPP8002", "ip": "10.2.2.1" },
{ "group": "APP9", "name": "QAPP9001", "ip": "10.2.2.2" }
]
},
"prod": {
"proxy": "type2.prod.proxy.mydomain.com",
"server": [
{ "group": "APP8", "name": "PAPP8001", "ip": "10.2.3.1" },
{ "group": "APP9", "name": "PAPP9001", "ip": "10.2.3.2" },
{ "group": "APP9", "name": "PAPP9002", "ip": "10.2.3.3" }
]
}
}
}
... based on a server name (field "name") I would have to collect the following info, to pass to a function:
"Type", "name", "ip", "proxy"
(Note that the "proxy" info is optional)
I am new to json, and I am trying to get this filtered with jq but so far, I am out of lucky.
What I acomplished so far is the following jq query, when searching for "PAPP9001" :
jq '.[] | .[] | select(.server[].name=="PAPP9001") | .proxy as $proxy | .server[] | {proxy: $proxy, name: .name, ip: .ip} | select(.name=="PAPP9001")' curlreturn.json
which returns me:
{
"proxy": "type2.prod.proxy.mydomain.com",
"name": "PAPP9001",
"ip": "10.2.3.2"
}
but:
I could not get the "Type" info, at the top level
Considering the number of pipes and the 2 selects, I doubt that this is the most efficient way.
One way to retrieve the key names programmatically is using to_entries. For example, given your input, this jq filter:
to_entries[]
| .key as $type
| .value[]
| .proxy as $proxy
| .server[]
| select(.name == "PAPP9001")
| { Type: $type, name, ip, proxy: $proxy }
yields:
{
"Type": "Type2",
"name": "PAPP9001",
"ip": "10.2.3.2",
"proxy": "type2.prod.proxy.mydomain.com"
}
Variations
If, for example, you wanted these four fields as a CSV row, then you could replace the last line of the filter above with:
| [$type, .name, .ip, $proxy] | #csv
See the jq manual for how to use string interpolation.

How to format a JSON string as a table using jq?

Just started out with Bash scripting and stumbled upon jq to work with JSON.
I need to transform a JSON string like below to a table for output in the terminal.
[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]
What I want to display in the terminal:
ID Name
=================
12 George
18 Jack
19 Joe
Notice how I don't want to display the email property for each row, so the jq command should involve some filtering. The following gives me a plain list of names and id's:
list=$(echo "$data" | jq -r '.[] | .name, .id')
printf "$list"
The problem with that is, I cannot display it like a table. I know jq has some formatting options, but not nearly as good as the options I have when using printf. I think I want to get these values in an array which I can then loop through myself to do the formatting...? The things I tried give me varying results, but never what I really want.
Can someone point me in the right direction?
Using the #tsv filter has much to recommend it, mainly because it handles numerous "edge cases" in a standard way:
.[] | [.id, .name] | #tsv
Adding the headers can be done like so:
jq -r '["ID","NAME"], ["--","------"], (.[] | [.id, .name]) | #tsv'
The result:
ID NAME
-- ------
12 George
18 Jack
19 Joe
As pointed out by #Tobia, you might want to format the table for viewing by using column to post-process the result produced by jq. If you are using a bash-like shell then column -ts $'\t' should be quite portable.
length*"-"
To automate the production of the line of dashes:
jq -r '(["ID","NAME"] | (., map(length*"-"))), (.[] | [.id, .name]) | #tsv'
Why not something like:
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r '.[] | "\(.id)\t\(.name)"'
Output
12 George
18 Jack
19 Joe
Edit 1 : For fine grained formatting use tools like awk
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r '.[] | [.id, .name] | #csv' | awk -v FS="," 'BEGIN{print "ID\tName";print "============"}{printf "%s\t%s%s",$1,$2,ORS}'
ID Name
============
12 "George"
18 "Jack"
19 "Joe"
Edit 2 : In reply to
There's no way I can get a variable containing an array straight
from jq?
Why not?
A bit involved example( in fact modified from yours ) where email is changed to an array demonstrates this
echo '[{
"name": "George",
"id": 20,
"email": [ "george#domain1.example" , "george#domain2.example" ]
}, {
"name": "Jack",
"id": 18,
"email": [ "jack#domain3.example" , "jack#domain5.example" ]
}, {
"name": "Joe",
"id": 19,
"email": [ "joe#domain.example" ]
}]' | jq -r '.[] | .email'
Output
[
"george#domain1.example",
"george#domain2.example"
]
[
"jack#domain3.example",
"jack#domain5.example"
]
[
"joe#domain.example"
]
Defining headers by hand is suboptimal! Omitting headers is also suboptimal.
TL;DR
data
[{ "name": "George", "id": 12, "email": "george#domain.example" },
{ "name": "Jack", "id": 18, "email": "jack#domain.example" },
{ "name": "Joe", "id": 19, "email": "joe#domain.example" }]
script
[.[]| with_entries( .key |= ascii_downcase ) ]
| (.[0] |keys_unsorted | #tsv)
, (.[] |map(.) |#tsv)
how to run
$ < data jq -rf script | column -t
name id email
George 12 george#domain.example
Jack 18 jack#domain.example
Joe 19 joe#domain.example
I found this question while summarizng some data from amazon web services. The problem I was working on, in case you want another example:
$ aws ec2 describe-spot-instance-requests | tee /tmp/ins |
jq --raw-output '
# extract instances as a flat list.
[.SpotInstanceRequests | .[]
# remove unwanted data
| {
State,
statusCode: .Status.Code,
type: .LaunchSpecification.InstanceType,
blockPrice: .ActualBlockHourlyPrice,
created: .CreateTime,
SpotInstanceRequestId}
]
# lowercase keys
# (for predictable sorting, optional)
| [.[]| with_entries( .key |= ascii_downcase ) ]
| (.[0] |keys_unsorted | #tsv) # print headers
, (.[]|.|map(.) |#tsv) # print table
' | column -t
Output:
state statuscode type blockprice created spotinstancerequestid
closed instance-terminated-by-user t3.nano 0.002000 2019-02-24T15:21:36.000Z sir-r5bh7skq
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:51:47.000Z sir-1k9s5h3m
closed instance-terminated-by-user t3.nano 0.002000 2019-02-24T14:55:26.000Z sir-43x16b6n
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:29:23.000Z sir-2jsh5brn
active fulfilled t3.nano 0.002000 2019-02-24T15:37:26.000Z sir-z1e9591m
cancelled bad-parameters t3.nano 0.002000 2019-02-24T14:33:42.000Z sir-n7c15y5p
Input:
$ cat /tmp/ins
{
"SpotInstanceRequests": [
{
"Status": {
"Message": "2019-02-24T15:29:38+0000 : 2019-02-24T15:29:38+0000 : Spot Instance terminated due to user-initiated termination.",
"Code": "instance-terminated-by-user",
"UpdateTime": "2019-02-24T15:31:03.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T15:21:36.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-0414083bef5e91d94",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-r5bh7skq",
"State": "closed",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T15:21:36.000Z",
"SpotPrice": "0.008000"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:51:48.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:51:47.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Invalid device name /dev/sda",
"Code": "InvalidBlockDeviceMapping"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-1k9s5h3m",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:51:47.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "2019-02-24T15:02:17+0000 : 2019-02-24T15:02:17+0000 : Spot Instance terminated due to user-initiated termination.",
"Code": "instance-terminated-by-user",
"UpdateTime": "2019-02-24T15:03:34.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:55:26.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-010442ac3cc85ec08",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-43x16b6n",
"State": "closed",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:55:26.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:29:24.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:29:23.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Addressing type must be 'public'",
"Code": "InvalidParameterCombination"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-2jsh5brn",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:29:23.000Z",
"SpotPrice": "0.011600"
},
{
"Status": {
"Message": "Your spot request is fulfilled.",
"Code": "fulfilled",
"UpdateTime": "2019-02-24T15:37:28.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T15:37:26.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"InstanceId": "i-0a29e9de6d59d433f",
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-z1e9591m",
"State": "active",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T15:37:26.000Z",
"SpotPrice": "0.008000"
},
{
"Status": {
"Message": "Your Spot request failed due to bad parameters.",
"Code": "bad-parameters",
"UpdateTime": "2019-02-24T14:33:43.000Z"
},
"ActualBlockHourlyPrice": "0.002000",
"ValidUntil": "2019-03-03T14:33:42.000Z",
"InstanceInterruptionBehavior": "terminate",
"Tags": [],
"Fault": {
"Message": "Invalid device name /dev/sda",
"Code": "InvalidBlockDeviceMapping"
},
"BlockDurationMinutes": 60,
"SpotInstanceRequestId": "sir-n7c15y5p",
"State": "cancelled",
"ProductDescription": "Linux/UNIX",
"LaunchedAvailabilityZone": "eu-north-1a",
"LaunchSpecification": {
"Placement": {
"Tenancy": "default",
"AvailabilityZone": "eu-north-1a"
},
"ImageId": "ami-6d27a913",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda",
"VirtualName": "root",
"NoDevice": "",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 8
}
}
],
"EbsOptimized": false,
"SecurityGroups": [
{
"GroupName": "default"
}
],
"Monitoring": {
"Enabled": false
},
"InstanceType": "t3.nano",
"AddressingType": "public",
"NetworkInterfaces": [
{
"DeviceIndex": 0,
"Description": "eth-zero",
"NetworkInterfaceId": "",
"DeleteOnTermination": true,
"SubnetId": "subnet-420ffc2b",
"AssociatePublicIpAddress": true
}
]
},
"Type": "one-time",
"CreateTime": "2019-02-24T14:33:42.000Z",
"SpotPrice": "0.011600"
}
]
}
The problem with the answers above is they only work if the fields are all about the same width.
To avoid this issue, the Linux column command could be used:
// input.json
[
{
"name": "George",
"id": "a very very long field",
"email": "george#domain.example"
},
{
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
},
{
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}
]
Then:
▶ jq -r '.[] | [.id, .name] | #tsv' input.json | column -ts $'\t'
a very very long field George
18 Jack
19 Joe
I made a mix with all responses to get all this behaviours
create header table
handle long fields
create a function to reuse
function bash
function jsonArrayToTable(){
jq -r '(.[0] | ([keys[] | .] |(., map(length*"-")))), (.[] | ([keys[] as $k | .[$k]])) | #tsv' | column -t -s $'\t'
}
Sample use
echo '[{"key1":"V1.1", "key2":"V2.1"}, {"keyA":"V1.2", "key2":"V2.2"}]' | jsonArrayToTable
output
key1 key2
---- ----
V1.1 V2.1
V2.2 V1.2
If you want to generate an HTML table instead of a table for terminal output:
echo '[{
"name": "George",
"id": 12,
"email": "george#domain.example"
}, {
"name": "Jack",
"id": 18,
"email": "jack#domain.example"
}, {
"name": "Joe",
"id": 19,
"email": "joe#domain.example"
}]' | jq -r 'map("<tr><td>" + .name + "</td><td>" + (.id | tostring) + "</td></tr>") | ["<table>"] + . + ["</table>"] | .[]'
Output:
<table>
<tr><td>George</td><td>12</td></tr>
<tr><td>Jack</td><td>18</td></tr>
<tr><td>Joe</td><td>19</td></tr>
</table>
If the values don't contain spaces, this might be helpful:
read -r -a data <<<'name1 value1 name2 value2'
echo "name value"
echo "=========="
for ((i=0; i<${#data[#]}; i+=2)); do
echo ${data[$i]} ${data[$((i+1))]}
done
Output
name value
==========
name1 value1
name2 value2
More simple implement:
jq -r '(.[0]|keys_unsorted|(.,map(length*"-"))),.[]|map(.)|#tsv'|column -ts $'\t'
you can add the following jq function into ~/.jq:
def pretty_table:
(.[0]|keys_unsorted|(.,map(length*"-"))),.[]|map(.)|#tsv
;
and then run:
cat apps.json | jq -r pretty_table | column -ts $'\t'