Format json output with jq - json

I'm writing a script which will obtain certain info from my Kubernetes cluster.
The following command
kubectl get --context <my-context> svc --selector='<my-selectors>' -o json |
jq -r ' .items[]| {Name:.metadata.name, Port:.spec.ports[0].port} + {Group: "test", Group: "group name", SSLMode: "prefer", MaintenanceDB: "postgres"} '>file.json
will output something like this
{
"Name": "db-name",
"Port": 3000,
"Group": "group name",
"SSLMode": "prefer",
"MaintenanceDB": "postgres"
}
{
"Name": "db-name",
"Port": 5432,
"Group": "group name",
"SSLMode": "prefer",
"MaintenanceDB": "postgres"
}
I've been trying to get the above into the following format
{
"Servers":{
"1": {
"Name": "db-name",
"Port": 3000,
"Group": "Server Group 1",
"SSLMode": "prefer",
"MaintenanceDB": "postgres"
},
"2": {
"Name": "db-name",
"Port": 5432,
"Group": "Server Group 1",
"SSLMode": "prefer",
"MaintenanceDB": "postgres"
}
}
}
Only just discovered jq so the few things I've tried have been unsuccessful. Would be grateful for any pointers.

Given the stream of JSON objects shown in the question, the following jq filter will produce the desired output assuming the stream is somehow "slurped":
. as $in
| reduce range(0;length) as $i ({};.[$i+1|tostring] = $in[$i])
| {Servers: .}
To avoid having to call jq twice, you could wrap your filter in square brackets, and pipe that into the above; better yet, you can streamline everything, e.g. along the following lines:
.items
| [to_entries[]
| {(.key+1|tostring): .value}
| map_values(
{Name:.metadata.name,
Port:.spec.ports[0].port,
Group: "group name",
SSLMode: "prefer",
MaintenanceDB: "postgres"}) ]
| {Servers: add}

Related

How to select multiple items with duplicate items using jq?

I have a JSON file which contains a list like this;
[{
"host": "cat",
"ip": "192.168.1.1",
"id": "cherry"
}, {
"host": "dog",
"ip": "192.168.1.1",
"id": "apple"
}, {
"host": "cat",
"ip": "192.168.1.2",
"id": "banana"
}]
I want to collect IPs and print id and host next to it but if IP is used multiple times then print multiple id and host next to instead of a new line. IP and host can be the same for multiple items but id is unique.
So the final output should look like this;
$ echo <something>
192.168.1.1 cat cherry dog apple
192.168.1.2 cat banana
How can I do this using bash and jq?
Make sure you have a valid JSON file: Remove the last comma , in each object to get this as your input.json:
[{
"host": "cat",
"ip": "192.168.1.1",
"id": "cherry"
}, {
"host": "dog",
"ip": "192.168.1.1",
"id": "apple"
}, {
"host": "cat",
"ip": "192.168.1.2",
"id": "banana"
}]
Then, you only need one jq call:
jq --raw-output 'group_by(.ip)[] | [first.ip, (.[] | .host, .id)] | join(" ")' input.json
Demo
Once you fix your example so it's valid JSON, the group_by function is the key:
$ jq -r 'group_by(.ip)[] | [.[0].ip, map(.host, .id)[]] | #tsv' input.json
192.168.1.1 cat cherry dog apple
192.168.1.2 cat banana
That will combine all objects with the same ip field into an array of objects. The rest is just turning those array of objects into arrays of just the values you want, and finally outputting each new array as a line of tab-separated values.

Shell variables as additional fields in json

I'm able to retrieve each user's data into m4-$u.json file with the below shell script
#!/bin/bash
USERID=ricardo.sanchez
PASSWORD=password
PORT=2728
for u in `cat user-list.txt`;
do echo $u;
curl --user $USERID:$PASSWORD http://198.98.99.12:46567/$PORT/protects/$u | jq '.' > m4-$u.json
done
One for the user's output of m4-daniel.json file few lines as follows.
[
{
"depotFile": "//ABND/JJEB/...",
"host": "*",
"isgroup": "",
"line": "16",
"perm": "open",
"user": "5G_USER_GROUP"
},
{
"depotFile": "//LIB/...",
"host": "*",
"isgroup": "",
"line": "19",
"perm": "write",
"user": "6G_USER_GROUP"
},
{
"depotFile": "//AND/RIO/...",
"host": "*",
"isgroup": "",
"line": "20",
"perm": "write",
"user": "AND_USER_GROUP"
},
During shell script run time additionally $PORT & $u variable values need to be added in output of each json file.
Expected json output:-
{
"depotFile": "//ABND/JJEB/...",
"host": "*",
"isgroup": "",
"line": "16",
"perm": "open",
"user": "5G_USER_GROUP",
"port": "2728",
"userid": "daniel"
},
To achieve this any help will be appreciated.
By using --arg with jq you can pass paremeter and use as a variables
( more info https://stedolan.github.io/jq/manual/v1.5/#Invokingjq )
by using map and + you can iterate over the array and add a new property for each associative array
in you case :
curl --user $USERID:$PASSWORD http://198.98.99.12:46567/$PORT/protects/$u \
| jq --arg a_port $PORT --arg $u $USER 'map(.+{"userid":$a_userid}+{"port":$a_port|tonumber})' > m4-$u.json
see addition and map in
https://stedolan.github.io/jq/manual/v1.5/#Builtinoperatorsandfunctions

Is it possible to use select as an element in a jq list?

I'm trying to parse the instance id, state, launch time and name using jq from the following partial output:
{
"AmiLaunchIndex": 0,
"ImageId": "ami-07c1483ef8c3dfece",
"InstanceId": "i-0309XXXXXf6d500c",
"InstanceType": "m5.2xlarge",
"KeyName": "k8s-prod-ap-southeast-1",
"LaunchTime": "2019-01-27T12:23:55+00:00",
"Monitoring": {
"State": "enabled"
},
"Placement": {
"AvailabilityZone": "ap-southeast-1c",
"GroupName": "",
"Tenancy": "default"
},
"PrivateDnsName": "ip-X-X-X-X.ap-southeast-1.compute.internal",
"PrivateIpAddress": "X.X.X.X",
"ProductCodes": [],
"PublicDnsName": "",
"State": {
"Code": 16,
"Name": "running"
},
"SourceDestCheck": true,
"Tags": [
{
"Key": "Environment",
"Value": "dev"
},
{
"Key": "Project",
"Value": "someproject"
},
{
"Key": "Name",
"Value": "k8s-prod-ap-southeast-1-mongodb"
},
{
"Key": "aws:autoscaling:groupName",
"Value": "k8s-prod-ap-southeast-1-mongodb-moved-marmoset-20190127122348196400000006"
},
{
"Key": "extra_tag1",
"Value": "extra_value1"
},
{
"Key": "extra_tag2",
"Value": "extra_value2"
}
],
.
.
.
}
Without the instance name which is represented as a tag (.Tags[].Name), running this:
aws ec2 describe-instances --filters "Name=instance.group-id,Values=${group_id}" --profile ${profile} --region ${region} --output json | jq -r '.Reservations[].Instances[] | [ .InstanceId, .State.Name, .LaunchTime ] | #tsv'
yields the following output:
i-01eb8b857e00e61d6 running 2019-01-27T12:23:55+00:00
i-0b013248c2a238598 running 2019-01-27T12:23:55+00:00
i-03094d164ff6d500c running 2019-01-27T12:23:55+00:00
But when I try to display the instance name as well the command fails:
✗ aws ec2 describe-instances --filters "Name=instance.group-id,Values=${group_id}" --profile ${profile} --region ${region} --output json | jq -r '.Reservations[].Instances[] | [ .InstanceId, .State.Name, .LaunchTime, select(.Tags[].Key=="Name" | .Value) ] | #tsv'
jq: error (at <stdin>:448): Cannot index boolean with string "Value"
What am I doing wrong?
What am I doing wrong?
Your use of select is not quite right. Instead, it seems you want:
[ .InstanceId,
.State.Name,
.LaunchTime,
(.Tags[] | select(.Key=="Name").Value) ]
| #tsv
There is no need to use jq to extract values from AWS CLI output.
You can instead use --query, which follows JMESPATH syntax.
Here's an example:
aws ec2 describe-instances --query "Reservations[].Instances[].[InstanceId,State.Name,LaunchTime,Tags[?Key=='Name'].Value|[0]]"
The output format can be specified by --output, such as json, text, table or (for AWS CLI v2) yaml.

How to select an element with jq in a nested JSON

I have input like this:
"data": [{
"id": 111585,
"name": "Inverter",
"batList": [{
"name": "Battery1",
"dataDict": [{
"key": "b1_1",
"name": "Battery V.",
"value": 57.63,
"unit": "V"
}, {
"key": "b1_2",
"name": "Battery I.",
"value": -0.10,
"unit": "A"
}, {
"key": "b1_3",
"name": "Battery P.",
"value": -6,
"unit": "W"
}, {
"key": "b1_4",
"name": "Inner T.",
"value": 25,
"unit": "℃"
}, {
"key": "b1_5",
"name": "Remaining Capacity % ",
"value": 99,
"unit": "%"
}]
}]
}],
from which I want to extract the 'value' property (i.e. 99) for "Remaining Capacity % ".
My best amateurish but well searched attempt is
jq --arg instance "Remaining Capacity % " '.data | .[] | select(.name == $instance) | .value')
but I get an empty result. Any help with this nested intransigence would be much appreciated.
Your idea seems about right, but you missed out mentioning the top-level paths after .data[], it should have been
jq --arg instance "Remaining Capacity % " \
'.data[].batList[].dataDict[] | select(.name == $instance ).value' json
An alternative to Inian's answer is to use the select-contains recipe, e.g.:
jq -r '.[].batList[].dataDict[] | select(.name|contains("Remaining")).value' file
While no better in this example, it is handy to remember especially if you need to find a bunch of values, e.g. contains("Battery") would return three results.

How can I filter by a numeric field using jq?

I am writing a script to query the Bitbucket API and delete SNAPSHOT artifacts that have never been downloaded. This script is failing because it gets ALL snapshot artifacts, the select for the number of downloads does not appear to be working.
What is wrong with my select statement to filter objects by the number of downloads?
Of course the more direct solution here would be if I could just query the Bitbucket API with a filter. To the best of my knowledge the API does not support filtering by downloads.
My script is:
#!/usr/bin/env bash
curl -X GET --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=100" > downloads.json
# get all values | reduce the set to just be name and downloads | select entries where downloads is zero | select entries where name contains SNAPSHOT | just get the name
#TODO i screwed up the selection somewhere its returning files that contain SNAPSHOT regardless of number of downloads
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#unique sort, not sure why jq gives me multiple values
sort -u snapshots_without_any_downloads.js | tr -d '"' > unique_snapshots_without_downloads.js
cat unique_snapshots_without_downloads.js | xargs -t -I % curl -Ss -X DELETE --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/%" > deleted_files.txt
A deidentified sample of the raw input from the API is:
{
"pagelen": 10,
"size": 40,
"values": [
{
"name": "myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
}
},
"downloads": 2,
"created_on": "2018-03-15T17:50:00.157310+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430894
},
{
"name": "myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
}
},
"downloads": 0,
"created_on": "2018-03-15T17:50:00.157310+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430894
},
{
"name": "myproject_1.0_mc_3.5.1.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.1.zip"
}
},
"downloads": 5,
"created_on": "2018-03-15T17:49:14.885544+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430934
}
],
"page": 1,
"next": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=10&page=2"
}
The output I want from this snippet is myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip - that artifact is a SNAPSHOT and has zero downloads.
I have used this intermediate step to do some debugging:
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads>0) | select(.name | contains("SNAPSHOT")) | unique' downloads.json > snapshots_with_downloads.js
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#this returns the same values for each list!
diff unique_snapshots_with_downloads.js unique_snapshots_without_downloads.js
This adjustment gives a cleaner and unique structure, it suggests that theres some sort of splitting or streaming aspect of jq that I do not fully understand:
#this returns a "unique" array like I expect, adding select to this still does not produce the desired outcome
jq '.values | [{name: .[].name, downloads: .[].downloads}] | unique' downloads.json
The data after this step looks like this. It just removed the cruft I didn't need from the raw API response:
[
{
"name": "myproject_1.0_2400a51_mc_3.4.0.zip",
"downloads": 0
},
{
"name": "myproject_1.0_2400a51_mc_3.4.1.zip",
"downloads": 2
},
{
"name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.0.zip",
"downloads": 0
},
{
"name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.1.zip",
"downloads": 2
}
]
As I understand it:
You want globally unique outputs
You want only items with downloads==0
You want only items whose name contains "SNAPSHOT"
The following will accomplish that:
jq -r '
[.values[] | {(.name): .downloads}]
| add
| to_entries[]
| select(.value == 0)
| .key | select(contains("SNAPSHOT"))'
Rather than making unique an explicit step, this version generates a map from names to download counters (adding the values together -- which means that in case of conflicts, the last one wins), and thereby both ensures that the outputs are unique.
Given your test JSON, output is:
myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip
Applied to the overall problem context, this strategy can be used to simplify the overall process:
jq -r '[.values[] | {(.links.self.href): .downloads}] | add | to_entries[] | select(.value == 0) | .key | select(contains("SNAPSHOT"))'
It simplifies the overall process by acting on the URL to the file rather than the name only. This simplifies the subsequent DELETE call. The sort and tr calls can also be removed.
Here's a solution which sums up the .download values per .name before making the selection based on the total number of downloads:
reduce (.values[] | select(.name | contains("SNAPSHOT"))) as $v
({}; .[$v.name] += $v.downloads)
| with_entries(select(.value == 0))
| keys_unsorted[]
Example:
$ jq -r -f program.jq input.json
myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip
p.s.
What is wrong with my select statement ...?
The problem that jumps out is the bit of the pipeline just before the "select" filter:
.values | {name: .[].name, downloads: .[].downloads}
The use of .[] in this manner results in the Cartesian product being formed -- that is, the above expression will emit n*n JSON sets, where n is the length of .values. You evidently intended to write:
.values[] | {name: .name, downloads: .downloads}
which can be abbreviated to:
.values[] | {name, downloads}