jq transform to array and sort - json

Suppose I have the following input.log file:
{"foo": "1", "foo2": "2"}
{"foo": "3", "foo2": "4"}
{"foo": "5", "foo2": "6"}
{"foo": "7", "foo2": "8"}
I want to parse this using jq and sort the result based on the value of some common key, lets say the "foo" key.
How could I accomplish that?

To sort, you need an array, which you can obtain using --slurp/-s.
jq -sc 'sort_by( .foo )[]' input.log
Demo on jqplay

Related

using jq to merge two json values based on key value

i have two json files that i would like to merge based on the value of a key. the key name is different in both json files but the value would be the same. i am using jq to try to get this done. most of the examples i have found all merge based on key name and not value.
sample1.json
[
{
"unique_id": "pp1234",
"unique_id_type": "netid",
"rfid": "12245556890478",
},
{
"unique_id": "aqe123",
"unique_id_type": "netid",
"rfid": "12234556890478",
}
]
sample2.json
[
{
"mailing_state": "New York",
"mobile_phone_number": "(982) 2541212",
"netid": "pp1234",
"netid_reachable": "Y",
},
{
"mailing_state": "New York",
"mobile_phone_number": "(982) 5551212",
"netid": "aqe123",
"netid_reachable": "Y",
}
]
i would want the output to look something like:
results.json
[
{
"unique_id": "pp1234",
"unique_id_type": "netid",
"rfid": "12245556890478",
"mailing_state": "New York",
"mobile_phone_number": "(982) 2541212",
"netid_reachable": "Y",
},
{
"unique_id": "aqe123",
"unique_id_type": "netid",
"rfid": "12234556890478",
"mailing_state": "New York",
"mobile_phone_number": "(982) 5551212",
"netid_reachable": "Y",
}
]
order of results would not matter as long as the records are merged based on netid/unique_id keys. i am open to using something other than jq if necessary. thanks in advance.
Once the sample input files have been corrected, the following invocation should do the trick:
jq --argfile uid sample1.json '
($uid | INDEX(.unique_id)) as $dict
| map( $dict[.netid] + del(.netid) )
' sample2.json
If you prefer not to use --argfile because it has been deprecated, you could (for example) use --slurpfile and change $uid in the jq program to $uid[0].

Get parent value from json using jq

My json file looks like this;
{
"RQBTYFE86MFC3oL": {
"name": "Nightmode",
"lights": [
"1",
"2",
"3",
"4",
"5",
"7",
"8",
"9",
"10",
"11"
],
"owner": "kvovodUUfn2vlby9h9okdDhv8SrTzkBFjk6kPz2v",
"recycle": false,
"locked": false,
"appdata": {
"version": 1,
"data": "QSXCj_r01_d99"
},
"picture": "",
"lastupdated": "2018-08-08T03:21:39",
"version": 2
}
}
I want to get the 'RQBTYFE86MFC3oL' value by doing a query for 'Nightmode'. So far I came up with this;
jq '.[] | select(.name == "Nightmode")'
This will return me the correct part of the Json but the 'RQBTYFE86MFC3oL' part is stripped. How do I get this part as well?
A simple way to determine the key name(s) corresponding to values satisfying a certain condition is to use to_entries, as explained in the jq manual.
Using this approach, the appropriate jq filter would be:
to_entries[] | select(.value.name == "Nightmode") | .key
with the result:
"RQBTYFE86MFC3oL"
If you want to get the key-value pair, you'd use with_entries as follows:
with_entries( select(.value.name == "Nightmode") )
If the input JSON is too large to fit comfortably in memory, then it would make sense to use jq's streaming parser (invoked with the --stream command-line option):
jq --stream '
select(.[1] == "Nightmode" and (first|length) == 2 and first[1] == "name")
| first | first'
This would produce the key name.
The key idea is that the streaming parser produces arrays including pairs of the form: [ARRAYPATH, VALUE] where VALUE is the value at ARRAYPATH.
You want to get the Key Value.
So use the keys command, to return 'RQBTYFE86MFC3oL' as that is the key, the rest is the value of that key.
jq 'keys'
Here is a snippet: https://jqplay.org/s/YvpCb2PH42
Reference: https://stedolan.github.io/jq/manual/

Parsing JSON format with jq

I need to parse the output from lsblk. Since I am doing this from within a script I need the output in a standardized format. Therefore I chose the JSON format as output. Here is the command with some sample output:
# lsblk -o NAME,MOUNTPOINT -J
{
"blockdevices": [
{"name": "sda", "mountpoint": null,
"children": [
{"name": "sda1", "mountpoint": "/sda1/mountpoint"},
{"name": "sda2", "mountpoint": null,
"children": [
{"name": "sda2_mapper", "mountpoint": "/sda2/mountpoint"}
]
},
{"name": "sda3", "mountpoint": null},
{"name": "sda4", "mountpoint": null}
]
},
{"name": "sdb", "mountpoint": null,
"children": [
{"name": "sdb1", "mountpoint": "/sdb1/mountpoint"},
{"name": "sdb2", "mountpoint": null}
]
},
{"name": "sdc", "mountpoint": null}
]
}
I want to extract the names of all innermost nodes, i.e., the name of all nodes that do not have children. The desired output for the above sample would be:
sda1
sda2_mapper
sda3
sda4
sdb1
sdb2
sdc
My tool of choice is jq which I have only recently discovered. I have tried
# jq '.blockdevices[].children[]?.name?'
But this only filters the first level of names. I also tried with
# jq 'recurse(.name?)'
but this returns the entire file.
Is there a way to return only nodes that do not have children, no matter how deep they are nested?
PS: I am capable of implementing the requirement in bash or awk. I would, however, prefer a solution with a tool like jq, which specific purpose is to parse json files.
I don't think this is the simplest way to do it, but it seems to work:
$ jq -r '.blockdevices[] | .. | objects | select(has("children")|not)| .name' tmp.json
sda1
sda2_mapper
sda3
sda4
sdb1
sdb2
sdc
It recursively outputs each value found in the JSON, filtering out first anything that is not an object, then any object that has a children key. Finally, you can select the name value from each remaining object.
With your JSON input, the following command:
jq '.. | scalars'
emits the "leaves", beginning:
"sda"
"sda1"
"/sda1/mountpoint"
Use the -r (raw output) to strip the quotation marks from strings.

Select or exclude multiples object with an array of IDs

I have the following JSON :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
},
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]
And I have the following array store in a variable: ["1", "2", "56", "1337"] (note the IDs are string, and may contain any regular character).
So, thanks to this SO, I found a way to filter my original data. jq 'jq '[.[] | select(.id == ("1", "2", "56", "1337"))]' ./data.json (note the array is surrounded by parentheses and not brackets) produces :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
But I would also liked to do the opposite (basically excluding IDs instead of selecting them). Using select(.id != ("1", "2", "56", "1337")) doesn't work and using jq '[. - [.[] | select(.id == ("1", "2", "56", "1337"))]]' ./data.json seems very ugly and it doesn't work with my actual data (an output of aws ec2 describe-instances).
So have you any idea to do that? Thank you!
To include them, you need to verify that the id is any of the values in the keep set.
$ jq --argjson include '["1", "2", "56", "1337"]' 'map(select(.id == $include[]))' ...
To exclude them, you need to verify that all values are not in your excluded set. But it might just be easier to take the original set and remove the items that are in the excluded set.
$ jq --argjson exclude '["1", "2", "56", "1337"]' '. - map(select(.id == $exclude[]))' ...
Here is a solution that uses inside. Assuming you run jq as
jq -M --argjson IDS '["1","2","56","1337"]' -f filter.jq data.json
This filter.jq
map( select([.id] | inside($IDS)) )
produces the ids from data.json that are in the $IDS array:
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
and this filter.jq
map( select([.id] | inside($IDS) | not) )
produces the ids from data.json that are not in the $IDS array:
[
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]

How to use `jq` to obtain the keys

My json looks like this :
{
"20160522201409-jobsv1-1": {
"vmStateDisplayName": "Ready",
"servers": {
"20160522201409 jobs_v1 1": {
"serverStateDisplayName": "Ready",
"creationDate": "2016-05-22T20:14:22.000+0000",
"state": "READY",
"provisionStatus": "PENDING",
"serverRole": "ROLE",
"serverType": "SERVER",
"serverName": "20160522201409 jobs_v1 1",
"serverId": 2902
}
},
"isAdminNode": true,
"creationDate": "2016-05-22T20:14:23.000+0000",
"totalStorage": 15360,
"shapeId": "ot1",
"state": "READY",
"vmId": 4353,
"hostName": "20160522201409-jobsv1-1",
"label": "20160522201409 jobs_v1 ADMIN_SERVER 1",
"ipAddress": "10.252.159.39",
"publicIpAddress": "10.252.159.39",
"usageType": "ADMIN_SERVER",
"role": "ADMIN_SERVER",
"componentType": "jobs_v1"
}
}
My key keeps changing from time to time. So for example 20160522201409-jobsv1-1 may be something else tomorrow. Also I may more than one such entry in the json payload.
I want to echo $KEYS and I am trying to do it using jq.
Things I have tried :
| jq .KEYS is the command i use frequently.
Is there a jq command to display all the primary keys in the json?
I only care about the hostname field. And I would like to extract that out. I know how to do it using grep but it is NOT a clean approach.
You can simply use: keys:
% jq 'keys' my.json
[
"20160522201409-jobsv1-1"
]
And to get the first:
% jq -r 'keys[0]' my.json
20160522201409-jobsv1-1
-r is for raw output:
--raw-output / -r: With this option, if the filter’s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems.
Source
If you want a known value below an unknown property, eg xxx.hostName:
% jq -r '.[].hostName' my.json
20160522201409-jobsv1-1