How to get max value from JSON? - json

There is a json file like this:
[
{
"createdAt": 1548729542000,
"platform": "foo"
},
{
"createdAt": 1548759398000,
"platform": "foo"
},
{
"createdAt": 1548912360000,
"platform": "foo"
},
{
"createdAt": 1548904550000,
"platform": "bar"
}
]
Now I want to get the max createdAt of foo platform? how to implement it by using jq?
jq '.[] | select(.platform=="foo") | .createdAt | max' foo.json
jq: error (at <stdin>:17): number (1548729542000) and number (1548729542000) cannot be iterated over
jq '.[] | select(.platform=="foo") | max_by(.createdAt)' foo.json
jq: error (at <stdin>:17): Cannot index number with string "createdAt"
exit status 5

max expects an array as input.
$ jq 'map(select(.platform == "foo") .createdAt) | max' file
1548912360000

One approach would be to make the selection and then use one of the array-oriented builtins max or max_by to find the maximum, e.g.
map(select(.platform=="foo"))
| max_by(.createdAt)
| .createdAt
However, this approach is not very satisfactory as it requires more space than is strictly necessary. For large arrays, a stream-oriented version of max_by would be better.
max_by
def max_by(s; f):
reduce s as $s (null;
if . == null then {s: $s, m: ($s|f)}
else ($s|f) as $m
| if $m > .m then {s: $s, m: $m} else . end
end)
| .s ;
max_by(.[] | select(.platform=="foo"); .createdAt)
| .createdAt

Related

Append multiple dummy objects to json array with jq

Lets say this is my array :
[
{
"name": "Matias",
"age": "33"
}
]
I can do this :
echo "$response" | jq '[ .[] | select(.name | test("M.*"))] | . += [.[]]'
And it will output :
[
{
"name": "Matias",
"age": "33"
},
{
"name": "Matias",
"age": "33"
}
]
But I cant do this :
echo "$response" | jq '[ .[] | select(.name | test("M.*"))] | . += [.[] * 3]'
jq: error (at <stdin>:7): object ({"name":"Ma...) and number (3) cannot be multiplied
I need to extend an array to create a dummy array with 100 values. And I cant do it. Also, I would like to have a random age on the objects. ( So later on I can filter the file to measure performance of an app .
Currently jq does not have a built-in randomization function, but it's easy enough to generate random numbers that jq can use. The following solution uses awk but in a way that some other PRNG can easily be used.
#!/bin/bash
function template {
cat<<EOF
[
{
"name": "Matias",
"age": "33"
}
]
EOF
}
function randoms {
awk -v n=$1 'BEGIN { for(i=0;i<n;i++) {print int(100*rand())} }'
}
randoms 100 | jq -n --argfile template <(template) '
first($template[] | select(.name | test("M.*"))) as $t
| [ $t | .age = inputs]
'
Note on performance
Even though the above uses awk and jq together, this combination is about 10 times faster than the posted jtc solution using -eu:
jq+awk: u+s = 0.012s
jtc with -eu: u+s = 0.192s
Using jtc in conjunction with awk as above, however, gives u+s == 0.008s on the same machine.

jq conditional processing on multiple files

I have multiple json files:
a.json
{
"status": "passed",
"id": "id1"
}
{
"status": "passed",
"id": "id2"
}
b.json
{
"status": "passed",
"id": "id1"
}
{
"status": "failed",
"id": "id2"
}
I want to know which id was passed in a.json and which is failed now in b.json.
expected.json
{
"status": "failed",
"id": "id2"
}
I tried something like:
jq --slurpfile a a.json --slurpfile b b.json -n '$a[] | reduce select(.status == "passed") as $passed (.; $b | select($a.id == .id and .status == "failed"))'
$passed is supposed to contain the list of passed entry in a.json and reduce will merge all the objects for which the id matches and are failed.
However it does not produce the expected result, and the documentation is kind of limited.
How to produce expected.json from a.json and b.json ?
For me your filter produces the error
jq: error (at <unknown>): Cannot index array with string "id"
I suspect this is because you wrote $b instead of $b[] and $a.id instead of $passed.id. Here is my guess at what you intended to write:
$a[]
| reduce select(.status == "passed") as $passed (.;
$b[] | select( $passed.id == .id and .status == "failed")
)
which produces the output
null
{
"status": "failed",
"id": "id2"
}
You can filter away the null by adding | values e.g.
$a[]
| reduce select(.status == "passed") as $passed (.;
$b[] | select( $passed.id == .id and .status == "failed")
)
| values
However you don't really need reduce here. A simpler way is just:
$a[]
| select(.status == "passed") as $passed
| $b[]
| select( $passed.id == .id and .status == "failed")
If you intend to go further with this I would recommend a different approach: first construct an object combining $a and $b and then project what you want from it. e.g.
reduce (($a[]|{(.id):{a:.status}}),($b[]|{(.id):{b:.status}})) as $v ({};.*$v)
will give you
{
"id1": {
"a": "passed",
"b": "passed"
},
"id2": {
"a": "passed",
"b": "failed"
}
}
To convert that back to the output you requested add
| keys[] as $id
| .[$id]
| select(.a == "passed" and .b == "failed")
| {$id, status:.b}
to obtain
{
"id": "id2",
"status": "failed"
}
The following solutions to the problem are oriented primarily towards efficiency, but it turns out that they are quite straightforward and concise.
For efficiency, we will construct a "dictionary" of ids of those who have passed in a.json to make the required lookup very fast.
Also, if you have a version of jq with inputs, it is easy to avoid "slurping" the contents of b.json.
Solution for jq 1.4 or higher
Here is a generic solution which, however, slurps both files:
Invocation (note the use of the -s option):
jq -s --slurpfile a a.json -f passed-and-failed.jq b.json
Program:
([$a[] | select(.status=="passed") | {(.id): true}] | add) as $passed
| .[] | select(.status == "failed" and $passed[.id])
That is, first construct the dictionary, and then emit the objects in b.json that satisfy the condition.
Solution for jq 1.5 or higher
Invocation (note the use of the -n option):
jq -n --slurpfile a a.json -f passed-and-failed.jq b.json
INDEX/2 is currently available from the master branch, but is provided here in case your jq does not have it, in which case you might want to add its definition to ~/.jq:
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);
The solution now becomes a simple two-liner:
INDEX($a[] | select(.status == "passed") | .id; .) as $passed
| inputs | select(.status == "failed" and $passed[.id])

Parsing multiple key/values in json tree with jq

Using jq, I'd like to cherry-pick key/value pairs from the following json:
{
"project": "Project X",
"description": "This is a description of Project X",
"nodes": [
{
"name": "server001",
"detail001": "foo",
"detail002": "bar",
"networks": [
{
"net_tier": "network_tier_001",
"ip_address": "10.1.1.10",
"gateway": "10.1.1.1",
"subnet_mask": "255.255.255.0",
"mac_address": "00:11:22:aa:bb:cc"
}
],
"hardware": {
"vcpu": 1,
"mem": 1024,
"disks": [
{
"disk001": 40,
"detail001": "foo"
},
{
"disk002": 20,
"detail001": "bar"
}
]
},
"os": "debian8",
"geo": {
"region": "001",
"country": "Sweden",
"datacentre": "Malmo"
},
"detail003": "baz"
}
],
"detail001": "foo"
}
For the sake of an example, I'd like to parse the following keys and their values: "Project", "name", "net_tier", "vcpu", "mem", "disk001", "disk002".
I'm able to parse individual elements without much issue, but due to the hierarchical nature of the full parse, I've not had much luck parsing down different branches (i.e. both networks and hardware > disks).
Any help appreciated.
Edit:
For clarity, the output I'm going for is a comma-separated CSV. In terms of parsing all combinations, covering the sample data in the example will do for now. I will hopefully be able to expand on any suggestions.
Here is a different filter which computes the unique set of network tier and disk names and then generates a result with columns appropriate to the data.
{
tiers: [ .nodes[].networks[].net_tier ] | unique
, disks: [ .nodes[].hardware.disks[] | keys[] | select(startswith("disk")) ] | unique
} as $n
| def column_names($n): [ "project", "name" ] + $n.tiers + ["vcpu", "mem"] + $n.disks ;
def tiers($n): [ $n.tiers[] as $t | .networks[] | if .net_tier==$t then $t else null end ] ;
def disks($n): [ $n.disks[] as $d | map(select(.[$d]!=null)|.[$d])[0] ] ;
def rows($n):
.project as $project
| .nodes[]
| .name as $name
| tiers($n) as $tier_values
| .hardware
| .vcpu as $vcpu
| .mem as $mem
| .disks
| disks($n) as $disk_values
| [$project, $name] + $tier_values + [$vcpu, $mem] + $disk_values
;
column_names($n), rows($n)
| #csv
The benfit of this approach becomes apparent if we add another node to the sample data:
{
"name": "server002",
"networks": [
{
"net_tier": "network_tier_002"
}
],
"hardware": {
"vcpu": 1,
"mem": 1024,
"disks": [
{
"disk002": 40,
"detail001": "foo"
}
]
}
}
Sample Run (assuming filter in filter.jq and amended data in data.json)
$ jq -Mr -f filter.jq data.json
"project","name","network_tier_001","network_tier_002","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001","",1,1024,40,20
"Project X","server002",,"network_tier_002",1,1024,,40
Try it online!
Here's one way you could achieve the desired output.
program.jq:
["project","name","net_tier","vcpu","mem","disk001","disk002"],
[.project]
+ (.nodes[] | .networks[] as $n |
[
.name,
$n.net_tier,
(.hardware |
.vcpu,
.mem,
(.disks | add["disk001","disk002"])
)
]
)
| #csv
$ jq -r -f program.jq input.json
"project","name","net_tier","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001",1,1024,40,20
Basically, you'll want to project the fields that you want into arrays so you may convert those arrays to csv rows. Your input makes it seem like there could potentially be multiple networks for a given node. So if you wanted to output all combinations, that would have to be flattened out.
Here's another approach, that is short enough to speak for itself:
def s(f): first(.. | f? // empty) // null;
[s(.project), s(.name), s(.net_tier), s(.vcpu), s(.mem), s(.disk001), s(.disk002)]
| #csv
Invocation:
$ jq -r -f value-pairs.jq input.json
Result:
"Project X","server001","network_tier_001",1,1024,40,20
With headers
Using the same s/1 as above:
. as $d
| ["project", "name", "net_tier", "vcpu", "mem", "disk001","disk002"]
| (., map( . as $v | $d | s(.[$v])))
| #csv
With multiple nodes
Again with s/1 as above:
.project as $p
| ["project", "name", "net_tier", "vcpu", "mem", "disk001","disk002"] as $h
| ($h,
(.nodes[] as $d
| $h
| map( . as $v | $d | s(.[$v]) )
| .[0] = $p)
) | #csv
Output with the illustrative multi-node data:
"project","name","net_tier","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001",1,1024,40,20
"Project X","server002","network_tier_002",1,1024,,40

Deep JSON merge

I have multiple JSON files that I'd like to merge into one.
Some have the same root element but different children. I don't want to overwrite the children but too extend them if they have the same parent element.
I've tried this answer, but it doesn't work:
jq: error (at file2.json:0): array ([{"title":"...) and array ([{"title":"...) cannot be multiplied
Sample files and wanted result (Gist)
Thank you in advance.
Here is a recursive solution which uses group_by(.key) to decide
which objects to combine. This could be a little simpler if .children
were more uniform. Sometimes it's absent in the sample data and sometimes it's the unusual value [{}].
def merge:
def kids:
map(
.children
| if length<1 then empty else .[] end
)
| if length<1 then {} else {children:merge} end
;
def mergegroup:
{
title: .[0].title
, key: .[0].key
} + kids
;
if .==[{}] then .
else group_by(.key) | map(mergegroup)
end
;
[ .[] | .[] ] | merge
When run with the -s option as follows
jq -M -s -f filter.jq file1.json file2.json
It produces the following output.
[
{
"title": "Title1",
"key": "12345678",
"children": [
{
"title": "SubTitle2",
"key": "123456713",
"children": [
{}
]
},
{
"title": "SubTitle1",
"key": "12345679",
"children": [
{
"title": "SubSubTitle1",
"key": "12345610"
},
{
"title": "SubSubTitle2",
"key": "12345611"
},
{
"title": "DifferentSubSubTitle1",
"key": "12345612"
}
]
}
]
}
]
If the ordering of the objects within the .children matters
then an a sort_by can be added to the {children:merge} expression,
e.g. {children:merge|sort_by(.key)}
Here is something that will reproduce your desired result. It's by no means automatic, It's really a proof of concept at this stage.
One liner:
jq -s '. as $in | ($in[0][].children[].children + $in[1][].children[0].children | unique) as $a1 | $in[1][].children[1] as $s1 | $in[0] | .[0].children[0].children = ($a1) | .[0].children += [$s1]' file1.json file2.json
Multi line breakdown (Copy/Paste):
jq -s '. as $in
| ($in[0][].children[].children + $in[1][].children[0].children
| unique) as $a1
| $in[1][].children[1] as $s1
| $in[0]
| .[0].children[0].children = ($a1)
| .[0].children += [$s1]' file1.json file2.json
Where:
$in : file1.json and file2.json combined input
$a1: merged "SubSubTitle" array
$s1: second subtitle object
I suspect the reason this didn't work was because your schema is different and has nested arrays.
I find it quite hypnotic looking at this, it would be good if you could elaborate a bit on how fixed the structure is and what the requirements are.

access fields in json object with unique identifiers

I'm having trouble understanding jq. I'm even having trouble articulating what I want to learn.
I think I want a wildcard? my desire is, given a json snippet like this:
{
"logs": {
"-MnpQaRONGXz9tff-W": {
"points": 10,
"type": "signup"
},
"-N5qlX1mQ3SYA9RXdE": {
"points": 15,
"type": "welcome"
},
"-N5rx8PAcNgWu25zRf": {
"points": 5,
"type": "vote"
},
"-N5s29TyZ33snUqC5X": {
"points": 5,
"type": "vote"
}
},
"total": 35
}
Count how many times a certain type appears, or even just output all the types into a file.
This totally doesn't work (and in the simplified example doesn't make sense):
cat test.json | jq '.logs | * | .type'
would get me a simple object or listing of the types .
To obtain a stream of all the "type" values of all objects, no matter how deeply nested:
.. | select(.type?) | .type
In your particular case, the following would suffice:
.[] | select(type == "object") | .[].type
To produce a tabulation:
def tabulate: reduce .[] as $i ({}; .[$i] += 1 );
[.[] | select(type == "object") | .[].type] | tabulate
Given your input, the output would be:
{
"signup": 1,
"welcome": 1,
"vote": 2
}
Not a jq-only answer, but you can get a list of all the type fields, then pipe that through uniq -c to get a count.
$ jq '.logs[] | .type' tmp.json | uniq -c
1 signup
1 welcome
2 vote
Here is a solution which uses tostream and reduce
reduce (tostream|select(length==2)) as [$p,$v] (
{}
; if $p[-1] == "type" then .[$v] += 1 else . end
)
output with sample data
{
"signup": 1,
"welcome": 1,
"vote": 2
}