jq conditional processing on multiple files - json

I have multiple json files:
a.json
{
"status": "passed",
"id": "id1"
}
{
"status": "passed",
"id": "id2"
}
b.json
{
"status": "passed",
"id": "id1"
}
{
"status": "failed",
"id": "id2"
}
I want to know which id was passed in a.json and which is failed now in b.json.
expected.json
{
"status": "failed",
"id": "id2"
}
I tried something like:
jq --slurpfile a a.json --slurpfile b b.json -n '$a[] | reduce select(.status == "passed") as $passed (.; $b | select($a.id == .id and .status == "failed"))'
$passed is supposed to contain the list of passed entry in a.json and reduce will merge all the objects for which the id matches and are failed.
However it does not produce the expected result, and the documentation is kind of limited.
How to produce expected.json from a.json and b.json ?

For me your filter produces the error
jq: error (at <unknown>): Cannot index array with string "id"
I suspect this is because you wrote $b instead of $b[] and $a.id instead of $passed.id. Here is my guess at what you intended to write:
$a[]
| reduce select(.status == "passed") as $passed (.;
$b[] | select( $passed.id == .id and .status == "failed")
)
which produces the output
null
{
"status": "failed",
"id": "id2"
}
You can filter away the null by adding | values e.g.
$a[]
| reduce select(.status == "passed") as $passed (.;
$b[] | select( $passed.id == .id and .status == "failed")
)
| values
However you don't really need reduce here. A simpler way is just:
$a[]
| select(.status == "passed") as $passed
| $b[]
| select( $passed.id == .id and .status == "failed")
If you intend to go further with this I would recommend a different approach: first construct an object combining $a and $b and then project what you want from it. e.g.
reduce (($a[]|{(.id):{a:.status}}),($b[]|{(.id):{b:.status}})) as $v ({};.*$v)
will give you
{
"id1": {
"a": "passed",
"b": "passed"
},
"id2": {
"a": "passed",
"b": "failed"
}
}
To convert that back to the output you requested add
| keys[] as $id
| .[$id]
| select(.a == "passed" and .b == "failed")
| {$id, status:.b}
to obtain
{
"id": "id2",
"status": "failed"
}

The following solutions to the problem are oriented primarily towards efficiency, but it turns out that they are quite straightforward and concise.
For efficiency, we will construct a "dictionary" of ids of those who have passed in a.json to make the required lookup very fast.
Also, if you have a version of jq with inputs, it is easy to avoid "slurping" the contents of b.json.
Solution for jq 1.4 or higher
Here is a generic solution which, however, slurps both files:
Invocation (note the use of the -s option):
jq -s --slurpfile a a.json -f passed-and-failed.jq b.json
Program:
([$a[] | select(.status=="passed") | {(.id): true}] | add) as $passed
| .[] | select(.status == "failed" and $passed[.id])
That is, first construct the dictionary, and then emit the objects in b.json that satisfy the condition.
Solution for jq 1.5 or higher
Invocation (note the use of the -n option):
jq -n --slurpfile a a.json -f passed-and-failed.jq b.json
INDEX/2 is currently available from the master branch, but is provided here in case your jq does not have it, in which case you might want to add its definition to ~/.jq:
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);
The solution now becomes a simple two-liner:
INDEX($a[] | select(.status == "passed") | .id; .) as $passed
| inputs | select(.status == "failed" and $passed[.id])

Related

How to remove unnecessary items from an JSON with JQ?

I need from this: (example)
{
"jobs": [
{},
{}
],
"services": {
"service-1": {
"version": "master"
},
"service-2": {
"foo": true,
"version": "master"
},
"service-3": {
"bar": "baz"
}
}
}
Make this:
{
"services": {
"service-1": {
"version": "master"
},
"service-2": {
"version": "master"
}
}
}
So delete all except .services.*.version. Help please, can't handle it with my knowledge of the JQ.
Translating your expression .services.*.version quite generically, you could use tostream as follows:
reduce (tostream
| select(length==2 and
(.[0] | (.[0] == "services" and
.[-1] == "version")))) as $pv (null;
setpath($pv[0]; $pv[1]) )
Using jq's streaming parser
To reduce memory requirements, you could modify the above solution to use jq's streaming parser, either with reduce as above, or with fromstream:
jq -n --stream -f program.jq input.json
where program.jq contains:
fromstream(inputs
| select( if length == 2
then .[0] | (.[0] == "services" and
.[-1] == "version")
else . end ) )
.services.*.version.*
Interpreting .services.*.version more broadly so as not to require the terminal component of the path to be .version, simply replace
.[-1] == "version"
with:
index("version")
in the above.
If you wish to differentiate between "version" being absent or null:
{services}
| .services |= map_values(select(has("version")) | {version} )

Append multiple dummy objects to json array with jq

Lets say this is my array :
[
{
"name": "Matias",
"age": "33"
}
]
I can do this :
echo "$response" | jq '[ .[] | select(.name | test("M.*"))] | . += [.[]]'
And it will output :
[
{
"name": "Matias",
"age": "33"
},
{
"name": "Matias",
"age": "33"
}
]
But I cant do this :
echo "$response" | jq '[ .[] | select(.name | test("M.*"))] | . += [.[] * 3]'
jq: error (at <stdin>:7): object ({"name":"Ma...) and number (3) cannot be multiplied
I need to extend an array to create a dummy array with 100 values. And I cant do it. Also, I would like to have a random age on the objects. ( So later on I can filter the file to measure performance of an app .
Currently jq does not have a built-in randomization function, but it's easy enough to generate random numbers that jq can use. The following solution uses awk but in a way that some other PRNG can easily be used.
#!/bin/bash
function template {
cat<<EOF
[
{
"name": "Matias",
"age": "33"
}
]
EOF
}
function randoms {
awk -v n=$1 'BEGIN { for(i=0;i<n;i++) {print int(100*rand())} }'
}
randoms 100 | jq -n --argfile template <(template) '
first($template[] | select(.name | test("M.*"))) as $t
| [ $t | .age = inputs]
'
Note on performance
Even though the above uses awk and jq together, this combination is about 10 times faster than the posted jtc solution using -eu:
jq+awk: u+s = 0.012s
jtc with -eu: u+s = 0.192s
Using jtc in conjunction with awk as above, however, gives u+s == 0.008s on the same machine.

Use jq to merge keys with common id

consider a file 'b.json':
[
{
"id": 3,
"foo": "cannot be replaced, id isn't in a.json, stay untouched",
"baz": "do not touch3"
},
{
"id": 2,
"foo": "should be replaced with 'foo new2'",
"baz": "do not touch2"
}
]
and 'a.json':
[
{
"id": 2,
"foo": "foo new2",
"baz": "don't care"
}
]
I want to update the key "foo" in b.json using jq with the matching value from a.json. It should also work with more than one entry in a.json.
Thus the desired output is:
[
{
"id": 3,
"foo": "cannot be replaced, id isn't in a.json, stay untouched",
"baz": "do not touch3"
},
{
"id": 2,
"foo": "foo new2",
"baz": "do not touch2"
}
]
Here's one of several possibilities that use INDEX/2. If your jq does not have this as a built-in, see below.
jq --argfile a a.json '
INDEX($a[]; .id) as $dict
| map( (.id|tostring) as $id
| if ($dict|has($id)) then .foo = $dict[$id].foo
else . end)' b.json
There are other ways to pass in the contents of a.json and b.json.
Caveat
The above use of INDEX assumes there are no "collisions", which would happen if, for example, one of the objects has .id equal to 1 and another has .id equal to "1". If there is a possibility of such a collision, then a more complex definition of INDEX could be used.
INDEX/2
Straight from builtin.jq:
def INDEX(stream; idx_expr):
reduce stream as $row ({}; .[$row|idx_expr|tostring] = $row);
Here's a generic answer that makes no assumptions about the values of the .id keys except that they are distinct JSON values.
Generalization of INDEX/2
def type2: [type, if type == "string" then . else tojson end];
def dictionary(stream; f):
reduce stream as $s ({}; setpath($s|f|type2; $s));
def lookup(value):
getpath(value|type2);
def indictionary(value):
(value|type2) as $t
| has($t[0]) and (.[$t[0]] | has($t[1]));
Invocation
jq --argfile a a.json -f program.jq b.json
main
dictionary($a[]; .id) as $dict
| b
| map( .id as $id
| if ($dict|indictionary($id))
then .foo = ($dict|lookup($id).foo)
else . end)

How to get max value from JSON?

There is a json file like this:
[
{
"createdAt": 1548729542000,
"platform": "foo"
},
{
"createdAt": 1548759398000,
"platform": "foo"
},
{
"createdAt": 1548912360000,
"platform": "foo"
},
{
"createdAt": 1548904550000,
"platform": "bar"
}
]
Now I want to get the max createdAt of foo platform? how to implement it by using jq?
jq '.[] | select(.platform=="foo") | .createdAt | max' foo.json
jq: error (at <stdin>:17): number (1548729542000) and number (1548729542000) cannot be iterated over
jq '.[] | select(.platform=="foo") | max_by(.createdAt)' foo.json
jq: error (at <stdin>:17): Cannot index number with string "createdAt"
exit status 5
max expects an array as input.
$ jq 'map(select(.platform == "foo") .createdAt) | max' file
1548912360000
One approach would be to make the selection and then use one of the array-oriented builtins max or max_by to find the maximum, e.g.
map(select(.platform=="foo"))
| max_by(.createdAt)
| .createdAt
However, this approach is not very satisfactory as it requires more space than is strictly necessary. For large arrays, a stream-oriented version of max_by would be better.
max_by
def max_by(s; f):
reduce s as $s (null;
if . == null then {s: $s, m: ($s|f)}
else ($s|f) as $m
| if $m > .m then {s: $s, m: $m} else . end
end)
| .s ;
max_by(.[] | select(.platform=="foo"); .createdAt)
| .createdAt

Change entry in a JSON list that matches a condition without discarding rest of document

I am trying to open a file, look through the file and change a value based on the value and pass this either to a file or var.
Below is an example of the JSON
{
"Par": [
{
"Key": "12345L",
"Value": "https://100.100.100.100:100",
"UseLastValue": true
},
{
"Key": "12345S",
"Value": "VAL2CHANGE",
"UseLastValue": true
},
{
"Key": "12345T",
"Value": "HAPPY-HELLO",
"UseLastValue": true
}
],
"CANCOPY": false,
"LOGFILE": ["HELPLOG"]
}
i have been using jq and i have been successful in isolating the object group and change the value.
cat jsonfile,json | jq '.Par | map(select(.Value=="VAL2CHANGE")) | .[] | .Value="VALHASBEENCHANGED"'
This gives
{
"Key": "12345S",
"Value": "VALHASBEENCHANGED",
"UseLastValue": true
}
What id like to achieve is to retain the full JSON output with the changed value
{
"Par": [
{
"Key": "12345L",
"Value": "https://100.100.100.100:100",
"UseLastValue": true
},
{
"Key": "12345S",
"Value": "VALHASBEENCHANGED",
"UseLastValue": true
},
{
"Key": "12345T",
"Value": "HAPPY-HELLO",
"UseLastValue": true
}
],
"CANCOPY": false,
"LOGFILE": ["HELPLOG"]
}
I.E.
jq '.Par | map(select(.Value=="VAL2CHANGE")) | .[] | .Value="VALHASBEENCHANGED"' (NOW PUT IT BACK IN FILE)
OR
open file, look in file, file value to be changed and change this and output this to a file or to screen
To add, the json file will only contain the value im looking for once as im creating this. If any other values need changing i will name differently.
jq --arg match "VAL2CHANGE" \
--arg replace "VALHASBEENCHANGED" \
'.Par |= map(if .Value == $match then (.Value=$replace) else . end)' \
<in.json
To more comprehensively replace a string anywhere it may be in a nested data structure, you can use the walk function -- which will be in the standard library in jq 1.6, but can be manually pulled in in 1.5:
jq --arg match "VAL2CHANGE" \
--arg replace "VALHASBEENCHANGED" '
# taken from jq 1.6; will not be needed here after that version is released.
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys_unsorted[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
walk(if . == $match then $replace else . end)' <in.json
If you're just replacing based on the values, you could stream the file and replace the values as you rebuild the result.
$ jq --arg change 'VAL2CHANGE' --arg value 'VALHASBEENCHANGED' -n --stream '
fromstream(inputs | if length == 2 and .[1] == $change then .[1] = $value else . end)
' input.json