Finding highest value in a specific JSON field in bash - json

I am writing a bash script that curls POST an API. The response from the post has values returned in the following format:
{
"other": "irrelevant-fields",
"results": [
{
"datapoints": [
{"timestamp": 1555977600, "value": 0},
{"timestamp": 1555984800, "value": 15},
{"timestamp": 1555992000, "value": 5}
]
}
]
}
I want to extract the highest figure from the "value" columns but I am having problems writing this code in bash. I am a beginner at JSON and there are no real references I can use to filter out the strings and values I don't need as each array is the same except for the timestamp, but I don't care about the timestamp, just the highest value returned.
My current code is just a generic way to extract the largest number from a file in bash:
grep -Eo '[[:digit:]]+' | sort -n | tail -n 1
...but instead of 15, that returns 1555992000.

echo '
{
"other": "irrelevant-fields",
"results": [
{
"datapoints": [
{"timestamp": 1555977600, "value": 0},
{"timestamp": 1555984800, "value": 15},
{"timestamp": 1555992000, "value": 5}
]
}
]
}
' | jq '.results[].datapoints | max_by(.value)'
The output will be like this:
{
"timestamp": 1555984800,
"value": 15
}
For more information, see this Medium post on jq, or the program's home page at https://stedolan.github.io/jq/

Please process JSON with a proper JSON interpreter/parser, like Xidel.
$ cat <<EOF | xidel -s - -e '$json/max((.//datapoints)()/value)'
{
"other": "irrelevant-fields",
"results": [
{
"datapoints": [
{"timestamp": 1555977600, "value": 0},
{"timestamp": 1555984800, "value": 15},
{"timestamp": 1555992000, "value": 5}
]
}
]
}
EOF
This returns 15.
(or in full: -e '$json/max((results)()/(datapoints)()/value)')

Related

iterating through JSON files adding properties to each with jq

I am attempting to iterate through all my JSON files and add properties but I am relatively new jq.
here is what I am attempting:
find hashlips_art_engine/build -type f -name '*.json' | jq '. + {
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}'
However this is returning an error:
parse error: Invalid numeric literal at line 2, column 0
I have around 10,000 JSON files that I need to iterate over and add
{
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}
to, is this possible or am I barking up the wrong tree on this?
thanks for your assistance with this, I have been searching the web for several hours now but either my terminology is incorrect or there isn't much out there regarding this issue.
The problem is that you are piping the filenames to jq rather than making the contents available to jq.
Most likely you could use the following approach, e.g. if you want the augmented contents of each file to be handled separately:
find ... | while read f ; do jq ... "$f" ; done
An alternative that might be relevant would be:
jq ... $(find ...)
If you have 2 files:
file01.json :
{"a":"1","b":"2"}
file02.json :
{"x":"10","y":"12","z":"15"}
you can:
for f in file*.json ;do cat $f | jq '. + { creators:[{address: "xxx",share:1}] } ' ; done
result:
{
"a": "1",
"b": "2",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}
{
"x": "10",
"y": "12",
"z": "15",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}

jq- merge two json files on a value

i have two json files structured like that:
file 1
[
{
"id": 25422,
"location": "Hotel X",
"suppliers": [
12
]
},
{
"id": 25423,
"location": "Hotel Y",
"suppliers": [
13
]
}]
file 2
[
{
"id": 12,
"vatNumber": "0000000000"
},
{
"id": 14,
"vatNumber": "0000000001"
}]
and i'd like a result like this
[
{
"id": 25422,
"location": "Hotel X",
"suppliers": [
12
],
"vatNumber": "0000000000"
},
{
"id": 25423,
"location": "Hotel Y",
"suppliers": [
13
],
}]
The important thing to me is that the matching vatNumbers, are set in the first file. Supplier arrays are not required anymore after the melding, if it simplifies the job.
Also jq is not essential, but i need something i can use via terminal to set up a script.
Thank you in advance.
Here's one of many possible solutions. If your jq does not have INDEX/2, then either upgrade your jq or include its def (available e.g. from https://github.com/stedolan/jq/blob/master/src/builtin.jq):
Invocation:
jq -n --argfile f1 file1.json --argfile f2 file2.json -f merge.jq
merge.jq:
INDEX($f2[] ; .id) as $dict
| $f1
| map( ($dict[.suppliers[0]|tostring]|.vatNumber) as $vn
| if $vn then .vatNumber = $vn else . end)

Creating a CSV from json using jq, based on elements in array

I have the following json format that I need to convert to CSV
[{
"name": "joe",
"age": 21,
"skills": [{
"lang": "spanish",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
},
{
"name": "sarah",
"age": 34,
"skills": [{
"lang": "french",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
}, {
"name": "jim",
"age": 26,
"skills": [{
"lang": "spanish",
"grade": "60"
}, {
"lang": "english",
"grade": "66",
"school": {
"name": "eg school",
"url": "eg-school.com"
}
}]
}
]
to convert to csv
name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3
So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists
I'd also like to add the file and line number it came from.
I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?
With your data in input.json, and the following jq program in tocsv.jq:
.[]
| [.name, .age] +
(.skills[]
| select(.lang == "spanish")
| [.grade, .school.name, .school.url, input_filename, input_line_number] )
| #csv
the invocation:
jq -r -f tocsv.jq input.json
yields:
"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51
If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""
Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:
jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq
where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.
Finally, please also consider using the TSV format (#tsv) rather than the rather messy CSV format (#csv).

Bash JQ getting multiple values Issue in JSON file

I'm trying to parse a JSON file for getting multiple values. I know how to parse the specific values ( "A"/"B"/"C") in the array (.info.file.hashes[]).
For Example : When issuing the following command over the file b.json
jq -r '.info.file.hashes[] | select(.name == ("A","B","C")).value' b.json
Result :
f34d5f2d4577ed6d9ceec516c1f5a744
66031dad95dfe6ad10b35f06c4342faa
9df25fa4e379837e42aaf6d05d92012018d4b659
Where b.json:
{
"Finish": 1475668827,
"Start": 1475668826,
"info": {
"file": {
"Score": 4,
"file_subtype": "None",
"file_type": "Image",
"hashes": [
{
"name": "A",
"value": "f34d5f2d4577ed6d9ceec516c1f5a744"
},
{
"name": "B",
"value": "66031dad95dfe6ad10b35f06c4342faa"
},
{
"name": "C",
"value": "9df25fa4e379837e42aaf6d05d92012018d4b659"
},
{
"name": "D",
"value": "4a51cc531082d216a3cf292f4c39869b462bf6aa"
},
{
"name": "E",
"value": "e445f412f92b25f3343d5f7adc3c94bdc950601521d5b91e7ce77c21a18259c9"
}
],
"size": 500
}
}
}
Now, how can i get multiple values with "Finish", "Start" along with the hash values? I have tried issuing the command.
jq -r '.info.file.hashes[] | select(.name == ("A","B","C")).value','.Finish','.Start' b.json
and Im getting the result as:
f34d5f2d4577ed6d9ceec516c1f5a744
null
66031dad95dfe6ad10b35f06c4342faa
null
9df25fa4e379837e42aaf6d05d92012018d4b659
null
null
null
Expected Result :
f34d5f2d4577ed6d9ceec516c1f5a744
66031dad95dfe6ad10b35f06c4342faa
9df25fa4e379837e42aaf6d05d92012018d4b659
1475668827
1475668826
Literally just downloaded and read the manual
Try
jq '(.info.file.hashes[] |select(.name == ("A","B","C")).value), .Finish, .Start' b.json
"f34d5f2d4577ed6d9ceec516c1f5a744"
"66031dad95dfe6ad10b35f06c4342faa"
"9df25fa4e379837e42aaf6d05d92012018d4b659"
1475668827
1475668826
Note the brackets used for grouping the pipe separately from the Finish and Start values.

Select or exclude multiples object with an array of IDs

I have the following JSON :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
},
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]
And I have the following array store in a variable: ["1", "2", "56", "1337"] (note the IDs are string, and may contain any regular character).
So, thanks to this SO, I found a way to filter my original data. jq 'jq '[.[] | select(.id == ("1", "2", "56", "1337"))]' ./data.json (note the array is surrounded by parentheses and not brackets) produces :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
But I would also liked to do the opposite (basically excluding IDs instead of selecting them). Using select(.id != ("1", "2", "56", "1337")) doesn't work and using jq '[. - [.[] | select(.id == ("1", "2", "56", "1337"))]]' ./data.json seems very ugly and it doesn't work with my actual data (an output of aws ec2 describe-instances).
So have you any idea to do that? Thank you!
To include them, you need to verify that the id is any of the values in the keep set.
$ jq --argjson include '["1", "2", "56", "1337"]' 'map(select(.id == $include[]))' ...
To exclude them, you need to verify that all values are not in your excluded set. But it might just be easier to take the original set and remove the items that are in the excluded set.
$ jq --argjson exclude '["1", "2", "56", "1337"]' '. - map(select(.id == $exclude[]))' ...
Here is a solution that uses inside. Assuming you run jq as
jq -M --argjson IDS '["1","2","56","1337"]' -f filter.jq data.json
This filter.jq
map( select([.id] | inside($IDS)) )
produces the ids from data.json that are in the $IDS array:
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
and this filter.jq
map( select([.id] | inside($IDS) | not) )
produces the ids from data.json that are not in the $IDS array:
[
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]