Build nested json data from variable with looping with jq using bash - json

I'm trying to create a dynamic json data using jq, but I don't know how to write it when involving loop. I can do this in normal bash writing a sequence string
Here is the example code:
#!/bin/bash
# Here I declare 2 arrays just to demonstrate. Each values pairs between `disk_ids` and `volume_ids` cannot have the same value. If one has value, the other must null.
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
json_query_data=""
# Now example I want to loop through the above 2 arrays and create json. In real application I already have the above arrays that is looping where I can access using $disk_id and $volume_id.
for disk_id in "${disk_ids[#]}"; do
for volume_id in "${volume_ids[#]}"; do
json_query_data=$(jq -n --argjson disk_id "$disk_id" --argjson volume_id "$volume_id" '{
devices: {
sda: {"disk_id": $disk_id, "volume_id": $volume_id },
sdb: {"disk_id": $disk_id, "volume_id": $volume_id },
sdc: {"disk_id": $disk_id, "volume_id": $volume_id },
sdd: {"disk_id": $disk_id, "volume_id": $volume_id },
}}')
done
done
As you can see that is definitely NOT the output that I want, and my code writing logic is not dynamic. The final output should produce the following json when I echo "${json_query_data}":
{
devices: {
sda: {"disk_id": 111, "volume_id": null },
sdb: {"disk_id": null, "volume_id": 444 },
sdc: {"disk_id": 222, "volume_id": null },
sdd: {"disk_id": 333, "volume_id": null },
}}
I have not seen any example online regarding looping with variable when creating json data with jq. Appreciate if someone can help. Thanks.
UPDATE:
I must use for loop inside bash to create the json data. Because the sample array disk_ids and volume_ids that I provided in the code were just example, In real application I already able to access the variable $disk_id and $volume_id for each for loop counter. But how do I use this variables and create the json output that fill up all the data above inside that for loop?
The json example is taken from: linode API here

The looping/mapping can also be accomplished in jq:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
jq -n --arg disk_ids "${disk_ids[*]}" --arg volume_ids "${volume_ids[*]}" '
[$disk_ids, $volume_ids | . / " " | map(fromjson)]
| transpose | {devices: with_entries(
.key |= "sd\([. + 97] | implode)"
| .value |= {disk_id: first, volume_id: last}
)}
'
Demo
Or, if you can already provide the letters in the same way:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
letters=(a b c d)
jq -n --arg disk_ids "${disk_ids[*]}" --arg volume_ids "${volume_ids[*]}" --arg letters "${letters[*]}" '
[$disk_ids, $letters, $volume_ids | . / " " ] | .[0,2] |= map(fromjson)
| transpose | {devices: with_entries(
.key = "sd\(.value[1])"
| .value |= {disk_id: first, volume_id: last}
)}
'
Demo
Output:
{
"devices": {
"sda": {
"disk_id": 111,
"volume_id": null
},
"sdb": {
"disk_id": null,
"volume_id": 444
},
"sdc": {
"disk_id": 222,
"volume_id": null
},
"sdd": {
"disk_id": 333,
"volume_id": null
}
}
}
UPDATE:
I must use for loop inside bash to create the json data.
If you insist on doing this in a bash loop, how about:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
json='{"devices": {}}'
for i in ${!disk_ids[#]}
do
json="$(
jq --argjson disk_id "${disk_ids[$i]}" --argjson volume_id "${volume_ids[$i]}" '
.devices |= . + {"sd\([length + 97] | implode)": {$disk_id, $volume_id}}
' <<< "$json"
)"
done
echo "$json"
Or, with letters included:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
letters=(a b c d)
json='{"devices": {}}'
for i in ${!disk_ids[#]}
do
json="$(
jq --argjson disk_id "${disk_ids[$i]}" --argjson volume_id "${volume_ids[$i]}" --arg letter "${letters[$i]}" '
.devices["sd\($letter)"] += {$disk_id, $volume_id}
' <<< "$json"
)"
done
echo "$json"

Related

Using jq to get json values

Input json:
{
"food_group": "fruit",
"glycemic_index": "low",
"fruits": {
"fruit_name": "apple",
"size": "large",
"color": "red"
}
}
Below two jq commands work:
# jq -r 'keys_unsorted[] as $key | "\($key), \(.[$key])"' food.json
food_group, fruit
glycemic_index, low
fruits, {"fruit_name":"apple","size":"large","color":"red"}
# jq -r 'keys_unsorted[0:2] as $key | "\($key)"' food.json
["food_group","glycemic_index"]
How to get values for the first two keys using jq in the same manner? I tried below
# jq -r 'keys_unsorted[0:2] as $key | "\($key), \(.[$key])"' food.json
jq: error (at food.json:9): Cannot index object with array
Expected output:
food_group, fruit
glycemic_index, low
To iterate over a hash array , you can use to_entries and that will transform to a array .
After you can use select to filter rows you want to keep .
jq -r 'to_entries[]| select( ( .value | type ) == "string" ) | "\(.key), \(.value)" '
You can use to_entries
to_entries[] | select(.key=="food_group" or .key=="glycemic_index") | "\(.key), \(.value)"
Demo
https://jqplay.org/s/Aqvos4w7bo

Is there a way to differentiate between a null value and the absence of a key?

If I execute
echo '{"foo": "bar", "baz": null}' | jq '.baz'
I receive null as result.
But if I execute
echo '{"foo": "bar", "baz": null}' | jq '.hello'
I also receive null as result.
In the first case, the value is null, in the second it does not exist (can't be resolved). Is there any way to tell the two cases apart?
Yes, there is. The has built-in returns a boolean value representing whether its argument exists in its input as a key (or index, if the input is an array).
$ echo '{"foo": null}' | jq 'has("foo")'
true
$ echo '{"foo": null}' | jq 'has("bar")'
false
$ echo '[null]' | jq 'has(0)'
true
$ echo '[null]' | jq 'has(1)'
false

use jq to format output and convert timestamp

I have the following code, which lists all the current aws lambda functions on my account:
aws lambda list-functions --region eu-west-1 | jq -r '.Functions | .[] | .FunctionName' | xargs -L1 -I {} aws logs describe-log-streams --log-group-name /aws/lambda/{} | jq 'select(.logStreams[-1] != null)' | jq -r '.logStreams | .[] | [.arn, .lastEventTimestamp] | #csv'
that returns
aws:logs:eu-west-1:****:log-group:/aws/lambda/admin-devices-block-master:log-stream:2018/01/23/[$LATEST]64965367852942f490305cb8707d81b4",1516717768514
i am only interested in admin-devices-block-master and i want to convert the timestamp 1516717768514 in as strflocaltime("%Y-%m-%d %I:%M%p")
so it should just return:
"admin-devices-block-master",1516717768514
i tried
aws lambda list-functions --region eu-west-1 | jq -r '.Functions | .[] | .FunctionName' | xargs -L1 -I {} aws logs describe-log-streams --log-group-name /aws/lambda/{} | jq 'select(.logStreams[-1] != null)' | jq -r '.logStreams | .[] | [.arn,[.lastEventTimestamp|./1000|strflocaltime("%Y-%m-%d %I:%M%p")]]'
jq: error: strflocaltime/1 is not defined at <top-level>, line 1:
.logStreams | .[] | [.arn,[.lastEventTimestamp|./1000|strflocaltime("%Y-%m-%d %I:%M%p")]]
jq: 1 compile error
^CException ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
any advice is much appreciated
strflocaltime need jq version 1.6 , thanks to #oliv to remark it .
This is a very simple example that will replace a EPOCH with milliseconds by a local time .
date -d #1572892409
Mon Nov 4 13:33:29 EST 2019
echo '{ "ts" : 1572892409356 , "id": 2 , "v": "foobar" } ' | \
jq '.ts|=( ./1000|strflocaltime("%Y-%m-%d %I:%M%p")) '
{
"ts": "2019-11-04 01:33PM",
"id": 2,
"v": "foobar"
}
A second version that test if ts exists
(
echo '{ "ts" : 1572892409356 , "id": 2 , "v": "foobar" } ' ;
echo '{ "id":3 }' ;
echo '{ "id": 4 , "v": "barfoo" }'
) | jq 'if .ts != null
then ( .ts|=( ./1000|strflocaltime("%Y-%m-%d %I:%M%p")) )
else .
end '

jq: error (at <stdin>:0): Cannot iterate over string, cannot execute unique problem

We are trying to parse a JSON file to a tsv file. We are having problems trying to eliminate duplicate Id with unique.
JSON file
[
{"Id": "101",
"Name": "Yugi"},
{"Id": "101",
"Name": "Yugi"},
{"Id": "102",
"Name": "David"},
]
cat getEvent_all.json | jq -cr '.[] | [.Id] | unique_by(.[].Id)'
jq: error (at :0): Cannot iterate over string ("101")
A reasonable approach would be to use unique_by, e.g.:
unique_by(.Id)[]
| [.Id, .Name]
| #tsv
Alternatively, you could form the pairs first:
map([.Id, .Name])
| unique_by(.[0])[]
| #tsv
uniques_by/2
For very large arrays, though, or if you want to respect the original ordering, a sort-free alternative to unique_by should be considered. Here is a suitable, generic, stream-oriented alternative:
def uniques_by(stream; f):
foreach stream as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s
else ($s|tostring) end) as $y
| if .[$t][$y] then .emit = false
else .emit = true | (.item = $x) | (.[$t][$y] = true)
end;
if .emit then .item else empty end );

Filter empty and/or null values with jq

I have a file with jsonlines and would like to find empty values.
{"name": "Color TV", "price": "1200", "available": ""}
{"name": "DVD player", "price": "200", "color": null}
And would like to output empty and/or null values and their keys:
available: ""
color: null
I think it should be something like cat myexample | jq '. | select(. == "")', but is not working.
The tricky part here is emitting the keys without quotation marks in a way that the empty string is shown with quotation marks. Here is one solution that works with jq's -r command-line option:
to_entries[]
| select(.value | . == null or . == "")
| if .value == "" then .value |= "\"\(.)\"" else . end
| "\(.key): \(.value)"
Once the given input has been modified in the obvious way to make it valid JSON, the output is exactly as specified.
Some people may find the following jq program more useful for identifying keys with null or empty string values:
with_entries(select(.value |.==null or . == ""))
With the sample input, this program would produce:
{"available":""}
{"color":null}
Adding further information, such as the input line or object number, would also make sense, e.g. perhaps:
with_entries(select(.value |.==null or . == ""))
| select(length>0)
| {n: input_line_number} + .
With a single with_entries(if .value == null or .value == " then empty else . end) filter expression it's possible to filter out null and empty ("") values.
Without filtering:
echo '{"foo": null, "bar": ""}' | jq '.'
{
"foo": null,
"bar": ""
}
With filtering:
s3 echo '{"foo": null, "bar": ""}' | jq 'with_entries(if .value == null or .value == "" then empty else . end)'
{}
Take a look at this snippet https://blog.nem.ec/code-snippets/jq-ignore-nulls/
jq -r '.firstName | select( . != null )' file.json