use jq to format output and convert timestamp - json

I have the following code, which lists all the current aws lambda functions on my account:
aws lambda list-functions --region eu-west-1 | jq -r '.Functions | .[] | .FunctionName' | xargs -L1 -I {} aws logs describe-log-streams --log-group-name /aws/lambda/{} | jq 'select(.logStreams[-1] != null)' | jq -r '.logStreams | .[] | [.arn, .lastEventTimestamp] | #csv'
that returns
aws:logs:eu-west-1:****:log-group:/aws/lambda/admin-devices-block-master:log-stream:2018/01/23/[$LATEST]64965367852942f490305cb8707d81b4",1516717768514
i am only interested in admin-devices-block-master and i want to convert the timestamp 1516717768514 in as strflocaltime("%Y-%m-%d %I:%M%p")
so it should just return:
"admin-devices-block-master",1516717768514
i tried
aws lambda list-functions --region eu-west-1 | jq -r '.Functions | .[] | .FunctionName' | xargs -L1 -I {} aws logs describe-log-streams --log-group-name /aws/lambda/{} | jq 'select(.logStreams[-1] != null)' | jq -r '.logStreams | .[] | [.arn,[.lastEventTimestamp|./1000|strflocaltime("%Y-%m-%d %I:%M%p")]]'
jq: error: strflocaltime/1 is not defined at <top-level>, line 1:
.logStreams | .[] | [.arn,[.lastEventTimestamp|./1000|strflocaltime("%Y-%m-%d %I:%M%p")]]
jq: 1 compile error
^CException ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
any advice is much appreciated

strflocaltime need jq version 1.6 , thanks to #oliv to remark it .
This is a very simple example that will replace a EPOCH with milliseconds by a local time .
date -d #1572892409
Mon Nov 4 13:33:29 EST 2019
echo '{ "ts" : 1572892409356 , "id": 2 , "v": "foobar" } ' | \
jq '.ts|=( ./1000|strflocaltime("%Y-%m-%d %I:%M%p")) '
{
"ts": "2019-11-04 01:33PM",
"id": 2,
"v": "foobar"
}
A second version that test if ts exists
(
echo '{ "ts" : 1572892409356 , "id": 2 , "v": "foobar" } ' ;
echo '{ "id":3 }' ;
echo '{ "id": 4 , "v": "barfoo" }'
) | jq 'if .ts != null
then ( .ts|=( ./1000|strflocaltime("%Y-%m-%d %I:%M%p")) )
else .
end '

Related

Build nested json data from variable with looping with jq using bash

I'm trying to create a dynamic json data using jq, but I don't know how to write it when involving loop. I can do this in normal bash writing a sequence string
Here is the example code:
#!/bin/bash
# Here I declare 2 arrays just to demonstrate. Each values pairs between `disk_ids` and `volume_ids` cannot have the same value. If one has value, the other must null.
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
json_query_data=""
# Now example I want to loop through the above 2 arrays and create json. In real application I already have the above arrays that is looping where I can access using $disk_id and $volume_id.
for disk_id in "${disk_ids[#]}"; do
for volume_id in "${volume_ids[#]}"; do
json_query_data=$(jq -n --argjson disk_id "$disk_id" --argjson volume_id "$volume_id" '{
devices: {
sda: {"disk_id": $disk_id, "volume_id": $volume_id },
sdb: {"disk_id": $disk_id, "volume_id": $volume_id },
sdc: {"disk_id": $disk_id, "volume_id": $volume_id },
sdd: {"disk_id": $disk_id, "volume_id": $volume_id },
}}')
done
done
As you can see that is definitely NOT the output that I want, and my code writing logic is not dynamic. The final output should produce the following json when I echo "${json_query_data}":
{
devices: {
sda: {"disk_id": 111, "volume_id": null },
sdb: {"disk_id": null, "volume_id": 444 },
sdc: {"disk_id": 222, "volume_id": null },
sdd: {"disk_id": 333, "volume_id": null },
}}
I have not seen any example online regarding looping with variable when creating json data with jq. Appreciate if someone can help. Thanks.
UPDATE:
I must use for loop inside bash to create the json data. Because the sample array disk_ids and volume_ids that I provided in the code were just example, In real application I already able to access the variable $disk_id and $volume_id for each for loop counter. But how do I use this variables and create the json output that fill up all the data above inside that for loop?
The json example is taken from: linode API here
The looping/mapping can also be accomplished in jq:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
jq -n --arg disk_ids "${disk_ids[*]}" --arg volume_ids "${volume_ids[*]}" '
[$disk_ids, $volume_ids | . / " " | map(fromjson)]
| transpose | {devices: with_entries(
.key |= "sd\([. + 97] | implode)"
| .value |= {disk_id: first, volume_id: last}
)}
'
Demo
Or, if you can already provide the letters in the same way:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
letters=(a b c d)
jq -n --arg disk_ids "${disk_ids[*]}" --arg volume_ids "${volume_ids[*]}" --arg letters "${letters[*]}" '
[$disk_ids, $letters, $volume_ids | . / " " ] | .[0,2] |= map(fromjson)
| transpose | {devices: with_entries(
.key = "sd\(.value[1])"
| .value |= {disk_id: first, volume_id: last}
)}
'
Demo
Output:
{
"devices": {
"sda": {
"disk_id": 111,
"volume_id": null
},
"sdb": {
"disk_id": null,
"volume_id": 444
},
"sdc": {
"disk_id": 222,
"volume_id": null
},
"sdd": {
"disk_id": 333,
"volume_id": null
}
}
}
UPDATE:
I must use for loop inside bash to create the json data.
If you insist on doing this in a bash loop, how about:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
json='{"devices": {}}'
for i in ${!disk_ids[#]}
do
json="$(
jq --argjson disk_id "${disk_ids[$i]}" --argjson volume_id "${volume_ids[$i]}" '
.devices |= . + {"sd\([length + 97] | implode)": {$disk_id, $volume_id}}
' <<< "$json"
)"
done
echo "$json"
Or, with letters included:
#!/bin/bash
disk_ids=(111 null 222 333)
volume_ids=(null 444 null null)
letters=(a b c d)
json='{"devices": {}}'
for i in ${!disk_ids[#]}
do
json="$(
jq --argjson disk_id "${disk_ids[$i]}" --argjson volume_id "${volume_ids[$i]}" --arg letter "${letters[$i]}" '
.devices["sd\($letter)"] += {$disk_id, $volume_id}
' <<< "$json"
)"
done
echo "$json"

Windows version fails where jqplay.org works

I've been using jq to parse the output from AWS cli.
The output looks something like this..
{
"Vpcs": [
{
"CidrBlock": "10.29.19.64/26",
"State": "available",
"VpcId": "vpc-0ba51bd29c41d41",
"IsDefault": false,
"Tags": [
{
"Key": "Name",
"Value": "CloudEndure-Europe-Development"
}
]
}
]}
and the script I am using looks like this..
.Vpcs[] | [.VpcId, .CidrBlock, (.Tags[]|select(.Key=="Name")|.Value)]
If I run it under Windows it fails like this.
jq: error: Name/0 is not defined at , line 1:
.Vpcs[] | [.VpcId, .CidrBlock, (.Tags[]|select(.Key==Name)|.Value)]
jq: 1 compile error
But it works fine in jqplay.org.
Any ideas, on Windows Im using jq-1.6.
Thanks
Bruce.
The correct jq program is
.Vpcs[] | [.VpcId, .CidrBlock, ( .Tags[] | select( .Key == "Name" ) | .Value ) ]
You didn't show the command you used, but you provided the following to jq:
.Vpcs[] | [.VpcId, .CidrBlock, ( .Tags[] | select( .Key == Name ) | .Value ) ]
That's incorrect. (Notice the missing quotes.)
Not only did you not provide what command you used, you didn't specify whether it was being provided to the Windows API (CreateProcess), Windows Shell (cmd) or Power Shell.
I'm guessing cmd. In order to provide the above program to jq, you can use the following cmd command:
jq ".Vpcs[] | [.VpcId, .CidrBlock, ( .Tags[] | select( .Key == \"Name\" ) | .Value ) ]" file.json
I'm not agreeing to ikegami about the CMD command that [he/she?] provided because the character used for CMD escaping is ^, not \ like Assembly/C/C++. I hope this will work (I don't want to test this on my potato thing):
jq .Vpcs[] | [.VpcId, .CidrBlock, ( .Tags[] | select( .Key == "Name" ) | .Value ) ] file.json
or this:
jq .Vpcs[] | [.VpcId, .CidrBlock, ( .Tags[] | select( .Key == ^"Name^" ) | .Value ) ] file.json

replace string with jq

I have the following file file.txt:
{"a": "a", "b": "a", "time": "20210210T10:10:00"}
{"a": "b", "b": "b", "time": "20210210T11:10:00"}
I extract the values with bash command jq (I use this command on massive 100g files):
jq -r '[.a, .b, .time] | #tsv'
This returns good result of:
a a 20210210T10:10:00
b b 20210210T11:10:00
The output I would like is:
a a 2021-02-10 10:10:00
b b 2021-02-10 11:10:00
The problem is that I want to change the format of the date in the most efficient way possible.
How do I do that?
You can do it in sed, but you can also call sub directly in jq:
jq -r '[.a, .b,
( .time
| sub("(?<y>\\d{4})(?<m>\\d{2})(?<d>\\d{2})T";
.y+"-"+.m+"-"+.d+" ")
)
] | #tsv'
Use strptime for date interpretation and strftime for formatting:
parse.jq
[
.a,
.b,
( .time
| strptime("%Y%m%dT%H:%M:%S")
| strftime("%Y-%d-%m %H:%M:%S")
)
] | #tsv
Run it like this:
<input.json jq -rf parse.jq
Or as a one-liner:
<input.json jq -r '[.a,.b,(.time|strptime("%Y%m%dT%H:%M:%S")|strftime("%Y-%d-%m %H:%M:%S"))]|#tsv'
Output:
a a 2021-10-02 10:10:00
b b 2021-10-02 11:10:00
Since speed is an issue, and since there does not appear to be a need for anything more than string splitting, you could compare string splitting done with jq using
[.a, .b,
(.time | "\(.[:4])-\(.[4:6])-\(.[6:8]) \(.[9:])"]
vs similar splitting using jq with awk -F\\t 'BEGIN{OFS=FS} ....' (awk for ease of handling the TSV).
With sed:
$ echo "20210427T19:23:00" | sed -r 's|([[:digit:]]{4})([[:digit:]]{2})([[:digit:]]
{2})T|\1-\2-\3 |'
2021-04-27 19:23:00

Using jq to get json values

Input json:
{
"food_group": "fruit",
"glycemic_index": "low",
"fruits": {
"fruit_name": "apple",
"size": "large",
"color": "red"
}
}
Below two jq commands work:
# jq -r 'keys_unsorted[] as $key | "\($key), \(.[$key])"' food.json
food_group, fruit
glycemic_index, low
fruits, {"fruit_name":"apple","size":"large","color":"red"}
# jq -r 'keys_unsorted[0:2] as $key | "\($key)"' food.json
["food_group","glycemic_index"]
How to get values for the first two keys using jq in the same manner? I tried below
# jq -r 'keys_unsorted[0:2] as $key | "\($key), \(.[$key])"' food.json
jq: error (at food.json:9): Cannot index object with array
Expected output:
food_group, fruit
glycemic_index, low
To iterate over a hash array , you can use to_entries and that will transform to a array .
After you can use select to filter rows you want to keep .
jq -r 'to_entries[]| select( ( .value | type ) == "string" ) | "\(.key), \(.value)" '
You can use to_entries
to_entries[] | select(.key=="food_group" or .key=="glycemic_index") | "\(.key), \(.value)"
Demo
https://jqplay.org/s/Aqvos4w7bo

How to convert nested JSON to CSV using only jq

I've following json,
{
"A": {
"C": {
"D": "T1",
"E": 1
},
"F": {
"D": "T2",
"E": 2
}
},
"B": {
"C": {
"D": "T3",
"E": 3
}
}
}
I want to convert it into csv as follows,
A,C,T1,1
A,F,T2,2
B,C,T3,3
Description of output: The parents keys will be printed until, I've reached the leaf child. Once I reached leaf child, print its value.
I've tried following and couldn't succeed,
cat my.json | jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $rows[] | #csv'
and it throwing me an error.
I can't hardcode the parent keys, as the actual json has too many records. But the structure of the json is similar. What am I missing?
Some of the requirements are unclear, but the following solves one interpretation of the problem:
paths as $path
| {path: $path, value: getpath($path)}
| select(.value|type == "object" )
| select( [.value[]][0] | type != "object")
| .path + ([.value[]])
| #csv
(This program could be optimized but the presentation here is intended to make the separate steps clear.)
Invocation:
jq -r -f leaves-to-csv.jq input.json
Output:
"A","C","T1",1
"A","F","T2",2
"B","C","T3",3
Unquoted strings
To avoid the quotation marks around strings, you could replace the last component of the pipeline above with:
join(",")
Here is a solution using tostream and group_by
[
tostream
| select(length == 2) # e.g. [["A","C","D"],"T1"]
| .[0][:-1] + [.[1]] # ["A","C","T1"]
]
| group_by(.[:-1]) # [[["A","C","T1"],["A","C",1]],...
| .[] # [["A","C","T1"],["A","C",1]]
| .[0][0:2] + map(.[-1]|tostring) # ["A","C","T1","1"]
| join(",") # "A,C,T1,1"