This is a variation on a question that's been asked before.
I'm using an external data source in Terraform to ask it for a list of volume snapshots in AWS Dublin, and JQ in a templatefile to extract the snapshot ids.
data "external" "volsnapshot_ids" {
program = [
"bash",
"-c",
templatefile("cli.tftpl", {input_string = "aws ec2 describe-snapshots --region=eu-west-1", top = "Snapshots", next = "| .SnapshotId"})]
}
And it uses this templatefile:
#!/bin/bash
set -e
OUTPUT=$(${input_string} | jq -r -c '.${top}[] ${next}' | jq -R -s -c 'split("\n")' | jq '.[:-1]')
jq -n -c --arg output "$OUTPUT" '{"output":$output}'
The basic CLI command with JQ works and looks like this:
aws ec2 describe-snapshots --region=eu-west-1 | jq -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("\n")' | jq '.[:-1]' | wc -l
It returns a lot of snapshot ids.
When I run it through Terraform though, it errors:
Error: External Program Execution Failed
│
│ with data.external.volsnapshot_ids,
│ on data.tf line 304, in data "external" "volsnapshot_ids":
│ 304: program = [
│ 305: "bash",
│ 306: "-c",
│ 307: templatefile("cli.tftpl", {input_string = "aws ec2 describe-snapshots --region=eu-west-1", top = "Snapshots", next = "| .SnapshotId"})]
│
│ The data source received an unexpected error while attempting to execute
│ the program.
│
│ Program: /bin/bash
│ Error Message: bash: line 6: /usr/local/bin/jq: Argument list too long
│
│ State: exit status 1
I think it's the size of the dataset being returned because it works in regions with less snapshot ids - London works.
Sizewise, here's London:
aws ec2 describe-snapshots --region=eu-west-2 | jq -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("\n")' | jq '.[:-1]' | wc -l
20000
And here's Dublin:
aws ec2 describe-snapshots --region=eu-west-1 | jq -r -c '.Snapshots[] | .SnapshotId' | jq -R -s -c 'split("\n")' | jq '.[:-1]' | wc -l
42500
Is there a way to fix up the JQ in my templatefile so it can handle big JSON files?
I wouldn't recommend using command inside TF datasource. Might be hard to debug. There is a data_source for EBS snapshots.
As for your command inside template, in order to debug it you need to simulate the same environment. E.g. instead of running as is, try to repeat what you have in template, like bash -c and so on. Also you can add output to see the template rendered to see if there are any issues.
Scroll to bottom of answer.
Don't provide the value as argument, but via directly standard input:
aws ... \
| jq -rc '.${top}[] ${next}' \
| jq -Rsc './"\n"' \
| jq -c '.[:-1]'
| jq -Rc '{output:.}'
Note that you are can probably combine most of the separate jq invocations into a single jq program.
This pipeline of jq invocations is a massively, massively overcomplicated non-solution. Why convert back and forth between strings and JSON objects, parsing those strings again, when jq can already process the data directly?
aws ... | jq -c '{ output: .Snapshots | map(.SnapshotId) | tostring }'
Example output:
{"output":"[\"snap-cafebabe\",\"snap-deadbeef\",\"snap-0123abcd\"]"}
If you have to use variables:
top=Snapshots
next=SnapshotId
aws ... | jq --arg top "$top" --arg next "$next" -c '{ output: .[$top] | map(.[$next]) | tostring }'
or .[$top] | map(.[$next]) | tostring | { output: . } or .[$top] | map(.[$next]) | { output: tostring }.
Even if you want or need to string together multiple jq invocations, there's little sense in consuming raw input (-R) and try to parse it, if you already have perfectly structured JSON items in stream form.
Here is what it would look like if you wanted to do it with multiple steps, but always stay in JSON land (and not play ping pong between structured JSON and unstructured text):
top=Snapshots
next=SnapshotId
aws ... \
| jq --arg top "$top" --arg next "$next" '.[$top][][$next]' \
| jq -sc '{ output: tostring }'
or the equivalent:
top=Snapshots
next=SnapshotId
aws ... \
| jq --arg top "$top" --arg next "$next" '.[$top] | map(.[$next])' \
| jq -c '{ output: tostring }'
Related
I want to use a bash script to output the contents of top command and then write it to a json file. But I'm having difficulty writing the slashes/encodings/line breaks into a file with a valid json object
Here's what I tried:
#!/bin/bash
message1=$(top -n 1 -o %CPU)
message2=$(top -n 1 -o %CPU | jq -aRs .)
message3=$(top -n 1 -o %CPU | jq -Rs .)
message4=${message1//\\/\\\\/}
echo "{\"message\":\"${message2}\"}" > file.json
But when I look at the file.json, it looks soemthing like this:
{"message":""\u001b[?1h\u001b=\u001b[?25l\u001b[H\u001b[2J\u001b(B\u001b[mtop - 21:34:53 up 55 days, 5:14, 2 users, load average: 0.17, 0.09, 0.03\u001b(B\u001b[m\u001b[39;49m\u001b(B\u001b[m\u001b[39;49m\u001b[K\nTasks:\u001b(B\u001b[m\u001b[39;49m\u001b[1m 129 \u001b(B\u001b[m\u001b[39;49mtotal,\u001b(B\u001b[m\u001b[39;49m\u001b[1m 1 \u001b(B\u001b[m\u001b[39;49mrunning,\u001b(B\u001b[m\u001b[39;49m\u001b[1m 128 \u001b(B\u001b[m\u001b[39;49msleeping,\u001b(B\u001b[m
Each of the other attempts with message1 to message4 all result in various json syntax issues.
Can anyone suggest what I should try next?
You don't need all the whistle of echo and multiple jq invocations:
top -b -n 1 -o %CPU | jq -aRs '{"message": .}' >file.json
Or pass the output of the top command as an argument variable.
Using --arg to pass arguments to jq:
jq -an --arg msg "$(top -b -n 1 -o %CPU)" '{"message": $msg}' >file.json
I have .property file which I'm trying to convert to a json file using bash command(s) and I wanted to exclude particular keys being shown in the json file. Below are my .properties inside the property file, I want to exclude property 4 and 5 being converted to json
app.database.address=127.0.0.70
app.database.host=database.myapp.com
app.database.port=5432
app.database.user=dev-user-name
app.database.pass=dev-password
app.database.main=dev-database
Here's my bash command used for converting to json but it converts all the properties to json
cat fle.properties | jq -R -s 'split("\n") | map(split("=")) | map({(.[0]): .[1]}) | add' > zppprop.json
Is there any way we can include these parameters to exclude from converting to json
With xidel:
XPath + JSONiq solution
$ xidel -s fle.properties -e '
{|
x:lines($raw)[not(position() = (4,5))] ! {
substring-before(.,"="):substring-after(.,"=")
}
|}
'
{
"app.database.address": "127.0.0.70",
"app.database.host": "database.myapp.com",
"app.database.port": "5432",
"app.database.main": "dev-database"
}
x:lines($raw) is a shorthand for tokenize($raw,'\r\n?|\n') and turns $raw, the raw input, into a sequence where every new line is another item.
[not(position() = (4,5))] if it's always the 4th and 5th line you want to exclude. Otherwise, use [not(contains(.,"user") or contains(.,"pass"))] as seen below.
XQuery solution
$ xidel -s --xquery '
map:merge(
for $x in file:read-text-lines("fle.properties")[not(contains(.,"user") or contains(.,"pass"))]
let $kv:=tokenize($x,"=")
return
{$kv[1]:$kv[2]}
)
'
{
"app.database.address": "127.0.0.70",
"app.database.host": "database.myapp.com",
"app.database.port": "5432",
"app.database.main": "dev-database"
}
You can use file:read-text-lines() to do everything "in-query".
Playground.
You may filter out unneeded lines with grep:
cat fle.properties | grep -v -E "user|pass" | jq -R -s 'split("\n") | map(select(length > 0)) | map(split("=")) | map({(.[0]): .[1]}) | add'
It is also needed to remove the empty string at the end of the array returned by the split function. This is what map(select(length > 0)) is doing.
You can do the exclusion within the jq script:
properties2json
#!/usr/bin/env -S jq -sRf
split("\n") |
map(split("=")) |
map(
if .[0] | test(".*\\.(user|pass)";"i")
then
{}
else
{(.[0]): .[1]}
end
) |
add
# Make it executable
chmod +x properties2json
# Run it
./properties2json file.properties >file.json
I have a response trace file containing below response:
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
I need to fetch the value of the "id" key in a variable which I can put in my further code.
Expected result is
echo $id - should give me 70EA96FB313349279EB089BA9DE2EC3B value
With valid JSON (remove first to second row with sed and parse with jq):
id=$(sed '1,2d' file | jq -r '.member[]|.id')
Output to variable id:
70EA96FB313349279EB089BA9DE2EC3B
I would strongly suggest using jq to parse json.
But given that json is mostly compatible with python dictionaries and arrays, this HACK would work too:
$ cat resp
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
$ awk 'NR==3{print "a="$0;print "print a[\"member\"][0][\"id\"]"}' resp | python
70EA96FB313349279EB089BA9DE2EC3B
$ sed -n '3s|.*|a=\0\nprint a["member"][0]["id"]|p' resp | python
70EA96FB313349279EB089BA9DE2EC3B
Note that this code is
1. dirty hack, because your system does not have the right tool - jq
2. susceptible to shell injection attacks. Hence use it ONLY IF you trust the response received from your service.
Quick and dirty (don't use eval):
eval $(cat response_file | tail -1 | awk -F , '{ print $5 }' | sed -e 's/"//g' -e 's/:/=/')
It is based on the exact structure you gave, and hoping there is no , in any value before "id".
Or assign it yourself:
id=$(cat response_file | tail -1 | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
Note that you can't access the name field with that trick, as it is the first item of the member array and will be "swallowed" by the { print $2 }. You can use an even-uglier hack to retrieve it though:
id=$(cat response_file | tail -1 | sed -e 's/:\[/,/g' -e 's/}\]//g' | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
But, if you can, jq is the right tool for that work instead of ugly hacks like that (but if it works...).
When you can't use jq, you can consider
id=$(grep -Eo "[0-9A-F]{32}" file)
This is only working when the file looks like what I expect, so you might need to add extra checks like
id=$(grep "My des_" file | grep -Eo "[0-9A-F]{32}" | head -1)
I receive some json that I process until it becomes just text lines. In the first line there's a value that I would like to keep in a variable and all the rest after the first line should be displayed with less or other utils.
Can I do this without using a temporary file?
The context is this:
aws logs get-log-events --log-group-name "$logGroup" --log-stream-name "$logStreamName" --limit "$logSize" |
jq '{message:.nextForwardToken}, .events[] | .message' |
sed 's/^"//g' | sed 's/"$//g'
In the first line there's the nextForwardToken that I want to put in the variable and all the rest is log messages.
The json looks like this:
{
"events": [
{
"timestamp": 1518081460955,
"ingestionTime": 1518081462998,
"message": "08.02.2018 09:17:40.955 [SimpleAsyncTaskExecutor-138] INFO o.s.b.c.l.support.SimpleJobLauncher - Job: [SimpleJob: [name=price-update]] launched with the following parameters: [{time=1518081460875, sku=N-W7ZLH9U737B|N-XIBH22XQE87|N-3EXIRFNYNW0|N-U19C031D640|N-6TQ1847FQE6|N-NF0XCNG0029|N-UJ3H0OZROCQ|N-W2JKJD4S6YP|N-VEMA4QVV3X1|N-F40J6P2VM01|N-VIT7YEAVYL2|N-PKLKX1PAUXC|N-VPAK74C75DP|N-C5BLYC5HQRI|N-GEIGFIBG6X2|N-R0V88ZYS10W|N-GQAF3DK7Y5Z|N-9EZ4FDDSQLC|N-U15C031D668|N-B8ELYSSFAVH}]"
},
{
"timestamp": 1518081461095,
"ingestionTime": 1518081462998,
"message": "08.02.2018 09:17:41.095 [SimpleAsyncTaskExecutor-138] INFO o.s.batch.core.job.SimpleStepHandler - Executing step: [index salesprices]"
},
{
"timestamp": 1518082421586,
"ingestionTime": 1518082423001,
"message": "08.02.2018 09:33:41.586 [upriceUpdateTaskExecutor-3] DEBUG e.u.d.a.j.d.b.StoredMasterDataReader - Reading page 1621"
}
],
"nextBackwardToken": "b/33854347851370569899844322814554152895248902123886870536",
"nextForwardToken": "f/33854369274157730709515363051725446974398055862891970561"
}
I need to put in a variable this:
f/33854369274157730709515363051725446974398055862891970561
and display (or put in an other variable) the messages:
08.02.2018 09:17:40.955 [SimpleAsyncTaskExecutor-138] INFO o.s.b.c.l.support.SimpleJobLauncher - Job: [SimpleJob: [name=price-update]] launched with the following parameters: [{time=1518081460875, sku=N-W7ZLH9U737B|N-XIBH22XQE87|N-3EXIRFNYNW0|N-U19C031D640|N-6TQ1847FQE6|N-NF0XCNG0029|N-UJ3H0OZROCQ|N-W2JKJD4S6YP|N-VEMA4QVV3X1|N-F40J6P2VM01|N-VIT7YEAVYL2|N-PKLKX1PAUXC|N-VPAK74C75DP|N-C5BLYC5HQRI|N-GEIGFIBG6X2|N-R0V88ZYS10W|N-GQAF3DK7Y5Z|N-9EZ4FDDSQLC|N-U15C031D668|N-B8ELYSSFAVH}]
08.02.2018 09:17:41.095 [SimpleAsyncTaskExecutor-138] INFO o.s.batch.core.job.SimpleStepHandler - Executing step: [index salesprices]
08.02.2018 09:33:41.586 [upriceUpdateTaskExecutor-3] DEBUG e.u.d.a.j.d.b.StoredMasterDataReader - Reading page 1621
Thanks in advance for your help.
You might consider it a bit of trick, but you can use tee to pipe all the output to stderr and fetch the one line you want for your variable with head:
var="$(command | tee /dev/stderr | head -n 1)"
Or you can solve this with a bit of scripting:
first=true
while read -r line; do
if $first; then
first=false
var="$line"
fi
echo "$line"
done < <(command)
If you are interested in storing the contents to variables, use mapfile or read on older bash versions.
Just using read to get the first line do. I've added -r flag to jq print output without quotes
read -r token < <(aws logs get-log-events --log-group-name "$logGroup" --log-stream-name "$logStreamName" --limit "$logSize" | jq -r '{message:.nextForwardToken}, .events[] | .message')
printf '%s\n' "$token"
Or using mapfile
mapfile -t output < <(aws logs get-log-events --log-group-name "$logGroup" --log-stream-name "$logStreamName" --limit "$logSize" | jq -r '{message:.nextForwardToken}, .events[] | .message')
and loop through the array. The first element will always contain the token-id you want.
printf '%s\n' "${output[0]}"
Rest of the elements can be iterated over,
for ((i=1; i<${#output[#]}; i++)); do
printf '%s\n' "${output[i]}"
done
Straightforwardly:
aws logs get-log-events --log-group-name "$logGroup" \
--log-stream-name "$logStreamName" --limit "$logSize" > /tmp/log_data
-- set nextForwardToken variable:
nextForwardToken=$(jq -r '.nextForwardToken' /tmp/log_data)
echo $nextForwardToken
f/33854369274157730709515363051725446974398055862891970561
-- print all message items:
jq -r '.events[].message' /tmp/log_data
08.02.2018 09:17:40.955 [SimpleAsyncTaskExecutor-138] INFO o.s.b.c.l.support.SimpleJobLauncher - Job: [SimpleJob: [name=price-update]] launched with the following parameters: [{time=1518081460875, sku=N-W7ZLH9U737B|N-XIBH22XQE87|N-3EXIRFNYNW0|N-U19C031D640|N-6TQ1847FQE6|N-NF0XCNG0029|N-UJ3H0OZROCQ|N-W2JKJD4S6YP|N-VEMA4QVV3X1|N-F40J6P2VM01|N-VIT7YEAVYL2|N-PKLKX1PAUXC|N-VPAK74C75DP|N-C5BLYC5HQRI|N-GEIGFIBG6X2|N-R0V88ZYS10W|N-GQAF3DK7Y5Z|N-9EZ4FDDSQLC|N-U15C031D668|N-B8ELYSSFAVH}]
08.02.2018 09:17:41.095 [SimpleAsyncTaskExecutor-138] INFO o.s.batch.core.job.SimpleStepHandler - Executing step: [index salesprices]
08.02.2018 09:33:41.586 [upriceUpdateTaskExecutor-3] DEBUG e.u.d.a.j.d.b.StoredMasterDataReader - Reading page 1621
I believe the following meets the stated requirements, assuming a bash-like environment:
x=$(aws ... |
tee >(jq -r '.events[] | .message' >&2) |
jq .nextForwardToken) 2>&1
This makes the item of interest available as the shell variable $x.
Notice that the string manipulation using sed can be avoided by using the -r command-line option of jq.
Calling jq just once
x=$(aws ... |
jq -r '.nextForwardToken, (.events[] | .message)' |
tee >(tail -n +2 >&2) |
head -n 1) 2>&1
echo "x=$x"
As a newbee to bash and jq, I was trying to download several urls from a json file using jq command in bash scripts.
My items.json file looks like this :
[
{"title" : [bob], "link" :[a.b.c]},
{"title" : [alice], "link" :[d.e.f]},
{"title" : [carol], "link" :[]}
]
what I was initially doing was just filter the non-empty link and put them in an array and then download the array:
#!/bin/bash
lnk=( $(jq -r '.[].link[0] | select (.!=null)' items.json) )
for element in ${lnk[#]}
do
wget $element
done
But the problem of this approach is that all the files downloaded use the link as the file names.
I wish to filter json file but still keeps the title name with the link so that i can rename the file in the wget command. But I dont have any idea on what structure should I use here. So how can i keep the title to in the filter and use it after?
You can use this:
IFS=$'\n' read -d '' -a titles < <(jq -r '.[] | select (.link[0]!=null) | .title[0]' items.json);
IFS=$'\n' read -d '' -a links < <(jq -r '.[] | select (.link[0]!=null) | .link[0]' items.json);
Then you can iterate over arrays "${title[#]}" & ${links[#]}...
for i in ${!titles[#]}; do
wget -O "${titles[i]}" "${links[#]}"
done
EDIT: Easier & safer approach:
jq -r '.[] | select (.link[0]!=null) | #sh "wget -O \(.title[0]) \(.link[0])"' items.json | bash
Here is a bash script demonstrating reading the result of a jq filter into bash variables.
#!/bin/bash
jq -M -r '
.[]
| select(.link[0]!=null)
| .title[0], .link[0]
' items.json | \
while read -r title; read -r url; do
echo "$title: $url" # replace with wget command
done