shell script: use pipelines instead of files for batch processing

shell script: use pipelines instead of files for batch processing - json

I'm using these two commands in order to process a huge single sequence of json objects:
$ jq -c '.[]' csvjson.json | split -l 25 - splitted
Above command, creates several splitted-* files, containing 25 lines each one.
$ jq --slurp 'map({PutRequest: {Item: map_values({S: .})}})' splitted-n > output-n.json
Is there any way to pipeline above two commands?

Is there any way to pipeline above two commands?
We can make use of the split --filter option:
jq -c '.[]' csvjson.json |
split -l25 --filter='jq --slurp "map({PutRequest: {Item: map_values({S: .})}})" >$FILE.json' - output

Related

Batch Processing Curl API Requests in Bash?

Need to query an API endpoint for specific parameters, but there's a parameter limit of 20.
params are gathered into an array & stored in a JSON file, and ref'd in a variable tacked onto the end of my curl command, which generates the full curl API request.
curl -s -g GET '/api/endpoint?parameters='$myparams
eg.
curl -s -g GET '/api/endpoint?parameters=["1","2","3","etc"]'
This works fine when the params json is small and below the parameter limit per request. Only problem is params list fluctuates but is many times larger than the request limit.
My normal thinking would be to iterate through the param lines, but that would create many requests and probably block me too.
What would a good approach be to parse the parameter array json and generate curl API requests respectful of the parameter limit, with the minimum requests? Say its 115 params now, so that'd create 5 api requests of 20 params tacked on & 1 of 15..

You can chunk the array with undocumented _nwise function and then use that, e.g.:
<<JSON jq -r '_nwise(3) | "/api/endpoint?parameters=\(.)"'
["1","2","3","4","5","6","7","8"]
JSON
Output:
/api/endpoint?parameters=["1","2","3"]
/api/endpoint?parameters=["4","5","6"]
/api/endpoint?parameters=["7","8"]
This will generate the URLs for your curl calls, which you can then save in a file or consume directly:
<input.json jq -r ... | while read -r url; do curl -s -g -XGET "$url"; done
Or generate the query string only and use it in your curl call (pay attention to proper escaping/quoting):
<input.json jq -c '_nwise(3)' | while read -r qs; do curl -s -g -XGET "/api/endpoint?parameters=$qs"; done

Depending on your input format and requirements regarding robustness, you might not need jq at all; sed and paste can do the trick:
<<IN sed 's/\\/&&/g;s/"/\\"/g' | sed 's/^/"/;s/$/"/' | paste -sd ',,\n' | while read -r items; do curl -s -g -XGET "/api/endpoint?parameters=[$items]" done;
1
2
3
4
5
6
7
8
IN
Output:
curl -s -g -XGET /api/endpoint?parameters=["1","2","3"]
curl -s -g -XGET /api/endpoint?parameters=["4","5","6"]
curl -s -g -XGET /api/endpoint?parameters=["7","8"]
Explanation:
sed 's/\\/&&/g;s/"/\\"/g': replace \ with \\ and " with \".
sed 's/^/"/;s/$/"/': wrap each line/item in quotes
paste -sd ',,\n': take 3 lines and join them by a comma (repeat the comma character as many times as you need items minus 1)
while read -r items; do curl -s -g -XGET "/api/endpoint?parameters=[$items]"; done;: read generated items, wrap them in brackets and run curl

Compare two JSON arrays and iterate over the remaining items

I have two arrays with numbers that are already stored in variables:
$SLOT_IDS = [1,2,3,4,5]
$PR_IDS = [3,4]
I would like to find which numbers are in array 1 but not array 2. So in this case it would be
$OBSOLETE_SLOT_IDS = [1,2,5]
and then I would like to run a command foreach of those numbers and insert it into a placeholder:
So this command:
az webapp deployment slot delete -g group --name webapp --slot pr-<PLACEHOLDER>
Should be run three times:
az webapp deployment slot delete -g group --name webapp --slot pr-1
az webapp deployment slot delete -g group --name webapp --slot pr-2
az webapp deployment slot delete -g group --name webapp --slot pr-5
I know that should look something like this (it is required that it is inline):
for i in $OBSOLETE_SLOT_IDS; do az webapp deployment slot delete -g group --name webapp --slot pr-$i; done
So my questions:
How can I calculte $OBSOLETE_SLOT_IDS from the other two variables with an inline command
What is the correct version of the for loop
comment: seems that the variables do not contain actual arrays. They are basically the return values of some curl calls that I stored in variables:

A shorter approach that uses jq to get the difference of the two arrays:
#!/usr/bin/env bash
slot_ids="[1,2,3,4,5]"
pr_ids="[3,4]"
while read -r id; do
az webapp deployment slot delete -g group --name webapp --slot "pr-$id"
done < <(jq -n --argjson a "$slot_ids" --argjson b "$pr_ids" '$a - $b | .[]')

jq -r '.[]' will transform your array to a stream with one number per line -- which is the format that standard UNIX tools expect to work with.
Once we have the numbers in a sorted form, we can use comm to compare the two streams. -3 tells comm to ignore contents present in both streams, and -2 tells it to ignore content present only in the second stream, so comm -23 prints only files unique to the first stream.
Using readarray (added in bash 4.0) then lets us read that content into an array, which we can iterate over in our for loop.
#!/usr/bin/env bash
slot_ids='[1,2,3,4,5]'
pr_ids='[3,4]'
readarray -t filtered_slot_ids <(
comm -23 \
<(jq -r '.[]' <<<"$slot_ids" | sort) \
<(jq -r '.[]' <<<"$pr_ids" | sort))
for i in "${filtered_slot_ids[#]}"; do
az webapp deployment slot delete -g group --name webapp --slot "pr-$i"
done

Retrieve secrets from AWS Secrets Manager

I have a bunch of secrets (key/value) pairs stored in AWS Secrets Manager. I tried to parse the secrets using jq as:
aws secretsmanager get-secret-value --secret-id <secret_bucket_name> | jq --raw-output '.SecretString' | jq -r .PASSWORD
It retrieves the value stored in .PASSWORD, but the problem is I not only want to retrieve the value stored in key but also want to retrieve the key/value in the following manner:
KEY_1="1234"
KEY_2="0000"
.
.
.
so on...
By running the above command I am not able to parse in this format and also for every key/value I have to run this command many times which is tedious. Am I doing something wrong or is there a better way of doing this?

This isn't related to python, but more related to behaviour of aws cli and jq. I come up with something like this.
aws secretsmanager get-secret-value --secret-id <secret_name> --output text --query SecretString | jq ".[]"
There are literally hundred different ways to format something like this.
aws cli itself has lot of options to filter output using --query option https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-output.html
Exact conversion you are looking for would require somwthing like this:
aws secretsmanager get-secret-value --secret-id <secret_name> --output text --query SecretString \
| jq -r 'to_entries[] | [.key, "=", "\"", .value, "\"" ] | #tsv' \
| tr -d "\t"
There has to be some better way of doing this!!

Try the snippet below. I tend to put these little helper filters into their own shell function <3
tokv() {
jq -r 'to_entries|map("\(.key|ascii_upcase)=\"\(.value|tostring)\"")|.[]'
}
$ echo '{"foo":"bar","baz":"fee"}' | tokv
FOO="bar"
BAZ="fee"

Parsing json output using jq -jr

I am running a puppet bolt command query certain information from a set of servers in json format. I am piping it to jq.. Below is what I get
$ bolt command run "cat /blah/blah" -n #hname.txt -u uid --no-host-key-check --format json |jq -jr '.items[]|[.node],[.result.stdout]'
[
"node-name"
][
"stdout data\n"
]
What do I need to do to make it appear like below
["nodename":"stdout data"]

If you really want output that is not valid JSON, you will have to construct the output string, which can easily be done using string interpolation, e.g.:
jq -r '.items[] | "[\"\(.node)\",\"\(.result.stdout)\"]"'

#peak thank you.. that helped. Below is how it looks like
$ bolt command run "cat /blah/blah" -n #hname.txt -u UID --no-host-key-check --format json |jq -r '.items[] | "[\"\(.node)\",\"\(.result.stdout)\"]"'
["node name","stdout data
"]
I used a work around to get the data I needed by using the #csv flag to the command itself. Sharing with you below what worked.
$ bolt command run "cat /blah/blah" -n #hname.txt -u uid --no-host-key-check --format json |jq -jr '.items[]|[.node],[.result.stdout]|#csv'
""node-name""stdout.data
"

Decompress / decompact JSON using jq

I can compact JSON using jq -c like so:
cat file.json | jq -c
which will output all the json on a single line..is there a command that can decompact/decompress it so it's more human readable again? Basically adding newlines in the right places?

. is the basic JQ filter (jq by default pretty-prints all output)
cat file.json | jq -c | jq .
jq . will decompress it

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

shell script: use pipelines instead of files for batch processing - json

Is there any way to pipeline above two commands? We can make use of the split --filter option: jq -c '.[]' csvjson.json | split -l25 --filter='jq --slurp "map({PutRequest: {Item: map_values({S: .})}})" >$FILE.json' - output

Related

Batch Processing Curl API Requests in Bash?

Compare two JSON arrays and iterate over the remaining items

Retrieve secrets from AWS Secrets Manager

Parsing json output using jq -jr

Decompress / decompact JSON using jq

Categories

Resources