Create Bash loop from JSON string using jq - json

I am writing a bash script that reads a JSON string then loop based on the JSON values to execute a CLI command.
#!/usr/bin/env bash
jq --version > /dev/null 2>&1 || { echo >&2 "jq is required but it's not installed. Aborting."; exit 1; }
read -r -d '' USER_ACTIONS << EOM
{
"user1": [
"action1"
],
"user2": [
"action2",
"action3"
]
}
EOM
USERS= #TODO
for user in USERS; do
ACTIONS= #TODO
for action in ACTIONS; do
echo "Executing ${command} ${user}-${action}"
done
done
If jq is present in the server, how do I populate the USERS and ACTIONS variable?

Depending on what command you want to execute, if it can be performed from within jq, it's easier to also move the loop inside. There are several ways to accomplish that. Here are some examples, all yielding the same output:
jq -r 'to_entries[] | "Executing command \(.key)-\(.value[])"' <<< "$USER_ACTIONS"
jq -r 'keys[] as $user | "Executing command \($user)-\(.[$user][])"' <<< "$USER_ACTIONS"
jq -r --stream '"Executing command \(.[0][0])-\(.[1]? // empty)"' <<< "$USER_ACTIONS"
Output:
Executing command user1-action1
Executing command user2-action2
Executing command user2-action3

It seems better to play with vanilla js (nodejs):
const myvar={
"user1": [
"action1"
],
"user2": [
"action2",
"action3"
]
};
for (let user in myvar) {
myvar[user].forEach((action) => {
console.log("Executing command " + user + "-" + action);
});
}
Output
Executing command user1-action1
Executing command user2-action2
Executing command user2-action3
Usage
node script.js
Usage with bash
You can remove Executing string, then:
node script.js | bash

Would you please try the following:
#!/bin/bash
declare -a users # array of users
declare -A actions # array of actions indexed by user
read -r -d '' user_actions << EOM
{
"user1": [
"action1"
],
"user2": [
"action2",
"action3"
]
}
EOM
while IFS=$'\t' read -r key val; do
users+=( "$key" ) # add the user to the array "users"
actions[$key]="$val" # associate the actions with the user
done < <(jq -r 'to_entries[] | [.key, .value[]] | #tsv' <<< "$user_actions")
for user in "${users[#]}"; do # loop over "users"
IFS=$'\t' read -r -a vals <<< "${actions[$user]}"
for action in "${vals[#]}"; do
echo yourcommand "$user" "$action"
done
done
Output:
yourcommand user1 action1
yourcommand user2 action2
yourcommand user2 action3
[Explanations]
The jq command outputs TSV which looks like:
user1\taction1
user2\taction2\taction3
where \t represents a tab character used as a field delimiter.
The first read builtin command assigns key to the first field and val
to the remaining field(s). If val contains two or more fields,
it will be splitted with tne next read builtin in the next loop.
if the output looks good, drop echo and replace the string yourcommand with the command name.

Related

bash & jq: add attribute with object value

I'm looking for a solution to add a new attribute with a JSON object value into an existing JSON file.
My current script:
if [ ! -f "$src_file" ]; then
echo "Source file $src_file does not exists"
exit 1
fi
if [ ! -f "$dst_file" ]; then
echo "Destination file $dst_file does not exists"
exit 1
fi
if ! jq '.devDependencies' "$src_file" >/dev/null 2>&1; then
echo "The key "devDependencies" does not exists into source file $src_file"
exit 1
fi
dev_dependencies=$(jq '.devDependencies' "$src_file" | xargs )
# Extract data from source file
data=$(cat $src_file)
# Add new key-value
data=$(echo $data | jq --arg key "devDependencies" --arg value "$dev_dependencies" '. + {($key): ($value)}')
# Write data into destination file
echo $data > $dst_file
It's working but the devDependencies value from $dev_dependencies is wrote as string:
"devDependencies": "{ #nrwl/esbuild: 15.6.3, #nrwl/eslint-pl[...]".
How can I write it as raw JSON ?
I think you want the --argjson option instead of --arg. Compare
$ jq --arg k '{"foo": "bar"}' -n '{x: $k}'
{
"x": "{\"foo\": \"bar\"}"
}
with
$ jq --argjson k '{"foo": "bar"}' -n '{x: $k}'
{
"x": {
"foo": "bar"
}
}
--arg will create a string variable. Use --argjson to parse the value as JSON (can be object, array or number).
From the docs:
--arg name value:
This option passes a value to the jq program as a predefined variable.
If you run jq with --arg foo bar, then $foo is available in the
program and has the value "bar". Note that value will be treated as a
string, so --arg foo 123 will bind $foo to "123".
Named arguments are also available to the jq program as $ARGS.named.
--argjson name JSON-text:
This option passes a JSON-encoded value to the jq program as a
predefined variable. If you run jq with --argjson foo 123, then $foo
is available in the program and has the value 123.
Note that you don't need multiple invocations of jq, xargs, command substitution or variables (don't forget to quote all your variables when expanding).
To "merge" the contents of two files, read both files with jq and let jq do the work. This avoids all the complications that arise from jumping between jq and shell context. A single line is all that's needed:
jq --slurpfile deps "$dep_file" '. + { devDependencies: $deps[0].devDependencies }' "$source_file" > "$dest_file"
or
jq --slurpfile deps "$dep_file" '. + ($deps[0]|{devDependencies})' "$source_file" > "$dest_file"
alternatively (still a one-liner):
jq --slurpfile deps "$dev_file" '.devDependencies = $deps[0].devDependencies' "$source_file" > "$dest_file"
peak's answer here reminded me of the very useful input filter, which can make the program even shorter as it avoids the variable:
jq '. + (input|{devDependencies})' "$source_file" "$dep_file" > "$dest_file"

How to Iterate over an array of objets using jq

I have a javascript file which prints a JSON array of objects:
// myfile.js output
[
{ "id": 1, "name": "blah blah", ... },
{ "id": 2, "name": "xxx", ... },
...
]
In my bash script, I want to iterate through each object.
I've tried following, but it doesn't work.
#!/bin/bash
output=$(myfile.js)
for row in $(echo ${output} | jq -c '.[]'); do
echo $row
done
You are trying to invoke myfile.js as a command. You need this:
output=$(cat myfile.js)
instead of this:
output=$(myfile.js)
But even then, your current approach isn't going to work well if the data has whitespace in it (which it does, based on the sample you posted). I suggest the following alternative:
jq -c '.[]' < myfile.js |
while read -r row
do
echo "$row"
done
Output:
{"id":1,"name":"blah blah"}
{"id":2,"name":"xxx"}
Edit:
If your data is arising from a previous process invocation, such as mongo in your case, you can pipe it directly to jq (to remain portable), like this:
mongo myfile.js |
jq -c '.[]' |
while read -r row
do
echo "$row"
done
How can I make jq -c '.[]' < (mongo myfile.js) work?
In a bash shell, you would write an expression along the following lines:
while read -r line ; do .... done < <(mongo myfile.js | jq -c .[])
Note that there are two occurrences of "<" in the above expression.
Also, the above assumes mongo is emitting valid JSON. If it emits //-style comments, those would have somehow to be removed.
Comparison with piping into while
If you use the idiom:
... | while read -r line ; do .... done
then the bindings of any variables in .... will be lost.

JQ Group Multiple Files

I have a set of JSON that all contain JSON in the following format:
File 1:
{ "host" : "127.0.0.1", "port" : "80", "data": {}}
File 2:
{ "host" : "127.0.0.2", "port" : "502", "data": {}}
File 3:
{ "host" : "127.0.0.1", "port" : "443", "data": {}}
These files can be rather large, up to several gigabytes.
I want to use JQ or some other bash json processing tool that can merge these json files into one file with a grouped format like so:
[{ "host" : "127.0.0.1", "data": {"80": {}, "443" : {}}},
{ "host" : "127.0.0.2", "data": {"502": {}}}]
Is this possible with jq and if yes, how could I possibly do this? I have looked at the group_by function in jq, but it seems like I need to combine all files first and then group on this big file. However, since the files can be very large, it might make sense to stream the data and group them on the fly.
With really big files, I'd look into a primarily disk based approach instead of trying to load everything into memory. The following script leverages sqlite's JSON1 extension to load the JSON files into a database and generate the grouped results:
#!/usr/bin/env bash
DB=json.db
# Delete existing database if any.
rm -f "$DB"
# Create table. Assuming each host,port pair is unique.
sqlite3 -batch "$DB" <<'EOF'
CREATE TABLE data(host TEXT, port INTEGER, data TEXT,
PRIMARY KEY (host, port)) WITHOUT ROWID;
EOF
# Insert the objects from the files into the database.
for file in file*.json; do
sqlite3 -batch "$DB" <<EOF
INSERT INTO data(host, port, data)
SELECT json_extract(j, '\$.host'), json_extract(j, '\$.port'), json_extract(j, '\$.data')
FROM (SELECT json(readfile('$file')) AS j) as json;
EOF
done
# And display the results of joining the objects Could use
# json_group_array() instead of this sed hackery, but we're trying to
# avoid building a giant string with the entire results. It might still
# run into sqlite maximum string length limits...
sqlite3 -batch -noheader -list "$DB" <<'EOF' | sed '1s/^/[/; $s/,$/]/'
SELECT json_object('host', host,
'data', json_group_object(port, json(data))) || ','
FROM data
GROUP BY host
ORDER BY host;
EOF
Running this on your sample data prints out:
[{"host":"127.0.0.1","data":{"80":{},"443":{}}},
{"host":"127.0.0.2","data":{"502":{}}}]
If the goal is really to produce a single ginormous JSON entity, then presumably that entity is still small enough to have a chance of fitting into the memory of some computer, say C. So there is a good chance of jq being up to the job on C. At any rate, to utilize memory efficiently, you would:
use inputs while performing the grouping operation;
avoid the built-in group_by (since it requires an in-memory sort).
Here then is a two-step candidate using jq, which assumes grouping.jq contains the following program:
# emit a stream of arrays assuming that f is always string-valued
def GROUPS_BY(stream; f):
reduce stream as $x ({}; ($x|f) as $s | .[$s] += [$x]) | .[];
GROUPS_BY(inputs | .data=.port | del(.port); .host)
| {host: .[0].host, data: map({(.data): {}}) | add}
If the JSON files can be captured by *.json, you could then consider:
jq -n -f grouping.jq *.json | jq -s .
One advantage of this approach is that if it fails, you could try using a temporary file to hold the output of the first step, and then processing it later, either by "slurping" it, or perhaps more sensibly distributing it amongst several files, one per .host.
Removing extraneous data
Obviously, if the input files contain extraneous data, you might want to remove it first, e.g. by running
for f in *.json ; do
jq '{host,port}' "$f" | sponge $f
done
or by performing the projection in program.jq, e.g. using:
GROUPS_BY(inputs | {host, data: .port}; .host)
| {host: .[0].host, data: map( {(.data):{}} )}
Here's a script which uses jq to solve the problem without requiring more memory than is needed for the largest group. For simplicity:
it reads *.json and directs output to $OUT as defined at the top of the script.
it uses sponge
#!/usr/bin/env bash
# Requires: sponge
OUT=big.json
/bin/rm -i "$OUT"
if [ -s "$OUT" ] ; then
echo $OUT already exists
exit 1
fi
### Step 0: setup
TDIR=$(mktemp -d /tmp/grouping.XXXX)
function cleanup {
if [ -d "$TDIR" ] ; then
/bin/rm -r "$TDIR"
fi
}
trap cleanup EXIT
### Step 1: find the groups
for f in *.json ; do
host=$(jq -r '.host' "$f")
echo "$f" >> "$TDIR/$host"
done
for f in $TDIR/* ; do
echo $f ...
jq -n 'reduce (inputs | {host, data: {(.port): {} }}) as $in (null;
.host=$in.host | .data += [$in.data])' $(cat $f) | sponge "$f"
done
### Step 2: assembly
i=0
echo "[" > $OUT
find $TDIR -type f | while read f ; do
i=$((i + 1))
if [ $i -gt 1 ] ; then echo , >> $OUT ; fi
cat "$f" >> $OUT
done
echo "]" >> $OUT
Discussion
Besides requiring enough memory to handle the largest group, the main deficiencies of the above implementation are:
it assumes that the .host string is suitable as a file name.
the resultant file is not strictly speaking pretty-printed.
These two issues could however be addressed quite easily with minor modifications to the script without requiring additional memory.

How to parse json in shell script using jq add store into array in shell script [duplicate]

I'm parsing a JSON response with a tool called jq.
The output from jq will give me a list of full names in my command line.
I have the variable getNames which contains JSON, for example:
{
"count": 49,
"user": [{
"username": "jamesbrown",
"name": "James Brown",
"id": 1
}, {
"username": "matthewthompson",
"name": "Matthew Thompson",
"id": 2
}]
}
I pass this through JQ to filter the json using the following command:
echo $getNames | jq -r .user[].name
Which gives me a list like this:
James Brown
Matthew Thompson
I want to put each one of these entries into a bash array, so I enter the following commands:
declare -a myArray
myArray=( `echo $getNames | jq -r .user[].name` )
However, when I try to print the array using:
printf '%s\n' "${myArray[#]}"
I get the following:
James
Brown
Matthew
Thompson
How do I ensure that a new index is created after a new line and not a space? Why are the names being separated?
Thanks.
A simple script in bash to feed each line of the output into the array myArray.
#!/bin/bash
myArray=()
while IFS= read -r line; do
[[ $line ]] || break # break if line is empty
myArray+=("$line")
done < <(jq -r .user[].name <<< "$getNames")
# To print the array
printf '%s\n' "${myArray[#]}"
Just use mapfile command to read multiple lines into an array like this:
mapfile -t myArray < <(jq -r .user[].name <<< "$getNames")

jq truncates ENV variable after whitespace

Trying to write a bash script that replaces values in a JSON file we are running into issues with Environment Variables that contain whitespaces.
Given an original JSON file.
{
"version": "base",
"myValue": "to be changed",
"channelId": 0
}
We want to run a command to update some variables in it, so that after we run:
CHANNEL_ID=1701 MY_VALUE="new value" ./test.sh
The JSON should look like this:
{
"version": "base",
"myValue": "new value",
"channelId": 1701
}
Our script is currently at something like this:
#!/bin/sh
echo $MY_VALUE
echo $CHANNEL_ID
function replaceValue {
if [ -z $2 ]; then echo "Skipping $1"; else jq --argjson newValue \"${2}\" '. | ."'${1}'" = $newValue' build/config.json > tmp.json && mv tmp.json build/config.json; fi
}
replaceValue channelId ${CHANNEL_ID}
replaceValue myValue ${MY_VALUE}
In the above all values are replaced by string and strings are getting truncated at whitespace. We keep alternating between this issue and a version of the code where substitutions just stop working entirely.
This is surely an issue with expansions but we would love to figure out, how we can:
- Replace values in the JSON with both strings and values.
- Use whitespaces in the strings we pass to our script.
You don't have to mess with --arg or --argjson to import the environment variables into jq's context. It can very well read the environment on its own. You don't need a script separately, just set the values along with the invocation of jq
CHANNEL_ID=1701 MY_VALUE="new value" \
jq '{"version": "base", myValue: env.MY_VALUE, channelId: env.CHANNEL_ID}' build/config.json
Note that in the case above, the variables need not be exported globally but just locally to the jq command. This allows you to not export multiple variables into the shell and pollute the environment, but just the ones needed for jq to construct the desired JSON.
To make the changes back to the original file, do > tmp.json && mv tmp.json build/config.json or more clearly download the sponge(1) utility from moreutils package. If present, you can pipe the output of jq as
| sponge build/config.json
Pass variables with --arg. Do:
jq --arg key "$1" --arg value "$2" '.[$key] = $value'
Notes:
#!/bin/sh indicates that this is posix shell script, not bash. Use #!/bin/bash in bash scripts.
function replaceValue { is something from ksh shell. Prefer replaceValue() { to declare functions. Bash obsolete and deprecated syntax.
Use newlines in your script to make it readable.
--argjson passes a json formatted argument, not a string. Use --arg for that.
\"${2}\" doesn't quote $2 expansion - it only appends and suffixes the string with ". Because the expansion is not qouted, word splitting is performed, which causes your input to be split on whitespaces when creating arguments for jq.
Remember to quote variable expansions.
Use http://shellcheck.net to check your scripts.
. | means nothing in jq, it's like echo $(echo $(echo))). You could jq '. | . | . | . | . | .' do it infinite number of times - it passes the same thing. Just write the thing you want to do.
Do:
#!/bin/bash
echo "$MY_VALUE"
echo "$CHANNEL_ID"
replaceValue() {
if [ -z "$2" ]; then
echo "Skipping $1"
else
jq --arg key "$1" --arg value "$2" '.[$key] = $value' build/config.json > tmp.json &&
mv tmp.json build/config.json
fi
}
replaceValue channelId "${CHANNEL_ID}"
replaceValue myValue "${MY_VALUE}"
#edit Replaced ."\($key)" with easier .[$key]
jq allows you to build new objects:
MY_VALUE=foo;
CHANNEL_ID=4
echo '{
"version": "base",
"myValue": "to be changed",
"channelId": 0
}' | jq ". | {\"version\": .version, \"myValue\": \"$MY_VALUE\", \"channelId\": $CHANNEL_ID}"
The . selects the whole input, and inputs that (|) to the construction of a new object (marked by {}). For version is selects .version from the input, but you can set your own values for the other two. We use double quotes to allow the Bash variable expansion, which means escaping the double quotes in the JSON.
You'll need to adapt my snippet above to scriptify it.