How to loop over nested JSON arrays, get values and then modify them based off index in shell using JQ - json

this is the JSON I am working with -
{
"content": {
"macOS": {
"releases": [
{
"version": "3.21",
"updateItems": [
{
"id": 1,
"title": "Automatic detection for inactivity is here",
"message": "We've finally added a feature long requested - Orby now detects when you've been inactive on your computer. You can modify the maximum allowable inactive time through settings, or just turn it off if you don't need it",
"image": "https://static.image.png"
},
{
"id": 2,
"title": "In case you missed it... We have an iOS app 📱 🙌",
"message": "It's far from perfect, but it's come a long way since we first pushed out version 1.0. We don't that many users on it so far, but I'm hoping that it's useful. Please send any feedback and feature requests my way",
"image": "https://static.image.png"
}
]
}
]
},
"iOS": {
"releases": [
{
"version": "1.31",
"updateItems": [
{
"image": "https://static.image.png",
"id": 1,
"link": "orbit://com.orbit:/settings",
"message": "Strict mode offers a fantastic new way to keep your focus and get more done. To enable it, go to settings and toggle it on. Now when you want to run a timer, put your device face down on a surface. The timer will stop if you pick it up.",
"title": "Strict mode is here 🙌 ⚠️ ⚠️ ⚠️"
}
]
}
]
}
}
}
I want to translate all the title values and message values (I use shell translate).
In my attempt, I have looped through the releases, gotten the indices, then looped through the updateItems and gotten their indices, then translated the title and message based off both these indices, and then I've attempted to assign these new values to replace the existing title and message.
I've noticed that this results in all the title values being the same, and all the message values being the same.
I'm clearly using jq the wrong way, but am unsure how to correct.
Please help.
curl ${URL} | jq >englishContent.json
LANGUAGES=(
# "en"
# "de"
# "fr"
# "es"
"it"
# "ja"
# "ko"
# "nl"
# "pt-BR"
# "ru"
# "zh-Hans"
)
for language in $LANGUAGES; do
# Create new json payload from the downloaded english content json
result=$(jq '{ "language": "'$language'", "response": . }' englishContent.json)
# Get the total number of release items
macOS_releases_length=$(jq -r ".response.content.macOS.releases | length" <(echo "$result"))
# Iterate over releases items and then use index to substitute values into nested arrays
macOS_releases_length=$(expr "$macOS_releases_length" - 1)
for macOS_release_index in $(seq 0 $macOS_releases_length); do
update_items_length=$(jq ".response.content.macOS.releases[$macOS_release_index].updateItems | length" <(echo "$result"))
update_items_length=$(expr "$update_items_length" - 1)
for update_item_index in $(seq 0 $update_items_length); do
title=$(jq ".response.content.macOS.releases[$macOS_release_index].updateItems[$update_item_index].title" <(echo "$result"))
translated_title=$(trans -brief -no-warn :$language $title | xargs)
message=$(jq ".response.content.macOS.releases[$macOS_release_index].updateItems[$update_item_index].message" <(echo "$result"))
translated_message=$(trans -brief -no-warn :$language $message | xargs)
result=$(jq --arg release_index "$(echo "$macOS_release_index" | jq 'tonumber')" --arg item_index "$("$update_item_index" | jq 'tonumber')" --arg new_title $translated_title '.response.content.macOS.releases['$release_index'].updateItems['$item_index'].title |= $new_title' <(echo "$result"))
result=$(jq --arg release_index "$(echo "$macOS_release_index" | jq 'tonumber')" --arg item_index "$("$update_item_index" | jq 'tonumber')" --arg new_message $translated_message '.response.content.macOS.releases['$release_index'].updateItems['$item_index'].message |= $new_message' <(echo "$result"))
done
done
echo $result
done

Related

JQ pull value from key searched by related key

Im bashing my head against a wall - aim pull value from key searched by related
JSON SNIPPIT
{
"block": {
"000545342981": [
{
"CODE": "xdxd3456532fok234sss0"
},
{
"STATE": "COMPLETE"
}
],
"000545341211": [
{
"CODE": "xdff3678223frt244swe4"
},
{
"STATE": "RUNNING"
}
]
}}
I wish to retrieve the CODE value for each ACCOUNT
ie for
000545342981 i wish to have xdxd3456532fok234sss0 returned and
xdff3678223frt244swe4 for 000545341211
then place in bash as
accounts+=(
["000545342981"]=xdxd3456532fok234sss0
["000545341211"]=xdff3678223frt244swe4
)
SO FAR
i have returned the accounts with
jq --slurp '.?|.[][]|keys' | jq -r '.[]'
then i tried jq --arg code "ACCOUNT" 'select ( ."block".key==$code) | ."block".$stack."CODE"' ----> no errors :/ so suspect it may not of found anything
this will kind of work ."block"."000545341211"[]."CODE"
but i know i cant input a var in jq from a for each so need arg input (unless im very wrong)
its the select ( ."block".key==$code) i think?? what do i need :(
If each account contains only one code, you can do :
#!/usr/bin/env bash
declare -A accounts
while read account code; do
accounts[$account]=$code
done < <(jq -r '.block |
to_entries[] |
"\(.key) \(.value[0].CODE)"' input.json)
declare -p accounts

Bash JSON compare two list and delete id

I have a JSON endpoint which I can fetch value with curl and yml local file. I want to get the difference and delete it with id of name present on JSON endpoint.
JSON's endpoint
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_toto_a"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_tata_b"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_titi_c"
}
]
files.yml
---
instance:
toto:
name: "toto"
tata:
name: "tata"
Between JSON's endpoint and local file, I want to delete it with id of tata, because it is the difference between the sources.
declare -a arr=(_a _b _c)
ar=$(cat files.yml | grep name | cut -d '"' -f2 | tr "\n" " ")
fileItemArray=($ar)
ARR_PRE=("${fileItemArray[#]/#/V1_}")
for i in "${arr[#]}"; do local_var+=("${ARR_PRE[#]/%/$i}"); done
remote_var=$(curl -sX GET "XXXX" | jq -r '.[].name | #sh' | tr -d \'\")
diff_=$(echo ${local_var[#]} ${remote_var[#]} | tr ' ' '\n' | sort | uniq -u)
output = titi
the code works, but I want to delete the titi with id dynamically
curl -X DELETE "XXXX" $id_titi
I am trying to delete with bash script, but I have no idea to continue...
Your endpoint is not proper JSON as it has
commas after the .name field but no following field
no commas between the elements of the top-level array
If this is not just a typo from pasting your example into this question, then you'd need to address this first before proceeding. This is how it should look like:
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "toto"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "tata"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "titi"
}
]
If your endpoint is proper JSON, try the following. It extracts the names from your .yml file (just as you do - there are plenty of more efficient and less error-prone ways but I'm trying to adapt your approach as much as possible) but instead of a Bash array generates a JSON array using jq which for Bash is a simple string. For your curl output it's basically the same thing, extracting a (JSON) array of names into a Bash string. Note that in both cases I use quotes <var>="$(…)" to capture strings that may include spaces (although I also use the -c option for jq to compact it's output to a single line). For the difference between the two, everything is taken over by jq as it can easily be fed with the JSON arrays as variables, perform the subtraction and output in your preferred format:
fromyml="$(cat files.yml | grep name | cut -d '"' -f2 | jq -Rnc '[inputs]')"
fromcurl="$(curl -sX GET "XXXX" | jq -c 'map(.name)')"
diff="$(jq -nr --argjson fromyml "$fromyml" --argjson fromcurl "$fromcurl" '
$fromcurl - $fromyml | .[]
')"
The Bash variable diff now contains a list of names only present in the curl output ($fromcurl - $fromyml), one per line (if, other than in your example, there happens to be more than one). If the curl output had duplicates, they will still be included (use $fromcurl - $fromyml | unique | .[] to get rid of them):
titi
As you can see, this solution has three calls to jq. I'll leave it to you to further reduce that number as it fits your general workflow (basically, it can be put together into one).
Getting the output of a program into a variable can be done using read.
perl -M5.010 -MYAML -MJSON::PP -e'
sub get_next_file { local $/; "".<> }
my %filter = map { $_->{name} => 1 } values %{ Load(get_next_file)->{instance} };
say for grep !$filter{$_}, map $_->{name}, #{ decode_json(get_next_file) };
' b.yaml a.json |
while IFS= read -r id; do
curl -X DELETE ..."$id"...
done
I used Perl here because what you had was no way to parse a YAML file. The snippet requires having installed the YAML Perl module.

How to substitute '\n' for newline with jq [duplicate]

I have some logs that output information in JSON. This is for collection to elasticsearch.
Some testers and operations people want to be able to read logs on the servers.
Here is some example JSON:
{
"#timestamp": "2015-09-22T10:54:35.449+02:00",
"#version": 1,
"HOSTNAME": "server1.example",
"level": "WARN",
"level_value": 30000,
"logger_name": "server1.example.adapter",
"message": "message"
"stack_trace": "ERROR LALALLA\nERROR INFO NANANAN\nSOME MORE ERROR INFO\nBABABABABABBA BABABABA ABABBABAA BABABABAB\n"
}
And so on.
Is it possible to make Jq print newline instead of the \n character sequence as seen in the value of .stack_trace?
Sure! Using the -r option, jq will print string contents directly to the terminal instead of as JSON escaped strings.
jq -r '.stack_trace'
Unless you're constraint to use jq only, you can "fix" (or actually "un-json-ify") jq output with sed:
cat the-input | jq . | sed 's/\\n/\n/g'
If you happen to have tabs in the input as well (\t in JSON), then:
cat the-input | jq . | sed 's/\\n/\n/g; s/\\t/\t/g'
This would be especially handy if your stack_trace was generated by Java (you didn't tell what is the source of the logs), as the Java stacktrace lines begin with <tab>at<space>.
Warning: naturally, this is not correct, in a sense that JSON input containing \\n will result in a "" output, however it should result in "n" output. While not correct, it's certainly sufficient for peeking at the data by humans. The sed patterns can be further improved to take care for this (at the cost of readability).
The input as originally given isn't quite valid JSON, and it's not clear precisely what the desired output is, but the following might be of interest. It is written for the current version of jq (version 1.5) but could easily be adapted for jq 1.4:
def json2qjson:
def pp: if type == "string" then "\"\(.)\"" else . end;
. as $in
| foreach keys[] as $k (null; null; "\"\($k)\": \($in[$k] | pp)" ) ;
def data: {
"#timestamp": "2015-09-22T10:54:35.449+02:00",
"#version": 1,
"HOSTNAME": "server1.example",
"level": "WARN",
"level_value": 30000,
"logger_name": "server1.example.adapter",
"message": "message",
"stack_trace": "ERROR LALALLA\nERROR INFO NANANAN\nSOME MORE ERROR INFO\nBABABABABABBA BABABABA ABABBABAA BABABABAB\n"
};
data | json2qjson
Output:
$ jq -rnf json2qjson.jq
"#timestamp": "2015-09-22T10:54:35.449+02:00"
"#version": 1
"HOSTNAME": "server1.example"
"level": "WARN"
"level_value": 30000
"logger_name": "server1.example.adapter"
"message": "message"
"stack_trace": "ERROR LALALLA
ERROR INFO NANANAN
SOME MORE ERROR INFO
BABABABABABBA BABABABA ABABBABAA BABABABAB
"

Create JSON using jq from pipe-separated keys and values in bash

I am trying to create a json object from a string in bash. The string is as follows.
CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0
The output is from docker stats command and my end goal is to publish custom metrics to aws cloudwatch. I would like to format this string as json.
{
"CONTAINER":"nginx_container",
"CPU%":"0.02%",
....
}
I have used jq command before and it seems like it should work well in this case but I have not been able to come up with a good solution yet. Other than hardcoding variable names and indexing using sed or awk. Then creating a json from scratch. Any suggestions would be appreciated. Thanks.
Prerequisite
For all of the below, it's assumed that your content is in a shell variable named s:
s='CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0'
What (modern jq)
# thanks to #JeffMercado and #chepner for refinements, see comments
jq -Rn '
( input | split("|") ) as $keys |
( inputs | split("|") ) as $vals |
[[$keys, $vals] | transpose[] | {key:.[0],value:.[1]}] | from_entries
' <<<"$s"
How (modern jq)
This requires very new (probably 1.5?) jq to work, and is a dense chunk of code. To break it down:
Using -n prevents jq from reading stdin on its own, leaving the entirety of the input stream available to be read by input and inputs -- the former to read a single line, and the latter to read all remaining lines. (-R, for raw input, causes textual lines rather than JSON objects to be read).
With [$keys, $vals] | transpose[], we're generating [key, value] pairs (in Python terms, zipping the two lists).
With {key:.[0],value:.[1]}, we're making each [key, value] pair into an object of the form {"key": key, "value": value}
With from_entries, we're combining those pairs into objects containing those keys and values.
What (shell-assisted)
This will work with a significantly older jq than the above, and is an easily adopted approach for scenarios where a native-jq solution can be harder to wrangle:
{
IFS='|' read -r -a keys # read first line into an array of strings
## read each subsequent line into an array named "values"
while IFS='|' read -r -a values; do
# setup: positional arguments to pass in literal variables, query with code
jq_args=( )
jq_query='.'
# copy values into the arguments, reference them from the generated code
for idx in "${!values[#]}"; do
[[ ${keys[$idx]} ]] || continue # skip values with no corresponding key
jq_args+=( --arg "key$idx" "${keys[$idx]}" )
jq_args+=( --arg "value$idx" "${values[$idx]}" )
jq_query+=" | .[\$key${idx}]=\$value${idx}"
done
# run the generated command
jq "${jq_args[#]}" "$jq_query" <<<'{}'
done
} <<<"$s"
How (shell-assisted)
The invoked jq command from the above is similar to:
jq --arg key0 'CONTAINER' \
--arg value0 'nginx_container' \
--arg key1 'CPU%' \
--arg value1 '0.0.2%' \
--arg key2 'MEMUSAGE/LIMIT' \
--arg value2 '25.09MiB/15.26GiB' \
'. | .[$key0]=$value0 | .[$key1]=$value1 | .[$key2]=$value2' \
<<<'{}'
...passing each key and value out-of-band (such that it's treated as a literal string rather than parsed as JSON), then referring to them individually.
Result
Either of the above will emit:
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Why
In short: Because it's guaranteed to generate valid JSON as output.
Consider the following as an example that would break more naive approaches:
s='key ending in a backslash\
value "with quotes"'
Sure, these are unexpected scenarios, but jq knows how to deal with them:
{
"key ending in a backslash\\": "value \"with quotes\""
}
...whereas an implementation that didn't understand JSON strings could easily end up emitting:
{
"key ending in a backslash\": "value "with quotes""
}
I know this is an old post, but the tool you seek is called jo: https://github.com/jpmens/jo
A quick and easy example:
$ jo my_variable="simple"
{"my_variable":"simple"}
A little more complex
$ jo -p name=jo n=17 parser=false
{
"name": "jo",
"n": 17,
"parser": false
}
Add an array
$ jo -p name=jo n=17 parser=false my_array=$(jo -a {1..5})
{
"name": "jo",
"n": 17,
"parser": false,
"my_array": [
1,
2,
3,
4,
5
]
}
I've made some pretty complex stuff with jo and the nice thing is that you don't have to worry about rolling your own solution worrying about the possiblity of making invalid json.
You can ask docker to give you JSON data in the first place
docker stats --format "{{json .}}"
For more on this, see: https://docs.docker.com/config/formatting/
JSONSTR=""
declare -a JSONNAMES=()
declare -A JSONARRAY=()
LOOPNUM=0
cat ~/newfile | while IFS=: read CONTAINER CPU MEMUSE MEMPC NETIO BLKIO PIDS; do
if [[ "$LOOPNUM" = 0 ]]; then
JSONNAMES=("$CONTAINER" "$CPU" "$MEMUSE" "$MEMPC" "$NETIO" "$BLKIO" "$PIDS")
LOOPNUM=$(( LOOPNUM+1 ))
else
echo "{ \"${JSONNAMES[0]}\": \"${CONTAINER}\", \"${JSONNAMES[1]}\": \"${CPU}\", \"${JSONNAMES[2]}\": \"${MEMUSE}\", \"${JSONNAMES[3]}\": \"${MEMPC}\", \"${JSONNAMES[4]}\": \"${NETIO}\", \"${JSONNAMES[5]}\": \"${BLKIO}\", \"${JSONNAMES[6]}\": \"${PIDS}\" }"
fi
done
Returns:
{ "CONTAINER": "nginx_container", "CPU%": "0.02%", "MEMUSAGE/LIMIT": "25.09MiB/15.26GiB", "MEM%": "0.16%", "NETI/O": "0B/0B", "BLOCKI/O": "22.09MB/4.096kB", "PIDS": "0" }
Here is a solution which uses the -R and -s options along with transpose:
split("\n") # [ "CONTAINER...", "nginx_container|0.02%...", ...]
| (.[0] | split("|")) as $keys # [ "CONTAINER", "CPU%", "MEMUSAGE/LIMIT", ... ]
| (.[1:][] | split("|")) # [ "nginx_container", "0.02%", ... ] [ ... ] ...
| select(length > 0) # (remove empty [] caused by trailing newline)
| [$keys, .] # [ ["CONTAINER", ...], ["nginx_container", ...] ] ...
| [ transpose[] | {(.[0]):.[1]} ] # [ {"CONTAINER": "nginx_container"}, ... ] ...
| add # {"CONTAINER": "nginx_container", "CPU%": "0.02%" ...
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "nginx_container" "0.02%" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
Not using jq but possible to use args and environment in values.
CONTAINER=nginx_container
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "$CONTAINER" "$1" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
If you're starting with tabular data, I think it makes more sense to use something that works with tabular data natively, like sqawk to make it into json, and then use jq work with it further.
echo 'CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0' \
| sqawk -FS '[|]' -RS '\n' -output json 'select * from a' header=1 \
| jq '.[] | with_entries(select(.key|test("^a.*")|not))'
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Without jq, sqawk gives a bit too much:
[
{
"anr": "1",
"anf": "7",
"a0": "nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0",
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0",
"a8": "",
"a9": "",
"a10": ""
}
]

JQ how to print newline and not newline character from json value

I have some logs that output information in JSON. This is for collection to elasticsearch.
Some testers and operations people want to be able to read logs on the servers.
Here is some example JSON:
{
"#timestamp": "2015-09-22T10:54:35.449+02:00",
"#version": 1,
"HOSTNAME": "server1.example",
"level": "WARN",
"level_value": 30000,
"logger_name": "server1.example.adapter",
"message": "message"
"stack_trace": "ERROR LALALLA\nERROR INFO NANANAN\nSOME MORE ERROR INFO\nBABABABABABBA BABABABA ABABBABAA BABABABAB\n"
}
And so on.
Is it possible to make Jq print newline instead of the \n character sequence as seen in the value of .stack_trace?
Sure! Using the -r option, jq will print string contents directly to the terminal instead of as JSON escaped strings.
jq -r '.stack_trace'
Unless you're constraint to use jq only, you can "fix" (or actually "un-json-ify") jq output with sed:
cat the-input | jq . | sed 's/\\n/\n/g'
If you happen to have tabs in the input as well (\t in JSON), then:
cat the-input | jq . | sed 's/\\n/\n/g; s/\\t/\t/g'
This would be especially handy if your stack_trace was generated by Java (you didn't tell what is the source of the logs), as the Java stacktrace lines begin with <tab>at<space>.
Warning: naturally, this is not correct, in a sense that JSON input containing \\n will result in a "" output, however it should result in "n" output. While not correct, it's certainly sufficient for peeking at the data by humans. The sed patterns can be further improved to take care for this (at the cost of readability).
The input as originally given isn't quite valid JSON, and it's not clear precisely what the desired output is, but the following might be of interest. It is written for the current version of jq (version 1.5) but could easily be adapted for jq 1.4:
def json2qjson:
def pp: if type == "string" then "\"\(.)\"" else . end;
. as $in
| foreach keys[] as $k (null; null; "\"\($k)\": \($in[$k] | pp)" ) ;
def data: {
"#timestamp": "2015-09-22T10:54:35.449+02:00",
"#version": 1,
"HOSTNAME": "server1.example",
"level": "WARN",
"level_value": 30000,
"logger_name": "server1.example.adapter",
"message": "message",
"stack_trace": "ERROR LALALLA\nERROR INFO NANANAN\nSOME MORE ERROR INFO\nBABABABABABBA BABABABA ABABBABAA BABABABAB\n"
};
data | json2qjson
Output:
$ jq -rnf json2qjson.jq
"#timestamp": "2015-09-22T10:54:35.449+02:00"
"#version": 1
"HOSTNAME": "server1.example"
"level": "WARN"
"level_value": 30000
"logger_name": "server1.example.adapter"
"message": "message"
"stack_trace": "ERROR LALALLA
ERROR INFO NANANAN
SOME MORE ERROR INFO
BABABABABABBA BABABABA ABABBABAA BABABABAB
"