I am trying to output the value for .metadata.name followed by the student's name in .spec.template.spec.containers[].students[] array using the regex test() function in jq.
I am having trouble to retrieve the individual array value since there is no key specified for the students[] array.
For example, if I check the students[] array if it contains the word "Jeff", I would like the output to display as below:
student-deployment: Jefferson
What i have tried:
I've tried the command below which somewhat works but I am not sure how to get only the "Jefferson" value. The command below would print out all of the students[] array values which is not what I want. I am using Powershell to run the command below.
kubectl get deployments -o json | jq -r '.items[] | select(.spec.template.spec.containers[].students[]?|test("\"^Jeff.\"")) | .metadata.name, "\":\t\"", .spec.template.spec.containers[].students'
Is there a way to print a specific value of an array given a condition in jq if there is no key specified? Also, would the solution work if there are multiple deployments?
The deployment template below is in json and I shortened it to only the relevant parts.
{
"apiVersion": "v1",
"items": [
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"name": "student-deployment",
"namespace": "default"
},
"spec": {
"template": {
"spec": {
"containers": [
{
"students": [
"Alice",
"Bob",
"Peter",
"Sally",
"Jefferson"
]
}
]
}
}
}
}
]
}
For this approch, we introduce a variable $pattern. You may set it with --arg pattern to your regex, e.g. "Jeff" or "^Al" or "e$" to have the student list filtered by test, or leave it empty to see all students.
Now, we iterate over all .item[] elements (i.e. over "all deployments"). For each found, we output the content of .metadata.name followed by a literal colon and a space. Then we iterate again over all .spec.template.spec.containers[].students[], perform the pattern test and concatenate the outcome.
To print out raw strings instead of JSON, we use the -r option when calling jq.
kubectl get deployments -o json \
| jq --arg pattern "Jeff" -r '
.items[]
| .metadata.name + ": " + (
.spec.template.spec.containers[].students[]
| select(test($pattern))
)
'
To retrieve the "students" array(s) in the input, you could use this filter:
.items[]
| paths(objects) as $p
| getpath($p)
| select( objects | has("students") )
| .students
You can then add additional filters to select the particular student(s) of interest, e.g.
| .[]
| select(test("Jeff"))
And then add any postprocessing filters, e.g.
| "student-deployment: \(.)"
Of course you can obtain the students array in numerous other ways.
I need to grab variables from JSON properties.
The JSON array looks like this (GitHub API for repository tags), which I obtain from a curl request.
[
{
"name": "my-tag-name",
"zipball_url": "https://api.github.com/repos/path-to-my-tag-name",
"tarball_url": "https://api.github.com/repos/path-to-my-tag-name-tarball",
"commit": {
"sha": "commit-sha",
"url": "https://api.github.com/repos/path-to-my-commit-sha"
},
"node_id": "node-id"
},
{
"name": "another-tag-name",
"zipball_url": "https://api.github.com/repos/path-to-my-tag-name",
"tarball_url": "https://api.github.com/repos/path-to-my-tag-name-tarball",
"commit": {
"sha": "commit-sha",
"url": "https://api.github.com/repos/path-to-my-commit-sha"
},
"node_id": "node-id"
},
]
In my actual JSON there are 100s of objects like these.
While I loop each one of these I need to grab the name and the commit URL, then perform more operations with these two variables before I get to the next object and repeat.
I tried (with and without -r)
tags=$(curl -s -u "${GITHUB_USERNAME}:${GITHUB_TOKEN}" -H "Accept: application/vnd.github.v3+json" "https://api.github.com/repos/path-to-my-repository/tags?per_page=100&page=${page}")
for row in $(jq -r '.[]' <<< "$tags"); do
tag=$(jq -r '.name' <<< "$row")
# I have also tried with the syntax:
url=$(echo "${row}" | jq -r '.commit.url')
# do stuff with $tag and $url...
done
But I get errors like:
parse error: Unfinished JSON term at EOF at line 2, column 0 jq: error
(at :1): Cannot index string with string "name" } parse error:
Unmatched '}' at line 1, column 1
And from the terminal output it appears that it is trying to parse $row in a strange way, trying to grab .name from every substring? Not sure.
I am assuming the output from $(jq '.[]' <<< "$tags") could be valid JSON, from which I could again use jq to grab the object properties I need, but maybe that is not the case? If I output ${row} it does look like valid JSON to me, and I tried pasting the results in a JSON validator, everything seems to check out...
How do I grab the ".name" and ".commit.url" for each of these object before I move onto the next one?
Thanks
It would be better to avoid calling jq more than once. Consider, for example:
while read -r name ; do
read -r url
echo "$name" "$url"
done < <( curl .... | jq -r '.[] | .name, .commit.url' )
where curl .... signifies the relevant invocation of curl.
I am trying to create JSON from scratch using bash.
The final structure needs to be like:
{
"hosts": {
"a_hostname" : {
"ips" : [
1,
2,
3
]
},
{...}
}
}
First I'm creating an input file with the format:
hostname ["1.1.1.1","2.2.2.2"]
host-name2 ["3.3.3.3","4.4.4.4"]
This is being created by:
for host in $( ansible -i hosts all --list-hosts ) ; \
do echo -n "${host} " ; \
ansible -i hosts $host -m setup | sed '1c {' | jq -r -c '.ansible_facts.ansible_all_ipv4_addresses' ; \
done > hosts.txt
The key point here is that the IP list/array, is coming from a JSON file and being extracted by jq. This extraction outputs an already valid / quoted JSON array, but as a string in a txt file.
Next I'm using jq to parse the whole text file into the desired JSON:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]):
{ips:.[1]}
}
] | add }
' < ~/hosts.txt
This is almost correct, everything except for the IPs value which is treated as a string and quoted leading to:
{
"hosts": {
"hostname1": {
"ips": "[\"1.1.1.1\",\"2.2.2.2\"]"
},
"host-name2": {
"ips": "[\"3.3.3.3\",\"4.4.4.4\"]"
}
}
}
I'm now stuck at this final hurdle - how to insert the IPs without causing them to be quoted again.
Edit - quoting solved by using {ips: .[1] | fromjson }} instead of {ips:.[1]}.
However this was completely negated by #CharlesDuffy's help suggesting converting to TSV.
Original Q body:
So far I've got to
jq -n {hosts:{}} | \
for host in $( ansible -i hosts all --list-hosts ) ; \
do jq ".hosts += {$host:{}}" | \
jq ".hosts.$host += {ips:[1,2,3]}" ; \
done ;
([1,2,3] is actually coming from a subshell but including it seemed unnecessary as that part works, and made it harder to read)
This sort of works, but there seems to be 2 problems.
1) Final output only has a single host in it containg data from the first host in the list (this persists even if the second problem is bypassed):
{
"hosts": {
"host_1": {
"ips": [
1,
2,
3
]
}
}
}
2) One of the hostnames has a - in it, which causes syntax and compiler errors from jq. I'm stuck going around quote hell trying to get it to be interpreted but also quoted. Help!
Thanks for any input.
Let's say your input format is:
host_1 1 2 3
host_2 2 3 4
host-with-dashes 3 4 5
host-with-no-addresses
...re: edit specifying a different format: Add #tsv onto the JQ command producing the existing format to generate this one instead.
If you want to transform that to the format in question, it might look like:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]): .[1:]}
] | add
}' <input.txt
Which yields as output:
{
"hosts": {
"host_1": [
"1",
"2",
"3"
],
"host_2": [
"2",
"3",
"4"
],
"host-with-dashes": [
"3",
"4",
"5"
],
"host-with-no-addresses": []
}
}
I need to pull a substring from JSON. In the JSON doc below, I need the end of the value of jq '.[].networkProfile.networkInterfaces[].id' In other words, I need just A10NICvw4konls2vfbw-data to pass to another command. I can't seem to figure out how to pull a substring using grep. I've seem regex examples out there but haven't been successful with them.
[
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Compute/virtualMachines/A10VNAvw4konls2vfbw",
"instanceView": null,
"licenseType": null,
"location": "centralus",
"name": "A10VNAvw4konls2vfbw",
"networkProfile": {
"networkInterfaces": [
{
"id": "/subscriptions/blah/resourceGroups/IPv6v2/providers/Microsoft.Network/networkInterfaces/A10NICvw4konls2vfbw-data",
"resourceGroup": "IPv6v2"
}
]
}
}
]
In your case, sub(".*/";"") will do the trick as * is greedy:
.[].networkProfile.networkInterfaces[].id | sub(".*/";"")
Try this:
jq -r '.[]|.networkProfile.networkInterfaces[].id | split("/") | last'
The -r tells JQ to print the output in "raw" form - in this case, that means no double-quotes around the string value.
As for the jq expression, after you access the id you want, piping it (still inside jq) through split("/") turns it into an array of the parts between slashes. Piping that through the last function (thanks, #Thor) returns just the last element of the array.
If you want to do it with grep here is one way:
jq -r '.[].networkProfile.networkInterfaces[].id' | grep -o '[^/]*$'
Output:
A10NICvw4konls2vfbw-data
I am trying to create a json object from a string in bash. The string is as follows.
CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0
The output is from docker stats command and my end goal is to publish custom metrics to aws cloudwatch. I would like to format this string as json.
{
"CONTAINER":"nginx_container",
"CPU%":"0.02%",
....
}
I have used jq command before and it seems like it should work well in this case but I have not been able to come up with a good solution yet. Other than hardcoding variable names and indexing using sed or awk. Then creating a json from scratch. Any suggestions would be appreciated. Thanks.
Prerequisite
For all of the below, it's assumed that your content is in a shell variable named s:
s='CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0'
What (modern jq)
# thanks to #JeffMercado and #chepner for refinements, see comments
jq -Rn '
( input | split("|") ) as $keys |
( inputs | split("|") ) as $vals |
[[$keys, $vals] | transpose[] | {key:.[0],value:.[1]}] | from_entries
' <<<"$s"
How (modern jq)
This requires very new (probably 1.5?) jq to work, and is a dense chunk of code. To break it down:
Using -n prevents jq from reading stdin on its own, leaving the entirety of the input stream available to be read by input and inputs -- the former to read a single line, and the latter to read all remaining lines. (-R, for raw input, causes textual lines rather than JSON objects to be read).
With [$keys, $vals] | transpose[], we're generating [key, value] pairs (in Python terms, zipping the two lists).
With {key:.[0],value:.[1]}, we're making each [key, value] pair into an object of the form {"key": key, "value": value}
With from_entries, we're combining those pairs into objects containing those keys and values.
What (shell-assisted)
This will work with a significantly older jq than the above, and is an easily adopted approach for scenarios where a native-jq solution can be harder to wrangle:
{
IFS='|' read -r -a keys # read first line into an array of strings
## read each subsequent line into an array named "values"
while IFS='|' read -r -a values; do
# setup: positional arguments to pass in literal variables, query with code
jq_args=( )
jq_query='.'
# copy values into the arguments, reference them from the generated code
for idx in "${!values[#]}"; do
[[ ${keys[$idx]} ]] || continue # skip values with no corresponding key
jq_args+=( --arg "key$idx" "${keys[$idx]}" )
jq_args+=( --arg "value$idx" "${values[$idx]}" )
jq_query+=" | .[\$key${idx}]=\$value${idx}"
done
# run the generated command
jq "${jq_args[#]}" "$jq_query" <<<'{}'
done
} <<<"$s"
How (shell-assisted)
The invoked jq command from the above is similar to:
jq --arg key0 'CONTAINER' \
--arg value0 'nginx_container' \
--arg key1 'CPU%' \
--arg value1 '0.0.2%' \
--arg key2 'MEMUSAGE/LIMIT' \
--arg value2 '25.09MiB/15.26GiB' \
'. | .[$key0]=$value0 | .[$key1]=$value1 | .[$key2]=$value2' \
<<<'{}'
...passing each key and value out-of-band (such that it's treated as a literal string rather than parsed as JSON), then referring to them individually.
Result
Either of the above will emit:
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Why
In short: Because it's guaranteed to generate valid JSON as output.
Consider the following as an example that would break more naive approaches:
s='key ending in a backslash\
value "with quotes"'
Sure, these are unexpected scenarios, but jq knows how to deal with them:
{
"key ending in a backslash\\": "value \"with quotes\""
}
...whereas an implementation that didn't understand JSON strings could easily end up emitting:
{
"key ending in a backslash\": "value "with quotes""
}
I know this is an old post, but the tool you seek is called jo: https://github.com/jpmens/jo
A quick and easy example:
$ jo my_variable="simple"
{"my_variable":"simple"}
A little more complex
$ jo -p name=jo n=17 parser=false
{
"name": "jo",
"n": 17,
"parser": false
}
Add an array
$ jo -p name=jo n=17 parser=false my_array=$(jo -a {1..5})
{
"name": "jo",
"n": 17,
"parser": false,
"my_array": [
1,
2,
3,
4,
5
]
}
I've made some pretty complex stuff with jo and the nice thing is that you don't have to worry about rolling your own solution worrying about the possiblity of making invalid json.
You can ask docker to give you JSON data in the first place
docker stats --format "{{json .}}"
For more on this, see: https://docs.docker.com/config/formatting/
JSONSTR=""
declare -a JSONNAMES=()
declare -A JSONARRAY=()
LOOPNUM=0
cat ~/newfile | while IFS=: read CONTAINER CPU MEMUSE MEMPC NETIO BLKIO PIDS; do
if [[ "$LOOPNUM" = 0 ]]; then
JSONNAMES=("$CONTAINER" "$CPU" "$MEMUSE" "$MEMPC" "$NETIO" "$BLKIO" "$PIDS")
LOOPNUM=$(( LOOPNUM+1 ))
else
echo "{ \"${JSONNAMES[0]}\": \"${CONTAINER}\", \"${JSONNAMES[1]}\": \"${CPU}\", \"${JSONNAMES[2]}\": \"${MEMUSE}\", \"${JSONNAMES[3]}\": \"${MEMPC}\", \"${JSONNAMES[4]}\": \"${NETIO}\", \"${JSONNAMES[5]}\": \"${BLKIO}\", \"${JSONNAMES[6]}\": \"${PIDS}\" }"
fi
done
Returns:
{ "CONTAINER": "nginx_container", "CPU%": "0.02%", "MEMUSAGE/LIMIT": "25.09MiB/15.26GiB", "MEM%": "0.16%", "NETI/O": "0B/0B", "BLOCKI/O": "22.09MB/4.096kB", "PIDS": "0" }
Here is a solution which uses the -R and -s options along with transpose:
split("\n") # [ "CONTAINER...", "nginx_container|0.02%...", ...]
| (.[0] | split("|")) as $keys # [ "CONTAINER", "CPU%", "MEMUSAGE/LIMIT", ... ]
| (.[1:][] | split("|")) # [ "nginx_container", "0.02%", ... ] [ ... ] ...
| select(length > 0) # (remove empty [] caused by trailing newline)
| [$keys, .] # [ ["CONTAINER", ...], ["nginx_container", ...] ] ...
| [ transpose[] | {(.[0]):.[1]} ] # [ {"CONTAINER": "nginx_container"}, ... ] ...
| add # {"CONTAINER": "nginx_container", "CPU%": "0.02%" ...
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "nginx_container" "0.02%" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
Not using jq but possible to use args and environment in values.
CONTAINER=nginx_container
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "$CONTAINER" "$1" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
If you're starting with tabular data, I think it makes more sense to use something that works with tabular data natively, like sqawk to make it into json, and then use jq work with it further.
echo 'CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0' \
| sqawk -FS '[|]' -RS '\n' -output json 'select * from a' header=1 \
| jq '.[] | with_entries(select(.key|test("^a.*")|not))'
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Without jq, sqawk gives a bit too much:
[
{
"anr": "1",
"anf": "7",
"a0": "nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0",
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0",
"a8": "",
"a9": "",
"a10": ""
}
]