Add text file as an array to json field using jq - json

I have a JSON file and a text file as shown below :
{
symbols: null,
symbols_count : null
}
values.txt
VALUE1
VALUE2
VALUE3
Using jq, I want to add values of values.txt as an array to the symbols field of the json.
symbols_count=$(wc -l < values.txt)
jq ".symbol_count = $symbols_count" < input.json | \
jq --slurpfile symbols values.txt '.symbols_count=$symbols'
The last jq command fails because the values in values.txt are not enclosed by "" .
Is there any way to add double quotes without changing values.txt?

jq expects its input to be either valid JSON or raw text, so it would probably be simplest if you could ensure your "JSON file" is valid JSON. See the jq FAQ for further details if that is an issue.
One way to handle values.txt is to use the --rawfile command-line option:
< input.json jq --rawfile text values.txt '
.symbols = [$text|splits("\n")|select(length>0)]
| .symbols_count = (.symbols|length)'

jq -nR '([inputs | select(length > 0)]) |
{"symbols":., "symbols_count":(. | length)}' values.txt
The jq program with comments:
values2json (chmod +x values2json)
#!/usr/bin/env -S jq -fnR
(
# Select non-empty input lines as an array
[ inputs | select( length > 0 ) ]
) |
# Build the object
{
# with the array created above
"symbols": .,
# and the length of the array
"symbols_count": (. | length)
}
Usage:
./values2json values.txt
Output from sample data:
{
"symbols": [
"VALUE1",
"VALUE2",
"VALUE3"
],
"symbols_count": 3
}

Related

get files from directory in bash and build JSON object using jq

I am trying to build list of JSON objects with the files in a particular directory. I am looping thru the files and creating the expected output object as string. I am sure there is a better way of doing this using jq.
Can someone please help me out here?
# input
files=($( ls * ))
prefix="myawesomeprefix"
# expected output
{
"listoffiles": [
{"file":"myawesomeprefix/file1.txt"},
{"file":"myawesomeprefix/file2.txt"},
{"file":"myawesomeprefix/file3.txt"},
]
}
If you don't have any "problematic" file names, e.g. ones that have new lines as part of their name, the following should work:
ls -1 | jq -Rn '{ listoffiles: [inputs | { file: "prefix/\(.)" }] }'
It reads each line as string, and reads them through the inputs filter (must be combined with -n null-input). It then builds your object.
$ cat <<LS | jq -Rn '{ listoffiles: [inputs | {file:"prefix/\(.)"}] }'
file1
file2
file with spaces
LS
{
"listoffiles": [
{
"file": "prefix/file1"
},
{
"file": "prefix/file2"
},
{
"file": "prefix/file with spaces"
}
]
}
You could use for with a glob which should handle new lines in file names as well. But it requires you to chain 2 jq commands:
for f in *; do
printf '%s' "$f" | jq -Rs '{file:"prefix/\(.)"}';
done | jq -s '{listoffiles:.}'
To specify the prefix as variable from the outside, use --arg, e.g.
jq --arg prefix "yourprefixvalue" '$prefix + .'
You can try the nice little command line tool jc:
ls | jc --ls
It converts the output of many shell commands to JSON. For reference have a look there in Github https://github.com/kellyjonbrazil/jc .
Then you can transform the result using jq:
ls | jc --ls | jq "{ listoffiles: [.[] | { file: (\"$prefix/\" + .filename) }] }"
You shouldn't parse the output of ls. If installed, you could use tree with the -J option to produce a JSON listing, which you can transform to your needs using jq:
tree -aJL 1 | jq '
{listoffiles: first.contents | map({file: ("myawesomeprefix/" + .name)})}
'
Or more comfortably using --arg:
tree -aJL 1 | jq --arg prefix myawesomeprefix '
{listoffiles: first.contents | map({file: "\($prefix)/\(.name)"})}
'
This is another alternative :
jq -n --arg prefix "myawesomeprefix"\
'.listoffiles = ($ARGS.positional |
map({file:($prefix+"/"+.)}))'\
--args *

jq how to pass json keys from a shell variable

I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.
Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json
Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file
You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]
Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]

What is returned through a jq filter on a json object in bash? [duplicate]

I parsed a json file with jq like this :
# cat test.json | jq '.logs' | jq '.[]' | jq '._id' | jq -s
It returns an array like this : [34,235,436,546,.....]
Using bash script i described an array :
# declare -a msgIds = ...
This array uses () instead of [] so when I pass the array given above to this array it won't work.
([324,32,45..]) this causes problem. If i remove the jq -s, an array forms with only 1 member in it.
Is there a way to solve this issue?
We can solve this problem by two ways. They are:
Input string:
// test.json
{
"keys": ["key1","key2","key3"]
}
Approach 1:
1) Use jq -r (output raw strings, not JSON texts) .
KEYS=$(jq -r '.keys' test.json)
echo $KEYS
# Output: [ "key1", "key2", "key3" ]
2) Use #sh (Converts input string to a series of space-separated strings). It removes square brackets[], comma(,) from the string.
KEYS=$(<test.json jq -r '.keys | #sh')
echo $KEYS
# Output: 'key1' 'key2' 'key3'
3) Using tr to remove single quotes from the string output. To delete specific characters use the -d option in tr.
KEYS=$((<test.json jq -r '.keys | #sh')| tr -d \')
echo $KEYS
# Output: key1 key2 key3
4) We can convert the comma-separated string to the array by placing our string output in a round bracket().
It also called compound Assignment, where we declare the array with a bunch of values.
ARRAYNAME=(value1 value2 .... valueN)
#!/bin/bash
KEYS=($((<test.json jq -r '.keys | #sh') | tr -d \'\"))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
Approach 2:
1) Use jq -r to get the string output, then use tr to delete characters like square brackets, double quotes and comma.
#!/bin/bash
KEYS=$(jq -r '.keys' test.json | tr -d '[],"')
echo $KEYS
# Output: key1 key2 key3
2) Then we can convert the comma-separated string to the array by placing our string output in a round bracket().
#!/bin/bash
KEYS=($(jq -r '.keys' test.json | tr -d '[]," '))
echo "Array size: " ${#KEYS[#]}
echo "Array elements: "${KEYS[#]}
# Output:
# Array size: 3
# Array elements: key1 key2 key3
To correctly parse values that have spaces, newlines (or any other arbitrary characters) just use jq's #sh filter and bash's declare -a. (No need for a while read loop or any other pre-processing)
// foo.json
{"data": ["A B", "C'D", ""]}
str=$(jq -r '.data | #sh' foo.json)
declare -a arr="($str)" # must be quoted like this
$ declare -p arr
declare -a arr=([0]="A B" [1]="C'D" [2]="")
The reason that this works correctly is that #sh will produce a space-separated list of shell-quoted words:
$ echo "$str"
'A B' 'C'\''D' ''
and this is exactly the format that declare expects for an array definition.
Use jq -r to output a string "raw", without JSON formatting, and use the #sh formatter to format your results as a string for shell consumption. Per the jq docs:
#sh:
The input is escaped suitable for use in a command-line for a POSIX shell. If the input is an array, the output will be a series of space-separated strings.
So can do e.g.
msgids=($(<test.json jq -r '.logs[]._id | #sh'))
and get the result you want.
From the jq FAQ (https://github.com/stedolan/jq/wiki/FAQ):
š¯‘ø: How can a stream of JSON texts produced by jq be converted into a bash array of corresponding values?
A: One option would be to use mapfile (aka readarray), for example:
mapfile -t array <<< $(jq -c '.[]' input.json)
An alternative that might be indicative of what to do in other shells is to use read -r within a while loop. The following bash script populates an array, x, with JSON texts. The key points are the use of the -c option, and the use of the bash idiom while read -r value; do ... done < <(jq .......):
#!/bin/bash
x=()
while read -r value
do
x+=("$value")
done < <(jq -c '.[]' input.json)
++ To resolve this, we can use a very simple approach:
++ Since I am not aware of you input file, I am creating a file input.json with the following contents:
input.json:
{
"keys": ["key1","key2","key3"]
}
++ Use jq to get the value from the above file input.json:
Command: cat input.json | jq -r '.keys | #sh'
Output: 'key1' 'key2' 'key3'
Explanation: | #sh removes [ and "
++ To remove ' ' as well we use tr
command: cat input.json | jq -r '.keys | #sh' | tr -d \'
Explanation: use tr delete -d to remove '
++ To store this in a bash array we use () with `` and print it:
command:
KEYS=(`cat input.json | jq -r '.keys | #sh' | tr -d \'`)
To print all the entries of the array: echo "${KEYS[*]}"

Create JSON from string with format "key1=value1,key2=value2" using jq

I'm trying to create a json file from a string with the following format:
string="key1=value1,key2=value2"
Is there a way to create a json using jq by specifying the = and , symbols as separators for the keys and values?
The output I'm looking for would be:
{"key1": "value1", "key2ā€¯ :ā€¯value2"}
I've tried to use this post as a reference:
Create JSON using jq from pipe-separated keys and values in bash -- however, it expects input that contains a line with only keys, before later lines with only values; here, the keys and values are all interspersed.
Here's a reduce-free solution that assumes string is the shell variable (not part of the string to be parsed), and that parsing of the string can be accomplished by first splitting on ",":
jq -R 'split(",")
| map( index("=") as $i | {(.[0:$i]) : .[$i+1:]})
| add' <<< "$string"
Notice that this allows "=" to appear within the values.
The only trickiness here is that when a key name is specified programmatically, it must be enclosed within parentheses.
Supplemental question
string="key1=value1|key2=value2,value3|key3=value4"
In this case, you would first split on "|", and then find the first occurrence of "=":
split("|")
| map( index("=") as $i | {(.[0:$i]) : .[$i+1:]})
| add
| map_values(if index(",") then split(",") else . end)
Output:
{
"key1": "value1",
"key2": [
"value2",
"value3"
],
"key3": "value4"
}
string="key1=value1,key2=value2"
jq -Rc '
split(",")
| [.[] | match( "([^=]*)=(.*)" )]
| reduce .[].captures as $item ({}; .[$item[0].string]=$item[1].string)
' <<<"$string"
echo -n "key1=value1,key2=value2" | \
jq -csR '[split(",")[]|split("=") | {(.[0]): .[1]}]|add'
this gives
{"key1":"value1","key2":"value2"}

Create JSON using jq from pipe-separated keys and values in bash

I am trying to create a json object from a string in bash. The string is as follows.
CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0
The output is from docker stats command and my end goal is to publish custom metrics to aws cloudwatch. I would like to format this string as json.
{
"CONTAINER":"nginx_container",
"CPU%":"0.02%",
....
}
I have used jq command before and it seems like it should work well in this case but I have not been able to come up with a good solution yet. Other than hardcoding variable names and indexing using sed or awk. Then creating a json from scratch. Any suggestions would be appreciated. Thanks.
Prerequisite
For all of the below, it's assumed that your content is in a shell variable named s:
s='CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0'
What (modern jq)
# thanks to #JeffMercado and #chepner for refinements, see comments
jq -Rn '
( input | split("|") ) as $keys |
( inputs | split("|") ) as $vals |
[[$keys, $vals] | transpose[] | {key:.[0],value:.[1]}] | from_entries
' <<<"$s"
How (modern jq)
This requires very new (probably 1.5?) jq to work, and is a dense chunk of code. To break it down:
Using -n prevents jq from reading stdin on its own, leaving the entirety of the input stream available to be read by input and inputs -- the former to read a single line, and the latter to read all remaining lines. (-R, for raw input, causes textual lines rather than JSON objects to be read).
With [$keys, $vals] | transpose[], we're generating [key, value] pairs (in Python terms, zipping the two lists).
With {key:.[0],value:.[1]}, we're making each [key, value] pair into an object of the form {"key": key, "value": value}
With from_entries, we're combining those pairs into objects containing those keys and values.
What (shell-assisted)
This will work with a significantly older jq than the above, and is an easily adopted approach for scenarios where a native-jq solution can be harder to wrangle:
{
IFS='|' read -r -a keys # read first line into an array of strings
## read each subsequent line into an array named "values"
while IFS='|' read -r -a values; do
# setup: positional arguments to pass in literal variables, query with code
jq_args=( )
jq_query='.'
# copy values into the arguments, reference them from the generated code
for idx in "${!values[#]}"; do
[[ ${keys[$idx]} ]] || continue # skip values with no corresponding key
jq_args+=( --arg "key$idx" "${keys[$idx]}" )
jq_args+=( --arg "value$idx" "${values[$idx]}" )
jq_query+=" | .[\$key${idx}]=\$value${idx}"
done
# run the generated command
jq "${jq_args[#]}" "$jq_query" <<<'{}'
done
} <<<"$s"
How (shell-assisted)
The invoked jq command from the above is similar to:
jq --arg key0 'CONTAINER' \
--arg value0 'nginx_container' \
--arg key1 'CPU%' \
--arg value1 '0.0.2%' \
--arg key2 'MEMUSAGE/LIMIT' \
--arg value2 '25.09MiB/15.26GiB' \
'. | .[$key0]=$value0 | .[$key1]=$value1 | .[$key2]=$value2' \
<<<'{}'
...passing each key and value out-of-band (such that it's treated as a literal string rather than parsed as JSON), then referring to them individually.
Result
Either of the above will emit:
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Why
In short: Because it's guaranteed to generate valid JSON as output.
Consider the following as an example that would break more naive approaches:
s='key ending in a backslash\
value "with quotes"'
Sure, these are unexpected scenarios, but jq knows how to deal with them:
{
"key ending in a backslash\\": "value \"with quotes\""
}
...whereas an implementation that didn't understand JSON strings could easily end up emitting:
{
"key ending in a backslash\": "value "with quotes""
}
I know this is an old post, but the tool you seek is called jo: https://github.com/jpmens/jo
A quick and easy example:
$ jo my_variable="simple"
{"my_variable":"simple"}
A little more complex
$ jo -p name=jo n=17 parser=false
{
"name": "jo",
"n": 17,
"parser": false
}
Add an array
$ jo -p name=jo n=17 parser=false my_array=$(jo -a {1..5})
{
"name": "jo",
"n": 17,
"parser": false,
"my_array": [
1,
2,
3,
4,
5
]
}
I've made some pretty complex stuff with jo and the nice thing is that you don't have to worry about rolling your own solution worrying about the possiblity of making invalid json.
You can ask docker to give you JSON data in the first place
docker stats --format "{{json .}}"
For more on this, see: https://docs.docker.com/config/formatting/
JSONSTR=""
declare -a JSONNAMES=()
declare -A JSONARRAY=()
LOOPNUM=0
cat ~/newfile | while IFS=: read CONTAINER CPU MEMUSE MEMPC NETIO BLKIO PIDS; do
if [[ "$LOOPNUM" = 0 ]]; then
JSONNAMES=("$CONTAINER" "$CPU" "$MEMUSE" "$MEMPC" "$NETIO" "$BLKIO" "$PIDS")
LOOPNUM=$(( LOOPNUM+1 ))
else
echo "{ \"${JSONNAMES[0]}\": \"${CONTAINER}\", \"${JSONNAMES[1]}\": \"${CPU}\", \"${JSONNAMES[2]}\": \"${MEMUSE}\", \"${JSONNAMES[3]}\": \"${MEMPC}\", \"${JSONNAMES[4]}\": \"${NETIO}\", \"${JSONNAMES[5]}\": \"${BLKIO}\", \"${JSONNAMES[6]}\": \"${PIDS}\" }"
fi
done
Returns:
{ "CONTAINER": "nginx_container", "CPU%": "0.02%", "MEMUSAGE/LIMIT": "25.09MiB/15.26GiB", "MEM%": "0.16%", "NETI/O": "0B/0B", "BLOCKI/O": "22.09MB/4.096kB", "PIDS": "0" }
Here is a solution which uses the -R and -s options along with transpose:
split("\n") # [ "CONTAINER...", "nginx_container|0.02%...", ...]
| (.[0] | split("|")) as $keys # [ "CONTAINER", "CPU%", "MEMUSAGE/LIMIT", ... ]
| (.[1:][] | split("|")) # [ "nginx_container", "0.02%", ... ] [ ... ] ...
| select(length > 0) # (remove empty [] caused by trailing newline)
| [$keys, .] # [ ["CONTAINER", ...], ["nginx_container", ...] ] ...
| [ transpose[] | {(.[0]):.[1]} ] # [ {"CONTAINER": "nginx_container"}, ... ] ...
| add # {"CONTAINER": "nginx_container", "CPU%": "0.02%" ...
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "nginx_container" "0.02%" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
Not using jq but possible to use args and environment in values.
CONTAINER=nginx_container
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "$CONTAINER" "$1" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
If you're starting with tabular data, I think it makes more sense to use something that works with tabular data natively, like sqawk to make it into json, and then use jq work with it further.
echo 'CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0' \
| sqawk -FS '[|]' -RS '\n' -output json 'select * from a' header=1 \
| jq '.[] | with_entries(select(.key|test("^a.*")|not))'
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Without jq, sqawk gives a bit too much:
[
{
"anr": "1",
"anf": "7",
"a0": "nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0",
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0",
"a8": "",
"a9": "",
"a10": ""
}
]