How to print path and key values of JSON file using JQ - json

I would like to print each path and value of a json file with included key values line by line. I would like the output to be comma delimited or at least very easy to cut and sort using Linux command line tools. Given the following json and jq, I have been given jq code which seems to do this for the test JSON, but I am not sure it works in all cases or is the proper approach.
Is there a function in jq which does this automatically? If not, is there a "most concise best way" to do it?
My wish would be something like:
$ cat short.json | jq -doit '.'
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Test JSON:
$ cat short.json | jq '.'
{
"Reservations": [
{
"Groups": [],
"Instances": [
{
"ImageId": "ami-a",
"InstanceId": "i-a",
"InstanceType": "t2.micro",
"KeyName": "ubuntu"
}
]
}
]
}
Code Recommended:
https://unix.stackexchange.com/questions/561460/how-to-print-path-and-key-values-of-json-file
Supporting:
https://unix.stackexchange.com/questions/515573/convert-json-file-to-a-key-path-with-the-resulting-value-at-the-end-of-each-k
JQ Code Too long and complicated!
jq -r '
paths(scalars) as $p
| [ ( [ $p[] | tostring ] | join(".") )
, ( getpath($p) | tojson )
]
| join(": ")
' short.json
Result:
Reservations.0.Instances.0.ImageId: "ami-a"
Reservations.0.Instances.0.InstanceId: "i-a"
Reservations.0.Instances.0.InstanceType: "t2.micro"
Reservations.0.Instances.0.KeyName: "ubuntu"

A simple jq query to achieve the requested format:
paths(scalars) as $p
| $p + [getpath($p)]
| join(",")
If your jq is ancient and you cannot upgrade, insert | map(tostring) before the last line above.
Output with the -r option
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Caveat
If a key or atomic value contains "," then of course using a comma may be inadvisable. For this reason, it might be preferable to use a character such as TAB that cannot appear in a JSON key or atomic value. Consider therefore using #tsv:
paths(scalars) as $p
| $p + [getpath($p)]
| #tsv
(The comment above about ancient versions of jq applies here too.)

Read it as a stream.
$ jq --stream -r 'select(.[1]|scalars!=null) | "\(.[0]|join(".")): \(.[1]|tojson)"' short.json

Use -c paths as follows:
cat short.json | jq -c paths | tr -d '[' | tr -d ']'
I am using jq-1.5-1-a5b5cbe

Related

get files from directory in bash and build JSON object using jq

I am trying to build list of JSON objects with the files in a particular directory. I am looping thru the files and creating the expected output object as string. I am sure there is a better way of doing this using jq.
Can someone please help me out here?
# input
files=($( ls * ))
prefix="myawesomeprefix"
# expected output
{
"listoffiles": [
{"file":"myawesomeprefix/file1.txt"},
{"file":"myawesomeprefix/file2.txt"},
{"file":"myawesomeprefix/file3.txt"},
]
}
If you don't have any "problematic" file names, e.g. ones that have new lines as part of their name, the following should work:
ls -1 | jq -Rn '{ listoffiles: [inputs | { file: "prefix/\(.)" }] }'
It reads each line as string, and reads them through the inputs filter (must be combined with -n null-input). It then builds your object.
$ cat <<LS | jq -Rn '{ listoffiles: [inputs | {file:"prefix/\(.)"}] }'
file1
file2
file with spaces
LS
{
"listoffiles": [
{
"file": "prefix/file1"
},
{
"file": "prefix/file2"
},
{
"file": "prefix/file with spaces"
}
]
}
You could use for with a glob which should handle new lines in file names as well. But it requires you to chain 2 jq commands:
for f in *; do
printf '%s' "$f" | jq -Rs '{file:"prefix/\(.)"}';
done | jq -s '{listoffiles:.}'
To specify the prefix as variable from the outside, use --arg, e.g.
jq --arg prefix "yourprefixvalue" '$prefix + .'
You can try the nice little command line tool jc:
ls | jc --ls
It converts the output of many shell commands to JSON. For reference have a look there in Github https://github.com/kellyjonbrazil/jc .
Then you can transform the result using jq:
ls | jc --ls | jq "{ listoffiles: [.[] | { file: (\"$prefix/\" + .filename) }] }"
You shouldn't parse the output of ls. If installed, you could use tree with the -J option to produce a JSON listing, which you can transform to your needs using jq:
tree -aJL 1 | jq '
{listoffiles: first.contents | map({file: ("myawesomeprefix/" + .name)})}
'
Or more comfortably using --arg:
tree -aJL 1 | jq --arg prefix myawesomeprefix '
{listoffiles: first.contents | map({file: "\($prefix)/\(.name)"})}
'
This is another alternative :
jq -n --arg prefix "myawesomeprefix"\
'.listoffiles = ($ARGS.positional |
map({file:($prefix+"/"+.)}))'\
--args *

jq how to pass json keys from a shell variable

I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.
Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json
Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file
You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]
Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]

Grouping and sorting JSON records in Bash

I'm using curl to get JSON file. My problem is that I would like to get group of 4 words in one line, then break the line, and sort it by first column.
I'm trying:
curl -L 'http://mylink/ | jq '.[]| .location, .host_name, .serial_number, .model'
I'm getting
"Office-1"
"work-1"
"11xxx111"
"hp"
"Office-2"
"work-2"
"33xxx333"
"lenovo"
"Office-1"
"work-3"
"22xxx222"
"dell"
I would like to have:
"Office-1", "work-1", "11xxx111", "hp"
"Office-1" "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
I tried jq -S ".[]| .location| group_by(.location), and few other combinations like sort_by(.location) but it doesn't work. I'm getting error: jq: error (at <stdin>:1): Cannot iterate over string ("Office-1")
Sample of my JSON file:
[
{
"location": "Office-1",
"host_name": "work-1",
"serial_number": "11xxx111",
"model": "hp"
},
{
"location": "Office-2",
"host_name": "work-2",
"serial_number": "33xxx333",
"model": "lenovo"
},
{
"location": "Office-1",
"host_name": "work-3",
"serial_number": "22xxx222",
"model": "dell"
}
]
To sort by .location only, without an external sort:
map( [ .location, .host_name, .serial_number, .model] )
| sort_by(.[0])[]
| map("\"\(.)\"") | join(", ")
The ", " is per the stated requirements.
If you want the output as CSV, simply replace the last line in the jq program above by #csv.
If minimizing keystrokes is a goal, then if you are certain that the keys are always in the desired order, you could get away with replacing the first line by map( [ .[] ] )
You can ask jq to produce arbitrary formatted strings.
curl -L 'http://mylink/ |
jq -r '.[]| "\"\(.location)\", \"\(.host_name)\", \"\(.serial_number)\", \"\(.model)\""' |
sort
Inside the double quotes, \" produces literal double quotes, and \(.field) interpolates a field name. The -r option is required to produce output which isn't JSON.
This will get you the output you wanted:
jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""'
like so:
$ jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""' /tmp/so7713.json
"Office-1", "work-1", "11xxx111", "hp"
"Office-1", "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
If you want it all as one string, it's a bit simpler:
$ jq 'group_by(.location) | .[] | .[] | map(values) | join (", ")' /tmp/so7713.json
"Office-1, work-1, 11xxx111, hp"
"Office-1, work-3, 22xxx222, dell"
"Office-2, work-2, 33xxx333, lenovo"
Note the lack of -r in the second example.
I feel there has to be a better way of doing .[] | .[], but I don't know what it is (yet).

Map arrays to objects with no common fields

How might one use jq-1.5-1-a5b5cbe to join a filtered set of arrays from STDIN to a set of objects which contains no common fields, assuming that all elements will be in predictable order?
Standard Input (pre-slurpfile; generated by multiple GETs):
{"ref":"objA","arr":["alpha"]}
{"ref":"objB","arr":["bravo"]}
Existing File:
[{"name":"foo"},{"name":"bar"}]
Desired Output:
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
Current Bash:
$ multiGET | jq --slurpfile stdin /dev/stdin '.[].arr = $stdin[].arr' file
[
{
"name": "foo",
"arr": [
"alpha"
]
},
{
"name": "bar",
"arr": [
"alpha"
]
}
]
[
{
"name": "foo",
"arr": [
"bravo"
]
},
{
"name": "bar",
"arr": [
"bravo"
]
}
]
Sidenote: I wasn't sure when to use pretty/compact JSON in this question; please comment with your opinion on best practice.
Get jq to read file before stdin, so that the first entity in file will be . and you can get everything else using inputs.
$ multiGET | jq -c '. as $objects
| [ foreach (inputs | {arr}) as $x (-1; .+1;
. as $i | $objects[$i] + $x
) ]' file -
[{"name":"foo","arr":["alpha"]},{"name":"bar","arr":["bravo"]}]
"Slurping" (whether using -s or --slurpfile) is sometimes necessary but rarely desirable, because of the memory requirements. So here's a solution that takes advantage of the fact that your multiGET produces a stream:
multiGET | jq -n --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
$objects
| [foreach inputs as $in (-1; .+1;
. as $ix
| $objects[$ix] + ($in | del(.ref)))]
'
Here's a functional approach that might be appropriate if your stream was in fact already packaged as an array:
multiGET | jq -s --argjson objects '[{"name":"foo"},{"name":"bar"}]' '
[$objects, map(del(.ref))]
| transpose
| map(add)
'
If the $objects array is in a file or too big for the command line, I'd suggest using --argfile, even though it is technically deprecated.
If the $objects array is in a file, and if you want to avoid --argfile, you could still avoid slurping, e.g. by using the fact that unless -n is used, jq will automatically read one JSON entity from stdin:
(echo '[{"name":"foo"},{"name":"bar"}]';
multiGET) | jq '
. as $objects
| [foreach inputs as $in (-1; .+1;
. as $ix | $objects[$ix] + $in | del(.ref))]
'

Filter only specific keys from an external file in jq

I have a JSON file with the following format:
[
{
"id": "00001",
"attr": {
"a": "foo",
"b": "bar",
...
}
},
{
"id": "00002",
"attr": {
...
},
...
},
...
]
and a text file with a list of ids, one per line. I'd like to use jq to filter only the records whose ids are mentioned in the text file. I.e. if the list contains "00001", only the first one should be printed.
Note, that I can't simply grep since each record may have an arbitrary number of attributes and sub-attributes.
There are basically two ways to proceed:
read the file of ids from STDIN
read the JSON from STDIN
Both are feasible, but here we illustrate (2) as it leads to a simple but efficient solution.
Suppose the JSON file is named in.json and the list of ids is in a file named ids.txt like so:
00001
00010
Notice that this file has no quotation marks. If it does, then the following can be significantly simplified as shown in the postscript.
The trick is to convert ids.txt into a JSON array. With the above assumption about quotation marks, this can be done by:
jq -R . ids.txt | jq -s .
Assuming a reasonable shell, a simple solution is now at hand:
jq --argjson ids "$(jq -R . ids.txt | jq -s .)" '
map( select( .id as $id | $ids | index($id) ))' in.json
Faster
Assuming your jq has any/2, then a simpler and more efficient solution can be obtaining by defining:
def isin($a): . as $in | any($a[]; $in == .);
The required jq filter is then just:
map( select( .id | isin($ids) ) )
If these two lines of jq are put into a file named select.jq, the required incantation is simply:
jq --argjson ids "$(jq -R . ids.txt | jq -s)" -f select.jq in.json
Postscript
If the index file consists of a stream of valid JSON texts (e.g., strings with quotation marks) and if your jq supports the --slurpfile option, the invocation can be further simplified to:
jq --slurpfile ids ids.txt -f select.jq in.json
Or if you want everything as a one-liner:
jq --slurpfile ids ids.txt 'map(select(.id as $id|any($ids[];$id==.)))' in.json