jq how to pass json keys from a shell variable - json

I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.

Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json

Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file

You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]

Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]

Related

Filter in jq based on an array of string

This is close to this question: jq filter JSON array based on value in list , the only difference being the condition. I don't want equality as a condition, but a "contains", but I am struggling to do it correctly.
The problem:
If want to filter elements of an input file based on the presence of a specific value in an entry
The input file:
[{
"id":"type 1 is great"
},{
"id":"type 2"
},
{
"id":"this is another type 2"
},
{
"id":"type 4"
}]
The filter file:
ids.txt:
type 2
type 1
The select.jq file:
def isin($a): . as $in | any($a[]; contains($in));
map( select( .id | contains($ids) ) )
The jq command:
jq --argjson ids "$(jq -R . ids.txt | jq -s .)" -f select.jq test.json
The expected result should be something like:
[{
"id":"type 1 is great"
},{
"id":"type 2"
},
{
"id":"this is another type 2"
}]
The problem is obviously in the "isin" function of the select.jq file, so how to write it correctly to check if an entry contains one of the string of another array?
Here is a solution that illustrates that there is no need to invoke jq more than once. In accordance with the jq program shown in the question, it also assumes that the selection is to be based on the values of the "id" key.
< ids.txt jq -nR --argfile json input.json '
# Is the input an acceptable value?
def acceptable($array): any(contains($array[]); .);
[inputs] as $substrings
| $json
| map( select(.id | acceptable($substrings)) )
'
Note on --argfile
--argfile has been deprecated, so you may wish to use --slurpfile instead, in which case you'd have to write $json[0].

How to print path and key values of JSON file using JQ

I would like to print each path and value of a json file with included key values line by line. I would like the output to be comma delimited or at least very easy to cut and sort using Linux command line tools. Given the following json and jq, I have been given jq code which seems to do this for the test JSON, but I am not sure it works in all cases or is the proper approach.
Is there a function in jq which does this automatically? If not, is there a "most concise best way" to do it?
My wish would be something like:
$ cat short.json | jq -doit '.'
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Test JSON:
$ cat short.json | jq '.'
{
"Reservations": [
{
"Groups": [],
"Instances": [
{
"ImageId": "ami-a",
"InstanceId": "i-a",
"InstanceType": "t2.micro",
"KeyName": "ubuntu"
}
]
}
]
}
Code Recommended:
https://unix.stackexchange.com/questions/561460/how-to-print-path-and-key-values-of-json-file
Supporting:
https://unix.stackexchange.com/questions/515573/convert-json-file-to-a-key-path-with-the-resulting-value-at-the-end-of-each-k
JQ Code Too long and complicated!
jq -r '
paths(scalars) as $p
| [ ( [ $p[] | tostring ] | join(".") )
, ( getpath($p) | tojson )
]
| join(": ")
' short.json
Result:
Reservations.0.Instances.0.ImageId: "ami-a"
Reservations.0.Instances.0.InstanceId: "i-a"
Reservations.0.Instances.0.InstanceType: "t2.micro"
Reservations.0.Instances.0.KeyName: "ubuntu"
A simple jq query to achieve the requested format:
paths(scalars) as $p
| $p + [getpath($p)]
| join(",")
If your jq is ancient and you cannot upgrade, insert | map(tostring) before the last line above.
Output with the -r option
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Caveat
If a key or atomic value contains "," then of course using a comma may be inadvisable. For this reason, it might be preferable to use a character such as TAB that cannot appear in a JSON key or atomic value. Consider therefore using #tsv:
paths(scalars) as $p
| $p + [getpath($p)]
| #tsv
(The comment above about ancient versions of jq applies here too.)
Read it as a stream.
$ jq --stream -r 'select(.[1]|scalars!=null) | "\(.[0]|join(".")): \(.[1]|tojson)"' short.json
Use -c paths as follows:
cat short.json | jq -c paths | tr -d '[' | tr -d ']'
I am using jq-1.5-1-a5b5cbe

Parsing JSON using jq or Python

I have this nested JSON
[
"[[Input=[Name=ABC, createDateTime=2019-30-11, RollNumber=9]]]",
"[[SubjectList=[Summer=, Winter=, Autumn=, Spring=, rList=, sList=, additionalList=, emailList=, FoodList=, sAssignmentList=, summerworkList=, outdoorList=, movielist=]]]",
"[ProcessingDate=2018-10-06]",
"[Hobbies=Football]",
"[Phone=Android,,]"
]
How can I process this JSON and get the value football or rollnumber using Python?
This is what I tried:
Code
import json
row = '''[
"[[Input=[Name=ABC, createDateTime=2019-30-11, RollNumber=9]]]",
"[[SubjectList=[Summer=, Winter=, Autumn=, Spring=, rList=, sList=, additionalList=, emailList=, FoodList=, sAssignmentList=, summerworkList=, outdoorList=, movielist=]]]",
"[ProcessingDate=2018-10-06]",
"[Hobbies=Football]",
"[Phone=Android,,]"
]'''
row_dict = json.loads(row)
print(row_dict[3])
Using this - I get following output:
[Hobbies=Football]
But I am missing next level parsing to get just football as output
Here is an approach that uses capture on the non-json strings in the array.
It assumes the [:alnum:] posix regex character class suffices to match the values after the =
Sample execution assuming data in test.json
$ jq -M '.[] | capture("Hobbies=(?<Hobbies>[[:alnum:]]+)")' test.json
{
"Hobbies": "Football"
}
Here is a variation which produces exactly Football:
$ jq -Mr '.[] | capture("Hobbies=(?<Hobbies>[[:alnum:]]+)") | .Hobbies' test.json
Football
Here's an example script which uses multiple captures and combines them with add
[ .[]
| capture("Hobbies=(?<Hobbies>[[:alnum:]]+)")
, capture("RollNumber=(?<RollNumber>[[:alnum:]]+)")
] | add
Sample execution assuming script in test.jq
$ jq -M -f test.jq test.json
{
"RollNumber": "9",
"Hobbies": "Football"
}

jq - filter using list of object identifier-index stored in a variable

I'm writing a reusable Bash script that allows me to query an API to get data about parliamentary members and store it in csv format.
The json response of the API has a lot of keys (name, gender, birthdate, commission membership, vote...), and, depending on what I'd like to do, I do not always want to capture the same keys. So I would like to abstract this part of the code in order to be able to write:
mp_keys= a,b,c,d
curl https://mp.com | jq '. | [$mp_keys] | #csv'
so that it is interpreted by jq as
jq '. | [.a, .b, .c, .d] | # csv'
I don't have a set structure for the variable format, so it could be:
mp_keys="a,b,c,d" or
mp_keys=".a, .b, .c, .d" or
mp_key1=a, mp_key2=b, mp_key3=c, mp_key4=d
I know that in jq I can use the following structure:
jq --arg mp_key1 "${mp_key1}" --arg mp_key2 "${mp_key2}" --arg mp_key3 "${mp_key3}" --arg mp_key4 "${mp_key4}" '.
| [.[$mp_key1], .[$mp_key2], .[$mp_key3], .[$mp_key4]]
| #csv'
But it obviously becomes tedious very quickly.
Lastly I could also build the jq command as a string then apply eval to it but I would prefer using a proper jq solution.
Edit
I will break down #peak's answer for future reference, as it's very useful to understand what's happening. His answer is reproduced below:
mp_keys="a,b,c,d"
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r --arg mpk "$mp_keys" '
($mpk|split(",")) as $mpkeys
| [ .[ $mpkeys[] ] ]
| #csv '
First, it's important to understand that jq will:
evaluate the value of $mpkeys first, which implies doing the split(",") on $mpk
then pass the json sent through echo.
So we can do the same to understand what's happening. The ( ) tells jq to process this section in priority, so we can start by replacing the parenthesis with its results.
the string "a,b,c,d", stored in $mpk, is split and stored into the array ["a","b","c","d"] as described in the section on split(str)of jq's manual.
The array is subsequently stored into $mpkeys through as.
Which means that an equivalent of the initial code can be written as:
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r --arg mpk "$mp_keys" '
["a","b","c","d] as $mpkeys
| [ .[ $mpkeys[] ] ]
| #csv '
Of course now the --arg is useless, as $mpk was there to hold our initial string. So we can simplify further with:
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r '["a","b","c","d] as $mpkeys
| [ .[ $mpkeys[] ] ]
| #csv '
Let's now break down [ .[ $mpkeys[] ] ]:
because $mpkeys is an array, $mpkeys[] stands for the individual elements of the array, equivalent to "a","b","c","d". Instead of a single element (an array), we now have 4 elements (strings), which will be transformed individually by the filters around them (the brackets)
$mpkeys[] is then wrapped into .[] which applies to each of the 4 elements, and is consequently equivalent to .["a"], .["b"], .["c"], .["d"]. Each of those elements is a generic object index, which are equivalent to the object-identifier index form (.a, .b, .c, .d) as described in jq's manual.
the final outer [ ] simply wraps everything inside an array, which is necessary to pass the result to #csv.
So the equivalent to the code above is:
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r '["a","b","c","d] as $mpkeys
| [ .["a"], .["b"], .["c"], .["d"] ]
| #csv '
Here $mpkeys serves no purpose, so we actually have:
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r ' [ .["a"], .["b"], .["c"], .["d"] ]
| #csv '
There we are.
Using your approach:
mp_keys="a,b,c,d"
echo '{"a":1, "b":2, "c":3, "d": 4}' |
jq -r --arg mpk "$mp_keys" '
($mpk|split(",")) as $mpkeys
| [ .[ $mpkeys[] ] ]
| #csv '
would yield:
1,2,3,4

Filter only specific keys from an external file in jq

I have a JSON file with the following format:
[
{
"id": "00001",
"attr": {
"a": "foo",
"b": "bar",
...
}
},
{
"id": "00002",
"attr": {
...
},
...
},
...
]
and a text file with a list of ids, one per line. I'd like to use jq to filter only the records whose ids are mentioned in the text file. I.e. if the list contains "00001", only the first one should be printed.
Note, that I can't simply grep since each record may have an arbitrary number of attributes and sub-attributes.
There are basically two ways to proceed:
read the file of ids from STDIN
read the JSON from STDIN
Both are feasible, but here we illustrate (2) as it leads to a simple but efficient solution.
Suppose the JSON file is named in.json and the list of ids is in a file named ids.txt like so:
00001
00010
Notice that this file has no quotation marks. If it does, then the following can be significantly simplified as shown in the postscript.
The trick is to convert ids.txt into a JSON array. With the above assumption about quotation marks, this can be done by:
jq -R . ids.txt | jq -s .
Assuming a reasonable shell, a simple solution is now at hand:
jq --argjson ids "$(jq -R . ids.txt | jq -s .)" '
map( select( .id as $id | $ids | index($id) ))' in.json
Faster
Assuming your jq has any/2, then a simpler and more efficient solution can be obtaining by defining:
def isin($a): . as $in | any($a[]; $in == .);
The required jq filter is then just:
map( select( .id | isin($ids) ) )
If these two lines of jq are put into a file named select.jq, the required incantation is simply:
jq --argjson ids "$(jq -R . ids.txt | jq -s)" -f select.jq in.json
Postscript
If the index file consists of a stream of valid JSON texts (e.g., strings with quotation marks) and if your jq supports the --slurpfile option, the invocation can be further simplified to:
jq --slurpfile ids ids.txt -f select.jq in.json
Or if you want everything as a one-liner:
jq --slurpfile ids ids.txt 'map(select(.id as $id|any($ids[];$id==.)))' in.json