Parsing JSON using jq or Python - json

I have this nested JSON
[
"[[Input=[Name=ABC, createDateTime=2019-30-11, RollNumber=9]]]",
"[[SubjectList=[Summer=, Winter=, Autumn=, Spring=, rList=, sList=, additionalList=, emailList=, FoodList=, sAssignmentList=, summerworkList=, outdoorList=, movielist=]]]",
"[ProcessingDate=2018-10-06]",
"[Hobbies=Football]",
"[Phone=Android,,]"
]
How can I process this JSON and get the value football or rollnumber using Python?
This is what I tried:
Code
import json
row = '''[
"[[Input=[Name=ABC, createDateTime=2019-30-11, RollNumber=9]]]",
"[[SubjectList=[Summer=, Winter=, Autumn=, Spring=, rList=, sList=, additionalList=, emailList=, FoodList=, sAssignmentList=, summerworkList=, outdoorList=, movielist=]]]",
"[ProcessingDate=2018-10-06]",
"[Hobbies=Football]",
"[Phone=Android,,]"
]'''
row_dict = json.loads(row)
print(row_dict[3])
Using this - I get following output:
[Hobbies=Football]
But I am missing next level parsing to get just football as output

Here is an approach that uses capture on the non-json strings in the array.
It assumes the [:alnum:] posix regex character class suffices to match the values after the =
Sample execution assuming data in test.json
$ jq -M '.[] | capture("Hobbies=(?<Hobbies>[[:alnum:]]+)")' test.json
{
"Hobbies": "Football"
}
Here is a variation which produces exactly Football:
$ jq -Mr '.[] | capture("Hobbies=(?<Hobbies>[[:alnum:]]+)") | .Hobbies' test.json
Football
Here's an example script which uses multiple captures and combines them with add
[ .[]
| capture("Hobbies=(?<Hobbies>[[:alnum:]]+)")
, capture("RollNumber=(?<RollNumber>[[:alnum:]]+)")
] | add
Sample execution assuming script in test.jq
$ jq -M -f test.jq test.json
{
"RollNumber": "9",
"Hobbies": "Football"
}

Related

get files from directory in bash and build JSON object using jq

I am trying to build list of JSON objects with the files in a particular directory. I am looping thru the files and creating the expected output object as string. I am sure there is a better way of doing this using jq.
Can someone please help me out here?
# input
files=($( ls * ))
prefix="myawesomeprefix"
# expected output
{
"listoffiles": [
{"file":"myawesomeprefix/file1.txt"},
{"file":"myawesomeprefix/file2.txt"},
{"file":"myawesomeprefix/file3.txt"},
]
}
If you don't have any "problematic" file names, e.g. ones that have new lines as part of their name, the following should work:
ls -1 | jq -Rn '{ listoffiles: [inputs | { file: "prefix/\(.)" }] }'
It reads each line as string, and reads them through the inputs filter (must be combined with -n null-input). It then builds your object.
$ cat <<LS | jq -Rn '{ listoffiles: [inputs | {file:"prefix/\(.)"}] }'
file1
file2
file with spaces
LS
{
"listoffiles": [
{
"file": "prefix/file1"
},
{
"file": "prefix/file2"
},
{
"file": "prefix/file with spaces"
}
]
}
You could use for with a glob which should handle new lines in file names as well. But it requires you to chain 2 jq commands:
for f in *; do
printf '%s' "$f" | jq -Rs '{file:"prefix/\(.)"}';
done | jq -s '{listoffiles:.}'
To specify the prefix as variable from the outside, use --arg, e.g.
jq --arg prefix "yourprefixvalue" '$prefix + .'
You can try the nice little command line tool jc:
ls | jc --ls
It converts the output of many shell commands to JSON. For reference have a look there in Github https://github.com/kellyjonbrazil/jc .
Then you can transform the result using jq:
ls | jc --ls | jq "{ listoffiles: [.[] | { file: (\"$prefix/\" + .filename) }] }"
You shouldn't parse the output of ls. If installed, you could use tree with the -J option to produce a JSON listing, which you can transform to your needs using jq:
tree -aJL 1 | jq '
{listoffiles: first.contents | map({file: ("myawesomeprefix/" + .name)})}
'
Or more comfortably using --arg:
tree -aJL 1 | jq --arg prefix myawesomeprefix '
{listoffiles: first.contents | map({file: "\($prefix)/\(.name)"})}
'
This is another alternative :
jq -n --arg prefix "myawesomeprefix"\
'.listoffiles = ($ARGS.positional |
map({file:($prefix+"/"+.)}))'\
--args *

jq - Looping through json and concatenate the output to single string

I was currently learning the usage of jq. I have a json file and I am able to loop through and filter out the values I need from the json. However, I am running into issue when I try to combine the output into single string instead of having the output in multiple lines.
File svcs.json:
[
{
"name": "svc-A",
"run" : "True"
},
{
"name": "svc-B",
"run" : "False"
},
{
"name": "svc-C",
"run" : "True"
}
]
I was using the jq to filter to output the service names with run value as True
jq -r '.[] | select(.run=="True") | .name ' svcs.json
I was getting the output as follows:
svc-A
svc-C
I was looking to get the output as single string separated by commas.
Expected Output:
"svc-A,svc-C"
I tried to using join, but was unable to get it to work so far.
The .[] expression explodes the array into a stream of its elements. You'll need to collect the transformed stream (the names) back into an array. Then you can use the #csv filter for the final output
$ jq -r '[ .[] | select(.run=="True") | .name ] | #csv' svcs.json
"svc-A","svc-C"
But here's where map comes in handy to operate on an array's elements:
$ jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Keep the array using map instead of decomposing it with .[], then join with a glue string:
jq -r 'map(select(.run=="True") | .name) | join(",")' svcs.json
svc-A,svc-C
Demo
If your goal is to create a CSV output, there is a special #csv command taking care of quoting, escaping etc.
jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Demo

jq how to pass json keys from a shell variable

I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.
Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json
Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file
You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]
Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]

jq: How to look up a JSON object by name read from another object?

I have the following JSON:
{
"contexts": {
"context1": {
"mydata": "value1"
},
"context2": {
"mydata": "value2"
}
},
"current_context": "context2"
}
I'd like to use jq to output the value of mydata for the context indicated by current_context. The above would output value2. If I change the JSON to have "current_context": "context1", I'd get value1.
Given two invocations of jq, and the above JSON content in a file called jq.json, this works:
jq -r --arg context "$(jq -r '.current_context' jq.json)" '.contexts | to_entries[] | select(.key == $context) | .value.mydata' jq.json
Is there a way to do this with a single invocation of jq?
Thanks!
Like this:
jq -r '.contexts[.current_context].mydata' file.json
You could also use a variable:
jq -r '.current_context as $cc|.contexts[$cc].mydata' file.json
and here's an alternative to jq solution, based on a walk-path unix utility jtc:
bash $ jtc -w'[current_context]<cc>v[^0]<cc>t[mydata]' file.json
"value2"
bash $
walk path (-w) breakdown:
[current_context]<cc>v - get to the current_context value and memorize it in the namespace cc (directive <cc>v does it)
[^0]<cc>t[mydata] - reset walk path back to root ([^0]) and recursively search for the label (tag) stored in the namespace (<cc>t), then address found JSON object by label mydata.
PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations

Filter only specific keys from an external file in jq

I have a JSON file with the following format:
[
{
"id": "00001",
"attr": {
"a": "foo",
"b": "bar",
...
}
},
{
"id": "00002",
"attr": {
...
},
...
},
...
]
and a text file with a list of ids, one per line. I'd like to use jq to filter only the records whose ids are mentioned in the text file. I.e. if the list contains "00001", only the first one should be printed.
Note, that I can't simply grep since each record may have an arbitrary number of attributes and sub-attributes.
There are basically two ways to proceed:
read the file of ids from STDIN
read the JSON from STDIN
Both are feasible, but here we illustrate (2) as it leads to a simple but efficient solution.
Suppose the JSON file is named in.json and the list of ids is in a file named ids.txt like so:
00001
00010
Notice that this file has no quotation marks. If it does, then the following can be significantly simplified as shown in the postscript.
The trick is to convert ids.txt into a JSON array. With the above assumption about quotation marks, this can be done by:
jq -R . ids.txt | jq -s .
Assuming a reasonable shell, a simple solution is now at hand:
jq --argjson ids "$(jq -R . ids.txt | jq -s .)" '
map( select( .id as $id | $ids | index($id) ))' in.json
Faster
Assuming your jq has any/2, then a simpler and more efficient solution can be obtaining by defining:
def isin($a): . as $in | any($a[]; $in == .);
The required jq filter is then just:
map( select( .id | isin($ids) ) )
If these two lines of jq are put into a file named select.jq, the required incantation is simply:
jq --argjson ids "$(jq -R . ids.txt | jq -s)" -f select.jq in.json
Postscript
If the index file consists of a stream of valid JSON texts (e.g., strings with quotation marks) and if your jq supports the --slurpfile option, the invocation can be further simplified to:
jq --slurpfile ids ids.txt -f select.jq in.json
Or if you want everything as a one-liner:
jq --slurpfile ids ids.txt 'map(select(.id as $id|any($ids[];$id==.)))' in.json