Can this jq map be simplified? - json

Given this JSON:
{
"key": "/books/OL1000072M",
"source_records": [
"ia:daywithtroubadou00pern",
"bwb:9780822519157",
"marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267"
]
}
Can the following jq code be simplified?
jq -r '.key as $olid | .source_records | map([$olid, .])[] | #tsv'
The use of variable assignment feels like cheating and I'm wondering if it can be eliminated. The goal is to map the key value onto each of the source_records values and output a two column TSV.

Instead of mapping into an array, and then iterating over it (map(…)[]) just create an array and collect its items ([…]). Also, you can get rid of the variable binding (as) by moving the second part into its own context using parens.
jq -r '[.key] + (.source_records[] | [.]) | #tsv'
Alternatively, instead of using #tsv you could build your tab-separated output string yourself. Either by concatenation (… + …) or by string interpolation ("\(…)"):
jq -r '.key + "\t" + .source_records[]'
jq -r '"\(.key)\t\(.source_records[])"'
Output:
/books/OL1000072M ia:daywithtroubadou00pern
/books/OL1000072M bwb:9780822519157
/books/OL1000072M marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267

It's not much shorter, but I think it's clearer than the original and clearer than the other shorter answers.
jq -r '.key as $olid | .source_records[] | [ $olid, . ] | #tsv'

Related

jq - Looping through json and concatenate the output to single string

I was currently learning the usage of jq. I have a json file and I am able to loop through and filter out the values I need from the json. However, I am running into issue when I try to combine the output into single string instead of having the output in multiple lines.
File svcs.json:
[
{
"name": "svc-A",
"run" : "True"
},
{
"name": "svc-B",
"run" : "False"
},
{
"name": "svc-C",
"run" : "True"
}
]
I was using the jq to filter to output the service names with run value as True
jq -r '.[] | select(.run=="True") | .name ' svcs.json
I was getting the output as follows:
svc-A
svc-C
I was looking to get the output as single string separated by commas.
Expected Output:
"svc-A,svc-C"
I tried to using join, but was unable to get it to work so far.
The .[] expression explodes the array into a stream of its elements. You'll need to collect the transformed stream (the names) back into an array. Then you can use the #csv filter for the final output
$ jq -r '[ .[] | select(.run=="True") | .name ] | #csv' svcs.json
"svc-A","svc-C"
But here's where map comes in handy to operate on an array's elements:
$ jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Keep the array using map instead of decomposing it with .[], then join with a glue string:
jq -r 'map(select(.run=="True") | .name) | join(",")' svcs.json
svc-A,svc-C
Demo
If your goal is to create a CSV output, there is a special #csv command taking care of quoting, escaping etc.
jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Demo

Pretty-print valid JSONs mixed with string keys

I have a Redis hash with keys and values like string key -- serialized JSON value.
Corresponding rediscli query (hgetall some_redis_hash) being dumped in a file:
redis_key1
{"value1__key1": "value1__value1", "value1__key2": "value1__value2" ...}
redis_key2
{"value2__key1": "value2__value1", "value2__key2": "value2__value2" ...}
...
and so on.
So the question is, how do I pretty-print these values enclosed in brackets? (note that key strings between are making the document invalid, if you'll try to parse the entire one)
The first thought is to get particular pairs from Redis, strip parasite keys, and use jq on the remaining valid JSON, as shown below:
rediscli hget some_redis_hash redis_key1 > file && tail -n +2 file
- file now contains valid JSON as value, the first string representing Redis key is stripped by tail -
cat file | jq
- produces pretty-printed value -
So the question is, how to pretty-print without such preprocessing?
Or (would be better in this particular case) how to merge keys and values in one big JSON, where Redis keys, accessible on the upper level, will be followed by dicts of their values?
Like that:
rediscli hgetall some_redis_hash > file
cat file | cool_parser
- prints { "redis_key1": {"value1__key1": "value1__value1", ...}, "redis_key2": ... }
A simple way for just pretty-printing would be the following:
cat file | jq --raw-input --raw-output '. as $raw | try fromjson catch $raw'
It tries to parse each line as json with fromjson, and just outputs the original line (with $raw) if it can't.
(The --raw-input is there so that we can invoke fromjson enclosed in a try instead of running it on every line directly, and --raw-output is there so that any non-json lines are not enclosed in quotes in the output.)
A solution for the second part of your questions using only jq:
cat file \
| jq --raw-input --null-input '[inputs] | _nwise(2) | {(.[0]): .[1] | fromjson}' \
| jq --null-input '[inputs] | add'
--null-input combined with [inputs] produces the whole input as an array
which _nwise(2) then chunks into groups of two (more info on _nwise)
which {(.[0]): .[1] | fromjson} then transforms into a list of jsons
which | jq --null-input '[inputs] | add' then combines into a single json
Or in a single jq invocation:
cat file | jq --raw-input --null-input \
'[ [inputs] | _nwise(2) | {(.[0]): .[1] | fromjson} ] | add'
...but by that point you might be better off writing an easier to understand python script.

Nested array in JSON to different rows in CSV

I have the following JSON:
{
"transmitterId": "30451155eda2",
"rssiSignature": [
{
"receiverId": "001bc509408201d5",
"receiverIdType": 1,
"rssi": -52,
"numberOfDecodings": 5,
"rssiSum": -52
},
{
"receiverId": "001bc50940820228",
"receiverIdType": 1,
"rssi": -85,
"numberOfDecodings": 5,
"rssiSum": -85
}
],
"timestamp": 1574228579837
}
I want to convert it to CSV format, where each row corresponds to an entry in rssiSignature (I have added the header row for visualization purposes):
timestamp,transmitterId,receiverId,rssi
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
My current attempt is the following, but I get a single CSV row:
$ jq -r '[.timestamp, .transmitterId, .rssiSignature[].receiverId, .rssiSignature[].rssi] | #csv' test.jsonl
1574228579837,"30451155eda2","001bc509408201d5","001bc50940820228",-52,-85
How can I use jq to generate different rows for each entry of the rssiSignature array?
In order to reuse a value of the upper level, like the timestamp, for every item of the rssiSignature array, you can define it as a variable. You can get your csv like this:
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
Output:
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
Also here is an way to print headers for an output file in bash, independent of what commands we call, using commands grouping.
(
printf "timestamp,transmitterId,receiverId,rssi\n"
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
) > output.csv
Actually, the task can be accomplished without the use of any variables; one can also coax jq to include a header:
jq -r '
["timestamp","transmitterId","receiverId","rssi"],
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi])
| #csv'
A single header with multiple files
One way to produce a single header with multiple input files would be to use inputs in conjunction with the -n command-line option. This happens also to be efficient:
jq -nr '
["timestamp","transmitterId","receiverId","rssi"],
(inputs |
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi]))
| #csv'

jq string manipulation on domain names and dns records

I am attempting to learn some jq but am running into trouble.
I am working with a dataset of dns records like {"timestamp":"1592145252","name":"0.127.9.109.rev.sfr.net","type":"a","value":"109.9.127.0"}
I cannot figure out how to
strip the subdomain details out of the name field. in this example i just want sfr.net
print the name backwards, eg: 0.127.9.109.rev.sfr.net would become ten.rfs.ver.901.9.721.0
my end goal is to print lines like this:
0.127.9.109.rev.sfr.net,ten.rfs.ver.901.9.721.0,a,sfr.net
Thanks SO!
To extract the "domain" part, you could use simple string manipulation methods to select it. Assuming anything after the .rev. part is the domain, you could do this:
split(".rev.")[1]
To reverse a string, jq doesn't have the operations to do it directly for strings. However it does have a function to reverse arrays. So you could convert to an array, reverse, then convert back.
split("") | reverse | join("")
To put it all together for your input:
.name | [
.,
(split("") | reverse | join("")),
(split(".rev.")[1])
] | join(",")
Here's one approach using reverse and capture:
jq -r '
.type as $type
| .name
| "\(.),\(explode|reverse|implode),\($type),"
+ capture("(?<subdomain>[^.]+[.][^.]+)$").subdomain'
Like this :
$ jq -r '.name' file.json | grep -oE '\w+\.\w+$'
sfr.net
$ jq -r '.name' file.json | rev
ten.rfs.ver.901.9.721.0

Parse JSON field with jq and output repeatedly for every child

I have the following JSON:
{
"field1":"foo",
"array":[
{
child_field1:"c1_1",
child_field2:"c1_2"
},
{
child_field1:"c2_1",
child_field2:"c2_2"
}
]...
}
and using jq, I would like to return the following output, where the value of field1 is repeated for every child element.:
foo,c1_1,c1_2
foo,c2_1,c2_2
...
I can access each field separately, am having trouble returning the desired result above.
Can this be done with jq?
jq -r '.array[] as $a | [.field1, $a.child_field1, $a.child_field2] | #csv'
Does the right thing for the sample data you provided, but I freely admit there are lots of ways to do that kind of thing in jq, and that was only the first one which sprang to mind.
I fed it through #csv because it seemed like that was what you wanted, but if you prefer the actual output, exactly as you have written, then:
jq -r '.array[] as $a | "\(.field1),\($a.child_field1),\($a.child_field2)"'
will produce it
In cases like this, there's no need for reduce or `to_entries', or to list the fields explicitly -- one can simply exploit jq's backtracking behavior:
.field1 as $f
| .array[]
| [$f, .[]]
| #csv
As pointed out by #MatthewLDaniel, there are many alternatives to using #csv here.
jq solution:
jq -r '.field1 as $f1 | .array[]
| [$f1, .[]]
| join(",")' input.json
The output:
foo,c1_1,c1_2
foo,c2_1,c2_2