Nested array in JSON to different rows in CSV - json

I have the following JSON:
{
"transmitterId": "30451155eda2",
"rssiSignature": [
{
"receiverId": "001bc509408201d5",
"receiverIdType": 1,
"rssi": -52,
"numberOfDecodings": 5,
"rssiSum": -52
},
{
"receiverId": "001bc50940820228",
"receiverIdType": 1,
"rssi": -85,
"numberOfDecodings": 5,
"rssiSum": -85
}
],
"timestamp": 1574228579837
}
I want to convert it to CSV format, where each row corresponds to an entry in rssiSignature (I have added the header row for visualization purposes):
timestamp,transmitterId,receiverId,rssi
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
My current attempt is the following, but I get a single CSV row:
$ jq -r '[.timestamp, .transmitterId, .rssiSignature[].receiverId, .rssiSignature[].rssi] | #csv' test.jsonl
1574228579837,"30451155eda2","001bc509408201d5","001bc50940820228",-52,-85
How can I use jq to generate different rows for each entry of the rssiSignature array?

In order to reuse a value of the upper level, like the timestamp, for every item of the rssiSignature array, you can define it as a variable. You can get your csv like this:
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
Output:
1574228579837,"30451155eda2","001bc509408201d5",-52
1574228579837,"30451155eda2","001bc50940820228",-85
Also here is an way to print headers for an output file in bash, independent of what commands we call, using commands grouping.
(
printf "timestamp,transmitterId,receiverId,rssi\n"
jq -r '.timestamp as $t | .transmitterId as $tid |
.rssiSignature[] | [ $t, $tid, .receiverId, .rssi] | #csv
' file.json
) > output.csv

Actually, the task can be accomplished without the use of any variables; one can also coax jq to include a header:
jq -r '
["timestamp","transmitterId","receiverId","rssi"],
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi])
| #csv'
A single header with multiple files
One way to produce a single header with multiple input files would be to use inputs in conjunction with the -n command-line option. This happens also to be efficient:
jq -nr '
["timestamp","transmitterId","receiverId","rssi"],
(inputs |
[.timestamp, .transmitterId] + (.rssiSignature[] | [.receiverId,.rssi]))
| #csv'

Related

Can this jq map be simplified?

Given this JSON:
{
"key": "/books/OL1000072M",
"source_records": [
"ia:daywithtroubadou00pern",
"bwb:9780822519157",
"marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267"
]
}
Can the following jq code be simplified?
jq -r '.key as $olid | .source_records | map([$olid, .])[] | #tsv'
The use of variable assignment feels like cheating and I'm wondering if it can be eliminated. The goal is to map the key value onto each of the source_records values and output a two column TSV.
Instead of mapping into an array, and then iterating over it (map(…)[]) just create an array and collect its items ([…]). Also, you can get rid of the variable binding (as) by moving the second part into its own context using parens.
jq -r '[.key] + (.source_records[] | [.]) | #tsv'
Alternatively, instead of using #tsv you could build your tab-separated output string yourself. Either by concatenation (… + …) or by string interpolation ("\(…)"):
jq -r '.key + "\t" + .source_records[]'
jq -r '"\(.key)\t\(.source_records[])"'
Output:
/books/OL1000072M ia:daywithtroubadou00pern
/books/OL1000072M bwb:9780822519157
/books/OL1000072M marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267
It's not much shorter, but I think it's clearer than the original and clearer than the other shorter answers.
jq -r '.key as $olid | .source_records[] | [ $olid, . ] | #tsv'

Grouping and sorting JSON records in Bash

I'm using curl to get JSON file. My problem is that I would like to get group of 4 words in one line, then break the line, and sort it by first column.
I'm trying:
curl -L 'http://mylink/ | jq '.[]| .location, .host_name, .serial_number, .model'
I'm getting
"Office-1"
"work-1"
"11xxx111"
"hp"
"Office-2"
"work-2"
"33xxx333"
"lenovo"
"Office-1"
"work-3"
"22xxx222"
"dell"
I would like to have:
"Office-1", "work-1", "11xxx111", "hp"
"Office-1" "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
I tried jq -S ".[]| .location| group_by(.location), and few other combinations like sort_by(.location) but it doesn't work. I'm getting error: jq: error (at <stdin>:1): Cannot iterate over string ("Office-1")
Sample of my JSON file:
[
{
"location": "Office-1",
"host_name": "work-1",
"serial_number": "11xxx111",
"model": "hp"
},
{
"location": "Office-2",
"host_name": "work-2",
"serial_number": "33xxx333",
"model": "lenovo"
},
{
"location": "Office-1",
"host_name": "work-3",
"serial_number": "22xxx222",
"model": "dell"
}
]
To sort by .location only, without an external sort:
map( [ .location, .host_name, .serial_number, .model] )
| sort_by(.[0])[]
| map("\"\(.)\"") | join(", ")
The ", " is per the stated requirements.
If you want the output as CSV, simply replace the last line in the jq program above by #csv.
If minimizing keystrokes is a goal, then if you are certain that the keys are always in the desired order, you could get away with replacing the first line by map( [ .[] ] )
You can ask jq to produce arbitrary formatted strings.
curl -L 'http://mylink/ |
jq -r '.[]| "\"\(.location)\", \"\(.host_name)\", \"\(.serial_number)\", \"\(.model)\""' |
sort
Inside the double quotes, \" produces literal double quotes, and \(.field) interpolates a field name. The -r option is required to produce output which isn't JSON.
This will get you the output you wanted:
jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""'
like so:
$ jq -r 'group_by(.location) | .[] | .[] | map(values) | "\"" + join ("\", \"") + "\""' /tmp/so7713.json
"Office-1", "work-1", "11xxx111", "hp"
"Office-1", "work-3", "22xxx222", "dell"
"Office-2", "work-2", "33xxx333", "lenovo"
If you want it all as one string, it's a bit simpler:
$ jq 'group_by(.location) | .[] | .[] | map(values) | join (", ")' /tmp/so7713.json
"Office-1, work-1, 11xxx111, hp"
"Office-1, work-3, 22xxx222, dell"
"Office-2, work-2, 33xxx333, lenovo"
Note the lack of -r in the second example.
I feel there has to be a better way of doing .[] | .[], but I don't know what it is (yet).

How to print path and key values of JSON file using JQ

I would like to print each path and value of a json file with included key values line by line. I would like the output to be comma delimited or at least very easy to cut and sort using Linux command line tools. Given the following json and jq, I have been given jq code which seems to do this for the test JSON, but I am not sure it works in all cases or is the proper approach.
Is there a function in jq which does this automatically? If not, is there a "most concise best way" to do it?
My wish would be something like:
$ cat short.json | jq -doit '.'
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Test JSON:
$ cat short.json | jq '.'
{
"Reservations": [
{
"Groups": [],
"Instances": [
{
"ImageId": "ami-a",
"InstanceId": "i-a",
"InstanceType": "t2.micro",
"KeyName": "ubuntu"
}
]
}
]
}
Code Recommended:
https://unix.stackexchange.com/questions/561460/how-to-print-path-and-key-values-of-json-file
Supporting:
https://unix.stackexchange.com/questions/515573/convert-json-file-to-a-key-path-with-the-resulting-value-at-the-end-of-each-k
JQ Code Too long and complicated!
jq -r '
paths(scalars) as $p
| [ ( [ $p[] | tostring ] | join(".") )
, ( getpath($p) | tojson )
]
| join(": ")
' short.json
Result:
Reservations.0.Instances.0.ImageId: "ami-a"
Reservations.0.Instances.0.InstanceId: "i-a"
Reservations.0.Instances.0.InstanceType: "t2.micro"
Reservations.0.Instances.0.KeyName: "ubuntu"
A simple jq query to achieve the requested format:
paths(scalars) as $p
| $p + [getpath($p)]
| join(",")
If your jq is ancient and you cannot upgrade, insert | map(tostring) before the last line above.
Output with the -r option
Reservations,0,Instances,0,ImageId,ami-a
Reservations,0,Instances,0,InstanceId,i-a
Reservations,0,Instances,0,InstanceType,t2.micro
Reservations,0,Instances,0,KeyName,ubuntu
Caveat
If a key or atomic value contains "," then of course using a comma may be inadvisable. For this reason, it might be preferable to use a character such as TAB that cannot appear in a JSON key or atomic value. Consider therefore using #tsv:
paths(scalars) as $p
| $p + [getpath($p)]
| #tsv
(The comment above about ancient versions of jq applies here too.)
Read it as a stream.
$ jq --stream -r 'select(.[1]|scalars!=null) | "\(.[0]|join(".")): \(.[1]|tojson)"' short.json
Use -c paths as follows:
cat short.json | jq -c paths | tr -d '[' | tr -d ']'
I am using jq-1.5-1-a5b5cbe

Quoted string in CSV input become double-escaped

I'm trying to use JQ to process CSV like this which has no column headings:
cat "input.csv"
"12345678901234567890","2019-03-19",12
Is there more elegant and readable way to remove escaped quotes for the first and second fields--and overall, to build a stream of objects given such input?
Ideally I would like to have a reusable script which builds JSON from an artbitrary CSV, given a file and a list of fields in it passed as a command-line argument.
Current JQ script and output:
cat "input.csv" |
jq \
--raw-input '
. |
split("\n") |
map( split(",")) |
.[0] |
{
ID: (.[0] | fromjson),
date: (.[1] | fromjson),
count: (.[2] | tonumber)
}'
{
"ID": "12345678901234567890",
"date": "2019-03-19",
"count": 1
}
Output of the same script without | fromjson used which results in quoted quotes, which I would like to avoid:
{
"ID": "\"12345678901234567890\"",
"date": "\"2019-03-19\"",
"count": 1
}
Your invocation of jq can be simplified to:
jq -R '
split(",")
| map(fromjson)
| {ID: .[0], date: .[1], count: .[2] }'
Generic solution
jq -R --argjson header '["ID", "date", "count"]' '
split(",")
| map(fromjson)
| [ $header, . ]
| transpose
| reduce .[] as $kv ({}; .[$kv[0]] =$kv[1]) '
If you want to specify the headers in a file, use the --argfile command-line option instead.

How to map an object to arrays so it can be converted to csv?

I'm trying to convert an object that looks like this:
{
"123" : "abc",
"231" : "dbh",
"452" : "xyz"
}
To csv that looks like this:
"123","abc"
"231","dbh"
"452","xyz"
I would prefer to use the command line tool jq but can't seem to figure out how to do the assignment. I managed to get the keys with jq '. | keys' test.json but couldn't figure out what to do next.
The problem is you can't convert a k:v object like this straight into csv with #csv. It needs to be an array so we need to convert to an array first. If the keys were named, it would be simple but they're dynamic so its not so easy.
Try this filter:
to_entries[] | [.key, .value]
to_entries converts an object to an array of key/value objects. [] breaks up the array to each of the items in the array
then for each of the items, covert to an array containing the key and value.
This produces the following output:
[
"123",
"abc"
],
[
"231",
"dbh"
],
[
"452",
"xyz"
]
Then you can use the #csv filter to convert the rows to CSV rows.
$ echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries[] | [.key, .value] | #csv'
"123","abc"
"231","dbh"
"452","xyz"
Jeff answer is a good starting point, something closer to what you expect:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))'
[
"123,abc",
"231,dbh",
"452,xyz"
]
But did not find a way to join using newline:
cat input.json | jq 'to_entries | map([.key, .value]|join(","))|join("\n")'
"123,abc\n231,dbh\n452,xyz"
Here's an example I ended up using this morning (processing PagerDuty alerts):
cat /tmp/summary.json | jq -r '
.incidents
| map({desc: .trigger_summary_data.description, id:.id})
| group_by(.desc)
| map(length as $len
| {desc:.[0].desc, length: $len})
| sort_by(.length)
| map([.desc, .length] | #csv)
| join("\n") '
This dumps a CVS-separated document that looks something like:
"[Triggered] Something annoyingly frequent",31
"[Triggered] Even more frequent alert!",35
"[No data] Stats Server is probably acting up",55
Try This
give same output you want
echo '{"123":"abc","231":"dbh","452":"xyz"}' | jq -r 'to_entries | .[] | "\"" + .key + "\",\"" + (.value | tostring)+ "\""'
onecol2txt () {
awk 'BEGIN { RS="_end_"; FS="\n"}
{ for (i=2; i <= NF; i++){
printf "%s ",$i
}
printf "\n"
}'
}
cat jsonfile | jq -r -c '....,"_end_"' | onecol2txt