Using jq to parse json key and value to CSV - json

I am a newbie to jq and is very excited to use it. What ever i am trying to achieve is possible with python but the intention is to learn jq.I am trying to process JSON out of a curl command.
Below is the response of my curl command
{
"results": [{
"name": "smith Jones",
"DOB": "1992-03-26",
"Enrollmentdate": "2013-08-24"
},
{
"name": "Jacob Mathew",
"DOB": "1993-03-26",
"Enrollmentdate": "2014-10-02"
},
{
"name": "Anita Rodrigues",
"DOB": "1994-03-26",
"Enrollmentdate": "2015-02-19"
}
]
}
I was able to get the desired output to some extent. But i am unable to print the key itself in the output. I need this information to use it at a later time as a header of the column when i export this csv file (file.csv) into excel. I am planning to write a bash script to achieve the csv to excel.
<curl-command>|jq '.results | map(.name), map(.DOB), map(.Enrollmentdate) | #csv' >file.csv
I was able to get the output as below
smith jones, jacob Mathew, Anita Rodrigues
1992-03-26, 1993-03-26, 1994-03-26
2013-08-24, 2014-10-02, 2015-02-19
What i am trying to achieve is as below
name:smith jones, name:jacob Mathew, name:Anita Rodrigues
DOB:1992-03-26, DOB:1993-03-26, DOB:1994-03-26
Enrollmentdate:2013-08-24, Enrollmentdate:2014-10-02, Enrollmentdate:2015-02-19

Since you want the key names as well as their values, then adapting your approach, you could use the following, in conjunction with the -r command-line option, to produce CSV:
.results
| map(to_entries[] | select(.key=="name")),
map(to_entries[] | select(.key=="DOB")),
map(to_entries[] | select(.key=="Enrollmentdate"))
| map("\(.key):\(.value)" )
| #csv`
If you want CSV, then stick with the above; if you are confident that quoting the strings
is never necessary, change #csv to join(", "); if you want to remove the quotation
marks only when they are not necessary, you could add a def for a simple filter to do just that.
The repetition of to_entries in the above is a bit of an eye-sore. You might want to think about how to avoid it.

Related

Generate csv files from a JSON

Unfortunately I have considerable difficulties to generate three csv files from one json format. Maybe someone has a good hint how I could do this. Thanks
Here is the output. Within dropped1 and dropped2 can be several different and multiple addresses.
{
"result": {
"found": 0,
"dropped1": {
"address10": 1140
},
"rates": {
"total": {
"1min": 3579,
"5min": 1593,
"15min": 5312,
"60min": 1328
},
"dropped2": {
"address20": {
"1min": 9139,
"5min": 8355,
"15min": 2785,
"60min": 8196
}
}
},
"connections": 1
},
"id": "whatever",
"jsonrpc": "2.0"
}
The 3 csv files should be displayed in this form.
address10,1140
total,3579,1593,5312,1328
address20,9139,8355,2785,8196
If you decide to use jq, then unless there is some specific reason not to, I'd suggest invoking jq once for each of the three output files. The three invocations would then look like these:
jq -r '.result.dropped1 | [to_entries[][]] | #csv' > 1.csv
jq -r '.result.rates.total | ["total", .["1min"], .["5min"], .["15min"], .["60min"]] | #csv' > 2.csv
jq -r '.result.rates.dropped2
| to_entries[]
| [.key] + ( .value | [ .["1min"], .["5min"], .["15min"], .["60min"]] )
| #csv
' > 3.csv
If you can be sure the ordering of keys within the total and address20 objects is fixed and in the correct order, then the last two invocations can be simplified.
Did you try using this library?
https://www.npmjs.com/package/json-to-csv-stream
npm i json-to-csv-stream

Search and extract value using JQ command line processor

I have a JSON file very similar to the following:
[
{
"uuid": "832390ed-58ed-4338-bf97-eb42f123d9f3",
"name": "Nacho"
},
{
"uuid": "5b55ea5e-96f4-48d3-a258-75e152d8236a",
"name": "Taco"
},
{
"uuid": "a68f5249-828c-4265-9317-fc902b0d65b9",
"name": "Burrito"
}
]
I am trying to figure out how to use the JQ command line processor to first find the UUID that I input and based on that output the name of the associated item. So for example, if I input UUID a68f5249-828c-4265-9317-fc902b0d65b9 it should search the JSON file, find the matching UUID and then return the name Burrito. I am doing this in Bash. I realize it may require some outside logic in addition to JQ. I will keep thinking about it and put an update here in a bit. I know I could do it in an overly complicated way, but I know there is probably a really simple JQ method of doing this in one or two lines. Please help me.
https://shapeshed.com/jq-json/#how-to-find-a-key-and-value
You can use select:
jq -r --arg query Burrito '.[] | select( .name == $query ) | .uuid ' tst.json

How to convert json into csv file using jq?

This is my json file:
{
"ClientCountry": "ca",
"ClientASN": 812,
"CacheResponseStatus": 404,
"CacheResponseBytes": 130756,
"CacheCacheStatus": "hit"
}
{
"ClientCountry": "ua",
"ClientASN": 206996,
"CacheResponseStatus": 301,
"CacheResponseBytes": 142,
"CacheCacheStatus": "unknown"
}
{
"ClientCountry": "ua",
"ClientASN": 206996,
"CacheResponseStatus": 0,
"CacheResponseBytes": 0,
"CacheCacheStatus": "unknown"
}
I want to convert these json into csv like below.
"ClientCountry", "ClientASN","CacheResponseStatus", "CacheResponseBytes", "CacheCacheStatus"
"ca", 812, 404, 130756, "hit";
"ua", 206996, 301, 142,"unknown";
"ua", 206996, 0,0,"unknown";
Please let me know how to achieve this using jq?
I just tried below. But its not working.
jq 'to_entries[] | [.key, .value] | #csv'
Regards
Palani
Since you want all the key-values,
then assuming that the keys are presented in a consistent order in the input file, you can simply write:
jq -r '[.[]] | #csv' palanikumar.json
With the given input, this produces the following CSV:
"ca",812,404,130756,"hit"
"ua",206996,301,142,"unknown"
"ua",206996,0,0,"unknown"
Adding the headers and the trailing semicolons (if you really want them) is left as a (very easy) exercise.
Inconsistent ordering
If the ordering of the keys varies or might vary, then the following could be used to produce suitable CSV, assuming that the ordering of the keys in the first object in the input stream should be used:
input
| . as $first
| keys_unsorted as $keys
| $keys, [$first[]], (inputs | [.[$keys[]]]) | #csv
The appropriate invocation of jq would include both the -n and -r command-line options.
Look at these links
How to convert arbirtrary simple JSON to CSV using jq?
http://bigdatums.net/2017/09/30/convert-json-to-csv-with-jq/
( jq -r '.myarray | #csv' )

Fix "is not valid in a csv row" for jq, by transforming array to string

I try to export a CSV from Neo4j with jq, with:
curl --header "Authorization: Basic myBase64hash=" -H accept:application/json -H content-type:application/json \
-d '{"statements":[{"statement":"MATCH path=(()<--(p:Person)-->(h:House)<--(s:Street)-->(n:Neighbourhood)) RETURN path"}]}' \
http://localhost:7474/db/data/transaction/commit \
| jq -r '(.results[0]) | .columns,.data[].row | #csv' > '/tmp/export-subset.csv'
But I'm getting this error message:
jq: error (at <stdin>:0): array ([{"email":"...) is not valid in a csv row
I think it's because of I have multiple e-mail adresses,
is it possible to place all of them in a CSV cell seperated by comma?
How can I achieve that with jq?
Edit:
This is an example of my JSON file:
{"results":[{"columns":["path"],"data":[{"row":[[{"email":"gdggdd#gmail.com"},{},{"date_found":"2011-11-29 12:51:14","last_name":"Doe","provider_id":2649,"first_name":"John"},{},{"number":"133","lon":3.21114,"lat":22.8844},{},{"street_name":"Govstreet"},{},{"hood":"Rotterdam"}]],"meta":[[{"id":71390,"type":"node","deleted":false},{"id":226866,"type":"relationship","deleted":false},{"id":63457,"type":"node","deleted":false},{"id":227100,"type":"relationship","deleted":false},{"id":65076,"type":"node","deleted":false},{"id":214799,"type":"relationship","deleted":false},{"id":63915,"type":"node","deleted":false},{"id":226552,"type":"relationship","deleted":false},{"id":71120,"type":"node","deleted":false}]]}]}],"errors":[]}
Forgive me but I'm not familiar with Cypher syntax or how your data is actually structured, you don't provide much detail about that. But what I can gather, based on your sample output, each "row" item seems to correspond to what you return in your Cypher query.
Apparently you're returning path which is an entire set of nodes and relationships, and not necessarily just the data you're actually interested in.
MATCH path=(()<--(p:Person)-->(h:House)<--(s:Street)-->(n:Neighbourhood))
RETURN path
You just want the email addresses so you should probably just return the email. If I understand the syntax correctly, you could change that to this:
MATCH (i)<--(p:Person)-->(h:House)<--(s:Street)-->(n:Neighbourhood)
RETURN i.email
I believe that should result in something that looks something like this:
{
"results": [
{
"columns": [ "email" ],
"data": [
{
"row": [
"gdggdd#gmail.com"
],
"meta": [
{
"id": 71390,
"type": "string",
"deleted": false
}
]
}
]
}
],
"errors": []
}
Then it should be trivial to export that data to csv using jq since the rows can be converted directly:
.results[0] | .columns, .data[].row | #csv
On the other hand, I could be completely wrong on what that output would actually look like. So just working with your example, if you just want emails, you need to map the rows to just the email.
.results[0] | .columns, (.data[].row | map(.[0].email)) | #csv
In case I misinterpreted, if you were intending to output all values and not just the email, you should select just the values in your Cypher query.
MATCH (i)<--(p:Person)-->(h:House)<--(s:Street)-->(n:Neighbourhood)
RETURN i.email, p.date_found, p.last_name, p.provider_id, p.first_name,
h.number, h.lon, h.lat, s.street_name, n.hood
Then if my assumptions on the output are correct, the trivial jq query should give you your csv.
Since you want the keys in their original order, use keys_unsorted. This should get you on your way:
$ jq -r -c '.results[0] | .data[] | .row[]
| add
| keys_unsorted as $keys
| ($keys, [.[$keys[]]])
| #csv' input.json
(The newlines here are mainly for legibility.)
With your illustrative input, the output would be:
"email","date_found","last_name","provider_id","first_name","number","lon","lat","street_name","hood"
"gdggdd#gmail.com","2011-11-29 12:51:14","Doe",2649,"John","133",3.21114,22.8844,"Govstreet","Rotterdam"
Of course, in practice, you will probably have multiple lines of data, so in that case, you will probably want to make adjustments to ensure the headers are only printed once.

Exclude column from jq json output

I would like to get rid of the timestamp field here using jq JSON processor.
[
{
"timestamp": 1448369447295,
"group": "employees",
"uid": "elgalu"
},
{
"timestamp": 1448369447296,
"group": "employees",
"uid": "mike"
},
{
"timestamp": 1448369786667,
"group": "services",
"uid": "pacts"
}
]
White listing would also works for me, i.e. select uid, group
Ultimately what I would really like is a list with unique values like this:
employees,elgalu
employees,mike
services,pacts
If you just want to delete the timestamps you can use the del() function:
jq 'del(.[].timestamp)' input.json
However to achieve the desired output, I would not use the del() function. Since you know which fields should appear in output, you can simply populate an array with group and id and then use the join() function:
jq -r '.[]|[.group,.uid]|join(",")' input.json
-r stands for raw ouput. jq will not print quotes around the values.
Output:
employees,elgalu
employees,mike
services,pacts
For the record, an alternative would be:
$ jq -r '.[] | "\(.uid),\(.group)"' input.json
(The white-listing approach makes it easy to rearrange the order, and this variant makes it easy to modify the spacing, etc.)
The following example may be of interest to anyone who wants safe CSV (i.e. even if the values have embedded commas or newline characters):
$ jq -r '.[] | [.uid, .group] | #csv' input.json
"elgalu","employees"
"mike","employees"
"pacts","services"
Sed is your best friend - I can't think of anything simpler. I've got here having the same problem as the question's author - but maybe this is a simpler answer to the same problem:
< file sed -e '/timestamp/d'