I'm trying to extract the sids, ll, state, name, smry values in my JSON file using jq and export to a csv.
JSON File (out.json):
{
"data": [
{
"meta": {
"uid": 74529,
"ll": [
-66.9333,
47.0667
],
"sids": [
"CA008102500 6"
],
"state": "NB",
"elev": 1250,
"name": "LONG LAKE"
},
"smry": [
[
"42",
"1955-02-23"
]
]
},
{
"meta": {
"uid": 74534,
"ll": [
-67.2333,
45.9667
],
"sids": [
"CA008103425 6"
],
"state": "NB",
"elev": 150.9,
"name": "NACKAWIC"
},
"smry": [
[
"40",
"1969-02-23"
]
]
},
{
"meta": {
"uid": 74549,
"ll": [
-67.4667,
47.4667
],
"sids": [
"CA008104933 6"
],
"state": "NB",
"elev": 794,
"name": "ST QUENTIN"
},
"smry": [
[
"M",
"M"
]
]
},
{
"meta": {
"uid": 74550,
"ll": [
-67.2667,
45.1833
],
"sids": [
"CA008104936 6"
],
"state": "NB",
"elev": 36.1,
"name": "ST STEPHEN"
},
"smry": [
[
"48",
"1900-02-23"
]
]
},
{
"meta": {
"uid": 74554,
"ll": [
-67.25,
47.2667
],
"sids": [
"CA008105000 6"
],
"state": "NB",
"elev": 915.4,
"name": "SISSON DAM"
},
"smry": [
[
"35",
"1955-02-23"
]
]
}
]
}
Terminal Code:
jq '.data | [ {sids, ll, state, name, smry} ]' out.json
I am getting the following errors:
assertion "cb == jq_util_input_next_input_cb" failed: file "/usr/src/ports/jq/jq-1.5-3.x86_64/src/jq-1.5/util.c", line 371, function: jq_util_input_get_position
Aborted (core dumped)
Example Expected Output:
sids, ll, state, name, smry
CA008102500, -66.9333, 47.0667, NB, LONG LAKE, 42,1955-02-23
CA008103425, -67.2333, 45.9667, NB, NACKAWIC, 35,1955-02-23
What am I doing wrong?
It's a bit more complex because you need to flatten sids, ll and smry before you can flatten the whole record. I recommend to create a jq file:
foo.jq:
.data[]|{
"sids":(.meta.sids[0]|split(" ")[0]),
"ll":(.meta.ll|map(tostring)|join(",")),
"state":.meta.state,
"name":.meta.name,
"smry":(.smry[]|join(","))
}|join(",")
# or, for robust csv output
# } | #csv
And then call:
jq -rf foo.jq file.json
Output:
CA008102500,-66.9333,47.0667,NB,LONG LAKE,42,1955-02-23
CA008103425,-67.2333,45.9667,NB,NACKAWIC,40,1969-02-23
CA008104933,-67.4667,47.4667,NB,ST QUENTIN,M,M
CA008104936,-67.2667,45.1833,NB,ST STEPHEN,48,1900-02-23
CA008105000,-67.25,47.2667,NB,SISSON DAM,35,1955-02-23
Related
I have a JSON file who I am trying to convert to CSV using jq, but I've been having a lot of problems, this is the JSON:
{
"nhits": 2,
"parameters": {
"dataset": "real-time-bezettingen-fietsenstallingen-gent",
"rows": 10,
"start": 0,
"facet": [
"facilityname"
],
"format": "json",
"timezone": "UTC"
},
"records": [
{
"datasetid": "real-time-bezettingen-fietsenstallingen-gent",
"recordid": "d471594688a931ba8d81f8d883874a08cee84775",
"fields": {
"id": "48-2",
"freeplaces": 71,
"facilityname": "Braunplein",
"geo_point_2d": [
51.05406845807926,
3.723722319130363
],
"time": "2022-11-10T12:18:01+00:00",
"totalplaces": 116,
"occupiedplaces": 45,
"bezetting": 38
},
"geometry": {
"type": "Point",
"coordinates": [
3.723722319130363,
51.05406845807926
]
},
"record_timestamp": "2022-11-10T12:18:04.838Z"
},
{
"datasetid": "real-time-bezettingen-fietsenstallingen-gent",
"recordid": "d0121748cf31c7e1c02d99712bdf07cb33156689",
"fields": {
"id": "48-1",
"freeplaces": 65,
"facilityname": "Korenmarkt",
"geo_point_2d": [
51.05388288288933,
3.7214177570400473
],
"time": "2022-11-10T12:18:01+00:00",
"totalplaces": 235,
"occupiedplaces": 170,
"bezetting": 72
},
"geometry": {
"type": "Point",
"coordinates": [
3.7214177570400473,
51.05388288288933
]
},
"record_timestamp": "2022-11-10T12:18:04.838Z"
}
],
"facet_groups": [
{
"name": "facilityname",
"facets": [
{
"name": "Braunplein",
"count": 1,
"state": "displayed",
"path": "Braunplein"
},
{
"name": "Korenmarkt",
"count": 1,
"state": "displayed",
"path": "Korenmarkt"
}
]
}
]
}
I only want to have the columns facilityname, time, totalplaces, occupiedplaces and bezetting, I tried converting using the following command:
jq -r '["naam", "tijd", "totaalAantalPlaatsen", "bezettePlaatsen", "bezetting"] , .records[] | (.fields[] | [.facilityname, .time, .totalplaces, .occupiedplaces, .bezetting]) | #csv' data.json
But I get the error:
jq: error (at data.json:0): Cannot index array with string "fields"
Does anyone know what I'm doing wrong?
You just need some parentheses around the .records[] ... part
jq -r '
["name", "tijd", "totaalAantalPlaatsen", "bezettePlaatsen", "bezetting"],
(.records[].fields | [.facilityname, .time, .totalplaces, .occupiedplaces, .bezetting])
| #csv
' file.json
I have a json like this
[
{
"name": "hosts",
"ipaddress": "1.2.3.4",
"status": "UP",
"randomkey": "randomvalue"
},
{
"name": "hosts",
"ipaddress": "5.6.7.8",
"status": "DOWN",
"newkey": "newvalue"
},
{
"name": "hosts",
"ipaddress": "9.10.11.12",
"status": "RESTART",
"anotherkey": "anothervalue"
}
]
I want to merge the objects and looking for some output like this
[
{
"name": "hosts", //doesn't matter if it is ["hosts"]
"ipaddress": ["1.2.3.4", "5.6.7.8", "9.10.11.12"],
"status": ["UP", "DOWN", "RESTART"],
"randomkey": ["randomvalue"],
"newkey": ["newvalue"],
"anotherkey": ["anothervalue"]
}
]
I can hardcode each and every key and do something like this - { ipaddress: (map(.ipaddress) | unique ) } + { status: (map(.status) | unique ) } + { randomkey: (map(.randomkey) | unique ) }
The important ask here is the values are random and cannot be hardcoded.
Is there a way i can merge all the keys without hardcoding the key here?
Using reduce, then unique would be one way:
jq '[
reduce (.[] | to_entries[]) as {$key, $value} ({}; .[$key] += [$value])
| map_values(unique)
]'
[
{
"name": [
"hosts"
],
"ipaddress": [
"1.2.3.4",
"5.6.7.8",
"9.10.11.12"
],
"status": [
"DOWN",
"RESTART",
"UP"
],
"randomkey": [
"randomvalue"
],
"newkey": [
"newvalue"
],
"anotherkey": [
"anothervalue"
]
}
]
Demo
Using group_by and map, then unique again would be another:
jq '[
map(to_entries[]) | group_by(.key)
| map({key: first.key, value: map(.value) | unique})
| from_entries
]'
[
{
"anotherkey": [
"anothervalue"
],
"ipaddress": [
"1.2.3.4",
"5.6.7.8",
"9.10.11.12"
],
"name": [
"hosts"
],
"newkey": [
"newvalue"
],
"randomkey": [
"randomvalue"
],
"status": [
"DOWN",
"RESTART",
"UP"
]
}
]
Demo
I'm needing to solve this with JQ. I have a large lists of arrays in my json file and am needing to do some sort | uniq -c types of stuff on them. Specifically I have a relatively nasty looking fruit array that needs to break down what is inside. I'm aware of unique and things like that, and imagine there is likely a simple way to do this, but I've been trying run down assigning things as variables and appending and whatnot, but I can't get the most basic part of counting the unique values per that fruit array, and especially not without breaking the rest of the content (hence the variable ideas). Please tell me I'm overthinking this.
I'd like to turn this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"fruit": [
"Kiwi",
"Apple",
"",
"",
"",
"Kiwi",
"",
"Kiwi",
"",
"",
"Mango",
"Kiwi"
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"fruit": [
"",
"Orange"
]
}
]
Into this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Kiwi - 3"
},
{
"name": "fruit",
"value": "Mango - 1"
},
{
"name": "fruit",
"value": "Apple - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]
Using group_by and length would be one way:
jq '
map(with_entries(select(.key == "fruit") |= (
.value |= (group_by(.) | map(
{name: "fruit", value: "\(.[0] | select(. != "")) - \(length)"}
))
| .key = "metadata"
)))
'
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Apple - 1"
},
{
"name": "fruit",
"value": "Kiwi - 4"
},
{
"name": "fruit",
"value": "Mango - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]
Demo
I have file with 30, 000 JSON lines delimited by new line. I am using JQ to process it.
Below is each line schema (new.json).
{
"indexed": {
"date-parts": [
[
2020,
8,
13
]
],
"date-time": "2020-08-13T06:27:26Z",
"timestamp": 1597300046660
},
"reference-count": 42,
"publisher": "American Chemical Society (ACS)",
"issue": "3",
"content-domain": {
"domain": [],
"crossmark-restriction": false
},
"short-container-title": [
"Org. Lett."
],
"published-print": {
"date-parts": [
[
2005,
2
]
]
},
"DOI": "10.1021/ol047829t",
"type": "journal-article",
"created": {
"date-parts": [
[
2005,
1,
27
]
],
"date-time": "2005-01-27T05:53:29Z",
"timestamp": 1106805209000
},
"page": "383-386",
"source": "Crossref",
"is-referenced-by-count": 38,
"title": [
"Liquid-Crystalline [60]Fullerene-TTF Dyads"
],
"prefix": "10.1021",
"volume": "7",
"author": [
{
"given": "Emmanuel",
"family": "Allard",
"affiliation": []
},
{
"given": "Frédéric",
"family": "Oswald",
"affiliation": []
},
{
"given": "Bertrand",
"family": "Donnio",
"affiliation": []
},
{
"given": "Daniel",
"family": "Guillon",
"affiliation": []
}
],
"member": "316",
"container-title": [
"Organic Letters"
],
"original-title": [],
"link": [
{
"URL": "https://pubs.acs.org/doi/pdf/10.1021/ol047829t",
"content-type": "unspecified",
"content-version": "vor",
"intended-application": "similarity-checking"
}
],
"deposited": {
"date-parts": [
[
2020,
4,
7
]
],
"date-time": "2020-04-07T13:39:55Z",
"timestamp": 1586266795000
},
"score": null,
"subtitle": [],
"short-title": [],
"issued": {
"date-parts": [
[
2005,
2
]
]
},
"references-count": 42,
"alternative-id": [
"10.1021/ol047829t"
],
"URL": "http://dx.doi.org/10.1021/ol047829t",
"relation": {},
"ISSN": [
"1523-7060",
"1523-7052"
],
"issn-type": [
{
"value": "1523-7060",
"type": "print"
},
{
"value": "1523-7052",
"type": "electronic"
}
],
"subject": [
"Physical and Theoretical Chemistry",
"Organic Chemistry",
"Biochemistry"
]
}
For every DOI, I need to obtain the values of given and family key in the same cell of the same row of that DOI in the CSV/TSV format.
The expected output for the above json is (in CSV/TSV format):
|DOI| givenName|familyName|
|10.1021/ol047829t|Emmanuel; Frédéric; Bertrand; Daniel;|Allard; Oswald; Donnio; Guillon|
I am using the below command line but it is throwing error and when I try to alter I am unable to get CSV/TSV output at all.
cat new.json | jq -r "[.DOI, .publisher, .author[] | .given] | #tsv" > manage.tsv
The same logic applies for subject key also. I am using the below command line to output values of subject key to CSV but it is throwing only the first element (in this case only: "Physical and Theoretical Chemistry")
cat new.json | jq -c -r "[.DOI, .publisher, .subject[0]] | #csv" > manage.csv
Any pointers for right jq command line will be of great help.
Join given and family names by semicolons separately, then pass resulting strings as fields to the TSV filter.
["DOI", "givenName", "familyName"],
(inputs | [.DOI, (.author | map(.given), map(.family) | join("; "))])
| #tsv
Online demo
Note that you need to invoke JQ with -r and -n flags for this to work and produce a valid TSV output.
Can somebody help me to extract with | jq the following:
{
"status": "success",
"data": {
"resultType": "matrix",
"result": [
{
"metric": {
"pod": "dev-cds-5c97cf7f78-sw6b9"
},
"values": [
[
1588204800,
"0.3561394483796914"
],
[
1588215600,
"0.3607968456046861"
],
[
1588226400,
"0.3813882532417868"
],
[
1588237200,
"0.6264355815408573"
]
]
},
{
"metric": {
"pod": "uat-cds-66ccc9685-b5tvh"
},
"values": [
[
1588204800,
"0.9969746974696218"
],
[
1588215600,
"0.7400881057270005"
],
[
1588226400,
"1.2298959318837195"
],
[
1588237200,
"0.9482296838254507"
]
]
}
]
}
}
I need to obtain all-values individually by given word dev-cds and not all the name dev-cds-5c97cf7f78-sw6b9.
Result desired:
{
"metric": {
"pod": "dev-cds-5c97cf7f78-sw6b9"
},
"values": [
[
1588204800,
"0.3561394483796914"
],
[
1588215600,
"0.3607968456046861"
],
[
1588226400,
"0.3813882532417868"
],
[
1588237200,
"0.6264355815408573"
]
]
}
You should first iterate over the result array. Check if the pod inside, metric object has the value that contains "dev-cds".
.data.result[] | if .metric.pod | contains("dev-cds") then . else empty end
https://jqplay.org/s/54OH83qHKP