Convert Array of JSON Objects to CSV With Headers - json

I have a JSON payload that I'm looking to convert to CSV that looks like the below:
[
{
"endpoint": "APPLE",
"date": "2022-11-02 12:00",
"upUsage": 0,
"downUsage": 18000,
"upAvgRate": 0,
"downAvgRate": 600,
"upMaxRate": 0,
"downMaxRate": 800
},
{
"endpoint": "BANANA",
"date": "2022-11-02 12:00",
"upUsage": 0,
"downUsage": 17600,
"upAvgRate": 0,
"downAvgRate": 587,
"upMaxRate": 0,
"downMaxRate": 693
},
{
"endpoint": "CARROT",
"date": "2022-11-02 12:00",
"upUsage": 0,
"downUsage": 8000,
"upAvgRate": 0,
"downAvgRate": 533,
"upMaxRate": 0,
"downMaxRate": 533
}
]
I am trying to convert this to a standard CSV file with the appropriate headers via jq, but having difficulties in doing so. Below is my desired output:
"endpoint","date","upUsage","downUsage","upAvgRate","downAvgRate","upMaxRate","downMaxRate"
"APPLE","2022-11-02 12:00",0,18000,0,600,0,800
"BANANA","2022-11-02 12:00",0,17600,0,587,0,693
"CARROT","2022-11-02 12:00",0,8000,0,533,0,533
I've been able to use the below jq to get close to this output, but my headers are not being included:
cat testJson.json | jq -r '.[] | join(",")'
*Note: - There are also instances in which one of my JSON objects may not include the same number of values, so I need my output file to account for this and simply enter a null value between the commas to keep a consistent number of columns

Assuming the keys are consistently ordered within the objects:
jq -r '((first | keys_unsorted), (.[] | to_entries | map(.value))) | #csv' file.json
"endpoint","date","upUsage","downUsage","upAvgRate","downAvgRate","upMaxRate","downMaxRate"
"APPLE","2022-11-02 12:00",0,18000,0,600,0,800
"BANANA","2022-11-02 12:00",0,17600,0,587,0,693
"CARROT","2022-11-02 12:00",0,8000,0,533,0,533

Related

kubernetes popeye report JSON to cvs with JQ

I need to reformat the Popeye Kubernetes report in a spreadsheet.
I used jq but it's a bit tricky.
{
"popeye": {
"score": 90,
"grade": "A",
"sanitizers": [
{
"sanitizer": "cluster",
"tally": {
"ok": 1,
"info": 0,
"warning": 0,
"error": 0,
"score": 100
},
"issues": {
"Version": [
{
"group": "__root__",
"level": 0,
"message": "[POP-406] K8s version OK"
}
]
}
}
]
}
}
The best format to export to csv would be something like :
{
"sanitizer" : "cluster",
"kube-object" : "Version",
"group": "__root__",
"level": 0,
"message": "[POP-406] K8s version OK"
}
I tried a lot of jq command without success.
Any ideas ?
Thanks.
You are asking for a CSV export but you are showing an object as desired format. So, I interpreted the object's fields as CSV columns:
["sanitizer", "kube-object", "group", "level", "message"],
(.popeye.sanitizers[] | [.sanitizer] + (
.issues | to_entries[] | [.key, (.value[] | .group, .level, .message)])
)
| #csv
"sanitizer","kube-object","group","level","message"
"cluster","Version","__root__",0,"[POP-406] K8s version OK"
Demo
Use jq's --raw-output or -r parameter to get proper CSV formatting. Also, remove the first line if you don't need headers.
One option would be using map() along with + operator in order to produce the JSON as in the format presented within the question such as
jq - r '.[].sanitizers | map({sanitizer}+{"kube-object" : "Version"}+.issues.Version[])[]'
where
{"kube-object" : "Version"}
has been added as a non-existing key-value pair for the source JSON
Demo
If your aim is to generate comma-seperated key-value pairs line by line, then consider using
jq -r '.[].sanitizers | map({sanitizer}+{"kube-object" : "Version"}+.issues.Version[])[] | to_entries[] | "\(.key), \(.value)"'
Demo

JQ: key selection from numeric objects

I use jq 1.6 in a Windows 10 PowerShell enviroment and trying to select keys from coincidentally numeric json objects.
Json exampel:
{
"alliances_info":{
"744085325458334213":{
"emblem":3,
"name":"wellwell",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"MELL",
"slogan":"",
"id":744085325458334213
},
"744128593839677958":{
"emblem":0,
"name":"Brave",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"GABA",
"slogan":"",
"id":744128593839677958
},
"746034084459209223":{
"emblem":0,
"name":"Queen",
"member_count":1,
"level":1,
"military_might":1035,
"public":false,
"tag":"QUE",
"slogan":"",
"id":746034084459209223
},
"750446471312466445":{
"emblem":0,
"name":"Phoenix Inc",
"member_count":35,
"level":6,
"military_might":453369,
"public":true,
"tag":"PHOI",
"slogan":"",
"id":750446471312466445
},
"750446518934594062":{
"emblem":11,
"name":"Australia",
"member_count":44,
"level":8,
"military_might":957211,
"public":true,
"tag":"AUST",
"slogan":"Go Australia",
"id":750446518934594062
}
},
"server_version":"v7.190.4-master.000000006"
}
I tried several jq commands:
.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]
or
.alliances_info | .. | objects | [{alliance_name: .name, alliance_c
ount: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slog
an, alliance_id: .id}]
But Always get a jq error: parse error: Invalid numeric literal at line 1, column 3
I renounce on the object Building in the first command (and built only a Array) it works. But i need that objects. Any tips?
BR
Timo
Your first query works perfectly well with the given JSON sample. Perhaps you're invoking jq incorrectly. If you have the jq program in a file, say select.jq, you'd invoke jq like so:
jq -f select.jq sample.json
If that doesn't help, then try:
jq empty sample.json
If that fails, there might be something wrong with the encoding of the JSON.
I'm not sure I understand what you want.
Your first attempt works for me, but generates one output for JSON value in the input. That is, I created a file named so.json and put in it your JSON from above:
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
⋮
}
When I run your program , I get:
$ jq '.alliances_info | .[] | [{alliance_name: .name, alliance_count: .member_count, alliance_level: .level, alliance_power: .military_might, alliance_tag: .tag, alliance_slogan: .slogan, alliance_id: .id}]' so.json
[
{
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"alliance_tag": "MELL",
"alliance_slogan": "",
"alliance_id": 744085325458334200
}
]
[
{
"alliance_name": "Brave",
⋮
]
If you want an array at all, you probably want one array containing all the alliances like this:
$ jq '.alliances_info | [ .[] | { alliance_name: .name, alliance_id: .id } ]' so.json
[
{
"alliance_name": "wellwell",
"alliance_id": 744085325458334200
},
{
"alliance_name": "Brave",
"alliance_id": 744128593839678000
},
{
"alliance_name": "Queen",
"alliance_id": 746034084459209200
},
{
"alliance_name": "Phoenix Inc",
"alliance_id": 750446471312466400
},
{
"alliance_name": "Australia",
"alliance_id": 750446518934594000
}
]
Starting from the left,
- .alliances_info looks in its input object for the field named "alliances_info" and outputs its value
- the | next says take the output from the left-hand side and pass those as inputs to the right-hand side.
- right after that first |, I have a [ «jq expressions» ] which tells jq to create one JSON array output for each input; the elements of that array are the outputs of that inner «jq expressions»
- that inner expression starts with .[] which means to produce one output for each JSON value (ignoring the keys) in the input object. For us, that will be the objects named "744085325458334213", "744128593839677958", …
- The next | uses those objects as input and for each, generates a JSON object { alliance_name: .name, alliance_id: .id }
That's why I end up with one JSON array containing 5 JSON objects.
As far as I can tell, you are mostly just renaming a bunch of the fields. For that, you could just do something like this:
$ jq --argjson renameMap '{ "name": "alliance_name", "member_count": "alliance_count", "level": "alliance_level", "military_might": "alliance_power", "tag": "alliance_tag", "slog": "alliance_slogan"}' '.alliances_info |= ( . | [ to_entries[] | ( .value |= ( . | [ to_entries[] | ( .key |= ( if $renameMap[.] then $renameMap[.] else . end ) ) ] | from_entries ) ) ] | from_entries )' so.json
{
"alliances_info": {
"744085325458334213": {
"emblem": 3,
"alliance_name": "wellwell",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "MELL",
"slogan": "",
"id": 744085325458334200
},
"744128593839677958": {
"emblem": 0,
"alliance_name": "Brave",
"alliance_count": 1,
"alliance_level": 1,
"alliance_power": 1035,
"public": false,
"alliance_tag": "GABA",
"slogan": "",
"id": 744128593839678000
},
⋮
},
"server_version": "v7.190.4-master.000000006"
}
well i am a idiot (to be here totally clear). I found the reason (and this is normally a nobrainer...). I read the input from a file and the funny thing is that the file is Unicode but no UTF8. after recoding the command is working fine. Thanks for the help.
BR
Timo

Consuming json values from a kafka topic and writing them and formatting them in a csv file using JQ

I am trying to write into a csv file keys and values that are in a kafka topic. I have been able to select the keys and values that I want, but I am not able to get them separated by rows (three values per row values in rows separated by commas).
This is an example of two json records that I consumed from my kafka topic without doing any filtering. The command that I used is:
./kafka-run-class.sh kafka.tools.ConsoleConsumer --bootstrap-server kafka1.example.net:9092 --topic prod.example.v1 --max-messages 2 | jq -r '. '
{
"count": "0",
"source": 3,
"lastModified": "2018-03-09T21:03:54.039Z",
"isBusiness": false,
"countryCode": " MX",
"phone": "52/4446789864"
}
{
"count": "0",
"source": 3,
"lastModified": "2018-03-09T21:03:54.039Z",
"isBusiness": false,
"countryCode": " GB",
"phone": "44/0187567846"
}
I tried using this command, but each value is being put into its own row:
./kafka-run-class.sh kafka.tools.ConsoleConsumer --bootstrap-server kafka1.example.net:9092 --topic prod.example.v1 --max-messages 3 | jq -r ' .isBusiness, .countryCode, .phone ' > file.csv
Ideal output would be:
false, MX, 52/4446789864
false, GB, 44/0187567846
true, BE, 32/8745687645
jq -r '[.isBusiness, .countryCode, .phone] | #csv'
produces CSV:
false," MX","52/4446789864"
false," GB","44/0187567846"
The filter:
"\(.isBusiness), \(.countryCode), \(.phone)"
produces
false, MX, 52/4446789864
false, GB, 44/0187567846
You might want to "trim" the string values, e.g. using:
def trim: sub("^ +";"") | sub(" +$";"");

Convert complex JSON (with arrays and different data types) to CSV using JQ?

I have the following JSON data:
{
"status": "ok",
"ok": true,
"data": "MFR-L",
"stores": [{
"name": "KOLL",
"lat": 52.93128,
"lng": 6.962956,
"dist": 1,
"x10": 1.129,
"isOpen": true
},
{
"name": "Takst",
"lat": 52.9523773,
"lng": 6.981644,
"dist": 1.3,
"x10": 1.809,
"isOpen": false
}]
}
I'm trying to convert it to a flat file using JQ, but I keep running into all sorts of problems, especially because of the file types ("cannot index boolean with string", etc).
This post has helped me flatten the contents of the array so far, like this:
jq -r -s 'map(.stores | map({nm: .name, lt: .lat} | [.nm, .lt])) | add [] | #csv
How can I get the contents higher up in the hierarchy to map to the array contents?
You could always collect the values you want from the parent objects separately from the child objects and combine them later.
e.g.,
$ jq -r '[.data] + (.stores[] | [.name, .lat, .lng, .dist]) | #csv' input.json
yields
"MFR-L","KOLL",52.93128,6.962956,1
"MFR-L","Takst",52.9523773,6.981644,1.3
There are several ways in which the illustrative JSON might be "flattened" (e.g. to CSV), but the following two approaches may be of interest. (I've omitted the invocation of #csv for ease-of-reading.)
$ jq '[.data, .stores[][]]' in.json
[
"MFR-L",
"KOLL",
52.93128,
6.962956,
1,
1.129,
true,
"Takst",
52.9523773,
6.981644,
1.3,
1.809,
false
]
$ jq '.data as $data | .stores[] | [$data, .[]]' in.json
[
"MFR-L",
"KOLL",
52.93128,
6.962956,
1,
1.129,
true
]
[
"MFR-L",
"Takst",
52.9523773,
6.981644,
1.3,
1.809,
false
]
Here is another approach which uses jq variables and string interpolation:
.data as $data
| .stores[]
| "\($data),\(.name),\(.lat),\(.lng),\(.dist),\(.x10),\(.isOpen)"
output with sample data:
"MFR-L,KOLL,52.93128,6.962956,1,1.129,true"
"MFR-L,Takst,52.9523773,6.981644,1.3,1.809,false"

How do I sum the values in an array of maps in jq?

Given a JSON stream of the following form:
{ "a": 10, "b": 11 } { "a": 20, "b": 21 } { "a": 30, "b": 31 }
I would like to sum the values in each of the objects and output a single object, namely:
{ "a": 60, "b": 63 }
I'm guessing this will probably require flattening the above list of objects into a an array of [name, value] pairs and then summing the values using reduce but the documentation of the syntax for using reduce is woeful.
Unless your jq has inputs, you will have to slurp the objects up using the -s flag. Then you'll have to do a fair amount of manipulation:
Each of the objects needs to be mapped out to key/value pairs
Flatten the pairs to a single array
Group up the pairs by key
Map out each group accumulating the values to a single key/value pair
Map the pairs back to an object
map(to_entries)
| add
| group_by(.key)
| map({
key: .[0].key,
value: map(.value) | add
})
| from_entries
With jq 1.5, this could be greatly improved: You can do away with slurping and just read the inputs directly.
$ jq -n '
reduce (inputs | to_entries[]) as {$key,$value} ({}; .[$key] += $value)
' input.json
Since we're simply accumulating all the values in each of the objects, it'll be easier to just run through the key/value pairs of all the inputs, and add them all up.
I faced the same question when listing all artifacts from GitHub (see here for details) and want to sum their size.
curl https://api.github.com/repos/:owner/:repo/actions/artifacts \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: token <your_pat_here>" \
| jq '.artifacts | map(.size_in_bytes) | add'
Input:
{
"total_count": 3,
"artifacts": [
{
"id": 0000001,
"node_id": "MDg6QXJ0aWZhY3QyNzUxNjI1",
"name": "artifact-1",
"size_in_bytes": 1,
"url": "https://api.github.com/repos/:owner/:repo/actions/artifacts/2751625",
"archive_download_url": "https://api.github.com/repos/:owner/:repo/actions/artifacts/2751625/zip",
"expired": false,
"created_at": "2020-03-10T18:21:23Z",
"updated_at": "2020-03-10T18:21:24Z"
},
{
"id": 0000002,
"node_id": "MDg6QXJ0aWZhY3QyNzUxNjI0",
"name": "artifact-2",
"size_in_bytes": 2,
"url": "https://api.github.com/repos/:owner/:repo/actions/artifacts/2751624",
"archive_download_url": "https://api.github.com/repos/:owner/:repo/actions/artifacts/2751624/zip",
"expired": false,
"created_at": "2020-03-10T18:21:23Z",
"updated_at": "2020-03-10T18:21:24Z"
},
{
"id": 0000003,
"node_id": "MDg6QXJ0aWZhY3QyNzI3NTk1",
"name": "artifact-3",
"size_in_bytes": 3,
"url": "https://api.github.com/repos/docker/mercury-ui/actions/artifacts/2727595",
"archive_download_url": "https://api.github.com/repos/:owner/:repo/actions/artifacts/2727595/zip",
"expired": false,
"created_at": "2020-03-10T08:46:08Z",
"updated_at": "2020-03-10T08:46:09Z"
}
]
}
Output:
6
Another approach, which illustrates the power of jq quite nicely, is to use a filter named "sum" defined as follows:
def sum(f): reduce .[] as $row (0; . + ($row|f) );
To solve the particular problem at hand, one could then use the -s (--slurp) option as mentioned above, together with the expression:
{"a": sum(.a), "b": sum(.b) } # (2)
The expression labeled (2) only computes the two specified sums, but it is easy to generalize, e.g. as follows:
# Produce an object with the same keys as the first object in the
# input array, but with values equal to the sum of the corresponding
# values in all the objects.
def sumByKey:
. as $in
| reduce (.[0] | keys)[] as $key
( {}; . + {($key): ($in | sum(.[$key]))})
;