JQ to convert JSON to CSV for specific Keys - json

I am trying to convert JSON to CSV for selected keys using jq.
file.json
{
"_ref": "ipv4address/Li5pcHY0X2FkZHJlc3yMDIuMS8w:10.202.202.1",
"discovered_data": {
"bgp_as": 64638,
"device_model": "catalyst37xxStack",
"device_port_name": "Vl2002",
"device_port_type": "propVirtual",
"device_type": "Switch-Router",
"device_vendor": "Cisco",
"discovered_name": "Test_Device.network.local",
"discoverer": "Network Insight",
"first_discovered": 1580161888,
"last_discovered": 1630773758,
"mac_address": "aa:bb:cc:dd:ee:ff",
"mgmt_ip_address": "10.202.202.1",
"os": "15.2(4)E10",
"port_speed": "Unknown",
"port_vlan_name": "TEST-DATA",
"port_vlan_number": 2002
},
"ip_address": "10.202.202.1",
"is_conflict": false,
"mac_address": "",
"names": ["Test_Device"],
"network": "10.202.202.0/23",
"network_view": "TEST VIEW",
"objects": [],
"status": "USED",
"types": [
"UNMANAGED"
],
"usage": []
}
my desired output is:
names,ip_address,discovered_data.mac_address,discovered_data.discovered_name
Test_Device,10.202.202.1,aa:bb:cc:dd:ee:ff,Test_Device.network.local
So far, I have tried using following command but getting some syntax error:
jq -r 'map({names,ip_address,discovered_data.mac_address,discovered_data.discovered_name}) | (first | keys_unsorted) as $keys | map([to_entries[] | .value]) as $rows | $keys,$rows[] | #csv' < file.json

Assuming the JSON has been fixed, consider the output of:
(null
| {names,
ip_address,
"discovered_data.mac_address",
"discovered_data.discovered_name"} | keys_unsorted) as $keys
| $keys,
({names: .names[],
ip_address,
"discovered_data.mac_address": .discovered_data.mac_address,
"discovered_data.discovered_name": .discovered_data.discovered_name }
| [.[]])
| #csv
Assuming jq is invoked with the -r command-line option, this has the advantage of producing valid CSV. If you prefer to have all the key names and values unquoted, you might wish to consider using join(",") instead of #csv, or some more sophisticated variation if you want to have your cake and eat it.

Related

why not print csv headers?

CentOS, jq
https://stedolan.github.io/jq/manual/
I want to export json to csv.
I use tool jq for this.
Here example of json.
{
"page": {
"id": "kctbh9vrtdwd",
"name": "GitHub",
"url": "https://www.githubstatus.com",
"time_zone": "Etc/UTC",
"updated_at": "2021-05-27T16:56:02.461Z"
},
"status": {
"indicator": "none",
"description": "All Systems Operational"
}
}
I get by
curl -s https://www.githubstatus.com/api/v2/status.json
Here convert json to csv.
curl -s https://www.githubstatus.com/api/v2/status.json | jq -r '.page | [.id, .name] | #csv'
And here is the result:
"kctbh9vrtdwd","GitHub"
But why not print csv headers?
There is quite a lot of noise on the SO page that is provided as a link in one of the comments,
so here are two safe jq-only solutions ("safe" in the sense that it does not matter how the keys are ordered in the input JSON):
Manually add the headers
["id", "name"],
(.page | [.id, .name])
| #csv
Include the headers based on the specification of the relevant columns
["id", "name"] as $headers
| $headers, (.page | [.[$headers[]]])
| #csv

use jq to format json data into csv data

{
"Users": [
{
"Attributes": [
{
"Name": "sub",
"Value": "1"
},
{
"Name": "phone_number",
"Value": "1234"
},
{
"Name": "referral_code",
"Value": "abc"
}
]
},
{
"Attributes": [
{
"Name": "sub",
"Value": "2"
},
{
"Name": "phone_number",
"Value": "5678"
},
{
"Name": "referral_code",
"Value": "def"
}
]
}
]
}
How can I produce output like below ?
1,1234,abc
2,5678,def
jq '.Users[] .Attributes[] .Value' test.json
produces
1
1234
abc
2
5678
def
Not sure this is the cleanest way to handle this, but the following will get the desired output:
.Users[].Attributes | map(.Value) | #csv
Loop through all the deep Attributes .Users[].Attributes
map() to get all the Value's
Convert to #csv
jqPlay demo
If you don't need the output to be guaranteed to be CSV, and if you're sure the "Name" values are presented in the same order, you could go with:
.Users[].Attributes
| from_entries
| [.[]]
| join(",")
To be safe though it would be better to ensure consistency of ordering:
(.Users[0] | [.Attributes[] | .Name]) as $keys
| .Users[]
| .Attributes
| from_entries
| [.[ $keys[] ]]
| join(",")
Using join(",") will produce the comma-separated values as shown in the Q (without the quotation marks), but is not guaranteed to produce the expected CSV for all valid values of the input. If you don't mind the pesky quotation marks, you could use #csv, or if you want to skip the quotation marks around all numeric values:
map(tonumber? // .) | #csv

Show a raw list of items after deletion another item with jq

I've got a json file with the following content:
{
"stuff": {
"usfull": [
"aa",
"bb",
"cc",
"dd"
],
"usefulAsWell": [
"ab",
"cd",
"ef",
"gh"
],
"waste": [
"12",
"34",
"56"
],
"moreWaste": [
"78"
]
}
}
I would need a plain list of the items "useful" and "usefulAsWell". I don't know if more useful items will be added later on, but I know which items are waste and want them included instead of just listing the useful ones.
With the following command I already got a list, but it still contains format characters like [],"
cat example.json | jq -r '.stuff | del(.waste,.moreWaste) | .[]'
[
"aa",
"bb",
"cc",
"dd"
]
[
"ab",
"cd",
"ef",
"gh"
]
With the following command I get a nice list, but it unfortunately contains the waste:
cat example.json | jq -r '.stuff[] | .[]'
aa
bb
cc
dd
ab
cd
ef
gh
12
34
56
78
When trying to add the deletion part, I get an error message:
cat example.json | jq -r '.stuff[] | del(.waste,.moreWaste) | .[]'
jq: error (at <stdin>:24): Cannot index array with string "waste"
Any idea on this topic?
Thanks in advance!
The description of the problem contains an inconsistency (a typo?), but it looks like you're after:
.stuff | del(.waste) | del(.moreWaste) | .[][]
which, as you implicitly note, can be abbreviated to:
.stuff | del(.waste, .moreWaste) | .[][]

Parsing multiple key/values in json tree with jq

Using jq, I'd like to cherry-pick key/value pairs from the following json:
{
"project": "Project X",
"description": "This is a description of Project X",
"nodes": [
{
"name": "server001",
"detail001": "foo",
"detail002": "bar",
"networks": [
{
"net_tier": "network_tier_001",
"ip_address": "10.1.1.10",
"gateway": "10.1.1.1",
"subnet_mask": "255.255.255.0",
"mac_address": "00:11:22:aa:bb:cc"
}
],
"hardware": {
"vcpu": 1,
"mem": 1024,
"disks": [
{
"disk001": 40,
"detail001": "foo"
},
{
"disk002": 20,
"detail001": "bar"
}
]
},
"os": "debian8",
"geo": {
"region": "001",
"country": "Sweden",
"datacentre": "Malmo"
},
"detail003": "baz"
}
],
"detail001": "foo"
}
For the sake of an example, I'd like to parse the following keys and their values: "Project", "name", "net_tier", "vcpu", "mem", "disk001", "disk002".
I'm able to parse individual elements without much issue, but due to the hierarchical nature of the full parse, I've not had much luck parsing down different branches (i.e. both networks and hardware > disks).
Any help appreciated.
Edit:
For clarity, the output I'm going for is a comma-separated CSV. In terms of parsing all combinations, covering the sample data in the example will do for now. I will hopefully be able to expand on any suggestions.
Here is a different filter which computes the unique set of network tier and disk names and then generates a result with columns appropriate to the data.
{
tiers: [ .nodes[].networks[].net_tier ] | unique
, disks: [ .nodes[].hardware.disks[] | keys[] | select(startswith("disk")) ] | unique
} as $n
| def column_names($n): [ "project", "name" ] + $n.tiers + ["vcpu", "mem"] + $n.disks ;
def tiers($n): [ $n.tiers[] as $t | .networks[] | if .net_tier==$t then $t else null end ] ;
def disks($n): [ $n.disks[] as $d | map(select(.[$d]!=null)|.[$d])[0] ] ;
def rows($n):
.project as $project
| .nodes[]
| .name as $name
| tiers($n) as $tier_values
| .hardware
| .vcpu as $vcpu
| .mem as $mem
| .disks
| disks($n) as $disk_values
| [$project, $name] + $tier_values + [$vcpu, $mem] + $disk_values
;
column_names($n), rows($n)
| #csv
The benfit of this approach becomes apparent if we add another node to the sample data:
{
"name": "server002",
"networks": [
{
"net_tier": "network_tier_002"
}
],
"hardware": {
"vcpu": 1,
"mem": 1024,
"disks": [
{
"disk002": 40,
"detail001": "foo"
}
]
}
}
Sample Run (assuming filter in filter.jq and amended data in data.json)
$ jq -Mr -f filter.jq data.json
"project","name","network_tier_001","network_tier_002","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001","",1,1024,40,20
"Project X","server002",,"network_tier_002",1,1024,,40
Try it online!
Here's one way you could achieve the desired output.
program.jq:
["project","name","net_tier","vcpu","mem","disk001","disk002"],
[.project]
+ (.nodes[] | .networks[] as $n |
[
.name,
$n.net_tier,
(.hardware |
.vcpu,
.mem,
(.disks | add["disk001","disk002"])
)
]
)
| #csv
$ jq -r -f program.jq input.json
"project","name","net_tier","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001",1,1024,40,20
Basically, you'll want to project the fields that you want into arrays so you may convert those arrays to csv rows. Your input makes it seem like there could potentially be multiple networks for a given node. So if you wanted to output all combinations, that would have to be flattened out.
Here's another approach, that is short enough to speak for itself:
def s(f): first(.. | f? // empty) // null;
[s(.project), s(.name), s(.net_tier), s(.vcpu), s(.mem), s(.disk001), s(.disk002)]
| #csv
Invocation:
$ jq -r -f value-pairs.jq input.json
Result:
"Project X","server001","network_tier_001",1,1024,40,20
With headers
Using the same s/1 as above:
. as $d
| ["project", "name", "net_tier", "vcpu", "mem", "disk001","disk002"]
| (., map( . as $v | $d | s(.[$v])))
| #csv
With multiple nodes
Again with s/1 as above:
.project as $p
| ["project", "name", "net_tier", "vcpu", "mem", "disk001","disk002"] as $h
| ($h,
(.nodes[] as $d
| $h
| map( . as $v | $d | s(.[$v]) )
| .[0] = $p)
) | #csv
Output with the illustrative multi-node data:
"project","name","net_tier","vcpu","mem","disk001","disk002"
"Project X","server001","network_tier_001",1,1024,40,20
"Project X","server002","network_tier_002",1,1024,,40

parsing Json data with jq

Need help in parsing Json data using jq , I used to parse the data using json path as [?(#.type=='router')].externalIP. I am not sure how to do the same using jq.
The result from the query should provide the .externalIp from the type=router.
198.22.66.99
Json data snippet as below
[
{
"externalHostName": "localhost",
"externalIP": "198.22.66.99",
"internalHostName": "localhost",
"isUp": true,
"pod": "gateway",
"reachable": true,
"region": "dc-1",
"type": [
"router"
],
"uUID": "b5f986fe-982e-47ae-8260-8a3662f25fc2"
},
]
##
cat your-data.json | jq '.[]|.externalIP|select(type=="string")'
"198.22.66.99"
"192.22.66.29"
"192.22.66.89"
"192.66.22.79"
explanation:
.[] | .externalIP | select(type=="string")
for every array entry | get field 'externalIP' | drop nulls
EDIT/ADDENDUM: filter on type (expects router to be on index 0 of type array)
cat x | jq '.[]|select(.type[0] == "router")|.externalIP'
"198.22.66.99"
"192.22.66.89"
The description:
i would like to extract externalIP for only the array "type": [ "router" ]
The corresponding jq query is:
.[] | select(.type==["router"]) | .externalIP
To base the query on whether "router" is amongst the specified types:
.[] | select(.type|index("router")) | .externalIP