I'm trying to come up with the correct jq syntax to convert json to csv.
Desired results:
<email>,<id>,<name>
e.g.
user1#whatever.nevermind.no,0,general
user2#whatever.nevermind.no,0,general
user1#whatever.nevermind.no,1,local
...
note that also need to ignore objects with empty "agent_priorities"
Input
[
{
"id": 0,
"name": "General",
"agent_priorities": {
"user1#whatever.nevermind.no": "normal",
"user2#whatever.nevermind.no": "normal"
}
},
{
"id": 1,
"name": "local",
"agent_priorities": {
"user1#whatever.nevermind.no": "normal"
}
},
{
"id": 2,
"name": "Engineering",
}
]
The following variant of the accepted answer checks for the existence of the "agent_priorities" key as per the requirements, and uses keys_unsorted to preserve the order of the keys:
jq -r '
.[]
| select(has("agent_priorities"))
| .id as $id
| .name as $name
| .agent_priorities
| keys_unsorted[]
| [., $id, $name ]
| #csv
' file.json
Store the id and name in variables, then iterate over the keys of agent_priorities:
jq -r '.[]
| .id as $id
| .name as $name
| .agent_priorities
| keys
| .[]
| [., $id, $name ]
| #csv
' file.json
I'm trying to generate a CSV of sort from json file, the files are as below
cat role1.json
{
"Tags": [
{
"Key": "Name",
"Value": "Role1Name"
},
{
"Key": "ID",
"Value": "Role1ID"
},
{
"Key": "Manager",
"Value": "Role1Manager"
},
{
"Key": "User",
"Value": "Role1User"
},
{
"Key": "Country",
"Value": "USA"
}
]
}
cat role2.json
{
"Tags": [
{
"Key": "Name",
"Value": "Role2Name"
},
{
"Key": "ID",
"Value": "Role2ID"
},
{
"Key": "City",
"Value": "NewYork"
},
{
"Key": "Creator",
"Value": "Role2Creator"
},
{
"Key": "User",
"Value": "Role2User"
}
]
}
cat role3.json
{
"Tags": [
{
"Key": "Name",
"Value": "Role3Name"
},
{
"Key": "ID",
"Value": "Role3ID"
},
{
"Key": "Creator",
"Value": "Role3Creator"
},
{
"Key": "ZIP",
"Value": 82378
},
{
"Key": "Manager",
"Value": "Role3Manager"
},
{
"Key": "User",
"Value": "Role3User"
}
]
}
I want to generate lines from each of these to be later used as CSV, something like:
Role1Name, Role1ID, null, Role1Manager, Role1User
Role2Name, Role2ID, Role2Creator, null, Role2User
Role3Name, Role3ID, Role3Creator, Role3Manager, Role3User
For the header line
Name, ID, Creator, Manager, User
I'm able to get all the "Value" but not able to print null for missing "Key"
$cat role1.json | jq -rc '[.Tags[] | select(.Key == ("Name","ID","Creator","Manager","User")) | .Value]'
["Role1Name","Role1ID","Role1Manager","Role1User"]
$cat role2.json | jq -rc '[.Tags[] | select(.Key == ("Name","ID","Creator","Manager","User")) | .Value]'
["Role2Name","Role2ID","Role2Creator","Role2User"]
$cat role3.json | jq -rc '[.Tags[] | select(.Key == ("Name","ID","Creator","Manager","User")) | .Value]'
["Role3Name","Role3ID","Role3Creator","Role3Manager","Role3User"]
Can someone share with me how this can be done using jq.
Also, how can we enforce the order.
Thanks!
The key (ha!) is
[ .[ $keys[] ] ]
Had you looked at other answers to questions relating to CSV, you might have noticed the first step taken is to get the list of keys. This is often done by collecting the keys of the input objects. (Example) In your case, you have a hard-coded list, so it's even simpler.
If you wanted actual CSV, you could use
jq -sr '
[ "Name", "ID", "Creator", "Manager", "User" ] as $keys |
(
$keys,
( .[].Tags | from_entries | [ .[ $keys[] ] ] )
) |
#csv
' role*.json
This produces
"Name","ID","Creator","Manager","User"
"Role1Name","Role1ID",,"Role1Manager","Role1User"
"Role2Name","Role2ID","Role2Creator",,"Role2User"
"Role3Name","Role3ID","Role3Creator","Role3Manager","Role3User"
jqplay
Without a header:
jq -r '.Tags | from_entries | [ .["Name","ID","Creator","Manager","User"] ] | #csv' role*.json
jqplay
To get the specific output you posted (which isn't CSV), you could use
jq -sr '
[ "Name", "ID", "Creator", "Manager", "User" ] as $keys |
(
$keys,
( .[].Tags | from_entries | [ .[ $keys[] ] | . // "null" ] )
) |
join(", ")
' role*.json
This produces
Name, ID, Creator, Manager, User
Role1Name, Role1ID, null, Role1Manager, Role1User
Role2Name, Role2ID, Role2Creator, null, Role2User
Role3Name, Role3ID, Role3Creator, Role3Manager, Role3User
jqplay
Without a header:
jq -r '.Tags | from_entries | [ .["Name","ID","Creator","Manager","User"] | . // "null" ] | join(", ")' role*.json
jqplay
Got an answer from another forum, might be useful for others
$jq -rc '.Tags | from_entries | [.Name, .ID, .Creator, .Manager, .User]' role*.json
["Role1Name","Role1ID",null,"Role1Manager","Role1User"]
["Role2Name","Role2ID","Role2Creator",null,"Role2User"]
["Role3Name","Role3ID","Role3Creator","Role3Manager","Role3User"]
Given a JSON like the following:
{
"data": [{
"id": "1a2b3c",
"info": {
"a": {
"number": 0
},
"b": {
"number": 1
},
"c": {
"number": 2
}
}
}]
}
I want to select on a number that is greater than or equal to 2 and for that selection I want to return the values of id and number. I did this like so:
$ jq -r '.data[] | .id as $ID | .info[] | select(.number >= 2) | [$ID, .number]' in.json
[
"1a2b3c",
2
]
Now I would also like to return a higher level key for my selection, in my case I need to return c. How can I accomplish this?
Assuming you want the string "c" instead of 2 in the output, this will work:
$ jq '.data[] | .id as $ID | .info | to_entries[] | select(.value.number >= 2) | [$ID, .key]' input.json
[
"1a2b3c",
"c"
]
I previously got some help on here for some jq to csv issues. I ran into an issue where a few json files had some extra values that breaks the jq command
Here is the json data. The repairs section is what breaks the jq command
[
{
"Name": "John Doe",
"Car": [
"Car1",
"Car2"
],
"Location": "Texas",
"Repairs: {
"RepairLocations": {
"RepairsCompleted":[
"Fix1",
"Fix2"
]
}
}
},
{
"Name": "Jane Roe",
"Car": "Car1",
"Location": [
"Illinois",
"Kansas"
]
}
]
Here is the command
def expand($keys):
. as $in
| reduce $keys[] as $k ( [{}];
map(. + {
($k): ($in[$k] | if type == "array" then .[] else . end)
})
) | .[];
(.[0] | keys_unsorted) as $h
| $h, (.[] | expand($h) | [.[$h[]]]) | #csv
This is the end result i am trying to get. This data isnt actual data.
Name,Car,Location,Repairs:RepairLocation
John Doe,Car1,Texas,RepairsCompleted:Fix1
John Doe,Car1,Texas,RepairsCompleted:Fix2
John Doe,Car2,Texas,RepairsCompleted:Fix1
John Doe,Car2,Texas,RepairsCompleted:Fix2
Jane Roe,Car1,Illinois,
Jane Roe,Car1,Kansas,
Any advice on this would be great. I am struggling to figure jq out
A simple solution can be obtained using the same technique shown in one of the answers to the similar question you already asked. The only difference is fulfilling your requirements in the case where the "Repairs" key does not exist:
["Name", "Car", "Location", "Repairs:RepairLocation"],
(.[]
| [.Name]
+ (.Car|..|scalars|[.])
+ (.Location|..|scalars|[.])
+ (.Repairs|..|scalars
| [if . == null then . else "RepairsCompleted:\(.)" end]) )
| #csv
Avoiding the repetition with a helper function
def s: .. | scalars | [.];
["Name", "Car", "Location", "Repairs:RepairLocation"],
(.[]
| [.Name]
+ (.Car|s)
+ (.Location|s)
+ (.Repairs|s|map(if . == null then . else "RepairsCompleted:\(.)" end)))
| #csv
I am parsing a curl output from gitlab api, and I need to add a sort_by to my query, then select only certain values.
sample input:
[
{
"id": 10,
"name": "another-test",
"path": "another-test",
"description": "",
"visibility": "private",
"lfs_enabled": true,
"avatar_url": null,
"web_url": "https://mygitlab/groups/another-test",
"request_access_enabled": false,
"full_name": "another-test",
"full_path": "another-test",
"parent_id": 9
},
{
"id": 11,
"name": "asdfg",
"path": "asdfg",
"description": "",
"visibility": "private",
"lfs_enabled": true,
"avatar_url": null,
"web_url": "https://mygitlab/groups/asdfg",
"request_access_enabled": false,
"full_name": "asdfg",
"full_path": "asdfg",
"parent_id": 7
}
I parse the JSON with jq as follows:
curl http://..... | jq -r '.[] | select(.parent_id!=null) | .name, .parent_id'
This works exactly as expected, but when I try to sort the results by parent_id, I get an error:
curl http://..... | jq -r '.[] | select(.parent_id!=null) | .name, .parent_id | sort_by(.parent_id)'
jq: error (at <stdin>:0): Cannot index number with string "parent_id"
I can use sort_by(), by putting a single dot instead than .[]:
curl http://..... | jq '. | sort_by(.parent_id) '
But I cannot combine the 2 functions.
Clarification: I need to extract name and parent_id, sorted by parent_id, when it is not null.
Thanks in advance
jq's sort_by() function accepts an array as input.
curl 'http://...' |
jq -r '
map(select(.parent_id != null))
| sort_by(.parent_id)[]
| [.name, .parent_id]
| #tsv
'
Sample output:
asdfg 7
another-test 9