I have two JSON files.
file1.json:
{
"Fruits": [
{
"name": "Apple",
"something_else": 123,
"id": 652090
},
{
"name": "Orange",
"something_else": 456,
"id": 28748
}
]}
file2.json:
{
"Fruits": [
{
"weight": 5,
"id": 652090
},
{
"weight": 7,
"id": 28748
}
]}
I want to combine objects from both files if they have a common key 'id', but to extract only 'name' property from file1. How do I do that using jq?
This is what I want to get:
{
"Fruits": [
{
"name": "Apple",
"weight": 5,
"id": 652090
},
{
"name": "Orange",
"weight": 7,
"id": 28748
},
]}
Combine Fruits arrays, group it by id, select groups with 2 elements because we want fruits present in both files. For each selected group; add name field from first group element to second, and collect results in an array.
jq -n '[inputs.Fruits[]]
| reduce (group_by(.id)[] | select(length==2)) as $f
([]; . + [$f[1] + ($f[0] | {name})])' file1.json file2.json
Note that the order files are given on the command line is important, the file with names should be given before the other.
Combining objects with same id and extracting a subset of fields is way much easier though:
jq -n '[inputs.Fruits[]]
| group_by(.id)
| map(select(length==2) | add | {name, id, weight})
' file1.json file2.json
There's plenty of ways this could be constructed. Here's another way:
$ jq '.Fruits |= (. + input.Fruits | [group_by(.id)[] | add | {name,weight,id}])' \
file1.json file2.json
{
"Fruits": [
{
"name": "Orange",
"weight": 7,
"id": 28748
},
{
"name": "Apple",
"weight": 5,
"id": 652090
}
]
}
Related
I have the following json files:
File1:
[
{
"id": 1,
"name": "serviceName",
"owner": {
"id": 1,
"name": "Nicole"
}
}
]
and File2:
[
{
"id": 1,
"name": "Nicole",
"email": "nicole#email.com"
}
]
I would like to have them merged like this:
[
{
"id": 1,
"name": "serviceName",
"owner": {
"id": 1,
"name": "Nicole",
"email": "nicole#email.com"
}
}
]
I'm trying the approach from here and try to use the following:
jq --argfile new file2.json '
($new | INDEX(.ID)) as $dict
| .owner
|= (if $dict[.ID] then . + $dict[.ID] else . end)
' file1.json
But that just results in an error.
Can anyone maybe provide me with some tips?
Your approach fails because of the followings:
Field names are case-sensitive. Having {"id": 1}, use .id, not .ID.
Field names are strings. Having {"id": 1} and INDEX(.id) as $dict, use $dict[.id | tostring] or $dict[.id | #text] or $dict["\(.id)"] to convert the number 1 into the string "1" in the object index.
Your files contain arrays. While INDEX defaults to read .[] from file2.json, you need to do it for file1.json yourself.
jq --argfile new file2.json '
($new | INDEX(.id)) as $dict | .[].owner |= (
if $dict[.id | #text] then . + $dict[.id | #text] else . end
)
' file1.json
[
{
"id": 1,
"name": "serviceName",
"owner": {
"id": 1,
"name": "Nicole",
"email": "nicole#email.com"
}
}
]
As a suggestion, you could also employ JOIN to merge on a given key:
jq '
JOIN(
INDEX(input[]; .id); .[]; .id | #text; .[0].owner += .[1] | .[0]
)
' file1.json file2.json
[
{
"id": 1,
"name": "serviceName",
"owner": {
"id": 1,
"name": "Nicole",
"email": "nicole#email.com"
}
}
]
Demo
I have the following input:
{
"Columns": [
{
"email": 123,
"name": 456,
"firstName": 789,
"lastName": 450,
"admin": 900,
"licensedSheetCreator": 617,
"groupAdmin": 354,
"resourceViewer": 804,
"id": 730,
"status": 523,
"sheetCount": 298
}
]
}
{
"Users": [
{
"email": "abc#def.com",
"name": "Abc Def",
"firstName": "Abc",
"lastName": "Def",
"admin": false,
"licensedSheetCreator": true,
"groupAdmin": false,
"resourceViewer": true,
"id": 521,
"status": "ACTIVE",
"sheetCount": 0
},
{
"email": "aaa#bbb.com",
"name": "Aaa Bob",
"firstName": "Aaa",
"lastName": "Bob",
"admin": false,
"licensedSheetCreator": true,
"groupAdmin": false,
"resourceViewer": false,
"id": 352,
"status": "ACTIVE",
"sheetCount": 0
}
]
}
I need to change the key for all key value pairs in users to match the value in Columns, like so:
{
"Columns": [
{
"email": 123,
"name": 456,
"firstName": 789,
"lastName": 450,
"admin": 900,
"licensedSheetCreator": 617,
"groupAdmin": 354,
"resourceViewer": 804,
"id": 730,
"status": 523,
"sheetCount": 298
}
]
}
{
"Users": [
{
123: "abc#def.com",
456: "Abc Def",
789: "Abc",
450: "Def",
900: false,
617: true,
354: false,
804: true,
730: 521,
523: "ACTIVE",
298: 0
},
{
123: "aaa#bbb.com",
456: "Aaa Bob",
789: "Aaa",
450: "Bob",
900: false,
617: true,
354: false,
804: false,
730: 352,
523: "ACTIVE",
298: 0
}
]
}
I don't mind if I update the Users array or create a new array of objects.
I have tried several combinations of with entries, to entries, from entries, trying to search for keys using variables but the more I dive into it, the more confused I get.
Elements of a stream are processed independently. So we have to change the input.
We could group the stream elements into an array. For an input stream, this can be achieved using --slurp/-s.[1]
jq -s '
( .[0].Columns[0] | map_values( tostring ) ) as $map |
(
.[0],
(
.[1:][] |
.Users[] |= with_entries(
.key = $map[ .key ]
)
)
)
'
Demo on jqplay
Alternatively, we could use --null-input/-n in conjunction with input and/or inputs to read the input.
jq -n '
input |
( .Columns[0] | map_values( tostring ) ) as $map |
(
.,
(
inputs |
.Users[] |= with_entries(
.key = $map[ .key ]
)
)
)
'
Demo on jqplay
Note that your desired output isn't valid JSON. Object keys must be strings. So the above produces a slightly different document than requested.
Note that I assumed that .Columns is always an array of one exactly one element. This is a nonsense assumption, but it's the only way the question makes sense.
For a stream the code generates, you could place the stream generator in an array constructor ([]). reduce can also be used to collect from a stream. For example, map( ... ) can be written as [ .[] | ... ] and as reduce .[] as $_ ( []; . + [ $_ | ... ] ).
The following has the merit of simplicity, though it does not sort the keys.
It assumes jq is invoked with the -n option and of course produces a stream of valid JSON objects:
input
| . as $Columns
| .Columns[0] as $dict
| input # Users
| .Users[] |= with_entries(.key |= ($dict[.]|tostring))
| $Columns, .
If having the keys sorted is important, then you could easily add suitable code to do that; alternatively, if you don't mind having the keys of all objects sorted, you could use the -S command-line option.
I have a JSON object that looks something like this:
{
"a": [{
"name": "x",
"group": [{
"name": "tom",
"publish": true
},{
"name": "joe",
"publish": true
}]
}, {
"name": "y",
"group": [{
"name": "tom",
"publish": false
},{
"name": "joe",
"publish": true
}]
}]
}
I want to select all the entries where publish=true and create a simplified JSON array of objects like this:
[
{
"name": "x"
"groupName": "tom"
},
{
"name": "x"
"groupName": "joe"
},
{
"name": "y"
"groupName": "joe"
}
]
I've tried many combinations but the fact that group is an array seems to prevent each from working. Both in this specific case as well as in general, how do you do a deep select without loosing the full hierarchy?
Using <expression> as $varname lets you store a value in a variable before going deeper into the hierarchy.
jq -n '
[inputs[][]
| .name as $group
| .group[]
| select(.publish == true)
| {name, groupName: $group}
]' <input.json
You can use this:
jq '.[]|map(
.name as $n
| .group[]
| select(.publish==true)
| {name:$n,groupname:.name}
)' file.json
A shorter, effective alternative:
.a | map({name, groupname: (.group[] | select(.publish) .name)})
Online demo
I'm pasting here a JSON example data which would require some manipulation to get a desired output which is mentioned in the next section to be read after this piece of JSON code.
I want to use jq for parsing my desired data.
{
"MetricAlarms": [
{
"EvaluationPeriods": 3,
"ComparisonOperator": "GreaterThanOrEqualToThreshold",
"AlarmActions": [
"Unimportant:Random:alarm:ELK2[10.1.1.2]-Root-Disk-Alert"
],
"AlarmName": "Unimportant:Random:alarm:ELK1[10.1.1.0]-Root-Alert",
"Dimensions": [
{
"Name": "path",
"Value": "/"
},
{
"Name": "InstanceType",
"Value": "m5.2xlarge"
},
{
"Name": "fstype",
"Value": "ext4"
}
],
"DatapointsToAlarm": 3,
"MetricName": "disk_used_percent"
},
{
"EvaluationPeriods": 3,
"ComparisonOperator": "GreaterThanOrEqualToThreshold",
"AlarmActions": [
"Unimportant:Random:alarm:ELK2[10.1.1.2]"
],
"AlarmName": "Unimportant:Random:alarm:ELK2[10.1.1.2]",
"Dimensions": [
{
"Name": "path",
"Value": "/"
},
{
"Name": "InstanceType",
"Value": "r5.2xlarge"
},
{
"Name": "fstype",
"Value": "ext4"
}
],
"DatapointsToAlarm": 3,
"MetricName": "disk_used_percent"
}
]
}
So when I Pass some Key1 & value1 as a parameter "Name": "InstanceType", to the JQ probably using cat | jq and output expected should be as below
m5.2xlarge
r5.2xlarge
A generic approach to search for a key-value pair (sk-sv) in input recursively and extract another key's value (pv) from objects found:
jq -r --arg sk Name \
--arg sv InstanceType \
--arg pv Value \
'.. | objects | select(contains({($sk): $sv})) | .[$pv]' file
Here's my input json:
{
"channels": [
{ "id": 1, "name": "Pop"},
{ "id": 2, "name": "Rock"}
],
"links": [
{ "id": 2, "streams": [ {"url": "http://example.com/rock"} ] },
{ "id": 1, "streams": [ {"url": "http://example.com/pop"} ] }
]
}
This is what I want as an output:
"http://example.com/pop"
"Pop"
"http://example.com/rock"
"Rock"
So I need jq to replace .channels[].id with .links[].streams[0].url based on .links[].id
I don't know if it's right, but this is how I managed to output the urls:
(.channels[].id | tostring) as $ids | [.links[]] | map({(.id | tostring): .streams[0].url}) | add as $urls | $urls[$ids]
"http://example.com/pop"
"http://example.com/rock"
The question is, how do I add .channels[].name to it?
You sometimes have to be careful what you ask for, but this will produce the result you said you want:
.channels[] as $channel
| $channel.name,
(.links[] | select(.id == $channel.id) | .streams[0].url)
Output for the given input:
"Pop"
"http://example.com/pop"
"Rock"
"http://example.com/rock"
Here is a solution which uses reduce and setpath to make a $urls lookup table from .links and then scans .channels generating corresponding urls and names.
(
reduce .links[] as $l (
{};
setpath([ $l.id|tostring ]; [$l.streams[].url])
)
) as $urls
| .channels[]
| $urls[ .id|tostring ][], .name
If multiple urls are present in the "streams" attribute this will
print them all before printing the name. e.g. if the input is
{
"channels": [
{ "id": 1, "name": "Pop"},
{ "id": 2, "name": "Rock"}
],
"links": [
{ "id": 2, "streams": [ {"url": "http://example.com/rock"},
{"url": "http://example.com/hardrock"} ] },
{ "id": 1, "streams": [ {"url": "http://example.com/pop"} ] }
]
}
the output will be
"http://example.com/pop"
"Pop"
"http://example.com/rock"
"http://example.com/hardrock"
"Rock"