Group by and remove duplicates across arrays objects using JQ - json

Given the json, I need to group by key userName the object userClientDetailDTOList across all sites->buildings->floors and remove any duplicate mac addresses.
I have been able to do it using jq expression -
[.billingDetailPerSiteDTOList[].billingDetailPerBuildingDTOList[].billingDetailsPerFloorDTOList[].userClientDetailDTOList[] ] | group_by(.userName) | map((.[0]|del(.associatedMacs)) + { associatedMacs: (map(.associatedMacs[]) | unique) })
This groups by userName and also removes duplicate macs belonging to particular user. This results in a list as
[
{
"userName": "1",
"associatedMacs": [
"3:3:3:3:3:3",
"5:5:5:5:5:5"
]
},
{
"userName": "10",
"associatedMacs": [
"4:4:4:4:4:4",
"6:6:6:6:6:6"
]
},
{
"userName": "2",
"associatedMacs": [
"1:1:1:1:1:1",
"2:2:2:2:2:2"
]
},
{
"userName": "3",
"associatedMacs": [
"2:2:2:2:2:2"
]
}
]
Live example
Questions:
Can the expression be simplified?
How do I remove duplicate mac addresses across all users? The mac address 2:2:2:2:2:2 is repeated for users 2 and 3

The filter is practically as good as it can get. If you really wanted to, you could still change
del(.associatedMacs) to {userName} for a positive definition, and
(…) + {…} to {userName: …, associatedMacs: …} to avoid the addition,
resulting in
… | map({userName: (.[0].userName), associatedMacs: (map(.associatedMacs[]) | unique)})
Demo
As for the second question, if you treated the input as an INDEX on the IPs, you could mostly reuse the code from earlier (of course, the unique part wouldn't be necessary anymore)
[INDEX(…; .associatedMacs[])[]] | group_by(.userName) | map(…)
[
{
"userName": "1",
"associatedMacs": [
"3:3:3:3:3:3",
"5:5:5:5:5:5"
]
},
{
"userName": "10",
"associatedMacs": [
"4:4:4:4:4:4",
"6:6:6:6:6:6"
]
},
{
"userName": "2",
"associatedMacs": [
"1:1:1:1:1:1"
]
},
{
"userName": "3",
"associatedMacs": [
"2:2:2:2:2:2"
]
}
]
Demo

Related

How to get output with unique values but only with last instance of a duplicate key - unique_by

I am currently working with jq to parse through some json. I would like to retrieve unique values based on a certain key. I came across unique_by. It does just that of getting unique values for key name but I am still not getting my desired output. From my understanding, unique_by looks at key name value an uses the first instance and then removes the duplicates that follow in the final output. However, I would like to grab the last duplicate key name value and display that in the final output.
Below is an example of my desired output. Is it possible to do this with unique_by or what would be the best approach?
cat file.json
Original json:
[
{
"name": "app-fastly",
"tag": "20210825-95-448f024",
"image": "docker.io/repoxy/app-fastly:20210825-95-448f024"
},
{
"name": "app-lovely",
"tag": "20211004-2101-b6a256c",
"image": "ghcr.io/repox/app-lovely:20211004-2101-b6a256c"
},
{
"name": "app-lovely",
"tag": "20211007-6622-b3fooba",
"image": "ghcr.io/repoxy/app-lovely:20211007-6622-b3fooba"
},
{
"name": "app-dogwood",
"tag": "20210325-36-2a349e9",
"image": "docker.io/repoxy/app-dogwood:20210325-36-2a349e9"
}
]
Jq Command:
cat file.json | jq 'unique_by( {name} )'
Current Output:
[
{
"name": "app-dogwood",
"tag": "20210325-36-2a349e9",
"image": "docker.io/repoxy/app-dogwood:20210325-36-2a349e9"
},
{
"name": "app-fastly",
"tag": "20210825-95-448f024",
"image": "docker.io/repoxy/app-fastly:20210825-95-448f024"
},
{
"name": "app-lovely",
"tag": "20211004-2101-b6a256c",
"image": "ghcr.io/repox/app-lovely:20211004-2101-b6a256c"
}
]
Desired Output:
[
{
"name": "app-dogwood",
"tag": "20210325-36-2a349e9",
"image": "docker.io/repoxy/app-dogwood:20210325-36-2a349e9"
},
{
"name": "app-fastly",
"tag": "20210825-95-448f024",
"image": "docker.io/repoxy/app-fastly:20210825-95-448f024"
},
{
"name": "app-lovely",
"tag": "20211007-6622-b3fooba",
"image": "ghcr.io/repoxy/app-lovely:20211007-6622-b3fooba"
}
]
If you want the last unique item, simply reverse the array first
jq 'reverse | unique_by( {name} )'
And if you want to retain the original order, reverse back again afterwards
jq 'reverse | unique_by( {name} ) | reverse'

Compare 2 JSON-files and create a new key if values match

I have 2 sets of JSON-files looking like below, data-A.json and data-B.json.
I need to somehow compare the key URL in data-A.json with the same key in data-B.json. Where there is a match take data from the key Position in data-A.json and write to new key PreviousPosition in data-B.json. If there is no matching URL, write a null value for this new key in data-B.json
Please see examples:
data-A.json
[
{
"Position": "1",
"TrackName": "One hit wonder",
"URL": "https://domain.local/xyz123"
},
{
"Position": "2",
"TrackName": "Random song",
"URL": "https://domain.local/123qwe"
},
{
"Position": "3",
"TrackName": "Dueling banjos",
"URL": "https://domain.local/asd456"
}
]
data-B.json
[
{
"Position": "1",
"TrackName": "Rocket",
"URL": "https://domain.local/nbs678"
},
{
"Position": "2",
"TrackName": "Dueling banjos",
"URL": "https://domain.local/asd456"
},
{
"Position": "3",
"TrackName": "One hit wonder",
"URL": "https://domain.local/xyz123"
}
]
(desired) data-B.json
[
{
"Position": "1",
"TrackName": "Rocket",
"URL": "https://domain.local/nbs678",
"PreviousPosition": null
},
{
"Position": "2",
"TrackName": "Dueling banjos",
"URL": "https://domain.local/asd456",
"PreviousPosition": "3"
},
{
"Position": "3",
"TrackName": "One hit wonder",
"URL": "https://domain.local/xyz123",
"PreviousPosition": "1"
}
]
I have done some mediocre attemps to solve this using jq with no luck. Also tried some PowerShell and Python but I just can't figure it out.
Any suggestions?
If a straightforward, two-line solution is what you're looking for, then jq is a good choice:
(INDEX($A[]; .URL) | map_values(.Position)) as $dict
| map( .PreviousPosition = $dict[ .URL ] )
This is perhaps more straightforward than it looks, as the expression in the first line is a commonly found idiom (namely INDEX(...) | map_values(...)) for creating a dictionary. In the first line, it is assumed that $A holds the JSON in data-A.json.
The second line just applies the lookup rule specified in the question.
The only tricky bit here is getting the command-line invocation right. The following will suffice:
jq --argfile A data-A.json -f program.jq data-B.json
where program.jq contains the above two-line program.

jq - Find a JSON object based on one of its values and get another value from it

I've started using jq just very recently and I would like to know if something like this is even possible.
Example:
{
"name": "device",
"version": "1.0.0",
"address": [
{
"address": "10.1.2.3",
"interface": "wlan1_wifi"
},
{
"address": "10.1.2.5",
"interface": "wlan2_link"
},
{
"address": "10.1.2.4",
"interface": "ether1"
}
],
"wireless": [
{
"name": "wlan1_wifi",
"type": "5Ghz",
"ssid": "wifi"
},
{
"name": "wlan2_link",
"type": "2Ghz",
"ssid": "link"
}
]
}
Firstly let's transform the example to this json object:
cat json | jq '. | {"name": ."name", "version": ."version", "wireless": [."wireless"[] | {"name": ."name", "type": ."type", "ssid": ."ssid"}]}'
{
"name": "device",
"version": "1.0.0",
"wireless": [
{
"name": "wlan1_wifi",
"type": "5Ghz",
"ssid": "wifi"
},
{
"name": "wlan2_link",
"type": "2Ghz",
"ssid": "link"
}
]
}
Now there's a problem. I need to assign an address to the "wireless" array. The address is stored in "address" array.
So the question: is there a way of finding the right json object in "address" based on "name" (in wireless array) and "interface" (in address array) for every json object in "wireless" array and then assigning "address" to it?
The final result should look like this:
{
"name": "device",
"version": "1.0.0",
"wireless": [
{
"name": "wlan1_wifi",
"type": "5Ghz",
"ssid": "wifi",
"address": "10.1.2.3"
},
{
"name": "wlan2_link",
"type": "2Ghz",
"ssid": "link",
"address": "10.1.2.5"
}
]
}
Answer:
Here's my answer based on the answer from #peak. Instead of copying the content of .wireless and then using map, I'm cherry picking the keys that I want to include only. This also allows me to position "address" how ever I want.
(INDEX(.address[]; .interface)) as $dict
| {name: .name, version: .version,
wireless: [.wireless[] | {name, address: ($dict[.name]|.address), type, ssid}]}
The following produces the output as originally requested:
(.wireless[].name) as $name
| .address[]
| select(.interface == $name)
| { wireless: {name: $name, address}}
However the above filter could potentially produce more than one result, so you might want to make modifications accordingly.
Revised revised requirements
If your jq has INDEX/2 (which was only made available AFTER jq 1.5 was released), you can simply use it to create a lookup table:
(INDEX(.address[]; .interface)) as $dict
| {name,
version,
wireless: (.wireless
| map(. + {address: ($dict[.name]|.address) }) ) }
Or (depending perhaps on the exact requirements):
(INDEX(.address[]; .interface)) as $dict
| del(.address)
| .wireless |= map(. + {address: ($dict[.name]|.address) })
If your jq does not have INDEX/2, then you could easily adapt the above (using reduce), or even more easily snarf the def of INDEX/2 from https://github.com/stedolan/jq/blob/master/src/builtin.jq

How to output keys on different levels if value found in array

Using jq, I would like to output multiple values on different levels of a JSON file based on whether they exist in an array.
My data looks like the following. It displays a number of hosts I examine regarding the people who have access to it:
[
{
"server": "example_1",
"version": "Debian8",
"keys": [
{
"fingerprint": "SHA256:fingerprint1",
"for_user": "root",
"name": "user1"
},
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "git",
"name": "user2"
}
]
},
{
"server": "example_2",
"version": "Debian9",
"keys": [
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "root",
"name": "user2"
},
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "www",
"name": "user2"
}
]
},
{
"server": "example_3",
"version": "CentOS",
"keys": [
null
]
}
]
I want to extract the value for server and the value of for_user any occurence where user2 is found as a name in .keys[]. Basically, the output could look like this:
example1, git
example2, root
example2, www
What I can already do is displaying the first column, so the .server value:
cat test.json | jq -r '.[] | select(.keys[].name | index("user2")) | .server'`
How could I also print a value in the selected array element?
You can use the following jq command:
jq -r '.[]|"\(.server), \(.keys[]|select(.name=="user2").for_user)"'

parsing JSON with jq to return value of element where another element has a certain value

I have some JSON output I am trying to parse with jq. I read some examples on filtering but I don't really understand it and my output it more complicated than the examples. I have no idea where to even begin beyond jq '.[]' as I don't understand the syntax of jq beyond that and the hierarchy and terminology are challenging as well. My JSON output is below. I want to return the value for Valid where the ItemName equals Item_2. How can I do this?
"1"
[
{
"GroupId": "1569",
"Title": "My_title",
"Logo": "logo.jpg",
"Tags": [
"tag1",
"tag2",
"tag3"
],
"Owner": [
{
"Name": "John Doe",
"Id": "53335"
}
],
"ItemId": "209766",
"Item": [
{
"Id": 47744,
"ItemName": "Item_1",
"Valid": false
},
{
"Id": 47872,
"ItemName": "Item_2",
"Valid": true
},
{
"Id": 47872,
"ItemName": "Item_3",
"Valid": false
}
]
}
]
"Browse"
"8fj9438jgge9hdfv0jj0en34ijnd9nnf"
"v9er84n9ogjuwheofn9gerinneorheoj"
Except for the initial and trailing JSON scalars, you'd simply write:
.[] | .Item[] | select( .ItemName == "Item_2" ) | .Valid
In your particular case, to ensure the top-level JSON scalars are ignored, you could prefix the above with:
arrays |