JQ - Remove Duplicate Array Values

JQ - Remove Duplicate Array Values - json

Edited for better clarity
I am using the following jq query to extract the AWS ARN and associated protocols. However I only need the ARN to be listed once followed by the ports and protocols
my code is jq -r '.Listeners[] | (.LoadBalancerArn), (.Protocol)' and the results are
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTPS"
I have tried everything including unique, first, unique_by, select, contains, etc.. and the results are always "Cannot iterate over string" or number
Desired results
"arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde"
"HTTP"
"HTTP"
"HTTPS"
Sample JSON
{
"Listeners": [
{
"LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
"Port": 9090,
"Protocol": "HTTP"
},
{
"LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
"Port": 80,
"Protocol": "HTTP"
},
{
"LoadBalancerArn": "arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde",
"Port": 443,
"Protocol": "HTTPS"
}
]
}

Group by the common field and iterate over the groups, then output the common field of the first (which is the same for the whole group), and iterate again to output other fields from the same group:
jq -r '.Listeners | group_by(.LoadBalancerArn)[]
| .[0].LoadBalancerArn, .[].Protocol'
arn:aws:elasticloadbalancing:us-xxxx-1:123456789:loadbalancer/app/msword-123456789/20b73abcde
HTTP
HTTP
HTTPS
Demo

unique works on an array, so you'll need to create one with all the LoadBalancerArn first, then call unique and get the first of the remaining array:
.Listeners | map(.LoadBalancerArn) | unique | first

Does this produce what you expected ?
jq -r '.Listeners |
group_by(.LoadBalancerArn)[] |
first |
"\(.LoadBalancerArn) \(.Protocol)"
' input.json

You say you want the ARN listed once, followed by the ports and protocols. You don't give such an example, so it's unclear whether there might be multiple different ARNs between the items in the array of listeners. Let's assume that there might well be multiple different ARNs, and for that reason I'll use slightly different test data:
{
"Listeners": [
{
"LoadBalancerArn": "arn:example:BLUE",
"Port": 9090,
"Protocol": "HTTP"
},
{
"LoadBalancerArn": "arn:example:GOLD",
"Port": 80,
"Protocol": "HTTP"
},
{
"LoadBalancerArn": "arn:example:GOLD",
"Port": 443,
"Protocol": "HTTPS"
}
]
}
group_by is the JQ function for collecting together all the items in an array that share a particular value for some expression. So you could use
.Listeners|group_by(.LoadBalancerArn)
to get an array of arrays of objects, where all of the objects in each inner array has the same value of LoadBalancerArn:
[
[
{
"LoadBalancerArn": "arn:example:BLUE",
"Port": 9090,
"Protocol": "HTTP"
}
],
[
{
"LoadBalancerArn": "arn:example:GOLD",
"Port": 80,
"Protocol": "HTTP"
},
{
"LoadBalancerArn": "arn:example:GOLD",
"Port": 443,
"Protocol": "HTTPS"
}
]
]
From there you can safely pick out the ARN from the first object in each list, knowing that the rest must have the same value:
.Listeners|group_by(.LoadBalancerArn)[](first|{LoadBalancerArn})+{Listeners:map(del(.LoadBalancerArn))}
{
"LoadBalancerArn": "arn:example:BLUE",
"Listeners": [
{
"Port": 9090,
"Protocol": "HTTP"
}
]
}
{
"LoadBalancerArn": "arn:example:GOLD",
"Listeners": [
{
"Port": 80,
"Protocol": "HTTP"
},
{
"Port": 443,
"Protocol": "HTTPS"
}
]
}

Related

Group by and remove duplicates across arrays objects using JQ

Given the json, I need to group by key userName the object userClientDetailDTOList across all sites->buildings->floors and remove any duplicate mac addresses.
I have been able to do it using jq expression -
[.billingDetailPerSiteDTOList[].billingDetailPerBuildingDTOList[].billingDetailsPerFloorDTOList[].userClientDetailDTOList[] ] | group_by(.userName) | map((.[0]|del(.associatedMacs)) + { associatedMacs: (map(.associatedMacs[]) | unique) })
This groups by userName and also removes duplicate macs belonging to particular user. This results in a list as
[
{
"userName": "1",
"associatedMacs": [
"3:3:3:3:3:3",
"5:5:5:5:5:5"
]
},
{
"userName": "10",
"associatedMacs": [
"4:4:4:4:4:4",
"6:6:6:6:6:6"
]
},
{
"userName": "2",
"associatedMacs": [
"1:1:1:1:1:1",
"2:2:2:2:2:2"
]
},
{
"userName": "3",
"associatedMacs": [
"2:2:2:2:2:2"
]
}
]
Live example
Questions:
Can the expression be simplified?
How do I remove duplicate mac addresses across all users? The mac address 2:2:2:2:2:2 is repeated for users 2 and 3

The filter is practically as good as it can get. If you really wanted to, you could still change
del(.associatedMacs) to {userName} for a positive definition, and
(…) + {…} to {userName: …, associatedMacs: …} to avoid the addition,
resulting in
… | map({userName: (.[0].userName), associatedMacs: (map(.associatedMacs[]) | unique)})
Demo
As for the second question, if you treated the input as an INDEX on the IPs, you could mostly reuse the code from earlier (of course, the unique part wouldn't be necessary anymore)
[INDEX(…; .associatedMacs[])[]] | group_by(.userName) | map(…)
[
{
"userName": "1",
"associatedMacs": [
"3:3:3:3:3:3",
"5:5:5:5:5:5"
]
},
{
"userName": "10",
"associatedMacs": [
"4:4:4:4:4:4",
"6:6:6:6:6:6"
]
},
{
"userName": "2",
"associatedMacs": [
"1:1:1:1:1:1"
]
},
{
"userName": "3",
"associatedMacs": [
"2:2:2:2:2:2"
]
}
]
Demo

Using `jq` to add key/value to a json file using another json file as a source

Been struggling with this for a while and I'm no closer to a solution. I'm not very experienced using jq.
I'd like to take the values from one json file and add them to another file when other values in the dict match. The example files below demonstrate what I'd like more clearly than an explanation.
hosts.json:
{
"hosts": [
{
"host": "hosta.example.com",
"hostid": "101",
"proxy_hostid": "1"
},
{
"host": "hostb.example.com",
"hostid": "102",
"proxy_hostid": "1"
},
{
"host": "hostc.example.com",
"hostid": "103",
"proxy_hostid": "2"
}
]
}
proxies.json:
{
"proxies": [
{
"host": "proxy1.example.com",
"proxyid": "1"
},
{
"host": "proxy2.example.com",
"proxyid": "2"
}
]
}
I also have the above file available with proxyid as the key, if this makes it easier:
{
"proxies": {
"1": {
"host": "proxy1.example.com",
"proxyid": "1"
},
"2": {
"host": "proxy2.example.com",
"proxyid": "2"
}
}
}
Using these json files above (from the Zabbix API), I'd like to add the value of .proxies[].host (from proxies.json) as .hosts[].proxy_host (to hosts.json).
This would only be when .hosts[].proxy_hostid equals .proxies[].proxyid
Desired output:
{
"hosts": [
{
"host": "hosta.example.com",
"hostid": "101",
"proxy_hostid": "1",
"proxy_host": "proxy1.example.com"
},
{
"host": "hostb.example.com",
"hostid": "102",
"proxy_hostid": "1",
"proxy_host": "proxy1.example.com"
},
{
"host": "hostc.example.com",
"hostid": "103",
"proxy_hostid": "2",
"proxy_host": "proxy2.example.com"
}
]
}
I've tried many different ways of doing this, and think I need to use jq -s or jq --slurpfile, but I've reached a lot of dead-ends and can't find a solution.
jq 'input as $p | map(.[].proxy_host = $p.proxies[].proxyid)' hosts.json proxies.json
I think I would need something like this as well, but not sure how to use it.
if .hosts[].proxy_hostid == .proxies[].proxyid then .hosts[].proxy_host = .proxies[].host else empty end'
I've found these questions but they haven't helped :(
How do I use a value as a key reference in jq? <- I think this one is the closest
Lookup values from one JSON file and replace in another
Using jq find key/value pair based on another key/value pair

This indeed is easier with the alternative version of your proxies.json. All you need is to store proxies in a variable as reference, and retrieve proxy hosts from it while updating hosts.
jq 'input as { $proxies } | .hosts[] |= . + { proxy_host: $proxies[.proxy_hostid].host }' hosts.json proxies.json
Online demo

JQ if then statement scope

I'd like to use JQ to grab only the sub-records that match an if-then statement. When I use
jq 'if .services[].banner == "FQMDAAICCg==" then .services[].port else empty end
it grabs all of the ports for the record. (there are multiple services under each record and I want to restrict my then statement to only the services scope where I actually found the if condition).
How do I just get the port, banner, etc. for the specific service underneath the record which hit my condition?
example:
{
"services": [
{
"tls_detected": false,
"banner_is_raw": true,
"transport_protocol": "tcp",
"banner": "PCFET0NUWVBFIEhU",
"certificate": null,
"timestamp": "2020-03-22T00:38:01.074Z",
"protocol": null,
"port": 4444
},
{
"tls_detected": false,
"banner_is_raw": true,
"transport_protocol": "tcp",
"banner": "SFRUUC8xLjEgMzA",
"certificate": null,
"timestamp": "2020-03-19T01:39:45.288Z",
"protocol": null,
"port": 8080
},
{
"tls_detected": false,
"banner_is_raw": true,
"transport_protocol": "tcp",
"banner": "FQMDAAICCg==",
"certificate": null,
"timestamp": "2020-03-19T01:39:45.288Z",
"protocol": null,
"port": 8085
},
{
"tls_detected": false,
"banner_is_raw": false,
"transport_protocol": "tcp",
"banner": "Q2FjaGUtQ29ud",
"certificate": null,
"timestamp": "2020-03-20T04:25:24Z",
"protocol": "http",
"port": 8080
}
],
"ip": "103.238.62.68",
"autonomous_system": {
"description": "CHAPTECH-AS-AP Chaptech Pty Ltd",
"asn": 133493,
"routed_prefix": "103.238.62.0/24",
"country_code": "AU",
"name": "CHAPTECH-AS-AP Chaptech Pty Ltd",
"path": [
11164,
3491,
63956,
7594,
7594,
7594,
7594,
133493
]
},
"location": {
"country_code": "AU",
"registered_country": "Australia",
"registered_country_code": "AU",
"continent": "Oceania",
"timezone": "Australia/Sydney",
"latitude": -33.494,
"longitude": 143.2104,
"country": "Australia"
}
}
Update:
Thanks to peak but I couldn't get the additional goals bit working below. I ended up using
jq 'select(.services[].banner == "FQMDAAICCg==") | {port: .services[].port, banner: .services[].banner, ip: .ip}' censys.json | jq 'if .banner == "FQMDAAICCg==" then .ip,.port else empty end'
which is ugly but did the trick and still allowed me to stream the data to the first filter.

Original question
How do I just get the port, banner, etc. for the specific service underneath the record which hit my condition?
To get just the "port" for the service matching the condition, you could modify your query:
.services[]
| if .banner == "FQMDAAICCg==" then .port else empty end
Equivalently:
.services[]
| select(.banner == "FQMDAAICCg==")
| .port
Additional goal
I want to end up in this example with '8085' + '103.238.62.68'
If you really want the two values in that format, you could write something along the following lines, invoking jq with the -r option:
.ip as $ip
| (.services[] | select(.banner == "FQMDAAICCg==") | .port) as $port
| "'\($port)' + '\($ip)'"
or more briefly but less readably:
"'\(.services[] | select(.banner == "FQMDAAICCg==") | .port)' + '\(.ip)'"

How To Combine Two Json Values

I have a text file with several hundred json entries.The first two entries look like this
{
"ip": "127.0.0.1",
"timestamp": "1565343832",
"ports": [{
"port": 80,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 245
}]
}
{
"ip": "127.0.0.2",
"timestamp": "1565343837",
"ports": [{
"port": 81,
"proto": "tcp",
"status": "open",
"reason": "syn-ack",
"ttl": 43
}]
}
I would like to parse this json file and combine the values for ip and port and put them in a text file.So the text file will have entries like this
127.0.0.1:80
127.0.0.2:81
How can this be done

Here is a C++ code example, use of boost::property_tree we can read json data and parse
ofstream outfile;
outfile.open ("entries.txt");
for (/*for loop for each entry*/) {
std::stringstream ss;
//assign input values to ss per entry
boost::property_tree::ptree pt;
boost::property_tree::read_json(ss, pt);
std::string IP = pt.get<std::string>("ip");
std::string PORT = pt.get<std::string>("ports.port");
outfile << IP << ":" <<PORT;
}

How to output keys on different levels if value found in array

Using jq, I would like to output multiple values on different levels of a JSON file based on whether they exist in an array.
My data looks like the following. It displays a number of hosts I examine regarding the people who have access to it:
[
{
"server": "example_1",
"version": "Debian8",
"keys": [
{
"fingerprint": "SHA256:fingerprint1",
"for_user": "root",
"name": "user1"
},
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "git",
"name": "user2"
}
]
},
{
"server": "example_2",
"version": "Debian9",
"keys": [
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "root",
"name": "user2"
},
{
"fingerprint": "SHA256:fingerprint2",
"for_user": "www",
"name": "user2"
}
]
},
{
"server": "example_3",
"version": "CentOS",
"keys": [
null
]
}
]
I want to extract the value for server and the value of for_user any occurence where user2 is found as a name in .keys[]. Basically, the output could look like this:
example1, git
example2, root
example2, www
What I can already do is displaying the first column, so the .server value:
cat test.json | jq -r '.[] | select(.keys[].name | index("user2")) | .server'`
How could I also print a value in the selected array element?

You can use the following jq command:
jq -r '.[]|"\(.server), \(.keys[]|select(.name=="user2").for_user)"'

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

JQ - Remove Duplicate Array Values - json

unique works on an array, so you'll need to create one with all the LoadBalancerArn first, then call unique and get the first of the remaining array: .Listeners | map(.LoadBalancerArn) | unique | first

Does this produce what you expected ? jq -r '.Listeners | group_by(.LoadBalancerArn)[] | first | "\(.LoadBalancerArn) \(.Protocol)" ' input.json

Related

Group by and remove duplicates across arrays objects using JQ

Using `jq` to add key/value to a json file using another json file as a source

JQ if then statement scope

How To Combine Two Json Values

How to output keys on different levels if value found in array

Categories

Resources