I am trying to get a value from an array by matching a value in a child array, but everything I try either returns nothing or all of members of the parent array. I only want the info from the parent where the child matches.
Specifically, I want to list all of the AWS security groups that have port 22 rules in them.
This is a reduced sample output from the aws command line that I am trying to parse:
{
"SecurityGroups": [
{
"Description": "ssh and web group",
"IpPermissions": [
{
"FromPort": 22,
"ToPort": 22
},
{
"FromPort": 80,
"ToPort": 80
}
],
"GroupName": "ssh-web",
"GroupId": "sg-11111111"
},
{
"Description": "https group",
"IpPermissions": [
{
"FromPort": 443,
"ToPort": 443
},
{
"FromPort": 8443,
"ToPort": 8443
}
],
"GroupName": "https",
"GroupId": "sg-22222222"
}
]
}
I have tried this:
aws ec2 describe-security-groups |
jq '.SecurityGroups[] as $top |
.SecurityGroups[].IpPermissions[] |
select(.FromPort == 22) |
$top'
and this:
aws ec2 describe-security-groups |
jq '. as $top |
.SecurityGroups[].IpPermissions[] |
select(.FromPort == 22) |
$top'
Both commands show both of the top-level array entries instead of just one containing the port 22 entry; they just show the entire output from the aws command.
The person who answered this question below specifically refers to the potential scoping problem that I am actually having, but his brief description of how to deal with it isn't enough for me to understand:
jq - How do I print a parent value of an object when I am already deep into the object's children?
I want to see this:
GroupName: "https"
GroupID: "sg-22222222"
I don't think I fully understand how using 'as' works, which may be my stumbling block.
Don't descend into children if you need parent.
.SecurityGroups[]
| select(any(.IpPermissions[]; .FromPort == 22))
| .GroupName, .GroupId
should work.
Related
When I run the jq command to parse a json document from the amazon cli I have the following problem.
I’m parsing through the IP address and a tag called "Enviroment". The enviroment tag in the instance does not exist therefore it does not throw me any result.
Here's an example of the relevant output returned by the AWS CLI
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
I’m running the following command
aws ec2 describe-instances --filters "Name=tag:Name,Values=Balance-OTA-SS_a" | jq -c '.Reservations[].Instances[] | ({IP: .PrivateIpAddress, Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)})'
## output
empty
How do I show the IP address in the output of the command even if the enviroment tag does not exist?
Regards,
Let's assume this input:
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
This is the format returned by describe-instances, but with all the irrelevant fields removed.
Note that tags is always a list of objects, each of which has a Key and a Value. This format is perfect for from_entries, which can transform this list of tags into a convenient mapping object. Try this:
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags|from_entries.Environment)
}
{"IP":"10.0.0.1","Ambiente":"alpha"}
{"IP":"10.0.0.2","Ambiente":null}
That answers how to do it. But you probably want to understand why your approach didn't work.
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)
}
The .[] filter you're using on the tags can return zero or multiple results. Similarly, the select filter can eliminate some or all items. When you apply this inside an object constructor (the expression from { to }), you're causing that whole object to be created a variable number of times. You need to be very careful where you use these filters, because often that's not what you want at all. Often you instead want to do one of the following:
Wrap the expression that returns multiple results in an array constructor [ ... ]. That way instead of outputting the parent object potentially zero or multiple times, you output it once containing an array that potentially has zero or multiple items. E.g.
[.Tags[]|select(.Key=="Environment")]
Apply map to the array to keep it an array but process its contents, e.g.
.Tags|map(select(.Key=="Environment"))
Apply first(expr) to capture only the first value emitted by the expression. If the expression might emit zero items, you can use the comma operator to provide a default, e.g.
first((.Tags[]|select(.Key=="Environment")),null)
Apply some other array-level function, such as from_entries.
.Tags|from_entries.Environment
You can either use an if ... then ... else ... end construct, or //. For example:
.Reservations[].Instances[]
| {IP: .PrivateIpAddress} +
({Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)}
// null)
Given a json such as:
{
"clusters": [
{
"domain": "crap1",
"name": "BB1",
"nodes": [
{
"gpu": null,
"node": "bb1-1",
"role": "worker"
},
{
"gpu": {
"P40": 2
},
"node": "bb1-2",
"role": "master"
}
],
"site": "B-place",
"hardware": "prod-2",
"timezone": "US/Eastern",
"type": "CCE",
"subtype": null
}
]
}
where there are actually many more clusters, I want to see if I can parse the json searching for node bb1-2, for example, and print out the cluster name it belongs to BB1?
I know I can search for that node with:
.clusters[] | .nodes[] | select(.node == "bb1-2")
but can't figure out how to code it to print out a value at a higher level?
In addition to the other approaches, a very general way to hold on to higher level context is to bind it to a variable.
jq '
.clusters[] |
. as $cluster |
.nodes[] |
select(.node == "bb1-2") |
{cluster_name:$cluster.name, node:.}
'
{
"cluster_name": "BB1",
"node": {
"gpu": {
"P40": 2
},
"node": "bb1-2",
"role": "master"
}
}
This makes sure you know both the cluster and the matching node itself, and avoids the confusion that arises if your select condition matches the same cluster more than once.
How about
.clusters[] | select(.nodes[].node == "bb1-2").name
Try it:
JQ play
Sorry if this sounds too simple but I am still learning and have spent few hours to get a solution. I have a large json file and I would like to search a specific value from an object and return value from other object.
Example, from the below data, I would like to search the json file for all objects that have value in unique_number that match "123456" and return this value along with the IP address.
jq should return something like - 123456, 127.0.0.1
Since the file is going to be about 300 MB with many IP addresses will there be any performace issues?
Partial json -
{
"ip": "127.0.0.1",
"data": {
"tls": {
"status": "success",
"protocol": "tls",
"result": {
"handshake_log": {
"server_hello": {
"version": {
"name": "TLSv1.2",
"value": 1111
},
"random": "dGVzdA==",
"session_id": "dGVzdA==",
"cipher_suite": {
"name": "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
"value": 1122
},
"compression_method": 0,
},
"server_certificates": {
"certificate": {
"raw": "dGVzdA==",
"parsed": {
"version": 3,
"unique_number": "123456",
"signature_algorithm": {
"name": "SHA256-RSA",
"oid": "1.2.4.5.6"
},
The straight-forward way would be to use the select filter (either standalone on multiple values or with map on an array) and filter all objects matching your criterion (e.g. equal to "123456") and then transform into your required output format (e.g. using string interpolation).
jq -r '.[]
| select(.data.tls.result.handshake_log.server_certificates.certificate.parsed.unique_number=="123456")
| "\(.ip), \(.data.tls.result.handshake_log.server_certificates.certificate.parsed.unique_number)"'
Because the unique_number property is nested quite deeply and cumbersome to write twice, it makes sense to first transform your object into something simpler, then filter, and finally output in the desired format:
jq -r '.[]
| { ip, unique_number: .data.tls.result.handshake_log.server_certificates.certificate.parsed.unique_number }
| select(.unique_number=="123456")
| "\(.ip), \(.unique_number)"'
Alternatively using join:
.[]
| { ip, unique_number: .data.tls.result.handshake_log.server_certificates.certificate.parsed.unique_number }
| select(.unique_number=="123456")
| [.ip, .unique_number]
| join(", ")
Here is a simplified json file of a terraform state file (let's call it dev.ftstate)
{
"version": 4,
"terraform_version": "0.12.9",
"serial": 2,
"lineage": "ba56cc3e-71fd-1488-e6fb-3136f4630e70",
"outputs": {},
"resources": [
{
"module": "module.rds.module.reports_cpu_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.reports_lag_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.cross_region_replica_lag_alert",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds",
"mode": "managed",
"type": "aws_db_instance",
"name": "master",
"provider": "provider.aws",
"instances": [
{
"schema_version": 0,
"attributes": {
"address": "dev-database.123456.us-east-8.rds.amazonaws.com",
"allocated_storage": 10,
"password": "",
"performance_insights_enabled": false,
"tags": {
"env": "development"
},
"timeouts": {
"create": "6h",
"delete": "6h",
"update": "6h"
},
"timezone": "",
"username": "admin",
"vpc_security_group_ids": [
"sg-1234"
]
},
"private": ""
}
]
}
]
}
There are many modules at the same level of module.rds inside the instances. I took out many of them to create the simplified version of the raw data. The key takeway: do not assume the array index will be constant in all cases.
I wanted to extract the password field in the above example.
My first attempt is to use equality check to extract the relevant modules
` jq '.resources[].module == "module.rds"' dev.tfstate`
but it actually just produced a list of boolean values. I don't see any mention of builtin functions like filter in jq's manual
then I tried to just access the field:
> jq '.resources[].module[].attributes[].password?' dev.tfstate
then it throws the following error
jq: error (at dev.tfstate:1116): Cannot iterate over string ("module.rds")
So what is the best way to extract the value? Hopefully it can only focus on the password attribute in module.rds module only.
Edit:
My purpose is to detect if a password is left inside a state file. I want to ensure the passwords are exclusively stored in AWS secret manager.
You can extract the module you want like this.
jq '.resources[] | select(.module == "module.rds")'
I'm not confident that I understand the requirements for the rest of the solution. So this might not only not be the best way of doing what you want; it might not do what you want at all!
If you know where password will be, you can do this.
jq '.resources[] | select(.module == "module.rds") | .instances[].attributes.password'
If you don't know exactly where password will be, this is a way of finding it.
jq '.resources[] | select(.module == "module.rds") | .. | .password? | values'
According to the manual under the heading "Recursive Descent," ..|.a? will "find all the values of object keys “a” in any object found “below” ."
values filters out the null results.
You could also get the password value out of the state file without jq by using Terraform outputs. Your module should define an output with the value you want to output and you should also output this at the root module.
Without seeing your Terraform code you'd want something like this:
modules/rds/main.tf
resource "aws_db_instance" "master" {
# ...
}
output "password" {
value = aws_db_instance.master.password
sensitive = true
}
example/main.tf
module "rds" {
source = "../modules/rds"
# ...
}
output "rds_password" {
value = module.rds.password
sensitive = true
}
The sensitive = true parameter means that Terraform won't print the output to stdout when running terraform apply but it's still held in plain text in the state file.
To then access this value without jq you can use the terraform output command which will retrieve the output from the state file and print it to stdout. From there you can use it however you want.
I have 2 hosted zone with the same name. I want to get the hostedZoneId of a Hostedzone used for us-west-2 region.
aws route53 list-hosted-zones-by-name --dns-name domainname
It gives the following output:
{
"HostedZones": [
{
"ResourceRecordSetCount": 3,
"CallerReference": "2018-08-07T14:02:30.733383821+05:30",
"Config": {
"Comment": "Private Hosted Zone for tenant:us-west-2",
"PrivateZone": true
},
"Id": "/hostedzone/D2JGX0PDINSIDA",
"Name": "domainname."
},
{
"ResourceRecordSetCount": 3,
"CallerReference": "2018-08-16T16:38:29.821900042+05:30",
"Config": {
"Comment": "Private Hosted Zone for tenant:eu-west-1",
"PrivateZone": true
},
"Id": "/hostedzone/Q1HEEHGD5JH3G3",
"Name": "domainname."
}
],
"DNSName": "domainname",
"IsTruncated": false,
"MaxItems": "100"
}
As you can see there are two records for the same name, I want to get the Id of a hostedZone used for us-west-2. I dont have any uniqueness now to identify the HostedZone used for Us other than the Comment.
I tried with jq but I am not aware of how to provide conditions to it.
aws route53 list-hosted-zones-by-name --dns-name domainname | jq ".HostedZones | .[] | .Config"
Any help would be appreciated or any references
It is a simple filter on jq to use endswith or test to match us-west-2 on the .Config.Comment field value. (See it working on jqplay.org )
jq '.HostedZones[] | select( .Config.Comment | test("us-west-2$") ).Id'
As ever, to remove the outer quotes, use the --raw-output mode with jq -r ..