Packer SSH timeout when setting custom VPC, subnet and security group - packer

So I needed to be able to move my packer builders inside a private VPC and add a locked down security group that only allowed ssh from a restricted range of IPs, thus:
"builders": [{
"type": "amazon-ebs",
"associate_public_ip_address": false,
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "{{user `aws_region`}}",
"source_ami_filter": {
"filters": {
"virtualization-type": "hvm",
"name": "{{user `ami_source_name`}}",
"root-device-type": "ebs"
},
"owners": ["{{user `ami_source_owner_id`}}"],
"most_recent": true
},
"instance_type": "t3.small",
"iam_instance_profile": "{{user `iam_instance_profile`}}",
"ssh_username": "{{user `ssh_username`}}",
"ami_name": "{{user `ami_name_prefix`}}_{{user `ami_creation_date`}}",
"ami_users": "{{user `share_amis_with_account`}}",
"ebs_optimized": true,
"vpc_id": "vpc-123456",
"subnet_id": "subnet-123456",
"security_group_id": "sg-123456",
"user_data_file": "scripts/disable_tty.sh",
"launch_block_device_mappings": [{
"device_name": "{{user `root_device_name`}}",
"volume_size": 10,
"volume_type": "gp2",
"delete_on_termination": true
}],
"tags": {
"packer": "true",
"ansible_role": "{{user `ansible_role`}}",
"builtby": "{{user `builtby`}}",
"ami_name": "{{user `ami_name_prefix`}}_{{user `ami_creation_date`}}",
"ami_name_prefix": "{{user `ami_name_prefix`}}",
"project": "{{user `project`}}"
}
}]
To start with I added "associate_public_ip_address:false" (false is the default as well) as every time I ran packer the host was assigned a public ip address but even adding that it still picks up a public ip????????
I used a security group that I had assigned to Jenkins build slaves which also communicate over port 22 and I haven't had any issue with accessing them from any part of my infrastructure.
I get this error:
1562344256,,ui,error,Build 'amazon-ebs' errored: Timeout waiting for SSH.
1562344256,,error-count,1
1562344256,,ui,error,\n==> Some builds didn't complete successfully and had errors:
1562344256,amazon-ebs,error,Timeout waiting for SSH.
1562344256,,ui,error,--> amazon-ebs: Timeout waiting for SSH.
During the wait period for SSH to respond I was able to nc -v 1.2.3.5 22 and I get a connection so the security group is allowing communications on port 22 from my IP address.
If I change the security group to 0.0.0.0/0 it connects straight away but why when I can nc to port 22 with the restricted security group can packer not initiate an SSH connection? Is packer trying to use the public IP address that I can not for the life of me turn off?
I thought it might be quite helpful to tcpdump the traffic on port 22 to see what was happening but I have a locked down laptop that does not allow the install of that particular handy item.
I can also ssh to the builder from my laptop but get a Too many authentication failures error and can't log in to see what is going on.

So the reason that the packer builder is getting a public ip is down to the subnet settings - map_public_ip_on_launch = true.
So answer is build a new private subnet for the packer builder, build a new NAT GW in the public subnet then route from the private subnet to the NAT GW with a new routing table.

Related

Packer custom image build failed with ssh authentication error

I'm trying to build custom image for AWS EKS managed node group, Note: my custom image (ubuntu) already has MFA and private key based authentication enabled.
I cloned github repository to build eks related changes from the below url.
git clone https://github.com/awslabs/amazon-eks-ami && cd amazon-eks-ami
Next i made few changes to run the make file
cat eks-worker-al2.json
{
"variables": {
"aws_region": "eu-central-1",
"ami_name": "template",
"creator": "{{env `USER`}}",
"encrypted": "false",
"kms_key_id": "",
"aws_access_key_id": "{{env `AWS_ACCESS_KEY_ID`}}",
"aws_secret_access_key": "{{env `AWS_SECRET_ACCESS_KEY`}}",
"aws_session_token": "{{env `AWS_SESSION_TOKEN`}}",
"binary_bucket_name": "amazon-eks",
"binary_bucket_region": "eu-central-1",
"kubernetes_version": "1.20",
"kubernetes_build_date": null,
"kernel_version": "",
"docker_version": "19.03.13ce-1.amzn2",
"containerd_version": "1.4.1-2.amzn2",
"runc_version": "1.0.0-0.3.20210225.git12644e6.amzn2",
"cni_plugin_version": "v0.8.6",
"pull_cni_from_github": "true",
"source_ami_id": "ami-12345678",
"source_ami_owners": "00012345",
"source_ami_filter_name": "template",
"arch": null,
"instance_type": null,
"ami_description": "EKS Kubernetes Worker AMI with AmazonLinux2 image",
"cleanup_image": "true",
"ssh_interface": "",
"ssh_username": "nandu",
"ssh_private_key_file": "/home/nandu/.ssh/template_rsa.ppk",
"temporary_security_group_source_cidrs": "",
"security_group_id": "sg-08725678910",
"associate_public_ip_address": "",
"subnet_id": "subnet-01273896789",
"remote_folder": "",
"launch_block_device_mappings_volume_size": "4",
"ami_users": "",
"additional_yum_repos": "",
"sonobuoy_e2e_registry": ""
After adding user and private key build getting failed with below error.
logs
amazon-ebs: Error waiting for SSH: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain.
for me just changue region for aws o delete aws region in packer.

Connection to RDS MySql from ECS Fargate wordpress container times out

I have a container running (wordpress container if being more specific), which tries to connect to mysql rds instance.
Parameters for the fargate ecs service container:
{
"executionRoleArn": "ignore-this",
"containerDefinitions": [
{
"name": "MyCoolContainer",
"image": "wordpress:latest",
"essential": true,
"environment": [
{"name": "WORDPRESS_DB_HOST", "value": "host:3306"},
{"name": "WORDPRESS_DB_USER", "value": "user"},
{"name": "WORDPRESS_DB_PASSWORD", "value": "password"},
{"name": "WORDPRESS_DB_NAME", "value": "name"}
],
"portMappings": [
{
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/aws/ecs/fargate/prefix",
"awslogs-region": "eu-west-1",
"awslogs-stream-prefix": "prefix"
}
}
}
],
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "256",
"memory": "512",
"family": "wordpress"
}
Also, for security groups, I have opened 22, 80, 443, 3306 ports for any IP address.
But the container in ECS still fails to start with the reason:
[17-Sep-2019 08:42:24 UTC] PHP Warning: mysqli::__construct():
(HY000/2002): Connection timed out in Standard input code on line 22
MySQL Connection Error: (2002) Connection timed out
MySQL Connection Error: (2002) Connection timed out
However I can ensure that the RDS instance is accessable, when trying to connect from a local machine with a command:
mysql -uuser -ppassword -hhost -P3306
Also, I can ensure that a (wordpress) container successfuly runs on local machine and successfully connects to a remote RDS database with no timeouts.
EDIT
This is how my environment looks like from ECS UI panel:
(I have tried to copy paste these values into my local mysql command and it connected successfully.)
I suspect there is something wrong with aws services configuration. Any ideas?
Thanks to Adiii and some other articles found on the internet i have a complete solution to this problem.
You need to simply attach a NAT Gateway to the subnet in which you are launching your ECS Fargate instance.
Simply launching in a public subnet with an Internet Gateway for some weird reason does not solve the problem (even though logically thinking it should).
TL;DR:
NAT Gateway is needed. AWS is f****d up.

data is not receiving at Azure IoTHub?

I am using Azure Edge V1 with Ubuntu, I have created 1 IoT Hub with the name say X and then created 2 Devices say dev1 & dev2. After that I changed simulated_device_cloud_upload_lin.json file.
modules:
IotHub - "args": {
"IoTHubName": "X",
"IoTHubSuffix": "azure-devices.net",
"Transport": "MQTT",
"RetryPolicy": "EXPONENTIAL_BACKOFF_WITH_JITTER"}
mapping - "args": [
{
"macAddress": "01:01:01:01:01:01",
"deviceId": "dev1",
"deviceKey": "primary key of dev1"
},
{
"macAddress": "02:02:02:02:02:02",
"deviceId": "dev2",
"deviceKey": "Primary key of dev2"
} ] }
And then I go inside build folder and run command
./samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json
And this start sending messages to IoT Hub, but when I checked to IoTHub with iothub-explorer it will show me error given below-
error receiving reply from Event hub management end point : undefined.
And also messages did not reach at IoT Hub.
Could you please tell me what have I done wrong?

Cannot connect to RDS from Heroku but all 'grants' done - conflict in RDS docs on approach?

Tearing my hair out. Learnt lots from my previous mistakes (Cannot connect remotely to EC2 MySQL installation), however I have now configured identically (AFAICT, outputs below), but cannot get heroku to connect to my new AWS RDS DB MYSQL instance! my old instances are fine.
One concern I have is that the Heroku article https://devcenter.heroku.com/articles/amazon-rds seems to have conflicting info out there about how to use use wild cards for the GRANT statements.
RDS article: https://devcenter.heroku.com/articles/amazon-rds says
GRANT USAGE ON *.* TO 'username'#'%';
BUT https://www.flydata.com/blog/access-denied-issue-amazon-rds/, https://www.flydata.com/blog/access-denied-issue-amazon-rds/ suggest a different syntax using '%'
GRANT USAGE ON `%`.* TO `username`#`%` IDENTIFIED BY 'pwd';
to no affect.
So..
all instances created with same security group
security group has inbound access (and works for 2 other instances)
GRANT access (as per my original 2 instances )
Tried new suggested syntax of % not *
Have tried
with or without SSL
creating a new security group
Security groups (all instances are the same for my 3 environments, but one i cannot connect from heroku)
$ grep sg- aws_instance.txt
"VpcSecurityGroupId": "sg-c8ce36b4"
"VpcSecurityGroupId": "sg-c8ce36b4"
"VpcSecurityGroupId": "sg-c8ce36b4"
Security group config
and visually i can see inboound config: MYSQL,TCP,3306,0.0.0.0/0
{
"DBSecurityGroups": [
{
"DBSecurityGroupDescription": "default",
"IPRanges": [
{
"Status": "authorized",
"CIDRIP": "0.0.0.0/32"
},
{
"Status": "authorized",
"CIDRIP": "0.0.0.0/0"
},
{
"Status": "authorized",
"CIDRIP": "87.1.1.1/32"
}
],
"OwnerId": "xxxxxxx",
"DBSecurityGroupArn": "arn:aws:rds:us-east-1:xxxxxxx:secgrp:default",
"EC2SecurityGroups": [
{
"Status": "authorized",
"EC2SecurityGroupName": "default",
"EC2SecurityGroupOwnerId": "xxxxxxxxx",
"EC2SecurityGroupId": "sg-2aca2f43"
}
],
"DBSecurityGroupName": "default"
},
{
"VpcId": "vpc-a7d034c1",
"DBSecurityGroupDescription": "Inbound DB only",
"IPRanges": [],
"OwnerId": "xxxxxx",
"DBSecurityGroupArn": "arn:aws:rds:us-east-1:xxxxxxx:secgrp:mysecuritygroupdbonly",
"EC2SecurityGroups": [],
"DBSecurityGroupName": "mysecuritygroupdbonly"
}
]
}

Cluster communication and firewalls in Google Container Engine

I'm trying to set up the following environment on Google Cloud and have 3 major problems with it:
Database Cluster
3 nodes
one port open to world, a few ports open to the compute cluster
Compute Cluster
- 5 nodes
- communicated with the database cluster
- two ports open to the world
- runs Docker containers
a) The database cluster runs fine, I have the configuration port open to world, but I don't know how to limit the other ports to only the compute cluster?
I managed to get the first Pod and Replication-Controller running on the compute cluster and created a service to open the container to the world:
controller:
{
"id": "api-controller",
"kind": "ReplicationController",
"apiVersion": "v1beta1",
"desiredState": {
"replicas": 2,
"replicaSelector": {
"name": "api"
},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "apiController",
"containers": [{
"name": "api",
"image": "gcr.io/my/api",
"ports": [{
"name": "api",
"containerPort": 3000
}]
}]
}
},
"labels": {
"name": "api"
}
}
}
}
service:
{
"id": "api-service",
"kind": "Service",
"apiVersion": "v1beta1",
"selector": {
"name": "api"
},
"containerPort": "api",
"protocol": "TCP",
"port": 80,
"selector": { "name": "api" },
"createExternalLoadBalancer": true
}
b) The container exposes port 3000, the service port 80. Where's the connection between the two?
The firewall works with labels. I want 4-5 different pods running in my compute cluster with 2 of them having open ports to the world. There can be 2 or more containers running on the same instance. The labels however are specific to the nodes, not the containers.
c) Do I expose all nodes with the same firewall configuration? I can't assign labels to containers, so not sure how to expose the api service for example?
I'll try my best to answer all of your questions as best I can.
First off, you will want to upgrade to using v1 of the Kubernetes API because v1beta1 and v1beta3 will no longer be available after Aug. 5th:
https://cloud.google.com/container-engine/docs/v1-upgrade
Also, Use YAML. It's so much less verbose ;)
--
Now on to the questions you asked:
a) I'm not sure I completely understand what you are asking here but it sounds like running the services in the same cluster (with resource limits) would be way easier than trying to deal with cross cluster networking.
b) You need to specify a targetPort so that the service knows what port to use on the container. This should match port 3000 that you have in your resource controller. See the docs for more info.
{
"kind": "Service",
"apiVersion": "v1",
"metadata: {
"labels": [{
"name": "api-service"
}],
},
"spec": {
"selector": {
"name": "api"
},
"ports": [{
"port": 80,
"targetPort": 3000
}]
"type": "LoadBalancer"
}
}
c) Yes. In Kubernetes the kube-proxy accepts traffic on any node and routes it to the appropriate node or local pod. You don't need to worry about mapping the load balancer to, or writing firewall rules for those specific nodes that happen to be running your pods (it could actually change if you do a rolling update!). kube-proxy will route traffic to the right place even if your service is not running on that node.