We have a cluster of Tomcat servers in AWS BeansTalk connected to AWS RDS (MySQL) with Multi-AZ availability.
Some days ago, the RDS instance had a patch applied to the OS which triggered a failover to another RDS instance based on the Multi-AZ availability.
The result was a Production system down during hours (it was at night) until we restarted the Tomcats in each instance. We had thousands of Connection refused errors to database.
According to AWS support, when a failover instance is launched, the endpoint is the same but its IP is changed, and my Tomcats had the old IP cached. So after restarting Tomcat the cache was cleared, the new IP was used and the connectivity issue was resolved. They refer me to this SO question.
That makes a lot of sense however I couldn't reproduce the issue in a controlled test with the same application in Production.
I changed the IP of a domain in /etc/hosts and my current BeansTalk Production Tomcat detected the IP change 30 seconds later, so it should have detected the RDS endpoint IP change too.
The Java ttl property in my BeansTalk environment is set as:
#networkaddress.cache.ttl=-1
So, by default it takes 30 secs as cache, that matches with my experiment.
[EDIT] As suggested in the comments, I've tried to simulate a failover through DNS. In this case, I've changed a CNAME record from a domain to another domain. I did the same test and Tomcat detected the change again 30 seconds later.
Do you have any idea why in this case the RDS endpoint IP change was not detected by Tomcat/JVM?
Related
I was trying to deploy a new application version for my python beanstalk environment. While deploying the version, the associated RDS instance shutdown and restarted and it took almost 2 hours. Also the db instance class and the allocated storage changed automatically, application had downtime for almost 2 hours. How i can troubleshoot the reason for the RDS restart/update on the application deployment time.
Please advice
I have an application running on EC2 instances that store data in an RDS instance. All of these instances are in an AWS VPC with security groups configured to allow them to connect to each other.
For reporting purposes, I would like to connect to the RDS instance from my laptop (e.g. using SQLAlchemy) to run simple queries. Every time I try to connect using the connection string that the EC2 apps use, the connection times out.
For Google, one can use the Cloud SQL proxy for this, but I can't find an analogous product for AWS. Instead, it seems like what I am supposed to do is attach an internet gateway to the VPC and configure the security groups to allow connection from my machine. However, the documents are unclear on how to do this other than allowing all inbound connections or allowing a static IP. Unfortunately my laptop doesn't have a static IP, and I'm uncomfortable allowing all inbound connections as it seems insecure and an invitation to attacks. I also have not been able to find a way to configure a security group to allow connections based on IAM credentials for example using the AWS CLI. Since I will be routinely generating reports, a solution that involves updating a security group (i.e. allowing my current IP) every time I want to connect seems suboptimal.
I have tried following the following documents, but so far have had no success in finding a solution that does not allow all connections:
Allow users to connect to RDS using IAM*
Connecting to RDS instance from command line
Connecting to RDS on VPC from internet
*My RDS instance configuration does not allow me to enable IAM authentication, I'm not sure why
IAM Database Authentication is not supported for the configuration in the DB Instance db.
Modify your Db Instance to another instance class and try again.
(Service: AmazonRDS; Status Code: 400; Error Code: InvalidParameterCombination;
Request ID: a6194fb8-2ab9-4a6a-a2be-63835e6e0184)
Is there something I'm not understanding or overlooking? Is allowing connections from all IPs not a big deal since the DB instance is still secured by DB user credentials?
Select this connection as per screenshot. Then fill up all details use your nat instance .pem file to connect. Its like you are connecting to VPC through Nat gateway or Internet gateway bypassing this.
Another option to install VPN on VPC and connect.
I have my Sails application on an AWS instance with all dependancies installed with no apparent issues. However, each time I try to launch the app I am getting the following error.
error: AdapterError: Connection is already registered
I have not managed to successfully lift sails yet on the instance and sails-mysql was freshly installed so no connections should be registered.
I have taken the following steps to deploy my app..
Set up a MySql RDS instance (EU-West)
Created and set up an Ubuntu AMD-64 t2.micro EC2 instance (EU-West)
Installed all prerequisites (Git, NVM, NodeJs, Sails, etc.)
Cloned my Sails project
Installed dependencies for Sails
Correctly configured my connection settings for Sails to use my RDS instance.
I know that my connection settings are correct as I have been able to run Sails on my local machine with a connection to my RDS instance and it would consistently lift without any issues.
I am also able to connect to my RDS instance using SequelPro with no problems.
I have had issues with dependencies in the past but have managed to fix those issues and have not had any of them on my local machine or with my EC2 instance.
After searching for a while I have come across a few users who have had similar issues but have managed to fix them with Waterline's teardown methods, however, I am unsure how to achieve this.
I have done my best to provide as much information as possible and any help would be massively appreciated.
Sails Version: 0.12.11
Thank you in advance.
I managed to fix the issue by carrying out the following:
Switched my environment to production in config/bootstrap.js
In connections.js add connectTimeout: 20000 to make sure the request does not time out before the connection is made.
eg. process.env.NODE_ENV = 'development'
Ensure that the security group inbounds rules for the RDS allows connections from the security group associated with my EC2 instance.
Type: MySQL/Aurora
Protocol: TCP
Port Range: 3306
Source: < Your security group ID >
Following the above points also meant I overcame the issue with handshake timeouts when communicating with the RDS.
I started several GCE instances and was unable to connect to even 1 of them using ssh. For debian wheezy instances the ssh server appeared to be not running ("nc IP 22" times out). Even though I enabled ICMP in default network, debian instances did not respond to ping.
CentOS instances responds to ping and I was able to get an ssh banner using nc intermittently. But connecting using ssh command repeatedly timed out.
I suspected a network outage but "gcutil listzones" showed that all the zones I was using, were UP (us-cental)
From https://groups.google.com/d/msg/gce-operations/coBWszq91j4/dRPq5_gJ3t4J:
We're investigating an issue with network connectivity to new Google Compute Engine instances. Currently-running instances are not affected. We will provide more information shortly.
I have an app with two workers (Web and Background) on AppHarbor that connect to a MySql database hosted on Amazon's RDS.
I keep getting "Unable to connect to any of the specified MySQL hosts." exception.
The RDS instance in the US-East region and I have added the following AppHarbor CIDR to the security group.
50.17.211.192/28
54.235.159.192/27
I have added my own CIDR to the security group and I connect to the instance just fine.
However when the app is running on AppHarbor it fails.
My connection string (censored) is:
Server=myinstanceXXXX.cykjvptrw5xs.us-east-1.rds.amazonaws.com;Database=MyDatabase;UID=XXXXXX;PWD=XXXXX;
I have tried including the port 3306 on the server endpoint but it made no difference.
Am I missing something on getting the two to play nice with one another?
By default AppHarbor use Amazon's internal DNS service for resolving hostnames. Because of that Amazon RDS instances in the same region as AppHarbor will resolve the private IP addresses rather than the public ones listed in the knowledge base article, so setting up rules based on the public IPs will not work most of the time.
In case Amazon's DNS service becomes unavailable we'll fail over to an external DNS service. This means you'll still have to configure the external IPs for the highest availability as an external DNS service will resolve the public IPs. This way you can ensure that your application is resilient towards DNS failures.
You can set up security group based access rules for your RDS security group. We've updated this knowledge base article with a section specifically for Amazon RDS where you can find the information necessary to set this up.