I'm using RDS(MySQL) with one of my Laravel project. but one question is floating in my mind that what happens to the project when amazon is creating a backup of the rds instance. Is it:
Freeze the project
The project throws an exception
working Normal
For a single instance RDS the database I/O may be suspended for a few seconds while the snapshot is created. During this period all requests to the database will be paused, but they will be resumed after the snapshot is created.
So if you have a webapp, requests received during the I/O suspension period will be served slower then usually.
You can mitigate this with a multi-AZ RDS deployment, because in case of multi-AZ, the snapshot is taken from the standby instance. So there is no I/O suspension on the master instance.
Relevant documentation: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html#USER_WorkingWithAutomatedBackups.BackupWindow
Your application will continue to work normally during backups. Since AWS RDS uses volume snapshots the MySQL service is running without any interruption. This is how manual snapshots or point-in-time recovery works as well.
Related
From my understanding, Aws RDS facilitate backup for the mysql database, but it is not cheap.
While using docker image for mysql may save us more in terms of cost? Because we only need to download the docker image for dockerhub and directly use it for free(e.g. create an instance and run the container).
Is there another reason of using RDS other than facilitating backup for the database?
I list several features of RDS which may warrant using it over self-managed MySQL docker container on an EC2 insistence or ECS:
RDS is managed service, so all OS updates, MySQL patches are managed by AWS and you don't have to worry about them.
RDS supports storage auto-scaling - you can start with small db, and RDS will extend storage automatically as needed.
Point-in-time recovery allowing you to "rewind" your recent db changes.
Read replicas - you can create up to 5 read replicas of your database to off-load read intensive applications from your primary db instance.
Cross-region read replica - you can have your replica in different region which is good for disaster recovery (entire AWS region goes down)
Automated and manual backups, including backups to a different region.
IAM authentication to your db instead of regular username/password.
Multi-AZ - RDS can keep a stand-by replica of your primary database instance in different availability zone, for quick recovery if it fails.
CloudWatch integrated db metrics and logs.
RDS event notifications allow you for straight-forward development of automations e.g. invoke lambda automatically for every backup, or if something fails.
Easier integration with other services, e.g. use of RDS Proxy in Lambda functions.
All these and other features of RDS make it much more expensive then hosting a self-managed MySQL docker container. But if MySQL in docker container meets all your requirements, then there is no need to use RDS. You can always start with the docker, and if your data and requirements grow, you can migrate to RDS.
I added a reader replica to my RDS Aurora MySQL cluster. The instance is running with minor cpu usage but it does not show connections on the monitoring page. I have enabled detailed monitoring. Access groups are the same as the writer instance.
How to I ensure that traffic is going to my reader instance?
AWS RDS Aurora does not support splitting of read/write transactions,
In order to forward read only queries to read replica endpoint and read/write queries to your master endpoint you need to add function or a proxy to your application to inspect the query then forward it to read replica or to the master, in other word the application logic should manage this process.
I have created an Amazon Aurora Database cluster runing MySQL with three instances: the main instance that backs the cluster and two read replicas for balancing. However, the cluster does not seem to be balancing the reads at all. I have one replica managing 700+ Selects/sec maximizing the CPU at 99.75% or higher while the other replica is doing virtually nothing with a CPU usage of 4% at 1 select per second, if that. The main cluster instance itself is at 33% CPU usage as it is being written to simultaneously while the replicas should are being read from. The lag time between the replicas is under 20 milliseconds. My application is querying the read only endpoint of the cluster but no balancing appears to be happening. Does anyone have any insight into why this may be happening or why the replica is at such a high CPU usage? The queries being ran against it are not complex by any means.
Aurora Cluster endpoints are DNS records and they only do DNS round robin during resolution. This means that when your client application opens connections to a cluster endpoint, you end up resolving the endpoint to different instances (different IPs basically), there by striping your connections across multiple replicas. Past that point, there is no load balancing. Connections are striped across instances, and queries run on each of those connections go to the corresponding instance backing it.
Now consider the scenario where your connection pool was already created to the cluster endpoint when you have one instance behind it. Now, if you add more instances, there will be no impact to your application, unless you terminate your connection and reestablish them. You would do a DNS round robin again, and this time some of your connections would land on the new instance that you provisioned.
Few callouts:
In Aurora, you have 2 cluster endpoints. One (RW) endpoint always points to the current writer and one (RO) does the DNS round robin between your read replicas.
Also, DNS propagation might take a few seconds when failovers happen, so that occasional errors are quite natural when failovers occur.
Hope this helps.
We've implemented a driver to try to mitigate this problem, with some visible gains: https://github.com/DiceTechnology/dice-fairlink
It regularly discovers the read-replicas to catch up with cluster changes and round-robins connections among them.
Despite not measuring any CPU utilisation, we've observed a better load distribution than with the native DNS based round-robin of the cluster reader endpoint
The Aurora's DNS based load balancing works at the connection level (not the individual query level). You must keep resolving the endpoint without caching DNS to get a different instance IP on each resolution. If you only resolve the endpoint once and then keep the connection in your pool, every query on that connection goes to the same instance. If you cache DNS, you receive the same instance IP each time you resolve the endpoint.
Unless you use a smart database driver, you depend on DNS record updates and DNS propagation for failovers, instance scaling, and load balancing across Aurora Replicas. Currently, Aurora DNS zones use a short Time-To-Live (TTL) of 5 seconds. Ensure that your network and client configurations don’t further increase the DNS cache TTL. Remember that DNS caching can occur anywhere from your network layer, through the operating system, to the application container. For example, Java virtual machines (JVMs) are notorious for caching DNS indefinitely unless configured otherwise. Here are AWS documentation and Aurora whitepaper on configuring DNS cache ttl.
My guess is that you are not connecting to the cluster endpoint.
Load Balancing – Connecting to the cluster endpoint allows Aurora to load-balance connections across the replicas in the DB cluster. This helps to spread the read workload around and can lead to better performance and more equitable use of the resources available to each replica. In the event of a failover, if the replica that you are connected to is promoted to the primary instance, the connection will be dropped. You can then reconnect to the reader endpoint in order to send your read queries to the other replicas in the cluster.
New Reader Endpoint for Amazon Aurora – Load Balancing & Higher Availability
[EDIT]
To load balance within a single application, you will need to reconnect to the endpoint. If you use the same connection for all queries only one replica will be responding. However, opening connections is expensive so this might not provide much benefit unless your queries take some time to run.
The new Google Cloud MySQL 2nd Generation spins up its own VM instance to run the MySQL server. Please see the following picture:
What is the difference between using the 2nd Generation instance, or using my own Compute VM instance with a manually installed version of MySQL on it? Are there any advantages when it comes to high availability, security, or performance?
Adding to the answer that Terry posted, and answering your question in the comment:
You can create a highly available Cloud SQL Second Generation by doing the following:
Set up your master instance correctly including sizing it appropriately and setting up binary logging. The master instance must have one backup after binary logging has been enabled. You should place your master instance in a zone that's close to your other services. See preparing the master instance.
Create one failover replica in a different zone than the master. See creating a failover replica.
Optionally, create one or more read replicas. Note that a master instance with a failover replica is sufficient for creating a highly available configuration.
Optionally, test failover. Keep in mind that testing the failover moves the master to a new zone.
To answer your question "So what happens if the VM instance they create fails?"
A master instance falls out of high availability mode when the failover replica becomes unavailable. This can happen, for example, if the network connection between the master instance and failover replica is interrupted, or if the failover replica is down due to its own zone failure. During this time, the master instance is not in high availability mode, and you will not be able to failover to the replica because it is not safe to do so. The failover replica resumes replication on reconnection, and high availability mode is re-enabled when the failover replica finishes catching up.
The major difference is that Cloud SQL v2 does not have to be managed. Google Cloud handles management, replication, and snapshots. Additionally Cloud SQL v2 using Cloud SQL Proxy works with App Engine standard and flexible runtimes to allow for flexible, but secure connections to SQL from other clients.
In return you do not have any access to any of the underlying system.
I am studying the new Amazon RDS product and it seems it can be scaled only vertically (i.e. put a stronger server).
Did anyone see a possibility to configure multiple instances so that one is master and the other/s is/are replication slaves?
Same question asked (and answered) here http://developer.amazonwebservices.com/connect/thread.jspa?threadID=37823
Looks like there are plans for Master-Master HA or similar but that's not the same a replicated scale-out offering.
According to the FAQ it is possible now, see http://aws.amazon.com/rds/faqs/#86 :
Q: What types of replication does
Amazon RDS support and when should I
use each?
Amazon RDS provides two distinct replication options to serve different
purposes.
If you are looking to use replication to increase database
availability while protecting your
latest database updates against
unplanned outages, consider running
your DB Instance as a Multi-AZ
deployment. When you create or modify
your DB Instance to run as a Multi-AZ
deployment, Amazon RDS will
automatically provision and manage a
“standby” replica in a different
Availability Zone (independent
infrastructure in a physically
separate location). In the event of
planned database maintenance, DB
Instance failure, or an Availability
Zone failure, Amazon RDS will
automatically failover to the standby
so that database operations can resume
quickly without administrative
intervention. Multi-AZ deployments
utilize synchronous replication,
making database writes concurrently on
both the primary and standby so that
the standby will be up-to-date in the
event a failover occurs. While our
technological implementation for
Multi-AZ DB Instances maximizes data
durability in failure scenarios, it
precludes the standby from being
accessed directly or used for read
operations. The fault tolerance
offered by Multi-AZ deployments make
them a natural fit for production
environments; to learn more about
Multi-AZ deployments, please visit
this FAQ section.
If you are looking to take advantage of MySQL 5.1’s built-in
replication to scale beyond the
capacity constraints of a single DB
Instance for read-heavy database
workloads, Amazon RDS makes it easier
with Read Replicas. You can create a
Read Replica of a given “source” DB
Instance using the AWS Management
Console or CreateDBInstanceReadReplica
API. Once the Read Replica is created,
database updates on the source DB
Instance will be propagated to the
Read Replica. You can create multiple
Read Replicas for a given source DB
Instance and distribute your
application’s read traffic amongst
them. Unlike Multi-AZ deployments,
Read Replicas use MySQL 5.1’s built-in
replication and are subject to its
strengths and limitations. In
particular, updates are applied to
your Read Replica(s) after they occur
on the source DB Instance
(“asynchronous” replication), and
replication lag can vary
significantly. This means recent
database updates made to a standard
(non Multi-AZ) source DB Instance may
not be present on associated Read
Replicas in the event of an unplanned
outage on the source DB Instance. As
such, Read Replicas do not offer the
same data durability benefits as
Multi-AZ deployments. While Read
Replicas can provide some read
availability benefits, they and are
not designed to improve write
availability.
With Amazon RDS, you can use Multi-AZ deployments and Read Replicas
in conjunction to enjoy the
complementary benefits of each. You
can simply specify that a given
Multi-AZ deployment is the source DB
Instance for your Read Replica(s).
That way you gain both the data
durability and availability benefits
of Multi-AZ deployments and the read
scaling benefits of Read Replicas.