Architecture of MySQL on EBS for scaling (Amazon Web Services) - mysql

I'm trying to understand how to architect an Amazon Web Services application.
I have an instance running off of EBS. As far as I understand, I need to mount the EBS drive so that I can store my MySQL database on it.
When I later want to scale up, how do I do so? I understand that I can add more server instances, but how will they be accessing the database? Since from what I understand, the EBS volume can only be attached to one server instance.

I can't speak to this particular setup as I do not have experience using EBS with a MySQL instance but how this type of scaling is typically accomplished is by dedicating a particular instance as the master database server. Any time you spin up additional web servers those are still using the master DB IP to connect. At the time in which your database is the bottleneck you then spin up a slave DB instance on one of the boxes (or its own dedicated box). You can then configure replication in either a master to slave direction or a circular replication so that you can write to the slave instance as well.
If you choose the classic master to slave replication then you will have to make sure your writes are only performed on the master DB instance.
You can setup something like Zeus or any other connection load balancer so that you only ever have to connect to a single Database IP which will then round-robin route your read connections to your pool of servers. Otherwise you'd have to manage the connections yourself which is definitely not trivial. Good luck.

Growing Amazon EBS Volume sizes

You can give a try to MySQL clustering on your EBS backed instances. I have similar query, with more requirements, posted here.

EBS Volumes capacity can be scaled up using Snapshot->launch new volume technique, alternatively storage capacity can be scaled out using EBS Striping (RAID 0).
In AWS you cannot mount same EBS Volume to 2 EC2 instances simultaneously, so when you are scaling your application you need to scale out / up your MySQL DB either thru Replication or clustering. AWS RDS is a very good option for MySQL , if your application is read intensive then you can scale out using RDS Read replica's as well. If you need write scaling then functional partition or MySQL Shards can be explored.

AWS has an entire product dedicated to this: RDS.
In all but the rarest and most specialized of circumstances you're going to be better off using RDS than trying to create and tune your own EBS/EC2/MySQL infrastructure.
RDS also directly answers your question - they directly enable the creation of readonly databases to use as query slaves. RDS also performs backups, upgrades, and all sorts of fail-over infrastructure for you.
With EBS there's no way to attach a disk to multiple EC2 instances, so you're not going to be able to build out a failure cluster using that approach. Instead you're going to need replication or backup tools of some type.

Related

mysql - Why do we need RDS when we can use docker image for mysql

From my understanding, Aws RDS facilitate backup for the mysql database, but it is not cheap.
While using docker image for mysql may save us more in terms of cost? Because we only need to download the docker image for dockerhub and directly use it for free(e.g. create an instance and run the container).
Is there another reason of using RDS other than facilitating backup for the database?
I list several features of RDS which may warrant using it over self-managed MySQL docker container on an EC2 insistence or ECS:
RDS is managed service, so all OS updates, MySQL patches are managed by AWS and you don't have to worry about them.
RDS supports storage auto-scaling - you can start with small db, and RDS will extend storage automatically as needed.
Point-in-time recovery allowing you to "rewind" your recent db changes.
Read replicas - you can create up to 5 read replicas of your database to off-load read intensive applications from your primary db instance.
Cross-region read replica - you can have your replica in different region which is good for disaster recovery (entire AWS region goes down)
Automated and manual backups, including backups to a different region.
IAM authentication to your db instead of regular username/password.
Multi-AZ - RDS can keep a stand-by replica of your primary database instance in different availability zone, for quick recovery if it fails.
CloudWatch integrated db metrics and logs.
RDS event notifications allow you for straight-forward development of automations e.g. invoke lambda automatically for every backup, or if something fails.
Easier integration with other services, e.g. use of RDS Proxy in Lambda functions.
All these and other features of RDS make it much more expensive then hosting a self-managed MySQL docker container. But if MySQL in docker container meets all your requirements, then there is no need to use RDS. You can always start with the docker, and if your data and requirements grow, you can migrate to RDS.

MySql localhost vs Amazon RDS instance Performance

I am running Django Rest API on an AWS ec2-server. Right now the Api's are using MySql localhost database. Should I shift my database from MySql localhost to Amazon RDS instance?
As per what I Know for remote servers would take a little extra time to transmit the request and shared resources. Would this little extra time be worth migrating my database from MySql localhost to Amazon RDS instance?
I read this answer but it didn't helped me much.
MySql localhost vs Amazon RDS instance
An answer with all possible Pros and Cons will really be appreciated.
Pros for local MySQL
Slightly faster, because of proximity to the application
Cons for local MySQL
Not Easily Scalable
If you want to use autoscaling for your application load and traffic then you might have nightmares, because as you scale you will have even the MySQL servers running on each new node.
Pros for RDS
You don't have to worry about installing and maintaining MySQL server
You don't have to worry about scaling
You don't have to worry about load balancing
You don't have to worry about EC2 upgrades and patching
You don't have to worry about failure recovery because when you provision a Multi-AZ DB Instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone (AZ)List item
Cons for RDS
Slightly slower due to network latency
It depends how database intensive your application is.
See this benchmark The local database blew RDS out of the water on query latency with low load.
The answer is probably use both? Use both a local Redis/MySQL for quick queries and an off server RDS for long queries over large data sets where paying the additional network latency makes sense.
Also think about using SQLite on S3. If you can easily shard your data, and most queries are read intensive it could be a lot cheaper, especially with something like Redis on the server to cache frequent queries.
If you want to really eek out performance per $$, you can use a lot of Pang's tricks by having a hierarchy of SQLite files.

Scalable web application architecture

I have a really simple bookshop webapplication written in Spring framework, just to test its scalability.
I deployed this bookshop on one EC2 instance (t1.micro), and database on Amazon RDS (t1.micro) with master/slave replication of one master instance and 3 slave instances (There's really a lot more reads than writes). One t1.micro RDS instance can have at maximum of 32 concurrent connections
Then I did stress testing with JMeter, figured out that the bottleneck is in the database, since you can have at maximum 32 concurrent connections to t1.micro RDS instance.
Should I auto scale RDS database instances, since creating new replica modifies master and it really takes long time to make it available?
Instead of using RDS should I create EC2 instances with MySQL master/replica and then auto scale these instances?
Should I shard my database instead of replication?
Application also uses com.mysql.jdbc.ReplicationDriver to load balance between master and slave instances. Should I use something different like HAProxy?
Have you ever consider Caching and Partitioning ? The web application we have worked have used Memcache. It really helps in performance issues. On the other hand If you have tables that have so much records, you should consider partitioning, accessing these tables on partitions can have remarkable affect.

What is the difference between Database Mirroring and Database Replication such as Multi A-Z deployment in Amazon RDS?

I have an application database running with MySQL engine on Amazon RDS. For better availability of our data for users in all parts of the world I'm looking for the best solution.
In the previous version of the application, we mirrored our database in US and Singapore, so that users got a better performance in terms of speed and on our side, we had backup if any disaster occurred.
Now as we moved to Amazon, will having Multi A-Z Deployment serve us in the same way? I mean replicates the database in all regions but will RDS still work in a single region only?
I have done some studies but still not sure so please ask me any further questions if I'm being puzzling.
Thank you.
I think you need both the Multi-AZ and the Read Replica features of AWS RDS.
Multi-AZ just creates a non-accessible secondary DB in another availability zone and in case the primary fails, AWS would switch over to the secondary DB. So you have failover.
In the case you want to increase the performance, and your application can work in read-only mode in Singapore (for example), the Read Replica would be perfect. If writes are also required, you would need to route them to the primary read-write database.
AWS supports a combination of the two approaches.
RDS MySQL Multi-AZ deployments currently works only inside an Amazon EC2 region. RDS Read replica's also need to be present inside the same Amazon EC2 region. Inter region replication is the most requested feature in AWS RDS to replicate data to another RDS in Alternative Amazon EC2 region. It is in their road map currently.

What are the respective advantages/limitations of Amazon RDS vs. EC2 with MySQL? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I realize a couple of basic differences between the two, i.e.
EC2 is going to be cheaper
RDS I wouldn't have to do maintenance
Other than those two, are there any advantages to running my database from RDS as opposed to a separate EC2 server acting as a MySQL server. Assuming similar instance sizes, are both going to run into the same limitations in terms of being able to handle a load?
To give you a little bit more info about my use, I've got a database, nothing too big or anything (biggest table 1 million rows), just high SELECT volume.
This is a simple question with a very complicated answer!
In short: EC2 will provide maximum performance if you go with a RAID0 EBS. Doing RAID0 EBS requires a pretty significant amount of maintenance overhead, for example:
http://alestic.com/2009/06/ec2-ebs-raid
http://alestic.com/2009/09/ec2-consistent-snapshot
EC2 without RAID0 EBS will provide crappy I/O performance, thus it's not even really an option.
RDS will provide very good (though not maximum) performance out of the box. The management console is fantastic and it's easy to upgrade instances. High availability and read only slaves are a click away. It's REALLY awesome.
Short answer: Go with RDS. Still on the fence? Go with RDS!!! if you enjoy headaches and tuning every last little bit for maximum performance, then you can consider EC2 + EBS RAID 0. Vanilla EC2 is a terrible option for MySQL hosting.
In this post there is an excellent benchmark between:
Running MySql on a Small EC2 + EBS
Running MySql on a Small EC2 + EBS + adjusted MySql parameters
A Small RDS
The benchmark is very good since it is not focused only in ideal conditions (only one thread) but also in more realistic scenarios, with 50 threads hitting the database.
RDS is not really a high availability system. Read the fine print in the RDS faq. During a failover event it can take up to 3 minutes to failover. Additional amazon will decide it needs to "upgrade" your rds instance and do a failover at that point which will take your database down for "up to 3 minutes" (our experience is that it can take a longer than that).
RDS high availability is very different than master - master or master - slave replication and is much slower. They don't use mysql replication but uses some kind of ebs replication. So in a failover situation it will mount the ebs on the backup machine, start mysql, wait for mysql to do failure recover (hopefully nothing got corrupted too bad), then do a dns switch.
I hope this helps you with you evaluation.
We chose to use EC2 MySQL instances because we have a high read volume and need master-slave replication. Of course, you can spin up multiple RDS instances and setup MySQL replication between them yourself, but we use Scalr.net, which manages that for you using EC2 instances.
Basically, we just tell Scalr how many MySQL instances we want at it keeps them up, automates the setup of replication, handles automatic failover of slave promotion to master if the master gets terminated etc. It does both SQL dump backups and EBS volume snapshots of the master. So, when it needs to create a new slave, it automatically temporarily mounts an EBS volume of the last master snapshot to initialize the slave DB, then starts replication from the appropriate point. All point and click :)
(and no, I don't work for Scalr or anything. Scalr is available as Open Source if you don't want to use their service)
Regarding the maintenance window question. If you use Multi-AZ then RDS will create a standby replica in another availability zone so that there's no down time for maintenance and you protect yourself against a zone failure.
That's what I'm planning to do in the next week or so. Of course it's going to cost you more but I haven't worked that bit out yet.
MySQL on EC2 vs RDS MySQL
Advantages of MySQL on EC2
Amazon EC2 Inter Region Replication
Copy Snapshots across Amazon EC2 regions
RAID 0 with EBS Striping in MySQL EC2
More than 3TB of Disk space ( You will not need this for your size) can be attached on MySQL on EC2.
Disadvantages of MySQL on EC2
Configuration, Monitoring and Maintenance compared to RDS
Point in time backups available in RDS
IOPS lesser than RDS MySQL ( even after RAID 0) currently, 10800 with 6 disks for MySQL on EC2 whereas 12500 IOPS 16KB on RDS MySQL
I have been trying out RDS for a few months and here are some issues I have:
Using SQL profiler is tricky. Since you cannot connect profiler directly to the server, you have to run some stored procedures to create a log file that you can analyze. While they offer some suggestions about how that is done, it is far from user friendly. I would only recommend that you have a certified SQL professional do this kind of work.
while Amazon backs up your instance, you cannot restore an individual database. I have a web app with several separate customer-specific databases and my solution was to launch an EC2 instance with SQL running on it to attach to the production RDB database and import the data and then back it up on the EC2 instance. The other solution was to use a 3rd party tool that creates a massive SQL script (on the app server) that will recreate the schema and populate the data back to a restore point.
I had the same question this weekend. There is a 4 hour downtime window per week for RDS where they do maintenance. RDS seemed more expensive if you can get away with a micro instance of EC2. (This is true of test instances which has minimum traffic) I also wasn't able to change the timezone of the RDS instance because I dont have permission.
I am now actually looking at http://xeround.com/ which is mysql on EC2 by another company. They do not use InnoDB, instead they have their own engine called IDG. I am just starting to investigate that but they are in BETA and will give 500MB of space.