I am running Django Rest API on an AWS ec2-server. Right now the Api's are using MySql localhost database. Should I shift my database from MySql localhost to Amazon RDS instance?
As per what I Know for remote servers would take a little extra time to transmit the request and shared resources. Would this little extra time be worth migrating my database from MySql localhost to Amazon RDS instance?
I read this answer but it didn't helped me much.
MySql localhost vs Amazon RDS instance
An answer with all possible Pros and Cons will really be appreciated.
Pros for local MySQL
Slightly faster, because of proximity to the application
Cons for local MySQL
Not Easily Scalable
If you want to use autoscaling for your application load and traffic then you might have nightmares, because as you scale you will have even the MySQL servers running on each new node.
Pros for RDS
You don't have to worry about installing and maintaining MySQL server
You don't have to worry about scaling
You don't have to worry about load balancing
You don't have to worry about EC2 upgrades and patching
You don't have to worry about failure recovery because when you provision a Multi-AZ DB Instance, Amazon RDS synchronously replicates the data to a standby instance in a different Availability Zone (AZ)List item
Cons for RDS
Slightly slower due to network latency
It depends how database intensive your application is.
See this benchmark The local database blew RDS out of the water on query latency with low load.
The answer is probably use both? Use both a local Redis/MySQL for quick queries and an off server RDS for long queries over large data sets where paying the additional network latency makes sense.
Also think about using SQLite on S3. If you can easily shard your data, and most queries are read intensive it could be a lot cheaper, especially with something like Redis on the server to cache frequent queries.
If you want to really eek out performance per $$, you can use a lot of Pang's tricks by having a hierarchy of SQLite files.
Related
I've been looking around for best practices when setting up your database on the cloud but it still isn't clear to me which of the following solutions should we be going for?
Amazon RDS Aurora
Amazon RDS MySQL
MySQL on EC2 instances
I see Amazon Aurora being marketed as the better alternative however after some research it doesn't seem like people are using it. Is there a problem with it?
You should benchmark Aurora carefully before you consider it. Launch an instance and set up a test instance of your application and your database. Generate as high of load as you can. I did at my last company, and I found that despite Amazon's claims of high performance, Aurora failed spectacularly. Two orders of magnitude slower than RDS. Our app had a high rate of write traffic.
Our conclusion: if you have secondary indexes and have high write traffic, Aurora is not suitable. I bet it's good for read-only traffic though.
(Edit: the testing I'm describing was done in Q1 of 2017. As with most AWS services, I expect Aurora to improve over time. Amazon has an explicit strategy of "Release ideas at 70% and then iterate." From this, we should conclude that a new product from AWS is worth testing, but probably not production-ready for at least a few years after it's introduced).
At that company, I recommended RDS. They had no dedicated DBA staff, and the automation that RDS gives you for DB operations like upgrades and backups was very helpful. You sacrifice a little bit of flexibility on tuning options, but that shouldn't be a problem.
The worst inconvenience of RDS is that you can't have a MySQL user with SUPER privilege, but RDS provides stored procs for most common tasks you would need SUPER privilege for.
I compared a multi-AZ RDS instance versus a replica set of EC2 instances, managed by Orchestrator. Because Orchestrator requires three nodes so you can have quorum, RDS was the clear winner on cost here, as well as ease of setup and operations.
I don't use Aurora personally, but I can HIGHLY recommend RDS over running your own on EC2. Having the failover happen automatically and also the backups is just worth every penny. Especially since RDS isn't that much more expensive.
Aurara looks really good on paper, but the more flexible choice of instances has kept me at PostGreSQL until now. We're looking at migrating to Aurora though, mainly because of the autoscaling storage provisioning and the higher performance.
AWS RDS is the managed database solution which provides support for multiple database options Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server. When you go with RDS it will provide inbuilt configuration options such as.
Database Replication for High Availability
Read Replicas for Scalability
Backups & Restore
Operating system and software patches & etc.
This simplifies the overhead of database administration. However the flexibility is limited to the RDS offerings.
Alternatively if you host your database in EC2 instance, you can install the required versions of the database engines, install needed extensions & etc. which provides more flexibility but also requires expertise & adds administration overhead.
When you consider Amazon Aurora in RDS, it differs from the rest of the engines because, its new and fully implemented by Amazon from ground up and offers higher performance, reliability out of the box (As marketed by Amazon) with reasonable pricing. However one limitation with Aurora is that its not included in AWS free-tier, where the smallest instance type it supports is "small".
Note: Some of the features offered by RDS and cost differs, based on the database option you select.
I have a really simple bookshop webapplication written in Spring framework, just to test its scalability.
I deployed this bookshop on one EC2 instance (t1.micro), and database on Amazon RDS (t1.micro) with master/slave replication of one master instance and 3 slave instances (There's really a lot more reads than writes). One t1.micro RDS instance can have at maximum of 32 concurrent connections
Then I did stress testing with JMeter, figured out that the bottleneck is in the database, since you can have at maximum 32 concurrent connections to t1.micro RDS instance.
Should I auto scale RDS database instances, since creating new replica modifies master and it really takes long time to make it available?
Instead of using RDS should I create EC2 instances with MySQL master/replica and then auto scale these instances?
Should I shard my database instead of replication?
Application also uses com.mysql.jdbc.ReplicationDriver to load balance between master and slave instances. Should I use something different like HAProxy?
Have you ever consider Caching and Partitioning ? The web application we have worked have used Memcache. It really helps in performance issues. On the other hand If you have tables that have so much records, you should consider partitioning, accessing these tables on partitions can have remarkable affect.
I've been playing with AWS EC2 and really like it. There is one drawback though, the instance could disappear due to hardware failure or whatever reason. This happened to me in my first week of operation. I was wondering whether there are good solutions to backup a MySQL database so that I don't lose my customer credentials?
You can transfer mysql database directly from EC2 machine to S3bucket but you will consume more cost for bandwidth and storage. You go for a third party application (which is safe) to backup your mysql or any plugins. Because they compress your data & encrypt and then save in S3 storage. Also, you can enable snap shot and take snap shots for volumes (hard drives)
I suggest you to use 'StoreGrid' backup software to backup your mysql database in EC2 machine. check this following link to know more about Online Backup Service on Amazon EC2/S3 http://storegrid.vembu.com/online-backup/amazon-ec2-s3-cloud-online-backup.php
Check this following link to configure MySQL database BACKUP http://storegrid.vembu.com/online-backup/mysql-backup.php?ct=1
Note: You have mentioned Hardware failure occurs often ! --- you can backup entire hard drives too using the above software.
I hope, now your MySQL data base is backed up from EC2 instance and stored in S3 storage safely.
Cheers !
Amazon now offers Relational Database Storage, that is, pre-configured EC2 instances, without any OS access to host MySQL (or Oracle, or T-SQL for real) for you, but aim to solve much of the availability, reliability and durability issues one faces when trying to host transactional data store yourself on a bare EC2 instance.
http://aws.amazon.com/rds/
"automated backups, DB snapshots, automatic host replacement, and Multi-AZ deployments"
I have an app whose database is being migrated to amazon RDS.
I experienced a significant drop of performance, due to the latency of the queries between RDS and our server (like 30s of loading time only because of the queries). There is no explicit caching, and the requests could be optimized a bit more, but this is still more than 10x slower than with a local database.
I this kind of performance drop expected? If yes, is there a way to use a cloud database with similar performances as a local one?
There have been some reported issues with poor performance of RDS by people. Although amazon as far as I've seen hasn't acknowledged these issues.
RDS (which is just a custom version of mysql) uses the ESB for the storage backend, and as I'm sure you are well aware of the failure they just had with that service.
I've read a lot of companies just running their own mysql DBs in EC2 instances becuase it's showns to have more reliable performance.
I'm trying to understand how to architect an Amazon Web Services application.
I have an instance running off of EBS. As far as I understand, I need to mount the EBS drive so that I can store my MySQL database on it.
When I later want to scale up, how do I do so? I understand that I can add more server instances, but how will they be accessing the database? Since from what I understand, the EBS volume can only be attached to one server instance.
I can't speak to this particular setup as I do not have experience using EBS with a MySQL instance but how this type of scaling is typically accomplished is by dedicating a particular instance as the master database server. Any time you spin up additional web servers those are still using the master DB IP to connect. At the time in which your database is the bottleneck you then spin up a slave DB instance on one of the boxes (or its own dedicated box). You can then configure replication in either a master to slave direction or a circular replication so that you can write to the slave instance as well.
If you choose the classic master to slave replication then you will have to make sure your writes are only performed on the master DB instance.
You can setup something like Zeus or any other connection load balancer so that you only ever have to connect to a single Database IP which will then round-robin route your read connections to your pool of servers. Otherwise you'd have to manage the connections yourself which is definitely not trivial. Good luck.
Growing Amazon EBS Volume sizes
You can give a try to MySQL clustering on your EBS backed instances. I have similar query, with more requirements, posted here.
EBS Volumes capacity can be scaled up using Snapshot->launch new volume technique, alternatively storage capacity can be scaled out using EBS Striping (RAID 0).
In AWS you cannot mount same EBS Volume to 2 EC2 instances simultaneously, so when you are scaling your application you need to scale out / up your MySQL DB either thru Replication or clustering. AWS RDS is a very good option for MySQL , if your application is read intensive then you can scale out using RDS Read replica's as well. If you need write scaling then functional partition or MySQL Shards can be explored.
AWS has an entire product dedicated to this: RDS.
In all but the rarest and most specialized of circumstances you're going to be better off using RDS than trying to create and tune your own EBS/EC2/MySQL infrastructure.
RDS also directly answers your question - they directly enable the creation of readonly databases to use as query slaves. RDS also performs backups, upgrades, and all sorts of fail-over infrastructure for you.
With EBS there's no way to attach a disk to multiple EC2 instances, so you're not going to be able to build out a failure cluster using that approach. Instead you're going to need replication or backup tools of some type.