best tool to take a backup from mysql - mysql

I am using mysql workbench for taking a backup/dump of my database hosted on Amazon RDS service. My database is very huge (about 8gib) and taking a 9-10 hours to download it from read-replica, mean while I am not able to see If download process is stuck or running.
Is there any GUI tool available to take a backup fast and can also give details of which process is running like which table is downloading with its row details or percentage of total download. Mysql workbench is a good tool, but It hasn't show all the options given in 'mysqldump' command utility, and It is also very slow. and I also doubt about my data integrity. can someone explain me how it's work specially with data integrity?
Thanks

First of all, your 8GB database is by no means 'huge'. Second, I'm not clear on what you're trying to do? Amazon provides multiple ways for you to have backups.
From: http://aws.amazon.com/rds/faqs/
Q: Do I need to enable backups for my DB Instance or is it done automatically?
By default and at no additional charge, Amazon RDS enables automated backups of your DB Instance with a 1 day retention period.

Related

What is an efficient way to maintain a local readonly copy of a live remote MySQL database?

I maintain a server that runs daily cron jobs to aggregate data sources and generate reports, accessible by a private Ruby on Rails application.
One of our data sources is a partial dump of one of our partner's databases. The partner runs an active application and the MySQL DB has hundreds of tables. They have given us read-only access to a relatively underpowered readonly slave of their application DB.
Because of latency issues and performance bottlenecking on their slave DB, we have been maintaining a limited local copy of their DB. We only need about 20 tables for our reports, so I only dump those tables. We also only need the data to a daily granularity, so realtime sync is not a requirement.
For a few months, I had implemented a nightly cron which streamed the dump of the necessary tables into a local production_tmp database. Then, when all tables were imported, I dropped production and renamed production_tmp to production. This was working until the DB grew to over 25GB, and we started running into disk space limitations.
For now, I have removed the redundancy step and am just streaming the dump straight into production on our local server. This feels a bit flimsy to me, and I would like to implement a safer approach. Also, currently doing the full dump/load takes our server over 2 hours, and I'd like to implement an approach that doesn't take as long. The database will only keep growing, so I'd like to implement something future proof.
Any suggestions would be appreciated!
I take it you have never heard of, or considered MySQL Replication?
The idea is that you do your backup & restore once, and then configure the replica to "subscribe" to a continuous stream of changes as they are made on the primary MySQL instance. Any change applied to the primary is applied automatically to the replica within seconds. You don't have to do the backup & restore procedure again, unless the replica gets damaged.
It takes some care to set up and keep working, but it's a much more efficient method of keeping two instances in sync.
#SusannahPotts mentions hot backup and/or incremental backup. You can get both of these features for free, without paying for MySQL Enterprise using Percona XtraBackup.
You can also consider using MySQL Transportable Tablespaces.
You'll need filesystem access to run either Percona XtraBackup or MySQL Enterprise Backup. It's not possible to use these physical backup tools for Amazon RDS, for example.
One alternative is to create a replication slave in the same network as the live system, and run Percona XtraBackup on that slave, where you do have filesystem access.
Another option is to stream the binary logs to another host (see https://dev.mysql.com/doc/refman/5.6/en/mysqlbinlog-backup.html) and then transfer them periodically to your local instance and replay them.
Each of these solutions has pros and cons. It's hard to recommend which solution is best for you, because you aren't sharing full details about your requirements.
This was working until the DB grew to over 25GB, and we started running into disk space limitations.
Some question marks "here":
Why don't you just increase the available Diskspace for your database? 25 GB seems nothing when it comes down to disk-space?
Why don't you modify your script to: download table1, import table1_tmp, drop table1_prod, rename table1_tmp to table1_prod; rinse and repeat.
Other than that:
Why don't you ask your partner for a system with enough performance to run your reports on? I'm quite sure, he would prefer this rather than having YOU download sensitive data every day to your "local site"?
Last thought (requires MySQL Enterprise Backup https://www.mysql.de/products/enterprise/backup.html):
Rather than dumping, downloading and importing 25 GB every day:
Create a full backup
Download and import
Use Differential or incremental backups from now.
The next day you download (and import) only the data-delta: https://dev.mysql.com/doc/mysql-enterprise-backup/4.0/en/mysqlbackup.incremental.html

Export huge database from amazon RDS to local mysql

I have a mysql database on a Amazon RDS (About 600GB of data) I need to move it back home to our local dedicated servers, but I don't know where to start.
Every time I try to init a sqldump it freezes, are there a way to move it on to S3? maybe even splitting it to smaller parts before starting the download?
How would you go about migrating a 600GB mysql DB?
Did you tried to use innobackupex script? It allows to to run living database (hot backup) and tar|gzip final backup thus you can get a smaller file. Works only with file_per_table=1
If you have downtime to move database you can also try to optimize tables to reclaim some space (especially if you did a lot of deletes).
Also you can think about get rid of some data: logs, archives etc and move them later.

Scheduled Cloudbees MySql Backup

This may be a stupid question, but after hours of googleing i cant find a suitable answer to this..
We have a buisness critical application running on cloudbees. The sourcecode is backed up properly and we want the same for our db. Cloudbees doc says:
"CloudBees MySQL databases are backed by EBS volumes on Amazon EC2 which provides a first layer of storage redundancy. EBS volumes are backed up to S3 every 24 hours for disaster recovery and are not generally available for customer use on multi-tenant MySQL clusters. Customers using Dedicated MySQL instances can request rollbacks to previous backup snapshots by filing a support ticket."
So basicly we are protected out of the box in case of emergencies, but not if an employee accidentally deletes something he should not.
So my question is: How can we automaticly do a backup of a cloudbees mysql db every night? We have amazon S3 storage where it could be put.
Any ideas?
You can use a command line script that backups your Databases to your S3 account quite easily, and run it as often as you like. I had exactly the same problem a while back, and wrote up this handy tutorial. It should be perfect for what you want to do.

Amazon RDS mysqldump outside of the Amazon eco-system

I would like to do a daily mysqldump to my own local disk out side of the amazon eco-system. I have few reasons I want to do this daily.
I want to be in more control of my database when RDS\EBS goes down again.
RDS only allows you to restore within the same availability zone. This really gets me because a natural disaster or network fault at the availability zone pretty much renders backups useless because you can only restore to the same zone. :/
Would like a sandbox/test database where I don't have to pay for space and bandwith.
My big question is if I do a daily mysql dump of a 50gb database will my bandwidth\IO costs skyrocket? I'm assuming they will! Has anyone done something like this before?
UPDATE:
I am running a Multi-AZ production environment be recent outages still proved that there is no such thing as complete failover.
Our company has two services, a front facing web site and internal processing. It's most important that our internal operations don't stop. Our web site could go dark for several hours if need be. Having a recent mysql dump at my figure tips seem priceless to me.
So you have a few points of concern that you note.
With regard to being in control of your database, I am not really sure what you are getting to here. If your production DB goes down, you don't have control over it. Even if you have a local backup of it, that isn't going to do you much good if you don't have a place to host that data.
Is your current production RDS instance a multi-AZ instance to help shield against AZ outtage? If it is, the fail over would happen automatically for you.
RDS snapshots are available to restore in different availability zones. See the documentation for rds-restore-db-instance command line at this link http://docs.amazonwebservices.com/AmazonRDS/latest/CommandLineReference/CLIReference-cmd-RestoreDBInstanceFromDBSnapshot.html
Note that you can specify which AZ you want to restore to.
Based on a daily backup of 50GB, you would be talking about spending $180 in data transfers for backups alone. It would be MUCH cheaper to simply have a small test RDS in the same region as your production RDS instance for testing (I think it is like $5/month for a micro). All your data transfer between these boxes (i.e. moving snapshots onto it) would be free.
You can do the math on pricing yourself here: http://aws.amazon.com/rds/#pricing
This is not to mention that doing your daily backups against production would interrupt your production DB access for the time it locks the DB to perform the dump. This is of course unless you pay to have an RDS read replica that you can take the dumps from.
Finally, there are subtle differences between RDS and a standalone MySQL server in regards to how they are configured, I would much rather have my testing environment be as similar to my production environment as possible.
Just try it. I pull from Amazon to my local mysql-server which is Ubuntu.
mysqldump signs -h signs.c3x4aregvxxx.us-east-1.rds.amazonaws.com -P 3306 -u cartersxxx -pxxxxxx | mysql -u root -pxxxxxx signs
I have been unable to predetermine billing at Amazon and I am actively trying to get away from them. FYI I pay $72/month for 10GB mysql with low bandwidth. IMHO table size dictates cost.

Should I stick only to AWS RDS Automated Backup or DB Snapshots?

I am using AWS RDS for MySQL. When it comes to backup, I understand that Amazon provides two types of backup - automated backup and database (DB) snapshot. The difference is explained here. However, I am still confused: should I stick to automated backup only or both automated and manual (db snapshots)?
What do you think guys? What's the setup of your own? I heard from others that automated backup is not reliable due to some unrecoverable database when the DB instance is crashed so the DB snapshots are the way to rescue you. If I am to do daily DB snapshots as similar settings to automated backup, I am gonna pay much bunch of bucks.
Hope anyone could enlighten me or advise me the right set up.
From personal experience, I recommend doing both. I have the automated backup set to 8 days, and then I also have a script that will take a snapshot once per day and delete snapshots older than 7 days. The reason is because from what I understand, there are certain situations where you could not restore from the automated backup. For example, if you accidentally deleted your RDS instance and did not take a final snapshot, you would not be able to access the automated backups that were done. But it is also good to have the automated backups turned on because that will provide you the point-in-time restore.
Hope this helps.
EDIT
To answer your comment, I use a certain naming convention when my script creates the snapshots. Something like:
autosnap-instancename-2012-03-23
When it goes to do the cleanup, it retrieves all the snapshots, looks for that naming convention, parses the date, and deletes any older than a certain date.
I think you could also look at the snapshot creation date, but this is just how I ended up doing it.
Just from personal experience, yesterday I accidentally deleted a table and had to restore from an RDS snapshot. The latest snapshot was only 10 minutes old, which was perfect. However, Amazon RDS took about 3 hours to get the snapshot online, during which time, the affected section of our site was completely offline.
So if you need to make a very quick recovery, do NOT depend on RDS backups.
Keep in mind, you can't download your snapshot so that you could view a database dump. Your only option is to wait for it to load in to a new database instance. So if you're only looking to restore a single table, RDS backups can make it a very painful process.
No blame to Amazon on this- they are awesome. But just something to keep in mind when planning, because it was a learning experience for us.
There are some situations where an automated backup does not recover the specific table you want to recover even though it has a point-in-time recovery feature. I am suggesting you enable the Backtracking feature for this kind of recovery and You can use "AWS-Backup" service to manage backups of Amazon RDS DB instances. Backups managed by AWS Backup are considered manual DB snapshots.
Also, you will be required to keep automated backup enabled for creating read-replica for DB-instance in order to improve read performance. The retention period for automated backup should be between 1 and 35 so you can keep it a minimum of 1 day.