Moving of large MySQL database from limited resource server - mysql

I have a Windows Server with MySQL Database Server installed.
Multiple databases exist among them, database A contains a huge table named 'tlog', size about 220gb.
I would like to move over database A to another server for backup purposes.
I know I can do SQL Dump or use MySQL Workbench/SQLyog to do table copy.
But due to limited disk storage in server (less than 50gb) SQL Dump is not possible.
The server is serving other works so basically the CPU & RAM is limited too. As a result, copy table without used up CPU & RAM is not possible.
Is there any other method that can do the moving of the huge database A over to another server please?
Thanks in advance.

You have a few ways:
Method 1
Dump and compress at the same time: mysqldump ... | gzip > blah.sql.gz
This method is good because chances are your database will be less than 50GB; as the database dump should be in ASCII; you're then compressing it on the fly.
Method 2
You can use slave replication; this method will require a dump of the data.
Method 3
You can also use xtrabackup.
Method 4
You can shutdown the database, and rsync the data directory.
Note: You don't actually have to shutdown the database; you can however do multiple rsyncs; and eventually nothing will change (unlikely if the database is busy; have to do during slow time); which means the database would have sync'd over.
I've had to do this method with fairly large PostgreSQL databases (1TB+). It takes a few rsyncs: but, hey; it's the cost of 0 down time.
Method 5
If you're in a virtual environment you could:
Clone the disk image.
If you're in AWS you could create an AMI.
You could add another disk and just sync locally; then detach the disk, and re-attach to the new VM.
If you're worried about consuming resources during the dump or transfer you can use ionice and renice to limit the priority of the dump/transfer.

Related

Mysqldump on RDS Read Replica Slave is 50x slower

I created a read-replica of my MySQL database on amazon RDS.
When executing the following command, it is super fast (half a second) on the master, but takes more like 30 seconds on the slave. Super annoying because I wanted to dump off of the slave so that I don't slow down the master.
mysqldump --set-gtid-purged=OFF -h myDomain.com -u dev -pmyPassword mySchema > out.sql
There are three issues to consider.
The most significant is that mysqldump does not perform well when run at a distance from the database, due to limitations in the traditional MySQL client/server wire protocol, which makes no allowance for pipelining a series of commands.
The mysqldump utility uses no magic to generate dump files -- it issues SQL statements to the server, and takes the results of those queries to generate its output.
As a result, every single object (schema, table, view, stored function/procedure, event) in the database requires at least one round trip and sometimes more than one.
For each table, mysqldump first issues SHOW CREATE TABLE t1; followed by SELECT * FROM t1; ... so a round trip time of 100 ms would mean that extracting a dump file of 150 tables would mean 150 × 2 × 0.100 = 30 seconds are simply wasted by the distance between the machine running mysqldump and the server -- and this is true even if the tables are completely empty.
This is not a recommendation, but you might take a look at mydumper, which claims to have the ability of creating the backup using multiple database connections, in parallel, and this could help mediate the cycles wasted as commands pass to the server and return to the client, by parallelizing the dump process. I don't know the quality of this code base, but something like this could help.
Next, you almost always want to use the --compress option for mysqldump. Contrary to what you might assume, this does not compress the backup file. The generated backup file is identical when this option is used, but when this feature is activated, the server compresses the data it sends to mysqldump on the wire, and mysqldump decompresses the data again before writing it out -- so this option will almost always make for a faster process unless the machine running mysqldump and the database server are connected by a low-latency, high-bandwidth network. Because the generated file is identical, there are no compatibility concerns when using this option.
Finally, there's an issue with newly-created RDS servers that you need to be aware of, so that it doesn't skew your benchmarks. When you create an RDS replica, it is originally seeded with data from a snapshot of the upstream master. This is, behind the scenes, an EBS snapshot of the master's hard drive, and the new database instance is backed by an EBS volume restored from that snapshot. EBS volumes are lazily-loaded from the snapshot, so they have a documented first-touch penalty. This issue could have a substantial impact on the performance of the first complete backup, but should have no meaningful impact after that.

What is an efficient way to maintain a local readonly copy of a live remote MySQL database?

I maintain a server that runs daily cron jobs to aggregate data sources and generate reports, accessible by a private Ruby on Rails application.
One of our data sources is a partial dump of one of our partner's databases. The partner runs an active application and the MySQL DB has hundreds of tables. They have given us read-only access to a relatively underpowered readonly slave of their application DB.
Because of latency issues and performance bottlenecking on their slave DB, we have been maintaining a limited local copy of their DB. We only need about 20 tables for our reports, so I only dump those tables. We also only need the data to a daily granularity, so realtime sync is not a requirement.
For a few months, I had implemented a nightly cron which streamed the dump of the necessary tables into a local production_tmp database. Then, when all tables were imported, I dropped production and renamed production_tmp to production. This was working until the DB grew to over 25GB, and we started running into disk space limitations.
For now, I have removed the redundancy step and am just streaming the dump straight into production on our local server. This feels a bit flimsy to me, and I would like to implement a safer approach. Also, currently doing the full dump/load takes our server over 2 hours, and I'd like to implement an approach that doesn't take as long. The database will only keep growing, so I'd like to implement something future proof.
Any suggestions would be appreciated!
I take it you have never heard of, or considered MySQL Replication?
The idea is that you do your backup & restore once, and then configure the replica to "subscribe" to a continuous stream of changes as they are made on the primary MySQL instance. Any change applied to the primary is applied automatically to the replica within seconds. You don't have to do the backup & restore procedure again, unless the replica gets damaged.
It takes some care to set up and keep working, but it's a much more efficient method of keeping two instances in sync.
#SusannahPotts mentions hot backup and/or incremental backup. You can get both of these features for free, without paying for MySQL Enterprise using Percona XtraBackup.
You can also consider using MySQL Transportable Tablespaces.
You'll need filesystem access to run either Percona XtraBackup or MySQL Enterprise Backup. It's not possible to use these physical backup tools for Amazon RDS, for example.
One alternative is to create a replication slave in the same network as the live system, and run Percona XtraBackup on that slave, where you do have filesystem access.
Another option is to stream the binary logs to another host (see https://dev.mysql.com/doc/refman/5.6/en/mysqlbinlog-backup.html) and then transfer them periodically to your local instance and replay them.
Each of these solutions has pros and cons. It's hard to recommend which solution is best for you, because you aren't sharing full details about your requirements.
This was working until the DB grew to over 25GB, and we started running into disk space limitations.
Some question marks "here":
Why don't you just increase the available Diskspace for your database? 25 GB seems nothing when it comes down to disk-space?
Why don't you modify your script to: download table1, import table1_tmp, drop table1_prod, rename table1_tmp to table1_prod; rinse and repeat.
Other than that:
Why don't you ask your partner for a system with enough performance to run your reports on? I'm quite sure, he would prefer this rather than having YOU download sensitive data every day to your "local site"?
Last thought (requires MySQL Enterprise Backup https://www.mysql.de/products/enterprise/backup.html):
Rather than dumping, downloading and importing 25 GB every day:
Create a full backup
Download and import
Use Differential or incremental backups from now.
The next day you download (and import) only the data-delta: https://dev.mysql.com/doc/mysql-enterprise-backup/4.0/en/mysqlbackup.incremental.html

Are docker-hosted databases somehow exempt from backup best practices?

As far as I was aware, for MS SQL, PostgreSQL, and even MySQL databases (so, I assumed, in general for RDBMS engines), you cannot simply back up the file system they are hosted on, but need to do an SQL-level backup to have any hope of internal consistency and therefore ability to actually restore.
But then answers like this and indeed the official docs referenced seem to suggest that one can just tar away on database data:
docker run --volumes-from dbdata -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata
These two ideas seem at odds with one another. Is there something special about how Docker works that makes it unnecessary to use SQL-level backups? If not, what am I missing in my understanding? (Why is something used as the official example when you can't use it to back up a production database? That can't be right...)
Under certain circumstances, it should be safe to use the image of a database on a disk:
The database server is not running.
All persistent data is on the disk system(s) being backed up (logs, tables spaces, temporary storage).
All components are restored together.
You are restoring the image to the same server on the same path.
The last condition is important, because some aspects of the database configuration may be stored in operating system files.
You need to do the backup within the database whenever the server is running. The server is responsible for the internal consistency of the data, and the disk image may not be complete or recoverable. If the server is not running, then the state of the database should be consistent in the persistent storage.

What are the best practices for Mysql backup

We have one php application and mysql server running on one of our production server.
Mysql server is currently 4GB big with intention to grow up to tens or even up to hundreds of GB.
What am curious to find out is what are the best practices for backup of mysql database in condition that application must be live under any circumstance? What is better, to have mysql replication server on which we will run backup scripts or to run on live server? What is more likely to slow down We have possibility to add additional server(s) if needed. Where do I need to store mysql dumps? Is it suggested to ftp copy mysql backup files to remote server.
What is the best practice to organize web application backup if don't have problem with number of server instances?
MySQL backup methods are documented on MySQL documentation.
The ideal backup solution will be to use MySQL Enterprise Backup. This is a licensed product sold on Oracle store. It is very fast compared to mysqldump.
MySQL Enterprise Backup: A licensed product that performs hot backups
of MySQL databases. It offers the most efficiency and flexibility when
backing up InnoDB tables, but can also back up MyISAM and other kinds
of tables.
If you are looking for a free solution with MySQL community edition, then you can install another replication server and either run mysqldump to take backup or make a raw data backup. During backup on your replication server, your main master database will be running. Since your data is big or will get bigger, it is recommended to backup raw data files. It is basically a process of copying data and log files from disk. Details are explained on MySQL documentation.
For larger databases, where mysqldump would be impractical or
inefficient, you can back up the raw data files instead. Using the raw
data files option also means that you can back up the binary and relay
logs that will enable you to recreate the slave in the event of a
slave failure.
Finally, you should copy backup files to another physical disk on the same to recover from disk failures or to another physical server to easily recover from complete server failures.
Replication is something that protects against hardware errors, for example, a hard disk crashed.
Backup - protects against software errors, for example, due to the human factor, data has been deleted from a table.
It is definitely good practice to combine both of these technologies by running a utility to create a backup on a replica. This not only reduces the load on the product database, but also covers more recovery scenarios.
In case of a hardware error, you can restore the most up-to-date data from the replica, and in cases of data corruption, you can already consider about from the what date to use the backup for recovery. Well, if your both the main server and the replica fail, then the backup will also save you.
What is the best way to make backups?
mysqldump is a good solution for small databases. This is a utility for creating logical backups nad it is included to MySQL Server. At the output, the utility creates a .sql file to recreate the database.
For large databases, it is better to use a physical backup. There are two ways on how to do it.
mysqlbackup is a utility included with MySQL Enterprise Solution. As a result, you get a binary file. Such a backup is created much faster than using mysqldump and is less load on the server.
xtrabackup, from Percona, is a lot like the MySQL Enterprise backup utility, but it's free. A more detailed comparison can be found here.
How often the backups should be made?
The more often you make backups, the better, but you can't make many such backups - since you will run out of space in the backup storage. There are two ways:
Find a compromise between the frequency of backups and the duration of storage.
Use incremental backups. The above utilities support incremental backups, but the management of such backups is more complicated (read more here)
Where the backups should be stored?
Anywhere you prefer, but not in the same place as the MySQL Server. Overall, I think using cloud storage is a good choice. Almost everyone today has a command line interface.
How to automate a backup?
The process of creating regular backups should be automated, and a person should intervene in it only in case of failure. A good backup process should include the following steps:
Creating a backup copy
Compression\Encryption
Uploading to storage
Sending success\fail notification
Removing old backups from the storage (so that it does not overflow)
The simplest script that implements this can be found, for example, here.
Something else?
Yes, the most important thing is not to create a backup and then restore it. Therefore, it is best practice to regularly test the recovery scenarios.
Happy backups!
What is better, to have mysql replication server on which we will run backup scripts or to run on live server
It depends on your db size (and time needed to dump it using mysqldump) and your reliability requirements.
If your db is relatively small and mysqldump dumps it in seconds or in a few minutes then its ok to just run scheduled backups. For most cases it is sufficient to have a daily backup which runs at a time when your app is mostly idle (at night when you clients are sleeping). You can use a nice tool automysqlbackup for that: it cares about the scheduling and backup rotation, all you need to do is to add it as your cron task and set up its config once.
Setting up a replica is only needed if:
Your backup takes long time (dozens of minutes or hours) to complete so you can not just stop your service for that long.
You can not afford loosing any history in case of main db crash. E.g. if you process financial transactions you may want to ensure that nothing will be lost if master db server dies.
In this cases you may want a replica with backups. Though you must understand that adding replication adds a new layer of problems: replicas may go out of sync, silently crash (and you will not notice that as the master and your app is running fine) etc.

How do I backup a MySQL database?

What do I have to consider when backing up a database with millions of entries? Are there any tools (maybe bundled with the MySQL server) that I could use?
Depending on your requirements, there's several options that I have been using myself:
if you don't need hot backups, take down the db server and back up on the file system level, i. e. using tar, rsync or similar.
if you do need the database server to keep running, you can start out with the mysqlhotcopy tool (a perl script), which locks the tables that are being backed up and allows you to select single tables and databases.
if you want the backup to be portable, you might want to use mysqldump, which creates SQL scripts to recreate the data, but which is slower than mysqlhotcopy
if you have a copy of the db at a certain point in time, you could also just keep the binlogs (starting at that point in time) somewhere safe. This can be very easy to do and doesn't interfere with the server's operation, but might not be the fastest to restore, and you have to make sure you don't miss part of the logs.
Methods I haven't tried, but that make sense to me:
if you have a filesystem like ZFS or are running on LVM, it might be a good idea to do a snapshot of the database by doing a filesystem snapshot, because they are very, very quick. Just remember to ensure a consistent state of your db during the whole operation, e. g. by doing FLUSH TABLES WITH READ LOCK (and of course, don't forget UNLOCK TABLES afterwards)
Additionally:
you can use a master-slave setup to replicate your production server to either a different machine or a second instance on the same machine and do any of the above to the replicated copy, leaving your production machine alone. Instead of running continously, you can also fire up the slave on regular intervals, let it read the binlog, and switch it off again.
I think, MySQL cluster and the enterprise licensed version have more tools, but I have never tried them.
Mysqlhotcopy is badly described - it only works if you use MyISAM, and it's not hot.
The problem with mysqldump is the time it takes to restore the backup (but it can be made hot if you have all InnoDB tables, see --single-transaction).
I recommend using a hot backup tool, like what is available in XtraBackup:
http://www.percona.com/docs/wiki/percona-xtrabackup:start
Watch out if using mysqldump on large tables using the MyISAM storage engine; it blocks selects while the dump is running on each table and this can take down busy sites for 5-10 minutes in some cases.
Using InnoDB, by comparison, you get non-blocking backups because of its row-level locking, so this is not such an issue.
If you need to use MyISAM, a common strategy is to replicate to a second MySQL instance and do the mysqldump against the replicated copy instead.
Use the export tab in phpMyAdmin. phpMyAdmin is the free easy to use web interface for doing MySQL administration.
I think mysqldump is the proper way of doing it.