We have a very large database that we need to occasionally replicate on our dev+staging machines.
At the moment we use mysqldump and then import the database script using "mysql -u xx -p dbname < dumpscript.sql"
This would be fine if it didn't take 2 days to complete!
Is it possible to simply copy the entire database as a file from one server to another and skip the whole export/import nonsense?
Cheers
there are couple of solutions:
have a separate replication slave you can stop at any time and take the file-level backup
if you use the innodb engine - you can take file system level snapshot [eg with lvm] and then copy the files over to your test environment
if you have plenty of tables/databases - you can paralleled the dumping and restoring process to speed things up.
I have many restrictions on where I can run scripts, access sources and targets, and have enough space to prepare the data for the task.
I get my zipped database dump from the hosting provider.
I split the unzipped commands so INSERT INTO lines get put into one file, and all the others go into a second one.
Then I create the database structures from the second one.
I convert the INSERT INTO statements to table related CSV files.
Finally, I upload the csv files in parallel (up to 50 tables concurrently) and this way a 130GB text file dump is cloned in 3 hours, instead of the 17 hours it'd take when using the statement by statement method.
In the 3 hours, I include:
the copy over (10 minutes),
sanity check (10 minutes) and
filtering of logs (10 minutes), as the log entries need to be from the latest academic year only.
The remote zipped file is between 7GB to 13GB passed over a 40MBps line.
The upload is to a remote server via a 40MBps line.
If your mysql server is local, the speed of uploading can be faster.
I utilise scp, gzip, zgrep, sed, awk, ps, mysqlimport, mysql and some other utilities to speed up decompression and filtering (pv, rg, pigz) if available.
If I had direct access to the database server, an LVM with folder level snapshot abilities would be the preferred solution, giving you speeds restricted only by the copy speed of the media.
Related
We want to run BI analytics on a (copy of a) mysql database.
Currently we get a full mysql dump file daily, but it takes up to 16 hours to import this file into our "BI" server. The dump file itself is about 9GB big, resulting in 360 tables with a total of 20GB (on disk); the biggest table file (.ibd) is around 6GB.
I am no mysql expert, but I think that the import takes so much time since the dump file only imports data, and the database needs to re-index everything from scratch...( as a side question: Any thoughts maybe on how to improve the import?) We have been looking at having a separate SSD for the datafiles, added some more RAM/CPU to the server, but that does not really improve the import speed...
Is there a way to have some sort of "snapshot" of the source database, so that it can be copied over as-is instead of importing?
I think about a .zip file with all the .ibd files (and other necessary files)
Yes, importing a large mysqldump is notoriously slow. The dump file contains the definitions of indexes, but not the index storage itself. So the indexes must be rebuilt every time you import the dump file.
At my company we use Percona XtraBackup.
It's a physical backup tool, meaning the result is not an SQL dump file that must be imported. It just makes a copy of the .ibd files and the iblog files to reconcile transactions. The .ibd files contain both rows of data and indexes.
We use this backup & restore solution to clone databases up to 100x as large as yours.
Percona XtraBackup is free and open-source.
There are a few caveats:
Doesn't work on Windows last I checked (I haven't checked in several years, because I don't use Windows).
You can backup without interrupting service on the source instance, but to "restore" you need to shut down your local MySQL Server, copy the backup into the datadir, and restart the MySQL Server.
It's nearly impossible to import just one database to an instance. In other words, backup and restore is for the full instance, with all tables and schemas. This will overwrite anything else you have on your local instance. Whereas mysqldump is more flexible because you can dump & import just one table or just one schema, and you can import to a running MySQL Server instance without stopping it.
It's worth mentioning that if you don't use a proper backup tool, you should not try to make a zip archive of the MySQL data directory of a running MySQL Server. You're almost certain to get a non-consistent copy of the data files, meaning they will be corrupt and not restorable.
I created a read-replica of my MySQL database on amazon RDS.
When executing the following command, it is super fast (half a second) on the master, but takes more like 30 seconds on the slave. Super annoying because I wanted to dump off of the slave so that I don't slow down the master.
mysqldump --set-gtid-purged=OFF -h myDomain.com -u dev -pmyPassword mySchema > out.sql
There are three issues to consider.
The most significant is that mysqldump does not perform well when run at a distance from the database, due to limitations in the traditional MySQL client/server wire protocol, which makes no allowance for pipelining a series of commands.
The mysqldump utility uses no magic to generate dump files -- it issues SQL statements to the server, and takes the results of those queries to generate its output.
As a result, every single object (schema, table, view, stored function/procedure, event) in the database requires at least one round trip and sometimes more than one.
For each table, mysqldump first issues SHOW CREATE TABLE t1; followed by SELECT * FROM t1; ... so a round trip time of 100 ms would mean that extracting a dump file of 150 tables would mean 150 × 2 × 0.100 = 30 seconds are simply wasted by the distance between the machine running mysqldump and the server -- and this is true even if the tables are completely empty.
This is not a recommendation, but you might take a look at mydumper, which claims to have the ability of creating the backup using multiple database connections, in parallel, and this could help mediate the cycles wasted as commands pass to the server and return to the client, by parallelizing the dump process. I don't know the quality of this code base, but something like this could help.
Next, you almost always want to use the --compress option for mysqldump. Contrary to what you might assume, this does not compress the backup file. The generated backup file is identical when this option is used, but when this feature is activated, the server compresses the data it sends to mysqldump on the wire, and mysqldump decompresses the data again before writing it out -- so this option will almost always make for a faster process unless the machine running mysqldump and the database server are connected by a low-latency, high-bandwidth network. Because the generated file is identical, there are no compatibility concerns when using this option.
Finally, there's an issue with newly-created RDS servers that you need to be aware of, so that it doesn't skew your benchmarks. When you create an RDS replica, it is originally seeded with data from a snapshot of the upstream master. This is, behind the scenes, an EBS snapshot of the master's hard drive, and the new database instance is backed by an EBS volume restored from that snapshot. EBS volumes are lazily-loaded from the snapshot, so they have a documented first-touch penalty. This issue could have a substantial impact on the performance of the first complete backup, but should have no meaningful impact after that.
I have a Windows Server with MySQL Database Server installed.
Multiple databases exist among them, database A contains a huge table named 'tlog', size about 220gb.
I would like to move over database A to another server for backup purposes.
I know I can do SQL Dump or use MySQL Workbench/SQLyog to do table copy.
But due to limited disk storage in server (less than 50gb) SQL Dump is not possible.
The server is serving other works so basically the CPU & RAM is limited too. As a result, copy table without used up CPU & RAM is not possible.
Is there any other method that can do the moving of the huge database A over to another server please?
Thanks in advance.
You have a few ways:
Method 1
Dump and compress at the same time: mysqldump ... | gzip > blah.sql.gz
This method is good because chances are your database will be less than 50GB; as the database dump should be in ASCII; you're then compressing it on the fly.
Method 2
You can use slave replication; this method will require a dump of the data.
Method 3
You can also use xtrabackup.
Method 4
You can shutdown the database, and rsync the data directory.
Note: You don't actually have to shutdown the database; you can however do multiple rsyncs; and eventually nothing will change (unlikely if the database is busy; have to do during slow time); which means the database would have sync'd over.
I've had to do this method with fairly large PostgreSQL databases (1TB+). It takes a few rsyncs: but, hey; it's the cost of 0 down time.
Method 5
If you're in a virtual environment you could:
Clone the disk image.
If you're in AWS you could create an AMI.
You could add another disk and just sync locally; then detach the disk, and re-attach to the new VM.
If you're worried about consuming resources during the dump or transfer you can use ionice and renice to limit the priority of the dump/transfer.
What is the best method to do a MySQl backup with compression? Also, how do you dump that to specific directory such a C:\targetdir
mysqldump command will output CREATE TABLE and INSERT commands that are sufficient to recreate your whole database. You can back up individual tables or databases with this command.
You can easily compress this. If you want it to be compressed as it goes, you will need some sort of streaming tool for the command line. On UNIX it would be mysqldump ... | gzip. On Windows, you will have to find a tool that works with pipes.
This I think is what you are looking for. I will list other options just because.
FLUSH TABLES WITH READ LOCK will flush all data to the disk and lock them from changing which you can do while you are making a copy of the data folder.
Keep in mind, when doing restores, if you want to preserve the full capability of MySQL bin logs, you will not want to restore parts of a database by touching the files directly. Best option is to have an alternate data dir with restored files and dump from there, then feed to your production database using regular mysql connection channels. Any direct changes to the filesystem will not be recorded by binlogs.
If you restore the whole database using files, you will be OK. Just not if you to peices.
mysqldump does not have this problem
Replication will allow you to back up to another instance of MySQL running on the same or different machine.
binlogs. Given a static copy of a database, you can use these to move it forward in time. binlogs are a log of all the commands that ever changed the data. If you have binlogs back to day one, then you may already have what you are looking for. You can run all the commands from the binlogs from day one to any date you wish and then you have a copy of the database from that date.
I recommend checking out Percona XtraBackup. It's a GPL licensed alternative to MySQL's paid Enterprise Backup tool and can create consistent non-blocking backups from databases even when they are written to. See this article for more information on why you'd want to use this over mysqldump.
You could use a script like AutoMySQLBackup, which automatically does a backup every day, keeping daily, weekly and monthly backups, keeping your backup directory pretty clean and uncluttered, while still providing you a long history of backups.
The backups are also compressed, naturally.
The databases are prohibitively large (> 400MB), so dump > SCP > source is proving to be hours and hours work.
Is there an easier way? Can I connect to the DB directly and import from the new server?
You can simply copy the whole /data folder.
Have a look at High Performance MySQL - transferring large files
Use can use ssh to directly pipe your data over the Internet. First set up SSH keys for password-less login. Next, try something like this:
$ mysqldump -u db_user -p some_database | gzip | ssh someuser#newserver 'gzip -d | mysql -u db_user --password=db_pass some_database'
Notes:
The basic idea is that you are just dumping standard output straight into a command on the other side, which SSH is perfect for.
If you don't need encryption then you can use netcat but it's probably not worth it
The SQL text data goes over the wire compressed!
Obviously, change db_user to user user and some_database to your database. someuser is the (Linux) system user, not the MySQL user.
You will also have to use --password the long way because having mysql prompt you will be a lot of headache.
You could setup a MySQL slave replication and let MySQL copy the data, and then make the slave the new master
400M is really not a large database; transferring it to another machine will only take a few minutes over a 100Mbit network. If you do not have 100M networks between your machines, you are in a big trouble!
If they are running the exact same version of MySQL and have identical (or similar ENOUGH) my.cnf and you just want a copy of the entire data, it is safe to copy the server's entire data directory across (while both instances are stopped, obviously). You'll need to delete the data directory of the target machine first of course, but you probably don't care about that.
Backup/restore is usually slowed down by the restoration having to rebuild the table structure, rather than the file copy. By copying the data files directly, you avoid this (subject to the limitations stated above).
If you are migrating a server:
The dump files can be very large so it is better to compress it before sending or use the -C flag of scp. Our methodology of transfering files is to create a full dump, in which the incremental logs are flushed (use --master-data=2 --flush logs, please check you don't mess any slave hosts if you have them). Then we copy the dump and play it. Afterwards we flush the logs again (mysqladmin flush-logs), take the recent incremental log (which shouldn't be very large) and play only it. Keep doing it until the last incremental log is very small so that you can stop the database on the original machine, copy the last incremental log and then play it - it should take only a few minutes.
If you just want to copy data from one server to another:
mysqldump -C --host=oldhost --user=xxx --database=yyy -p | mysql -C --host=newhost --user=aaa -p
You will need to set the db users correctly and provide access to external hosts.
try importing the dump on the new server using mysql console, not an auxiliar software
I have no experience with doing this with mysql, but to me it seems the bottleneck is transferring the actual data?
4oo MB isnt that much. But if dump -> SCP is slow, i dont think connecting to the db server from the remove box would be any faster?
I'd suggest dumping, compressing, then copying over network or burning to disk and manually transfering the data.
Compressing such a dump will most likely give you quite good compression rate since, most likely , theres a lot of repeptetive data.
If you are only copying all the databases of the server, copy the entire /data directory.
If you are just copying one or more databases and adding them to an existing mysql server:
create the empty database in the new server, set up the permissions for users etc.
copy the folder for the database in /data/databasename to the new server /data/databasename
I like to use BigDump: Staggered Mysql Dump Importer after Exporting my database from the old server.
http://www.ozerov.de/bigdump/
One thing to note though, if you don't set the export options (namely the maximum length of created queries) respective to the load your new server can handle, it'll just fail and you will have to try again with different parameters. Personally, I set mine to about 25,000, but that's just me. Test it out a bit and you'll get the hang of it.