How to limit bandwidth used by mysqldump - mysql

I have to dump a large database over a network pipe that doesn't have that much bandwidth and other people need to use concurrently. If I try it it soaks up all the bandwidth and latency soars and everyone else gets messed up.
I'm aware of the --compress flag to mysqldump which help somewhat.
How can I do this without soaking up all the bandwidth over this connection?
Update:
The suggestion to copy a dumpfile using scp with the -l flag is a good one, but I should note that I don't have SSH access to the database server.

trickle?
trickle is a portable lightweight userspace bandwidth shaper
You don't mention how you are actually transffering the DB dump, but if the transfer happens over TCP/IP, trickle should work. For example, if you use nc (for example: nc -L 1234 > backup.sql) the following command will transfer the backup at no greater than 20KB/s:
mysqldump [database name] | trickle -u 20 nc backup.example.com:1234

You will have to have access to a linux machine (sorry I'm a linuxy sort of person).
An ingress policy can decrease the amount of incoming traffic, but the server on the other side needs to have a farely well behaved TCP/IP stack.
tc qdisc add dev eth0 handle ffff: ingress
tc filter add dev eth0 parent ffff: protocol ip prio 50 \
u32 match ip src server.ip.address/32 police rate 256kbit \
burst 10k drop flowid :1
tc qdisc add dev eth0 root tbf \
rate 256kbit latency 25ms burst 10k
You can find more information on ingress filters in the advanced routing howto.
http://www.linux.org/docs/ldp/howto/Adv-Routing-HOWTO/index.html
If you are doing it in linux, you can dump the file locally, compress it and use scp to copy the file with the -l switch to limit the bandwidth used:
-l limit
Limits the used bandwidth, specified in Kbit/s.
eg
scp -l 16 dumpfile remotehost:filepathandname

One trick I've used is to specify CSV format rather than the insert. It doesn't change how much bandwidth you use per unit time, but it can reduce the total number of bytes you're pulling out.

If you send it over TCP, the bandwidth will be shared equally between all parties. If you want to lower the speed even more, you need to shape your device to only allow a certain amount of data going out.

On the client, you can run a proxy that will limit the speed of the download.
You can also control # of connections etc.
If you are on windows, this should work nicely:
http://www.youngzsoft.net/ccproxy/index.html

Are you using a transactional table engine like InnoDB? Is this your master database? Be very careful! mysqldump will hold table locks and disrupt the production use of your database. Slowing down the backup will only cause this period of disruption to get longer. Always mysqldump to a local disc, and then copy the dump from there.
One other approach might be to set up a replication slave at your remote site, and take your backups from that. Then database updates will trickle over your contended link instead of coming down in one big lump.
Another alternative: do your backups when noone else is using the network :)

Related

Mysqldump on RDS Read Replica Slave is 50x slower

I created a read-replica of my MySQL database on amazon RDS.
When executing the following command, it is super fast (half a second) on the master, but takes more like 30 seconds on the slave. Super annoying because I wanted to dump off of the slave so that I don't slow down the master.
mysqldump --set-gtid-purged=OFF -h myDomain.com -u dev -pmyPassword mySchema > out.sql
There are three issues to consider.
The most significant is that mysqldump does not perform well when run at a distance from the database, due to limitations in the traditional MySQL client/server wire protocol, which makes no allowance for pipelining a series of commands.
The mysqldump utility uses no magic to generate dump files -- it issues SQL statements to the server, and takes the results of those queries to generate its output.
As a result, every single object (schema, table, view, stored function/procedure, event) in the database requires at least one round trip and sometimes more than one.
For each table, mysqldump first issues SHOW CREATE TABLE t1; followed by SELECT * FROM t1; ... so a round trip time of 100 ms would mean that extracting a dump file of 150 tables would mean 150 × 2 × 0.100 = 30 seconds are simply wasted by the distance between the machine running mysqldump and the server -- and this is true even if the tables are completely empty.
This is not a recommendation, but you might take a look at mydumper, which claims to have the ability of creating the backup using multiple database connections, in parallel, and this could help mediate the cycles wasted as commands pass to the server and return to the client, by parallelizing the dump process. I don't know the quality of this code base, but something like this could help.
Next, you almost always want to use the --compress option for mysqldump. Contrary to what you might assume, this does not compress the backup file. The generated backup file is identical when this option is used, but when this feature is activated, the server compresses the data it sends to mysqldump on the wire, and mysqldump decompresses the data again before writing it out -- so this option will almost always make for a faster process unless the machine running mysqldump and the database server are connected by a low-latency, high-bandwidth network. Because the generated file is identical, there are no compatibility concerns when using this option.
Finally, there's an issue with newly-created RDS servers that you need to be aware of, so that it doesn't skew your benchmarks. When you create an RDS replica, it is originally seeded with data from a snapshot of the upstream master. This is, behind the scenes, an EBS snapshot of the master's hard drive, and the new database instance is backed by an EBS volume restored from that snapshot. EBS volumes are lazily-loaded from the snapshot, so they have a documented first-touch penalty. This issue could have a substantial impact on the performance of the first complete backup, but should have no meaningful impact after that.

How to open and work with a very large .SQL file that was generated in a dump?

I have a very large .SQL file, of 90 GB
It was generated with a dump on a server:
mysqldump -u root -p siafi > /home/user_1/siafi.sql
I downloaded this .SQL file on a computer with Ubuntu 16.04 and MySQL Community Server (8.0.16). It has 8GB of RAM
So I did these steps in Terminal:
# Access
/usr/bin/mysql -u root -p
# I create a database with the same name to receive the .SQL information
CREATE DATABASE siafi;
# I establish the privileges. User reinaldo
GRANT ALL PRIVILEGES ON siafi.* to reinaldo#localhost;
# Enable the changes
FLUSH PRIVILEGES;
# Then I open another terminal and type command for the created database to receive the data from the .SQL file
mysql --user=reinaldo --password="type_here" --database=siafi < /home/reinaldo/Documentos/Code/test/siafi.sql
I typed these same commands with other .SQL files, only minor ones, with a maximum of 2GB. And it worked normally
But this 90GB file is processing for over twelve hours without stopping. I do not know if it's working
Please, is there any more efficient way to do this? Maybe splitting the .SQL file?
Break the file up into smaller chunks and process them separately.
You're probably hitting the logging high-water mark and mysql is trying to roll everything back, and that is a slow process.
Split the file into approx 1Gb chunks, breaking on whole lines. Perhaps using:
split -l 1000000 bigfile.sql part.
Then run them in order using your current command.
You'll have to experiment with split to get the size right, and you haven't said what your OS is, and split implementations/options vary. split --number=100 make work for you.
2 things that might be helpful:
Use pv to see how much of the .sql file has already been read. This can give you a progress bar which at least tells you it's not suck.
Log into MySQL and use SHOW PROCESSLIST to see what MySQL currently is executing. If it's still running, just let it run to completion.
If turned on, it might really help to turn off the binlog for the duration of the restore. Another thing that may or may not be helpful... if you have the choice, try to use the fastest disks available. You may have this kind of option if you're running on hosters like Amazon. You're going to really feel the pain if you're (for example) doing this on a standard EC2 host.
You can use third party tools like
https://philiplb.de/sqldumpsplitter3/
Very easy to use, can define size, location etc...
Or use this one also
same but interface its bit colorful and use to use
https://sqldumpsplitter.net/

mysqldump increase apache process threads?

We are running into problem with number of apache processes drastically increasing at a specific time. On further investigating, it's found that "mysqldump" was running in MySQL server during that time.
We noticed that while mysqldump was running, the count of apache instances(processes) shoots up to the max. Since we limited MaxClients to 150 there is no further increase in process thread.
My question is: Is it possible that mysqldump would increase number of processes in apache?
MySQLdump itself does not cause Apache to do anything. MySQLdump and Apache Server are only related in that since they are running on the same machine they are sharing the resources of that machine.
I suspect that when you run MySQLdump your server becomes resource-constrained (could be CPU, Network or Disk). When Apache is resource-constrained an individual Apache process is not able to complete as quickly and move on to the next request so instead new processes are spawned.

Migrating a MySQL server from one box to another

The databases are prohibitively large (> 400MB), so dump > SCP > source is proving to be hours and hours work.
Is there an easier way? Can I connect to the DB directly and import from the new server?
You can simply copy the whole /data folder.
Have a look at High Performance MySQL - transferring large files
Use can use ssh to directly pipe your data over the Internet. First set up SSH keys for password-less login. Next, try something like this:
$ mysqldump -u db_user -p some_database | gzip | ssh someuser#newserver 'gzip -d | mysql -u db_user --password=db_pass some_database'
Notes:
The basic idea is that you are just dumping standard output straight into a command on the other side, which SSH is perfect for.
If you don't need encryption then you can use netcat but it's probably not worth it
The SQL text data goes over the wire compressed!
Obviously, change db_user to user user and some_database to your database. someuser is the (Linux) system user, not the MySQL user.
You will also have to use --password the long way because having mysql prompt you will be a lot of headache.
You could setup a MySQL slave replication and let MySQL copy the data, and then make the slave the new master
400M is really not a large database; transferring it to another machine will only take a few minutes over a 100Mbit network. If you do not have 100M networks between your machines, you are in a big trouble!
If they are running the exact same version of MySQL and have identical (or similar ENOUGH) my.cnf and you just want a copy of the entire data, it is safe to copy the server's entire data directory across (while both instances are stopped, obviously). You'll need to delete the data directory of the target machine first of course, but you probably don't care about that.
Backup/restore is usually slowed down by the restoration having to rebuild the table structure, rather than the file copy. By copying the data files directly, you avoid this (subject to the limitations stated above).
If you are migrating a server:
The dump files can be very large so it is better to compress it before sending or use the -C flag of scp. Our methodology of transfering files is to create a full dump, in which the incremental logs are flushed (use --master-data=2 --flush logs, please check you don't mess any slave hosts if you have them). Then we copy the dump and play it. Afterwards we flush the logs again (mysqladmin flush-logs), take the recent incremental log (which shouldn't be very large) and play only it. Keep doing it until the last incremental log is very small so that you can stop the database on the original machine, copy the last incremental log and then play it - it should take only a few minutes.
If you just want to copy data from one server to another:
mysqldump -C --host=oldhost --user=xxx --database=yyy -p | mysql -C --host=newhost --user=aaa -p
You will need to set the db users correctly and provide access to external hosts.
try importing the dump on the new server using mysql console, not an auxiliar software
I have no experience with doing this with mysql, but to me it seems the bottleneck is transferring the actual data?
4oo MB isnt that much. But if dump -> SCP is slow, i dont think connecting to the db server from the remove box would be any faster?
I'd suggest dumping, compressing, then copying over network or burning to disk and manually transfering the data.
Compressing such a dump will most likely give you quite good compression rate since, most likely , theres a lot of repeptetive data.
If you are only copying all the databases of the server, copy the entire /data directory.
If you are just copying one or more databases and adding them to an existing mysql server:
create the empty database in the new server, set up the permissions for users etc.
copy the folder for the database in /data/databasename to the new server /data/databasename
I like to use BigDump: Staggered Mysql Dump Importer after Exporting my database from the old server.
http://www.ozerov.de/bigdump/
One thing to note though, if you don't set the export options (namely the maximum length of created queries) respective to the load your new server can handle, it'll just fail and you will have to try again with different parameters. Personally, I set mine to about 25,000, but that's just me. Test it out a bit and you'll get the hang of it.

What's the quickest way to dump & load a MySQL InnoDB database using mysqldump?

I would like to create a copy of a database with approximately 40 InnoDB tables and around 1.5GB of data with mysqldump and MySQL 5.1.
What are the best parameters (ie: --single-transaction) that will result in the quickest dump and load of the data?
As well, when loading the data into the second DB, is it quicker to:
1) pipe the results directly to the second MySQL server instance and use the --compress option
or
2) load it from a text file (ie: mysql < my_sql_dump.sql)
QUICKLY dumping a quiesced database:
Using the "-T " option with mysqldump results in lots of .sql and .txt files in the specified directory. This is ~50% faster for dumping large tables than a single .sql file with INSERT statements (takes 1/3 less wall-clock time).
Additionally, there is a huge benefit when restoring if you can load multiple tables in parallel, and saturate multiple cores. On an 8-core box, this could be as much as an 8X difference in wall-clock time to restore the dump, on top of the efficiency improvements provided by "-T". Because "-T" causes each table to be stored in a separate file, loading them in parallel is easier than splitting apart a massive .sql file.
Taking the strategies above to their logical extreme, one could create a script to dump a database widely in parallel. Well, that's exactly what the Maakit mk-parallel-dump (see http://www.maatkit.org/doc/mk-parallel-dump.html) and mk-parallel-restore tools are; perl scripts that make multiple calls to the underlying mysqldump program. However, when I tried to use these, I had trouble getting the restore to complete without duplicate key errors that didn't occur with vanilla dumps, so keep in mind that your milage may vary.
Dumping data from a LIVE database (w/o service interruption):
The --single-transaction switch is very useful for taking a dump of a live database without having to quiesce it or taking a dump of a slave database without having to stop slaving.
Sadly, -T is not compatible with --single-transaction, so you only get one.
Usually, taking the dump is much faster than restoring it. There is still room for a tool that take the incoming monolithic dump file and breaks it into multiple pieces to be loaded in parallel. To my knowledge, such a tool does not yet exist.
Transferring the dump over the Network is usually a win
To listen for an incoming dump on one host run:
nc -l 7878 > mysql-dump.sql
Then on your DB host, run
mysqldump $OPTS | nc myhost.mydomain.com 7878
This reduces contention for the disk spindles on the master from writing the dump to disk slightly speeding up your dump (assuming the network is fast enough to keep up, a fairly safe assumption for two hosts in the same datacenter). Plus, if you are building out a new slave, this saves the step of having to transfer the dump file after it is finished.
Caveats - obviously, you need to have enough network bandwidth not to slow things down unbearably, and if the TCP session breaks, you have to start all over, but for most dumps this is not a major concern.
Lastly, I want to clear up one point of common confusion.
Despite how often you see these flags in mysqldump examples and tutorials, they are superfluous because they are turned ON by default:
--opt
--add-drop-table
--add-locks
--create-options
--disable-keys
--extended-insert
--lock-tables
--quick
--set-charset.
From http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html:
Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.
Of those behaviors, "--quick" is one of the most important (skips caching the entire result set in mysqld before transmitting the first row), and can be with "mysql" (which does NOT turn --quick on by default) to dramatically speed up queries that return a large result set (eg dumping all the rows of a big table).
Pipe it directly to another instance, to avoid disk overhead. Don't bother with --compress unless you're running over a slow network, since on a fast LAN or loopback the network overhead doesn't matter.
i think it will be a lot faster and save you disk space if you tried database replication as opposed to using mysqldump. personally i use sqlyog enterprise for my really heavy lifting but there also a number of other tools that can provide the same services. unless of course you would like to use only mysqldump.
For innodb, --order-by-primary --extended-insert is usually the best combo. If your after every last bit of performance and the target box has many CPU cores, you might want to split the resulting dumpfile and do parallel inserts in many threads, up to innodb_thread_concurrency/2.
Also, tweak the innodb_buffer_pool_size on the target to the max you can afford, and increase innodb_log_file_size to 128 or 256 MB (careful with this, you need to remove the old logfiles before restarting the mysql daemon otherwise it won't restart)
Use mk-parallel-dump tool from Maatkit.
At least that would probably be faster. I'd trust mysqldump more.
How often are you doing this? Is it really an application performance problem? Perhaps you should design a way of doing this which doesn't need to dump the whole data (replication?)
On the other hand, 1.5G is quite a small database so it probably won't be much of a problem.
mydumper is a good choice, with paralel export, even paralell threads per table, and compressed files, see: