Slow write of database using `mysqldump ` - mysql

I'm trying to automate a mysql dump of all databases from an Azure Database for MySQL Server. Current size of databases:
mysql> SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB"
FROM information_schema.tables GROUP BY table_schema;
+--------------------+---------------+
| DB Name | DB Size in MB |
+--------------------+---------------+
| db1 | 278.3 |
| db2 | 51.8 |
| information_schema | 0.2 |
| mysql | 8.9 |
| performance_schema | 0.0 |
| db3 | 43.3 |
| sys | 0.0 |
+--------------------+---------------+
7 rows in set (31.80 sec)
I have a python script, on a different VM, that calls mysqldump to dump all of these into a file. However, I'm running into an issue with db1. It is being dumped to a file but it is very slow, less than ~4MB in 30min. However db2 and db3 are dumped almost immediately, in seconds.
I have tried all of the following options and combinations to see if the write speed changes, but it doesn't:
--compress
--lock-tables (true / false)
--skip-lock-tables
--max-allowed-packet (512M)
--quick
--single-transaction
--opt
I'm currently not even using the script, just running the commands in a shell, with the same result.
mysqldump -h <host> -P <port> -u'<user>' -p'<password>' db1 > db1.sql
db1 has ~500 tables.
I understand that it is bigger than db2 and db3 but it's not by that much, and I'm wondering if anyone knows what could be the issue here?
EDIT
After these helpful answers and google research showed that the database is most likely fine, I run test by duplicating the db1 database on the server into a test database and then deleting tables one by one to decrease the size. And at around 50MB the writes became instant like the other databases. This leads me to believe that there is some throttling going on in Azure because the database is just fine and we will take it up with their support team. I have also found a lot of posts on google complaining about Azure database speeds in general.
In the meantime, I changed the script to ignore large databases. And we will try to move the databases to a SQL Server provided by Azure or a simple VM with a mysql server on it to see where we can get a better performance.

It's possible it's slow on the MySQL Server end, but it seems unlikely. You can open a second shell window, connect to MySQL and use SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS to check for stuck queries or locks.
It's also possible it's having trouble writing the data to db1.sql, if you have very slow storage. But 4MB is 30min. is ridiculous. Make sure you're saving to storage local to the instance you're running mysqldump on. Don't save to remote storage. Also be careful if the storage volume to which you're writing the dump has other heavy I/O traffic saturating it, this could slow down writes.
Another way you can test for slow data writes is to try mysqldump ... > /dev/null and if that is fast, then it's a pretty good clue that the slowness is the fault of the disk writes.
Finally, there's a possibility that the network is causing the slowness. If saving the dump file to /dev/null is still slow, I'd suspect the network.
An answer in
https://serverfault.com/questions/233963/mysql-checking-permission-takes-a-long-time suggests that slowness in "checking permissions" might be caused by having too much data in the MySQL grant tables (e.g. mysql.user). If you have thousands of user credentials, this could be the cause. You can try eliminating these entries (and run FLUSH HOSTS afterwards).

Create a backup from your database first. After that, try this:
mysqlcheck
More info about this: mysqlcheck

Related

Best method to transfer/clone large MySQL databases to another server

I am running a web application within a shared hosting environment which uses a MYSQL database which is about 3GB large.
For testing purposes I have setup a XAMPP environment on my local macOS machine. To copy the online DB to my local machine I used mysqldump on the server and than directly imported the dump file to mysql:
// Server
$ mysqldump -alv -h127.0.0.3 --default-character-set=utf8 -u dbUser -p'dbPass' --extended-insert dbName > dbDump.sql
// Local machine
$ mysql -h 127.0.0.3 -u dbUser -p'dbPass' dbName < dbDump.sql
The only optimization here is the use of extended-insert. However the import takes about 10 hours!
After some searching I found that disabling foreign constraint checks during the import should speed up the process. So I added the following line at the begining of the dump file:
// dbDump.sql
SET FOREIGN_KEY_CHECKS=0;
...
However this did not make any significant difference... The import now took about 8 hours. Faster but still pretty long.
Why does it take so much time to import the data? Is there a better/faster way to do this?
The server is not the fastest (shared hosting...) but it takes just about 2 minutes to export/dump the data. That exporting is faster than importing (no syntax checks, no parsing, just writing...) is not surprising but 300 times faster (10 hours vs. 2 minutes)? This is a huge difference...
Isn't there any other solution that would be faster? Copy the binary DB file instead, for example? Anything would be better than using a text file as transfer medium.
This is not just about transferring the data to another machine for testing purposes. I also create daily backups of the database. If it would be necessary to restore the DB it would be pretty bad if the site is down for 10 hours...

Slow running MySql restore - 10 times slower than backup speed and still going

I'm moving a large (~80GB) database from its testbed into what will be its production environment. We're working on Windows servers. This is the first time we've worked with MySQL and we're still learning the expected behaviours.
We backed up the data with
mysqldump -u root -p --opt [database name] > [database name].sql
Which took about 3 hours and created a file 45GB in size. It copied over to its new home overnight and, next morning, I used MySQL Workbench to launch a restore. According to its log, it ran
mysql.exe --defaults-file="[a path]\tmpc8tz9l.cnf" --protocol=tcp --host=127.0.0.1 --user=[me] --port=3306 --default-character-set=utf8 --comments --database=[database name] < "H:\[database name].sql"
And it's working - if I connect to the instance I can see the database and some its tables.
The trouble is, it seems to be taking forever. I presumed it would restore in the same 3-4 time frame it took to back up, maybe faster because it's restoring onto a more powerful server with SSD drives.
But it's now about 36 hours since the restore started and the DB is apparently 30GB in size. And it appears to be getting slower as it goes on.
I don't want to interrupt it now that it's started working so I guess I just have to wait. But for future reference: is this treacle-slow restore speed normal? Is there anything we can do it improve matters next time we need to restore a big DB?
Very large imports are notoriously hard to make fast. It sounds like your import is slowing down--processing fewer rows per second--as it progresses. That probably means MySQL is checking each new row to see whether it has key-conflicts with the rows already inserted.
A few things you can do:
Before starting, disable key checking.
SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
After ending restore your key checking.
SET UNIQUE_CHECKS = 1;
SET FOREIGN_KEY_CHECKS = 1;
And, if you can wrap every few thousand lines of INSERT operations in
START TRANSACTION;
INSERT ...
INSERT ...
...
COMMIT;
you'll save a lot of disk churning.
Notice that this only matters for tables with many thousands of rows or more.
mysqldump can be made to create a dump with that disables keys. https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_disable-keys
mysqldump --disable-keys
Similarly,
mysqldump --extended-insert --no-autocommit
will make the dumped sql file contain a variant of my suggestion about using transactions.
In your case if you had used --opts --no-autocommit you probably would have gotten an optimal dump file. You already used --opts.
I changed my.ini and got some improvements while also using mysqldump --extended-insert --no-autocommit
my.ini for 16GB RAM on Windows 10 mysql 7.4
# Comment the following if you are using InnoDB tables
#skip-innodb
innodb_data_home_dir="C:/xampp74/mysql/data"
innodb_data_file_path=ibdata1:10M:autoextend
innodb_log_group_home_dir="C:/xampp74/mysql/data"
#innodb_log_arch_dir = "C:/xampp74/mysql/data"
## You can set .._buffer_pool_size up to 50 - 80 %
## of RAM but beware of setting memory usage too high
#innodb_buffer_pool_size=16M
innodb_buffer_pool_size=8G
## Set .._log_file_size to 25 % of buffer pool size
#innodb_log_file_size=5M
innodb_log_file_size=2G
innodb_log_buffer_size=8M
#innodb_flush_log_at_trx_commit=1
#Use for restore only
innodb_flush_log_at_trx_commit=2
innodb_lock_wait_timeout=50

ERROR 2006 (HY000) at line MySQL server has gone away

Problem
I encountered this error during a Mysql DB dump and restore. None of the solutions posted anywhere solved my problem, so I thought I post my own answer I found on my own for posterity.
Source Env:
CentOS 4 i386 ext3, Mysql 5.5 dump, Most tables engines are MySIAM, with a few InnoDBs.
Destination Env:
CentOS 6 x66_64 XFS, Mysql 5.6
Source DB is 25GB on disk, and a gzipped dump is 4.5GB.
Dump
Dump command from source -> destination was run like so:
mysqldump $DB_NAME | gzip -c | sudo ssh $USER#$IP_ADDRESS 'cat > /$PATH/$DB_NAME-`date +%d-%b-%Y`.gz'
This makes the dump, gzips on the fly, and writes it over SSH to the source. You don't have to do it this way, but it is convenient.
Import
On the new source DB I ran the import like so:
gunzip < /$PATH/$DB_NAME.gz | mysql -u root $DB_NAME
Note that you have to issue CREATE DATABASE DB_NAME SQL to make the new empty detination DB before starting the import.
Everytime I tried this I got this type of error:
ERROR 2006 (HY000) at line MySQL server has gone away
Source DB conf
My source DB is a virt server using VMWare so I can resize the RAM/CPU as needed. For this project I temporarily scaled up to 8CPU/16GB of RAM, and then scaled back down after the import. This is a luxury I had, that you may not.
With so much RAM I was able to tune the heck out of the /etc/my.cnf file. Everyone else had suggested increasing
max-allowed-packet
bulk_insert_buffer_size
To double or triple default values. This didn't fix it for me. Then I tried increasing timeouts after reading more online.
interactive_timeout
wait_timeout
net_read_timeout
net_write_timeout
connect_timeout
I did this and it still didn't work. So then I went crazy and set everything unreasonably high. Here is what I ended up with:
key_buffer_size=512M
table_cache=2G
sort_buffer_size=512M
max-allowed-packet=2G
bulk_insert_buffer_size=2G
innodb_flush_log_at_trx_commit = 0
net_buffer_length=1000000
innodb_buffer_pool_size=3G
innodb_file_per_table
interactive_timeout=600
wait_timeout=600
net_read_timeout=300
net_write_timeout=300
connect_timeout=300
Still no luck. I felt deflated. Then I noticed that the import kept failing at the same spot. So I reviewed the SQL. I noticed nothing strange. Nothing in the log files either.
Solution
There's something about the DB structure that's causing the import to fail. I suspect it's size related, but who knows.
To fix it I started splitting the dumps up into smaller chunks. The source DB has about 75 tables. So I made 3 dumps with approx 25 each. You just have to pass the table names to the dump command. For ex:
mysqldump $DB_NAME $TABLE1> $TABLE2....$TABLE25 | gzip -c | sudo ssh $USER#$IP_ADDRESS 'cat > /$PATH/$DB_NAME-TABLES1-25`date +%d-%b-%Y`.gz'
Then I simply imported each chunk independently on the destination. Finally, no errors. Hopefully this is useful to someone else.
The answer to this question was to split the dump into chunks by tables. Then do multiple imports. See details in the original post.

MySQL Query with LARGE number of records gets Killed

I run the following query from my shell :
mysql -h my-host.net -u myuser -p -e "SELECT component_id, parent_component_id FROM myschema.components comp INNER JOIN my_second_schema.component_parents related_comp ON comp.id = related_comp.component_id ORDER BY component_id;" > /tmp/IT_component_parents.txt
The query runs for a LONG time and then gets KILLED.
However if I add LIMIT 1000, then the query runs till the end and output is written in file.
I further investigated and found (using COUNT(*)) that the total number of records that would be returned are 239553163.
Some information about my server is here:
MySQL 5.5.27
+----------------------------+----------+
| Variable_name | Value |
+----------------------------+----------+
| connect_timeout | 10 |
| delayed_insert_timeout | 300 |
| innodb_lock_wait_timeout | 50 |
| innodb_rollback_on_timeout | OFF |
| interactive_timeout | 28800 |
| lock_wait_timeout | 31536000 |
| net_read_timeout | 30 |
| net_write_timeout | 60 |
| slave_net_timeout | 3600 |
| wait_timeout | 28800 |
+----------------------------+----------+
Here's STATE of the query as I monitored :
copying to tmp table on disk
sorting results
sending data
writing to net
sending data
writing to net
sending data
writing to net
sending data ...
KILLED
Any guesses what's wrong here ?
The mysql client probably runs out of memory.
Use the --quick option to not buffer results in memory.
What is wrong is that you are returning 239 553 163 rows of data! Don't be surprised it it takes a lot of time to process. Actually, the longest part might very well be sending the result set back to your client.
Reuduce the result set (do you really need all these rows?). Or try to output the data in smaller batches:
mysql -h my-host.net -u myuser -p -e "SELECT ... LIMIT 10000, 0" >> dump.txt
mysql -h my-host.net -u myuser -p -e "SELECT ... LIMIT 10000, 10000" >> dump.txt
Assuming you mean 8 hours when you say a long time, the value 28800 for your wait_timeout causes the connection to drop with no further activity in 28,800 seconds, i.e. 8 hours. If you can't optimize the statement to run in less than 8 hours, you should increase this value.
See this page for further information on the wait_timeout variable.
The interactive_timeout variable is used for interactive client connections, so if you run long queries from an interactive session, that's the one you need look at.
You may want to utilize OUTFILE mechanizm if you are going to dump large amounts of data. That or mysql_dump will be much more efficient (and OUTFILE got the benefit of not locking-down the table).
You said in a comment that your MySQL instance is on RDS. This means you can't be running the query from the same host, since you can't log into an RDS host. I guess you might be doing this query over the WAN from your local network.
You're most likely having trouble because of a slow network. Your process state frequently showing "writing to net" makes me think this is your bottleneck.
Your bottleneck might also be the sorting. Your sort is writing to a temp table, and that can take a long time for a result set that large. Can you skip the ORDER BY?
Even so, I wouldn't expect the query to be killed even if it runs for 3100 seconds or more. I wonder if your DBA has some periodic job killing long-running queries, like pt-kill. Ask your DBA.
To reduce network transfer time, you could try using the compression protocol. You can use the --compress or -C flags to the mysql client for this (see https://dev.mysql.com/doc/refman/5.7/en/mysql-command-options.html#option_mysql_compress)
On a slow network, compression can help. For example, read about some comparisons here: https://www.percona.com/blog/2007/12/20/large-result-sets-vs-compression-protocol/
Another alternative is to run the query from an EC2 spot instance running in the same AZ as your RDS instance. The network between those two instances will be a lot faster, so it won't delay your data transfer. Save the query output to a file on the EC2 spot instance.
Once the query result is saved on your EC2 instance, you can download it to your local machine, using scp or something, which should be more tolerant of slow networks.

how to 'load data infile' on amazon RDS?

not sure if this is a question better suited for serverfault but I've been messing with amazon RDS lately and was having trouble getting 'file' privileges to my web host mysql user.
I'd assume that a simple:
grant file on *.* to 'webuser#'%';
would work but it does not and I can't seem to do it with my 'root' user as well. What gives? The reason we use load data is because it is super super fast for doing thousands of inserts at once.
anyone know how to remedy this or do I need to find a different way?
This page, http://docs.amazonwebservices.com/AmazonRDS/latest/DeveloperGuide/index.html?Concepts.DBInstance.html seems to suggest that I need to find a different way around this.
Help?
UPDATE
I'm not trying to import a database -- I just want to use the file load option to insert several hundred-thousand rows at a time.
after digging around this is what we have:
mysql> grant file on *.* to 'devuser'#'%';
ERROR 1045 (28000): Access denied for user 'root'#'%' (using password: YES)
mysql> select User, File_priv, Grant_priv, Super_priv from mysql.user;
+----------+-----------+------------+------------+
| User | File_priv | Grant_priv | Super_priv |
+----------+-----------+------------+------------+
| rdsadmin | Y | Y | Y |
| root | N | Y | N |
| devuser | N | N | N |
+----------+-----------+------------+------------+
You need to use LOAD DATA LOCAL INFILE as the file is not on the MySQL server, but is on the machine you are running the command from.
As per comment below you may also need to include the flag:
--local-infile=1
For whatever it's worth... You can add the LOCAL operand to the LOAD DATA INFILE instead of using mysqlimport to get around this problem.
LOAD DATA LOCAL INFILE ...
This will work without granting FILE permissions.
Also struggled with this issue, trying to upload .csv data into AWS RDS instance from my local machine using MySQL Workbench on Windows.
The addition I needed was adding OPT_LOCAL_INFILE=1 in: Connection > Advanced > Others. Note CAPS was required.
I found this answer by PeterMag in AWS Developer Forums.
For further info:
SHOW VARIABLES LIKE 'local_infile'; already returned ON
and the query was:
LOAD DATA LOCAL INFILE 'filepath/file.csv'
INTO TABLE `table_name`
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Copying from the answer source referenced above:
Apparently this is a bug in MYSQL Workbench V8.X. In addition to the
configurations shown earlier in this thread, you also need to change
the MYSQL Connection in Workbench as follows:
Go to the Welcome page of MYSQL which displays all your connections
Select Manage Server Connections (the little spanner icon)
Select your connection
Select Advanced tab
In the Others box, add OPT_LOCAL_INFILE=1
Now I can use the LOAD DATA LOCAL INFILE query on MYSQL RDS. It seems
that the File_priv permission is not required.*
Pretty sure you can't do it yet, as you don't have the highest level MySQL privileges with RDS. We've only done a little testing, but the easiest way to import a database seems to be to pipe it from the source box, e.g.
mysqldump MYDB | mysql -h rds-amazon-blah.com --user=youruser --pass=thepass
Importing bulk data into Amazon MySQL RDS is possible two ways. You could choose anyone of below as per your convenience.
Using Import utility.
mysqlimport --local --compress -u <user-name> -p<password> -h <host-address> <database-name> --fields-terminated-by=',' TEST_TABLE.csv
--Make sure, here the utility will be inserting the data into TEST_TABLE only.
Sending a bulk insert SQL by piping into into mysql command.
mysql -u <user-name> -p<password> -h <host-address> <database-name> < TEST_TABLE_INSERT.SQL
--Here file TEST_TABLE_INSERT.SQL will have bulk import sql statement like below
--insert into TEST_TABLE values('1','test1','2017-09-08'),('2','test2','2017-09-08'),('3','test3','2017-09-08'),('3','test3','2017-09-08');
I ran into similar issues. I was in fact trying to import a database but the conditions should be the same - I needed to use load data due to the size of some tables, a spotty connection, and the desire for a modest resume functionality.
I agree with chris finne that not specifying the local option can lead to that error. After many fits and starts I found that the mk-parallel-restore tool from Maatkit provided what I needed with some excellent extra features. It might be a great match for your use case.