Import large MySQL file without replication lagging - mysql

I'm about to import a 5 GB table on the command line:
mysql -u dbuser -p customersdb < transactions.sql
Previously I had imported a 2GB file and that caused replication to lag for long periods of time. Is there anyway to avoid that here? Somehow adding a timeout after every few thousand imports would seem ideal in my mind.
I've tried googling it but it doesn't seem like this use case comes up often.
Edit: Additionally, is there anyway to monitor the progress of an import?

The issue causing the lag is that the slave thread is single threaded by default. All operations - both from your import and from other operations - happen in a single queue.
Starting with MySQL 5.6 you can use multi threading there by setting the slave_parallel_workers option. With MySQL 5.6 this will distribute operations from different schemas, with 5.7 it can also parallize within a single schema.
See https://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html#sysvar_slave_parallel_workers

Related

Good idea to use SQS to move thousands of databases?

We want to move from using MySQL on an EC2 instance to RDS and setup replication. Seems like a no-brainer, right? Well, I've got 30,000 databases to move (don't ask). While setting up replication seems to work well, the process of getting the 30,000 databases into RDS is a royal pain; it takes forever and something almost alway happens.
The nightly backup takes about two hours. I end up with a multi-GB SQL dump file. When I try to restore it, something almost always goes wrong: the RDS instance wasn't big enough memory-wise and crashed, the localhost ran out of swap space, the network connection went flaky. Whatever! I did get it to restore once; IIRC it took 23 hours (30K MySQL DBs are a ton of file IO).
So today, I decided to use mydumper. It generated 30,000 schema files for the database in about two hours, then suddenly, the source MySQL went into uninterruptible sleep according to top, I lost my client connections, strace showed it was still trying to read files, and the mydumper process crashed. I restarted the whole process and just checked the status; mysqld restarted 2.5 hours into it for some reason.
So here's what I'm thinking and I'd like your input: I write two python scripts: firstScript.py will run mydumper on a single database, update a status table, package up the SQL, put it onto an AWS SQS queue, repeating until no more databases are found; the secondScript.py reads from the queue, runs the SQL and updates the status table, repeating until no more messages are found.
I think this can work. Do you? The main thing I'm not sure of is this: can I simply run multiple secondScript.py by Ctrl-Z-ing them into the background?
Or does someone have a better way of moving 30,000 databases?
I would not use mysqldump or mydumper to make a logical dump. Loading the resulting SQL-format dump takes too long.
Instead, use Percona XtraBackup to make a physical backup of your EC2 instance, and upload the backup to S3. Then restore to the RDS instance from S3, setup replication on the RDS instance to your EC2 instance, and let it catch up.
The feature of restoring a physical MySQL backup to RDS was announced in November 2017.
See also:
https://www.percona.com/blog/2018/04/02/migrate-to-amazon-rds-with-percona-xtrabackup/
https://aws.amazon.com/about-aws/whats-new/2017/11/easily-restore-an-amazon-rds-mysql-database-from-your-mysql-backup/
You should try it out with a smaller instance than your 30k databases just so you get some practice with the steps. See the steps in the Percona blog I linked to above.

What is the risk of `mysql_upgrade` without doing `mysqldump`?

After upgradig mysql to 5.7 form 5.5 few months ago, I forgot to do mysql_upgrade.
And facing some problems.. mysql, sys, performance_schema databases are missing and root privileges are broken. A lot of Access denied for user 'root'... messages pop up, when I try to do some mysql user privileges things.
This stack answer will have to solve my problem. But I need to know it won't affect any of the schemas, data... ect.
Because my database is pretty huge. It amounts to 10 GB and consists of about 50 tables. I'm afraid some bad things could happend. I know the answer will be the mysqldump.
But the full backup will cost a long time, maybe an hour. And the business won't accept that downtime.
So what is the risk of mysql_upgrade without doing mysqldump?
The risk of doing anything administrative to your database without backups is unacceptably high... not because of any limitations in MySQL per se, but because we're talking about something critical to your business. You should be backing it no less often than the time interval of data you are willing to lose.
If you are using InnoDB, then use the --single-transaction option of mysqldump and there should be no locking, because MVCC handles the consistency. If you are not using InnoDB, that is a problem itself, but using --skip-lock-tables should mimimize locking.
Note that it should be quite safe to kill a mysqldump in progress if you find it is causing issues -- find the thread-id of the dump using SHOW PROCESSLIST; and then KILL QUERY #; where # is the ID of the dump connection from the process list.
The potential problem with the answer you cited is that 5.1 > 5.5 is a supported upgrade path, because those two versions are sequential. 5.5 > 5.7 isn't. You should have upgraded to 5.6 and then 5.7, running the appropriate versions of mysql_upgrade both before and after each step (appropriate meaning the version of the utility matching the version of the server running at the moment).
You may be in a more delicate situation than you imagine... or you may not.
Given similar circumstances, I would not want to do anything less than stop the server completely, clone it by copying all the files to a new machine, and test the remediation steps against the clone.
If this system is business-critical, it should have a live, running replica server, that could be promoted to master and permanently replace this machine in the event of a failure. In a circumstance like this one, you would apply your fixes to the replica and promote it.
Access denied for user 'root'... may or may not be related to the schema incompatibilites.

Mysql full lock on big import

I've one MYSQL server with 5 databases.
I was using phpmyadmin csv import to load a very big amount of data in one table of one database.
I understand that all other operations in this machine may get slower due to the amount of processing take, but MYSQL is simple not responding to any other simultaneous request, even in other table or in other database.
And because of this apache doen't answer any request that need database connection (keeps loading forever)
after the import is finished, the apache and the mysql return to work normaly... i dont need to restart or execute any other command
my question is, Is this behavior normal? should mysql stop answering all other requests due a single giant one?
I'm afraid that if i've a big query running in one database in this server, all my other databases will be locked also and my applications stop working

Connection during MySQL import

What happens to a long lasting query executed from commandline via SSH if the connection to MySQL or SSH is lost?
Context:
We have 2 servers with a very large MySQL database on them. One server acts as the Master, and the other as Slave. During regular maintenance, the replication became corrupt, and we noticed data was missing from the slave, even though it reported Seconds_Behind_Master = 0.
So I am in the process of repairing the replication. I am, as we speak, importing one of two large dumps in to the slave. I am connected to MySQL through SSH, and used the MySQL "\. file.sql" command to import the dump.
Right now I am constantly getting results like so "Query OK, 6798 rows affected".
It has been running for probably 30 minutes now. My question and worry is, what happens if I lose connection through SSH while this is running?
I have another, even larger dump to import after this.
Thanks for the answer!
-Steve
if you lose your connection, all children of your bash process will die, including mysql.
to avoid this problem use the screen command.

mysqldump | mysql yields 'too many open files' error. Why?

I have a RHEL 5 system with a fresh new hard drive I just dedicated to the MySQL server. To get things started, I used "mysqldump --host otherhost -A | mysql", even though I noticed the manpage never explicitly recommends trying this (mysqldump into a file is a no-go. We're talking 500G of database).
This process fails at random intervals, complaining that too many files are open (at which point mysqld gets the relevant signal, and dies and respawns).
I tried upping it at sysctl and ulimit, but the problem persists. What do I do about it?
mysqldump by default performs a per-table lock of all involved tables. If you have many tables that can exceed the amount of file descriptors of the mysql server process.
Try --skip-lock-tables or if locking is imperative --lock-all-tables.
http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html--lock-all-tables, -x
Lock all tables across all databases. This is achieved by acquiring a global read lock for the duration of the whole dump. This option automatically turns off --single-transaction and --lock-tables.
mysqldump has been reported to yeld that error for larger databases (1, 2, 3). Explanation and workaround from MySQL Bugs:
[3 Feb 2007 22:00] Sergei Golubchik
This is not really a bug.
mysqldump by default has --lock-tables enabled, which means it tries to lock all tables to
be dumped before starting the dump. And doing LOCK TABLES t1, t2, ... for really big
number of tables will inevitably exhaust all available file descriptors, as LOCK needs all
tables to be opened.
Workarounds: --skip-lock-tables will disable such a locking completely. Alternatively,
--lock-all-tables will make mysqldump to use FLUSH TABLES WITH READ LOCK which locks all
tables in all databases (without opening them). In this case mysqldump will automatically
disable --lock-tables because it makes no sense when --lock-all-tables is used.
Edit: Please check Dave's workaround for InnoDB in the comment below.
If your database is that large you've got a few issues.
You have to lock the tables to dump the data.
mysqldump will take a very very long time and your tables will need to locked during this time.
importing the data on the new server will also take a long time.
Since your database is going to be essentially unusable while #1 and #2 are happening I would actually recommend stopping the database and using rsync to copy the files to the other server. It's faster than using mysqldump and much faster than importing because you don't have the added IO and CPU of generating indexes.
In production environments on Linux many people put Mysql data on an LVM partition. Then they stop the database, do an LVM snapshot, start the database, and copy off the state of the stopped database at their leisure.
I just restarted the "MySql" Server and then I could use the mysqldump command flawlessly.
Thought this might be helpful tip here.