I have two MySQL servers. One is the master of other (actually the master is also a slave of another server). Both are running on similar remote servers (same amount of RAM). Everything is working fine except the slave is taking 2-3 times more time than the master server to run a same large query. Can someone think of a reason for this problem.
If there are writes (coming from Master) to a, b, etc, then the SELECT will run slower on the Slave. How much slower depends on far too many things to predict.
To 'prove' that, do (in the Slave) STOP SLAVE SQL_THREAD;, run the query; START SLAVE SQL_THREAD;. That will turn off the writes for the SELECT.
But what about "fixing" it?
If the "writes" are INSERTs, can they be 'batched'? That is, INSERT multiple rows in the same statement. This may not 'solve' the problem, but it could mitigate it.
Related
We've set up a database replication about a week ago, and we are having an issue with keeping it in sync.
The setup is a master-master replication with MariaDB 10.1.35/MySQL 5.5.5. Only one database is being used to make calls on. The other database will only be used as a backup. I will refer to this one as the slave. And it's the slave we're having issues with. The replication is statement based.
The first 24 hours went fine. The next day, the slave was more and more behind, up until almost 24 hours. When we checked 24 hours later, the slave was back on track again, being behind on the master by just a few seconds.
Now again, it's starting to get behind more and more (over 5 hours of data now).
It's still syncing, so the replication itself is working. However, some queries just take way too long on the slave, which is delaying everything.
All queries are being executed quite fast, except for one UPDATE query. It's this one which stays in the processlist for 5, 10 and sometimes even 20 or 30 seconds. The query is being handled in less than a second on the master, and also when we execute this query manually on the slave, it's doesn't take longer than a second. So, we don't think it's related to the query itself. The structure of both databases/tables are exactly the same. The storage engine of the table is InnoDB.
At this point, we have no clue what could be causing this delay. Inserts are being processed instantly.
There's one difference in the processlist when the query is being executed on the slave; the command stays on 'Connect', while the command says 'Execute' on the master. Is this normal behaviour?
If I should provide more information, please let me know. It's clear that a slave only handles one query at a time and thus it can get behind if there are a lot of queries on the master, but it should not be necassary for that query to take up to 30 seconds, while it takes less than one when being executed manually.
Thank you.
P.S. We already optimized the table (OPTIMIZE) but unfortunately that didn't make a difference.
we have a MySQL with a replica (5.7 with row based replication).
Now, the master performs at peak about 3000 inserts per second, and the replica seems to read that just fine. However, sometimes we execute long-time select queries (that ran from 10 to 20 seconds). And during those queries the replication lag becomes very huge.
What I do not understand is how the usual mysql threads that execute selects (without locking any tables) can cause the replication thread to slow down (i.e. it performs about 2.5K inserts instead of 3K like master)? What would I need to tune exactly?
Now I checked the slave status and it's not about the IO thread - this one manages to read events from the master just fine. It's SQL slave thread, that somehow does not manage to catch up. The isolation level is Read Committed, so the select queries potentially could lock some records and make the slave thread wait. But I'm not sure about that.
UPDATED. I have checked again - it turns out that even a single heavy query (that scans the entire table for example) on the slave produces the lag. It seems like slave sql thread is blocked, but I do not understand why?
UPDATED 2. I finally found the solution. First I increased number of slave_parallel_workers to 4 and set slave_parallel_type to LOGICAL_CLOCK. However, and this is important, that gave me no improvement at all, since the transactions were dependent. But, after I increased on master binlog_group_commit_sync_delay to 10000 (that is, 10 milliseconds), the lag disappeared.
There Might be many reasons why replication lag in mysql slave database.
But as you mentioned
It's SQL slave thread, that somehow does not manage to catch up.
Assuming that IO works fine, Percona says (emphasis mine):
[...] when the slave SQL_THREAD is the source of replication delays it is probably because of queries coming from the replication stream are taking too long to execute on the slave. This is sometimes because of different hardware between master/slave, different schema indexes, workload. Moreover, the slave OLTP workload sometimes causes replication delays because of locking. For instance, if a long-running read against a MyISAM table blocks the SQL thread, or any transaction against an InnoDB table creates an IX lock and blocks DDL in the SQL thread. Also, take into account that slave is single threaded prior to MySQL 5.6, which would be another reason for delays on the slave SQL_THREAD.
I'm trying to understand an issue I am having with a MySQL 5.5 server.
This server hosts a number of databases. Each day at a certain time a process runs a series of inserts into TWO tables within this database. This process lasts from 5 to 15 minutes depending on the amount of rows being inserted.
This process runs perfectly. But it has a very unexpected side effect. All other inserts and update's running on tables unrelated to the two being inserted to just sit and wait until the process has stopped. Reads and writes outside of this database work just fine and SELECT statements too are fine.
So how is it possible for a single table to block the rest of a database but not the entire server (due to loading)?
A bit of background:-
Tables being inserted to are MyISAM with 10 - 20 million rows.
MySQL is Percona V5.5 and is serving one slave both running on
Debian.
No explicit locking is called for by the process inserting the
records.
None of the Insert statements do not select data from any other
table. They are also INSERT IGNORE statements.
ADDITIONAL INFO:
While this is happening there are no LOCK table entries in PROCESS LIST and the processor inserting the records causing this problem does NOT issue any table locks.
I've already investigated the usual causes of table locking and I think I've rules them out. This behaviour is either something to do with how MySQL works, a quirk of having large database files or possibly even something to do with the OS/File System.
After a few weeks of trying things I eventually found this: Yoshinori Matsunobu Blog - MyISAM and Disk IO Scheduler
Yoshinori demonstrates that changing the scheduler queue to 100000 (from the default 128) dramatically improves the throughput of MyISAM on most schedulers.
After making this change to my system there were no longer any dramatic instances of database hang on MyISAM tables while this process was running. There was slight slowdown as to be expected with the volume of data but the system remained stable.
Anyone experiencing performance issues with MyISAM should read Yoshinori's blog entry and consider this fix.
I have a Master to Master replication set up between two MySQL 5.0
My requirement is that before starting my application I have to ensure that the database are identical I would like to confirm that the "seconds_behind_master" of the "show slave status" command being at 0 seconds is enough to consider the 2 databases being synchronized ?
http://dev.mysql.com/doc/refman/5.0/en/show-slave-status.html
Thank you
No, that is not sufficient to guarantee that the databases are identical, just that the slave has run all statements from the master's binary log. You could update or delete large chunks of data on the slave database and still have seconds behind master = 0, but the slave would certainly not be identical to the master.
You should use a tool like Percona's pt-table-checksum if you really want to verify that the databases are identical.
It's not really a guarantee, as master-to-master replication does not guarantee atomicity across servers.
Also, a single read of 0 could be meaningless, but a consistent read of 0 over time would be a good indication that the servers are synced, if you have no errors reported.
So, I say, read it every second for 10 seconds, and if they're all 0, with no errors, you're probably good enough, short of comparing checksums, as Ike suggested.
Currently we have 3 slave databases,
but almost always there is one of them extremly slow than others(can be an hour after master database)
Has any one met similar problem?What may be the cause?
I'd guess some other process is running on the same host as the slow replica, and it's hogging resources.
Try running "top" (or use Nagios or Cactus or something) to monitor system performance on the three replica hosts and see if there are any trends you can observe. CPU utilization pegged by another process besides mysqld, or I/O constantly saturated, that sort of thing.
update: Read the following two articles by MySQL performance expert Peter Zaitsev:
Managing Slave Lag with MySQL Replication
Fighting MySQL Replication Lag
The author points out that replication is single-threaded, and the replica executes queries sequentially, instead of in parallel as they were executed on the master. So if you have a few replicated queries that are very long-running, they can "hold up the queue."
He suggests the remedy is to simplify long-running SQL queries so they run more quickly. For example:
If you have an UPDATE that affects millions of rows, break it up into multiple UPDATEs that act on a subset of the rows.
If you have complex SELECT statements incorporated into your UPDATE or INSERT queries, separate the SELECT into its own statement, generate a set of literal values in application code, and then run your UPDATE or INSERT on these. Of course the SELECT won't be replicated, the replica will only see the UPDATE/INSERT with literal values.
If you have a long-running batch job running, it could be blocking other updates from executing on the replica. You can put some sleeps into the batch job, or even write the batch job to check the replication lag at intervals and sleep if needed.
Are all the slave servers located in the same location? In my case, one of the slave server was located in another location and it was a network issue.