mysql slave database problem - mysql

Currently we have 3 slave databases,
but almost always there is one of them extremly slow than others(can be an hour after master database)
Has any one met similar problem?What may be the cause?

I'd guess some other process is running on the same host as the slow replica, and it's hogging resources.
Try running "top" (or use Nagios or Cactus or something) to monitor system performance on the three replica hosts and see if there are any trends you can observe. CPU utilization pegged by another process besides mysqld, or I/O constantly saturated, that sort of thing.
update: Read the following two articles by MySQL performance expert Peter Zaitsev:
Managing Slave Lag with MySQL Replication
Fighting MySQL Replication Lag
The author points out that replication is single-threaded, and the replica executes queries sequentially, instead of in parallel as they were executed on the master. So if you have a few replicated queries that are very long-running, they can "hold up the queue."
He suggests the remedy is to simplify long-running SQL queries so they run more quickly. For example:
If you have an UPDATE that affects millions of rows, break it up into multiple UPDATEs that act on a subset of the rows.
If you have complex SELECT statements incorporated into your UPDATE or INSERT queries, separate the SELECT into its own statement, generate a set of literal values in application code, and then run your UPDATE or INSERT on these. Of course the SELECT won't be replicated, the replica will only see the UPDATE/INSERT with literal values.
If you have a long-running batch job running, it could be blocking other updates from executing on the replica. You can put some sleeps into the batch job, or even write the batch job to check the replication lag at intervals and sleep if needed.

Are all the slave servers located in the same location? In my case, one of the slave server was located in another location and it was a network issue.

Related

Why would long running select queries cause a replication lag in MySQL slave database?

we have a MySQL with a replica (5.7 with row based replication).
Now, the master performs at peak about 3000 inserts per second, and the replica seems to read that just fine. However, sometimes we execute long-time select queries (that ran from 10 to 20 seconds). And during those queries the replication lag becomes very huge.
What I do not understand is how the usual mysql threads that execute selects (without locking any tables) can cause the replication thread to slow down (i.e. it performs about 2.5K inserts instead of 3K like master)? What would I need to tune exactly?
Now I checked the slave status and it's not about the IO thread - this one manages to read events from the master just fine. It's SQL slave thread, that somehow does not manage to catch up. The isolation level is Read Committed, so the select queries potentially could lock some records and make the slave thread wait. But I'm not sure about that.
UPDATED. I have checked again - it turns out that even a single heavy query (that scans the entire table for example) on the slave produces the lag. It seems like slave sql thread is blocked, but I do not understand why?
UPDATED 2. I finally found the solution. First I increased number of slave_parallel_workers to 4 and set slave_parallel_type to LOGICAL_CLOCK. However, and this is important, that gave me no improvement at all, since the transactions were dependent. But, after I increased on master binlog_group_commit_sync_delay to 10000 (that is, 10 milliseconds), the lag disappeared.
There Might be many reasons why replication lag in mysql slave database.
But as you mentioned
It's SQL slave thread, that somehow does not manage to catch up.
Assuming that IO works fine, Percona says (emphasis mine):
[...] when the slave SQL_THREAD is the source of replication delays it is probably because of queries coming from the replication stream are taking too long to execute on the slave. This is sometimes because of different hardware between master/slave, different schema indexes, workload. Moreover, the slave OLTP workload sometimes causes replication delays because of locking. For instance, if a long-running read against a MyISAM table blocks the SQL thread, or any transaction against an InnoDB table creates an IX lock and blocks DDL in the SQL thread. Also, take into account that slave is single threaded prior to MySQL 5.6, which would be another reason for delays on the slave SQL_THREAD.

Replication issue

We have a one master and two VIP slave database servers. We changed data type of column from VARCHAR(255) to TEXT on master.
The application is currently configured to use master only for writing operations and configured slaves for reading operation.
After changing the data type on master server using ALTER TABLE command the slave server becomes unresponsive.
We are using Mariadb 10.0
[PROCESSES INFORMATION]
Id User Host Db Command Time(sec) State Info
-----------------------------------------------------------------------
203739 repl slave1 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
203740 repl slave2 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
The slave instance becomes very slow due to slow queries.
number of sessions: 1590
thread_pool_max_thread=500
Current value =648
After performing ALTER TABLE on Master server, it was replicating to slave server and in the same time number of sessions were get increased rapidly on slave server.
I think slaves becomes unresponsive because of slow queries.
But I don't know why this queries became so slow and slaves got unresponsive.
The DBA's saying that after executing ANALYZE TABLE command, the issue has been solved.
But I don't understand why this happened because ANALYZE TABLE only update the statistic information.
It would be helpful if anyone comment on this why it happened?
How to avoid such issues in future.
There is one minor case where TEXT is slower than VARCHAR. When a SELECT needs to build a temporary table (often for sorting due to GROUP BY or ORDER BY), it first tries to build a MEMORY table. But, TEXT and BLOB prevent it from using such, so it uses MyISAM instead. This is slower (but gets the job done).
I say this is a "minor case" because users rarely identify it with phrases like "very slow" and "becomes unresponsive". I would guess that a SELECT might run twice as slow.
Also, the ANALYZE TABLE discussion does not hold water. Again it may be coincidence, not causation.
So, the change to TEXT may be a 'red herring'. Instead, let's discover what is being slow by using the slowlog. See this for what I like to work from.

Mysql Replica server's CPU utilization became high in EC2

I have an account in amazone. my primary MySql is a large instance. and i have other 2 small instances as replication servers. The hierarchy is Primary -> slave 1-> slave 2 . The problem is that some times the slave 1 and slave 2 showing high CPU utilization. We couldn't find out the exact reason. Slave 1 is act as a slave of primary and at the same time it act as Master of Slave 2. We searched a lot but we are still stucked as blind.
Thanks in advance for all helps.
As a general rule, when using MySQL native, asynchronous replication -- which is what RDS uses -- the replica servers need be as large as the master or sometimes larger. The replicas receive "replication events" from the master, which may contain the actual queries that modified data -- insert, update, delete but not select -- this is "statement based replication" -- or may receive binary images of the rows added removed or changes on the master -- which is "row based replication." By default, the master server decides, on a query-by-query basis, which format to use to send the replication events ("mixed").
In all cases SELECT statements are not sent to the replicas (except of course for INSERT ... SELECT), but they do need sufficient capacity to handled both the incoming changes from the master, as well as the SELECT queries that are run directly against the replica by your application.
In RDS for MySQL 5.6 and above,you can set the binlog_format to ROW to force the master to always use row-based replication. This might improve your performance, and it might not -- it depends on the workload. You cannot force RDS to use only "statement" mode, nor should you want to.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html
In general, however, any time the replica is a machine with fewer resources than the master, the chance of replication lags increases. The lag of the replicas can be monitored in Cloudwatch.

Difference in speed between two MySQL servers

I have two MySQL servers. One is the master of other (actually the master is also a slave of another server). Both are running on similar remote servers (same amount of RAM). Everything is working fine except the slave is taking 2-3 times more time than the master server to run a same large query. Can someone think of a reason for this problem.
If there are writes (coming from Master) to a, b, etc, then the SELECT will run slower on the Slave. How much slower depends on far too many things to predict.
To 'prove' that, do (in the Slave) STOP SLAVE SQL_THREAD;, run the query; START SLAVE SQL_THREAD;. That will turn off the writes for the SELECT.
But what about "fixing" it?
If the "writes" are INSERTs, can they be 'batched'? That is, INSERT multiple rows in the same statement. This may not 'solve' the problem, but it could mitigate it.

Large number of INSERT statements locking inserts on other tables in MySQL 5.5

I'm trying to understand an issue I am having with a MySQL 5.5 server.
This server hosts a number of databases. Each day at a certain time a process runs a series of inserts into TWO tables within this database. This process lasts from 5 to 15 minutes depending on the amount of rows being inserted.
This process runs perfectly. But it has a very unexpected side effect. All other inserts and update's running on tables unrelated to the two being inserted to just sit and wait until the process has stopped. Reads and writes outside of this database work just fine and SELECT statements too are fine.
So how is it possible for a single table to block the rest of a database but not the entire server (due to loading)?
A bit of background:-
Tables being inserted to are MyISAM with 10 - 20 million rows.
MySQL is Percona V5.5 and is serving one slave both running on
Debian.
No explicit locking is called for by the process inserting the
records.
None of the Insert statements do not select data from any other
table. They are also INSERT IGNORE statements.
ADDITIONAL INFO:
While this is happening there are no LOCK table entries in PROCESS LIST and the processor inserting the records causing this problem does NOT issue any table locks.
I've already investigated the usual causes of table locking and I think I've rules them out. This behaviour is either something to do with how MySQL works, a quirk of having large database files or possibly even something to do with the OS/File System.
After a few weeks of trying things I eventually found this: Yoshinori Matsunobu Blog - MyISAM and Disk IO Scheduler
Yoshinori demonstrates that changing the scheduler queue to 100000 (from the default 128) dramatically improves the throughput of MyISAM on most schedulers.
After making this change to my system there were no longer any dramatic instances of database hang on MyISAM tables while this process was running. There was slight slowdown as to be expected with the volume of data but the system remained stable.
Anyone experiencing performance issues with MyISAM should read Yoshinori's blog entry and consider this fix.