We have a one master and two VIP slave database servers. We changed data type of column from VARCHAR(255) to TEXT on master.
The application is currently configured to use master only for writing operations and configured slaves for reading operation.
After changing the data type on master server using ALTER TABLE command the slave server becomes unresponsive.
We are using Mariadb 10.0
[PROCESSES INFORMATION]
Id User Host Db Command Time(sec) State Info
-----------------------------------------------------------------------
203739 repl slave1 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
203740 repl slave2 Binlog Dump 75,143,121 Master has sent all binlog to slave; waiting for binlog to be updated
The slave instance becomes very slow due to slow queries.
number of sessions: 1590
thread_pool_max_thread=500
Current value =648
After performing ALTER TABLE on Master server, it was replicating to slave server and in the same time number of sessions were get increased rapidly on slave server.
I think slaves becomes unresponsive because of slow queries.
But I don't know why this queries became so slow and slaves got unresponsive.
The DBA's saying that after executing ANALYZE TABLE command, the issue has been solved.
But I don't understand why this happened because ANALYZE TABLE only update the statistic information.
It would be helpful if anyone comment on this why it happened?
How to avoid such issues in future.
There is one minor case where TEXT is slower than VARCHAR. When a SELECT needs to build a temporary table (often for sorting due to GROUP BY or ORDER BY), it first tries to build a MEMORY table. But, TEXT and BLOB prevent it from using such, so it uses MyISAM instead. This is slower (but gets the job done).
I say this is a "minor case" because users rarely identify it with phrases like "very slow" and "becomes unresponsive". I would guess that a SELECT might run twice as slow.
Also, the ANALYZE TABLE discussion does not hold water. Again it may be coincidence, not causation.
So, the change to TEXT may be a 'red herring'. Instead, let's discover what is being slow by using the slowlog. See this for what I like to work from.
Related
Is there a way to see when the last byte of data was copied over from the master to the slave? Currently to check how 'current' the data is I'm doing some pretty crude such as:
select max(last_updated) from one-my-my-tables
But it doesn't work too well. Is there a more formal way to do this?
From a privileged MySQL account (e.g. root) use:
mysql> show slave status;
The field:
Seconds_Behind_Master: 0
tells you how far out of date the slave is.
Unfortunately, there's no way to get the information from SHOW SLAVE STATUS without using that statement. I searched for this recently and learned that there are replication info tables in the PERFORMANCE_SCHEMA, however none of them contain the Seconds_Behind_Master.
Strictly speaking, the Seconds_Behind_Master doesn't tell you what you asked, anyway. You asked "when the last byte of data was copied over from the master to the replica?" Seconds_Behind_Master tells you the difference between the system time on the replica vs. the timestamp of the last executed event from the relay log.
Or if the replica has executed all downloaded events, it reports 0.
But suppose the replica has lost contact with its master, and there are more logs sitting around on the master waiting to be downloaded? The replica doesn't report this, because it has no idea there are more logs.
A more accurate way of measuring replication lag is to use the pt-heartbeat script, which is included in the free Percona Toolkit.
You execute the script on the master, and it inserts a timestamp to a table once per second, like a heartbeat.
Then on the replica you can query the timestamp and compare it to the system time.
If the replica is caught up, the difference will be zero.
If the replica has downloaded all the logs, but executing events is lagging, the timestamp difference will show this.
If the replica has lost contact with its master and hasn't downloaded all the logs, but we know the heartbeat timestamp should update once per second, then you can still get an accurate measure of the replication lag.
we have a MySQL with a replica (5.7 with row based replication).
Now, the master performs at peak about 3000 inserts per second, and the replica seems to read that just fine. However, sometimes we execute long-time select queries (that ran from 10 to 20 seconds). And during those queries the replication lag becomes very huge.
What I do not understand is how the usual mysql threads that execute selects (without locking any tables) can cause the replication thread to slow down (i.e. it performs about 2.5K inserts instead of 3K like master)? What would I need to tune exactly?
Now I checked the slave status and it's not about the IO thread - this one manages to read events from the master just fine. It's SQL slave thread, that somehow does not manage to catch up. The isolation level is Read Committed, so the select queries potentially could lock some records and make the slave thread wait. But I'm not sure about that.
UPDATED. I have checked again - it turns out that even a single heavy query (that scans the entire table for example) on the slave produces the lag. It seems like slave sql thread is blocked, but I do not understand why?
UPDATED 2. I finally found the solution. First I increased number of slave_parallel_workers to 4 and set slave_parallel_type to LOGICAL_CLOCK. However, and this is important, that gave me no improvement at all, since the transactions were dependent. But, after I increased on master binlog_group_commit_sync_delay to 10000 (that is, 10 milliseconds), the lag disappeared.
There Might be many reasons why replication lag in mysql slave database.
But as you mentioned
It's SQL slave thread, that somehow does not manage to catch up.
Assuming that IO works fine, Percona says (emphasis mine):
[...] when the slave SQL_THREAD is the source of replication delays it is probably because of queries coming from the replication stream are taking too long to execute on the slave. This is sometimes because of different hardware between master/slave, different schema indexes, workload. Moreover, the slave OLTP workload sometimes causes replication delays because of locking. For instance, if a long-running read against a MyISAM table blocks the SQL thread, or any transaction against an InnoDB table creates an IX lock and blocks DDL in the SQL thread. Also, take into account that slave is single threaded prior to MySQL 5.6, which would be another reason for delays on the slave SQL_THREAD.
I have an account in amazone. my primary MySql is a large instance. and i have other 2 small instances as replication servers. The hierarchy is Primary -> slave 1-> slave 2 . The problem is that some times the slave 1 and slave 2 showing high CPU utilization. We couldn't find out the exact reason. Slave 1 is act as a slave of primary and at the same time it act as Master of Slave 2. We searched a lot but we are still stucked as blind.
Thanks in advance for all helps.
As a general rule, when using MySQL native, asynchronous replication -- which is what RDS uses -- the replica servers need be as large as the master or sometimes larger. The replicas receive "replication events" from the master, which may contain the actual queries that modified data -- insert, update, delete but not select -- this is "statement based replication" -- or may receive binary images of the rows added removed or changes on the master -- which is "row based replication." By default, the master server decides, on a query-by-query basis, which format to use to send the replication events ("mixed").
In all cases SELECT statements are not sent to the replicas (except of course for INSERT ... SELECT), but they do need sufficient capacity to handled both the incoming changes from the master, as well as the SELECT queries that are run directly against the replica by your application.
In RDS for MySQL 5.6 and above,you can set the binlog_format to ROW to force the master to always use row-based replication. This might improve your performance, and it might not -- it depends on the workload. You cannot force RDS to use only "statement" mode, nor should you want to.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html
In general, however, any time the replica is a machine with fewer resources than the master, the chance of replication lags increases. The lag of the replicas can be monitored in Cloudwatch.
I have a Mysql master-slave(s) replication with MyISAM tables. All updates are done on the master and selects are done on either the master or slaves.
It appears that we might need to manually lock a few tables when we do certain updates. While this write lock is on the tables, no selects can happen on the locked table. But what about on the slaves? Does the lock propagate out?
Say I have table_A and table_B. I initiate a lock on table_A and table_B on the master and start performing the update. At this time no other connection can read table_A and table_B off the master? But what if at this time another connection tries to read the tables off of a slave, can they do so?
Everything that MySQL replicates can be found in the binary logs.
You can run the following command to see the details.
show global variables like 'log_bin%';
log_bin_basename will tell you the path to your binary logs with base file name.
and run
show binary logs
to find the binary files that are currently present on your server.
You can check the actual commands that are written to the file by using mysqlbinlog command together with the file name or by running show binlog events ... from the MySQL CLI.
Also, check what binlog_format are you using.
Basically - the lock of the tables is not directly propagated to slaves, but at the time, whey will execute the performed updates they will perform a lock of the updated table if needed.
As far as I know write locks do not propagate into the binlog, You can verify that by doing quick test and looking at the binlog. If you want to avoid issues on the master aswell and for some reason can not migrate to InnoDB consider integrating something like GET_LOCK() into your application instead of completely locking a table. MyISAM is quite iffy when it comes to concurrency.
I have 2 servers set up using MySQL. It's using a standard replication setup, with one slave, no circular replication.
Is there a way to programmatically tell how far behind the slave is in reading the data from the binary log?
If I run the statement:
SHOW MASTER STATUS;
On the master, and run
SHOW SLAVE STATUS;
on the slave, I can compare the Position column from master status, and the Read_Master_Log_Pos column from slave status to determine how far behind the slave is.
However, this only works if the slave is reading from the same file the master is writing to. So if the slave is still reading a previous log file, because it is running behind, I can't figure out how to determine how much data is left until it catches up to the current position that the master is at. A solution using only SQL would be optimal, but I'm open to other solutions. Hopefully not one that requires reading the directory containing the log files.
I like to use the 'Seconds_behind_master' field from SHOW SLAVE STATUS in order to determine if the MASTER-SLAVE servers are caught up. As a secondary guard I also to a COUNT(*) query on a specific table (ie one that gets updated frequently) on both servers and then compare the record counts.