I'm doing a migration work from MySql to MariaDB where replication is involved, everything is working fine and compatibility of master MySql (5.5.59) to slave MariaDB (10.1.26) is good.
The problem occur when I enable the replication from MariaDB master to MariaDB slave (same versions: 10.1.26). In some situations, identified on massive updates, the slave start to lag.
If I restore the master to MySql (5.5.59) and I replicate to the same slave in MariaDB, the lag never occur on the same set of updates.
I checked the relay logs in the MariaDB slave that is lagging, comparing the ones received when MySql is the master and the ones received when MariaDB is the master, the only differences are that when the master is MariaDB I can see statements related to gtid.
I would like to disable the presence of the gtid statements on the relay log when the master is MariaDB and make a replication similar to the "old style" MySql replication without gtid, but I've not found if is possible to do that.
The replication lag was due to the engine set in the table mysql.gtid_slave_pos in the slave server, by default this table is InnoDB and the tables that were receiving the replication updates are not InnoDB.
As explained in the link below, every transaction executed by the slave cause also an update on mysql.gtid_slave_pos, if the engines of the tables are different, that can cause a bad performance (in my case the server was lagging 4000+ seconds, changing the engine in the mysql.gtid_slave_pos the replication is now immediate).
https://mariadb.com/kb/en/library/mysqlgtid_slave_pos-table/
From MariaDB 10.3.1, a new parameter has been introduced to help with this problem: gtid_pos_auto_engines This parameter will create a different table mysql.gtid_slave_pos for each engine involved in the replication. Unfortunately seems not possible to accomplish that with previus version of MariaDB, the table mysql.gtid_slave_pos must be unique and the choice of its engine is up to the DBA and the tables/queries involved in the replication
I am facing a major issue in MySQL server from 2 days. My slave server is Seconds behind master by 70000 and its not getting down from 2 days. At night its suddenly increasing but again it in slow mode. Is there any way to Synchronize Master slave replication FAST? What is the problem with? Slave is working its IO and sql running in YES MODE. Please Help me out if there is any way
Is it repeatedly bouncing between 70000 and about 0? If so, that is a mystery that I have seen on and off for more than a decade. Ignore it, it will go away.
If Seconds_behind_master is rising at the rate of 1 second per second, the look at what the Slave is doing. SHOW PROCESSLIST; You will probably find something like ALTER that has been running a long time, tying up replication.
If Seconds_behind_master is getting big, but not going down much, then there are several possible answers.
Is the Slave a "weaker" machine than the Master? Keep in mind that Replication is (depending on the version) only single-threaded. Multiple writes can happen on the Master simultaneously, but then have to be done one at a time on the Slave.
Is the Slave running a big query that is locking what the replication thread would like to get to? Look at the Slave's PROCESSLIST.
Which Engine are you using? VM? Cloud hosted? Performing backups at night?
I have a very strange issue, I've setup Master / Slave replication with the slave being a Percona Cluster node.
Everything seems to be running correctly however no data appears in the slave databases and the data files themselves are not growing on the slave.
Oddily though i can see the filesize of the binlogs growing quite a lot on the slave (nothing else runs on this server at the moment).
My question is this.. During Master / Slave replication Does InnoDB / XtraDB cache a certain amount of data in the slave's binlogs before flushing it to the actual database?
If so can I configurre this "flushing".
Many Thanks
Binlog files are not directly used by Galera for replication, but the subsystem is (for its own replication protocol). Make sure that you have have activated log-slave-updates on the slave. Additionally, although it should work with the default STATEMENT format, due to some problems found in the past (autoincrement values working differently in Galera), I would recommend doing replication in ROW format.
If this doesn't work, we can try something else (are the binary logs increasing or are the relay logs increasing? -log slave updates should not be up by default; is the SQL thread stopped?; are you trying to replicate non-innodb tables?)
We use Percona mysql on EC2 and have a master / slave setup for HA. What we observe is that the replication to the slave always falls behind in hours or even days as we continuously write in data in the master as it is the nature of our application.
What could be a problem here ?
First, think about how MySQL Replication is organized
Major Components for MySQL Replication
Slave IO Thread: It's responsible for maintaining a constant connection to a Master. It is ready to receives binlogs events from the Master's binlogs and collect them FIFO style in the Slave's relay logs.
Slave SQL Thread: It's responsible for reading binlog events stored in the relay logs. The event (DML, DDL, etc) is executed on this internal thread.
Seconds_Behind_Master: Each binlog event has the timestamp of the event (DML, DDL, etc). Seconds_Behind_Master is simply the NOW() of the Slave server minus timestamp of the event. Seconds_Behind_Master is displayed in SHOW SLAVE STATUS\G.
What is the Problem ?
If Seconds_Behind_Master is ever increasing, consider the following: The single-threaded execution path for binlog events of MySQL Replication is nothing more than the Serialization of all the SQL commands that were executed on the Master in parallel. Picture this: if 10 UPDATE commands were executed on the Master in parallel and take 1 second each to process, they get placed in the relay logs and executed FIFO style. All the UPDATEs have the same timestamp. Subtracting the timestamp for each UPDATE processed on the Slave yields a 1-second increase in Seconds_Behind_Master. Multiplied by 10, and you get 10 additional seconds of Replication Lag. This reveals the SQL Thread as a Bottleneck.
Suggestions
Master and Slave may be underspec'd. Perhaps more Memory and/or Cores so that the Slave can process binlog events faster. (Scaling Up, Slightly Linear Improvements at Best)
Try configuring InnoDB for accessing more cores
Switch to MySQL 5.6 and implement parallel slave threads (if you have multiple databases)
Wait for Percona 5.6, then upgrade and implement parallel slave threads (if you have multiple databases)
I have a master/slave replication on my MySql DB.
my slave DB was down for a few hours and is back up again (master was up all the time), when issuing show slave status I can see that the slave is X seconds behind the master.
the problem is that the slave dont seem to catch up with the master, the X seconds behind master dont seem to drop...
any ideas on how I can help the slave catch up?
Here is an idea
In order for you to know that MySQL is fully processing the SQL from the relay logs. Try the following:
STOP SLAVE IO_THREAD;
This will stop replication from downloading new entries from the master into its relay logs.
The other thread, known as the SQL thread, will continue processing the SQL statements it downloaded from the master.
When you run SHOW SLAVE STATUS\G, keep your eye on Exec_Master_Log_Pos. Run SHOW SLAVE STATUS\G again. If Exec_Master_Log_Pos does not move after a minute, you can go ahead run START SLAVE IO_THREAD;. This may reduce the number of Seconds_Behind_Master.
Other than that, there is really nothing you can do except to:
Trust Replication
Monitor Seconds_Behind_Master
Monitor Exec_Master_Log_Pos
Run SHOW PROCESSLIST;, take note of the SQL thread to see if it is processing long running queries.
BTW Keep in mind that when you run SHOW PROCESSLIST; with replication running, there should be two DB Connections whose user name is system user. One of those DB Connections will have the current SQL statement being processed by replication. As long as a different SQL statement is visible each time you run SHOW PROCESSLIST;, you can trust mysql is still replicating properly.
What binary log format are you using ? Are you using ROW or STATEMENT ?
SHOW GLOBAL VARIABLES LIKE 'binlog_format';
If you are using ROW as a binlog format make sure that all your tables has Primary or Unique Key:
SELECT t.table_schema,t.table_name,engine
FROM information_schema.tables t
INNER JOIN information_schema .columns c
on t.table_schema=c.table_schema
and t.table_name=c.table_name
and t.table_schema not in ('performance_schema','information_schema','mysql')
GROUP BY t.table_schema,t.table_name
HAVING sum(if(column_key in ('PRI','UNI'), 1,0)) =0;
If you execute e.g. one delete statement on the master to delete 1 million records on a table without a PK or unique key then only one full table scan will take place on the master's side, which is not the case on the slave.
When ROW binlog_format is being used, MySQL writes the rows changes to the binary logs (not as a statement like STATEMENT binlog_format) and that change will be applied on the slave's side row by row, which means a 1 million full table scan will take place on the slave's to reflect only one delete statement on the master and that is causing slave lagging problem.
"seconds behind" isn't a very good tool to find out how much behind the master you really is. What it says is "the query I just executed was executed X seconds ago on the master". That doesn't mean that you will catch up and be right behind the master the next second.
If your slave is normally not lagging behind and the work load on the master is roughly constant you will catch up, but it might take some time, it might even take "forever" if the slave is normally just barely keeping up with the master. Slaves operate on one single thread so it is by design much slower than the master, also if there are some queries that take a while on the master they will block replication while running on the slave.
Just check if you have same time and timezones on both the servers, i.e., Master as well as Slave.
If you are using INNODB tables, check that you have innodb_flush_log_at_trx_commit to a value different that 0 at SLAVE.
http://dev.mysql.com/doc/refman/4.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit
We had exactly the same issue after setting up our slave from a recent backup.
We had changed the configuration of our slave to be more crash-safe:
sync_binlog = 1
sync_master_info = 1
relay_log_info_repository = TABLE
relay_log_recovery = 1
I think that especially the sync_binlog = 1 causes the problem, as the specs of this slave is not so fast as in the master. This config option forces the slave to store every transaction in the binary lo before they are executed (instead of the default every 10k transactions).
After disabling these config options again to their default values I see that the slave is catching up again.
Just to add the findings in my similar case.
There were few bulk temporary table insert/update/delete were happening in master which occupied most of the space from relay log in slave. And in Mysql 5.5, since being single threaded, CPU was always in 100% and took lot of time to process these records.
All I did was to add these line in mysql cnf file
replicate-ignore-table=<dbname>.<temptablename1>
replicate-ignore-table=<dbname>.<temptablename2>
and everything became smooth again.
Inorder to figure out which tables are taking more space in relay log, try the following command and then open in a text editor. You may get some hints
cd /var/lib/mysql
mysqlbinlog relay-bin.000010 > /root/RelayQueries.txt
less /root/RelayQueries.txt
If u have multiple schema's consider using multi threaded slave replication.This is relatively new feature.
This can be done dynamically without stopping server.Just stop the slave sql thread.
STOP SLAVE SQL_THREAD;
SET GLOBAL slave_parallel_threads = 4;
START SLAVE SQL_THREAD;
I have an issue similar to this. and both of my MySQL server hosted on AWS EC2 (master and replication). by increasing EBS disk size (which automatically increased IOPS) for MySQL slave server, its turned out the solution for me. R/W Throughput and bandwidth is increased R/W latency were decreased.
now my MySQL database replication is catching up to the master. and Seconds_Behind_Master was decreased (it was got increased from day to day).
so if you have MySQL hosted on EC2. I suggest you tried to increase EBS disk size or its IOPS on the slave.
I know it's been a while since OP asked but it would have helped me to read the following answer.
In /etc/mysql/mysql.cnf :
[mysql]
disable_log_bin
innodb_flush_log_at_trx_commit=2
innodb_doublewrite = 0
sync_binlog=0
disable_log_bin REALLY carried the trick for me.