MySQL replication slave losing connection each 10 minutes - mysql

I have multiple servers set up with MySQL one-way replication for backup purposes. On one of these slaves I have a problem. Exactly each 10 minutes it loses connection and reconnects without problems. Example from error log:
121216 18:05:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000002' at position 782733912
121216 18:05:49 [ERROR] Slave I/O: error reconnecting to master 'repl#127.0.0.1:5002' - retry-time: 60 retries: 86400, Error_code: 2013
121216 18:06:49 [Note] Slave: connected to master 'repl#127.0.0.1:5002',replication resumed in log 'mysql-bin.000002' at position 782733912
121216 18:15:49 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
121216 18:15:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000002' at position 822218944
121216 18:15:49 [ERROR] Slave I/O: error reconnecting to master 'repl#127.0.0.1:5002' - retry-time: 60 retries: 86400, Error_code: 2013
121216 18:16:49 [Note] Slave: connected to master 'repl#127.0.0.1:5002',replication resumed in log 'mysql-bin.000002' at position 822218944
121216 18:25:49 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
121216 18:25:49 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000002' at position 850106111
121216 18:25:49 [ERROR] Slave I/O: error reconnecting to master 'repl#127.0.0.1:5002' - retry-time: 60 retries: 86400, Error_code: 2013
So, everything works, but the error log is flooded with messages.
I looked at various MySQL settings, but I don't see any set to 10 minutes or 600 seconds.
FWIW, replication works through SSH tunnel using AutoSSH. I looked into sshd_config, but also do not see any timeout setting.
Which setting should I look into?

I am looking at some similar problems lately and it turns out that our firewall blocks autossh monitoring port thus autossh restarts ssh every 10 mins. This may happen to you too.
Check your autossh log. It is usually /var/log/syslog unless you specifies AUTOSSH_LOGFILE

As #interskh pointed out, the culprit may be ssh. My /var/log/syslog contained messages like the following:
Sep 15 16:34:57 servername autossh[2799]: timeout polling to accept read connection
Sep 15 16:34:57 servername autossh[2799]: port down, restarting ssh
Sep 15 16:34:57 servername autossh[2799]: starting ssh (count 136)
Sep 15 16:34:57 servername autossh[2799]: ssh child pid is 11664
I found a Debian bug report thread that suggested that contrary to many tutorials, it isn't necessary to include the -M parameter. Since version 1.4a-1, autossh will use a randomly selected "high" port by default (which is arguably better than manually specifying a monitoring port with -M).
Omitting the -M flag solved the problem for me.
Previous command (restarts the SSH connection every 10 minutes)
autossh -p2223 -M 20000 -f username#example.com -L 12345:127.0.0.1:3306 -N
New (working) command
autossh -p2223 -f username#example.com -L 12345:127.0.0.1:3306 -N
In case it helps anyone, our SSH client is running Ubuntu and the SSH server is running CentOS.

Related

Kubernetes MySQL Statefulset slave log sequence number is in future

I am using
https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/
to run MySQL replication, MySQL master and slave was running fine.
Last week I changed the replica of MySQL statefulset to 1, to stop the MySQL slave. Now I want to start the slave, when I change statefulset replicas to 2, I was getting
'Could not find first log file name in binary log index file'.
So I wanted to do a fresh clone from master, so deleted the PV on slave and recreated the slave pod, it created new PV with no data, xtrabackup cloned the files from MySQL master.
But when starting I am getting below error, (edited)
[ERROR] InnoDB: Page [page id: space=5868, page number=1181] log sequence number 4431188670431 is in the future! Current system log sequence number 4431180574947.
Even though the Statefulset runs
"xtrabackup --prepare /var/lib/mysql"
slave throws this error and fails to start, any ideas?

MySQL-8.0.12 slave replication failed

I use MySQL-8.0.12 to setup a master-slave replication cluster. But slave always gets following errors, does anyone know how to fix this ?
2018-11-01T04:17:58.327576Z 19 [ERROR] [MY-010834] [Server] next log
error: -1 offset: 50 log: ./mysql-relay-bin.000002 included: 1,
2018-11-01T04:17:58.327675Z 19 [ERROR] [MY-010596] [Repl] Error
reading relay log event for channel '': Error purging processed logs,
2018-11-01T04:17:58.327932Z 19 [ERROR] [MY-013121] [Repl] Slave SQL
for channel '': Relay log read failure: Could not parse relay log
event entry. The possible reasons are: the master's binary log is
corrupted (you can check this by running 'mysqlbinlog' on the binary
log), the slave's relay log is corrupted (you can check this by
running 'mysqlbinlog' on the relay log), a network problem, or a bug
in the master's or slave's MySQL code. If you want to check the
master's binary log or slave's relay log, you will be able to know
their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code:
MY-013121,
2018-11-01T04:17:58.327982Z 19 [ERROR] [MY-010586] [Repl] Error
running query, slave SQL thread aborted. Fix the problem, and restart
the slave SQL thread with "SLAVE START". We stopped at log
'mysql-bin.000003' position 805
check disk space in slave
faced the same issue once.
during the replication, if slave server disk is full and no space left mysql replication thread wait for the disk to be freed wait time is 60 sec if the server restarted during that time then relay log cannot be recovered and slave cannot read the relay log.

MySQL SSL replication error - Error_code: 2026

Master MySql version - 5.6.24-enterprise-commercial-advanced-log MySQL Enterprise Server
Slave MySql version - 5.7.15-enterprise-commercial-advanced-log MySQL Enterprise Server
SSL replication enabled between the nodes:-
Master SSL config - my.cnf
ssl-ca=/data/mysql_data/ca.pem
ssl-cert=/data/mysql_data/server-cert.pem
ssl-key=/data/mysql_data/server-key.pem
Slave - config
CHANGE MASTER TO MASTER_HOST='ip_address_host', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000013', MASTER_LOG_POS= 507,MASTER_USER='repl', MASTER_SSL=1 , MASTER_SSL_CA='/data/server-cert.pem' , MASTER_SSL_CIPHER=' DHE-RSA-AES256-SHA:AES128-SHA'
Seeing following error in the slave logs -
2017-10-02T17:33:37.348979Z 1 [ERROR] Slave I/O for channel '': error connecting to master 'repl#10.10.*.*:3306' - retry-time: 60 retries: 1, Error_code: 2026
When we downgraded the slave to same version as 5.6, it worked
I had a similar issue but with identical MySQL versions. It happened after an upgrade to 5.7.41.
All self-made certificates were disabled in my.cnf (by the upgrade).
Normal connection with mysql -u my-replication-user -p -h my-awesome-server --ssl worked as expected, but replication slave still could not connect.
I pinpointed it to the actual usage of our old certificates. And they were still used in the master configuration. So, I solved it by deleting the originally given file names of the SSL certificates.
change MASTER to MASTER_SSL_CA = '', MASTER_SSL_CERT = '', MASTER_SSL_KEY = ''

Mysql Replication SLAVE go down

I have a problem with mysql replication.
I configure two virtual host.
Server 1 Apache + mysql Ver 15.1 Distrib 5.5.41-MariaDB
Master and SLAVE OF Server2
Server 2 mysql Ver 14.14 Distrib 5.5.42
Master and SLAVE OF Server1
Topologi MASTER + MASTER
When I restart slaves all work good, short latency and fast update. But when I wait a few minutes the replication not work more. If I update some row or make a insert or delete the slave not update the changes.
The logs not write any error, but the master_position_log is diferent between master and slave.
And if I restart the slaves all works again, the bdd is updated and the replication works well.
I don't know what happen, seems the threads sleep or death.
Thanks for some idea for fix the problem
In two cases the processes seems ok.
SERVER1
Kill 168 system user None Connect 1146 Waiting for master to send event ---
Kill 169 system user None Connect 945 Slave has read all relay log; waiting for the slave I/O thread to update it ---
Kill 170 master XXXXXXX:59273 None Binlog Dump 1145 Master has sent all binlog to slave; waiting for binlog to be updated ---
SERVER2
Kill 73 root XXXXXX:55089 None Binlog Dump 1137 Master has sent all binlog to slave; waiting for binlog to be updated ---
Kill 76 system user None Connect 1137 Waiting for master to send event ---
Kill 77 system user None Connect 985 Slave has read all relay log; waiting for the slave I/O thread to update it ---
The problem is latency.
My solution, create a CRON every minut for stop and start slave.
Now all works.
Cristian
SHOW SLAVE STATUS;
on each server. That is likely to tell you what is wrong.
You do understand the potential problems with AUTO_INCREMENT and UNIQUE keys when you are writing to both heads of a dual-Master topology?

mysql master/slave replication set up but non working

I'm experiencing some trouble setting up mysql replication between a master & a slave..
I did the setup successfully, but data doesn't update.
Master : show master status;
[File]: mysql-bin.000033
[Position]: 1757196
[Binlog_Do_DB]: ciel
Master : show processlist;
[User]: slave
[Host]: 92.222.177.xxx:57578 ( right slave ip )
[db]:
[Command]: Binlog Dump
[Time]: 1231
[State]: Has sent all binlog to slave; waiting for binlog to be updated
Slave : show slave status;
[Slave_IO_State]: Waiting for master to send event
[Master_Host]: 46.105.122.xxx
[Master_User]: slave
[Master_Port]: 3306
[Connect_Retry]: 60
[Master_Log_File]: mysql-bin.000033
[Read_Master_Log_Pos]: 1757196
[Relay_Log_File]: mysqld-relay-bin.000006
[Relay_Log_Pos]: 252
[Relay_Master_Log_File]: mysql-bin.000033
[Slave_IO_Running]: Yes
[Slave_SQL_Running]: Yes
[Replicate_Do_DB]: ciel
[Exec_Master_Log_Pos]: 1757196
[Relay_Log_Space]: 409
[Until_Condition]: None
[Master_SSL_Allowed]: No
[Master_SSL_Verify_Server_Cert]: No
[Master_Server_Id]: 1
Slave : show proccesslist;
[User]: system user
[Host]:
[db]:
[Command]: Connect
[Time]: 1231
[State]: Waiting for master to send event
[Info]:
[Id]: 2
[User]: system user
[Host]:
[db]:
[Command]: Connect
[Time]: 1231
[State]: Slave has read all relay log; waiting for the slave I/O thread to update it
then selecting data on master :
master: lastmod: 2014-10-26 17:14:55
slave: lastmod: 2014-10-26 15:45:45
I'm feeling lost, because I'm still not finding after 8 hours, how to set this up correctly.