Using Percona MySQL 5.6 with sql_slave_parallel_workers=5 on Debian 8. Sometimes GTID replication breaks and I don't know why. I thought that the GTIDs are executed in a consecutive order, but when looking at status
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: d22.local
Master_User: xyz
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.039232
Read_Master_Log_Pos: 219044
Relay_Log_File: mysqld-relay-bin.072392
Relay_Log_Pos: 90640
Relay_Master_Log_File: mysql-bin.036196
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: xyz_etl
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table xyz.sessions; Can't find record in 'sessions', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.036196, end_log_pos 78709552
Skip_Counter: 0
Exec_Master_Log_Pos: 78708927
Relay_Log_Space: 1337994488
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table xyz.sessions; Can't find record in 'sessions', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.036196, end_log_pos 78709552
Replicate_Ignore_Server_Ids:
Master_Server_Id: 22
Master_UUID: 0e7b97a8-a689-11e5-8b79-901b0e8b0f53
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 161219 20:32:20
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 0e7b97a8-a689-11e5-8b79-901b0e8b0f53:60397-45157441
Executed_Gtid_Set: 0e7b97a8-a689-11e5-8b79-901b0e8b0f53:1-42679868:42679870-42679876:42679878-42679879:42679881-42679890:42679892-42679908:42679910:42679913:42679916-42679917:42679919-42679927:42679929-42679932:42679934:42679936:42679938-42679939:42679944:42679946-42679950:42679952-42679955:42679957-42679964:42679966:42679969-42679970:42679972:42679974-42679977:42679979-42679980:42679984-42679986:42679988-42679990:42679994-42679996:42679998:42680000-42680001:42680003-42680006:42680009-42680011:42680013-42680018:42680021:42680024:42680026:42680030:42680032:42680035:42680038,
aea3618e-bacf-11e6-9506-b8ca3a67f830:1-10937274
Auto_Position: 1
1 row in set (0.00 sec)
I'm a bit confused. sql_slave_parallel_workers is set to 0 now. But the error claimed above is GTID 42679909 instead of 42679868 as expected. What's the reason for this. And what are the correct steps to solve a broken replication like above?
What I don't understand is, that the transaction with GTID 42679869 can be executed without problems, theoretically. But doing a STOP SLAVE; START SLAVE; does not process them?!
To answer it and help others, here the steps I've done:
setting slave_parallel_workers=0
one have to pay attention to field Executed_Gtid_Set only and handle all gaps in GTID list one after another with STOP SLAVE; SET GTID_NEXT="[...]"; BEGIN; COMMIT; SET GTID_NEXT="AUTOMATIC"; START SLAVE;
when point is reached, that replication will continue automatically without error set slave_parallel_workers to previous value
Related
I have 2 MySql with a master/slave configuration and the replication is failing. The MySql Master crashed and a new register in the mysql-bin.index was created. I deleted this new register because this file was not existed in the file system. Then the MySql Master restarted successfully.
Now, I have the next error in the slave:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.64.253.99
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.001050
Read_Master_Log_Pos: 54868051
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.001050
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 54868051
Relay_Log_Space: 107
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
If I execute the "show master status" for view the mysql bin log file and the position:
mysql> show master status \G
*************************** 1. row ***************************
File: mysql-bin.001050
Position: 55586895
Binlog_Do_DB: aaa
Binlog_Ignore_DB: xxx,yyy,zzz,mysql
Then I set new config to the slave:
STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST='10.64.253.99',
MASTER_USER='slaveUser',
MASTER_PASSWORD='12345',
MASTER_LOG_FILE='mysql-bin.001050',
MASTER_LOG_POS=55586895;
START SLAVE;
And if I check slave status again I have the same error:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.64.253.99
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.001050
Read_Master_Log_Pos: 55586895
Relay_Log_File: mysqld-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.001050
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 55586895
Relay_Log_Space: 107
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
I have checked in the MySql Master that the mysql-bin.001050 exist and is not empty.
This is a production environment then new data are inserted every minute. The position value can to change in minutes or seconds. I don't know if this is a problem.
The max_allowed_packet variable has the same value in the two MySql (16M).
Why can not the slave find the binary log file?
You can try this:
Slave: stop slave;
Master: flush logs
Master: show master status; — take note of the master log file and master log position
Slave: CHANGE MASTER TO MASTER_LOG_FILE='log-bin.00000X', MASTER_LOG_POS=106;
Slave: start slave;
I've just created a mysql 5.6 master/slave relationship automatically through my provider's API, meaning I didn't have a root user. So after the slave was setup, I enabled the root user on the slave, which of course broke replication. I need to skip that GTID, but I'm having difficulty understanding the howtos.
STOP SLAVE;
SET GTID_NEXT="5b182ac6-8a79-11e4-8f28-001851cf5e10:10";
BEGIN; COMMIT;
SET GTID_NEXT="AUTOMATIC";
START SLAVE
results in the same error. What's the right GTID to select?
*************************** 1. row ***************************
Slave_IO_State: Waiting for the slave SQL thread to free enough relay log space
Master_Host: 10.188.52.218
Master_User: slave_6b72a386
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: replica-1007782573-bin.000006
Read_Master_Log_Pos: 1071986205
Relay_Log_File: replica-1130155763-relay.000003
Relay_Log_Pos: 5880
Relay_Master_Log_File: replica-1007782573-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1396
Last_Error: Error 'Operation CREATE USER failed for 'root'#'%'' on query. Default database: ''. Query: 'CREATE USER 'root'#'%''
Skip_Counter: 0
Exec_Master_Log_Pos: 5644
Relay_Log_Space: 5370067417
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1396
Last_SQL_Error: Error 'Operation CREATE USER failed for 'root'#'%'' on query. Default database: ''. Query: 'CREATE USER 'root'#'%''
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1007782573
Master_UUID: 5b182ac6-8a79-11e4-8f28-001851cf5e10
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 141223 15:14:12
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 5b182ac6-8a79-11e4-8f28-001851cf5e10:9-5851
Executed_Gtid_Set: 08bc8aa0-8a7a-11e4-8f2d-001851502460:1-41,
5b182ac6-8a79-11e4-8f28-001851cf5e10:9-69
Auto_Position: 0
Look at the executed GTID set.
Executed_Gtid_Set: 08bc8aa0-8a7a-11e4-8f2d-001851502460:1-41,
5b182ac6-8a79-11e4-8f28-001851cf5e10:9-69
08bc8aa0-8a7a-11e4-8f2d-001851502460 is probably your slave, your master is
Master_UUID: 5b182ac6-8a79-11e4-8f28-001851cf5e10
as you can see some lines above. From the master you have therefore executed transactions 9 - 69. So the next GTID is 70.
Also note, that you should set Auto_Position: 1 with a CHANGE MASTER TO statement. Read more about it here.
Spent two days so far looking though stack overflow answers and google and I just cannot get it working..
Im trying to setup a master / slave replication in mysql/mariadb.. yet when I start the replication. Its just errors on the slave saying tables dont exist.
Do I need to create the database and tables first?
If so, what happens if a new table is made on the master? will this break
replication?
This is the current state of the slave:
MariaDB [(none)]> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: somedomain.com
Master_User: someuser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 63687969
Relay_Log_File: mariadb-relay-bin.000002
Relay_Log_Pos: 382
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Error 'Table 'xxxxx.xxxx' doesn't exist' on query. Default database: 'xxxxxxxxxxxx'. Query: 'UPDATE xxxx SET lastused = NOW(), lingertime = 7 WHERE siteid = 'xxxxxxxxxxx''
Skip_Counter: 0
Exec_Master_Log_Pos: 98
Relay_Log_Space: 63763807
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1146
Last_SQL_Error: Error 'Table 'xxxxxxxx.xxxxx' doesn't exist' on query. Default database: 'xxxxxxx'. Query: 'UPDATE xxxxx SET lastused = NOW(), lingertime = 7 WHERE siteid = 'xxxxxxxxxx''
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Do I need to create the database and tables first?
Yes, you have to do that first. You must have the same schemas and tables as you have on master.
If so, what happens if a new table is made on the master? will this break replication?
Every command you execute on master will be replicated on slave as well. This includes CREATE statements for tables and schemas. So, you won't have any problem.
I have a slave that was started after the master had been running. I started it at a position in the master bin log that was when I started to import databases on the master. This slave has been running, but now is stuck and not progressing.
It has been stuck at relay master log 000055 and position for hours.
Results of show slave status:
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: db10.domain.com
Master_User: replicator
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: DB10-bin.000102
Read_Master_Log_Pos: 917727958
Relay_Log_File: dbbk9-relay-bin.000152
Relay_Log_Pos: 863694346
Relay_Master_Log_File: DB10-bin.000055
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: %.rmallaccountinfo,%.rmaccountbalance,%.rmoldestcharge,%.rmautotemp,%.rmitemtemp,%.rmcust
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 863694199
Relay_Log_Space: 56169138270
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 842902
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 100
1 row in set (0.00 sec)
I look up what is at that position in the master binlog and its just a drop temp table:
# at 863694053
#120827 15:32:50 server id 100 end_log_pos 863694199 Query thread_id=6068 exec_time=0 error_code=0
SET TIMESTAMP=1346095970/*!*/;
DROP TEMPORARY TABLE IF EXISTS `rmtemptableinclude` /* generated by server */
/*!*/;
# at 863694199
#120827 15:32:51 server id 100 end_log_pos 863694282 Query thread_id=7152 exec_time=1 error_code=0
SET TIMESTAMP=1346095971/*!*/;
BEGIN
/*!*/;
Whenever I see replication hungup on something, there's an error displayed in the STATUS output...
You can try skipping one statement in the replication:
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
I just experienced the same thing, slave not reporting any error, just stuck on an old "Read_Master_Log_Pos"
I executed a "STOP SLAVE;" followed by a "START SLAVE;" and now it's catching up with the master! This is a little disconcerting b/c the slave wasn't even reporting that it was stuck.
Here are the steps i did.
I was Replicating the mysql 5.0.95 as master and 5.5 as a slave. I'm using linux OS. My DB engine is InnoDBenter code here
I have set the master with
/etc/my.cnf
**#mysql Server setup**
server-id=1
bind-address = 192.168.1.41
innodb_flush_log_at_trx_commit=1
sync_binlog=1
log-bin = /var/log/mysql/mysql-bin.log
binlog-do-db=my_db
I have set the slave with
/etc/my.cnf
**#mysql replication client setup**
server-id=5
master-host=192.168.1.41
master-user=web_master
master-password=webmaster
master-connect-retry=60
replicate-do-db=dev_my_db
log-slave-updates
I have granted permission on master as
GRANT SUPER,REPLICATION CLIENT,REPLICATION SLAVE,RELOAD ON *.* TO 'web_master'#'192.168.1.41' identified by 'webmaster';
I have set change master and bin position on the slave
CHANGE MASTER TO MASTER_HOST='192.168.1.41', MASTER_USER='web_master', MASTER_PASSWORD='webmaster', MASTER_LOG_FILE='mysql-bin.00002', MASTER_LOG_POS=107;
I did 2 insert and a update and log my queries
and checked My Master status as
mysql> SHOW master STATUS\G
File: mysql-bin.000004
Position: 13790
Binlog_Do_DB: my_db
Binlog_Ignore_DB:
1 row in set (0.00 sec)
I chcekd my slave and slave status as
mysql> SHOW SLAVE STATUS \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.41
Master_User: web_master
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 13790
Relay_Log_File: mysqld-relay-bin.000007
Relay_Log_Pos: 244
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: dev_CHGV2_dbo
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 13790
Relay_Log_Space: 401
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 4
1 row in set (0.00 sec)
ERROR:
No query specified
I could see the bin log are updated on slave, i couldn't figure out what was causing the issue. As my updates on the master is not replicated on slave.
can any one suggest did i have missed out something?