Cannot start debezium MySQL connector due to Error code 1236 - mysql

When I check the status of my debezium connector via the kakfa-connect's REST API, I see this error message for the connector:
org.apache.kafka.connect.errors.ConnectException: The slave is
connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the
master has purged binary logs containing GTIDs that the slave
requires. Error code: 1236; SQLSTATE: HY000.\n\tat
io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)\n\tat
io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)\n\tat
io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java:997)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)\n\tat
java.lang.Thread.run(Thread.java:748)\nCaused by:
com.github.shyiko.mysql.binlog.network.ServerException: The slave is
connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the
master has purged binary logs containing GTIDs that the slave
requires.\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:914)\n\t...
3 more\n
Is this an issue with how I am configuring my debezium connector or an issue with MySQL? Whats crazy is that even when I tried setting the option snapshot.mode to never and this error is still being thrown! According to the documentation, when snapshot.mode is set to either never or when_needed it should not require the GTID so I am super confused as to what is happening

The problem is that Debezium was probably down for some time and some of the transactions it has not seen are no longer available on the server.

That could be an issue with the wrong offsets for the connector.
So I've deleted the connector, deleted all related Kafka topics (like schema history, etc), and cleaned the offsets using the following guide https://debezium.io/documentation/faq/#how_to_remove_committed_offsets_for_a_connector
And it helped! After re-creation - the connector works as expected now.

Related

AWS Aurora MySQL Blue/Green Deployment replication failure

I wanted to use the new fully managed Blue/Green deployments to upgrade Aurora 1 databases (MySQL 5.6) to Aurora 2 (MySQL 5.7). Though it worked great on one pre-production environment the replication fails on other environments, including production. The replication fails with errors like:
Read Replica Replication Error - SQLError: 1032, reason: Could not execute Delete_rows event on table my_schema.my_table; Can't find record in 'my_table', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin-changelog.000122, end_log_pos 2040
On other instances there were duplicate entries for the primary key (Error code 1062). I tried ROW and MIXED as binlog_format. AWS support's recommendation is to solve those problems manually, which does seem inpractical. For debugging purposes, I tried to read the binlog with the mysqlbinlog utility. This led to the following error:
ERROR: Could not construct log event object: Found invalid event in binary log
Did anybody encounter similar issues? Is there a way to get more insight into these errors and solve them?

Replication failed with error code 1062 using Google Database Migration Service on big database (~800G)

I'm trying to create an MySQL 5.7.35 read-only replica on GCP from an external origin. The database is enourmous, with aproximately 800G of data.
I have already ajusted the definer on the triggers, views and functions in a way that GCP accepts (root#%) and thefore the full-dump that the Database Migration Service makes worked. Also got the replication working with the schema of this database (no data).
So far made just one attempt with data. On this attempt the full-dump was sucesful (took 2 days and 10 hours) and failed some time after the replication started with the following error:
2021-09-05T06:09:33.293123Z 2 [ERROR] Slave SQL for channel '': Could not execute Write_rows event on table pacsdb.content_item; Duplicate entry '1441957' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000005, end_log_pos 78621021, Error_code: 1062
Selecting this row on the replica returned the same data of the origin (the row was already there).
Since I can't stop slave, skip_counter and start slave or something like that on GCP I have to figure out why this is happening.
My next step would be try to make the dump manualy using the flags that Google recommends.
Someone had a similar problem or have a clue why this is happening?
Any tips are apreciated, thx!
Activating the consistency warnings and GTID-based replication should work. There is information relating replication with Global Transaction Identifiers for MySQL 5.7 here [1].
[1] - https://dev.mysql.com/doc/refman/5.7/en/replication-gtids.html

Communication link failure: 1047 WSREP has not yet prepared node for application use in

We had a three-node cluster with MariaDB 10.4. We had an outage and the servers all rebooted with one having an irrecoverable network issue at the time.
We set up another server and added it to the cluster as a third member later.
However, ever since that, we have constantly been getting this error every now and then.
*3287799 FastCGI sent in stderr: "PHP message: An Error occurred while handling another error:
PDOException: SQLSTATE[08S01]: Communication link failure: 1047 WSREP has not yet prepared node for application use in /var/....yii2/db/Command.php:1293
In order to fix this issue, we turned down all three nodes one by one and then re-initialized the cluster, even with a new cluster name and all.
The first one was started with "galera_new_cluster" and the remaining two were added to this cluster. However, we still kept getting the same error intermittently.
The workaround at mariadb galera - Error when a node shutdown ERROR 1047 WSREP has not yet prepared node for application use was followed but that didn't do anything, as expected.
Next, what we did is set up a single fresh server and installed the new 10.5.X MariaDB server on it. Took backup from the old cluster using mariabackup and restored it onto this new single server.
This single server was set up as a new cluster with fresh details and everything. We wanted to run it as a single node cluster to make sure if the error still persisted. Oddly enough, the error is still there and it comes off every half an hour or so.
Has anyone got any clue what could be the reason for this weird issue we're facing? Currently, we don't know what exactly is the issue which is why we're facing a hard time solving it.
Any help would be greatly appreciated.
Update:
We turned off galera on this single-node cluster and ran it as a simple stand-alone mariadb server. However we still go the same errors in our web-server's logs. This is bonkers.
Any idea? Anyone?

Unable to connect to the binlog client in NiFi

I'm building a NiFi dadaflow, and I need to get the data changes from a MySql database, so I want to use the CaptureChangeMySQL processor to do that.
I get the following error when I run the CaptureChangeMySQL processor and I don't see what's causing this :
Failed to process session due to Could not connect binlog client to any of the specified hosts due to: BinaryLogClient was unable to connect in 10000ms: org.apache.nifi.processor.exception.ProcessException: Could not connect binlog client to any of the specified hosts due to: BinaryLogClient was unable to connect in 10000ms
I have the following controller services enabled :
DistributedMapCacheClientService
DistributedMapCacheServer
But I'm not sure if they are properly configured :
DistributedMapCacheServer properties
DistributedMapCacheClientService properties
In MySql, I have enabled the log_bin variable, by default it wasn't. I checked and I have indeed some binlog files created when data change.
So I think the issue is with the controller services and how they connect, it's not clear to me.
I searched for tutorials about how to use this NiFi processor but I couldt not find how to fix this error. I looked mainly at this one : https://community.hortonworks.com/articles/113941/change-data-capture-cdc-with-apache-nifi-version-1-1.html but it did not helped me.
Does anyone have already use this processor to do CDC?
Thank you in advance.
I found what was wrong : I was trying to connect to the wrong port for the MySQL Host of the CaptureChangeMySQL processor :x
For others who are still facing similar issues, check if the firewall of the server is stopping the connection. Allow mysql 3306 in your firewall rules.

aws DMS replicate-changes-only error

I have prod aws Aurora DB and I want
to replicate changes to test mysql DB (schema is same - Aurora is based on mysql)
I am using aws DMS for this.
When performing full replication for certain tables the replication works fine,
When I want to perform replicate-changes-only, the replication fails.
I've set binlog_checksum=NONE and binlog_format=ROW in the parameter group.
The error I am receiving while running is:
Last Error The task stopped abnormally Stop Reason RECOVERABLE_ERROR Error Level RECOVERABLE
Last Error Task 'task-id' was suspended due to 6 successive unexpected failures Stop Reason FATAL_ERROR Error Level FATAL
Loading a snapshot to the test db isn't an option.
I just want to replicate the changes between specific tables.
Thanks in advance.
I am having the same error, it was always stopping 10min after starting. Adding more verbose logs didn't show more information but by changing the task configuration, especially the parameter MaxFullLoadSubTasks.
By default the value is "MaxFullLoadSubTasks": 8,, I changed it to "MaxFullLoadSubTasks": 1,.
It is slower but it's working for now. You may be able to increase it a bit to be quicker without having the same error.
You can modify the task configuration by first copying the task json settings you will find under DMS > TASK > overview, then changing the value and saving it to a file and then:
aws dms modify-replication-task --replication-task-arn <TASK_ARN_ID> --replication-task-settings file:///path/to/your/task_config.json