kafka sink connector with mysql DB table not found - mysql

I was trying to configure kafka sink connector to mysql DB. Kafka topic has value in AVRO format, and i want to dump data to mysql. I was getting error saying table not found (Table 'airflow.mytopic' doesn't exist). I was expecting table to be created in 'myschema.mytopic', but it was looking for table in airflow. I had enabled "auto.create": "true" expecting the table to be created wherever it wants.
I am using Confluent Kafka 5.4.1 and started it manually
Configuration:
"topics": "mytopic",
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"connection.url": "jdbc:mysql://<mysqlDB>:3306/myschema",
"connection.user": "db_user",
"connection.password": "db_pwd",
"tasks.max": "1",
"auto.evolve": "true",
"auto.create": "true",
"transforms": "routeRecords",
"transforms.routeRecords.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.routeRecords.replacement": "$1",
"transforms.routeRecords.regex": "(.*)",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://localhost:8084",
"connection.attempts": "1",
"dialect.name": "MySqlDatabaseDialect",
"table.name.format": "myschema.mytopic"
Error stack:
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:561)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.ConnectException: java.sql.SQLException: Exception chain:
java.sql.SQLSyntaxErrorException: Table 'airflow.mytopic' doesn't exist
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:122)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:539)
... 10 more
Caused by: java.sql.SQLException: Exception chain:
java.sql.SQLSyntaxErrorException: Table 'airflow.mytopic' doesn't exist
at io.confluent.connect.jdbc.sink.JdbcSinkTask.getAllMessagesException(JdbcSinkTask.java:150)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:102)
... 11 more
Any clue what could be the reason for error?

Issue got resolved by downgrading the mysql driver (mysql-connector-java-5.1.17.jar), below are the configurations
"topics": "mytopic",
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"connection.url": "jdbc:mysql://<mysqlDB>:3306/myschema",
"connection.user": "db_user",
"connection.password": "db_pwd",
"tasks.max": "1",
"insert.mode": "insert",
"auto.evolve": "true",
"auto.create": "true",
"transforms": "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.unwrap.drop.tombstones": "false",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://localhost:8084",
"connection.attempts": "1",
"dialect.name": "MySqlDatabaseDialect",
"table.name.format": "myschema.mytopic"

Related

Kafka Connector : Encountered change event for table table name whose schema isn't known to this connector

We are facing issue with our new connector, we want to capture a change event for mysql table, but we are noticing the error with trace
org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.\n\tat io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:42)\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.handleEvent(MySqlStreamingChangeEventSource.java:369)\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.lambda$execute$25(MySqlStreamingChangeEventSource.java:860)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1125)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:973)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: io.debezium.DebeziumException: Error processing binlog event\n\t... 7 more\nCaused by: io.debezium.DebeziumException: Encountered change event for table tablename whose schema isn't known to this connector\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.informAboutUnknownTableIfRequired(MySqlStreamingChangeEventSource.java:654)\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.handleUpdateTableMetadata(MySqlStreamingChangeEventSource.java:633)\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.lambda$execute$13(MySqlStreamingChangeEventSource.java:831)\n\tat io.debezium.connector.mysql.MySqlStreamingChangeEventSource.handleEvent(MySqlStreamingChangeEventSource.java:349
We are anticipating that after downgrading from mysql version 8.0 to 5.7 we have started facing this issue. We tried deleting the history topic and snapshot mode configs but aren't able to solve it.
The properties are as follows :
{
"name": "speed-account-table-v3",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"snapshot.locking.mode": "none",
"topic.creation.default.partitions": "1",
"tasks.max": "1",
"database.history.consumer.sasl.jaas.config": "jass config",
"database.history.kafka.topic": "speed-history.speed-account-table-new",
"bootstrap.servers": "cluster name",
"database.history.consumer.security.protocol": "SASL_SSL",
"tombstones.on.delete": "true",
"snapshot.new.tables": "parallel",
"topic.creation.default.replication.factor": "2",
"database.history.skip.unparseable.ddl": "true",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"database.allowPublicKeyRetrieval": "true",
"database.history.producer.sasl.mechanism": "SCRAM-SHA-512",
"database.user": "username",
"database.server.id": "server id",
"database.history.producer.security.protocol": "SASL_SSL",
"database.history.kafka.bootstrap.servers": "cluster name",
"database.server.name": "speed-account-v3",
"database.port": "portnumber",
"key.converter.schemas.enable": "false",
"value.converter.schema.registry.url": "xxxx",
"database.hostname": "xxxxxx",
"database.password": "xxxxx",
"value.converter.schemas.enable": "false",
"name": "speed-account-table-v3",
"table.include.list": "speed.tbl_account",
"database.history.consumer.sasl.mechanism": "SCRAM-SHA-512",
"snapshot.mode": "initial",
"database.include.list": "speed"
}
}
We have tried changing our infrastructure by using a different kafka connect as well as completely different MSK, but failed.

Debezium Mysql KafkaConnect - capture only new changelog data

I am using Debezium on Mysql table to capture changelogs to Kafka with below kafka connect configuration:
{
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "mysql",
"database.port": "3306",
"database.user": "xxxx",
"database.password": "xxxx",
"database.server.id": "42",
"database.server.name": "xxxx",
"table.whitelist": "demo.movies",
"database.history.kafka.bootstrap.servers": "broker:9092",
"database.history.kafka.topic": "dbhistory.demo" ,
"decimal.handling.mode": "double",
"include.schema.changes": "true",
"transforms": "unwrap,dropTopicPrefix",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropTopicPrefix.regex":"asgard.demo.(.*)",
"transforms.dropTopicPrefix.replacement":"$1",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://schema-registry:8081",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081"
}
However it is sending all old records from the table to Kafka topic.
Is there any way to read only new changelog data?
The default behavior is to snapshot the table (take all existing data), then read new data.
To only read new data, you need to add "snapshot.mode" : "schema_only"

hard delete events on kafka-connect to sync databases doesn't work or gives a error

i had connect my postgres data base to sync on MySql database.
The create and update events work's fine on sink, but when i delete a row on source (no just a data from column) it's gives a error.
I've tried somethings but without lucky.
1 - When i don't put "createKey" and "extractInt" in "transform" on my MySql Sink, i receive a error and the column don't create with bigserial.
"BLOB/TEXT column 'id_consultor' used in key specification without a key length".
2 - But if i put in my configurations to "createKey" and "extractInt" work's fine on create and delete, but gives this error on delete events:
"Only Map objects supported in absence of schema for [copying fields from value to key], found: null".
"transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
"transforms.createKey.fields":"id_consultor",
"transforms.extractInt.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
"transforms.extractInt.field": "id_consultor"
3 - If i put on my source (Postgres)
**"transforms.unwrap.delete.handling.mode":"rewrite"**
The delete works partially executing a "soft delete" don't erase the row, just erase all data and preserve not null fields filling with 0.
Somebody could help me? Thanks!
Postgres Connector:
"name": "postgres-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"database.hostname": "**",
"database.port": "5432",
"database.user": "**",
"database.password": "**",
"database.dbname" : "**",
"database.server.name": "kafkaPostgres",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "history",
"schema.include.list": "public",
"table.include.list": "public.consultor",
"time.precision.mode": "connect",
"tombstones.on.delete": "true",
"plugin.name": "pgoutput",
"transforms": "unwrap, dropPrefix",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.unwrap.drop.tombstones": "false",
"transforms.unwrap.delete.handling.mode":"rewrite",
"transforms.unwrap.add.fields": "table,lsn",
"transforms.unwrap.add.headers": "db",
"transforms.dropPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropPrefix.regex":"kafkaPostgres.public.(.*)",
"transforms.dropPrefix.replacement":"$1"
MySql Sink:
"name": "mysql-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"topics": "consultor",
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "true",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true",
"connection.url": "**,
"connection.user":"**",
"connection.password":"**",
"auto.create": "true",
"auto.evolve": "true",
"insert.mode": "upsert",
"dialect.name": "MySqlDatabaseDialect",
"Database Dialect": "MySqlDatabaseDialect",
"table.name.format": "consultor",
"pk.mode": "record_key",
"pk.fields": "id_consultor",
"delete.enabled": "true",
"drop.invalid.message": "true",
"delete.retention.ms": 1,
"fields.whitelist": "id_consultor, idempresaorganizacional, cd_consultor_cpf, dt_consultor_nascimento , ds_justificativa, nn_consultor , cd_consultor_rg, id_motivo, id_situacao , id_sub_motivo",
"transforms": "unwrap, flatten, route, createKey, extractInt ",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.unwrap.drop.tombstones": "false",
"transforms.unwrap.delete.handling.mode":"rewrite",
"transforms.flatten.type": "org.apache.kafka.connect.transforms.Flatten$Value",
"transforms.flatten.delimiter": ".",
"transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.route.regex": "(?:[^.]+)\\.(?:[^.]+)\\.([^.]+)",
"transforms.route.replacement": "$1",
"transforms.createKey.type":"org.apache.kafka.connect.transforms.ValueToKey",
"transforms.createKey.fields":"id_consultor",
"transforms.extractInt.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
"transforms.extractInt.field": "id_consultor"
i've added this properties on connector:
"key.converter": "io.apicurio.registry.utils.converter.AvroConverter",
"key.converter.apicurio.registry.url" :"http://apicurio:8080/api",
"key.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.GetOrCreateIdStrategy",
"value.converter": "io.apicurio.registry.utils.converter.AvroConverter",
"value.converter.apicurio.registry.url":"http://apicurio:8080/api",
"value.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.GetOrCreateIdStrategy",
"key.converter.schemas.enable": "true",
"value.converter.schemas.enable": "true",
and replaces this on sink:
"key.converter":"org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "true",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true",
to:
"key.converter": "io.apicurio.registry.utils.converter.AvroConverter",
"key.converter.apicurio.registry.url" :"http://apicurio:8080/api",
"key.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.GetOrCreateIdStrategy",
"value.converter": "io.apicurio.registry.utils.converter.AvroConverter",
"value.converter.apicurio.registry.url":"http://apicurio:8080/api",
"value.converter.apicurio.registry.global-id": "io.apicurio.registry.utils.serde.strategy.GetOrCreateIdStrategy",
"key.converter.schemas.enable": "true",
"value.converter.schemas.enable": "true",
and everything worked well, but i can't read the mensages in topic because i'm using the debezium kafka version and there hasn't a avro console reader.
Now i'm try to some plugin for this version to able read avro files.
I hope help's somebody.

MySQL Debezium Kafka : schema isn't known to this connector

I started the MySQL Debezium Kafka Connector(Version: 0.9.2.Final) with one table in the "table.whitelist" and It was working fine. While adding another table in the whitelist and restarts the connector, I am getting the below error.
org.apache.kafka.connect.errors.ConnectException: Encountered change event for table paperclip.iltwhose schema isn't known to this connector
at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)
at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:477)
at com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)
at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)
at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.ConnectException: Encountered change event for table paperclip.iltwhose schema isn't known to this connector
at io.debezium.connector.mysql.BinlogReader.informAboutUnknownTableIfRequired(BinlogReader.java:727)
at io.debezium.connector.mysql.BinlogReader.handleUpdateTableMetadata(BinlogReader.java:702)
at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:461)
... 5 more
Please find the below configuration I have used. I hope with this setting("database.history.store.only.monitored.tables.ddl": "false"), it should work.
How can I resolve the case?
{
"name": "Mysql-rnd-engagex",
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"errors.log.enable": "true",
"errors.log.include.messages": "true",
"database.hostname": "devmysql.xxxx.net",
"database.port": "3306",
"database.user": "xxxxxx",
"database.password": "xxxxx",
"database.server.name": "rnd_engagex_cdc",
"database.history.kafka.bootstrap.servers": "xxxxxx.aivencloud.com:xxxx",
"database.history.kafka.topic": "rnd_engagex_dbhistory",
"database.history.skip.unparseable.ddl": "false",
"database.history.store.only.monitored.tables.ddl": "false",
"include.schema.changes": "false",
"include.query": "false",
"table.ignore.builtin": "true",
"database.whitelist": "paperclip",
"table.whitelist": "paperclip.elearning", //added new table : "paperclip.elearning,paperclip.ilt"
"column.blacklist": "paperclip.elearning.description",
"gtid.source.filter.dml.events": "true",
"tombstones.on.delete": "true",
"connect.keep.alive": "true",
"snapshot.minimal.locks": "true",
"database.history.producer.ssl.truststore.location": "/xxxx/yyyy/keys/public.truststore.jks",
"value.converter.schemas.enable": "false",
"database.history.consumer.ssl.truststore.location": "/xxxx/yyyy/keys/public.truststore.jks",
"database.history.producer.ssl.truststore.password": "password",
"database.history.producer.ssl.keystore.location": "/xxxx/yyyy/keys/public.keystore.p12",
"database.history.consumer.ssl.truststore.password": "password",
"database.history.consumer.ssl.keystore.location": "/xxxx/yyyy/keys/public.keystore.p12",
"database.history.producer.ssl.keystore.type": "PKCS12",
"database.history.producer.ssl.keystore.password": "ppppppppp",
"database.history.consumer.ssl.key.password": "ppppppppp",
"database.history.producer.security.protocol": "SSL",
"database.history.consumer.ssl.keystore.type": "PKCS12",
"database.history.consumer.ssl.keystore.password": "ppppppppp",
"database.history.producer.ssl.key.password": "ppppppppp",
"database.history.consumer.security.protocol": "SSL",
"key.converter.schemas.enable": "false"
}
You need to add property "snapshot.new.tables":"parallel" while creating connector, then only you will be able to whitelist more tables at later stage. This is not given in documentation, since the feature came as beta in 0.9.x

How to tell debezuim Mysql source connector to stop retaking snapshots of existing tables in kafka topic?

I'm using the Debezium MySQL CDC source connector to move a database from mysql to Kafka. The connector is working fine except for the snapshots where it's acting weird; the connector took the first snapshots successfully then after few hours went down for some heap memory limit (This is not the problem). I paused the connector, stoped the worker on the cluster, fixed the issue then started the worker again... The connector is now running fine but taking snapshots again!
it looks like the connector is not resuming from where it left off. and I think something is wrong in my configs.
I'm using debezium 0.95.
I changed the snapshot.mode=initial to initial_only but it didn't work.
Connect properties:
{
"properties": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"snapshot.locking.mode": "minimal",
"errors.log.include.messages": "false",
"table.blacklist": "mydb.someTable",
"include.schema.changes": "true",
"database.jdbc.driver": "com.mysql.cj.jdbc.Driver",
"database.history.kafka.recovery.poll.interval.ms": "100",
"poll.interval.ms": "500",
"heartbeat.topics.prefix": "__debezium-heartbeat",
"binlog.buffer.size": "0",
"errors.log.enable": "false",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"snapshot.fetch.size": "100000",
"errors.retry.timeout": "0",
"database.user": "kafka_readonly",
"database.history.kafka.bootstrap.servers": "bootstrap:9092",
"internal.database.history.ddl.filter": "DROP TEMPORARY TABLE IF EXISTS .+ /\\* generated by server \\*/,INSERT INTO mysql.rds_heartbeat2\\(.*\\) values \\(.*\\) ON DUPLICATE KEY UPDATE value \u003d .*,FLUSH RELAY LOGS.*,flush relay logs.*",
"heartbeat.interval.ms": "0",
"header.converter": "org.apache.kafka.connect.json.JsonConverter",
"autoReconnect": "true",
"inconsistent.schema.handling.mode": "fail",
"enable.time.adjuster": "true",
"gtid.new.channel.position": "latest",
"ddl.parser.mode": "antlr",
"database.password": "pw",
"name": "mysql-cdc-replication",
"errors.tolerance": "none",
"database.history.store.only.monitored.tables.ddl": "false",
"gtid.source.filter.dml.events": "true",
"max.batch.size": "2048",
"connect.keep.alive": "true",
"database.history": "io.debezium.relational.history.KafkaDatabaseHistory",
"snapshot.mode": "initial_only",
"connect.timeout.ms": "30000",
"max.queue.size": "8192",
"tasks.max": "1",
"database.history.kafka.topic": "history-topic",
"snapshot.delay.ms": "0",
"database.history.kafka.recovery.attempts": "100",
"tombstones.on.delete": "true",
"decimal.handling.mode": "double",
"snapshot.new.tables": "parallel",
"database.history.skip.unparseable.ddl": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"table.ignore.builtin": "true",
"database.whitelist": "mydb",
"bigint.unsigned.handling.mode": "long",
"database.server.id": "6022",
"event.deserialization.failure.handling.mode": "fail",
"time.precision.mode": "adaptive_time_microseconds",
"errors.retry.delay.max.ms": "60000",
"database.server.name": "host",
"database.port": "3306",
"database.ssl.mode": "disabled",
"database.serverTimezone": "UTC",
"task.class": "io.debezium.connector.mysql.MySqlConnectorTask",
"database.hostname": "host",
"database.server.id.offset": "10000",
"connect.keep.alive.interval.ms": "60000",
"include.query": "false"
}
}
I can confirm Gunnar's answer above. Ran into some issues during snapshotting, and had to restart the whole snapshotting process. Right now, the connector does not support resuming snapshot at a certain point. Your configs seems fine to me. Hope this helps.