Kafka to Database data Pipelining - mysql

I have data coming into Kafka, I want to push data from kafka to database(Postgresql).
I am following the steps of this link "https://hellokoding.com/kafka-connect-sinks-data-to-postgres-example-with-avro-schema-registry-and-python/" and I am getting the error. Any suggestions.

This is no code required, this can be achieved by using Kafka connect and mysql connector available on Confluent.
https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc

The error you're getting is this:
Failed to deserialize the data for topic
Error deserializing Avro message
Unknown magic byte
This is because you have specified Avro converter, but the data in your topic is not Avro.
See this article for details: https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained

Related

Receiving strange values to kafka topic from kura

Trying to send data to kafka topic from kura through Mqtt proxy but receiving strange values in kafka topic
we are trying to send json data to kafka topic
The MQTT proxy only sends binary data (bytes or strings), and Control Center can only show strings.
You will want to verify the Kafka producer settings to see if it is truly sending JSON, and not other binary format, or compressed/encrypted bytes.
You can further debug with kafka-console-consumer, rather than use any UI tool

How to get a json metadata object with Heroku log drain

Right now I'm fetching logs from the backend using Heroku log drains. The problem with that approach is that the logs are streamed line by line and is not possible to get a json metadata object.
What options do I have available to solve my issues? Thank you!

Kafka Connect transforming JSON string to actual JSON

I'm trying to figure out whether it's possible to transform JSON values that are stored as strings into actual JSON structures using Kafka Connect.
I tried looking for such a transformation but couldn't find one. As an example, this could be the source:
{
"UserID":2105058535,
"DocumentID":2105058535,
"RandomJSON":"{\"Tags\":[{\"TagID\":1,\"TagName\":\"Java\"},{\"TagID\":2,\"TagName\":\"Kafka\"}]}"
}
And this is my goal:
{
"UserID":2105058535,
"DocumentID":2105058535,
"RandomJSON":{
"Tags":[
{
"TagID":1,
"TagName":"Java"
},
{
"TagID":2,
"TagName":"Kafka"
}
]
}
}
I'm trying to make these transformations for Elasticsearch sink connector if it makes a difference.
I know I can use Logstash together with JSON filter in order to do this, but I'd like to know whether there's a way to do it using just Kafka Connect.
Sounds like this would be a Single Message Transform (thus applicable to any connector, not just ES), but there aren't any out of the box doing what you describe. The API is documented here.
I had a similar issue but in reverse. I had the data in Json and I needed to convert some of it into a Json string representation to store it in Cassandra using the Cassandra Sink. I ended up creating a Kafka stream app that reads from the topic and then output the Json object to another topic that is read by the connector.
topic document <- read by your kafka stream with a call to mapValues or create a Jackson POJO that serializes as you want, and then write value to -> topic document.elasticsearch
You can use FromJson converter.
Please check this link for more details
https://jcustenborder.github.io/kafka-connect-documentation/projects/kafka-connect-json-schema/transformations/examples/FromJson.inline.html

copy JSON formatted messages from a kafka topic to another topic in AVRO format

I have a Kafka-connect setup running where a source connector reads structured records from text files and store into a topic in JSON format (with schema). There is a sink connector running which is inserting those messages into a Cassandra Table. While this setup is running fine, I needed to introduce another sink connector to transfer those messages to HDFS also. So I tried to implement the HDFSSinkConnector (CP 3.0). But this connector expects that the messages would be AVRO formatted and hence throwing errors like 'Failed to deserialize data to Avro'.
Is there a way so that I can copy and convert the JSON messages from the source topic to another topic in Avro format and point the HDFS sink connector to the new topic to read from? Can it be done using Kafka Streams?
My distributed Connect Config file contains --
...
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
...
My message for in the topic is as below --
{"schema":{"type":"struct",
"fields":[{"type":"string","optional":false,"field":"id"},
{"type":"string","optional":false,"field":"name"},
{"type":"integer","optional":false,"field":"amount"}
],
"optional":false,
"name":"myrec",
"version":1
},
"payload":{"id":"A123","name":"Sample","amount":75}
}
Can anyone help me on this? Thanking in advance...

SpringXD JSON parser to Oracle DB

I am trying to use SpringXD to stream some JSON metrics data to a Oracle database.
I am using this example from here: SpringXD Example
Http call being made: EarthquakeJsonExample
My shell cmd.
stream create earthData --definition "trigger|usgs| jdbc --columns='mag,place,time,updated,tz,url,felt,cdi,mni,alert,tsunami,status,sig,net,code,ids,souces,types,nst,dmin,rms,gap,magnitude_type' --driverClassName=driver --username=username --password --url=url --tableName=Test_Table" --deploy
I would like to capture just the properties portion of this JSON response into the given table columns. I got it to the point where it doesn't give me a error on the hashing but instead just deposits a bunch of nulls into the column.
I think my problem is the parsing of the JSON itself. Since really the properties is in the Features array. Can SpringXD distinguish this for me out of the box or will I need to write a custom processor?
Here is a look at what the database looks like after a successful cmd.
Any advice? Im new to parsing JSON in this fashion and im not really sure how to find more documentation or examples with SpringXD itself.
Here is reference to the documentation: SpringXD Doc
The transformer in the JDBC sink expects a simple document that can converted to a map of keys/values. You would need to add a transformer upstream, perhaps in your usgs processor or even a separate processor. You could use a #jsonPath expression to extract the properties key and make it the payload.