Kafka Connect with MySQL Source - mysql

Before I begin, I'd like to start by saying I am completely new to Kafka and am fairly new to Linux, so if this ends up being a ridiculously simple answer, please be kind! :)
The high level idea of what I'm trying to do is use Confluent's Kafka Connect to read from a MySQL database that is having sensor data streamed to it on a minute or sub-minute basis and then use Kafka as an "ETL pipeline" to instantly route that data to a Data Warehouse and/or MongoDB for reporting or even tie in directly to Kafka from our web-app.
I am using Robin Moffatt's series as well as Confluent's JDBC Source Connector Quickstart as my initial guide. As far as where these are hosted, I am using an Amazon RDS MySQL database and a separate AWS EC2 t2.large instance with Ubuntu 16.04.2 to run Kafka Connect.
Using Robin's workflow, I am to the point where I have created the configuration file, but I am not using the json format he uses. I am using the format from the quickstart article.
name=jdbc_source_mysql_4427_Data
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
connection.url=jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*****
table.whitelist=4427_Data
mode=timestamp
timestamp.column.name=TmStamp
validate.non.null=false
topic.prefix=mysql-
And that is saved at:
/etc/kafka-connect-jdbc/kafka-connect-jdbc-source.properties
I then run:
/usr/bin/confluent load jdbc_source_mysql_4427_Data -d /etc/kafka-connect-jdbc/kafka-connect-jdbc-source.properties
and get this error:
{
"error_code": 400,
"message": "Connector configuration is invalid and contains the following 2 error(s):\nInvalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*** for configuration Couldn't open connection to jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=***\nInvalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*** for configuration Couldn't open connection to jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=***\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"
}
It seems to be a driver issue. My question at this point is, "Do I need to download the MySQL JDBC driver to my EC2 instance, or should that have been included in the Confluent Platform package?"
Also, does my overall idea sound like a good fit for Kafka Connect?
As I mentioned earlier, I am new to these technologies, but have found the best way to learn something is to jump right in and try to solve a problem. Any ideas and suggestions would be more than welcome. Thank you!

The overall concept makes sense to me. You do need to download the driver and add it to your worker classpath. It isn't packaged for licensing reasons I assume.

As #dawsaw says, you do need to make the MySQL JDBC driver available to the connector.
My observation here would be–given a free hand in all the application and architecture you describe– it would be best to stream from the sensor into Kafka, and then from there Kafka into MySQL, Mongo, webapp, etc.
Streaming into a DB to then stream out of the DB is not a perfect choice, if you have the option.

It's because there's no mysql driver in the distribution of confluent. I think you can solve the problem by downloading a mysql driver jar file, then putting it in confluent/share/java/kafka-connect-jdbc folder and re-run the program.

Related

Send data to a MySQL server over an internet connection

I'm a total beginner to MySQL, I'm more of a firmware specialist. I'm working on an application where I will be getting GPS coordinates from a microcontroller + cellular device and I would like some way to store the coordinates and do processing on them. I figured a database hosted on a server made the most sense, which is what has brought me to MySQL.
Basically, I'm wondering what the basic protocol is for sending data to a MySQL server over an internet connection (my device has data). Like how do I connect to the server and publish data to it?
I'm experienced with MQTT and I think I could do TCP as well but I'm looking for a protocol that is not super power-intensive and I can't use anything that requires an operating system, like a python script.
To be clear, I am NOT asking you to tell me every step for how this is done, but basically what protocol and what tools could I use? Anything you can tell me would be appreciated.
I was thinking that I could use the MySQL client C code to help write a driver that could allow me to connect to the server. I'm experienced with writing drivers and the microcontroller I'm using uses C.
You need no direct connection to the DB at all. Your cellular device should be able to establish tcp connection to the ipaddress/port and to send the byte-stream through the connection. It can be the dumb unidirectional protocol with losses.
You need some service that can listen on the other side, that can parse your byte-stream, can fetch the correct packets from it and then send the data to the database. Speaking frankly that service can even be written in linux shell:
nc -lk 1234 | collector.sh
where collector.sh is a script like that:
#!/bin/sh
while read LINE
do
# $LINE parsing and all the staff
mysql -e "INSERT INTO mygps.nmea (lat,lon,dtime) VALUES ($LAT, $LON, $DTIME);"
done <<< /dev/stdin
####
Sure it isn't a best solution but it was really helpful for me at the very beginning. Then you can proceed the gathered data in any desired way.
Build a simple server that communicates with whatever gathered data and then use the server so send the data to MySQL with the help of MySQL connector. Building part of the protocol will quite time consuming. - nbk
If you "can't use anything that requires an operating system" you need some middleware that can run the MySQL client driver to talk to the database, you will then use MQTT to pass data between your sensor and the middleware. If you don't want to write this middleware yourself, something like Node-RED might come handy.
You certainly can reimplement the driver for your MC, though I personally would not want to waste the time on something like this when I can assemble a solution from existing components. Database protocols are typically chatty, synchronous, and sensitive to network quality, and I wouldn't want to waste my MC cycles on that when I can make middleware do that asynchronously. - mustaccio
Simply "reverse ssh port forwarding"? That can be done, I think, with a single ssh command at one (or both) end of the connection. MySQL, by default, needs the client to connect on port 3306 to the server. - rick-james

Couchbase says: No valid node found to bootstrap from

What means this error message ?
com.couchbase.client.core.config.ConfigurationException: No valid node found to bootstrap from. Please check your network configuration.
From the source code:
https://github.com/couchbase/couchbase-jvm-core/blob/master/src/main/java/com/couchbase/client/core/message/cluster/SeedNodesRequest.java
it looks like my node host is found, but is not valid:
If memory serves, it means that the Couchbase SDK cannot connect to the cluster you have in your connect string in your connection object. It is trying to connect and get the cluster map to know the cluster topology, what services are available and where in the cluster they are.
In the future, please add your code you are using to your question so as to have people answer your question, but also benefit the community here.

Eclipse Data Source Explorer Does Not See MySql Schemas

I configured a connection profile and I could ping it successfully, but the Data Source Explorer does not see any of the schemas (databases) in MySQL.
When I configure the connection for JPA, the database-specific validations fail and none of my TABLE names mapped to my entities are seen, and cause the failures.
I looked at previous questions and answers, and followed the suggestions, which allowed me to configure the connection successfully, but the explorer still does not see schemas.
Help would be greatly appreciated. I’ve burned a full day on this issue and I cannot find any answers from uncle Google.
The MySQL driver is configured correctly (otherwise the ping would fail).
URL: jdbc:mysql://localhost:3306/finances
Database Name: finances
Driver Class: com.mysql.jdbc.Driver
The driver Jar is attached and visible in the driver JAR list.
The JPA properties of the project point at the same connection as the Data Source Explore and the "connect" button connects successfully, the status bar at bottom of the window shows MyConnectionProfile (Connected).
The JPA properties are:
Platform: EclipseLink 2.5.x
User Library: EclipseLink 2.5.2
JPA Version: 2.1
The database-specific validation errors are
Table "XXXXX" cannot be resolved, for each of 15 tables.
I wanted to embed snapshots but I did not know how.
My very first question on SO! Be Gentle!

Connecting to a MySQL db with the anylogic objects

I'm trying to connect to a local MySQL database, with the anylogic object "database".
I'm using the type: "Other Database", and the connection URL: "Server=localhost;Database=anylogicdata;"
but I constantly get a RuntimeException saying: "not suitable driver found"
The help file says that you have to install the driver, but i don't know which or if it is my connection URL that is the problem.
Does anyone have some pointers to help me along the way?
You need a JDBC driver to be able to connect to a MySQL server from Java (that AnyLogic is based on) and you can find one here. After you have installed the driver you should find it in the list of available JDBC drivers in AnyLogic. The name should be com.mysql.jdbc.Driver if you chose the suggested one.
First you have to download and add the mysql-connector-java-*.jar file to the anylogic-project.
Then you have to type com.mysql.jdbc.Driver into the JDBC driver dropdown box. finally the connection string should look something like
jdbc:mysql:[host/db]?:[port]
another thing I found that might be helpfull to others, is that you can get the java connection from the anylogic db object: database.getConnection() this is very usefull if you want to create your own query. eg create a bulk insert instead of the single insert that Anylogic provides.

Yii doesn't find PDO MySQL driver

The yii requirements page says PDO extension + the mysql driver works, phpinfo() says that PDO and the MySQL driver is installed, I have configured the 'db' component in the main config file for my project generated with yiic webapp, checked and double checked that the settings are correct (and yes, I am using mysql).
I have made a new migration script in /[mywebapp]/protected/migrations and now I'm trying to run the ./protected/yiic migrate command, but i just get an exception:
exception 'CDbException' with message 'CDbConnection failed to open the DB connection: could not find driver'
I have no idea what is wrong. I have been googling for 2 hours now and i find a lot of other users experiencing the same problems, but usually they are missing the drivers or something obvious. Is there anything i'm completely overlooking?
Despite the real answer being in the comments for this question, I am answering it here so that it appears as an answer. Yiic.php migrate uses the configuration stored in console.php. You need to set your database connection in there to use yiic.