I'm looking for an open source to replicate MySQL to Hadoop, I found two options, but
Sqoop, Flume: not support realtime UPDATE, DELETE
Tungsten: closed source and pricing
so what other tools to meet that requirement?
With Best of My Knowledge Kafka can be Useful for your case.
Kafka-mysql-connector is a plugin that allows you to easily replicate MySQL changes to Apache Kafka and from Kafka you can load to HDFS or HIVE
For a MySQL->Kafka solution based on Kafka Connect, check out the excellent Debezium project.
http://debezium.io/
For a MySQL->Kafka solution that is a standalone application, check out the excellent Maxwell project, upon which this connector was based.
http://maxwells-daemon.io/
Hope this Helps
(Note: I have not used this solution but you can give a try)
Related
I am trying to learn kafka technology and I want to setup a small pipeline in my pc which will take data from mysql and parse them into a kafka cluster.
For a small research that I made, I saw that the best way for this is KAFKA CONNECTOR.
Can someone please help with tips or information about how to implement the above logic?
The easiest way would be to use the JDBC Source connector to pull data from your MySQL server to Kafka topic for further processing. Please find the link to the Confluent JDBC source connector. Please view the various configuration the connector offers to better align your data for subsequent processing.
How to upgrade a production DB schema from a dev DB schema automatically, via command line? The dev version has changes to the schema that need to be made in the production schema, but I cannot lose the data in production.
Schema Migrations
Most modern projects use a tool to track each individual change to the database, and associate some version number with the change. The database must also have some table to store its current version. That way the tool can query the current version and figure out which (if any) changes to apply.
There are several free tools to do this, like:
Liquibase
Flyway
Rails Migrations
Doctrine Migrations
SQLAlchemy Migrate
All of these require that you write meticulous code files for each change as you develop. It would be hard to reverse-engineer a project if you haven't been following the process of creating schema change code all along.
There are tools like mysqldbcompare that can help you generate the minimal ALTER TABLE statements to upgrade your production database.
There is also a newer free tool called Shift (I work with the engineer who created it), which helps to automate the process of upgrading your database. It even provides a nice web interface for entering your schema changes, running them as online changes, and monitoring their progress. But it requires quite a lot of experience to use this tool, I wouldn't recommend it for a beginner.
You can also use MySQL WorkBench: https://dev.mysql.com/downloads/workbench/
if you want an easy GUI, and cross OS compatible ;)
I'm looking into setting up FlywayDB as our migration toolkit for our webapp, however there are some migrations (such as adding a column) on large tables (90 million rows) that take many minutes to run.
Usually when this is the case we use Percona Toolkit to run the schema change as it allows the application to continue running and not block incoming queries. So my question is if there a way to run FlywayDB migrations through Percona Toolkit or something similar? I have been unable to find much if any real documentation on such a situation.
There is no direct integration to Percona Online Schema Change from external sources. You would have to code hooks into FlywayDB to execute PT-OSC for you during deployments/migrations or you can write a plugin for PT-OSC to read FlywayDB files.
Does anyone know an efficient way to migrate from DashDB to MySQL ? I have a database schema on DashDB which contains n number of tables and I wanted to migrate to MySQL.
The only way I can think of is exporting each table to CSV and then import it in MySQL but I am sure there is a better way to do this. Any suggestions?
The IBM DataWorks service on Bluemix is designed for this exact purpose. Here is a screenshot of what the interface looks like:
After creating a free Bluemix account, find the service in the catalog and create an instance of it. You can try the service under the 'starter' level plan for free. A getting started guide can be found in the Bluemix Docs.
Hope this helps
So, the question is simple - how to make a query from Oracle to MySQL and the other way around. ODBC is out of the question due to slow performance.
Disclaimer: I am the creator of the MySQL DataController plugin.
As far for Oracle to MSSQL, you may create DBLinks for such queries. There are plenty of documentation on how to create a database link on Oracle using the MSSQL Driver.Just google "Database link from Oracle to SQL Server"
If you want to do the inverse, you may use the plugin that we wrote for mysql. The plugin uses Free-TDS which is an opensource project used to communicate to MSSQL. We have updated a blog about that MySQL Plugin, if you need help compiling it we could help you out.
See the following link for a short video and blog about the plugin
http://www.acentera.com/mysql-datacontroller/
There is a mysql client library driver for oracle:
https://docs.oracle.com/database/121/DRDAA/mysql_driver.htm
With regards to MySQL you may be able to use the FEDERATED MySQL storage engine
This may also be useful:
"The DataController project is being designed in order to facilitate and provide an easy database integration between MS SQL, Oracle and MySQL databases. It is being designed to provide real-time performant replication between mysql and other databases.. The code is originally derived from MySQL."
https://launchpad.net/datacontroller