Is there any better way to do the data migration? - mysql

I have written couple of methods to retrieve data from LDap and put it into MySql database. I put those methods in a Listener, so that it executes while deploying the War.
Now this is a one time action. That means, I have to take all the data from Ldap and put those into the MySql DB, and then work on the database tables. I have nothing to do with the LDap data farther.
Is there any better way to do the data migration thing? Since it is a one time work, and once the database is created successfully, there is no need of these methods.
Please Suggest!
Thanks. :)

For migration exercises, look into the Open Source Pentaho Data Integration tool (PDI, or commonly known as Kettle).
There is a slight learning curve, but it's easy to use, and you'll have it forever.

Related

ETL between a MySQL primary Data Store and a MongoDB secondary Data Store

We have a rails app that has a MySQL backend, each client has one DB and the schema is identical. We use a custom gem to change the DB based on the URL of the request (This is some legacy code that we are trying to move away from)
We need to capture some changes from those MySQL databases (Changes in inventory, some order information, etc) transform and store in a single MongoDB database (multitenant data store), this data will be used for analytics at first, but our idea is to move everything there.
There was something in place to do this, using AR callbacks and Rabbit, but to be honest it wasn't working correctly and it looked like it was more trouble to fix it than to start over with a fresh approach.
We did some research and found some tools to do ETL but they are overkill for our needs.
Does anyone have some experience with a similar problem?
Recommendations on how to architect and implement this simple ETL
Pentaho provides change-data-capture option which can solve Data-synchronization problems.
If by Overkill you mean Setup, Configuration, then Yes that is the common problem with ETL tools and PENTAHO is the easiest among them.
If you can provide more details, I'll be glad to provide an elaborate answer.

meteor reporting of data in existing mysql db. how?

I'm trying to make some reports using meteor and raphael js. I have to report data from an existing MySQL database. I do not wish to write to that database. I need only the "R" from CRUD.
I have thought of various manual ways of: exporting .csv files from the MySQL db via the application itself (Limesurvey) and using mongoimport to populate a MongoDB collection, and then do my CollectionName.find() etc in Meteor.
or perhaps some way of exposing REST full endpoints only to consume data, and use the http package for Meteor.
Is there a good clean solution for using existing SQL data in a Meteor JS application?
How can one use pre-existing SQL data?
(I've no problem with duplication in MongoDB, mind you. however it has to be...)
Thank You
You can do it without any duplication completely from inside Meteor, but you will have to jump through a couple of hoops.
Firstly, use the mysql npm package to query the SQL database. Though Meteor provides Npm to require node packages, I find that using meteor-npm is an easier. Then to do the "R"eading form MySQL, create a Meteor.method on your server which queries the MySQL directly.
Then the second problem is that the mysql package is completely asynchronous. Hence, the execution of the SQL query returns value in a call back and by that point, your Meteor.method call would return leaving the client with an undefined. To fix that issue, we can use Future.
There are a couple of ways of smoothing over this step:
Using `meteor-sync-methods
Spinning out your own version from advice from the issue to allow this natively
Use this easy to implement one-time pattern: "fence has already activated -- too late to add writes"
Hope that helps.

How to log mysql database structural changes

I'm working with a project which is using mysql as the database. The application is hosted with many clients and we are doing upgrades for the current live systems often.
There are some instances where the client has change the database structure(adding new tables) and causes some unexpected db crashes.
I need to log all the structural changes which were done at that database, so we can find the correct root cause for that. We can't do it 100% correct with diff tool because it will not show the intermediate changes.
I found http://www.liquibase.org/ tool but seems little bit complex.
Is there any well known technique or a tool to track database structural changes only.
well from mysql studio you can generate all object's schema definition and compare them with your standard schema definition and this way you can compare two database schema...
generate scrips of both database (One is client's Database and One is master copy database) and then compare it using file compare tool would be the best practice according to me because this way you can track which collumn was added, which column was deleted, which index was added like wise without any tool download.
Possiable duplication of Compare two MySQL databases ?
Hope this helps.
If you have an application for your clients to manage these schema changes, you can use a mechanism at application level. If you have a Python and Django-based solution, you could probably use South which provides schema change tracking and rollbacks.

Migrating subsets of production data back to dev

In our rails app we sometimes have db entries created by users that we'd like to make part of our dev environment, without exporting the whole table. So, we'd like to be able to have a special 'dev and testing' dump.
Any recommended best practices? mysqldump seems pretty cumbersome, and we'd like to pull in rails associations as well, so maybe a rake task would make more sense.
Ideas?
You could use an ETL tool like Pentaho Kettle. Once you have initial transformation setup that you want you could easily run it with different parameters in the future. This way you could also keep all your associations. I wrote a little blurb about Pentaho for another question here.
If you provide a rough schema I could probably help you get started on what your transformation would look like.
I had a similar need and I ended up creating a plugin for that. It was developed for Rails 2.x and worked fine for me, but I didn't have much use for it lately.
The documentation is lacking, but it's pretty simple. You basically install the plugin and then have a method to_sql available on all your models. Options are explained in README.
You can try it out and let me know if you have any issues, I'll try to help.
I'd go after it using a Rails runner script. That will allow your code to access the same things your Rails app would, including the database initializations. ActiveRecord will be able to take advantage of the model relationships you've defined.
Create some "transfer" tables in your production database and copy the desired data into those using the "runner" script. From there you could serialize the data, or use a dump tool, since you'll be dealing with a reduced amount of records. Reverse the process in the development environment to move the data into the database.
I had a need to populate the database in one of my apps from remote web logs and wrote a runner script that fired off periodically via cron, ftps the data from my site and inserts the data.

How to synchronize development and production database

Do you know any applications to synchronize two databases - during development sometimes it's required to add one or two table rows or new table or column.
Usually I write every sql statement in some file and during uploading path I evecute those lines on my production database (earlier backing it up).
I work with mySQL and postreSQL databases.
What is your practise and what applications helps you in that.
You asked for a tool or application answer, but what you really need is a a process answer. The underlying theme here is that you should be versioning your database DDL (and DML, when needed) and providing change scripts to be able to update any version of your database to a higher version.
This set of links provided by Jeff Atwood and written by K. Scott Allen explain in detail what this ought to look like - and they do it better than I can possibly write up here: http://www.codinghorror.com/blog/2008/02/get-your-database-under-version-control.html
For PostgreSQL you could use Another PostgreSQL Diff Tool . It can diff two SQL Dumps very fast (a few seconds on a db with about 300 tables, 50 views and 500 stored procedures). So you can find your changes easily and get a sql diff which you can execute.
From the APGDiff Page:
Another PostgreSQL Diff Tool is simple PostgreSQL diff tool that is useful for schema upgrades. The tool compares two schema dump files and creates output file that is (after some hand-made modifications) suitable for upgrade of old schema.
Have scripts (under source control of course) that you only ever add to the bottom off. That combined with regular restores from your production database to dev you should be golden. If you are strict about it, this works very well.
Otherwise I know lots of people use redgate stuff for SQLServer.
Another vote for RedGate SQL Compare
http://www.red-gate.com/products/SQL_Compare/index.htm
Wouldn't want to live without it!
Edit: Sorry, it seems this is only for SQL Server. Still - if any SQL Server users have the same question I'd definitely recommend this tool.
If you write your SQL statements for your development database (which are, I imagine, series of DDL instructions such as CREATE, ALTER and DROP), why don't you keep track of them by recording them in a table, with a "version" index? You will then be able to:
track your version changes
make a small routine allowing the "automatic" update of your production database by sending the recorded instructions to the database.
I really like the EMS tools.
There tools are available for all popular DB's and you have the same user experience for every type of DB.
One of the tools is the DB Comparer.
TOAD
saved many an ass several times in the past. Why do people run sql with no exit strategy?
the redgate one is good also.
Siebel (CRM, Sales, etc. management product) has a built-in tool to align the production database with the development one (dev2prod).
Otherwise, you've got to stick with manually executed scripts.
Navicat has a structure synchronisation wizard that handles this.
I solve this by using Hibernate. It can detect and autocreate missing tables, columns, etc.
You could add some automation to your current way of doing things by using dbDeploy or a similar script. This will allow you to keep track of your schema changes and to upgrade/rollback your schema as you see fit.
Here's a straight linux bash script I wrote for syncing Magento databases... but you can easily modify it for other uses :)
http://markshust.com/2011/09/08/syncing-magento-instance-production-development
DBV - "Database version control, made easy!" (PHP)