MySQL export to MongoDB - mysql

I am looking to export an existing MySQL database table to seed a MongoDB database.
I would have thought this was a well trodden path, but it appears not to be, as I am coming up blank with a simple MySQLDUMP -> MongoDB JSON converter.
It won't take much effort to code up such a conversion utility.

There are a method that doesn't require you to use any other software than mysql and mongodb utilities. The disadvantage is that you have to go table by table, but in your case you only need to migrate one table, so it won't be painful.
I followed this tutorial. Relevant parts are:
Get a CSV with your data. You can generate one with the following query in mysql.
SELECT [fields] INTO outfile 'user.csv' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' FROM [table]
Finally, import the file using mongoimport.
That's all

If you're using Ruby, you can also try: Mongify
It will read your mysql database, build a translation file and allow you to map the information.
It supports:
Updating internal IDs (to BSON ObjectID)
Updating referencing IDs
Type Casting values
Embedding Tables into other documents
Before filters (to change data manually)
and much much more...
Read more about it at: http://mongify.com/getting_started.html

MongoVue is a new project that contains a MySQL import:
MongoVue. I have not used that feature.

If you are Mac user you can use MongoHub which has a built in feature to import (& export) data from MySql databases.

If you are using java you can try this
http://code.google.com/p/sql-to-nosql-importer/

For a powerful conversion utility, check out Tungsten Replicator
I'm still looking int this one called SQLToNoSQLImporter, which is written in Java.

I've ut a little something up on GitHub - it's not even 80% there but it's growing for work and it might be something other of you could help me out on!
https://github.com/jaredwa/mysqltomongo

Related

Validation of migrated data for MySQL

I'm migrating a large(approx. 10GB) MySQL database(InnoDB engine).
I've figured out the migration part. Export -> mysqldump, Import -> mysql.
However, I'm trying to figure out the optimum way to validate if the migrated data is correct. I thought of the following approaches but they don't completely work for me.
One approach could have been using CHECKSUM TABLE. However, I can't use it since the target database would have data continuously written to it(from other sources) even during migration.
Another approach could have been using the combination of MD5(), GROUP_CONCAT, and CONCAT. However, that also won't work for me as some of the columns contain large JSON data.
So, what would be the best way to validate that the migrated data is correct?
Thanks.
How about this?
Do SELECT ... INTO OUTFILE from each old and new table, writing them into .csv files. Then run diff(1) between the files, eyeball the results, and convince yourself that the new tables' rows are an appropriate superset of the old tables'.
These flat files are modest in size compared to a whole database and diff is fast enough to be practical.

Talend - Process large delimited file

i've got a questions regarding on how to process a delimited file with a large number of columns (>3000).
I tried to extract the fields with the standard delimited file input component, but creating the schema takes hours and when i run the job i get an error, because the toString() method exceeds the 65535 bytes limit. At that point i can run the job but all the columns are messed up and i cant realy work with them anymore.
Is it possible to split that .csv-file with talend? Is there any other handling possible, maybe with some sort of java code? If you have any further questions dont hesitate to comment.
Cheers!
You can create the schema of the delimited file in Metadata right? I tested 3k columns with some millions of records and it did not even take 5 minutes to load all the column names with data types. Obviously you can't split that file by taking each row as one cell, it could exceed the limit of strings in talend. But you can do it in Java using BufferedReader.
To deal with Big delimited file, we need something designed for big data, I think it will be a good choice to load your file to a MongoDB collection using this command with no need to create a 3k columns collection before importing file:
mongoimport --db users --collection contacts --type csv --headerline --file /opt/backups/contacts.csv
After that, you can process your data easily using an ETL tool.
See MongoImprort Ref.
Maybe you could have a go with uniVocity. It is built to handle all sorts of extreme situations processing data.
Check out the tutorial and see if suits your needs.
Here's a simple project which works with CSV inputs: https://github.com/uniVocity/worldcities-import/
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

How to go about updating a MySQL Table from a CSV file every [time interval]?

Firstly, I understand that attempting to do this from MySQL itself is not allowed:
http://dev.mysql.com/doc/refman/5.6/en/stored-program-restrictions.html
When I try to use LOAD DATA INFILE 'c:/data.csv' ... , I get the "LOAD DATA IS NOT ALLOWED IN STORED PROCEDURES".
I am a beginner with moving data around MySQL and I realize this may not be a task it was designed to handle. Therefore, what approach should I use to grab data from a CSV file and append it to a table at a regular time interval? (I have researched a little bit about CRON, but that is for UNIX systems only and we are using a Windows based OS.)
You can run CRON job on windows also. I have found a couple of links after searching. Please look in to these links:
waytocode.com/2012/setup-cron-job-on-windows-server
http://stackoverflow.com/questions/24035090/run-cron-job-on-php-script-on-localhost-in-windows

How can I turn a CSV file into a web-based SQL database table, without having any database already setup?

Currently, I have a CSV file with data in it. I want to turn it into a SQL table, so I can run SQL queries on it. I want the table to be within a web-based database that others in my organization can also access. What's the easiest way to go from CSV file to this end result? Would appreciate insight on setting the up database and table, giving others access, and getting data inside. Preferably PostgreSQL, but MySQL is fine too.
To create the table it depends on the number of columns you have. If you have only a few then do it manually:
CREATE TABLE <table name> (<variable name> <variable type (e.g. int or varchar(100)>, <etc.>)
If you have many columns you can open the csv file in excel and get 'SQL Converter for Excel' which will build a create statement for you using your column headings (and autodetect variable types too).
Loading data from a csv is also pretty straightforward:
LOAD DATA INFILE <filepath (e.g. 'C:/Users/<username>/Desktop/test.csv'>
INTO TABLE <table name>
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS; (Only use this line if you have column names included in the csv).
As for a web-based solution: https://cloud.google.com/products/cloud-sql/
That's a relatively open-ended question. A couple of noteworthy pointers off the top of my head:
MySQL allows you to store your data in different formats, one of them being CSV. That's a very straightforward solution if you're happy with it and don't mind a few limitations (see http://dev.mysql.com/doc/refman/5.0/en/csv-storage-engine.html).
Otherwise you can import your data into a table with a full-featured engine (see other answer(s) for details).
If you're happy with PostgreSQL and look for fully web based solution, have a look at Heroku.
There are a great many ways to make your data available through web services without accessing the back-end data store directly. Have a look at REST and SOAP for instance.
HTH

How to move data from one SQLite to MySQL with different designs?

The problem is:
I've got a SQLite database which is constantly being updated though a proprietary application.
I'm building an application which uses MySQL and the database design is very different from the one of SQLite.
I then have to copy data from SQLite to MySQL but it should be done very carefully as not everything should be moved, tables and fields have different names and sometimes data from one table goes to two tables (or the opposite).
In short, SQLite should behave as a client to MySQL inserting what is new and updating the old in an automated way. It doesn't need to be updating in real time; every X hours would be enough.
A google search gave me this:
http://migratedb.sourceforge.net/
And asking a friend I got information about the Multisource plugin (Squirrel SQL) in this page:
http://squirrel-sql.sourceforge.net/index.php?page=plugins
I would like to know if there is a better way to solve the problem or if I will have to make a custom script myself.
Thank you!
I recommend a custom script for this:
If it's not a one-to-one conversion between the tables and fields, tools might not help there. In your question, you've said:
...and sometimes data from one table goes to two tables (or the opposite).
If you only want the differences, then you'll need to build the logic for that unless every record in the SQLite db has timestamps.
Are you going to be updating the MySQL db at all? If not, are you okay to completely delete the MySQL db and refresh it every X hours with all the data from SQLite?
Also, if you are comfortable with a scripting language (like php, python, perl, ruby, etc.), they have API's for both SQLite and MySQL; it would be easy enough to build your own script which you can control customise more easily based on program logic. Especially if you want to run "conversions" between the data from one to the other and not just simple mapping.
I hope i understand you correctly, that you will flush the data which are stored in a SQLite DB periodicly to a MySQL DB. Right?
So this is how i would do it.
Create a Cron, which starts the script every x minutes.
Export the Data from SQLite into an CSV-File.
Do an LOAD DATA INFILE an import the CSV Data to MySQL
Code example for LOAD DATA INFILE
LOAD DATA INFILE 'PATH_TO_EXPORTED_CSV' REPLACE INTO TABLE your_table FIELDS TERMINATED BY ';' ENCLOSED BY '\"' LINES TERMINATED BY '\\n' IGNORE 1 LINES ( #value_column1, #unimportend_value, #value_column2, #unimportend_value, #unimportend_value, #value_column3) SET diff_mysql_column1 = #value_column1, diff_mysql_column2 = #value_column2, diff_mysql_column3 = #value_column3);
This Code you can query to as much db tables you want. Also you can change the variables #value_column1.
Im in a hurry. so thats it for now. ask if something is unclear.
Greets Michael