Importing thousands of text files into database - mysql

I am pretty new to databases and need help. I have n (large) files, each file contains m (very large) text file (numeric data). What is the best way to import those files into a mysql database concerning the names of the fields?

usually one would write script with perl (or whatever scripting language is preferred, offering MySQL Support ) and process files one after other, applying necessary processing to files / lines inside files. If you like more specific answer, ask more specific question

If you only need to do it once, or the import process remains fairly similar each time, I would recommend using the ETL software Kettle by Pentaho (this bit of software is commonly referred to as kettle). While this software is far from perfect, I've found that I can often import data in a fraction of the time I would have to spend writing a script for one specific file. You can select a text file input and specify the delimiters, fixed width, etc.. and then simply export directly into your SQL server (they support MySql, SQLite, Oracle, and much more).
If you would like to research other types of software like this, its often referred to as ETL software, short for Extract Transform Load.
If your familiar with python, I would also recommend the last post on this page

Related

transferring FIlemaker DB data to mySQL DB

My office has a Filemaker database which they asked me to replace with a mySQL one. The mySQL one is now set up and running, but doesn't have exact same structure as the Filemaker one (they asked for more things to be added, redundant things were left out, etc.).
I've seen that the filemaker data can be exported as .xml files, could I use those to populate the mySQL database?
If so, I've only ever used
Cooktop,but I'm currently using mac10.6/lubuntu, is there maybe an equivalent (free) piece of software that could do that?
All suggestions are welcome.
Thanks
I can add some information about the various export formats that FileMaker provides. I've done extensive research and testing on this topic.
Below, you'll see a chart with all of the formats that FileMaker offers along the top. Along the side, you'll see various features of these file formats that are unique to FileMaker when exporting. Some are limitations of the FileMaker export process and others are general pros and cons of the format itself.
I'll explain them briefly:
Headers: column labels are exported
Delimiter: the type of separator symbol used
UTF 8/16: yes if either of these is available, could be of concern for special characters or some languages
Only 1 format: means that only one type of encoding is available
Other encoding: a list of all encoding options
Can be imported: FileMaker allows import (not important for this question)
Future proof: According to Wikipedia, format is still widely used and actively maintained
Open standard: open source format
Size: when exporting one of our tables, this was the size of the file
I would recommend also considering some of these factors when deciding which format will work for you. It will depend on the contents and type of your data.
MySQL is just the backend database, so you need a UI to perform the import. You could use FileMaker for this as well, if you set up the MySQL database as an ESS source. If you do this, then you can use familiar import steps in FileMaker to populate your new database.
This may be what the previous answer mentions, but just to designate between the ODBC insert via Execute SQL, which is limited, and External SQL Sources (ESS) that give a native UI in FM.
If the FileMaker database is hosted on a server, you could setup an ODBC link to the MySQL. You could then create a script, in FileMaker, to loop through the data, creating rows in MySQL with only the Columns you are looking to populate.
Other than that, you can export the data, from FileMaker, into many other formats including; TAB, CSV, Excel, xml and pushing it into MySQL.

How to migrate existing database from Domino Server to Relational database (MySQL)

Is there any good way to migrate existing database from Domino Server to Relational database like MySQL without using any tool.
I've explored a bit about this and got to know that its possible using XML but don't know how and what'll be the procedure.
Any help would be appreciated.
Without using any tool: NO.
There are two big difficulties in exporting data:
First is the Notes Richtext, which is a proprietary format that has to be "transcoded" somehow. This is not an easy thing to do "manually" and needs either a lot of coding or some kind of tool.
Second is the fact, that there is no "forced" structure in Notes documents. There can be several forms that "define" how the documents look and there can be different versions of these forms that have been used over the past. A document may or may not contain any number of fields in any thinkable type (the field may even be number in one document and text in the other).
You have to KNOW the structure of your documents to get them out. Of course you can simply export them as "Structured Text" or as "Comma separated values", to get -most- of it, but then you need views that show the documents in the order you need them. Exporting them as XML is another "standard" way to get the data, but then you need to understand the xml to get it into your relational database.
Short: Without (at least very little) coding knowledge OR a tool (that costs money) there is no chance for getting the data out.
Ah yes, there is an "ODBC driver" for Lotus Notes / Domino, but that will not help you much, if you do not know the structure of your documents and how Notes- Databases work, it will also not work.
As Torsten said above, you can't do it without a tool, either you buy one or write one yourself.
I wrote a tool like that several years ago to export Notes databases as XML. There is a bit of work, especially with the rich text fields. You also may want to export/detach attachments and embedded images.
You can read more about my export tool here: http://www.texasswede.com/websites/texasswede.nsf/Page/Notes%20XML%20Exporter

How to convert data stored in XML files into a relational database (MySQL)?

I have a few XML files containing data for a research project which I need to run some statistics on. The amount of data is close to 100GB.
The structure is not so complex (could be mapped to perhaps 10 tables in a relational model), and given the nature of the problem, this data will never be updated again, I only need it available in a place where it's easy to run queries on.
I've read about XML databases, and the possibility of running XPATH-style queries on it, but I never used them and I'm not so comfortable with it. Having the data in a relational database would be my preferred choice.
So, I'm looking for a way to covert the data stored in XML into a relational database (think of a big .sql file similar to the one generated by mysqldump, but anything else would do).
The ultimate goal is to be able to run SQL queries for crunching the data.
After some research I'm almost convinced I have to write it on my own.
But I feel this is a common problem, and therefore there should be a tool which already does that.
So, do you know of any tool that would transform XML data into a relational database?
PS1:
My idea would be something like (it can work differently, but just to make sure you get my point):
Analyse the data structure (based on the XML themselves, or on a XSD)
Build the relational database (tables, keys) based on that structure
Generate SQL statements to create the database
Generate SQL statements to create fill in the data
PS2:
I've seen some posts here in SO but still I couldn't find a solution.
Microsoft's "Xml Bulk Load" tool seems to do something in that direction, but I don't have a MS SQL Server.
Databases are not the only way to search data. I can highly recommend Apache Solr
Strategies to Implement search on XML file
Keep your raw data as XML and search it using the Solr index
Importing XML files of the right format into a MySql database is easy:
https://dev.mysql.com/doc/refman/5.6/en/load-xml.html
This means, you typically have to transform your XML data into that kind of format. How you do this depends on the complexity of the transformation, what programming languages you know, and if you want to use XSLT (which is most probably a good idea).
From your former answers it seems you know Python, so http://xmlsoft.org/XSLT/python.html may be the right thing for you to start with.
Take a look at StAX instead of XSD for analyzing/extraction of data. It's stream based and can deal with huge XML files.
If you feel comfortable with Perl, I've had pretty good luck with XML::Twig module for processing really big XML files.
Basically, all you need is to setup few twig handlers and import your data into MySQL using DBI/DBD::mysql.
There is pretty good example on xmltwig.org.
If you comfortable with commercial products, you might want to have a look at Data Wizard for MySQL by the SQL Maestro Group.
This application is targeted especially at exporting and, of course, importing data from/ to MySQL databases. This also includes XML import. You can download a 30-day trial to check if this is what you are looking for.
I have to admit that I did not use the MySQL product line from them yet, but I had a good user experience with their Firebird Maestro and SQLite Maestro products.

Extensible DB editor

I'm using MySQL.
In my DB there are several tables, containing fields with data, serialized in custom binary format. (Actually, these fields contain lists of fixed-format records, like a "sub-table".)
I need a tool to be able to edit those fields by hand while my own fancy data administration UI is still in development.
I wonder, if there is a DB viewer/editor (like PHPMyAdmin or Sequel Pro or whatever) which I would be able to easily extend to deserialize that extra data?
Note that the [de]serialization library is in plain C and I do not want to spend much time rewriting it in another language. (I would better spend that time on that data administration UI.)
Any clues?
P.S. I need the editor to work on OS X or Ubuntu (Wine is fine) or be web-based.
Sequel Pro is open source, so you can probably get the sources and hack your code in there.
Get it here.
This is a Java app http://www.isqlviewer.com/ . You could load your C library in that using JNI. I've used iSQLviewer a lot with various databases and the download comes with code, but I can't say I've ever looked at the code!

Easy data conversion tool

I often have data in Excel or text that I need to get into SqlServer. I can use ODBC to query the Excel file and I can parse the text file. What I want though is some tool that will just grab the data and put it into tables with little / no effort. Does anyone know of such a tool?
Have you tried the SQL Server Import/Export Wizard ?
In SQL Server Management Studio, right-click your Database Name, and select Tasks menu, Import Data. For Data Source, select Microsoft Excel, browse to the .XLS...
If you are using Sql Server look at Integration Services (SSIS).
You can also take a look at parse-o-matic
Use DTS or SSIS depending on which version of SQL Server you have. There is an import wizard which can get you started, but data imports are rarely simple and usually involve some sort of data cleanup so that your incoming data is acceptable to the table where you intend to store it. Excel data, in my experience, is usually particularly bad inthis respect becasue it often isn't stored properly in Excel to begin with.
I haven't seen commercial tools that do this. I create this kind of tools at work all the time, and the data validation is not trivial. This just makes sure that you don't have bad data making it into your database.
I found that for simple data conversion needs something like FileHelpers is pretty good. It still needs programming though. This framework is fairly easy to use, and somebody with a little bit of experience could bang something out for you.
On further thought, you can use the SQL Server bcp utility to upload the contents of a text file. This is a command-line utility and has a lot of switches. I would suggest you experiment on a test table before you use this in a production table.
It's been a while since I used it, so I can't remember if you can directly use an Excel spreadsheet. Text files are always the easiest to deal with in any case.
Seems like it'd be pretty easy to write a script that reads the text file, and converts it to "INSERT * into TABLE" Sql statements. I suspect this has already been done, but a simple implementation would be less than 100 lines of code in your favorite scripting language.
Hey, Google says SQLServer comes with such a tool, BULK INSERT: