how to convert excel sheet into mysql database using etl tools - mysql

I am in need to convert an excel taken as a input into mysql database table. While searching for tools regarding it, I found that ETL tools (eg.pentaho kettle etl tool) . Can any one say whether the tool is correct or any other approach is there to do the task.

Sounds like a good option but you have to dig into the subject more yourself.
Take a look at this video tutorial for example (the tool mentioned in the tutorial - Spoon is part of Kettle).

Yes, Pentaho Data Integration, aka Kettle, will allow you to do that. As would any ETL tool for that matter. You may also find specific tools that are simpler to use for your purpose, installing a multi 100 Mb tool just for that may be a bit overkill.

Related

Is Alteryx an ETL tool? How it differs from SSIS? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
My client want me to implement ETL process using Alteryx as they have a license of it. I am confused whether the Alteryx is an ETL tool or not. I believe that Alteryx is commonly used to prepare data for Tableau data visualization tool.
Please advise whether its an ETL tool or not? How it differs from SSIS?
Thanks,
Alteryx is a data preparation / advanced anaytics application. People use it in many different ways due to the fact it allows data preparation, spatial analytics and predictive.
I work with many clients who choose to use Alteryx purely for its ETL capabilities moving data from one database to another, e.g. I have worked with one client who has used Alteryx to automate their loads into their Amazon Redshift database from MySQL, another who is using SQL -> Tableau data engine, and many other examples involving a range of data inputs (Alteryx supports everything from custom APIs -> Excel).
If you're already working with SSIS then you'll find Alteryx a breathe of fresh air to be honest, I was working with SSIS in a past life and have since found Alteryx to be much faster to develop with. It is more forgiving to changes to data and allows tighter integration of many different data sources. The new in-database tools give a much tighter integration with SQL as was previously possible allowing the work to be done inside the database.
Finally, compared to SSIS, I think you'll find Alteryx very simple to learn. The online training videos on their site will give you as much introduction as you need.
Enjoy, I think you'll enjoy the experience.
Chris
Alteryx can be used for ETL as long as you have an Alteryx Server. I've used it for a number of use-cases especially between cloud & database.
Some things that in my personal opinion make it clearly superior to SSIS:
If input has column names (from database or from csv file with headers), it handles unexpected new columns or column order changes automatically, without requiring you to change the flows at all.
You can build flows as "macros" which you can then unit test completely independently of your source/destination databases (try that in SSIS..)
Ability to drop a browse tool anywhere in the flow and effectively debug.
Build in assertions using "Test" tools.
Flows are runnable from the commandline on a server, and easiest way I've found (besides using Alteryx's own scheduler) is to save as an "App", and then run from the command line using the Alteryx engine executable, passing it parameters via xml file. You can save a sample xml parameter file from your flow by hitting the magic wand button (after saving the flow as a .yxwz (app)) This brings up a panel that lets you set the variables, and that panel has a handy "save" button which generates an xml file in the right format.
Within the flows themselves, parameterise things like environment settings either via action tools or module level parameters (User.*) - you can then for example set a database server on an input using %User.[Your variable name]% in the field.
Error logs are generally excellent (identify the tool that failed, useful error messages), and command line throws useful errorlevel numbers, so pretty trivial to schedule with some third party scheduler (or just use the Alteryx Server's own scheduler).
Obviously if you need to do any serious data manipulation, pivoting etc, then it's hands down the easiest tool I've used.
Yes, Alteryx is a ETL and data wrangling tool but it does a lot more than pure ETL. Alteryx wraps up pre-baked connectivity (Experian / Tableau etc) options alongside a host of embedded features (like data mining, geospatial, data cleansing) to provide a suite of tools within one product.
If all you are looking for is basic a->b ETL mapping, and you dont have a need for the additional features that Alteryx has, a cheaper product like SSIS would tend be more than sufficient.
Alteryx is a data mining workbench, and ETL is often a big part of the data mining process. Alteryx has plenty of ETL tools/capabilities, and much more too. I haven't used SSIS in ages, certainly not since acquiring Alteryx.
Cate
Alteryx has three basic capabilities ETL , Advance Analytics and Reporting.
Best part that I like is advance analytics but ETL is also there . So, I consider it a complete Analytics tool that starts from ETL up to reporting. I used to connect it with data that is stored in magnetic tapes.

Extract Transform Load into MySql

Am basically from Microsoft background working much on SSIS for ETL sought of project.
Now I got another project on hand to deal with loading of .csv files into MySql database. In process of loading these tables data has to go through some transformations and then into destination table. It is much of ETL project.
Client doesn't have SSIS (BIDS) and am compelled to use open source tools.
I did bit of research and found Talend Data Integration tool best fits for my situation.
As am new to this environment and am sure there are experts in this area, I need some advice on best tools to do ETL of this type and best practices.
If need any futher information please let me know.
If I remember correctly, PhpMyAdmin can import CSV into MySQL, and this question is about a similar topic too, but these don't come close to what SSIS can offer...
Yes you are right Talend Open Studio is pretty good tool with hundreds of connector,
in your case just create job which take CSV as your source and MySQl is destination apply any transformation if required and load it.
you can get more information on CSV to MySQL load with examples Talend forum
if you have any base plan then, share with me, I can guide you how to transfer CSV to MySQL table.

Automated ETL / Database Migration Solution

I'm looking for an ETL solution that we can create a configure by hand and then deploy to run autonomously. This is basic transformation, it need not be feature heavy. Key points would be free or open source'ed software that could be tailored more to suit specific needs.
In fact, this could be reduced to a simple DB migration tool that will run on a Linux server. Essentially the same as the above but we probably won't need to validate / transform the data at all besides renaming columns.
I forgot to mention that this is going to have to be very cross platform. I'd like to be able to deploy it to a server, as well as test it on OSX and Windows.
Try Pentaho or Talend. Pentaho has a nice job-scheduling paradigm as well as the ETL workbench (Kettle). I haven't used Talend, but I've heard good things and I imagine it carries similar functionality.

ETL Tool for transfering old Firebird Database to a new organized Firebird Database

After looking at a lot of questions..i found no real answer for this.
I redisigned an Database for our customer.
With Microsoft Access i found a good Tool to get old table Data in my new well formed Database Structure. It is really easy but takes a lot of time (cause handling old Data with a lot of care).
Are there any Open Source Tools that bring that facilities like Microsoft Access?
To clear it up: I "just" want to reorder old Firebird Database Data in a new "best-practise" Way.
Edit:
I would be really nice if i can get a Log File or something similar to have some documentation on the changes.
Update:
After checking some of the Tools of that Wikipedia Site. I found no real Logging Mechanism.
How do you documentate the changes on a Database? Simply by writing it down?
Result:
So i dont got an real answer...i ma still searching for an nice tool. thnak you guys for the hints and your thoughts regarding this question. I want to reward Kenneth Cochran with the Bounty cause he pointed me to ETL. Thank you!
Talend's Open Source ETL supports FireBird. Very cool tool.
http://www.talend.com/download.php?src=DataGovernanceBlog
It sounds like what you're asking for is an ETL(extract, transform, load) tool.
Wikipedia has a list of open source tools that may help with this. I've not used any of them personally.
Well, I used the Pentaho suite for doing ETL using their Kettle tool.
It's quite easy to use and should be more than enough to reach your intent.
And it's open source.
Give a look at it.
I advice you to use a tool like IBExpert or Database Workbench which are the best tools for Firebird.
For migrating Firebird 1.5 to Firebird 2.1 : you just have to make a backup of your database with Firebird 1.5 server and restore your database with Firebird 2.1 server
I've used Excel in the past to document data model changes - each worksheet used the application version in order to sync with our tags in CVS. Every thing was logged in it - columns that were removed as well as minor alterations to datatypes like varchar(10) to varchar(20) etc along with a note describing why the change was made.
Personally, I've only ever scripted things like these as DDL/DML scripts broken into a script that dealt with table creation, constraint dropping, index drops, DML script(s), constraint application, index application, and removing orphaned tables.
If you want a basic ETL tool, that is client based (and cheap at $300), look at Advanced Query Tool. It mainly queries any type of ODBC connection(including Excel files set up that way), but also has some extended features, including moving data. And has a command line interface. http://www.querytool.com/
I've used it instead of Informatica for one-off jobs, but I've also used to extract from Excel to another file for business users, for a few months, scheduled from my desktop.

Easy data conversion tool

I often have data in Excel or text that I need to get into SqlServer. I can use ODBC to query the Excel file and I can parse the text file. What I want though is some tool that will just grab the data and put it into tables with little / no effort. Does anyone know of such a tool?
Have you tried the SQL Server Import/Export Wizard ?
In SQL Server Management Studio, right-click your Database Name, and select Tasks menu, Import Data. For Data Source, select Microsoft Excel, browse to the .XLS...
If you are using Sql Server look at Integration Services (SSIS).
You can also take a look at parse-o-matic
Use DTS or SSIS depending on which version of SQL Server you have. There is an import wizard which can get you started, but data imports are rarely simple and usually involve some sort of data cleanup so that your incoming data is acceptable to the table where you intend to store it. Excel data, in my experience, is usually particularly bad inthis respect becasue it often isn't stored properly in Excel to begin with.
I haven't seen commercial tools that do this. I create this kind of tools at work all the time, and the data validation is not trivial. This just makes sure that you don't have bad data making it into your database.
I found that for simple data conversion needs something like FileHelpers is pretty good. It still needs programming though. This framework is fairly easy to use, and somebody with a little bit of experience could bang something out for you.
On further thought, you can use the SQL Server bcp utility to upload the contents of a text file. This is a command-line utility and has a lot of switches. I would suggest you experiment on a test table before you use this in a production table.
It's been a while since I used it, so I can't remember if you can directly use an Excel spreadsheet. Text files are always the easiest to deal with in any case.
Seems like it'd be pretty easy to write a script that reads the text file, and converts it to "INSERT * into TABLE" Sql statements. I suspect this has already been done, but a simple implementation would be less than 100 lines of code in your favorite scripting language.
Hey, Google says SQLServer comes with such a tool, BULK INSERT: