How to process Excel files stored in an image data type column using SSIS package? - ssis

I have a .NET webforms front end that allows admin users to upload two .xls files for offline processing. As these files will be used for validation (and aggregation) I store these in an image field in a table.
My ultimate goal is to create an SSIS package that will process these files offline. Does anyone know how to use SSIS to read a blob from a table into its native (in this case .xls) format for use in a Data Flow task?

In my (admittedly limited) experience with SSIS, it is quite good at rapidly getting something up and running, but frusteratingly limited in getting something that "feels" like the most elegant, efficient solution to a programmer.
Since the Excel Source Editor seems to take only files as input, you need to give it a file or reimplement its functionality in code that can take a blob. I understand that this is unsatisfying, but in the end, this is a time saving tool.

Related

NetSuite Migrations

Has anyone had much experience with data migration into and out of NetSuite? I have to export DB2 tables into MySQL, manipulate data, and then export ina CSV file. Then take a CSV file of accounts and manipulate the data again for accounts to match up from our old system to new. Anyone tried to do this in MySQL?
A couple of options:
Invest in a data transformation tool that connects to NetSuite and DB2 or MySQL. Look at Dell Boomi, IBM Cast Iron, etc. These tools allow you to connect to both systems, define the data to be extracted, perform data transformation functions and mappings and do all the inserts/updates or whatever you need to do.
For MySQL to NetSuite, php scripts can be written to access MySQL and NetSuite. On the NetSuite side, you can either do SOAP web services, or you can write custom REST APIs within NetSuite. SOAP is probably a bit slower than REST, but with REST, you have to write the API yourself (server side JavaScript - it's not hard, but there's a learning curve).
Hope this helps.
I'm an IBM i programmer; try CPYTOIMPF to create a pretty generic CSV file. I'll go to a stream file - if you have NetServer running you can map a network drive to the IFS directory or you can use FTP to get the CSV file from the IFS to another machine in your network.
Try Adeptia's Netsuite integration tool to perform ETL. You can also try Pentaho ETL for this (As far as I know Celigo's Netsuite connector is built upon Pentaho). Also Jitterbit does have an extension for Netsuite.
We primarily have 2 options to pump data into NS:
i)SuiteTalk ---> Using which we can have SOAP based transformations.There are 2 versions of SuiteTalk synchronous and asynchronous.
Typical tools like Boomi/Mule/Jitterbit use synchronous SuiteTalk to pump data into NS.They also have decent editors to help you do mapping.
ii)RESTlets ---> which are typical REST based architures by NS can also be used but you may have to write external brokers to communicate with them.
Depending on your need you can have whatever you need.IN most of the cases you will be using SuiteTalk to bring in data to Netsuite.
Hope this helps ...
We just got done doing this. We used an iPAAS platform called Jitterbit (similar to Dell Boomi). It can connect to mySql and to NetSuite and you can do transformations in the tool. I have been really impressed with the platform overall so far
There are different approaches, I like the following to process a batch job:
To import data to Netsuite:
Export CSV from old system and place it in Netsuite's a File Cabinet folder (Use a RESTlet or Webservices for this).
Run a scheduled script to load the files in the folder and update the records.
Don't forget to handle errors. Ways to handle errors: send email, create custom record, log to file or write to record
Once the file has been processed move the file to another folder or delete it.
To export data out of Netsuite:
Gather data and export to a CSV (You can use a saved search or similar)
Place CSV in File Cabinet folder.
From external server call webservices or RESTlet to grab new CSV files in the folder.
Process file.
Handle errors.
Call webservices or RESTlet to move CSV File or Delete.
You can also use Pentaho Data Integration, its free and the learning curve is not that difficult. I took this course and I was able to play around with the tool within a couple of hours.

How to make SSIS choose data source depending on parameter?

I have an SSIS data flow task that reads a CSV file with certain fields, tweaks it a little and inserts results into a table. The source file name is a package parameter. All is good and fine there.
Now, I need to process slightly different kind of CSV files with an extra field. This extra field can be safely ignored, so the processing is essentially the same. The only difference is in the column mapping of the data source..
I could, of course, create a copy of the whole package and tweak the data source to match the second file format. However, this "solution" seems like terrible duplication: if there are any changes in the course of processing, I will have to do them twice. I'd rather pass another parameter to the package that would tell it what kind of file to process.
The trouble is, I don't know how to make SSIS read from one data source or another depending on parameter, hence the question.
I would duplicate the Connection Manager (CSV definition) and Data Flow in the SSIS package and tweak them for the new file format. Then I would use the parameter you described to Enable/Disable either Data Flow.
In essence, SSIS doesnt work with variable metadata. If this is going to be a recurring pattern I would deal with it upstream from SSIS, building a VB / C# command-line app to shred the files into SQL tables.
You could make the connection manager push all the data into 1 column. Then use a script transformation component to parse out the data to the output, depending on the number of fields in the row.
You can split the data based on delimiter into say a string array (I googled for help when I needed to do this). With the array you can tell the size of it and thus what type of file it is that has been connected to.
Then, your mapping to the destination can remain the same. No need to duplicate any components either.
I had to do something similar myself once, because although the files I was using were meant to always be the same format - depending on version of the system sending the file, it could change - and thus by handling it in a script transformation this way I was able to handle the minor variations to the file format. If the files are 99% always the same that is ok.. if they were radically different you would be better to use a separate file connection manager.

Extract Transform Load into MySql

Am basically from Microsoft background working much on SSIS for ETL sought of project.
Now I got another project on hand to deal with loading of .csv files into MySql database. In process of loading these tables data has to go through some transformations and then into destination table. It is much of ETL project.
Client doesn't have SSIS (BIDS) and am compelled to use open source tools.
I did bit of research and found Talend Data Integration tool best fits for my situation.
As am new to this environment and am sure there are experts in this area, I need some advice on best tools to do ETL of this type and best practices.
If need any futher information please let me know.
If I remember correctly, PhpMyAdmin can import CSV into MySQL, and this question is about a similar topic too, but these don't come close to what SSIS can offer...
Yes you are right Talend Open Studio is pretty good tool with hundreds of connector,
in your case just create job which take CSV as your source and MySQl is destination apply any transformation if required and load it.
you can get more information on CSV to MySQL load with examples Talend forum
if you have any base plan then, share with me, I can guide you how to transfer CSV to MySQL table.

Fastest way to load flat file in ssis

I have a single flat file which needs to be loaded in a SQL Server. For that I have to use SSIS. Now I want to know about things which can help me loading these files in the fastest way:
Should I use Flat file manager or Script task to load the flat file? (because in one of my question, i got an answer which states that script task load things faster)
Destination (ADO.NET or SQL Server) ?
Any other setting/Best Practice for flat file which I can make to load the file in faster way?
Here is the reference for how Microsoft was able to load 1TB in 30 minutes using SSIS.
I am surprised when you say that scripting is faster since to accomplish this same feat Microsoft used Flat File Sources and OLEDB destinations. They also optimized the load by breaking apart the load process into smaller chunks and partitioning the destination tables and by using very well tuned hardware. However, the techniques that they use in their SSIS packages are what I would use if I had to load a large dataset from SSIS.
(Microsoft) We Loaded 1TB in 30 Minutes with SSIS, and so can you
I think for what you're trying to accomplish, SSIS would be a great way to go. It allows for much more flexibility. As far as using Flat File Manager/Scripting, scripting will always give you better performance but I use SSIS as it makes things easier to navigate (or repair). I'm sure many of the die hard SQL devs will tell you to script it but I find either way works.
As far as the destination, I exclusively use SQL Server so I can't speak to that part of your question.
Best practices are, in my opinion, keep it as simple as you can. The easier you make things the better performance you'll get. In my 3 years in SSIS, I always try to optimize ANY query to the best of my ability, THEN put it into SSIS.
It sounds like you're doing nothing more than a simple ETL on these files, if that's the case, I recommend SSIS based on my experience. Once you have everything loaded you can modify the data types for your different cases of char, varchar and int.
Hope this helps!

Easy data conversion tool

I often have data in Excel or text that I need to get into SqlServer. I can use ODBC to query the Excel file and I can parse the text file. What I want though is some tool that will just grab the data and put it into tables with little / no effort. Does anyone know of such a tool?
Have you tried the SQL Server Import/Export Wizard ?
In SQL Server Management Studio, right-click your Database Name, and select Tasks menu, Import Data. For Data Source, select Microsoft Excel, browse to the .XLS...
If you are using Sql Server look at Integration Services (SSIS).
You can also take a look at parse-o-matic
Use DTS or SSIS depending on which version of SQL Server you have. There is an import wizard which can get you started, but data imports are rarely simple and usually involve some sort of data cleanup so that your incoming data is acceptable to the table where you intend to store it. Excel data, in my experience, is usually particularly bad inthis respect becasue it often isn't stored properly in Excel to begin with.
I haven't seen commercial tools that do this. I create this kind of tools at work all the time, and the data validation is not trivial. This just makes sure that you don't have bad data making it into your database.
I found that for simple data conversion needs something like FileHelpers is pretty good. It still needs programming though. This framework is fairly easy to use, and somebody with a little bit of experience could bang something out for you.
On further thought, you can use the SQL Server bcp utility to upload the contents of a text file. This is a command-line utility and has a lot of switches. I would suggest you experiment on a test table before you use this in a production table.
It's been a while since I used it, so I can't remember if you can directly use an Excel spreadsheet. Text files are always the easiest to deal with in any case.
Seems like it'd be pretty easy to write a script that reads the text file, and converts it to "INSERT * into TABLE" Sql statements. I suspect this has already been done, but a simple implementation would be less than 100 lines of code in your favorite scripting language.
Hey, Google says SQLServer comes with such a tool, BULK INSERT: