dynamically adding derived column in SSIS - ssis

I have a scenario where my source can be on different versions of our database as a result the in source file I could have different number of columns while my destination have defined number of columns.
now
what we are trying to do is:
load data from source to flat files. move them to central server and
then load that data into central database. but if any column is
missing in flat file i need to add derived column.
what is the best way to do this?? how can i dynamically add derived columns?

You can either do this with BiMLScript as other have suggested in comments, or you can write a script task that reads the file, analyzes the contents, and imports it. Yet another option would be to bulk import the file as is to a staging table (that would have to be dropped and re-created everytime) and write a stored procedure that analyzes the DDL and contents, and imports data to the destination table.

Related

how to handle middle of data failure case in ssis

Hi I have one doubt in ssis
I want load source excel file data into sql server database table.
Source excel file have billions of data(huge data).
whiel loading time halfoffrecords are loaded into destination table after that its failed due some data comes incorrect format .
in this sistuvation how will handle package for loading all data into destination using ssis.
source: excel(Emp information data)
destination : Table : emp
I tried using check point configuration for rerun at the point of failure..
but its not usefully to handled data row level and duplicate data is loading.
and I tried another way truncate data in the destination table.
after that I used redirect row for error handling.buts its not good implementation due to
trunating destination table.
please tell me how many way to achive this task(complete load) in ssis package level.
Load your data from Excel into a staging table, which you truncate before every load.
Make all of the columns of the staging table nvarchar(max) type, so they can handle any format of incoming character data.
Then run a stored procedure that de-dupes, formats and transfers the data to the final destination table.

SSIS package design, where 3rd party data is replacing existing data

I have created many SSIS packages in the past, though the need for this one is a bit different than the others which I have written.
Here's the quick description of the business need:
We have a small database on our end sourced from a 3rd party vendor, and this needs to be overwritten nightly.
The source of this data is a bunch of flat files (CSV) from the 3rd party vendor.
Current setup: we truncate the tables of this database, and we then insert the new data from the files, all via SSIS.
Problem: There are times when the files fail to come, and what happens is that we truncate the old data, though we don't have the fresh data set. This leaves us without a database where we would prefer to have yesterday's data over no data at all.
Desired Solution: I would like some sort of mechanism to see if the new data truly exists (these files) prior to truncating our current data.
What I have tried: I tried to capture the data from the files and add them to an ADO recordset and only proceeding if this part was successful. This doesn't seem to work for me, as I have all the data capture activities in one data flow and I don't see a way for me to reuse that data. It would seem wasteful of resources for me to do that and let the in-memory tables just sit there.
What have you done in a similar situation?
If files are not present update some flags like IsFile1Found to false and pass these flags to stored procedure which truncates on conditional basis.
If file is empty then Using powershell through Execute Process Task you can extract first two rows if there are two rows (header + data row) then it means data file is not empty. Then you can truncate the table and import the data.
other approach could be
you can load data into some staging table and from these staging table insert data to the destination table using SQL stored procedure and truncate these staging tables after data is moved to all the destination table. In this way before truncating destination table you can check if staging tables are empty or not.
I looked around and found that some others were struggling with the same issue, though none of them had a very elegant solution, nor do I.
What I ended up doing was to create a flat file connection to each file of interest and have a task count records and save to a variable. If a file isn't there, the package fails and you can stop execution at that point. There are some of these files whose actual count is interesting to me, though for the most part, I don't care. If you don't care what the counts are, you can keep recycling the same variable; this will reduce the creation of variables on your end (I needed 31). In order to preserve resources (read: reduce package execution time), I excluded all but one of the columns in each data source; it made a tremendous difference.

add parameter value to table at import SSIS

I am importing an Excel table from an ftp site to SQL using SSIS. The destination table is going to be used to calculate good and bad records based on another SQL database. Here is my problem. The Excel file is name RTW_032613_ABC_123.xls. This file name is a concatenation of a number of fields. I cannot recreate it based on the fields in the table, so I need to retain it and pass it to the new table in SQL. I have a parameter #FileName that I am using to loop through the files in the ftp folder. What I would like to do is either combine the import of data from the Excel file with the file name or insert the file name in each record after the import. I am calling the SSIS procedure from another stored procedure in SQL. I tried adding a SQL data flow task but I am not seeing where I add the insert statement on either the SQL Server Compact Destination or SQL Server Destination.
I am over my head with SSIS on this one. The key is that the parameter that I need is available in SSIS but I really need to get it passed on to my SQL table.
TIA
If I'm reading your question right, you have an SSIS package with a variable containing the filename and you want to save the filename with each row that you are sending to your SQL table? If so:
Add a derived column to the data flow, making a new column and referencing the variable in the expression
Include that new column in the mapping for the destination of your data flow, sending the filename to whichever column you would like to save that data in.
No need for a seperate SQL task.

Dynamically mapping columns from source to destination

I am creating an SSIS package which has a flat file source and a destination database.
The mappings between the columns are based on the following:
There is a table which contains records indicating the mappings ie: source column name and destination column name. The tables will be based on the name of the flat file.
The reason this has been done is so that the destination column names can be changed in the database rather than needing to recreate or edit the package.
Please could you advise as to how I could do this "lookup" and create the mappings dynamically.
There is no way to create mappings dynamically. You'll need to either generate SSIS package programmatically on the fly, or use other method (openrowset, bcp, bulk insert...).

Updating a SQL table with CSV data?

I am trying to update one of my SQL tables with new columns in my source CSV file. The CSV records in this file are already in this SQL table, but this SQL table is lacking some of the new columns from this CSV file.
I already added the new columns to my SQL table structure via ALTER TABLE. But now I just need to import the data from this CSV file into the new columns. How can I do this? I am trying to use SSIS and SQL Server to accomplish this, but am pretty new to Excel.
This is probably too late to solve salvationishere's problem; though I'm posting this for future readers!
You could just generate the SQL INSERT/UPDATE/etc command by parsing the csv file (a simple python script will do).
You could alternatively use this online parser:
http://www.convertcsv.com/csv-to-sql.htm
(Hoping that it'd still be available when you click!)
to generate your SQL command. The interface is extremely straight forward and it does the entire job in an awesome way.
You have several options:
If you are loading the data into a non-production system where you can edit the target tables, you could load the data into a new table, rename the old table to obsolete, and rename the new table to the old table name.
You can load the data into a staging table and then write a SQL statement to update the target table from the staging table.
You can open the CSV file in Excel and write a formula to generate an update script, drag the formula down across all rows so that you get a separate update statement for each row, and then run the separate update statements in management studio.
You can truncate the target table and update your existing ssis package that imports the file to use the new columns if you have the full history in your CSV file.
There are more options, but any of the above would probably be more than adequate solutions.