SSIS to import data from excel into multiple tables - ssis

I have an Excel sheet (input) where each row needs to be saved in one of three SQL server tables based on the Record type (column 1) of the row.
Example:
If the Record type is EMP, the whole row should go to the Employee table.
If the Record type is CUS, the whole row should go to the Customer table
I am trying to use a multicast and not sure how to split the data from multicast to the destination table. Do I need any other control in between?
Any idea would be appreciated.

A Conditional Split Component sounds like just want you need. A Conditional Split uses expressions you define to route each input row to one output. In your case, your Conditional Split would define three outputs, each of which would be attached to a SQL destination.
In comparison, the Multicast Component you're currently using sends each input row to all outputs. This component would be useful if you were trying to save a copy of each row to all three SQL destinations.

Related

SSIS package to Insert flat file data to two different tables

I want to insert flat file data to two different sql table.But some additional field coming from flat file should be inserted to other table on the basis of indicator field but the usual field coming should be inserted into the regular table.
The other issue,the additional field to be inserted cannot be inserted directly because of no column mapping.
eg:
1234 056 Y Tushar
5678 065 N
So 1234 056 should be inserted to regular table but indicator Y tells us that Tushar should be inserted to other table.
But the table in which I want to Insert Tushar cannot be done directly as it does not have 1234 column name.
For indicator N also it should get inserted normally in the base table.
So what I did was I used a conditional split and then used ole db command but it it inserting multiple records in the table.
If you put a Multicast task right after your flat file source, you can create extra copies of your data set. Then you can use one copy to insert into Regular Table, and then you can put your Conditional Split on the second copy.
Your data flow would then look like this:
In my Flat File Source I defined four columns:
The Multicast doesn't need any configuration, and I assume the Regular Table destination isn't giving you the trouble. So next, you'd create the Indicator check with a Conditional Split task. Check for a value of Y like this:
Then just map whichever available columns you want to insert into Other Table. I chose the second column (I called mine Seq) and the Name column. You may have these named differently.

Calculate Difference between current and previous rows in SSIS

How to Calculate Difference between current and previous rows in SSIS then use that result to add a new column to the existing table
I'm assuming when you say "current and previous rows" is
Create 2 package variables, lets say: 'NumBefore'and 'NumAfter'.
Both are Int32.
Inside the Data Flow Task, use a source component (lets say OLEDB Source) and select if its a table or a query. Lets say a table T
Drag 'Row Count' in the Data Flow Transformations list. Double click it and in the Section Variable Names, select the variable 'User::NumBefore'. Row Count task will save, in runtime, the result of the calculation in that variable.
Do whatever you want to do with the data extracted from table T. My guess is that you are going to insert new rows in the same table T, right?
You have to use a second Data Flow Task in the Control Flow. Inside drag another OLEDB Source with the same table T. Use another Row Count Task, but this time use the variable 'User::NumAfter'. After the Row Count Task use either a Script Component or a derived column.
If you use Derived Column, write a name for the column, choose the option 'Replace xxxx' if you want to replace the value of xxx column, or 'Add column' if you want to add that as a column output.
In expression, write: #[User::NumAfter] - #[User::NumBefore]. and the place your OLEDB Destination.
Hope this was you were looking for

SSIS: How to store master-details records by condition?

I'm new to SSIS and I completely stuck with perhaps easy question.
I have two tables with one-to-many relationship. I parse HTML data in a Script component and create two outputs for Master Data and Detail records.
Then I check the condition for overwriting the existing data, and if it is satisfied, I write Master record to the table. Unfortunately, my data flow looks like on the picture above (schematic view). Detail records are added in any case. I would like the Details are stored only if the condition is met (the green arrow on the picture), but can't imagine how to do it.
I have face the same problem when we have to load the XML data into parent child tables. For this , I have added two data flow tasks in package. In first DFT, I have parsed XML and loaded data into master table only. In second DFT, I have parsed child XML nodes data and pass this output to merge join operator (first input). Now, we have to pass second input to merge join operator, for which i have extract data from master table.
guys!
Eventually I managed to resolve the problem. I split the whole process in two data flows. In the first one I parse html, save master data in table if needed and save parsed detail data in the package Object variable. Also, the first data flow has a Row Count component which saves its value in MasterRowCount variable. In the second data flow I save the detail data in table. The first and the second data flows are connected by expression constrained precedence (#MasterRowCount > 0). Thus, the second data flow executes only if the master data were added.

SSIS - Reuse Ole DB source when matching Fact against lookup table twice

I am pretty new to SSIS and BI in general, so first of all sorry if this is a newbie question.
I have my source data for the fact table in a csv, so I want to match the ids against the surrogate keys in lookup tables.
The data structure in the csv is like this
... userId, OriginStationId, DestinyStationId,..
What I am trying to accomplish is to match the data against my lookup table. So what I am doing is
Reading Lookup data using OLE DB Source
Reading my csv file
Sorting both inputs by the same field
Doing a left join by Id, in order to get the SK
This way, if there is no match (aka can't find the surrogate key) I can redirect that to a rejected csv and handle it later.
something like this:
(sorry for the spanish!)
I am doing this for each dimension, so I can handle each one with different error codes.
Since OriginStationId and DestinyStationId are two values from the same dimension (they both match against the same lookup table), I wanted to know if there's a way to avoid reading two times the data from the table (I mean, not to use two ole db sources to read twice the data from the same table).
I tried adding a second output to the sort but I am not allowed to. The same goes to adding another output from OLE DB Source.
I see there's an "cache option", is the best way to go ? (Although it would impy creating anyway another OLE DB source.. right?)
The third option I thought of was joining by the two fields, but since there is only one field in the lookup table (the same field) I am getting an error when I try to map both colums from my csv against the same column in my Lookup table
There are columns missing with the sort order 2 to 2
What is the best way to go for this ?
Or I am thinking something incorrectly ?
If something was not clear let me know and I'll update my question
Any time you wish you could have multiple outputs from a component that only allows one, all you have to do is follow that component with the Multicast component, whose sole purpose is to split a Data Flow stream into multiple outputs.
Gonzalo
I have just used this article on how to derive columns for a data warehouse building:- How to Populate a Fact Table using SSIS (part 1).
Using this I built a simple package that reads a CSV file with two columns that are used to derive separate values from the same CodeTable. The CodeTable has two fields Id and Description.
The Data Flow has two "Lookup" tasks. The first one joins the attribute Lookup1 against the Description to derive its Id. The second joins the attribute Lookup2 against the Description to derive a different Id.
Here is the Data Flow:-
Note the "Data Conversion" was required to convert the string attributes from the CSV file into "Unicode string [DT_WSTR]" so they could be joined to the nvarchar(50) description attribute in the table.
Here is the Data Conversion:-
Here is the first Lookup (the second one joins "Copy of Lookup2" to the Description):-
Here is the Data Viewer output with the to two derived Ids CodeTableFirstId and CodeTableSecondId:-
Hopefully I understand your problem and this is of use to you.
Cheers John

SSIS - Writing to Excel After Skipping Rows

Is there a way to write data to an excel spreadsheet after skipping x number of rows...excel is my destination and a sql query would be my source?
My scenario is one where i have a lot of header rows that i need to skip before data insertion. I would like to do this in an SSIS package. I am using SQL 2008 and Excel 2010.
Thanks
if you right click on the excel connection manager at the bottom of the page than click options , there is a setting called FirstRowHasColumnName set it to FALSE .let me know if it helps , didn't really understand if you just want to skip the first row that is the name of the columns from SQL query or more , there are other ways
Easiest way would be to modify your SQL query to exclude the header rows. If you can't do that then you need some logic to determine if the row is a header row (like checking if a certain field is a number):
If you can do that then you can do this:
read all columns in as text
Put in a derived column where you generate a new column IsHeader using your logic
Use Conditional Output to filter out the rows where your IsHeader is true
Use Data Conversion or Derived column to convert the columns to correct datatype
Output to Excel as usual