We are reading in data from two OLE DB data sources, sorting each and then merging them. We then perform a Conditional Split to place the data into three "buckets". We next want to pass each resulting record set to each one's appropriate stored procedure. There are several columns in the results.
Snapshot of SSIS flow:
How can we best pass these rows to the SP?
Should this be done in the Data Flow or as another step in the Control Flow?
Pass rows to a Stored Procedure via an OLEDB Command Component.
And yes, you do this in the dataflow, not another step in the control flow.
Related
I need to push the filtered data into data flow task...
In the control flow task I have 2 'execute SQL task' and one Data flow task connected one after the other. HOW can I use the output result set of the Execute sql taks into the data flow ?
The two 'execute sql task' performs filter operations and is running fine while debugging.
Inside the datflow task I use a source OLEDB ? What shall I use as a source to get the filtered output data from SQL task in Control Flow...
Adding to this, since you have two EST (Execute SQL Task) which generate a filtered data set which needs to be passed to a DFT (Data Flow Task), you can use a variable substitution method.
Here, you can replace direct SQL with a variable and, create a dynamic SQL using Script task and assign final SQL to the SSIS variable. Now in DFT, use SQL with variable option in your OLEDB Source, this will allow you to get rid of 2 EST's with a single variable which has T-SQL statements
The output data of the Execute SQL Task must be written into the some storage OR in object type variable which can be used as a source in your data flow task.
You can also filter the data in source of the data flow task.
You can store the output of execute SQL task to #Temp table (other properties like delay validation, remainSameConnetion will be required to be set TRUE) OR Permanent table and access that from data flow.
Can somebody please help me to transfer around 15 tables from one database to another database. At present I can do this one by one using Data Flow task, but then I need to do this task 15 times which is very time consuming.
Why don't you just use a task? Maybe tasks->export is what you're looking for.
Otherwise you'll need to create separate blocks for each table or:
Create a variable of type object
Script Task: Add to your list all table names.
Iterate over this object variable with For each loop container
Inside the loop create a source from a variable. In this variable specify the connection dynamically depending on the current loop value.
you can use SSIS package, select Transfer SQL server objects from SSIS toolbox , in Object specify the source and destination servers and database. for copyAllObjects make it false . ObjectToCopy select CopyAllTables true or make it false and pick from the list the table you want to copy.
I have an SSIS application that needs to get data from 2 databases of different servers (not link). I need to get the match names and DOB records between 2 database then use the results to insert/update a table.
My initial approach is to use OLE DB source then Merge Join and put the results to recordset. Then on controlflow, use the results of the recordset to insert/update a table. But I can't see the recordset at the control flow.
Alternative solution is to create temp tables. But the temp tables are not visible since they reside at the tempdb database of each servers.
What is a better approach for this problem?
what do you mean by put the results to recordset?
If you join two sources on the data flow using a join, that "recordset" on the join will only be available during the current dataflow. You cant use it on the control flow after the data flow is finisehd.
why cant you just insert the resultset on the destination DB? You can perform any other transform operation on the same data flow and insert the result on the destination database.
Or, if you really need to do something that can only be done on the control flow before insert the data, you can yes, insert the recordset on a temp table on the destination using a oleDBDestination and access in on another dataflow (not a very good approach, though)
In this case, I would keep a database around for work table or create a schema for those work tables.
Next, add a SQL control flow task that truncates the table that will hold the intermediate result. After this, load the intermediate result set into the table, do the operation and optionally, truncate the table again.
The recordset destination is fine for smaller datasets. But if you plan to use it for larger datasets that dont fit memory it will be very slow.
If you dont have a database/schema that can serve as a workspace, you could use RAW files to hold the intermediate result. Those are very fast too.
I would like to bring in an XML source and do data conversion and update it in a table. Data from this table will be used to update another table. How to accomplish this in SSIS?
I understand the first two steps. But lost after that.
XML Source (under dataflow task)
Data Conversion
OLE DB Destination? (If I use OLE DB Destination, then I cannot use that as a source again to update another table). What component should I be using to accomplish this?
TIA
Within a dataflow you can split the records to go to multiple tables using either a conditional split (if you want some records to go one way and some to go another way) or a mulicast task if you want all records to go to both destinations. We use a multicast to create two staging tables, one where the raw data from the file will stay and one where the data will be cleaned and transformed before going into our prod tables. This enables us to easily research if some problem data that came in was due to our transformation process (a bug) or bad data being sent (a problem at the client end, but which might require more steps to handle if they can't fix).
You can also have multiple data flows that all have the same source. Or you can insert to one staging table and then have a second data flow or exec SQL task to move that data to where you want it.
Use the OLE DB Destination to inject your XML source data into your staging table. Then, in your control flow use an Execute SQL task after your data flow task to execute a stored procedure or T-SQL script to move your data from the staging table into the production table(s) and truncate the staging table if required.
I've found that SSIS is great for ETL work, but moving data around inside a DB or aggregation work is best carried out using T-SQL in stored procs. Easier to write, control and you know you're not going to have any RBAR shenanigans you can happen upon in a DFT.
YMMV
I'm fairly new to SSIS,
I'm importing from an XLS spreadsheet into a database table. Along the way I want to select a record from a table, but it is NOT a lookup, ie: a straight SELECT with no join from input source. Then I want to merge this along with the other rows from the XLS.
What is the best way to do this? Variables? OLE DB commands?
Thanks
You could use an OLE DB command but the important thing to remember about this is that it is fired on a per-row basis and could potentially be slow. You can still use a lookup for this purpose, but make sure that you use set the error output to ignore lookup errors for the cases when the lookup transformation does not contain an value for the match you are looking for.
You could also use a merge transformation with an outer join condition rather than an inner join.
If the record that you are retrieving from the database table is not dependent on the data within the row from the spreadsheet then it will probably be the same for each row - is that what you are hoping for?
In this case, I would consider using an Execute SQL Task in the Control Flow to retrieve the record and save it to a variable. You can use a Script Component in the Data Flow to copy the values in the record from the variable to the appropriate fields in each row. This will mean that the lookup data is retrieved only once and not once per row which is slow as jn29098 said above.
If the target for your Data Flow is the same database as the one from which you are extracting the 'lookup' record then you could also consider using an Execute SQL Task (in the Control Flow) to add the lookup values once the spreadsheet data has arrived in the database (once the Data Flow has completed). This would be much more efficient.