I need to read data from a DB1 and write them to another DB2.
I use a complex query with CTEs and temp tables and no, i can't put this query in a SProc.
I use an OLE DB source and an OLE DB Destination.
When i put the query as SQL Command in the OLEDBSource I get the usual complaint about not being able to determine metadata because a CTE is using a temp table.
I can't use the "with result sets" workaround because it is not a SProc. So i try with the other workaround, the "SET FMTONLY ON/OFF" .
Now the OLE DB Source accepts my query but it outputs two datasets, the first empty and the second is the data I need. The OLE DB Destination doesn't write a single row because it is reading only the first resultset, the empty one.
How can i solve this?
I cannot change the temp tables in something else and basically i can't change the query. I am looking for a SSIS solution if possible, not a SQL solution.
Thx.
For an SSIS solution, you cannot use an OLE DB Source. That component can only access the first result set.
What you can do is use a Script Transformation as your data source, and access the second result set in the usual way, and send its columns to the output of the script.
Related
I'm looking for the best practice to insert or update rows from a MySQL connection to a SQL Server connection.
First of all, I added a ADO.NET data source to grab MySQL content (a simple table Supplier with two fields id and name). Then, I added a Lookup transformation to split new rows / updated rows. It works well when I need to insert new rows. However, I would like to use a Command OLE DB to update existing rows but It doesn't work due to a incompatibility between my connection manager and the component (ADO.NET vs OLE DB).
Any idea to update modified rows ?! Should I use a cache component ?!
Thanks in advance !
Just get rid of the lookup and conditional split all together.
Outside of your SSIS package, build a staging table that contains the fields you need for inserts/updates.
In your SSIS Package, create a control flow that does the following:
Execute SQL Task to truncate the staging table.
Data Flow task to load the MySQL data from the source system to the staging table. If you can do this based on a "changes-only" type process, such as using a timestamp that you check, it would be faster.
Execute SQL Task to perform an UPDATE statement on your target table using the staging table joined to the target table.
Execute SQL Task to perform an INSERT statement on your target table using a query based on the target table and your staging table (with a WHERE NOT EXISTS or some such on a key fied)
I would change the SQL connection to use OLE DB. As well as allowing the OLE DB Command to work, you may also find the OLE DB Destination is faster.
I am using SSIS 2012 with a data flow task having a data source and an Ole DB Sql Task.
The data source is creating a set of Id's { 1,2,3, etc } with the Ole DB Sql Task deleting a record in another database-table. What I am seeing in the Sql Profiler is a delete command for each Id which is expected as it is working on a record by record basis. I can get upto 10,000 records.
Is there any way I work with the output of the data source as a set and say:
delete from Table1 where Id in { set of Id's }
You cannot do that in SSIS.
In fact, you can build an expression and execute that expression in SSIS, but you don't WANT to do that. Expressions are limited in the number of characters they can have, and they are a mess at maintenance time.
Some things are better done directly in a stored procedure, while other are better in SSIS. The art of SSIS is to know when to do it in SSIS or in a procedure.
Good luck!
I have an SSIS application that needs to get data from 2 databases of different servers (not link). I need to get the match names and DOB records between 2 database then use the results to insert/update a table.
My initial approach is to use OLE DB source then Merge Join and put the results to recordset. Then on controlflow, use the results of the recordset to insert/update a table. But I can't see the recordset at the control flow.
Alternative solution is to create temp tables. But the temp tables are not visible since they reside at the tempdb database of each servers.
What is a better approach for this problem?
what do you mean by put the results to recordset?
If you join two sources on the data flow using a join, that "recordset" on the join will only be available during the current dataflow. You cant use it on the control flow after the data flow is finisehd.
why cant you just insert the resultset on the destination DB? You can perform any other transform operation on the same data flow and insert the result on the destination database.
Or, if you really need to do something that can only be done on the control flow before insert the data, you can yes, insert the recordset on a temp table on the destination using a oleDBDestination and access in on another dataflow (not a very good approach, though)
In this case, I would keep a database around for work table or create a schema for those work tables.
Next, add a SQL control flow task that truncates the table that will hold the intermediate result. After this, load the intermediate result set into the table, do the operation and optionally, truncate the table again.
The recordset destination is fine for smaller datasets. But if you plan to use it for larger datasets that dont fit memory it will be very slow.
If you dont have a database/schema that can serve as a workspace, you could use RAW files to hold the intermediate result. Those are very fast too.
I use a lookup component. When No Match output, I insert the rows into a target table. I would like to update the target table with this rows when Lookup is matched.
How can I do that?
Thx!!
In the Lookup transformation, map Lookup Match Output to an OLE DB Command transformation. In the OLE DB command transformation use an UPDATE statement or a stored procedure and map the columns accordingly. Here is a link that describes how to use OLE DB Command transformation.
Please note that if you have too many rows to update, OLE DB Command transformation might slow down things.
There are couple of options:
You can use a second Lookup Transformation between the first Lookup transformation and the OLE DB Command. In the second lookup, map all the columns between source and destination that you will be updating including the key column and redirect the output to OLE DB Command only if there is not matching records.
Split the output from Lookup Match Output to multiple outputs using a sequence number and have multiple OLD DB Command transformations. Please find my answer in this Stack Overflow question where I split output from one transformation into multiple output before redirecting to an OLE DB Command.
Hope that helps.
Rather new to SSIS so not sure how to handle this.
I have a flat file which i managed to successfully read from. So right now my data flow consists of just a flat file source.
What i want to do is something like this:
Update SqlTable S
set s.columnA = f.columna
from FlatFile f
where s.columnID = f.columnID
Right now the only way i can see of doing this would be to insert the contents of the flat file into a sql table, then doing my update. This seems wasteful considering i don't need to save the data of the flat file. I just need to update an existing sql table based on the data in the flat file. So is there some way to run the query directly in the SSIS package instead of having to insert a bunch of data into a sql table that i will just wind up dropping?
thanks
Update SqlTable S set s.columnA = f.columna from FlatFile f where s.columnID = f.columnID
That statement above is a SQL statement. You cannot connect a sql table to a flat file. You need to work in SQL to do an update, since that is where the table lives
You have 2 choices:
Use an OLEDB Command component within the data flow. The downside is this calls the statement for each record, so if you have 1,000s of records it is very inefficient.
Push the records to a table using an OLE DB Destination and then you can call your update using an Execute SQL Task. You can then truncate the table if you like
A possible 3rd option is to roll your own OLE DB destination to do an update on record sets vs records.
While this might sound wasteful, to create a table in the database to store update records, it is done very often. You just drop the worktable or truncate when complete.
You could add an OLE DB Command component to the Data Flow that retrieves data from the flat file. The OLE DB Command would do a single row update for each record retrieved from the flat file. This might be okay if there are few rows in the flat file; but, you can imagine how bad performance will be if there are many rows in the flat file.
I think you'll find that sending the flat file rows to a database table and running a single UPDATE is going to be the best performer for lots of data.
I haven't tried this but have you tried sending to a recordset destination and then running the update using that?
The bulk load into a temporary table is the way to go and then do your updates from the temp table. As a previous poster says it is quite a common aproach to stuff data into a staging area prior to doing some more work with the data and then dropping or truncating the table