Combining two tables with SSIS into one destination table - sql-server-2008

I am new to SSIS, so please bear with me.
I created an Integration Services Project for SQL Server 2008 to import data from an old db to a new one. One of the things I need to do is import data from two old source tables into one new destination table.
What is the best way to do this?
I can easily see the results I want with a simple inner join query using tsql, but am not having any luck using the SSIS package. My current approach is a three step process:
Add OLE DB Source component that pulls all columns from my first source table
Add a Lookup component, which is the next step after my OLE DB Source component. In this I query the second source table 'using the results of a sql query' that returns no nulls, then drag the foreign key id from the 'available input columns' to the primary key in the available lookup columns. I also check the checkboxes in 'available input columns' to add 2 more columns.
Add OLE DB Destination, pointed to my destination table.
This process fails at the first step, not at the lookup step, and fails with the error "Row yielded no match during lookup". The foreign key cannot be null, and obviously the primary key can't either. I used a SQL statement in step to so I could make sure I don't get any null date values in the columns (there were a few) but I am still getting the error. If I output the first step failure path to a Flat File Destination, I get an empty CSV (watching in debug mode says ~600k records go into the flat file).
I am pretty stumped at this point and this seems like it should be super easy task. I have scoured the web for answers, and found this link that sounds like the same exact problem I am having, but changing the cache setting didn't help.
Help appreciated!

It sounds like you have a mismatch in the lookup. I'd hand run the queries and verify that tha both OLE DB SOurce has no null foriegn keys; and that each foreign-key matches something in the lookup table.
There is a simpler approach here. Use your inner join query you mentioned in the OLE DB SOurce. Don't use the table select, provide your SQL query with the join. This let's the SQL Server do all of the heavy lifting of the join and then SSIS can do the transferring.

Related

SSIS Delete records using one table that exists in a different table from different databases (SSDT)

I'm using a data flow task and 2 Ole Db sources. The 2 sources bring in data from tables on 2 different databases on the same server. The 2 tables can be mapped by ids. All of the ids from the second table (closedstops) exist in the first table (stops). I need to remove all the the closed stops by id from the first table. Afterwards I need to export the first table out of the database into a text file.
Do I need to use a merge join before deleting or do I need to use a ole db command to delete records (see attached screenshot). I have looked at many questions and answers on stackoverflow as well as tutorials and none of them quite answer my question. Any help is greatly appreciated. Thank you.
Closed stops is the driver table. Leave it be.
Instead of an OLE DB Source table for "stops" change that to a Lookup Component. You are only interested in rows that match.
And then you can use your OLE DB Command to fire off single delete statements.
My preference for performance and traceability would be to insert all the "to be deleted" ids into a table on the Stops database. When the Data Flow has completed, an Execute SQL Task would then fire up to perform the deletes in a set based operation(s).

Package is struck at "Execute phase is beginning" at Lookup task

I have used a Lookup in my data flow task. When I use Full Cache mode, the data flow task runs fine. But when I use Partial Cache or no Cache in my lookup, the records do not go past the lookup task and it keeps running for hours. I have checked for errors but there aren't any errors displayed. Could anyone please help me on this?
A Lookup is not appropriate for your task. Instead:
Add an OLE DB Source to pull in the data
Sort the records from the incoming source and the OLE DB Source
Perform a merge join (Full outer).
Add a Derived Column Transformation to check for ISNULL on the two joining columns. Create a new output column Called Action. For the NULLs in the target then you will tag that as an INSERT record.
Add a conditional split to send the INSERT record to an OLE DB Destination to insert the new records.
You can also check to see if there are matches between the two populations and perform updates, or look for NULLs in the source and DELETE in the destination.

How to insert/update rows from MySQL to SQL Server by using SSIS

I'm looking for the best practice to insert or update rows from a MySQL connection to a SQL Server connection.
First of all, I added a ADO.NET data source to grab MySQL content (a simple table Supplier with two fields id and name). Then, I added a Lookup transformation to split new rows / updated rows. It works well when I need to insert new rows. However, I would like to use a Command OLE DB to update existing rows but It doesn't work due to a incompatibility between my connection manager and the component (ADO.NET vs OLE DB).
Any idea to update modified rows ?! Should I use a cache component ?!
Thanks in advance !
Just get rid of the lookup and conditional split all together.
Outside of your SSIS package, build a staging table that contains the fields you need for inserts/updates.
In your SSIS Package, create a control flow that does the following:
Execute SQL Task to truncate the staging table.
Data Flow task to load the MySQL data from the source system to the staging table. If you can do this based on a "changes-only" type process, such as using a timestamp that you check, it would be faster.
Execute SQL Task to perform an UPDATE statement on your target table using the staging table joined to the target table.
Execute SQL Task to perform an INSERT statement on your target table using a query based on the target table and your staging table (with a WHERE NOT EXISTS or some such on a key fied)
I would change the SQL connection to use OLE DB. As well as allowing the OLE DB Command to work, you may also find the OLE DB Destination is faster.

SSIS - check if row exists before importing from source to destination

Can someone please advise:
I have got SOURCE (which is OLD SQL Server) and we need to move data to new DESTINATION (which is NEW SERVER). So moving data between different instances.
I'm struggling how to write the package which looks up in destination first and check if row exists then do nothing else INSERT.
Regards
Here are the steps:
Take an OLEBD Source, connect it with a Lookup task.
Select the column that can be looked up. There should be some kind of ID for you to do this. Also select all the columns that need to be passed(SSIS has provisions of applying check boxes).
Connect the lookup no match rows to an OLEDB destination, do the mapping, and you are done.
If you want to redirect all those matching rows to somewhere, say a notepad file, you can do that too...
I would use an Lockup transformation and redirect match output to something else like a OLEDB command in there you can write a IF exist statement or create that in a stored procedure that way it will either insert data or update data not insert duplicates

Which SSIS transformation can perform 'NOT IN' constraint used in SQL query?

I have two OLEDB Data Sources that have similar columns:
TMP_CRUZTRANS
-------------
CUENTA_CTE numeric (20,0)
TMP_CTACTE_S_USD
----------------
CON_OPE numeric(20,0)
I need to substract all the similar values between this two tables and keep the rows which are different. Is there a transformation/task within SSIS that can perform NOT IN constraint normally used in SQL query?
Currently, I am performing this operation using Execute SQL Task on Control Flow.
The top Data Flow creates the first table TMP_CRUZTRANS (Merge join between other 2 tables... But I guess that's not important for my question) that i need to keep the different values with the second table.
In the Execute SQL Task, I have the following statement:
INSERT INTO [dbo].[TMP_CYA]
SELECT RUT_CLIE, CUENTA_CTE, MONTO_TRANSAC
FROM [dbo].[TMP_CRUZTRANS]
WHERE CUENTA_CTE NOT IN (SELECT CON_OPE FROM TMP_CTACTE_S_USD)
Finally, with the new table TMP_CYA I can continue with my work.
The problem with this approach is that the TMP_CRUZTRANS got like 5 millions of rows, so it's VERY slow inserting all this data into a table using Execute SQL Task. It takes about like 5 hours to perform this operation. That's why I need to do this inside the Data Flow task.
You can use Lookup transformation available within Data Flow task to achieve your requirement.
Here is a sample that illustrates what you are trying to achieve.
Create a package with data flow task. Inside the data flow task, use OLE DB Source to read data from your source table TMP_CRUZTRANS. Use Lookup transformation to validate the existence of the values against the table dbo.TMP_CTACTE_S_USD between given columns. Then redirect the non-matching output to OLE DB Destination to insert rows into table dbo.TMP_CYA
Here is how data flow task would look like in place of the Execute SQL Task that you are currently using.
Configure the Lookup transformation as shown below:
On the General tab page, select Redirect rows to no match output from Specify how to handle rows with no matching entries because you are interested only in non matching rows.
On the Connection tab page, select the appropriate OLE DB Connection manager and select the table dbo.TMP_CTACTE_S_USD. That is the table against which you would like to validate the data.
On the Columns tab page, drag the column CUENTA_CTE and drop it on CON_OPE to establish the mapping between source and lookup tables. Click OK.
When you connect the Lookup transformation with OLE DB Destination, Input Output Selection dialog will appear. Please make sure to select Lookup No Match Output.
Here is the sample before executing the package.
You can see that only 2 rows non matching rows have been transferred to OLE DB destination.
You can notice that the destination table now contains the two non matching rows after package execution.
Hope that helps.