how to upsert data from mysql to ssms using ssis - mysql

how can we insert new data or update the data from one table to another table from MySQL to SQL server using ssis and by not using lookup.

A common way to do this is to insert new data to an empty temporary table, and then run SQL Merge command (using separate SQL Query task).
MERGE command is super powerful and can do updates, inserts or even deletes. See full description of Merge here:
https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017

The design for this will look like below :
You will have 4 tables and 1 view : Source, TMP_Dest (exactly as source with no PK), CHG_Dest(for changes, exactly as destination with no PK), Dest(will have PK), FV_TMP_Dest (this is in case the destination looks different than the source - different field types)
SSIS package :
1.Use ExecuteSQLTask and truncate TMP_Dest because it is just temporary for the extracted data
Use ExecuteSQlTask and truncate CHG_Dest because it is just temporary for the extracted data
Use one DataFlowTask for loading data from Source to TMP_Dest
Define two variables OperationIDInsert=1 and OperationIDUpdate=2 (the values are not important, you can set them as you want) -> you will use them at 5. point below
Use another DataFlowTask in which you will have:
on the left side OLE DB Source in which you will extract data from the view, ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
on the right side OLE DB Source in which you will extract data from the Dest ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
LEFT JOIN between this
on the left side ( "insert side") you will have: a derived column in which you will assign as Expression the OperationIDInsert variable AND an OLE DB Destination for inserting the data in CHG_Dest table. In this way, you will insert the data that have to be inserted in the destination table and you know this because you have the OperationIDInsert column.
on the right side you will do the same thing but using OperationIDUpdate column
You will use ExecuteSQLTask in the ControlFlow and will have an SQL Merge. Based on the PK fields and OperationIDInsert/OperationIDUpdate fields you will either insert the data or update it.
Hope this will help you. Let me know if you need additional info.

Related

Update a table (that has relationships) using another table in SSIS

I want to be able to update a specific column of a table using data from another table. Here's what the two tables look like, the DB type and SSIS components used to get the tables data (btw, both ID and Code are unique).
Table1(ID, Code, Description) [T-SQL DB accessed using ADO NET Source component]
Table2(..., Code, Description,...) [MySQL DB accessed using ODBC Source component]
I want to update the column Table1.Description using the Table2.Description by matching them with the right Code first (because Table1.Code is the same as Table2.Code).
What i tried:
Doing a Merge Join transformation using the Code column but I couldn't figure out how to reinsert the table because since Table1 has relationships i can't simply drop the table and replace it with the new one
Using a Lookup transformation but since both tables are not the same type it didn't allow me to create the lookup table's connection manager (which would be for in my case MySQL)
I'm still new to SSIS but any ideas or help would be greatly appreciated
My solution is based on #Akina's comments. Although using a linked server would've definitely fit, my requirement is to make an SSIS package to take care of migrating some old data.
The first and last are SQL tasks, while the Migrate ICDDx is the DFT that transfers the data to a staging table created during the first SQL task.
Here's the SQL commands that gets executed during Create Staging Table :
DROP TABLE IF EXISTS [tempdb].[##stagedICDDx];
CREATE TABLE ##stagedICDDx (
ID INT NOT NULL,
Code VARCHAR(15) NOT NULL,
Description NVARCHAR(500) NOT NULL,
........
);
and here's the sql command (based on #Akina's comment) for transferring from staged to final (inside Transfer Staged):
UPDATE [MyDB].[dbo].[ICDDx]
SET [ICDDx].[Description] = [##stagedICDDx].[Description]
FROM [dbo].[##stagedICDDx]
WHERE [ICDDx].[Code]=[##stagedICDDx].[Code]
GO
Here's the DFT used (both TSQL and MySQL sources return sorted output using ORDER BY Code, so i didnt have to insert Sort components before the Merge Join) :
Note: Btw, you have to setup the connection manager to retain/reuse the same connection so that the temporary table doesn't get deleted before we transfer data to it. If all goes well, then after the Transfer Staged SQL Task, the connection would be closed and the global temporary table would be deleted.

SSIS- Load destination Table with new column

We have one table (S1) from database1 and are loading data into another database (datbase2) and table (D1). We implemented this in ssis using OLEDB source (database1.S1) and OLEDB Destination (datbase2.D1).
We have to add a new column 'Addeddate' to destination table. For this we have used a Derived column in between source and destination.
Now my Thought is, instead of using a derived column can we create the added column in the source itself? because we need just record loaded date.
Yes, if you use a SQL Query in your Source (and you should) all you have to do is add the column to the query.
Something like,
SELECT
S1.Column1, S1.Column2, ... S1.ColumnN, GETDATE() AS RecordLoadedDate
FROM S1
WHERE ...

Update all rows of a single column from one table to the same table in another database

Ok, so I have a database in my testing environment called 'Food'. In this database, there is a table called 'recipe', with a column called 'source'.
This same database exists in my local environment. However, I just received an updated database (in my local environment) where all the column values (for 'source') have changed.
Is there any way I can migrate the 'source' column from my local to my test environment, without changing the values for any other column? There are 1186 rows in the 'Food' database 'recipe' table in my test environment that need to be updated ONLY with the 'source' column.
You need some way to uniquely identify your Recipes. If both tables have a surrogate key that remained constant, use that. Otherwise figure out some way to match up the new data with your test data: you might already have a unique index in mind or you might need to decide on a combination of fields that uniquely identify your Recipes.
On a side note, why can't you just overwrite all the columns? It is just test data, right?
If only a column has changed and you have IDs (or keys) on your rows, you could follow these steps:
create an intermediate table locally
insert keys and new source values there (either those which have changed or all)
use mysqldump to selectively export the table from the local database
copy the dumped table to the remote database server
import it there
join it with the production table in an update statement to replace the values
drop the intermediate table on the server

Creating centralized DB

I have 2 databases(A) with same name in different servers( B & C). Both the databases have same schema. (sql server 2008 r2)
Task 1: Copy(transfer) both the databases into 3rd server (D) with the names (A_B and A_C).
Task 2: Merge both the databases into one database(A_D). (I don't know how will I handle keys)
Task 3: On daily basis I have to get data from servers B & C and put in centralized server D.
Any help would be appreciated.
Thanks.
Ritesh
Here are a few ideas:
Task 1: Transfer databases by doing a backup an restore to server D.
Task 2: I think this will involve ETL processes and creating new surrogate keys in database A_D. Keep keys from original source in a data source id column. I think a MERGE statement would be helpful.
Task 3: Leverage logic in Task 2
Update for Task 2:
Say a source Table1 in database A and B has an key column named Table1_ID. In database A_D add columns Table1_SourceID and Table1_Source. Populate Table1_SourceID with the key from source database, and use Table1_Source to indicate the source database.
Use Table1_ID as the key for Table1, and is unique to database A_D. This will account for collisions for key columns in the source databases. Also, you can track the row to the source database.
Task 1: Create destination Databases with no structures. I'd use tasks -> export function on the source databases with create structures option in SSMS. After export you will have exact copies in destination.
Task 2: In each table of A_D create a new key column (SurKey). It has to be a combination of values which will give unique values in the whole table. E.g. source table abbreviation + PK column + date.
For each table create two Data Flows in SSIS Package, which will load data from A_B and A_C. Put a Derived Column component, which will add a new column - SurKey.
In A_B DataFlow put the A_B as an abbreviation, A_C in the second one.
Task 3: Use Data Flows you created. Script a Job in SSMS, add it to the daily plan.

Script to migrate data between two SQL Server databases

I have two SQL Server databases, and I need to write a script to migrate data from database A to database B. Both databases have the same schema.
I must loop through the tables and for each table I must follow those rules:
If the item I'm migrating does not exist in the target table (for example, the comparison is made on a column Name) then I insert it directly.
If the item I'm migrating exists in the target table then I need to only update certain columns (for example, only update Age and Address but do not touch other columns)
Can anyone help me with that script? Any example would suffice. Thanks a lot
EDIT:
I just need an example for one table. No need to loop, I can handle each table separately (because each table has its own comparison column and update columns)
The MERGE statement looks like it can help you here. An Example:
MERGE StudentTotalMarks AS stm
USING (SELECT StudentID,StudentName FROM StudentDetails) AS sd
ON stm.StudentID = sd.StudentID
WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE
WHEN MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25
WHEN NOT MATCHED THEN
INSERT(StudentID,StudentMarks)
VALUES(sd.StudentID,25);
The merge statement is available as of SQL Server 2008 so you are in luck
Instead of creating a script why don't you copy the source table under a different name into the target server (update needs to take place).
Then just do a simple insert where name does not exist.
Here is the SQL for step 1 only.
INSERT INTO [TableA]
SELECT Name,
XX,
XXXX
FROM TableB
WHERE NOT NAME IN(SELECT NAME
FROM TableA)