How to specify column mapping in AWS Data pipeline? - mysql

I am using AWS data pipeline to copy data from RedShift to MySql in RDS. The data is copied to MySQL. In the pipeline the insert query is specified as below:
insert into test_Employee(firstname,lastname,email,salary) values(?,?,?,?);
Is there any way for me to specify the column name for the source table in place of ? in the above query? I tries adding the column names for the source table but that does not seem to work. Currently the column names in both the source and destination table are same.
Thanks for your time. Let me know if any other information is required.

Specifying columns instead of ? wouldn't work because insert SQL queries know nothing about your source datasource. AWS Copy activity just passes the parameters to this query in the same order you selected them from the source dataset.
However column names of destination table (test_Employee) in insert query don't have to match the order of columns specified in the DDL, so you can change this query to match the order of columns in the source table.
E.g if your source dataset has following columns:
email,first_name,last_name,salary
Insert query:
insert into test_Employee(email,firstname,lastname,salary) values(?,?,?,?);
Note. As you can see the column names of the source and destination table don't have to match.

Related

how to upsert data from mysql to ssms using ssis

how can we insert new data or update the data from one table to another table from MySQL to SQL server using ssis and by not using lookup.
A common way to do this is to insert new data to an empty temporary table, and then run SQL Merge command (using separate SQL Query task).
MERGE command is super powerful and can do updates, inserts or even deletes. See full description of Merge here:
https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
The design for this will look like below :
You will have 4 tables and 1 view : Source, TMP_Dest (exactly as source with no PK), CHG_Dest(for changes, exactly as destination with no PK), Dest(will have PK), FV_TMP_Dest (this is in case the destination looks different than the source - different field types)
SSIS package :
1.Use ExecuteSQLTask and truncate TMP_Dest because it is just temporary for the extracted data
Use ExecuteSQlTask and truncate CHG_Dest because it is just temporary for the extracted data
Use one DataFlowTask for loading data from Source to TMP_Dest
Define two variables OperationIDInsert=1 and OperationIDUpdate=2 (the values are not important, you can set them as you want) -> you will use them at 5. point below
Use another DataFlowTask in which you will have:
on the left side OLE DB Source in which you will extract data from the view, ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
on the right side OLE DB Source in which you will extract data from the Dest ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
LEFT JOIN between this
on the left side ( "insert side") you will have: a derived column in which you will assign as Expression the OperationIDInsert variable AND an OLE DB Destination for inserting the data in CHG_Dest table. In this way, you will insert the data that have to be inserted in the destination table and you know this because you have the OperationIDInsert column.
on the right side you will do the same thing but using OperationIDUpdate column
You will use ExecuteSQLTask in the ControlFlow and will have an SQL Merge. Based on the PK fields and OperationIDInsert/OperationIDUpdate fields you will either insert the data or update it.
Hope this will help you. Let me know if you need additional info.

SSIS- Load destination Table with new column

We have one table (S1) from database1 and are loading data into another database (datbase2) and table (D1). We implemented this in ssis using OLEDB source (database1.S1) and OLEDB Destination (datbase2.D1).
We have to add a new column 'Addeddate' to destination table. For this we have used a Derived column in between source and destination.
Now my Thought is, instead of using a derived column can we create the added column in the source itself? because we need just record loaded date.
Yes, if you use a SQL Query in your Source (and you should) all you have to do is add the column to the query.
Something like,
SELECT
S1.Column1, S1.Column2, ... S1.ColumnN, GETDATE() AS RecordLoadedDate
FROM S1
WHERE ...

Cant copy unique records from one database table to another?

Hi,
I am trying to copy unique records from a database table to another table of the same name but different database. The source database contains some records that are already present in the destination database, so those I dont need, only the other ones. Database destination is called "test" and the source database is "forums". The table name is store for both cases. I am using this query:
INSERT INTO test.store (cs_key, cs_value, cs_array, cs_updated,cs_rebuild)
SELECT DISTINCT cs_key, cs_value, cs_array, cs_updated,cs_rebuild
FROM forums.store
But I am getting many errors as I try to run this query. Why?
Thank you.

Update all rows of a single column from one table to the same table in another database

Ok, so I have a database in my testing environment called 'Food'. In this database, there is a table called 'recipe', with a column called 'source'.
This same database exists in my local environment. However, I just received an updated database (in my local environment) where all the column values (for 'source') have changed.
Is there any way I can migrate the 'source' column from my local to my test environment, without changing the values for any other column? There are 1186 rows in the 'Food' database 'recipe' table in my test environment that need to be updated ONLY with the 'source' column.
You need some way to uniquely identify your Recipes. If both tables have a surrogate key that remained constant, use that. Otherwise figure out some way to match up the new data with your test data: you might already have a unique index in mind or you might need to decide on a combination of fields that uniquely identify your Recipes.
On a side note, why can't you just overwrite all the columns? It is just test data, right?
If only a column has changed and you have IDs (or keys) on your rows, you could follow these steps:
create an intermediate table locally
insert keys and new source values there (either those which have changed or all)
use mysqldump to selectively export the table from the local database
copy the dumped table to the remote database server
import it there
join it with the production table in an update statement to replace the values
drop the intermediate table on the server

Script to migrate data between two SQL Server databases

I have two SQL Server databases, and I need to write a script to migrate data from database A to database B. Both databases have the same schema.
I must loop through the tables and for each table I must follow those rules:
If the item I'm migrating does not exist in the target table (for example, the comparison is made on a column Name) then I insert it directly.
If the item I'm migrating exists in the target table then I need to only update certain columns (for example, only update Age and Address but do not touch other columns)
Can anyone help me with that script? Any example would suffice. Thanks a lot
EDIT:
I just need an example for one table. No need to loop, I can handle each table separately (because each table has its own comparison column and update columns)
The MERGE statement looks like it can help you here. An Example:
MERGE StudentTotalMarks AS stm
USING (SELECT StudentID,StudentName FROM StudentDetails) AS sd
ON stm.StudentID = sd.StudentID
WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE
WHEN MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25
WHEN NOT MATCHED THEN
INSERT(StudentID,StudentMarks)
VALUES(sd.StudentID,25);
The merge statement is available as of SQL Server 2008 so you are in luck
Instead of creating a script why don't you copy the source table under a different name into the target server (update needs to take place).
Then just do a simple insert where name does not exist.
Here is the SQL for step 1 only.
INSERT INTO [TableA]
SELECT Name,
XX,
XXXX
FROM TableB
WHERE NOT NAME IN(SELECT NAME
FROM TableA)