I have two SQL Server databases, and I need to write a script to migrate data from database A to database B. Both databases have the same schema.
I must loop through the tables and for each table I must follow those rules:
If the item I'm migrating does not exist in the target table (for example, the comparison is made on a column Name) then I insert it directly.
If the item I'm migrating exists in the target table then I need to only update certain columns (for example, only update Age and Address but do not touch other columns)
Can anyone help me with that script? Any example would suffice. Thanks a lot
EDIT:
I just need an example for one table. No need to loop, I can handle each table separately (because each table has its own comparison column and update columns)
The MERGE statement looks like it can help you here. An Example:
MERGE StudentTotalMarks AS stm
USING (SELECT StudentID,StudentName FROM StudentDetails) AS sd
ON stm.StudentID = sd.StudentID
WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE
WHEN MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25
WHEN NOT MATCHED THEN
INSERT(StudentID,StudentMarks)
VALUES(sd.StudentID,25);
The merge statement is available as of SQL Server 2008 so you are in luck
Instead of creating a script why don't you copy the source table under a different name into the target server (update needs to take place).
Then just do a simple insert where name does not exist.
Here is the SQL for step 1 only.
INSERT INTO [TableA]
SELECT Name,
XX,
XXXX
FROM TableB
WHERE NOT NAME IN(SELECT NAME
FROM TableA)
Related
I'm trying to delete records in my target table based on whether records exist in source table. I tried using a 'Delete' step, but I noticed that this step is based on a conditional clause.
My condition is quite simple "if the record/row does NOT exist in table A [source], delete the record/row from table B [destination]".
I also read about using a 'Merge Rows (diff)' step, but it seems to check/compare the entire set of tables for differences.
The table has several million records with many hundreds of columns in a MySQL server, I need to do this in the most efficient way.
I'm doing a search of table A with the Table input object and sql command:
'' ' SELECT I went , user , password , attribute , op FROM viewuserradiusunisulma
Any help would be appreciated.
print - image screen pentaho transformation
Transformation
Delete Pentaho
if your source and target table are in the same database, you can use a SQL query to delete all records in tableB that don't have a corresponding entry in tableA:
delete tableB where not exists (select id from tableA where id = tableB.id)
if source and destination tables are not in the same database, you would have to go through all rows in tableB and check whether the record exists in tableA. If your source tableA has a limited number of rows, loading the key values in memory and then performing a stream lookup instead of a database lookup would be much faster. I'd probably try that even with higher number of rows because of the significant performance impact.
note: I hope I haven't messed up the sql syntax, I'm thinking almost exclusively in abap at the moment and that messes with my memory a bit. So please test this on some backup before firing away.
I found the solution. In this case, I check the records, then report, update and enter the new data
Trasnsformation
how can we insert new data or update the data from one table to another table from MySQL to SQL server using ssis and by not using lookup.
A common way to do this is to insert new data to an empty temporary table, and then run SQL Merge command (using separate SQL Query task).
MERGE command is super powerful and can do updates, inserts or even deletes. See full description of Merge here:
https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
The design for this will look like below :
You will have 4 tables and 1 view : Source, TMP_Dest (exactly as source with no PK), CHG_Dest(for changes, exactly as destination with no PK), Dest(will have PK), FV_TMP_Dest (this is in case the destination looks different than the source - different field types)
SSIS package :
1.Use ExecuteSQLTask and truncate TMP_Dest because it is just temporary for the extracted data
Use ExecuteSQlTask and truncate CHG_Dest because it is just temporary for the extracted data
Use one DataFlowTask for loading data from Source to TMP_Dest
Define two variables OperationIDInsert=1 and OperationIDUpdate=2 (the values are not important, you can set them as you want) -> you will use them at 5. point below
Use another DataFlowTask in which you will have:
on the left side OLE DB Source in which you will extract data from the view, ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
on the right side OLE DB Source in which you will extract data from the Dest ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
LEFT JOIN between this
on the left side ( "insert side") you will have: a derived column in which you will assign as Expression the OperationIDInsert variable AND an OLE DB Destination for inserting the data in CHG_Dest table. In this way, you will insert the data that have to be inserted in the destination table and you know this because you have the OperationIDInsert column.
on the right side you will do the same thing but using OperationIDUpdate column
You will use ExecuteSQLTask in the ControlFlow and will have an SQL Merge. Based on the PK fields and OperationIDInsert/OperationIDUpdate fields you will either insert the data or update it.
Hope this will help you. Let me know if you need additional info.
Is this the right syntax:
INSERT INTO stock (Image)
SELECT Image,
FROM productimages
WHERE stock.Name_of_item = productimages.number;
SQL Server Management Studio's "Import Data" task (right-click on the DB name, then tasks) will do most of this for you. Run it from the database you want to copy the data into.
If the tables don't exist it will create them for you, but you'll probably have to recreate any indexes and such. If the tables do exist, it will append the new data by default but you can adjust that (edit mappings) so it will delete all existing data.
I use this all the time and it works fairly well.
INSERT INTO bar..tblFoobar( *fieldlist* )
SELECT *fieldlist* FROM foo..tblFoobar
This just moves the data. If you want to move the table definition (and other attributes such as permissions and indexes), you'll have to do something else.
The query logic that you are trying appears to be not correct (the query itself is buggy).
Assuming you have the correct query for the above logic and what you are trying is to insert new rows into the table stock by selecting a column from productimages table with a matching record as stock.Name_of_item = productimages.number
The above logic will add redundant data in to the table.
You perhaps looking to update instead of insert, something as -
update stock s
join productimages p on p.number = s.Name_of_item
set s.Image = p.Image
I have created a system using PHP/MySQL that downloads a large XML dataset, parses it and then inserts the parsed data into a MySQL database every week.
This system is made up of two databases with the same structure. One is a production database and one is a temporary database where the data is parsed and inserted into first.
When the data has been inserted into the temporary database I perform a merge by inserting/replacing the data in the production database. I have done all of the above so far. I then realised, data that might have been removed in a new dataset will be left to linger in the production database.
I need to perform a check to see if the new data is still in the production database, if it is then leave it, if it isn't delete the row from the production database so that the rows aren't left to linger.
For arguments sake, let's say the two databases are called database_temporary and database_production.
How can I go about doing this?
If you are using SQL to merge, a simple SQL can do the delete as well:
delete from database_production.table
where pk not in (select pk from database_temporary.table)
Notes:
This assumes that there is a a row can be uniquely identified. This may be based on a single column, multiple columns or another mechanism.
If your dataset is large, a not exists mey perform better than not in. See What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL? and NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: SQL Server
An example not exists:
delete from database_production.table p
where not exists (select 1 from database_temporary.table t where t.pk = p.pk)
Performance Notes:
As pointed out by #mgonzalez in the comments on the question, you may want to use a timestamp column (something like last modified) for comparing/merging in general so that you vompare only changed rows. This does not apply to the delete specifically, you cannot use timestamp for the delete because, well, the row would not exist.
I'm working on a database right now and I have a pretty specific problem that I'm trying to figure out:
I have a large master table in my database with all the info we're gathering. We're updating records in this master table based on Excel files returned to us by various team members across the company - all of the records have unique ID numbers so we know what fields in the master table to update. We are tracking who responds by updating the file name into the master table as well. I want to update this with the file name; however, if two sources give me the same data, I want to append the second file to the first file rather than replace it with an update.
The problem is, I need the query to "know" when to update and when to append. Is there some IF statement I can use - maybe Update when Null, Append when Not Null?
You can refer to an Excel sheet or range in a query:
INSERT INTO Table1 ( ADate )
SELECT SomeDate FROM [Excel 8.0;HDR=YES;DATABASE=Z:\Docs\Test.xls].[Sheet1$a1:a4]
WHERE SomeDate Is Not Null
This means that you can run queries based on the presence or absence of data in the Excel file.