SSIS- Update few columns of a row for which the primary key already exists - ssis

The following is an example to better explain my scenario. My database table has following columns
Column -1: Operating_ID (which is the primary key)
Column -2: Name
Column -3: Phone
Column -4: Address
Column -5: Start Date
Column -6: End Date
The values for the columns 1,2,3,4 come from an extract and this extract is pushed to the database daily using SSIS data flow task
The values for the columns 5 and 6 are user inputted from a web applicxation and saved to the database.
Now in SSIS process instead of throwing violation of primary key error, i need to update Columns 2,3,4 if the primary key i.e column 1 already exists.
First i considered replace but that deletes the user inputted data columns 4,5.
I would like to keep the data in columns 4,5 and update columns 2,3,4 when column 1 already exists.

Do a LOOKUP for Operating_ID. Change that lookup from "FAIL ON NOT FOUND" to "REDIRECT ROWS TO NO MATCH"
If match not found, go to INSERT
If match found, go to UPDATE. You can run OLAP commands to update, but if it is a large set of data, you are better off putting into a table, and doing an UPDATE with a JOIN

This is what I would do. I would put all the data in a staging table. Then I woudl use a data flow to insert the new records and the source of that dataflow would be the staging table with a not exists clause referencing the prod table.
Then I would use an Execute SQL task in the control flow to update the data for existing rows.

Related

Trying to copy or somehow move the contents (values) in a TEXT column 3k+ large rows to another table in the same database without success

I have created a new column in the "destination" table with the same name, datatype and other values as appear in the "source" column in a different table. I have tried many suggested solutions as found on stackoverflow. This one appeared to work (found on Quora) but when I went to the destination table the previously empty column remains empty with nothing but NULL values noted. This is the Quora suggestion:
you can fill data in column from another existing one by using INSERT INTO statement and SELECT statement together like that
INSERT INTO `table1`(column_name)
SELECT column_name FROM `table2`
here you filled a single column in table 1 with data located in a single column in table 2
so if you want to fill the whole table 1 (all columns) with data located in table 2 you can make table 1 like a copy of table 2 by using the same code but without column name
INSERT INTO `table1`
SELECT * FROM `table2`
but note to do this (copy table content to another one) ensure that both of tables have the same column count and data types.
I'm not sure what is meant by column count (do the two table have to have the same number of columns?)
When I run it I get error # 1138.
Any help greatly appreciated. -JG

SSIS Lookup Transformation No Match Output Only Populates Null

I am trying to use the lookup transformation but can not seem to get the functionality out of it that I need. I have two tables that are the exact same structure
Temp Table (input): Smaller table but may have entries that do not exist in other table
Reference Lookup Table: Larger table that may not have identical entries to Temp Table.
I am trying to compare the entries of the Temp Table to the entries of the Reference Lookup Table. Anything that exists in the Temp Table, but not the Lookup should be output to a separate table (No match output).
It is a very simple Data Flow, but it does not seem to accomplish the lookup properly. It will find "No Match" rows, but the "no match" table is populated with null values for every column. I am trying to figure out why the data is losing its values?
How the Lookup is setup:
The data in temp table is what drives your data flow. 151 rows flowed out of it.
Your lookup is going to match based on whatever criteria you specify and you've identified that if there is no match, I want to push the no-match data into a table.
Since the lookup task cannot add columns to the no-match output path, this would imply your source (temp table) started NULL across the board.
Drop a data viewer/data tap onto the data flow between the lookup and the destination and then compare that data to your source. I suspect you're going to discover that the process that populated Temp table is at fault.
In the Lookup Transformation, in the columns tab you have identified that you want to use the value from the reference table to replace the value from the source.
Which works great until you get a no-match. In which case, the component is going to do the non-intuitive (even to me with 15+ years of working with it) action of update that column whether it matches or not.
Source query
SELECT 21 AS tipID, NULL AS tipYear
UNION ALL SELECT 22, 2020
UNION ALL SELECT 64263810, 2020
This adds three rows to my data flow, the first with no tipYear and the next two rows with a year of 2020. Stamp of 1 in the below image
Lookup query
SELECT
*
FROM
(
values (20, 1111), (21, 2021), (22, 2022)
)D(tipID, tipYear)
This reference data will supply a year for all the matches (21 and 22). In the matched path, we'll see 21 supplied with a value and 22 will have its year updated. Stamp 2 in the image
For id 64263810 however, no match will be found and we'll see the initial value of 2020 replaced with the matching row aka NULL. Stamp 3
Lessons learned. If you need to use the data from the reference table but have a no-match output path, do not replace column in the lookup transformation (unless your intention is to wipe out data)

How to run a series of checks governed by a mysql table and store the results

I have to run a series of checks (governed by the table "Checks") and store the results in a table "Checks_result" (in a mysql database).
The table "Checks" contains an identifier (checkno) and a sql-statement (possibly returning many rows with a single value) to be executed.
The table "Check_results" has to contain all the rows returned from the sql-statement, with a reference to checkno and an autoincrement row checkentry for each returned row from the sql-statement.
Is it possible to do this?
What I was suggesting was when your table has the 2 SQL statements, you should read each record and construct another SQL statement along the lines of:
insert into check_results(checkno, checkresult )
select 1, i.val1-i.val2 from import i;
The select just needs the checkno added into it and the checkentry should be an autoincrement column.

SSIS-Replace the duplicate column with empty string keeping the original column

Can anyone please help me with below Requirement.
I have a requirement to check if a column in a record matches with any other column i want to replace the duplicate column with empty string.
Say i have x1,x2,x3 columns. How to check if x1 matches with any of the x1,x2,x3 columns and if it matches i want to replace the duplicate column with empty string.
Doing this is more complexe than one would expect. Here are 2 options:
Try the fuzzy lookup by duplicating the file and comparing it with itself with a high threshold. I suspect you want to check for the same record if there is a match on other columns so you will need to create an exact match on the key (go under the Columns tab and right click on the link, Edit Mappings) and do the fuzzy on the others. You can only link a field once so duplicate the columns as needed.
Do a stored proc with all the combinations and have it generate an out table with the results (you can run a stored proc using the OLE DB Command). I would probably go with that one if I am sure of the "exactness" of the data. Otherwise, go with the fuzzy.
Since you only have a few columns, you could just run a set of update statements like the following:
update Contacts
set Phone2 = null
where Phone2 = Phone1
update Contacts
set Phone3 = null
where Phone3 = Phone1
update Contacts
set Phone3 = null
where Phone3 = Phone2
Accomplishing this task within an SSIS dataflow would be a bit tricky, because you would be trying to compare all of the other rows in all the buffers compared to the current row.
Instead, I would recommend staging the data in a table as Gordon Bell has suggested. Then you need to determine which row wins when a duplicate is found. You might have a date column to sort it out, or you may add a row number column to the data flow in ssis and sort by how you received the data.
Here is an example of how you might find the winning row and update others with a self join: Deleting duplicate record in SQL Server
m

Update column data type and update the data

I have a SQL server database design problem.
I have an existing database table with hundreds of records in there. One of the columns is of type NVARCHAR but it should be an integer with all the data in a lookup table.
Is there any clever way in SQL Server to get the data out of the column and into a new lookup table, change the datatype of the column and update the values with the correct ID from the new lookup table??
Thanks in advance.
I'm using SQL Server 2008
No, you have to do it as 3 steps:
Insert the values into the new lookup table
Update the current rows so that the nvarchar column now contains appropriate ID values from the lookup table
Change the column definition to int
(4th of 3 :-)) create a foreign key constraint between this column and the ID column of the lookup table.
Thankfully, int values should always be able to fit in an nvarchar, unless it's an especially small one (in which case you'll have to expand it first).