Empty lookup table vs non-matching lookup table (lookup transform) - ssis

Is empty lookup table the same as non-matching lookup table in lookup transform?
What would be the result if no row redirection is configured?
an empty result set or
package failure at the lookup transform

You could get #2: Package failure. It would not be able to find the row in the lookup table (since it's empty).
Edit: I should say that if you set the Error Configuration to Ignore Failure, you will get an empty rowset.

Related

Duplicate row detected during DML action - Snowflake - Talend

I want to load data into Snowflake with Talend. I used tSnowflakeOutput with Upsert option because I want to insert data if not exists in Snowflake, or update rows if it exists. I used the primary key to identify the rows that already exist.
When I run my job, I have the following error:
Duplicate row detected during DML action
I am aware that the problem is due to a line that exists in Snowflake, I want to update the line but all I've got is this error.
do you have an idea why?
Please help :)
The Talend connector might be internally using the MERGE operation of Snowflake. As mentioned by #mike-walton, the error is reported because MERGE does not accept duplicates in the source data. Considering that its an insert or update if exists operation, if multiple source rows join to a target record, the system is not able to decide which source row to use for the operation.
From the docs
When a merge joins a row in the target table against multiple rows in the source, the following join conditions produce nondeterministic results (i.e. the system is unable to determine the source value to use to update or delete the target row)
A target row is selected to be updated with multiple values (e.g. WHEN MATCHED ... THEN UPDATE)
Solutions 1
One option as mentioned in the documentation can be to set the ERROR_ON_NONDETERMINISTIC_MERGE parameter. This will just pick an arbitrary source row to update from.
Solutions 2
Another option is to make it deterministic by using a MERGE query of the following form. This essentially does a de-duplication on the source table and lets you pick one of the duplicates as the preferred one for the update.
merge into taget_table t
using (
select *
from source_table
qualify
row_number() over (
partition by
the_join_key
order by
some_ordering_column asc
) = 1
) s
on s.the_join_key = t.the_join_key
when matched then update set
...
when not matched then insert
...
;
Doing this same thing in Talend may just require one to do a dedup operation upstream in the ETL mapping.

how to upsert data from mysql to ssms using ssis

how can we insert new data or update the data from one table to another table from MySQL to SQL server using ssis and by not using lookup.
A common way to do this is to insert new data to an empty temporary table, and then run SQL Merge command (using separate SQL Query task).
MERGE command is super powerful and can do updates, inserts or even deletes. See full description of Merge here:
https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
The design for this will look like below :
You will have 4 tables and 1 view : Source, TMP_Dest (exactly as source with no PK), CHG_Dest(for changes, exactly as destination with no PK), Dest(will have PK), FV_TMP_Dest (this is in case the destination looks different than the source - different field types)
SSIS package :
1.Use ExecuteSQLTask and truncate TMP_Dest because it is just temporary for the extracted data
Use ExecuteSQlTask and truncate CHG_Dest because it is just temporary for the extracted data
Use one DataFlowTask for loading data from Source to TMP_Dest
Define two variables OperationIDInsert=1 and OperationIDUpdate=2 (the values are not important, you can set them as you want) -> you will use them at 5. point below
Use another DataFlowTask in which you will have:
on the left side OLE DB Source in which you will extract data from the view, ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
on the right side OLE DB Source in which you will extract data from the Dest ordered by PK (do not forget to set the SortKeyPosition from Advanced Editor for the PK fields)
LEFT JOIN between this
on the left side ( "insert side") you will have: a derived column in which you will assign as Expression the OperationIDInsert variable AND an OLE DB Destination for inserting the data in CHG_Dest table. In this way, you will insert the data that have to be inserted in the destination table and you know this because you have the OperationIDInsert column.
on the right side you will do the same thing but using OperationIDUpdate column
You will use ExecuteSQLTask in the ControlFlow and will have an SQL Merge. Based on the PK fields and OperationIDInsert/OperationIDUpdate fields you will either insert the data or update it.
Hope this will help you. Let me know if you need additional info.

SSIS Lookup on data being imported

This is a classic does the record exist then update otherwise insert scenario. The dataflow process I am following is as follows:
OLE DB: return list of fields one of which is a calculated hashvalue on a number of the fields in the dataset
Lookup: See if the row has changed by comparing the hashvalue and another key field(allows duplicates) against the latest record in the target table for that key field.
If matches update the record, if no match then insert.
The problem I am having is that one of the records that I would match against is in the dataset I am importing and not in the target table. Therefore I always geta NO MATCH result. If I delete the row causing the error in the target table then re-run the import I do get a MATCH.
I have tried turning off the cache for the lookup with no success. Suffice to say I have searched for the answer with no luck. HELP

mysql: show truncated or modified data announced by warning

When INSERTing into a MYSQL table, I sometimes get a "Data truncated for column 'x' at row n" message and I'm unable to identify the data that was truncated or the resulting truncated value. I tried to use the column/row number but this didn't lead me to any truncated value.
Is my only option to verify the data insertion by selecting it from the target table and comparing it to the source data?
Your inserted data surpassed the maximum value for a given column type.If its from another table alter your destination table SCHEMA to be identical with the source table SCHEMA(at least for the columns which exist in both tables) and the problem will be solved.

BIDS - SSIS - Redirect row on Error sends too many rows

I have a simple data flow.
The source is a small flat file with approxiamtely 16k rows in it.
The destination is an OLE DB destination, a SQL 2008 table with a 3 part Unique key on it.
The data flow goes via some simple transformations; Row Count, derived columns, data conversion etc.
All simple and all that works fine.
My problem is that within this data there are 2 rows which are duplicates in terms of the primary key, 2 duplicate rows that violate that key, so 4 rows in total. On the OLE DB destination i have set the error output to redirect Row and the rows are sent to an Error table which has enough columns for me to identify the bad rows.
The problem is that even though there are 4 cuplrits the tranformation keeps writing 1268 rows to the error table.
Any ideas?
Thanks.
**
Just to add, if i remove the 2 duplicate rows the whole file imports successfully....16,875 rows.
There is no question that only 2 rows violate the key, but the error redirection affects 1268.
**
I have found the solution.
The problem goes away if you load the data using Data access mode 'Table or view' in the OLE Destination rather than 'Table or View - Fast load'.
The only relevant comment i can find is on MSDN;
Any constraint failure at the destination causes the entire batch of rows defined by FastLoadMaxInsertCommitSize to fail.
So it seems that the row size was 1268 in my case and the 2 duplicate rows that were violating the key caused the whole batch to be redirected to the error destination table.
Are you sure that the other rows are errors due to the PK violation? There are a couple of additional columns (ErrorCode, ErrorColumn) available through the error path. This may show that you have different issues.
In SQL 2008 you can redirect failing rows to e.g. a flat file destination. Go to the destination OLEDB task goto the error output (select all fields in the windows). With the combobox below choose redirect row and apply, then ok.
Next drag a precedance contraint (red arrow) from the OLEDB to a new Flat file task and configure this task (don't change the default mapping of the columns).
Now you should be able to find the error row more easy.
Eric