I am trying to use the lookup transformation but can not seem to get the functionality out of it that I need. I have two tables that are the exact same structure
Temp Table (input): Smaller table but may have entries that do not exist in other table
Reference Lookup Table: Larger table that may not have identical entries to Temp Table.
I am trying to compare the entries of the Temp Table to the entries of the Reference Lookup Table. Anything that exists in the Temp Table, but not the Lookup should be output to a separate table (No match output).
It is a very simple Data Flow, but it does not seem to accomplish the lookup properly. It will find "No Match" rows, but the "no match" table is populated with null values for every column. I am trying to figure out why the data is losing its values?
How the Lookup is setup:
The data in temp table is what drives your data flow. 151 rows flowed out of it.
Your lookup is going to match based on whatever criteria you specify and you've identified that if there is no match, I want to push the no-match data into a table.
Since the lookup task cannot add columns to the no-match output path, this would imply your source (temp table) started NULL across the board.
Drop a data viewer/data tap onto the data flow between the lookup and the destination and then compare that data to your source. I suspect you're going to discover that the process that populated Temp table is at fault.
In the Lookup Transformation, in the columns tab you have identified that you want to use the value from the reference table to replace the value from the source.
Which works great until you get a no-match. In which case, the component is going to do the non-intuitive (even to me with 15+ years of working with it) action of update that column whether it matches or not.
Source query
SELECT 21 AS tipID, NULL AS tipYear
UNION ALL SELECT 22, 2020
UNION ALL SELECT 64263810, 2020
This adds three rows to my data flow, the first with no tipYear and the next two rows with a year of 2020. Stamp of 1 in the below image
Lookup query
SELECT
*
FROM
(
values (20, 1111), (21, 2021), (22, 2022)
)D(tipID, tipYear)
This reference data will supply a year for all the matches (21 and 22). In the matched path, we'll see 21 supplied with a value and 22 will have its year updated. Stamp 2 in the image
For id 64263810 however, no match will be found and we'll see the initial value of 2020 replaced with the matching row aka NULL. Stamp 3
Lessons learned. If you need to use the data from the reference table but have a no-match output path, do not replace column in the lookup transformation (unless your intention is to wipe out data)
Related
I have created a new column in the "destination" table with the same name, datatype and other values as appear in the "source" column in a different table. I have tried many suggested solutions as found on stackoverflow. This one appeared to work (found on Quora) but when I went to the destination table the previously empty column remains empty with nothing but NULL values noted. This is the Quora suggestion:
you can fill data in column from another existing one by using INSERT INTO statement and SELECT statement together like that
INSERT INTO `table1`(column_name)
SELECT column_name FROM `table2`
here you filled a single column in table 1 with data located in a single column in table 2
so if you want to fill the whole table 1 (all columns) with data located in table 2 you can make table 1 like a copy of table 2 by using the same code but without column name
INSERT INTO `table1`
SELECT * FROM `table2`
but note to do this (copy table content to another one) ensure that both of tables have the same column count and data types.
I'm not sure what is meant by column count (do the two table have to have the same number of columns?)
When I run it I get error # 1138.
Any help greatly appreciated. -JG
The following is an example to better explain my scenario. My database table has following columns
Column -1: Operating_ID (which is the primary key)
Column -2: Name
Column -3: Phone
Column -4: Address
Column -5: Start Date
Column -6: End Date
The values for the columns 1,2,3,4 come from an extract and this extract is pushed to the database daily using SSIS data flow task
The values for the columns 5 and 6 are user inputted from a web applicxation and saved to the database.
Now in SSIS process instead of throwing violation of primary key error, i need to update Columns 2,3,4 if the primary key i.e column 1 already exists.
First i considered replace but that deletes the user inputted data columns 4,5.
I would like to keep the data in columns 4,5 and update columns 2,3,4 when column 1 already exists.
Do a LOOKUP for Operating_ID. Change that lookup from "FAIL ON NOT FOUND" to "REDIRECT ROWS TO NO MATCH"
If match not found, go to INSERT
If match found, go to UPDATE. You can run OLAP commands to update, but if it is a large set of data, you are better off putting into a table, and doing an UPDATE with a JOIN
This is what I would do. I would put all the data in a staging table. Then I woudl use a data flow to insert the new records and the source of that dataflow would be the staging table with a not exists clause referencing the prod table.
Then I would use an Execute SQL task in the control flow to update the data for existing rows.
I have a table full of traffic accident data with column headers such as 'Vehicle_Manoeuvre' which contains integers for example 13 represents the vehicle manoeuvre which caused the accident was 'overtaking moving vehicle'.
I know the mappings from integers to text as I have a (quite large) excel file with this data.
An example of what I want to know is percentage of the accidents involved this type of manoeuvre but I don't want to have to open the excel file and find the mappings of integers to text every time I write a query.
I could manually change the integers of all the columns (write query with all the possible mappings of each column, add them as new column, then delete the orginial columns) but this sould take a long time.
Is it possible to create some type of variable (like an array with first column as integers and second column with the mapped text) that SQL could use to understand how text relates to the integers allowing me to write a query below:
SELECT COUNT(Vehicle_Manoeuvre) FROM traffictable WHERE Vehicle_Manoeuvre='overtaking moving vehicle';
rather than:
SELECT COUNT(Vehicle_Manoeuvre) FROM traffictable WHERE Vehicle_Manoeuvre=13;
even though the data in the table is still in integer form?
You would do this with a Maneeuvres reference table:
create table Manoeuvres (
ManoeuvreId int primary key,
Name varchar(255) unique
);
insert into Manoeuvres(ManoeuvreId, Name)
values (13, 'Overtaking');
You might even have such a table already, if you know that 13 has a special meaning.
Then use a join:
SELECT COUNT(*)
FROM traffictable tt JOIN
Manoeuvres m
ON tt.Vehicle_Manoeuvre = m.ManoeuvreId
WHERE m.name = 'Overtaking';
I have created a SSIS Package that reads data from a CSV file and loads into table1 . the other data flow tasks does a look up on table 1 .Table1 has columns x , y, z, a ,b . Table 2 has columns a , b ,y,z Lookup is done based on columns y and z . Based on the column y and z , it is picking up a and b from table 1 and updating table 2 . The problem is the data gets updated but i get multiple rows of data thats is one without updation and one after updation .
I can provide more clear explanation if needed .
Fleshing out Nick's suggestion, I would get rid of your second data flow (the one from Table 2 to Table 2).
After the first Dataflow that populates table 1, then just do an EXECUTE SQL task that performs an UPDATE on Table 2, and joins to Table 1 to get the new data.
EDIT in response to comment:
You need to use a WHERE clause that will match rows uniquely. Apparently Model_Cd is not a UNIQUE column in JLRMODEL_DIMS. If you cannot make the WHERE clause unique because of the relationship between the two tables, then you need to select either an aggregate [Length (cm)] like MIN(), MAX() etc, or you need to use TOP 1, so that you only get one row from the subquery.
I have 2 existing tables in a MySql DB. The tables have identical structures. I want to copy data from table to another.
insert into `Table1`
select * from Table2
where department = "engineering"
the above code seemed to work and it copied the data correctly except for 1 column. The "department" column did not copy over so it was blank. All the other fields seemed to copy over correctly for all of the records.
What can be causing this? As I mentioned both tables have identical structures, same number of columns and everything...
Any ideas?
Note:I just realized that there are actually 2 columns that are not copying over. The "department" and "Category" fields come over blank. So basically when I am inserting the data from table 2 into table 1, 12 out of 14 columns are successfully copied over but then there are 2 columns that remain blank.
Below is the DESCRIBE of Table1 and Table2
The only difference I can see when I do a Describe on both tables is that the 2 fields in question have a data type of enum (.....) but they have differences in between the parenthesis. Could this be causing the issue and if so is there a simple way around it? I'm thinking I might have to do an update query after I do the initial insert that will bring in the "department" and "category" fields from table 2 into table 1 by joining in the ID field.
From the docs:
If you insert an invalid value into an ENUM (that is, a string not
present in the list of permitted values), the empty string is
inserted instead as a special error value.
Read about ENUM.