Add columns to result - integration

i use pentaho data integration.
I have three columns A,B,C i want to use C in "Input Table" Transformation for a select to another database.
so i have added "Select Value" before "Input Table" so my sql work fine, and that return me only one column : 'D'
But now i have two stream A,B,C and another D.
I havn't primary key in my second stream so how can i merge all columns,
My final result will be A,B,C,D
I have tried with "Merge join" but not working because i havn't primary key
ps: two stream retun me the same number of rows

Try to use "Joins rows" and do not select any fields, this will add the column D on the main flow.

Related

SSIS- Update few columns of a row for which the primary key already exists

The following is an example to better explain my scenario. My database table has following columns
Column -1: Operating_ID (which is the primary key)
Column -2: Name
Column -3: Phone
Column -4: Address
Column -5: Start Date
Column -6: End Date
The values for the columns 1,2,3,4 come from an extract and this extract is pushed to the database daily using SSIS data flow task
The values for the columns 5 and 6 are user inputted from a web applicxation and saved to the database.
Now in SSIS process instead of throwing violation of primary key error, i need to update Columns 2,3,4 if the primary key i.e column 1 already exists.
First i considered replace but that deletes the user inputted data columns 4,5.
I would like to keep the data in columns 4,5 and update columns 2,3,4 when column 1 already exists.
Do a LOOKUP for Operating_ID. Change that lookup from "FAIL ON NOT FOUND" to "REDIRECT ROWS TO NO MATCH"
If match not found, go to INSERT
If match found, go to UPDATE. You can run OLAP commands to update, but if it is a large set of data, you are better off putting into a table, and doing an UPDATE with a JOIN
This is what I would do. I would put all the data in a staging table. Then I woudl use a data flow to insert the new records and the source of that dataflow would be the staging table with a not exists clause referencing the prod table.
Then I would use an Execute SQL task in the control flow to update the data for existing rows.

SSIS-Replace the duplicate column with empty string keeping the original column

Can anyone please help me with below Requirement.
I have a requirement to check if a column in a record matches with any other column i want to replace the duplicate column with empty string.
Say i have x1,x2,x3 columns. How to check if x1 matches with any of the x1,x2,x3 columns and if it matches i want to replace the duplicate column with empty string.
Doing this is more complexe than one would expect. Here are 2 options:
Try the fuzzy lookup by duplicating the file and comparing it with itself with a high threshold. I suspect you want to check for the same record if there is a match on other columns so you will need to create an exact match on the key (go under the Columns tab and right click on the link, Edit Mappings) and do the fuzzy on the others. You can only link a field once so duplicate the columns as needed.
Do a stored proc with all the combinations and have it generate an out table with the results (you can run a stored proc using the OLE DB Command). I would probably go with that one if I am sure of the "exactness" of the data. Otherwise, go with the fuzzy.
Since you only have a few columns, you could just run a set of update statements like the following:
update Contacts
set Phone2 = null
where Phone2 = Phone1
update Contacts
set Phone3 = null
where Phone3 = Phone1
update Contacts
set Phone3 = null
where Phone3 = Phone2
Accomplishing this task within an SSIS dataflow would be a bit tricky, because you would be trying to compare all of the other rows in all the buffers compared to the current row.
Instead, I would recommend staging the data in a table as Gordon Bell has suggested. Then you need to determine which row wins when a duplicate is found. You might have a date column to sort it out, or you may add a row number column to the data flow in ssis and sort by how you received the data.
Here is an example of how you might find the winning row and update others with a self join: Deleting duplicate record in SQL Server
m

SSIS Lookup data update

I have created a SSIS Package that reads data from a CSV file and loads into table1 . the other data flow tasks does a look up on table 1 .Table1 has columns x , y, z, a ,b . Table 2 has columns a , b ,y,z Lookup is done based on columns y and z . Based on the column y and z , it is picking up a and b from table 1 and updating table 2 . The problem is the data gets updated but i get multiple rows of data thats is one without updation and one after updation .
I can provide more clear explanation if needed .
Fleshing out Nick's suggestion, I would get rid of your second data flow (the one from Table 2 to Table 2).
After the first Dataflow that populates table 1, then just do an EXECUTE SQL task that performs an UPDATE on Table 2, and joins to Table 1 to get the new data.
EDIT in response to comment:
You need to use a WHERE clause that will match rows uniquely. Apparently Model_Cd is not a UNIQUE column in JLRMODEL_DIMS. If you cannot make the WHERE clause unique because of the relationship between the two tables, then you need to select either an aggregate [Length (cm)] like MIN(), MAX() etc, or you need to use TOP 1, so that you only get one row from the subquery.

Replace one column of a table with another column of another table in SQL

I have a table with several columns Table1(Col A, Col B)
Now I have one more table with one column. Table2 (Col C)
What I want to do is:
Replace Col B of table1 with Col C of tabl 2.
Is it possible in SQL? I am using phpmyadmin to execute queries
Why I need to do this?
- I was playing around with the database structure and changed the type of text to integer which messed up the entries in the column
- Good thing: I have a backup excel file so now i am planning to replace the effected column to by the orginal values in the backedup excel file.
No can do.
You seem to be making an incorrect assumption, namely that the order of rows in a table is significant. Else what's confusing some of the commenters would be clear to you: there's no information in table2 to relate it to table1.
Since you still have the data in Excel, drop table2 and re-create it with rows having the key to table1. Then write a view to join them. Easiest is probably to insert that join result into a third table, and then drop the first two and rename the third.

delphi DBGrid display JOIN results

I am working with BDS 2006,MySQL DB (MyDAC components used for connection) and i have a DBGrid component on my form which displays the records from my DB table.
Now i need to JOIN two tables and display the results in my DBGrid
The Resulting view that I should get is the result of the query
SELECT e_salary.e_basic,e_details.e_name
FROM e_details
INNER JOIN e_salary
ON e_details.e_id=e_salary.e_id;
there is one more option to do it as I searched
SELECT
e_salary.e_basic,e_details.e_name
FROM
e_details, e_salary
WHERE
e_details.e_id=e_salary.e_id;
e_details,e_salary are my two tables and e_id is my PRIMARY KEY
Presently I am having 2 DBGrid one is for e_details and other for e_salary
Is it possible to have only 1 DBGrid displaying values from both the Tables? or I have to display 2 separate DBGrid?
If possible then how do I go about it
P.S.- there are more columns to be added in the view and both tables have same no of rows
Thanks in advance
DBGrid displays a dataset data. The data may be result of some SQL query execution. DBGrid, TDataSet and TDataSource do not cary what was the SQL query. Single table SELECT, multi table SELECT with joins, stored procedure call or SHOW command. So, yes - you can use 1 DBGrid to display resultset of your SELECT joining 2 tables.
If both tables have the same number of rows, e_id is primary key for both tables, then why not to have single table, containing columns of both tables ? Also, if you will need to edit your dataset data, then there may be problems to update columns of both tables. And that may be one more argument to have single table.
Although you can use WHERE e_details.e_id=e_salary.e_id instead of JOIN e_salary ON e_details.e_id=e_salary.e_id. The JOIN is preferred, because DBMS gets your intent more explicitly and that is more readable for others.
The DBgrid is probably not the component you need.
Give an eye to the TTreeView
http://delphi.about.com/od/vclusing/l/aa060603a.htm