duplicate set key value in SSIS - ssis

I need to pivot rows to columns in SSIS.
I am using Integration Services in the Microsoft Visual Studio version 2010.
I have a flat file with the following info:
column 0 column1 column2
-------------------------------------
d-5454-s34 name Frans
d-5454-s34 sd xyh
d-5454-s34 description Group zen
d-5454-s34 member xxxx
d-5454-s34 member yyyy
d-5454-s34 member zzzzz
d-5454-s34 member uuuuu
d-5454-s45 name He-man
d-5454-s45 sd ygh
d-5454-s45 description Group Comics
d-5454-s45 member eeee
d-5454-s45 member ffffff
e-3434-t45 name Calvin
e-3434-t45 sd trdg
and the final output should be
id name sd description member
---------------------------------------------------------------------------
d-5454-s34 Frans xyh Group zen xxxx; yyyy; zzzzz; uuuuu
d-5454-s45 He-man ygh Group Comics eeee; ffffff
e-3434-t45 Calvin trdg NULL NULL
I have used the flat file component and the result is the same as you see BEFORE the final output (check above).
If I setup with the pivot component in SSIS as follows:
I set the PIVOT KEY as column 1 (it contains rows Name, sd, description and member - this last is repeated....) , the SET KEY as column 0 as we have the id that should not be repeated. :) and finally the pivot value as column 2. Afterwards I have set pivot output columns as C_NAME, C_sd, C_description, C_member... but as member is repeated in several rows it is throwing this error... Duplicate key value "member" ... how to overcome this?
Just to test i have deleted all remaining Members leaving only one member, in this way it works. Now I need to get a way to aggregate the several rows with MEMBER duplicated (column 0). How to use the aggregate function of SSIS to group only the member in column 1 and connecting all the different values for member in column 2 separated by ; as shown in the last table. Thank you.
[

You would need to change your approach a bit and transform (aggregate) your data before you are actually doing the pivot operation.
Built a sample package to demonstrate the solution -
As per the package the data needs to be sorted first as the job would be comparing records with each other. Next we need a script component (type transformation). Select all the required input and create the necessary output columns. The data type of the output columns would be same as input just make sure to increase the size of the last column(column3). Also, make sure the script component is asynchronous because it throws out a different number of rows than there are incomming.
Use the below code in script component which would be checking the previous row value and appending the data as a semi-colon separated list of related records.
bool initialRow = true; // Indicater for the first row
string column0 = "";
string column1 = "";
string column2 = "";
public override void Input0_ProcessInput(Input0Buffer Buffer)
{
// Loop through buffer
while (Buffer.NextRow())
{
// Process an input row
Input0_ProcessInputRow(Buffer);
// Change the indicator after the first row has been processed
initialRow = false;
}
// Check if this is the last row
if (Buffer.EndOfRowset())
{
// Fill the columns of the existing output row with values
// from the variable before closing this Script Component
Output0Buffer.Column0 = column0;
Output0Buffer.Column1 = column1;
Output0Buffer.Column2 = column2;
}
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (initialRow)
{
// This is for the first input row only
// Create a new output row
Output0Buffer.AddRow();
// Now fill the variables with the values from the input row
column0 = Row.column0;
column1 = Row.column1;
column2 = Row.column2;
}
else if ((!initialRow) & ((column0 != Row.column0) || (column1 != Row.column1)))
{
// This isn't the first row, but either the column1 or column2 did change
// Fill the columns of the existing output row with values
// from the variable before creating a new output row
Output0Buffer.Column0 = column0;
Output0Buffer.Column1 = column1;
Output0Buffer.Column2 = column2;
// Create a new output row
Output0Buffer.AddRow();
// Now fill the variables with the values from the input row
column0 = Row.column0;
column1 = Row.column1;
column2 = Row.column2;
}
else if ((!initialRow) & (column0 == Row.column0) & (column1 == Row.column1) & (column1 == "member"))
{
// This isn't the first row, and the column (member) did not change
// Concatenate the studentsname to the variable
column2 += ";" + Row.column2;
}
}
Reference: link

SSIS provides a lot of transformations, but most of time, insert data into a temp table and write a simple query can save a lot of time and performance may be better.
for example:
with #tempTable as (
select 'd-5454-s34' column0, 'name' column1, 'Frans' column2
union all select 'd-5454-s34', 'sd ', 'xyh'
union all select 'd-5454-s34', 'description', 'Group zen'
union all select 'd-5454-s34', 'member', 'xxxx'
union all select 'd-5454-s34', 'member', 'yyyy'
union all select 'd-5454-s34', 'member', 'zzzzz'
union all select 'd-5454-s34', 'member', 'uuuuu'
union all select 'd-5454-s45', 'name', 'He-man'
union all select 'd-5454-s45', 'sd', 'ygh '
union all select 'd-5454-s45', 'description', 'Group Comics'
union all select 'd-5454-s45', 'member', 'eeee'
union all select 'd-5454-s45', 'member', 'ffffff'
union all select 'e-3434-t45', 'name', 'Calvin'
union all select 'e-3434-t45', 'sd', 'trdg'
)
SELECT column0
, [name]
, sd
, description
, member
FROM ( SELECT column0,column1, column2 , STUFF(( SELECT '; ' + column2
FROM #tempTable T1
WHERE T1.column0 = t2.column0
AND column1 = 'member'
FOR XML PATH('') ),1, 1, '') member
FROM #tempTable t2 ) t
PIVOT ( MAX(t.column2) FOR t.column1 IN ([name], sd, description)) AS pivotable

Related

How do i batch multiple select query calls into a single mysql query or procedure

I have this function which checks friendship status between 2 users using their userIds and it is working fine.
function checkFriendShipBetweenUsers(user1Id, user2Id) {
var checkFriendShipBetweenUsersQuery = "SELECT status FROM friends WHERE (user1Id=? AND user2Id =?) OR (user1Id=? AND user2Id =?)"
var queryParameterList = [user1Id, user2Id, user2Id, user1Id]
}
I have a case in which i need to check friendship status between a user and other 3 users.
I can call above function 3 times, one for each other user to get desired result but i would like to make it with a single db call using a single query or using a mysql procedure.
function checkFriendShipBetweenUsers(user1Id, userIdList) {
var checkFriendShipBetweenUsersQuery = ""
var queryParameterList = []
}
So this query/procedure call should return 3 integers indicating user1's friendship status with users in userIdList.
Here is an example db fiddle:
db-fiddle.com/f/p5RP61V3AcawRgJcogeXey/1
given user1Id : 'a8t57h6p8n2efden' and
userIdList : ['typ3vg6xb1vt7nw2', 'cy6mqqyykpldc2j1g5vm5cqsi6x1dgrl', '0bw87kprb97pes1crom8ceodi07r2kd0']
How do i write such query or procedure?
DEMO
-- source data
CREATE TABLE test (
id INT,
user1Id VARCHAR(100),
user2Id VARCHAR(100),
status INT
);
INSERT INTO test (id,user1Id,user2Id,status) VALUES
(1,'a8t57h6p8n2efden','typ3vg6xb1vt7nw2',0),
(2,'cy6mqqyykpldc2j1g5vm5cqsi6x1dgrl','a8t57h6p8n2efden',1),
(3,'0bw87kprb97pes1crom8ceodi07r2kd0','a8t57h6p8n2efden',2),
(4,'a8t57h6p8n2efden','ap21wzbew0bprt5t',0);
SELECT * FROM test;
id
user1Id
user2Id
status
1
a8t57h6p8n2efden
typ3vg6xb1vt7nw2
0
2
cy6mqqyykpldc2j1g5vm5cqsi6x1dgrl
a8t57h6p8n2efden
1
3
0bw87kprb97pes1crom8ceodi07r2kd0
a8t57h6p8n2efden
2
4
a8t57h6p8n2efden
ap21wzbew0bprt5t
0
-- searching parameters
SET #user1Id := 'a8t57h6p8n2efden';
SET #userIdList := '[
"typ3vg6xb1vt7nw2",
"cy6mqqyykpldc2j1g5vm5cqsi6x1dgrl",
"0bw87kprb97pes1crom8ceodi07r2kd0",
"absent value"
]';
SELECT jsontable.userid, test.status
FROM JSON_TABLE( #userIdList,
'$[*]' COLUMNS ( rowid FOR ORDINALITY,
userid VARCHAR(255) PATH '$'
)) jsontable
LEFT JOIN test
ON (#user1Id, jsontable.userid) IN ( (test.user1Id, test.user2Id),
(test.user2Id, test.user1Id)
)
userid
status
typ3vg6xb1vt7nw2
0
cy6mqqyykpldc2j1g5vm5cqsi6x1dgrl
1
0bw87kprb97pes1crom8ceodi07r2kd0
2
absent value
null
fiddle
If you do not need status value for the IDs which are not found then use INNER JOIN.
If you want to receive the output as one solid value then add according GROUP BY and aggregation. Use jsontable.rowid for to provide needed values ordering.
PS. If you won't use an aggregation then you may do not obtain rowid value - simply remove rowid FOR ORDINALITY, in this case.

How to select the row value based on preference order in the table using MySql

I have table as data_attributes with a column data_type
SELECT * FROM DATA_ATTRIBUTES;
DATA_TYPE
----------
NAME
MOBILE
ETHINICITY
CC_INFO
BANK_INFO
ADDRESS
Bank_info, CC_info classified as Risk1,
Mobile, Ethinicity classified as Risk2,
Name, Address classified as Risk3
I should get the Risk classification as output,
For eg: If any of the row contains Risk1 type then output should be Risk1,
else if any of the row contains Risk2 type then output should be Risk2,
else if any of the row contains Risk3 type then output should be Risk3
I wrote below query for this
SELECT COALESCE(COL1,COL2,COL3) FROM
(SELECT
CASE WHEN DATATYPE IN ('BANK_INFO','CC_INFO') THEN 'RISK1' ELSE NULL END AS COL1,
CASE WHEN DATATYPE IN ('MOBILE','ETHINICITY') THEN 'RISK2' ELSE NULL END AS COL2,
CASE WHEN DATATYPE IN ('NAME','ADDRESS') THEN 'RISK3' ELSE NULL END AS COL3
FROM DEMO.TPA_CLASS1) A;
The required output is: Risk1 ( Only 1 value )
Please give some idea to achieve this.
You can use conditional aggregation:
SELECT
CASE
WHEN MAX(DATATYPE IN ('BANK_INFO','CC_INFO')) = 1 THEN 'RISK1'
WHEN MAX(DATATYPE IN ('MOBILE','ETHINICITY')) = 1 THEN 'RISK2'
WHEN MAX(DATATYPE IN ('NAME','ADDRESS')) = 1 THEN 'RISK3'
END AS RISK
FROM DEMO.TPA_CLASS

MySql Insert Dynamic Query with Nested Select

I'm trying a MySQL Insert Query with mix of static & Dynamic Values. The INSERT command is.
INSERT INTO ebdb.requestaction(RequestID,
ActionID,
TransactionID,
IsActive,
IsComplete)
VALUES (
1,
**Dynamic Value from Below Query,
Dynamic Value from Below Query,**
1,
0);
The Query to fetch the field 2 & 3 come from the below Query.
SELECT transitionaction.TransitionID, transitionaction.ActionID
FROM transitionaction
INNER JOIN transition
ON transitionaction.TransitionID = transition.TransitionID
WHERE transition.TenantID = 1
AND transition.ProcessID = 1
AND transition.CurrentStateID = 1
ORDER BY transitionaction.TransitionID;
I'm doing something wrong in here.
Please guide me as to how this can be achieved in the most optimized way.
You can select static values as part of a query, e.g.:
INSERT INTO ebdb.requestaction(RequestID, ActionID, TransactionID, IsActive, IsComplete)
SELECT 1, transitionaction.ActionID, transitionaction.TransitionID, 1, 0
FROM transitionaction
INNER JOIN transition
ON transitionaction.TransitionID = transition.TransitionID
WHERE transition.TenantID = 1
AND transition.ProcessID = 1
AND transition.CurrentStateID = 1
ORDER BY transitionaction.TransitionID;
For more information, refer to MySQL's Insert...Select Syntax

zend select from two tables 3 sets of rows

I have this problem using Zend and I think its db related at all:
I have two tables, one contains:
id, ..., file, desc, date
and the second table contains:
id, ..., file_1, desc_1, file_2, desc_2, date
What I need as a result is:
id, ..., file, desc, date
From both tables, which means I need to have coresponding file, desc and file_1 ->file, desc_1->desc and file_2->file, desc_2->desc in this one table.
Any idea how to do this with Zend 1.12?
You need to use JOIN in Zend ORM
for exmaple
public function getPendingProjects($owner){
$data = $this ->getAdapter()
->select()
->from('campaign' , array('title', 'id'))
->joinLeft('job', 'campaign.id = job.campaign_id', array('count(user_id)'))
->where('campaign.employer_id = ' . (int)$owner . ' AND job.status = 3' );
return $data->query()->fetchAll();
}
taked from here http://zend-frameworks.com/en/articles/zend_db_zend_mysql.html

SQL Query w/ multiple name / value pairs in Where clause

I have a table called Properties (pid, uid, pname, pvalue). The pid column is auto generated. Each uid (user id) could have multiple name value pairs stored in the pname and pvalue.
As an input I've multiple name value pairs for pname and pvalue which forms a complicated boolean expression.
For example: let's start w/ one name value pair. Say I want to retrieve all uid's whose 'favorite_color' is 'red'.
I wrote an SQL query:
SELECT *
FROM properties
WHERE ((pname = 'favorite_color') and (pvalue = 'red'))
The query soon gets complicated if I had to retrieve something like, fetch all uid's whose 'favorite_color' is 'red' or 'blue' and 'favorite_drink' is 'juice' or ' 'milk' and 'favorite_ hobby' is 'music' or 'art' etc.
I wrote an SQL query:
SELECT *
FROM properties
WHERE (((pname = 'favorite_color') and (pvalue = 'red'))
OR ((pname = 'favorite_color') and (pvalue = 'blue')))
AND (((pname = 'favorite_drink') and (pvalue = 'juice'))
OR ((pname = 'favorite_drink') and (pvalue = 'milk')))
AND (((pname = 'favorite_hobby') and (pvalue = 'music'))
OR ((pname = 'favorite_hobby') and (pvalue = 'art')))
I got the expression correct but unfortunately it fails because the evaluation is done on each row. What if I wanted to add more name value pairs to the where clause?
Questions:
Is it possible to write and SQL query for this?
The other idea I had was to fetch all the pname, pvalue pairs for each user, build a dynamic expression using an expression language and my input name value paris to evaluate it. I've apache's JEXL in mind.
To do the ANDs you need to do as many self-joins as you have and-ed conditions:
SELECT *
FROM Properties p1, Properties p2, Properties p3
WHERE p1.uid = p2.uid AND p1.uid = p3.uid
AND (p1.pname = 'favorite_color' AND p1.pvalue IN ('red', 'blue'))
AND (p2.pname = 'favorite_drink' AND p2.pvalue IN ('juice', 'milk'))
AND (p3.pname = 'favorite_hobby' AND p2.pvalue IN ('music', 'art'))
EDIT:
Another possibility is to denormalize the data, and then use FIND_IN_SET() or RLIKE:
SELECT uid, group_concat(concat(pname, '=', pvalue)) props
FROM Properties
GROUP BY uid
HAVING props RLIKE 'favorite_color=(red|blue)'
AND props RLIKE 'favorite_drink=(juice|milk)'
AND props RLIKE 'favorite_hobby=(music|art)'
SELECT p1.*,p2.pname,p2.pvalue,p3.pname,p3.pvalue
FROM (
(Select *
from Properties
where (pname = 'favorite_color' AND pvalue IN ('red', 'blue')) p1,
(Select *
from Properties
where (pname = 'favorite_drink' AND pvalue IN ('juice','milk')) p2,
(Select *
from Properties
where (pname = 'favorite_hobby' AND pvalue IN ('music', 'art')) p3,
)
WHERE p1.uid = p2.uid AND p1.uid = p3.uid