Dynamic Header Mapping in SSIS - ssis

Can any one help on below implementation...
Source txt/Excel File may come in below format
Case-1
CName Pan Mobile
------------------------
A PANA1 1234567891
B PANB2 1234567892
Case-2
Pan_No Mobile_No CustomerName Gender
----------------------------------------
PANA1 1234567891 A M
PANB2 1234567892 B F
Case-3
Email Mobile_Number Customer_Name PanNumber
----------------------------------------------------
A#gmail.com 1234567891 A PANA1
B#gmail.com 1234567892 B PANB2
Destination Table
Customer Table
C_Name C_PanNo C_MobileNo
---------------------------
A PANA1 1234567891
B PANB2 1234567892
ExternalHeaderMapping Table
Id DestinationColumnName ExternalHeaderName
-----------------------------------------------
1 C_Name CName
2 C_Name CustomerName
3 C_Name Customer_Name
3 C_PanNo Pan
4 C_PanNo Pan_No
5 C_PanNo PanNumber
6 C_MobileNo Mobile
7 C_MobileNo Mobile_No
8 C_MobileNo Mobile_Number
In the above case I need to build the SSIS packge it should work for all the above three case even
the order of column is changed and new column being added.
In case-2 & Case-3 it should ignore the Gender and Email column.
I am new to SSIS, please help me how to achive the same with SSIS..I know it can only achive through Script component but
don't know how to do....

If I will do I will use Script Task and use below logic. (Sorry I can not write code but i can give you overall Idea).. I am not sure about performance but it works 100%...
Use “Script Task”
Output – c1, c2, c3 (assume output columns are fix)
Variable
#Headers = H1,H2,H3,H4 (get value from file like first row/header row)
#Columns_Object = Multiple row from Header mapping table
Row value should like:
1) C_Name |CName,CustomerName,Customer_Name
2) C_PanNo |PAN,Pan_No,Pan_Number
#Array_Header = #Headers splite using “,”
Now value is like
[0] = H1,
[1] = H2, etc
Use foreach loop #Array_Header
Pick one value “H1” and find Row from #Columns_Object
Once you get row find value before “|” (e.g get C_Name) store in local variable #SelectCol
Switch
{
If #SelectCol = ‘C_Name’
Then store into
Ouput C1 =datarow.col[0]
( “0” we get base on position of H1) if loop is looking for H2 then index is “1”
Else if #SelectCol = “C_PanNo”
Ouput C2 =datarow.col[1]
END
}

Related

combining two select queries from the same table

I need to do something like this:
id tag status user trial Value (other columns)...
1 A Pass peter first 0
2 A Pass peter second 1
3 A Fail peter third 3
4 B Pass peter first 4
5 B Pass peter second 5
6 B Pass peter third 6
select the rows that tag equal A and status equal to Pass and find the same value for other tag ex:B
id tag status user trial Value_tag_A Value_tag_B (other columns)...
1 A Pass peter first O 4
2 A Pass peter second 1 5
I can do some processing using php to get this result, but i'm wondering if i can do it directly using sql
I've tried numerous variations and can't seem to get close to the result.
Solution: http://sqlfiddle.com/#!9/e9068d/17
I don't know why in the rows where tag=A also have Value_tag_B. I will ignore this and maybe the following query is an approach.
SELECT DISTINCT y.status, y.`user`, y.trial,
(SELECT Value FROM toto WHERE y.`user` = `user` and y.trial = trial and tag = 'A' ) AS Value_tag_A,
(SELECT Value FROM toto WHERE y.`user` = `user` and y.trial = trial and tag = 'B' ) AS Value_tag_B
FROM toto y
WHERE y.trial NOT IN (SELECT DISTINCT trial FROM toto WHERE `status` <> 'Pass')
The code has been modified.
SQL Fiddle

MySQL query to gather incorrectly stored data

I have recently taken over a email campaign project and need to generate a report for the customer. However the data has been stored very strangely.
Basically the client wants a report of the subscribers first name and last name that have subscribed to a emailing list.
Example table data.
------------------------------------------------------------
id | owner_id | list_id | field_id | email_address | value
------------------------------------------------------------
1 10 1 137 me#example.com John
2 10 1 138 me#example.com Doe
So as you can see, John Doe has subscribed to mailing list 1, and field_id 137 is his first name and field_id 138 is his last name.
The client is looking for a export with the users first name and last name all is one field.
I tred the following sql query
SELECT value
FROM Table_A AS child
INNER JOIN Table_A AS parent
ON parent.email_address = child.email_address
WHERE child.owner_id = '10'
But unfortunately the query gives me the results in many rows but not appending the first name and last name into one field,
If anyone can provide some assistance that would be awesome.
Thanks.
SELECT
concat( parent.value,' ',child.value)name
FROM mytable AS child
left JOIN mytable AS parent
ON parent.email_address = child.email_address
WHERE child.owner_id = '10'
and parent.field_id=137 and child.field_id=138
Check at-http://sqlfiddle.com/#!9/199b4b/45
I think you have to use a variable to put in there everything you have to and then select the variable with the desired name of yours.
For example:
DECLARE #yourvariable VARCHAR(MAX)
SELECT #yourvariable = COALESCE(#yourvariable + " ") + value
FROM table_A
WHERE owner_id = 10
SELECT #yourvariable as FullName
Try that, it might help.
You can try this code(column name equals value in your original DB):
select a.name
from
table_a a inner join table_a b
on a.email_address = b.email_address and a.field_id <> b.field_id
where a.owner_id=10
order by a.field_id
Here is the example link:
http://sqlfiddle.com/#!9/5fbdf6/25/0
As per assumptions, first name has the field id 137 and last name has the field id 138.
You can try the following query to get the desired result.
SELECT CONCAT(SUBSTRING_INDEX(GROUP_CONCAT(`value`),",",1)," ",SUBSTRING_INDEX(GROUP_CONCAT(`value`),",",-1)) AS client_name
FROM Table_A
WHERE owner_id = 10
AND field_id IN (137, 138)
GROUP BY email_address;

select one row multiple time when using IN()

I have this query :
select
name
from
provinces
WHERE
province_id IN(1,3,2,1)
ORDER BY FIELD(province_id, 1,3,2,1)
the Number of values in IN() are dynamic
How can I get all rows even duplicates ( in this example -> 1 ) with given ORDER BY ?
the result should be like this :
name1
name3
name2
name1
plus I shouldn't use UNION ALL :
select * from provinces WHERE province_id=1
UNION ALL
select * from provinces WHERE province_id=3
UNION ALL
select * from provinces WHERE province_id=2
UNION ALL
select * from provinces WHERE province_id=1
You need a helper table here. On SQL Server that can be something like:
SELECT name
FROM (Values (1),(3),(2),(1)) As list (id) --< List of values to join to as a table
INNER JOIN provinces ON province_id = list.id
Update: In MySQL Split Comma Separated String Into Temp Table can be used to split string parameter into a helper table.
To get the same row more than once you need to join in another table. I suggest to create, only once(!), a helper table. This table will just contain a series of natural numbers (1, 2, 3, 4, ... etc). Such a table can be useful for many other purposes.
Here is the script to create it:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
For the task at hand it is not necessary to add many records, as you only need to make sure you never have more repetitions in your in condition than in the above seq table. I guess 128 will be good enough, but feel free to double the number of records a few times more.
Once you have the above, you can write queries like this:
select province_id,
name,
#pos := instr(#in2 := insert(#in2, #pos+1, 1, '#'),
concat(',',province_id,',')) ord
from (select #in := '0,1,2,3,1,0', #in2 := #in, #pos := 10000) init
inner join provinces
on find_in_set(province_id, #in)
inner join seq
on num <= length(replace(#in, concat(',',province_id,','),
concat(',+',province_id,',')))-length(#in)
order by ord asc
Output for the sample data and sample in list:
| province_id | name | ord |
|-------------|--------|-----|
| 1 | name 1 | 2 |
| 2 | name 2 | 4 |
| 3 | name 3 | 6 |
| 1 | name 1 | 8 |
SQL Fiddle
How it works
You need to put the list of values in the assignment to the variable #in. For it to work, every valid id must be wrapped between commas, so that is why there is a dummy zero at the start and the end.
By joining in the seq table the result set can grow. The number of records joined in from seq for a particular provinces record is equal to the number of occurrences of the corresponding province_id in the list #in.
There is no out-of-the-box function to count the number of such occurrences, so the expression at the right of num <= may look a bit complex. But it just adds a character for every match in #in and checks how much the length grows by that action. That growth is the number of occurrences.
In the select clause the position of the province_id in the #in list is returned and used to order the result set, so it corresponds to the order in the #in list. In fact, the position is taken with reference to #in2, which is a copy of #in, but is allowed to change:
While this #pos is being calculated, the number at the previous found #pos in #in2 is destroyed with a # character, so the same province_id cannot be found again at the same position.
Its unclear exactly what you are wanting, but here's why its not working the way you want. The IN keyword is shorthand for creating a statement like ....Where province_id = 1 OR province_id = 2 OR province_id = 3 OR province_id = 1. Since province_id = 1 is evaluated as true at the beginning of that statement, it doesn't matter that it is included again later, it is already true. This has no bearing on whether the result returns a duplicate.

MySQL Pivot multiple rows into new columns

I am trying to write a pivot function in MySQL workbench and many of the places I've looked have not been super relevant.
I currently have:
order_ID Part Description Order number
1 103 A 1
2 104 B 1
3 103 A 2
4 105 C 3
5 103 A 4
6 105 C 4
7 107 D 4
I would like to create:
Order Part1 Description Part2 Description Part3 Description
1 103 A 104 B
2 103 A
3 105 C
4 103 A 105 C 107 D
I can keep the primary key in the output, but it is not necessary. The problem I am running into is that many pivot functions involve using distinct parts names to move them; however, I have over 500 parts. I also would like to move the description and the part together so they are next to each other--most pivot functions are not powerful enough to address that.
I did write a macro to do this in Excel, but it must be done in a database because of further analysis in R and I am pulling data from a database and I must automate any changes made to the data. As a result, I DO NOT have a choice in how the data is organized and laid out. Please do not mention normalizing data or other database techniques because I am trying to fix the data and how messy it is, but I DO NOT have a choice in how the data is inputted.
Some resources I used to gain experience with pivoting in MySQL, but I have not been able to get any code to work.
MySQL pivot table
mysql pivoting - how can I fetch data from the same table into different columns?
http://en.wikibooks.org/wiki/MySQL/Pivot_table
http://buysql.com/mysql/14-how-to-automate-pivot-tables.html
Select group_concat(Table.column1) as anything,
group_concat(Table.column2 separator ';')
AS Anything2, Table.`column3`
FROM Table
group by Table.column3;
Alter TABLE Table ADD
`newcolumn1` varchar(100) DEFAULT '' after `column3`;
Alter TABLE MB ADD
`newcolumn2` varchar(500) DEFAULT '' after `newcolumn1`;
UPDATE Table SET
`newcolumn1` = IF (
LOCATE(',', column1) >0,
SUBSTRING(column1, 1,LOCATE(',', column1)-1),
column1
),
`newcolumn2` = IF(
LOCATE(',', column1) > 0,
SUBSTRING(column1, LOCATE(',', column1)+1),
'');
UPDATE Table SET
newcolumn2 = SUBSTRING_INDEX(newcolumn2, ',', 1);
UPDATE Table SET
newcolumn3 = SUBSTRING_INDEX(newcolumn3, ',', 1);
This code achieved exactly the format I wanted above.

Progress OpenEdge, ODBC, recordsets, joining, oh my

So frustrated here. I'm not a DB Admin but can get around. I'm writing some ODBC queries against a Progress OpenEdge database that we only have view access to. For the longest time there have been no problems until recently they changed the data structure and for who knows why, they moved customer phone numbers into their own table called "contact" whereas before they were in "cif", where the address etc still remain.
Instead of creating the "contact" table with one row for for each customer and fields for each phone number, they use a code of 0-4, number/email, and customer. So if a customer has 4 phone numbers, they have 4 rows with different code, contact fields and customer name repeated.
I'm trying to join the "contact" table with the "cif" table so it returns each mention of customer in "cif" no matter how many times it is listed in "cif", but include all phone numbers associated from "contact" in each line.
Table structure simplified is like so:
Table "contact"
code | contact(#) | customer
--------------------------------
0 | (123)456-7890 | ABC Corp
1 | (123)456-7891 | ABC Corp
0 | (987)654-3210 | CBA Inc
Table "cif"
customer | b_in_low | b_in_high
----------------------------------
ABC Corp | 50.45 | 134.66
ABC Corp | 64.45 | 188.99
CBA Inc | 12.56 | 890.33
What I'm trying to return is a joined row for each row in "cif" but include all numbers from "contact" so the table above would return:
rsRow1) ABC Corp, 0, (123)456-7890, 1, (123)456-7891, 50.45, 134.66
rsRow2) ABC Corp, 0, (123)456-7890, 1, (123)456-7891, 64.45, 188.99
rsRow3) CBA Inc, 0, (987)654-3210,,, 12.56, 890.00
What I do NOT want:
rsRow1) ABC Corp, 0, #, 50.45, 134.66
rsRow2) ABC Corp, 1, #, 50.45, 134.66
rsRow3) ABC Corp, 0, #, 64.45, 188.99
rsRow4) ABC Corp, 1, #, 64.45, 188.99
rsRow5) CBA Inc, 0, #, 12.56 | 890.00
Make sense? I can get it to work by one rs on the "cif" table and during each repeat region, perform another query on "contact" using the "cif.customer" as a WHERE filter but obviously it is extremely slow and would result in potentially thousands of queries.
I can get it to return only 1 line from "cif" but only 1 number from "contact"
or
I can get it to return up to 5 duplicate "cif" lines with the 5 different phone numbers for each.
So in a nutshell, how can I efficiently get 1 row from "cif" while listing all +-5 phone numbers from "contact"?
How about this:
SELECT c.customer
, ISNULL(c1.code,'')
, ISNULL(c1.contact,'')
, ISNULL(c2.code,'')
, ISNULL(c2.contact,'')
, ISNULL(c3.code,'')
, ISNULL(c3.contact,'')
, ISNULL(c4.code,'')
, ISNULL(c4.contact,'')
, ISNULL(c5.code,'')
, ISNULL(c5.contact,'')
, c.b_in_low
, c.b_in_high
FROM CIF AS c
LEFT OUTER JOIN Contact AS c1
ON c1.customer = c.customer
AND c1.code = 0
LEFT OUTER JOIN Contact AS c2
ON c2.customer = c.customer
AND c2.code = 1
LEFT OUTER JOIN Contact AS c3
ON c3.customer = c.customer
AND c3.code = 1
LEFT OUTER JOIN Contact AS c4
ON c4.customer = c.customer
AND c4.code = 1
LEFT OUTER JOIN Contact AS c5
ON c5.customer = c.customer
AND c5.code = 1
It depends on the type of the field 'code' what is returned, if you want it to be blank you probably have to do another translation.
Not pretty, but I think it works.
There is an XML option in SQL server that lets you take multiple results and merge them into a concatenated string in a single field. It's the STUFF FOR XML PATH command. Here's an example of how I've I used it.
SELECT call_number, item_number,
REPLACE(REPLACE(STUFF((SELECT DISTINCT ',',''''
+ CONVERT(VARCHAR(20), item_line) + '**‘
+ item_number + '**‘ + work_code + ''''
FROM stage_call_item_detail s
WHERE h.source_system_code = s.source_system_code
AND h.domain_code = s.domain_code
AND h.call_number = s.call_number
AND s.site_code IS NOT NULL
ORDER BY 2
FOR XML PATH(''))
,1, 1, '‘) ,'<item_number>',''''),'</item_number>','''') call_line_item_list, *
FROM stage_ssm_call_history h
WHERE call_number = 'A1014-01'
Can you use buffers when building a query?
If you can, you could do something like:
define buffers contactA ... contactN for contact.
FOR EACH cif WHERE cif.customer = "ABC Corp",
FIRST contact OF cif OUTER-JOIN,
FIRST contactA OF cif WHERE ROWID(contactA) <> ROWID(contact) OUTER-JOIN,
...
FIRST contactN of cif
WHERE ROWID(contactN) <> ROWID(contact)
AND ROWID(contactN) <> ROWID(contactA)
...
This is not a nice solution, and performance can be affected seriously... And this will only work if you have limited number of contacts, say, 0-4.