SSIS Derived Column - ssis

How to convert a empty string into a string
ie i have a column with names and in that column there are empty strings I want those empty strings to be removed and renamed as UNKNOWN.
Before :
ID Name Age
1 15
2 Sam 20
3 47
4 Smith 25
After :
ID Name Age
1 UNKNOWN 15
2 Sam 20
3 UNKNOWN 47
4 Smith 25

You can use an "if-else" expression in the derived column transform dialog. For example, you could try:
[Name] == ''?'UNKNOWN':[Name]
More info is here: http://msdn.microsoft.com/en-us/library/ms141069.aspx

Related

SQL QUERY to show records of names that are recorded several times and are unique towards each other

For example, let us consider this table:
In this image consists of rows of 8 where names like Mike,Glenn,Daryl,Shane and Patricia is included with their respective ages
Id
Name
Age
1
Mike
25
2
Glenn
19
3
Glenn
19
4
Daryl
56
5
Shane
30
6
Shane
30
7
Patricia
16
Now I want to insert the type of query that will show the names without repetitions like This, not like This
EDIT: I entered the data from first picture. The request is to list the names without duplicates, as shown in the second and third picture but I will not convert them to text.
DISTINCT specifies removal of duplicate rows from the result set.
SELECT DISTINCT Name
FROM tablename
see: use DISTINCT in SELECT
You can use GROUP BY to achieve it.
SELECT * FROM your_table
GROUP BY your_table.name
ORDER BY id
With the data you gave, the result from this query will be:
id
name
age
1
Mike
25
2
Glenn
19
4
Deryl
56
5
Shane
30
7
Patricia
16

Mysql Parsing logic on Multiple rows

I have parsing Queries with below references
link1 - SET and Select Query combine Run in a Single MySql Query to pass result in pentaho
link2
Input will be shown in below Col1 showing ,In #input in the above reference link i am considering only 1 records and applying parsing logic for each cell , but issue is with multiple rows (n rows) and combining result with parsing logic.
Col1
--------------
22:4,33:4
33:6,89:7,69:2,63:2
78:6
blank record
22:6,63:1
I want to create single Query for same as in reference link i asked for.
Expected Output
xyz count
------------
22 10
33 10
89 7
69 2
63 3
78 6
I tried solutions Passing values with this conditions
where condition pass 1 by 1 col1 in (my query)
MAX (col1)
group_concat
but i am not getting expected output to fit this all things in a single query.
I finally found solution for my question. and group_concat worked for this
#input= (select group_concat(Col1) from (select Col1 from table limit 10)s);
group_concat will merge all the rows of Col1 into comma seperated string
22:4,33:4,33:6,89:7,69:2,63:2,78:6,blank record,22:6,63:1
as we have now single string we can apply same logic as shown in link 1
we can replace blank record with REPLACE command and neglect it.
Output after using logic from link1 result
xyz count
------------
22 4
33 4
33 6
89 7
69 2
63 2
78 6
22 6
63 1
Just use Group by
select xyz,sum(count) from (select link1 output)s group by xyz;
will give you Final Output
xyz count
------------
22 10
33 10
89 7
69 2
63 3
78 6

how to organize an undefined data set

I have a data1.csv file with 74 rows containing names and about 300 columns containing dates.
I used the following codes:
data1<-read.csv(data1.csv)
names(data1)[1]<-paste("name")
So, the data looks like the following:
name v2 v3 etc v300
2011/08/01 2011/08/02 etc 2014/03/03
name1 123 132 etc 134
name2 12 14 etc 15
etc
For each name (name1 to name 74), data is for each date. I don't need V2-V300 - just want to name the second row as "date". How can I transpose data by each name such that the data looks like the following:
name date data
name1 2011/08/01 123
name1 2011/08/02 132
etc etc etc
name2 2011/08/01 12
name2 2011/08/02 14
etc etc etc
Thanks in advance,
chw

Mysql :Exclude row that does not satisfy the condition list

So Here is My data
ID C1 C2 C3
6 Digit 2 6,8,10,12
12 Digit 3 15
15 127 Digit 2 6,7,8,9,10,11,12,13
68 140,141 Digit 11 85,86,87,88,167,168,158,159
73 1 Digit 11 85,86,87,88,169,170
76 Digit 11 85,86,87,91,164,165,166,167,168
99 Digit 11 20,27,85,86,87
106 Digit 1 1,2
111 Digit 11 85,86,87,88
112 Digit 11 85,86,87,88
135 Digit 11 85,86,87
and my condition string is (2,6,15,37,42,52,62,65,79,85,94,100,104,107,113,124,131)
Now,I want to exclude row 3,4,5 if the values 127,140,141,1 are not in the list condition. I tried Not in , but no avail. I think I might be missing something basic, but just cant get it.
It's better not to store multiple values in a column if possible. Then it's easier to do queries like this.
You cannot use "IN" or "NOT IN" because they are looking for a list of separate items. But C3 is just one item that happens to have commas in it.
Try this:
SELECT * FROM
(SELECT ID, C1, C2, CONCAT('|',REPLACE(C3,',','|'),'|') as C3 FROM `table` WHERE `C3` ) as t1
WHERE t1.C3 NOT LIKE "|127|" AND t1.C3 NOT LIKE "|140|" AND t1.C3 NOT LIKE "|141|" AND t1.C3 NOT LIKE "|1|"
You could avoid the "|" and just concat "," to the start and end.
Or you could fix your database schema so that it actually acts like a Normalized Relational Database.
Every column that contains multiple values should be separated out into its own table.
There should be no column C3 in your table above. Instead, you should have a table, some_other_data:
At this point, I see that C3=6 is related to more than one record in the main table. Therefore, you actually need a third, linking table, in addition to some_other_data. See below.
`some_other_data`
id
6
8
10
12
15
`main_table_to_some_other_data_link`
some_other_data_id | main_table_id
6 6
8 6
10 6
12 6
15 12
6 15
etc. You can see that the linking table can contain duplicates of either value. But your other two tables would have completely unique ids.
I think you're trying to solve the wrong problem.
(I'm assuming you can change your table structure. If you can't someone else will need to address your question.)
The long lists of comma-separated data are a flag that they have a one-to-many relationship with ID.
For example, make the data in C3 its own table:
ID MainID C3
================
1 6 6
2 6 8
3 6 10
4 6 12
5 12 15
6 15 6
7 15 7
8 15 8
9 15 9
10 15 10
11 15 11
12 15 12
13 15 13
// and so forth //
So ID is the primary key of the new table, MainID is the foreign key that refers to the record in your primary table, and C3 is the data in C3.
Each separate value of C3 now has its own record.
Now, you're in a position to use something like
Select * from MainTable
Inner Join NewTable
On MainTable.ID = NewTable.MainID
Where NewTable.C3 Not In (2,6,15,37,42,52,62,65,79,85,94,100,104,107,113,124,131);
If you can, pulling out the one-to-many relationships into their own tables will make things easier for you.

SSIS Derived Column casting numeric(6,0) to ANSI string drops leading digits

I have a package that's reading a table from an Oracle data source. The surrogate key for the table (an identity column) was assigned a numeric(6,0) data type by SSIS.
When I take that column into a Derived Column component and cast it as (DT_STR,100,1252) to store in a lookup table downstream, the string column produces incorrect output - but in a very strange way.
Here's some sample output:
ID ID_AS_STRING
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 0
11 11
12 12
.
.
20 0
21 21
22 22
.
.
.
30 0
.
.
40 0
Basically, it drops the leading digit of the ID if it's divisible by 10 (this goes on into the hundreds and thousands, i.e. 740 becomes 40, 9920 becomes 920.)
Needless to say, this is doubly wrong, since it misses some IDs for the lookup and creates duplicates of others.
If you change the initial column data type to four-bit integer it works fine, so in a way this question is merely academic. But can anyone explain what's going on in SSIS to drop that leading digit of a mod-10 number?