Selecting distinct 5 columns combination in mysql - mysql

I have a mysql table that looks like this: Col 1 is UNIQUE.
1 value1 value2 0 2
2 value1 value2 1 3
3 value3 value4 3 2
4 value4 value5 4 1
5 value3 value4 3 1
I need a query to select all the rows with distinct column 1 and 2, for example the output I want for this example will look like this:
1 value1 value2 0 2
3 value3 value4 3 2
4 value4 value5 4 1
I need distinct col 1 and 2 but altogether all columns combination will be distinct always. I want to display distinct col 1,2 and 3 without col 2,3 repeating.
I've found a few samples on how to do it but they all select distinct on each column individually. I tried many stackoverflow answers too. But my question is different.

One method that performs well is a correlate subquery:
select t.*
from t
where t.col1 = (select min(t2.col1)
from t t2
where t2.col2 = t.col2 and t2.col3 = t.col3
);
For best performance, you want an index on (col2, col3, col1).
I strongly advise having a primary key on all tables, but if you did not have one, then row_number() would be the way to go:
select t.*
from (select t.*,
row_number() over (partition by col2, col3 order by col2) as seqnum
from t
) t
where seqnum = 1;
This incurs a tad more overhead because row numbers need to be assigned to all rows before they are filtered for only first one.

It could be achieved by using ROW_NUMBER:
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY col2, col3 ORDER BY col1) AS rn
FROM tab) sub
WHERE rn=1

Related

Filter twice with MAX() in MySQL

I have the following table called my_values in a MySQL 5.7 database:
value1
value2
value3
foo
7
something4
foo
5
something1
foo
12
anything5
bar
3
something7
bar
18
anything5
bar
0
anything8
baz
99
anything9
baz
100
something0
As you see, there are duplicates in value1. I want to SELECT each unique value1 only once, but that row with the highest value in value2.
I'm using this query for that:
SELECT v.* FROM my_values v WHERE v.value2 = (SELECT MAX(v2.value2) FROM my_values v2 WHERE v2.value1 = v.value1);
The result is:
value1
value2
value3
foo
12
anything5
bar
18
anything5
baz
100
something0
Here's a fiddle of that.
From this result I want to SELECT each unique value3 only once, but that row with the highest value in value2 (no matter what value1 is).
So expected result would be:
value1
value2
value3
bar
18
anything5
baz
100
something0
How can I do that?
here is how you can do it :
select t1.*
from my_values t1
natural join (select value1, MAX(value2) value2
from my_values
group by value1 ) t2
natural join (select value3, MAX(value2) value2
from my_values
group by value3) t3
fiddle
You can use tuples for the comparison:
select t.*
from my_values t
where (t.value2, t.value3) = (select t2.value2, t2.value3
from my_values t2
where t2.value1 = t.value1
order by t2.value2 desc, t2.value3 desc
limit 1
);

SQL Select rows where col1 or col2 equals variable

So I want to select rows from table where col1 or col2 equals to variable, but if there is already row selected where col1 equals to variable (variable X) and col2 is anything else (variable Y) then it won't select another row where col2 equals to variable X and col1 equals to that variable Y. Everything ordered by column TIME descending.
Let's say this is my table:
COL1 COL2 TIME COL4
1 2 0 A
1 2 1 B
2 1 2 C
1 3 3 D
3 1 4 E
4 2 5 F
3 4 6 G
1 2 7 H
4 1 8 I
And let's say that variable X equals to 1, then I want to have these rows:
COL1 COL2 TIME COL4
4 1 8 I
1 2 7 H
3 1 4 E
So it won't show me this row
COL1 COL2 TIME COL4
2 1 2 C
because there is already a combination where col1/col2 is 2/1 or 1/2.
Sorry if I explained it in a bad way, but I can't think of better explanation.
Thank you guys.
Making a couple of key assumptions...
SELECT a.*
FROM my_table a
JOIN
( SELECT MAX(time) time
FROM my_table
WHERE 1 IN (COL1,COL2)
GROUP
BY LEAST(col1,col2)
, GREATEST(col1,col2)
) b
ON b.time = a.time;
EDIT: I posted this answer when it was thought that OP's database was SQL Server. But as it turns out, the database is MySQL.
I think this query should do it:
select t.col1, t.col2, t.time, t.col4
from (select t.*,
row_number() over (
partition by
case when col1 < col2 then col1 else col2 end,
case when col1 < col2 then col2 else col1 end
order by time desc) as rn
from tbl t
where t.col1 = x or t.col2 = x) t
where t.rn = 1
order by t.time desc
The key part is defining the row_number partition by clause in such a way that (1, 2) is considered equivalent to (2, 1), which is what the case statements do. Once the partitioning works correctly, you just need to keep the first row of every "partition" (where t.rn = 1) to exclude duplicate rows.

How to split one row into two based on 2 different column values

I am trying to accomplish something simple, but cant get to think straight. I have a case where 1 row can have different values in 2 different columns. But if thats the case then instead of displaying just 1 row for these 2 values, I need to display 2 rows for 1 column value each..for example.
ID Col1 col2 col3 col4
46054 2011W3974 164505 1 2
58765 2014W3777 275908 1 NULL
52311 2013W1877 247047 1 NULL
63032 2015W3317 295279 1 NULL
57552 2014W2813 274810 1 NULL
44584 2011W2622 173985 1 2
This needs to be split into 2 rows for row 1 and 6 into 2 rows like below:
46054 2011W3974 164505 1 NULL
46054 2011W3974 164505 NULL 2
58765 2014W3777 275908 1 NULL
52311 2013W1877 247047 1 NULL
63032 2015W3317 295279 1 NULL
57552 2014W2813 274810 1 NULL
44584 2011W2622 173985 1 NULL
44584 2011W2622 173985 NULL 2
What is the best possible way to do this. I looked at SPLIT XML function, but I dont think that will be helpful here. I also played with ranking functions, but since this is 2 columns, I dont think that will work either. Please suggest
Thanks,
RV
I'd properly just union it together:
SELECT Id, Col1, Col2, Col3, NULL AS Col4
FROM <Your Table>
WHERE col4 is NULL
UNION
SELECT Id, Col1, Col2, NULL, Col4
FROM <Your Table>
WHERE Col4 = 2
Just use Union not union All.
SELECT Id, Col1, Col2, Col3, NULL AS Col4
FROM YourTable
WHERE isnull(Col4 , 0) = 0
UNION
SELECT Id, Col1, Col2, NULL as Col3, Col4
FROM YourTable
WHERE isnull(Col3 , 0) = 0

how to achieve concatenation using group by

Suppose there is a table named 'a' with following data:
col1, col2
-----------
1 1
1 2
1 3
2 2
2 3
3 4
then to achieve following results:
col1, col2
--------------
1 6
2 5
3 4
i can run query like :
select col1, sum(col2) from a group by col1.
But suppose my table is:
col1, col2
---------
1 a
1 b
1 c
2 d
2 e
3 f
here col2 is of varchar type not of numeric type.
what will be the sql query to give following results???
col1, col2
------------
1 a,b,c
2 d,e
3 f
i have tried group by on col1 but how to concatenate values in col2???
the problem is that col2 is of varchar type.
In case of MySQL you can use GROUP_CONCAT like this:
SELECT
col1,
GROUP_CONCAT(col2) as col2
FROM demo
GROUP BY col1;
Here is the sqlfiddle.
In case of SQL Server you can use STUFF like this:
SELECT t1.col1,
stuff((SELECT ',' + CAST(t2.col2 as VARCHAR(10))
FROM demo t2 WHERE t1.col1 = t2.col1
FOR xml path('')),1,1,'') col2
FROM demo t1
GROUP BY t1.col1;
Here is the sqlfiddle.
You can use group_concat function in mysql
select
col1,
group_concat(col2) as col2
from table_name
group by col1
Here is a good example, I ran into a similar issue whilst coding up a schedule (working example: www.oldiesplus.com/schedule/)
Here is the link to my question with answer: https://stackoverflow.com/a/27047139

mysql, find frequenties in table

I have a table in PHPMyAdmin with six columns.
In each cell there is a name.
Now I want to know in what frequencies we have each name in each column.
For example:
column1 column2 column3
name1 name3 name2
name1 name2 name2
name2 name3 name1
Then I need a list with:
column1 column2 column3
name1 - 2 0 1
name2 - 1 1 2
name3 - 0 2 0
I tried to play with:
SELECT Count(*) FROM aanmeldingen2013 WHERE column1 LIKE name1.
Can someone help me with the SQL code to generate this output?
I think this is the most efficient method:
select name,
sum(n = 1) as Column1Cnt,
sum(n = 2) as Column1Cnt,
sum(n = 3) as Column1Cnt
from (select (case when n.n = 1 then column1
when n.n = 2 then column2
when n.n = 3 then column3
end) as name,
n.n
from t cross join
(select 1 as n union all select 2 union all select 3) n
) t
This should be more efficient that a union all query because it only scans the original table once. I've shown the example here for three columns (as in your sample data). It should be clear how to generalize this to six columns.
check this query
SELECT COUNT( * ) FROM 'tbl_name' where column1='name1';