I want to select two columns together distinctly in no particular order in MySQL.
For example, the given table is below -
col1 col2 col3
--------------
a b val1
a c val2
b a val1
b c val3
c a val2
c b val3
I need to distinctly select col1 and col2 in no particular order.
col1 = a AND col2 = b
is equivalent to
col1 = b AND col2 = a
in my case, as col3 value will be same for both combinations of col1 and col2.
Expected result is below -
col1 col2 col3
--------------
a b val1
a c val2
b c val3
I want to eliminate duplicates actually.
Any help you can give would be greatly appreciated.
Thank you in advance.
Use greatest and least functions to create groups:
SELECT col1, col2, col3
FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY least(col1, col2), greatest(col1, col2) ORDER BY least(col1, col2), greatest(col1, col2)) AS rn
FROM mytable
) t
WHERE rn = 1
See Demo.
Just to give you an alternative and probably better solution in terms of performance tuning, You may try below query without using window functions -
SELECT * FROM mytable M1
WHERE NOT EXISTS (SELECT 1 FROM mytable M2
WHERE M1.col1 = M2.col2
AND M1.col2 = M2.col1
AND M2.col1 < M2.col2)
Since it uses exists clause, It will perform faster than above query. Here is the demo for both the queries.
Related
Suppose I have data containing two columns I am interested in. Ideally, the data in these is always in matching sets like this:
A 1
A 1
B 2
B 2
C 3
C 3
C 3
However, there might be bad data where the same value in one column has different values in the other column, like this:
D 4
D 5
or:
E 6
F 6
How do I isolate these bad rows, or at least show that some of them exist?
You can use exists:
select t.*
from t
where exists (select 1 from t t2 where t2.col1 = t.col1 and t2.col2 <> t.col2);
If you just want the col1 values that have non-matches, you can use aggregation:
select col1, min(col2), max(col2)
from t
group by col1
having min(col2) <> max(col2);
Using MIN and MAX as analytic functions we can try:
WITH cte AS (
SELECT t.*, MIN(col2) OVER (PARTITION BY col1) AS min_col2,
MAX(col2) OVER (PARTITION BY col1) AS max_col2
FROM yourTable t
)
SELECT col1, col2
FROM cte
WHERE min_col2 <> max_col2;
The above approach, while seemingly verbose, would return all offending rows.
exTab
PK col1 col2 col3
---------------------------------
1 val1 val4 val7 **want to return this row only
2 val1 val4 val8
3 val1 val4 val8
4 val1 val5 val9
5 val2 val5 val9
6 val2 val5 val9
7 val2 val6 val0
8 val3 val6 val0
How do I use SQL (with mySQL) to return just the rows that have multiple of the same value in col1 with multiple of the same value in col2 but with a unique value in col 3?
In the table above (exTab), for instance, val1 occurs 4 times in col1, and for these 4 occurrences val4 occurs 3 times in col2, but for these 3 occurrences val7 occurs only once in col3, so I would want to return this row (row 1). Given the criteria, row 1 would be the only row I would want to return from this table.
I've tried various combinations with group by, having count > 1, distinct, where not exits, and more to no avail. This is my first post, so my apologies if I've done something incorrectly.
I would do this by combining the results of two subqueries:
In subquery 1 I would get the col1-col2 combinations which occur more than once.
In subquery 2 I would get the col1-col2-col3 combinations that occur only once.
The intersection (inner join) of these 2 subqueries would yield the record you are looking for.
select t1.*
from
exTab t1
inner join
(select col1, col2 from exTab
group by col1, col2
having count(*)>1) t2
inner join
(select col1, col2, col3 from exTab
group by col1, col2, col3
having count(*)=1) t3 on t2.col1=t3.col1
and t2.col2=t3.col2
and t1.col1=t3.col1
and t1.col2=t3.col2
and t1.col3=t3.col3
If I've good understand the problem this SQL query might help you:
SELECT
SubTab.PK
FROM
(SELECT
PK,
COUNT(col3) OVER (PARTITION BY col1) as col1_group,
COUNT(col3) OVER (PARTITION BY col2) as col2_group
FROM
exTab) SubTab
WHERE
SubTab.col1_group = 1 AND SubTab.col2_group = 1;
It will run TWO windowing aggregating functions over original Tab, and then return temporary tab and from this tab we only select this PK of rows for which col3 was unique in one group and the another too.
You could try something along the lines of:
SELECT
*
FROM table
WHERE col1 IN (SELECT col1 FROM table GROUP BY 1 HAVING count(*)>1)
AND col2 IN (SELECT col2 FROM table GROUP BY 1 HAVING count(*)>1)
AND col3 IN (SELECT col3 FROM table GROUP BY 1 HAVING count(*)=1)
Though the performance may be terrible if your table is large.
Here is my sample table
Col1 Col2
A A
B B
A C
B D
C C
I want to be able to select distinct records where all rows have the same value in Col1 and Col2. So my answer should be
Col1 Col2
A A
B B
C C
Simply:
select distinct * from t where col1 = col2;
if both cols have null and you want to get that row too:
select distinct * from t where coalesce(col1, col2) is null or col1 = col2;
The query is already written in your request:
select distinct records where all rows have the same value in Col1 and Col2
SELECT DISTINCT *
FROM tbl
WHERE Col1 = Col2
How do I structure my query so I can count how many occurrences of a value in column 1 appears in column 2 and then store that result in a new column in the same table? (If a value is duplicated in the first column I still want to store the same value in the new column) For example if I had a table like this:
COL1 COL2
1 2
1 4
2 1
3 1
4 1
4 2
The resulting table will look like this:
COL1 COL2 COL3
1 2 3
1 4 3
2 1 2
3 1 0
4 1 1
4 2 1
Any help is appreciated I am new to sql! Thanks in advance!
Select
col1,
col2,
COALESCE(col3,0) as col3
FROM
mytable
LEFT JOIN
( Select count(*) as col3, col2
from mytable
GROUP BY col2) as temp ON temp.col2 = mytable.col1
And if you want the update (thanks Thorsten Kettner ) :
UPDATE mytable
LEFT JOIN ( Select count(*) as col3, col2
from mytable
GROUP BY col2) as temp ON temp.col2 = mytable.col1
SET mytable.col3 = COALESCE(temp.col3,0)
You can easily count on-the-fly. Don't store this redundantly. This would only cause problems later.
select
col1,
col2,
(
select count(*)
from mytable match
where match.col2 = mytable.col1
) as col3
from mytable;
If you think you must do it; here is the according UPDATE statement:
update mytable
set col3 =
(
select count(*)
from mytable match
where match.col2 = mytable.col1
);
To do that, you can try :
SELECT COL1, COL2, (SELECT COUNT(COL1) FROM `tablename` AS t2
WHERE t2.COL1 = t1.COL1) AS COL3 FROM `tablename` AS t1
Enjoy :)
i am trying to run a sql query which will not show distinct/duplicate values.
For example if using distinct option it would display only one unique result, but i would like to skip all detected distinct values i.e dont display distinct values
is it possible?
select col1 d from tb_col where col1 = '123';
col1
------
123
123
(2 rows)
select distinct col1 d from tb_col where col1 = '123';
col1
------
123
(1 row)
SELECT col1
FROM tb_col
GROUP BY col1
HAVING count(*) = 1
Not showing duplicates at all:
SELECT col1 AS d
FROM tb_col
GROUP BY col1
HAVING COUNT(*) = 1 --- or perhaps HAVING COUNT(*) > 1
--- it's not clear what you want.
select col1
from tb_col
group by col1
having count(*) < 2
Try with DISTINCT it will works!
SELECT DISTINCT(col1) as d from tb_col where col1 = '123';