I have this table, here is my db Fiddle
CREATE TABLE table1 (
`ID` VARCHAR(100),
`Val` VARCHAR(100),
`Val2` VARCHAR(100),
`Val3` VARCHAR(100)
);
INSERT INTO table1
(`ID`, `Val`, `Val2`, `Val3`)
VALUES
('1','100','200','90'),
('2','100','200','10'),
('3','100','200','20'),
('4','20','100','55'),
('5','20','100','10'),
('6','112','100','20'),
('7','112','100','20'),
('8','90','200','90'),
('9','30','90','180'),
('10','30','90','29');
I want the result with this condition
Val had to be duplicate AND
Val2 had to be duplicate AND
After i got the duplicate value, now i need to check the val3 from the duplicate value if the value of val3 had unique value from the previous aggregate
i tried with this query
SELECT
t1.*
FROM
table1 t1
WHERE
EXISTS (
SELECT
1
FROM
table1
WHERE
ID <> t1.ID
AND Val = t1.Val
AND Val2 = t1.Val2
)
AND NOT EXISTS (
SELECT
1
FROM
table1
WHERE
Val = t1.Val
AND Val2 = t1.Val2
AND Val3 IN (
SELECT Val3
FROM table1
GROUP BY Val3
HAVING count( * ) > 1
)
)
I expect the result would be like this
ID Val Val2 Val3
1 100 200 90
2 100 200 10
3 100 200 20
4 20 100 55
5 20 100 10
9 30 90 180
10 30 90 29
BUt i got the result like this
ID Val Val2 Val3
9 30 90 180
10 30 90 29
Sample 2
INSERT INTO table1
(`ID`, `Val`, `Val2`, `Val3`)
VALUES
('1','100','200','90'),
('2','100','200','10'),
('3','100','200','20'),
('19','100','200','20'),
('4','20','100','55'),
('5','20','100','10'),
('6','112','100','20'),
('7','112','100','20'),
('8','90','200','90'),
('9','30','90','180'),
('10','30','90','29');
Expected result 2
ID Val Val2 Val3
1 100 200 90
2 100 200 10
4 20 100 55
5 20 100 10
9 30 90 180
10 30 90 29
dbfiddle 2
Sample 3
INSERT INTO table1
(`ID`, `Val`, `Val2`, `Val3`)
VALUES
('1','100','200','aa'),
('2','100','200','aa'),
('3','100','200','aa'),
('19','100','200','ab'),
('4','20','100','SD2'),
('5','20','100','SD1'),
('6','112','100','aa'),
('7','112','100','ab'),
('8','90','200','aa'),
('9','30','90','SF2'),
('10','30','90','SF1');
Expected result 3
ID Val Val2 Val3
4 20 100 SD2
5 20 100 SD1
6 112 100 aa
7 112 100 ab
9 30 90 SF2
10 30 90 SF1
Some people might be confused with sample 3, so here is a notes for sample 3 :
For this case, ID 19 in sample 3 had same value with column val and val2 for id 1, 2, 3 ( 100 and 200), but these id (1, 2, 3) had same aa value in val3, so id 1,2,3 must be excluded, because these id did not match with last condition (val, val2, val3) is unique. ID 19 is fine but val dan val2 column that had duplicate value which is id 1,2,and 3 had already excluded, it makes id 19 had no duplicate value for both column val and val2. if there was another data like '200','100','200','ae' in sample 3, the id 19 will included in result because it has duplicate value beside id 1,2,and 3.
for sample 3 ID 19 will be included if the data in table1 were like this
Sample 3 ( different case )
INSERT INTO table1
(`ID`, `Val`, `Val2`, `Val3`)
VALUES
('1','100','200','aa'),
('2','100','200','aa'),
('3','100','200','aa'),
('19','100','200','ab'),
('200','100','200','ae'),
('4','20','100','SD2'),
('5','20','100','SD1'),
('6','112','100','aa'),
('7','112','100','ab'),
('8','90','200','aa'),
('9','30','90','SF2'),
('10','30','90','SF1');
The expected result will be like this
ID Val Val2 Val3
4 20 100 SD2
5 20 100 SD1
19 100 200 ab
200 100 200 ae
6 112 100 aa
7 112 100 ab
9 30 90 SF2
10 30 90 SF1
Join the table to the queries that apply your conditions:
select distinct t.*
from (
select val, val2
from table1
group by val, val2
having count(*) > 1
) t1
inner join (
select val, val2, val3
from table1
group by val, val2, val3
having count(*) = 1
) t2
on t2.val = t1.val and t2.val2 = t1.val2
inner join (
select val, val2, val3
from table1
group by val, val2, val3
having count(*) = 1
) t3
on t3.val = t1.val and t3.val2 = t1.val2 and t3.val3 <> t2.val3
inner join table1 t on t2.val = t.val and t2.val2 = t.val2 and t.val3 in (t2.val3, t3.val3)
See demo1, demo2, demo3, demo4.
As I understand your question, you want rows whose (val, val2) tuple is not unique, and whose (val, val2, val3) is unique.
Here is one way to express this by filtering the dataset with correlated subquery:
select t1.*
from table1 t1
where
(
select count(*)
from table1 t2
where t2.val = t1.val and t2.val2 = t1.val2
) > 1
and (
select count(*)
from table1 t2
where t2.val = t1.val and t2.val2 = t1.val2 and t2.val3 = t1.val3
) = 1
order by id
For performance, consider an index on (val, val1, val2) (the ordering of columns in the index matters here).
If you are lucky enough to be running MySQL 8.0, this can be phrased more simply and more efficiently using window functions:
select id, val, val2, val3
from (
select
t1.*,
count(*) over(partition by val, val2) cnt_1,
count(*) over(partition by val, val2, val3) cnt_2
from table1 t1
) t
where cnt_1 > 1 and cnt_2 = 1
As #GMB told in rather simplified manner in his answer, you want rows whose (val, val2) tuple is not unique, and whose (val, val2, val3) is unique.
Following query should accomplish that very easily:
select t.*
from table1 t
inner join
(
select t1.val, t1.val2
from table1 t1
inner join
(select val,val2,val3
from table1
group by val,val2,val3
having count(val3) = 1
) t2
on t1.val = t2.val and t1.val2 = t2.val2 and t1.val3 = t2.val3
group by t1.val, t1.val2
having count(distinct t1.id) > 1
) tmp
on tmp.val = t.val and tmp.val2 = t.val2
inner join
(select val,val2,val3
from table1
group by val,val2,val3
having count(val3) = 1
) t3
on t.val = t3.val and t.val2 = t3.val2 and t.val3 = t3.val3
Please find the fiddle link for Sample1, Sample2, Sample3 and Sample4.
Having a (MySQL) audit table containing rows that are similar, is it possible to view only those columns that have different values?
For example, a table containing four columns where column key is primary key, and column id is the identifier to match rows:
key id col1 col2
1 123 B C
2 123 A C
3 456 B C
4 789 B A
5 789 B B
6 987 A C
In the example above I need the query to return only row 1, 2, 4, and 5 as they have matching id, and differing values in col1 and col2, ie B,A and B,A.
key id col1 col2
1 123 B
2 123 A
4 789 A
5 789 B
I know it might not be very efficient solution, but gives what you want. HERE try this:
SELECT A.ID, (CASE A.col1 WHEN B.col1 THEN NULL ELSE B.col1 END), (CASE A.col2 WHEN B.col2 THEN NULL ELSE B.col2 END) FROM tblName A
FULL OUTER JOIN tblName B
ON
A.ID=B.ID
WHERE
(A.col1=B.col1 AND A.Col2<>B.Col2)
OR
(A.col2<>B.col2 AND A.Col1=B.Col1)
INNER JOIN should give same result
This is a bit contrived, in the sense that adding more rows will give very different results - but anyway...
SELECT x.my_key
, x.id
, IF(y.col1=x.col1,'',x.col1) col1
, IF(y.col2=x.col2,'',x.col2) col2
FROM my_table x
JOIN my_table y
ON y.id = x.id
AND y.my_key <> x.my_key
WHERE (y.col1 <> x.col1 OR y.col2 <> x.col2)
ORDER
BY my_key;
Thanks for all responses which guided me.
Using your suggestions I made the sql like this:
SELECT
T1.KEY,
T1.ID,
CASE T2.COL1_DISTINCT_VALUES WHEN 1 THEN NULL ELSE T1.COL1 END AS COL1,
CASE T2.COL2_DISTINCT_VALUES WHEN 1 THEN NULL ELSE T1.COL2 END AS COL2
FROM
TAB1 T1
INNER JOIN
(
SELECT
ID,
COUNT(DISTINCT COL1) AS COL1_DISTINCT_VALUES,
COUNT(DISTINCT COL2) AS COL2_DISTINCT_VALUES
FROM
TAB1
GROUP BY
ID
) T2
ON T1.ID=T2.ID
WHERE
T2.COL1_DISTINCT_VALUES > 1
OR T2.COL2_DISTINCT_VALUES > 1
ORDER BY
KEY,ID;
I have a relational DB that I can't think of how to form this query.
Here's the info
Table1
id name
1 Mike
Table2
id table_1_id value setting
1 1 something setting1
2 1 something2 setting2
2 1 something3 setting3
Currently, this is my sql query
SELECT * FROM Table1
JOIN Table2 on Table2.table_1_id = Table1.id
What this outputs is something like this
id name table_1_id value setting
1 Mike 1 something1 setting1
1 Mike 1 something2 setting2
1 Mike 1 something3 setting3
Is it possible to construct this in such a way to return these results so I can export it to a CSV file?
id name table_1_id something1 something2 something3
1 Mike 1 setting1 setting2 setting3
SELECT
Table1.*,
something1Table.setting AS something1,
something2Table.setting AS something2,
something3Table.setting AS something3
FROM Table1
JOIN Table2 AS something1Table ON something1Table.table_1_id = Table1.id AND something1Table.value = 'something'
JOIN Table2 AS something2Table ON something2Table.table_1_id = Table1.id AND something2Table.value = 'something2'
JOIN Table2 AS something3Table ON something3Table.table_1_id = Table1.id AND something3Table.value = 'something3'
You need a conditional aggregation:
select table1.id, table1.name,
max(case when value = 'something1' then setting end) as setting1,
max(case when value = 'something2' then setting end) as setting2,
max(case when value = 'something3' then setting end) as setting3
from table1 join
table2
on table1.id = table2.id
group by table1.id, table1.name
This type of data transformation is known an a pivot but MySQL does not have a pivot function. So you will want to replicate it using an aggregate function with a CASE expression.
If you know the the number of values ahead of time, then you can hard-code your query similar to this:
select t1.id,
t1.name,
max(case when t2.value = 'something' then t2.setting end) as setting1,
max(case when t2.value = 'something2' then t2.setting end) as setting2,
max(case when t2.value = 'something3' then t2.setting end) as setting3
from table1 t1
left join table2 t2
on t1.id = t2.table_1_id
group by t1.id, t1.name;
See SQL Fiddle with Demo
But if you have an unknown number of values that you want to transform into columns, then you can use a prepared statement to generate dynamic sql.
The query would be similar to this:
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'max(case when t2.value = ''',
value,
''' then t2.setting end) AS `',
value, '`'
)
) INTO #sql
FROM table2;
SET #sql = CONCAT('SELECT t1.id,
t1.name, ', #sql, '
FROM table1 t1
left join table2 t2
on t1.id = t2.table_1_id
group by t1.id, t1.name');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
See SQL Fiddle with Demo
The result of both versions is:
| ID | NAME | SOMETHING | SOMETHING2 | SOMETHING3 |
---------------------------------------------------
| 1 | Mike | setting1 | setting2 | setting3 |
GROUP_CONCAT may be of use. It doesn't give you exactly what you want, because it would put the concatenated values into a single field. But depending on what you're actually trying to accomplish, perhaps you can work around that. The advantage of the GROUP_CONCAT is that it can handle any number of table2 rows per table1 row, whereas the conditional aggregation above hardwires having three entries (which may well be what you want).
SELECT table1.*,
GROUP_CONCAT(value) AS value_group,
GROUP_CONCAT(setting) AS setting_group
FROM table1
INNER JOIN table2
ON table2.table_1_id = table1.id
returns
id,person,value_group,setting_group
1,Mike,"something1,something2,something3","setting1,setting2,setting3"
i have a denormalized table, where i have to count the number of same values in other columns.
I'm using the InfiniDB Mysql Storage Engine.
This is my Table:
col1 | col2 | col3
------------------
A | B | B
A | B | C
A | A | A
This is what i expect:
col1Values | col2Values | col3Values
------------------------------------
1 | 2 | 2 -- Because B is in Col2 and Col3
1 | 1 | 1
3 | 3 | 3
Is there something like
-- function count_values(needle, haystack1, ...haystackN)
select count_values(col1, col1, col2, col3) as col1values -- col1 is needle
, count_values(col2, col1, col2, col3) as col2values -- col2 is needle
, count_values(col3, col1, col2, col3) as col3values -- col3 is needle
from table
or am i missing something simple that will do the trick? :-)
Thanks in advance
Roman
select
CASE WHEN col1 = col2 and col1=col3 THEN '3'
WHEN col1 = col2 or col1=col3 THEN '2'
WHEN col1 != col2 and col1!=col3 THEN '1'
ELSE '0' END AS col1_values,
CASE WHEN col2 = col1 and col2=col3 THEN '3'
WHEN col2 = col1 or col2=col3 THEN '2'
WHEN col2 != col1 and col2!=col3 THEN '1'
ELSE '0' END AS col2_values,
CASE WHEN col3 = col1 and col3=col2 THEN '3'
WHEN col3 = col1 or col3=col2 THEN '2'
WHEN col3 != col1 and col3!=col2 THEN '1'
ELSE '0' END AS col3_values
FROM table_name
fiddle demo
Assuming the table has got a key, you could:
Unpivot the table.
Join the unpivoted dataset back to the original.
For every column in the original, count matches against the unpivoted column.
Here's how the above could be implemented:
SELECT
COUNT(t.col1 = s.col OR NULL) AS col1Values,
COUNT(t.col2 = s.col OR NULL) AS col2Values,
COUNT(t.col3 = s.col OR NULL) AS col3Values
FROM atable t
INNER JOIN (
SELECT
t.id,
CASE colind
WHEN 1 THEN t.col1
WHEN 2 THEN t.col2
WHEN 3 THEN t.col3
END AS col
FROM atable t
CROSS JOIN (SELECT 1 AS colind UNION ALL SELECT 2 UNION ALL SELECT 3) x
) s ON t.id = s.id
GROUP BY t.id
;
The subquery uses a cross join to unpivot the table. The id column is a key column. The OR NULL bit is explained in this answer.
I have found a different, very very simple solution :-)
select if(col1=col1,1,0) + if(col2=col1,1,0) + if(col3=col1,1,0) as col1values -- col1 is needle
from table