Mysql: How to delete all duplicates that violate UNIQUE constraint - mysql

I want to add a UNIQUE index to a table, like this:
ALTER TABLE `mytable` ADD UNIQUE `myunique_name`(`first`, `second`, `third`);
Mysql responds with:
Duplicate entry '1-2-3' for key 'myunique_name'
I know for sure that this combination is just one out of thousands that violate the constraint.
In this special case I know for sure that all the rows that contain the same values in the three specified columns also contain the same data in the other relevant fields (the primary index differs of course, but is irrelevant), therefore all the duplicates can be deleted.
Is there a way to do delete all duplicate entries but keep one (doesn't matter which primary key is kept) so that the unique index can be added?

CREATE TEMPORARY TABLE IF NOT EXISTS MyTable engine=memory
select 1 as id, 1 col1,1 col2,1 col3
union all
select 2 as id, 2 col1,2 col2,2 col3
union all
select 3 as id, 3 col1,3 col2,3 col3
union all
select 4 as id, 4 col1,4 col2,4 col3
union all
select 5 as id, 1 col1,1 col2,1 col3
union all
select 6 as id, 2 col1,2 col2,2 col3
CREATE TEMPORARY TABLE IF NOT EXISTS MyDuplicateTableWithCount engine=memory
select col1 , col2 , col3, count(*) Count_1
from MyTable
group by col1 , col2 , col3
having count(*)>1
select a.* from MyTable a
inner join
(select col1 , col2 , col3
from MyDuplicateTableWithCount
) b
on a.col1 =b.col1 and a.col2 =b.col2 and a.col3 =b.col3
order by a.id
After getting the duplicate id's write your delete query specifyinging duplicate id's as
delete from myTable where id in (5,6)
Also use below query using myTable from above
CREATE TEMPORARY TABLE IF NOT EXISTS MyTable2 engine=memory
SELECT MIN(id) as id, Col1, Col2, Col3
FROM MyTable
GROUP BY Col1, Col2, Col3
DELETE a FROM MyTable as a
LEFT JOIN (
SELECT * from MyTable2
) as b ON
b.id = a.id
WHERE
b.id IS NULL

Related

Delete from database where one part of a two part primary key has duplicates

This question stumps me. I have a database with a table that has a primary key that consists of two fields. In the end I require that the primary key only be one field, but I need to delete the duplicate entries from the table.
In other words the table has:
PRIMARY KEY (`field1`, `field2`)
There are entries that have duplicate field1 and different field2. So I have entries like this:
field1 | field2
1 | 1
1 | 2
2 | 1
2 | 2
3 | 1
4 | 1
I want to delete 1 of each of those entries that have duplicates on field1.
How can I do this with MySQL / SQL?
I think this will work in your case,
DELETE t1 FROM table t1
INNER JOIN table t2
WHERE t1.id > t2.id
AND t1.field1 = t2.field1
In this query I am joining the same table and picking duplicate values of field1 with different id and removing those.
Hope this works!!
I dont know how the delete from table needs to be specified in the mysql syntax but essentially you are trying to remove the second entry for the field1 for each of its unique value. So in some way if you are able to retrieve those records and pass them as select statements under your delete from table clause it should work.
For instance, here is the query that would select 2nd row for each value of field1 if it is repeated
select field1, field2
from
(
select *, count(*) over (partition by field1) as ct
, rank() over (partition by field1 order by field2 desc) as rn
from temp
) where rn = 1 and ct = 2
In your case it would return below records
field1 field2
1 2
2 2
So then all you need to do is have a delete from table clause at the top of that select statement.
NOTE - I have tried a solution without a join and hence I maintain these 2 analytical functions.
For instance this works in something like BigQuery -
delete from TABLE where concat(field1, field2) in
(
select concat(field1, field2)
from
(
select *, count(*) over (partition by field1) as ct
, rank() over (partition by field1 order by field2 desc) as rn
from TABLE
) where rn = 1 and ct = 2
)

Select last duplicate row in MySQL

SELECT Duplicate row item from MySQL table using
SELECT * FROM `table` GROUP BY `col1`,`col2` Having COUNT(`col1`)>1 and COUNT(`col2`)>1
Actual result
The above query return first duplicate entry. from above data row 1 and row 7 contains duplicate field in same column(col1, col2).
But I need to Get last duplicate entry. Highlighted duplicate row
Expected Result
I need to get last duplicate entry.
How do you define the last duplicate? In a database table, records are not inhenrently ordered, and you did not tell which column we should use for ordering.
If you want to order by col3, then you can just use aggregation, like so:
select col1, col2, max(col3) -- or min(col3)
from mytable
group by col1, col2
-- having count(*) > 1
-- uncomment the above line if you want to see only records for which a duplicate exists
If you have some other column that you want to order with, say id, then you can filter with a correlated subquery
select col1, col2, col3
from mytable t
where id = (
select max(id) from mytable t1 where t1.col1 = t.col1 and t1.col2 = t.col2
)

MySQL Counting the number of occurrences of a value from a column in another column and storing in new column

How do I structure my query so I can count how many occurrences of a value in column 1 appears in column 2 and then store that result in a new column in the same table? (If a value is duplicated in the first column I still want to store the same value in the new column) For example if I had a table like this:
COL1 COL2
1 2
1 4
2 1
3 1
4 1
4 2
The resulting table will look like this:
COL1 COL2 COL3
1 2 3
1 4 3
2 1 2
3 1 0
4 1 1
4 2 1
Any help is appreciated I am new to sql! Thanks in advance!
Select
col1,
col2,
COALESCE(col3,0) as col3
FROM
mytable
LEFT JOIN
( Select count(*) as col3, col2
from mytable
GROUP BY col2) as temp ON temp.col2 = mytable.col1
And if you want the update (thanks Thorsten Kettner ) :
UPDATE mytable
LEFT JOIN ( Select count(*) as col3, col2
from mytable
GROUP BY col2) as temp ON temp.col2 = mytable.col1
SET mytable.col3 = COALESCE(temp.col3,0)
You can easily count on-the-fly. Don't store this redundantly. This would only cause problems later.
select
col1,
col2,
(
select count(*)
from mytable match
where match.col2 = mytable.col1
) as col3
from mytable;
If you think you must do it; here is the according UPDATE statement:
update mytable
set col3 =
(
select count(*)
from mytable match
where match.col2 = mytable.col1
);
To do that, you can try :
SELECT COL1, COL2, (SELECT COUNT(COL1) FROM `tablename` AS t2
WHERE t2.COL1 = t1.COL1) AS COL3 FROM `tablename` AS t1
Enjoy :)

How to get SUM of certain column without losing all rows?

CREATE TABLE tmp ( col1 int, col2 int );
INSERT INTO tmp VALUES (1,3), (2,5), (3,7);
SELECT col1, col2, SUM(col2) AS Total FROM tmp; -- ???
The SELECT statement leaves me with this data set:
col1 col2 Total
1 3 15
Is there a way to allow all the rows to appear without introducing a subquery, so that the result is this:
col1 col2 Total
1 3 15
2 5 15
3 7 15
You can use a cross join to avoid a subquery:
SELECT t1.col1, t1.col2, sum(t2.col2) sum_col2
from tmp t1
cross join tmp t2
group by 1, 2
See SQL fiddle
Note that this only works if combinations of col1 and col2 are unique.

Trying to get the DISTINCT values of 3 columns in sqlite db

SO I have a table with 3 cols:
Col1 Col2 Col3
a b c
b c null
a null b
c d a
And my desired output will be:
a,b,c,d,null
I am hoping to have the output in a single string if possible.
I have tried:
SELECT DISTINCT col1, col2, col3 FROM table
and didn't get the desired results. Any Ideas?
A single-string solution (see on sqlfiddle):
SELECT GROUP_CONCAT(COALESCE(c, 'NULL'), ',')
FROM (
SELECT col1 c
FROM mytable
UNION
SELECT col2 c
FROM mytable
UNION
SELECT col3 c
FROM mytable
) q
SELECT Col1
FROM table
UNION
SELECT Col2
FROM table
UNION
SELECT Col3
FROM table
does this work in sqlite:
select col1 from table
union
select col2 from table
union
select coll3 from table
or:
select col1 from table where col1 is not null
union
select col2 from table where col2 is not null
union
select coll3 from table where col3 is not null
to eliminate nulls.
Note i don't think this would be fast to execute but I know in mssql union will do a distinct on the results
If you are using MySql, you could use this solution:
select group_concat(coalesce(c,'null') order by c is null, c)
from (
select col1 c from tbl
union
select col2 c from tbl
union
select col3 c from tbl
) u
Union query selects all values, removing all duplicates. I'm then returning the result in a single string, ordered by value with null value at the end, and converting null to 'null' (since group_concat would ignore null values).
If you are using SQLite, Group_Concat doesn't support order by, and you could use this:
select group_concat(coalesce(c,'null'))
from (
select col1 c, col1 is null o from mytable
union
select col2 c, col2 is null o from mytable
union
select col3 c, col3 is null o from mytable
order by o, c
) u