MySql - Delete duplicate rows, without temp Table - mysql

I have this table email_addr_bean_rel with these fieldsid, email_address_id, bean_id, bean_module, primary_address, reply_to_address, date_created, date_modified, deleted
Out of these only records of column bean_id is duplicated twice.
I have tried this, but it doesn't work
CREATE TABLE email_addr_bean_rel_V AS SELECT DISTINCT * FROM email_addr_bean_rel;
DROP TABLE email_addr_bean_rel;
RENAME TABLE email_addr_bean_rel_V TO email_addr_bean_rel;
It still contains same number of records.

Related

SQL insert avoiding duplicates from queried table + avoiding to insert duplicates in updated table

I wish to update specific columns of a table from certain columns from a second table avoiding duplicates in a unique column, with the extra twist that the second table itself has duplicates.
INSERT INTO cie (cik,name,sic,fye)
SELECT dumpsubq2.cik,dumpsubq2.name,dumpsubq2.sic,dumpsubq2.fye
FROM dumpsubq2
WHERE NOT EXISTS(SELECT cik
FROM cie
WHERE dumpsubq2.cik = cie.cik)
Now this would work fine if only there were no duplicates in the dumpsubq2 Table. So the exploit is to somehow include DISTINCT while querying the dumpsubq2 table. And i fail to achieve this in one pass :/
Do i have to do this in 2 steps ?
Eliminating duplicates in the dumpsubq2 in a temp table
Play above query that stops with (error #1062 - Duplicate entry '1606069' for key 'cie.cik') when it encounters a duplicate in dumpsubq2)
Thanks for your help :)
This query did it for me :)
INSERT INTO cie(cik,name,sic,fye)
SELECT * from ( select distinct cik,name,sic,fye FROM `dumpsubq2` WHERE form = '10-k') AS tmp
WHERE NOT EXISTS (
SELECT cik FROM cie WHERE tmp.cik = cie.cik
)

Delete duplicate rows on a huge list

I have a huge list of roads and the place of that road, like below:
StreetName,PlaceName,xcoord,ycoord
Ovayok Road,Cambridge Bay,-104.99656,69.12876
Ovayok Road,Cambridge Bay,-104.99693,69.12865
Ovayok Road,Cambridge Bay,-104.99794,69.12842
Ovayok Road,Cambridge Bay,-104.99823,69.12835
Hikok Drive,Kugluktuk,-115.09433,67.82674
Hikok Drive,Kugluktuk,-115.09570,67.82686
Hikok Drive,Kugluktuk,-115.09593,67.82689
Hikok Drive,Kugluktuk,-115.09630,67.82695
Sivulliq Avenue,Rankin Inlet,-92.08252,62.81265
Sivulliq Avenue,Rankin Inlet,-92.08276,62.81265
Sivulliq Avenue,Rankin Inlet,-92.08461,62.81262
How to delete rows that have duplicates data on first and second column? All numbers (coordinates) are differents.
If you don't have any column by which you can uniquely identify your data or any column with ID,
then fetch the unique records in the table and move them to a copy of the table and rename this temp table with the original table.
Below find the query for the same -
CREATE TABLE street_details_temp LIKE street_details;
INSERT INTO street_details_temp SELECT DISTINCT * FROM street_details;
DROP TABLE street_details;
RENAME TABLE street_details_new TO street_details;

Delete records meeting criteria dacross tables

Long time lurker, first post(er?)
I need to delete some duplicate entries in my database based on attributes spread across multiple tables. This is the way I've done but, but I'm sure there is a better way (I'm by no means a SQL expert!). Any pointers would be great. (When I went to do this for a second time, it failed)
I'm first getting all the rows from a table, sorting by date, then picking the newest entry (dup1). Then I'm matching that list to another table based on a value (dup2). Then finally creating a list of rows based in their ID that appears in both tables (dup3). Then I want to delete from the main table where the ID is in the 3rd temp table.
First, I created a temp table:
create temporary table dup1
as
select * from
(SELECT hex(media_id) as asset_id, folder_id, name, ingest_date
FROM media
order by ingest_date DESC) as dup
group by name having count(name)>1 and count(hex(folder_id))>1
Then created a second temp table:
create temporary table dup2
as
SELECT hex(asset_id) as asset_id, value FROM datavalues where name_id = 103 group by value having count(value)>1;
as
SELECT hex(asset_id), value FROM datavalues where name_id = 103 group by value having count(value)>1;
Created a final temp table that merges the two previous temp tables
create temporary table dup3
as
select dup1.asset_id from dup1
join dup2 on dup2.asset_id = dup1.asset_id
Then deleted all assets from the media table that exist in dup3
DELETE FROM media where hex(media_id) in (SELECT * from dup3);

Delet just one record of duplicate

i am trying to delete e-mail duplicates from table nlt_user
this query is showing correctly records having duplicates:
select [e-mail], count([e-mail])
from nlt_user
group by [e-mail]
having count([e-mail]) > 1
now how can i delete all records having duplicate but one?
Thank you
If MySQL version is prior 5.7.4 you can add a UNIQUE index on the column e-mail with the IGNORE keyword.
This will remove all the duplicate e-mail rows:
ALTER IGNORE TABLE nlt_user
ADD UNIQUE INDEX idx_e-mail (e-mail);
If > 5.7.4 you can use a temporary table (IGNORE not possible on ALTER anymore):
CREATE TABLE nlt_user_new LIKE nlt_user;
ALTER TABLE nlt_user_new ADD UNIQUE INDEX (emailaddress);
INSERT IGNORE INTO nlt_user_new SELECT * FROM nlt_user;
DROP TABLE nlt_user;
RENAME TABLE nlt_user_new TO nlt_user;
Try this :
delete n1 from nlt_user n1
inner join nlt_user n2 on n1.e-mail=n2.e-mail and n1.id>n2.id;
This will keep record with minimum ID value of duplicates and deletes remaining duplicate records
The rank function can be employed to retain only the unique values
1:Create a new table which contains only unique values
Example: nlt_user_unique
CREATE TABLE nlt_user_unique AS
(SELECT * FROM
(SELECT A.*,RANK() OVER (PARTITION BY email ORDER BY email) RNK
FROM nlt_user A)
where RNK=1)
2:Truncate the orignal table containing duplicates
truncate table nlt_user
3:Insert the unique rows from the table created in step 1 to your table nlt_user
INSERT INTO nlt_user()
SELECT email from nlt_user_unique;

How do I delete all the duplicate records in a MySQL table without temp tables

I've seen a number of variations on this but nothing quite matches what I'm trying to accomplish.
I have a table, TableA, which contain the answers given by users to configurable questionnaires. The columns are member_id, quiz_num, question_num, answer_num.
Somehow a few members got their answers submitted twice. So I need to remove the duplicated records, but make sure that one row is left behind.
There is no primary column so there could be two or three rows all with the exact same data.
Is there a query to remove all the duplicates?
Add Unique Index on your table:
ALTER IGNORE TABLE `TableA`
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);
Another way to do this would be:
Add primary key in your table then you can easily remove duplicates from your table using the following query:
DELETE FROM member
WHERE id IN (SELECT *
FROM (SELECT id FROM member
GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
) AS A
);
Instead of drop table TableA, you could delete all registers (delete from TableA;) and then populate original table with registers coming from TableA_Verify (insert into TAbleA select * from TAbleA_Verify). In this way you won't lost all references to original table (indexes,... )
CREATE TABLE TableA_Verify AS SELECT DISTINCT * FROM TableA;
DELETE FROM TableA;
INSERT INTO TableA SELECT * FROM TAbleA_Verify;
DROP TABLE TableA_Verify;
This doesn't use TEMP Tables, but real tables instead. If the problem is just about temp tables and not about table creation or dropping tables, this will work:
SELECT DISTINCT * INTO TableA_Verify FROM TableA;
DROP TABLE TableA;
RENAME TABLE TableA_Verify TO TableA;
Thanks to jveirasv for the answer above.
If you need to remove duplicates of a specific sets of column, you can use this (if you have a timestamp in the table that vary for example)
CREATE TABLE TableA_Verify AS SELECT * FROM TableA WHERE 1 GROUP BY [COLUMN TO remove duplicates BY];
DELETE FROM TableA;
INSERT INTO TableA SELECT * FROM TAbleA_Verify;
DROP TABLE TableA_Verify;
Add Unique Index on your table:
ALTER IGNORE TABLE TableA
ADD UNIQUE INDEX (member_id, quiz_num, question_num, answer_num);
is work very well
If you are not using any primary key, then execute following queries at one single stroke. By replacing values:
# table_name - Your Table Name
# column_name_of_duplicates - Name of column where duplicate entries are found
create table table_name_temp like table_name;
insert into table_name_temp select distinct(column_name_of_duplicates),value,type from table_name group by column_name_of_duplicates;
delete from table_name;
insert into table_name select * from table_name_temp;
drop table table_name_temp
create temporary table and store distinct(non duplicate) values
make empty original table
insert values to original table from temp table
delete temp table
It is always advisable to take backup of database before you play with it.
As noted in the comments, the query in Saharsh Shah's answer must be run multiple times if items are duplicated more than once.
Here's a solution that doesn't delete any data, and keeps the data in the original table the entire time, allowing for duplicates to be deleted while keeping the table 'live':
alter table tableA add column duplicate tinyint(1) not null default '0';
update tableA set
duplicate=if(#member_id=member_id
and #quiz_num=quiz_num
and #question_num=question_num
and #answer_num=answer_num,1,0),
member_id=(#member_id:=member_id),
quiz_num=(#quiz_num:=quiz_num),
question_num=(#question_num:=question_num),
answer_num=(#answer_num:=answer_num)
order by member_id, quiz_num, question_num, answer_num;
delete from tableA where duplicate=1;
alter table tableA drop column duplicate;
This basically checks to see if the current row is the same as the last row, and if it is, marks it as duplicate (the order statement ensures that duplicates will show up next to each other). Then you delete the duplicate records. I remove the duplicate column at the end to bring it back to its original state.
It looks like alter table ignore also might go away soon: http://dev.mysql.com/worklog/task/?id=7395
An alternative way would be to create a new temporary table with same structure.
CREATE TABLE temp_table AS SELECT * FROM original_table LIMIT 0
Then create the primary key in the table.
ALTER TABLE temp_table ADD PRIMARY KEY (primary-key-field)
Finally copy all records from the original table while ignoring the duplicate records.
INSERT IGNORE INTO temp_table AS SELECT * FROM original_table
Now you can delete the original table and rename the new table.
DROP TABLE original_table
RENAME TABLE temp_table TO original_table
Tested in mysql 5.Dont know about other versions.
If you want to keep the row with the lowest id value:
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id > n2.id AND n1.member_id = n2.member_id and n1.answer_num =n2.answer_num
If you want to keep the row with the highest id value:
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id < n2.id AND n1.member_id = n2.member_id and n1.answer_num =n2.answer_num