Deleting duplicate rows with sql

Deleting duplicate rows with sql - mysql

I am trying to delete duplicate rows from my mysql table. I've tried multiple queries but I am keep on getting this error: #1093 - You can't specify target table 'usa_city' for update in FROM clause
The table looks like this:
usa_city
--------
id(pk)
id_state
city_name
And the queries I have tired were:
DELETE FROM usa_city
WHERE id NOT IN
(
SELECT MIN(id)
FROM usa_city
GROUP BY city_name, id_state
)
And:
DELETE
FROM usa_city
WHERE usa_city.id IN
-- List 1 - all rows that have duplicates
(SELECT F.id
FROM usa_city AS F
WHERE Exists (SELECT city_name, id_state, Count(id)
FROM usa_city
WHERE usa_city.city_name = F.city_name
AND usa_city.id_state = F.id_state
GROUP BY usa_city.city_name, usa_city.id_state
HAVING Count(usa_city.id) > 1))
AND usa_city.id NOT IN
-- List 2 - one row from each set of duplicate
(SELECT Min(id)
FROM usa_city AS F
WHERE Exists (SELECT city_name, id_state, Count(id)
FROM usa_city
WHERE usa_city.city_name = F.city_name
AND usa_city.id_state = F.id_state
GROUP BY usa_city.city_name, usa_city.id_state
HAVING Count(usa_city.id) > 1)
GROUP BY city_name, id_state);
Thanks in advance.

Try to select the duplicates first, the delete them
DELETE FROM usa_city WHERE city_id IN
(
SELECT city_id FROM usa_city
GROUP BY city_name, id_state
HAVING count(city_id) > 1
)
Hope it helps!!!
MODIFIED: Based on the comment, if you want to keep one record, you can make a join and keep the lowest value
DELETE c1 FROM usa_city c1, usa_city c2 WHERE c1.id < c2.id AND
(c1.city_name= c2.city_name AND c1.id_state = c2.id_state)
Be sure to make a backup before executing the query above...

from mysql documentation:
"Currently, you cannot delete from a table and select from the same
table in a subquery."
but here is a workaround for update, should work for delete too.
also, you could select rows, and then in php for example delete them in loop

You may found here an answer to your problem: How to delete duplicate records in mysql database?
You should improve your database by using keyfields to prevent duplicate rows, so you dont need to clear in future.

Edit : This solution is also found if you follow the link posted by BloodyWorld, so if it works please go and upvote DMin's post here
Found this browsing the internet (#1 google result for mysql delete duplicate rows), have you tried it?
delete from table1
USING table1, table1 as vtable
WHERE (NOT table1.ID=vtable.ID)
AND (table1.field_name=vtable.field_name)

Judging from your examples, when you say "duplicate", you mean "having the same combination of id_state and city_name", correct? If so after you have done removing the duplictes, I strongly suggest creating a UNIQUE constraint on {id_state, city_name}.
To actually remove the duplicates, it is not enough to just identify the set of duplicates, you must also decide which of the identified duplicates to keep. Assuming you want to keep the ones with the smallest id, the following piece of SQL will do the job:
CREATE TEMPORARY TABLE usa_city_to_delete AS
SELECT id FROM usa_city T1
WHERE EXISTS (
SELECT * FROM usa_city T2
WHERE
T1.id_state = T2.id_state
AND T1.city_name = T2.city_name
AND T1.id > T2.id
);
DELETE FROM usa_city
WHERE id IN (SELECT id FROM usa_city_to_delete);
DROP TEMPORARY TABLE usa_city_to_delete;
Unfortunately, MySQL does not allow the correlated subqueries in DELETE, otherwise we could have done that in a single statement, without the temporary table.
--- EDIT ---
You can't have a correlated subquery but you can have JOIN, as illustrated by Carlos Quijano answer. Also, the temporary table can be created implicitly, as suggested by Kokers.
So it is possible to do it in a single statement, contrary to what I wrote above...

Related

MySQL 5.7 remove duplicate rows in the same table based on multiple columns

I have a table with already existing records, I want to add a Unique constraint on multiple columns(app_instance_config_uuid, external_resource_id and spaceId), but first, I need to remove already existing duplicates.
This is an example of the table I want to add the constraint.
The best solution i found is
DELETE FROM spaces_apps
WHERE id IN ( SELECT id FROM ( SELECT MIN(id) AS id FROM spaces_apps
GROUP BY spaceId, app_instance_config_uuid, external_resource_id
HAVING COUNT(id) > 1 ) temp )
but the issue is that it only deletes one duplicate and if I need to delete more then one i need to run it again.
Important note that this is MySQL5.7 so using ROW_COUNT() and similar approaches doesn't work.
UPDATE:
The first solution works even better when just changing IN to NOT IN and removing the HAVING clause! Thanks to #Pankaj for pointing this!
DELETE FROM spaces_apps
WHERE id IN ( SELECT id FROM ( SELECT MIN(id) AS id FROM spaces_apps
GROUP BY spaceId, app_instance_config_uuid, external_resource_id )temp )

I found solution for this. It's not the prettiest but it's the only one that works in my case.
DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE
t1.created_at < t2.created_at AND
t1.app_instance_config_uuid=t2.app_instance_config_uuid AND
t1.external_resource_id=t2.external_resource_id AND
t1.spaceId=t2.spaceId;

sql MySQL error (1241) operand should contain 1 column(s)

first I will like to state that am still a newbie on writing SQL Queries. I thoroughly searched for an answer on this Error and I got a good number of answers, but none seems to be helpful or I will say I don't really know how to apply the solutions to mine.
Here is my challenge, I have an application table, that stores applicants records with some unique columns e.g (dl_number,parent_id,person_id). The parent_id keeps tracks of individual applicant history records with the his/her first record and each applicant is meant to have a unique dl_number, but for some reasons, some applicants dl_number(s) are not unique, hence a need to identify the records with changing dl_number(s).
Below is the SQL Query, that am getting the [sql error (1241) operand should contain 1 column(s)] error on.
SELECT id,application_id,dl_number,surname,firstname,othername,birth_date,status_id,expiry_date,person_id,COUNT(DISTINCT(dl_number,parent_id,birth_date)) AS NumOccurrences
FROM tbl_dl_application
WHERE status_id > 1
GROUP BY dl_number,parent_id,birth_date
HAVING NumOccurrences > 1
Please any help on how to solve this, or a better way to solve this.
Sample table and expected result

DISTICT is not really a function to be used that way.
You can do SELECT DISTICT column1, column2 FROM table to get unique rows only, or similarly SELECT column, count(DISTINCT anothercolumn) FROM table GROUP BY column to get unique rows within a group.
Problem as I understand it: You look for duplicates in your table. Duplicates are defined as having identical values of these 3 columns: dl_n‌umber, parent_id and birth‌_date.
I'm also assuming that id is a primary key in your table. If not, replace the t2.id <> t.id condition with one that uniquely identify your row.
If you only wanted to know what are the duplicated groups, this should work:
SELECT dl_n‌umber, parent_id, birth‌_date, count(*) as NumOccurences -- You can only add aggregation functions here, not another column unless you group by it.
FROM tbl_dl_application t
WHERE status_id > 1 -- I don't know what this is but it should do no harm.
GROUP BY dl_n‌umber, parent_id, birth‌_date
HAVING count(*)>1
If, however, you want to know details of each duplicated row, this query will give you that:
SELECT *
FROM tbl_dl_application t
WHERE
status_id > 1 -- I don't know what this is but it should do no harm.
AND EXISTS (
SELECT 1
FROM tbl_dl_application t2
WHERE
t2.dl_number = t.dl_number
AND t2.parent_id = t.parent_id
AND t2.birth_date = t.birth_date
AND t2.id <> t.id
)
ORDER BY dl_n‌umber, parent_id, birth‌_date, id; -- So you have your duplicates nicely next to each other.
Please explain further if I misunderstood your objective, or ask if the solution is not clear enough.

**You have to use only one column while use to DISTINCT function. You used this three field dl_number,parent_id,birth_date. Just use 1 filed from these 3. Then query will run.**
For example.
SELECT id,application_id,dl_number,surname,firstname,othername,birth_date,status_id,expiry_date,person_id,COUNT(DISTINCT(parent_id)) AS NumOccurrences
FROM tbl_dl_application
WHERE status_id > 1
GROUP BY dl_number,parent_id,birth_date
HAVING NumOccurrences > 1

Remove rows that are already in another table

I have 2 mySQL tables. They are both lists of personal info (names, phone numbers, emails etc).
Although the field names are similar, they are not identical.
I need to remove rows from the 1st table containing phone numbers that are found in the second table.
Is this possible and can someone point me in the right direction?
Thanks in advance.

delete t1
from first_table t1
join second_table t2 on t1.phone = t2.phone

Try this:
DELETE FROM `table1` WHERE `phonenumber` IN (SELECT `phonenumber` FROM `table2`)

First, write a query that identifies the set of rows you want to remove. For example:
SELECT o.*
FROM table_one o
JOIN table_two t
ON o.phone_number = t.phone_number
Verify that this is the set of rows you want to remove. (This is also convenient if you want to store a backup copy of the rows you are going to delete. If there are multiple rows from table_two that have the same phone number, you can add a GROUP BY clause with the primary key of table_one, or a DISTINCT keyword, etc.)
Convert the SELECT statement into a DELETE statement by replacing the SELECT keyword and the select list with DELETE o.* (if it's rows from table_one you want to remove.)

sql delete all but 2 duplicates

I want to be able to limit the amount of duplicate records in a mySQL database table to 2.
(Excluding the id field which is auto increment)
My table is set up like
id city item
---------------------
1 Miami 4
2 Detroit 5
3 Miami 4
4 Miami 18
5 Miami 4
So in that table, only row 5 would be deleted.
How can I do this?

MySQL has some foibles when reading and writing to the same table. So I don't actually know if this will work, the syntax is fine in many implementations of SQL, but I don't know if it's MySQL friendly...
DELETE
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
EDIT
Amazingly convoluted, but worth a try?
DELETE
yourTable
FROM
yourTable
INNER JOIN
(
SELECT
id
FROM
(
SELECT
id
FROM
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
)
AS inner_deletes
)
AS deletes
ON deletes.id = yourTable.id

I think your problem here is that both your code and/or table structure allows inserting duplicates and you are asking this question when you should really fix your db and/or code.

i think a better solution is avoid allow more than 5 registers, you have to implement a validation where if select count(*) > 3 you will not accept the new insert.
because if you want to do this into the data tier, you have to use a stored procedure , because first you need to identify all the register with more than 3 registers and delete only the last .
Saludos

Due to MySQL being notoriously difficult when it comes to updating queried tables (see for example the answers from Dems), the best I can figure out is sadly more than one statement but on the plus side fairly readable;
CREATE TEMPORARY TABLE Dump AS SELECT id FROM table1 WHERE id NOT IN
(SELECT MIN(id) FROM table1 GROUP BY city,item UNION
SELECT MAX(id) FROM table1 GROUP BY city,item);
DELETE FROM table1 where id in (select * from Dump);
DROP TABLE DUMP;
Not sure if it was important which duplicate was removed, this keeps the first and last.

In your reply to Joachim's answer, you ask about saving 3 or 5 rows, this is one way to accomplish it. Depending on how you are using this database, you could either call this in a loop, or you could turn it into a stored procedure. Either way, you would continue to run this entire block of code until Rows Affected = 0:
drop table if exists TempTable;
create table TempTable
select city, item,
count(*) as record_count,
min(id) as ItemToDrop -- this could be changed to max() if you
-- want to delete new stuff instead
from YourTable
group by city, item
having count(*) > 2; -- This value = number of rows you save
delete from YourTable
where id in (select ItemToDrop from TempTable);

Delete - I can't specify target table?

Why this query doesn't work?
DELETE FROM recent_edits
WHERE trackid NOT IN
(SELECT DISTINCT history.trackid
FROM history JOIN recent_edits ON history.trackid=recent_edits.trackid
GROUP BY recent_edits.trackid)
I get this message : "You can't specify target table "recent_edits" for update in FROM clause

Try in this way
DELETE FROM recent_edits
WHERE trackid NOT IN
(select * from (SELECT DISTINCT history.trackid
FROM history JOIN recent_edits ON history.trackid=recent_edits.trackid
GROUP BY recent_edits.trackid) as t);

You can't post-process a table which is locked for deletion. using the hack select * from (query) as Nicola states will generate a temporary table instead of direct access.
Edit - make sure that you give ID to the tables you use since it is nested and will require uniqueID for every table.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Deleting duplicate rows with sql - mysql

from mysql documentation: "Currently, you cannot delete from a table and select from the same table in a subquery." but here is a workaround for update, should work for delete too. also, you could select rows, and then in php for example delete them in loop

You may found here an answer to your problem: How to delete duplicate records in mysql database? You should improve your database by using keyfields to prevent duplicate rows, so you dont need to clear in future.

Related

MySQL 5.7 remove duplicate rows in the same table based on multiple columns

sql MySQL error (1241) operand should contain 1 column(s)

Remove rows that are already in another table

sql delete all but 2 duplicates

Delete - I can't specify target table?

Categories

Resources