how to delete all duplicate rows in mysql - mysql

this is my table role_users for particular id
I want to delete all duplicate rows for role_id and user_id and if 2 or more records are there then only 1 latest record should be there and other should be deleted. How can i write this.?

You can use the delete ... from ... join... syntax:
delete r
from role_users r
inner join (
select role_id, user_id, max(id) max_id
from role_users
group by role_id, user_id
) r1
on r.role_id = r1.role_id
and r.user_id = r1.user_id
and r.id < r1.max_id

Try the following approach:
The following query would give you IDs of all Unique records grouped by role_id & user_id
SELECT MAX(id) FROM role_users GROUP BY role_id, user_id
Notice the MAX function here to get the latest created record assuming id is auto_increment
Use it in a nested query to delete all the other duplicate records
DELETE FROM role_users WHERE id NOT IN (SELECT MAX(id) FROM role_users GROUP BY role_id, user_id);
Let me know if it works :)

Related

Delete all duplicates except first one mysql

I have a table with a column serial_number that is repeated a few times. How would I delete the entire row except the first duplicate?
By the following, I can select all the duplicates. But can't delete.
SELECT serial_number, COUNT(*) FROM trademark_merge GROUP BY serial_number HAVING COUNT(*) > 1
Assuming that the primary key of your table is id, you could phrase this as a delete/join query, like:
delete tm
from trademark_merge tm
inner join (
select serial_number, min(id) id
from trademark_merge
group by serial_number
) tm1 on tm.serial_number = tm1.serial_number and tm.id > tm1.id

Delete records based on another query in mysql

I have a query in MySQL based on which I am finding duplicate records of some columns.
select max(id), count(*) as cnt
from table group by start_id, end_id, mysqltable
having cnt>1;
This above query gives me the max(id) and the count of number of records that have start_id,end_id,mysqltable column values same.
I want to delete all the records that match the max(id) column of the above query
How can I do that?
I have tried like below
delete from table
where (select max(id), count(*) as cnt
from table group by start_id,end_id,mysqltable
having cnt>1)
But Unable to delete records
You can remove duplicate records using JOIN.
DELETE t1 FROM table t1
INNER JOIN
table t2
WHERE
t1.id > t2.id AND t1.start_id = t2.start_id AND t1.end_id = t2.end_id AND t1.mysqltable = t2.mysqltable;
This query keeps the lowest id and remove the highest.
I think so this command should work:
delete from table
where id in
( select max(id) from table
group by start_id, end_id, mysqltable
having count(*) > 1
);

Optimizing MySQL Join and Group By with intermediate table

Simplifying but I have three tables:
users (user_id, team_id)
results (user_id, result)
user_signups (user_id, team_id, event_id)
results.user_id is a foreign key.
Tables have large number of rows in. If I do
select sum(result)
from results
inner join users on users.id = results.user_id
group by team_id
It is fast. "Explain" has results with 150k rows, users with 1 row.
If I do
select sum(result)
from results
inner join user_signups on user_signups.user_id = results.user_id
where event_id = 1
group by team_id
It is very slow (from 1 second to 14). "Explain" has results with 28 rows, user_signups with 5345 rows.
Things I have tried:
A unique index on event_id and user_id on user_signups.
An index on event_id, user_id, team_id on user_signups.
Rewriting as
select sum(result)
from results
inner join (select * from user_signups where event_id = 1) user_signups on user_signups.user_id = results.user_id
group by team_id
Rewriting as
select sum(result)
from results
inner join users on users.id = results.user_id
inner join user_signups on user_signups.user_id = users.id
where event_id = 1
group by user_signups.team_id
Any other suggestions?
By grouping on the team_id, I assume that you want one row for each record in results.
Is this what you're looking for?
SELECT *, sum(result) FROM results
LEFT JOIN users ON (users.user_id=results.user_id)
LEFT JOIN user_signups ON (user_signups.users_id=users.user_id)
GROUP BY table.field
From here, you can group on whatever you like. This structure assumes that most of your data will be present in the results table and will join users to the results table and user_signups to the users table.
Make the multicolumn index on (event_id, user_id, team_id) in user_signups table and try to run the following query.
If this doesn't work then post your explain here.
select sum(result) from results inner join(select
event_id,user_id,team_id from user_signups where event_id = 1)
user_signups on user_signups.user_id = results.user_id group by
team_id

Removing duplicate records from relational db table

I have a database table with three columns. Id, user_id, book_id. In this table, there are some duplicates. a user_id should only have one record of a book_id, but in some cases, a user_id has several book_ids. There are a couple of million records already and I'm wondering how to remove any duplicates.
Try following.
SQL SERVER
WITH ORDERED AS
(
SELECT id
ROW_NUMBER() OVER (PARTITION BY [user_id] , [book_id] ORDER BY id ASC) AS rn
FROM
tableName
)
delete from tableName
where id in ( select id from ORDERED where rn != 1)
MYSQL
delete from tableName
where id not in(
select MIN(id)from tableName
group by user_id, book_id
)
Edited as per comments - In MySQL, you can't modify the same table which you use in the SELECT part
This will solve the issue.
delete from tableName
where id not in(
select temp.temp_id from (
select MIN(id) as temp_id from tableName
group by user_id, book_id
) as temp
)
This will keep only one combination of (user_id, book_id)
If you execute this statement below, it will delete all duplicate records of user_ID and leaving only the greatest ID for each user_ID
DELETE a
FROM tableName a
LEFT JOIN
(
SELECT user_ID, MAX(ID) max_ID
FROM tableName
GROUP BY user_ID
) b ON a.user_ID = b.user_ID AND
a.ID = b.max_ID
WHERE b.max_ID IS NULL
SQLFiddle Demo
Hope this query will allow you to remove duplicates:
DELETE bl1 FROM book_log bl1
JOIN book_log bl2
ON (
bl1.id > bl2.id AND
bl1.user_id = bl2.user_id AND
bl1.book_id = bl2.book_id
);
Demo

Count unique users from db

I have the following table structure in my db (MySQL):
id group_id item_id project_id user_id
Users can have multiple entries withing the same project. How do I count unique users withing a particular project (minus project owner id)?
SELECT COUNT(user_id) AS cnt
FROM myTable
WHERE project_id = $myProject
AND user_id != 3
GROUP BY user_id
This looks right but I don't believe I'm getting the right results. Am I missing something?
Select Count(Distinct user_id)
From MyTable
Where project_id = $myProject
And user_id != 3
Add DISTINCT to your COUNT and eliminate the GROUP BY.
SELECT COUNT(DISTINCT user_id) AS cnt
FROM myTable
WHERE project_id = $myProject
AND user_id != 3
You don't need a GROUP BY clause for this.
SELECT COUNT(DISTINCT user_id) AS cnt
FROM myTable
WHERE project_id = $myProject
AND user_id != 3;
If you want to list the member count for each group in the same query, you can GROUP BY project_id:
SELECT COUNT(DISTINCT user_id) AS cnt
FROM myTable
GROUP BY project_id;
By grouping on user_id as you do now, every row in the resultset will contain 1.
Try Distinct?
SELECT DISTINCT COUNT(user_id) AS cnt FROM myTable WHERE project_id = $myProject AND user_id != 3