SQL Query with COUNT, Having Count >1, display full details of duplicates

SQL Query with COUNT, Having Count >1, display full details of duplicates - mysql

I have a table like :
name employment_Status email
---- ---- -----
David E David#email.com
John U John#email.com
Michael E Michael#email.com
Steve E Michael#email.com
James U David#email.com
Mary U Mary#email.com
Beth E Beth#email.com
I started by selecting email and count(email):
SELECT email, COUNT(email) AS emailCount
FROM Table
GROUP BY email
HAVING ( COUNT(email) > 1 );
The problem occurred when I tried to include name as well:
SELECT name, email, COUNT(email) AS emailCount
FROM Table
GROUP BY name, email
HAVING ( COUNT(email) > 1 );
I would like to find all people with a duplicate email addresses, (only where both people are employed (E)). However it is returning zero results.
I'd like to be able to display all information for people with duplicate emails, and having employment_Status E. If two people have the same email, but one or both is Unemployed (U), then just ignore.
Could anyone advise?

I think you want exists:
select t.*
from t
where t.employeed = 'E' and
exists (select 1
from t t2
where t2.email = t.email and t2.employeed = 'E' and
t2.name <> t.name
);
Note that this assumes that name (or at least name/email) is unique.
In MySQL 8+, you can use window functions:
select t.*
from (select t.*, count(*) over (partition by t.email) as cnt
from t
where t.employeed = 'E'
) t
where cnt >= 2;

One way would be to use your query as a subquery in FROM clause, and JOIN the result with the main table.
SELECT t.*, d.emailCount
FROM (
SELECT email, employment_Status, COUNT(*) AS emailCount
FROM my_table
GROUP BY email
WHERE employment_Status = 'E'
HAVING emailCount > 1
) d
JOIN my_table t USING(email, employment_Status)
You could also use GROUP_CONCAT(name), if you are fine getiing the names in a (comma) separated string:
SELECT email, COUNT(*) AS emailCount, GROUP_CONCAT(name) as names
FROM my_table
GROUP BY email
WHERE employment_Status = 'E'
HAVING emailCount > 1
The result for your sample data would be:
email emailCount names
-----------------------------------------------
Michael#email.com 2 Michael,Steve

Related

Eliminate the duplicate rows from the table in SQL

I want to eliminate the duplicate rows based on email from the table and retrieve all the rows without duplicates.
I have tried using distinct but I'm not getting desired results.
SELECT
DISTINCT Email
FROM
Users
Example Table:
Id
Email
Username
1
sam#gmail.com
sam1122
2
john#gmail.com
john1122
3
sam#gmail.com
sam2233
4
lily#gmail.com
lily#as
What I want to retrieve:
Id
Email
Username
1
john#gmail.com
john1122
2
lily#gmail.com
lily#as

We can try using exists logic here:
SELECT Id, Email, Username
FROM Users u1
WHERE NOT EXISTS (
SELECT 1
FROM Users u2
WHERE u2.Email = u1.Email AND
u2.Id <> u1.Id
);

You can do it using left join :
select u.*
from Users u
left join (
select email, max(id) as Id
from Users
group by email
having count(1) > 1
) as s on s.email = u.email
where s.email is null;
Demo here

Yet another option, if you are using MySQL 8 -
SELECT Id, Email, Username
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY Email) AS cnt
FROM Users
) t
WHERE t.cnt = 1;

SELECT Id, Email, Username
FROM Users
WHERE Email IN (
SELECT Email
FROM Users
GROUP BY Email
HAVING COUNT(*) = 1
)

SELECT id,
Email,
Username,
count(*) AS duplicate_email_count
FROM Users
GROUP BY Email
HAVING duplicate_email_count=1

mysql probem on Leetcode

Here is a question from LEETCODE.I don't know why my output is wrong. First I write the SELECT in the parenthesis to find out the repeated email address. Then I use the DELETE to filter out the repeated email address so anyone know what is wrong with my code? questionmycode
output

it is very simple. try this
-- Solution 1
with cte as
(
select id, email, Rank() OVER (partition by email order by id) ranks
from person where email in(
select email from person
group by email having count(email) >1
)
)
DELETE FROM person where id in
(
SELECT id FROM CTE where ranks!=1
)
-- Solution 2
DELETE p from person p
inner join (
select MIN(id) id, email from person
where email in(
select email from person group by email having count(email)>1
) group by email
) A On P.Id>A.id and p.email = a.email;

find duplicates in pairs in mysql

I want to know how can I find duplicate value in a table over two columns combined.
suppose my table has fields as id || name || father_name || region || dob
now how can I find results set such as:
.ie I want to find all rows where three columns are same.

select t1.*
from your_table t1
join
(
select name, father_name, region
from your_table
group by name, father_name, region
having count(*) >= 3
) t2 on t1.name = t2.name
and t1.father_name = t2.father_name
and t1.region = t2.region

If you are using MySql 8.0, you could make use of window function. Below query with such function returns exact output:
select id, name, fatherName, country from (
select id,
name,
fatherName,
country,
count(id) over (partition by name, fatherName, country) cnt
from Tbl
) `a` where cnt > 1;

Actually, i also need this type of feature many times, where i need to compare all columns with same value except auto incremented primary key id column.
So, in that case i always use group by keyword.
Example,
SELECT A.*
FROM YourTable A
INNER JOIN (SELECT name,city,state
FROM YourTable
GROUP BY name,city,state
HAVING COUNT(*) > 1) B
ON A.name = B.name AND A.city = B.city AND A.state = B.state
You can append the number of columns which you want to compare
Hope, This might help you in your case also.

Find duplicates using MySQL considering multiple columns

I need to find duplicate uses based on either same email OR first_name, last_name combination OR same birth_date. What I could comfortably try was:
SELECT id, first_name, last_name
FROM users
where id IN (SELECT id
from users
GROUP BY email
HAVING count(*) > 1)
GROUP BY email, id;
The above gives only duplicate email details, but I'm bit confused about handling other conditions based on first_name, last_name combination OR same birth_date as well.
Is it possible to achieve it in a single query?

Try doing a UNION of three separate queries which checks for the three duplicate criteria:
SELECT id
FROM users
GROUP BY id
HAVING COUNT(DISTINCT email) > 1
UNION
(
SELECT id
FROM users t1
INNER JOIN
(
SELECT firstname, lastname
FROM users
GROUP BY firstname, lastname
HAVING COUNT(*) > 1
) t2
ON t1.firstname = t2.firstname AND
t1.lastname = t2.lastname
)
UNION
SELECT id
FROM users
GROUP BY id
HAVING COUNT(DISTINCT birthdate) > 1

Find duplicate records in MySQL

I want to pull out duplicate records in a MySQL Database. This can be done with:
SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1
Which results in:
100 MAIN ST 2
I would like to pull it so that it shows each row that is a duplicate. Something like:
JIM JONES 100 MAIN ST
JOHN SMITH 100 MAIN ST
Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.

The key is to rewrite this query so that it can be used as a subquery.
SELECT firstname,
lastname,
list.address
FROM list
INNER JOIN (SELECT address
FROM list
GROUP BY address
HAVING COUNT(id) > 1) dup
ON list.address = dup.address;

SELECT date FROM logs group by date having count(*) >= 2

Why not just INNER JOIN the table with itself?
SELECT a.firstname, a.lastname, a.address
FROM list a
INNER JOIN list b ON a.address = b.address
WHERE a.id <> b.id
A DISTINCT is needed if the address could exist more than two times.

I tried the best answer chosen for this question, but it confused me somewhat. I actually needed that just on a single field from my table. The following example from this link worked out very well for me:
SELECT COUNT(*) c,title FROM `data` GROUP BY title HAVING c > 1;

Isn't this easier :
SELECT *
FROM tc_tariff_groups
GROUP BY group_id
HAVING COUNT(group_id) >1
?

select `cityname` from `codcities` group by `cityname` having count(*)>=2
This is the similar query you have asked for and its 200% working and easy too.
Enjoy!!!

Find duplicate users by email address with this query...
SELECT users.name, users.uid, users.mail, from_unixtime(created)
FROM users
INNER JOIN (
SELECT mail
FROM users
GROUP BY mail
HAVING count(mail) > 1
) dupes ON users.mail = dupes.mail
ORDER BY users.mail;

we can found the duplicates depends on more then one fields also.For those cases you can use below format.
SELECT COUNT(*), column1, column2
FROM tablename
GROUP BY column1, column2
HAVING COUNT(*)>1;

Finding duplicate addresses is much more complex than it seems, especially if you require accuracy. A MySQL query is not enough in this case...
I work at SmartyStreets, where we do address validation and de-duplication and other stuff, and I've seen a lot of diverse challenges with similar problems.
There are several third-party services which will flag duplicates in a list for you. Doing this solely with a MySQL subquery will not account for differences in address formats and standards. The USPS (for US address) has certain guidelines to make these standard, but only a handful of vendors are certified to perform such operations.
So, I would recommend the best answer for you is to export the table into a CSV file, for instance, and submit it to a capable list processor. One such is LiveAddress which will have it done for you in a few seconds to a few minutes automatically. It will flag duplicate rows with a new field called "Duplicate" and a value of Y in it.

Another solution would be to use table aliases, like so:
SELECT p1.id, p2.id, p1.address
FROM list AS p1, list AS p2
WHERE p1.address = p2.address
AND p1.id != p2.id
All you're really doing in this case is taking the original list table, creating two pretend tables -- p1 and p2 -- out of that, and then performing a join on the address column (line 3). The 4th line makes sure that the same record doesn't show up multiple times in your set of results ("duplicate duplicates").

Not going to be very efficient, but it should work:
SELECT *
FROM list AS outer
WHERE (SELECT COUNT(*)
FROM list AS inner
WHERE inner.address = outer.address) > 1;

This will select duplicates in one table pass, no subqueries.
SELECT *
FROM (
SELECT ao.*, (#r := #r + 1) AS rn
FROM (
SELECT #_address := 'N'
) vars,
(
SELECT *
FROM
list a
ORDER BY
address, id
) ao
WHERE CASE WHEN #_address <> address THEN #r := 0 ELSE 0 END IS NOT NULL
AND (#_address := address ) IS NOT NULL
) aoo
WHERE rn > 1
This query actially emulates ROW_NUMBER() present in Oracle and SQL Server
See the article in my blog for details:
Analytic functions: SUM, AVG, ROW_NUMBER - emulating in MySQL.

This also will show you how many duplicates have and will order the results without joins
SELECT `Language` , id, COUNT( id ) AS how_many
FROM `languages`
GROUP BY `Language`
HAVING how_many >=2
ORDER BY how_many DESC

SELECT firstname, lastname, address FROM list
WHERE
Address in
(SELECT address FROM list
GROUP BY address
HAVING count(*) > 1)

select * from table_name t1 inner join (select distinct <attribute list> from table_name as temp)t2 where t1.attribute_name = t2.attribute_name
For your table it would be something like
select * from list l1 inner join (select distinct address from list as list2)l2 where l1.address=l2.address
This query will give you all the distinct address entries in your list table... I am not sure how this will work if you have any primary key values for name, etc..

Fastest duplicates removal queries procedure:
/* create temp table with one primary column id */
INSERT INTO temp(id) SELECT MIN(id) FROM list GROUP BY (isbn) HAVING COUNT(*)>1;
DELETE FROM list WHERE id IN (SELECT id FROM temp);
DELETE FROM temp;

Personally this query has solved my problem:
SELECT `SUB_ID`, COUNT(SRV_KW_ID) as subscriptions FROM `SUB_SUBSCR` group by SUB_ID, SRV_KW_ID HAVING subscriptions > 1;
What this script does is showing all the subscriber ID's that exists more than once into the table and the number of duplicates found.
This are the table columns:
| SUB_SUBSCR_ID | int(11) | NO | PRI | NULL | auto_increment |
| MSI_ALIAS | varchar(64) | YES | UNI | NULL | |
| SUB_ID | int(11) | NO | MUL | NULL | |
| SRV_KW_ID | int(11) | NO | MUL | NULL | |
Hope it will be helpful for you either!

SELECT t.*,(select count(*) from city as tt where tt.name=t.name) as count FROM `city` as t where (select count(*) from city as tt where tt.name=t.name) > 1 order by count desc
Replace city with your Table.
Replace name with your field name

SELECT id, count(*) as c
FROM 'list'
GROUP BY id HAVING c > 1
This will return you the id with the number of times that id is repeated, or nothing in which case you will not have repeated id.
Change the id in the group by (ex: address) and it will return the number of times an address is repeated identified by the first found id with that address.
SELECT id, count(*) as c
FROM 'list'
GROUP BY address HAVING c > 1
I hope it helps. Enjoy ;)

SELECT *
FROM (SELECT address, COUNT(id) AS cnt
FROM list
GROUP BY address
HAVING ( COUNT(id) > 1 ))

I use the following:
SELECT * FROM mytable
WHERE id IN (
SELECT id FROM mytable
GROUP BY column1, column2, column3
HAVING count(*) > 1
)

Most of the answers here don't cope with the case when you have MORE THAN ONE duplicate result and/or when you have MORE THAN ONE column to check for duplications. When you are in such case, you can use this query to get all duplicate ids:
SELECT address, email, COUNT(*) AS QUANTITY_DUPLICATES, GROUP_CONCAT(id) AS ID_DUPLICATES
FROM list
GROUP BY address, email
HAVING COUNT(*)>1;
If you want to list every result as a single line, you need a more complex query. This is the one I found working:
CREATE TEMPORARY TABLE IF NOT EXISTS temptable AS (
SELECT GROUP_CONCAT(id) AS ID_DUPLICATES
FROM list
GROUP BY address, email
HAVING COUNT(*)>1
);
SELECT d.*
FROM list AS d, temptable AS t
WHERE FIND_IN_SET(d.id, t.ID_DUPLICATES)
ORDER BY d.id;

Find duplicate Records:
Suppose we have table : Student
student_id int
student_name varchar
Records:
+------------+---------------------+
| student_id | student_name |
+------------+---------------------+
| 101 | usman |
| 101 | usman |
| 101 | usman |
| 102 | usmanyaqoob |
| 103 | muhammadusmanyaqoob |
| 103 | muhammadusmanyaqoob |
+------------+---------------------+
Now we want to see duplicate records
Use this query:
select student_name,student_id ,count(*) c from student group by student_id,student_name having c>1;
+--------------------+------------+---+
| student_name | student_id | c |
+---------------------+------------+---+
| usman | 101 | 3 |
| muhammadusmanyaqoob | 103 | 2 |
+---------------------+------------+---+

To quickly see the duplicate rows you can run a single simple query
Here I am querying the table and listing all duplicate rows with same user_id, market_place and sku:
select user_id, market_place,sku, count(id)as totals from sku_analytics group by user_id, market_place,sku having count(id)>1;
To delete the duplicate row you have to decide which row you want to delete. Eg the one with lower id (usually older) or maybe some other date information. In my case I just want to delete the lower id since the newer id is latest information.
First double check if the right records will be deleted. Here I am selecting the record among duplicates which will be deleted (by unique id).
select a.user_id, a.market_place,a.sku from sku_analytics a inner join sku_analytics b where a.id< b.id and a.user_id= b.user_id and a.market_place= b.market_place and a.sku = b.sku;
Then I run the delete query to delete the dupes:
delete a from sku_analytics a inner join sku_analytics b where a.id< b.id and a.user_id= b.user_id and a.market_place= b.market_place and a.sku = b.sku;
Backup, Double check, verify, verify backup then execute.

SELECT * FROM bookings
WHERE DATE(created_at) = '2022-01-11'
AND code IN (
SELECT code FROM bookings
GROUP BY code
HAVING COUNT(code) > 1
) ORDER BY id DESC

Would go with something like this:
SELECT t1.firstname t1.lastname t1.address FROM list t1
INNER JOIN list t2
WHERE
t1.id < t2.id AND
t1.address = t2.address;

select address from list where address = any (select address from (select address, count(id) cnt from list group by address having cnt > 1 ) as t1) order by address
the inner sub-query returns rows with duplicate address then
the outer sub-query returns the address column for address with duplicates.
the outer sub-query must return only one column because it used as operand for the operator '= any'

Powerlord answer is indeed the best and I would recommend one more change: use LIMIT to make sure db would not get overloaded:
SELECT firstname, lastname, list.address FROM list
INNER JOIN (SELECT address FROM list
GROUP BY address HAVING count(id) > 1) dup ON list.address = dup.address
LIMIT 10
It is a good habit to use LIMIT if there is no WHERE and when making joins. Start with small value, check how heavy the query is and then increase the limit.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL Query with COUNT, Having Count >1, display full details of duplicates - mysql

Related

Eliminate the duplicate rows from the table in SQL

mysql probem on Leetcode

find duplicates in pairs in mysql

Find duplicates using MySQL considering multiple columns

Find duplicate records in MySQL

Categories

Resources