MySQL look for duplicates on multiple fields - mysql

I have a MySQL database with the following fields:
id, email, first_name, last_name
I want to run an SQL query that will display rows where id and email exists more than once.
Basically, the id and email field should only have one row and I would like to run a query to see if there are any possible duplicates

If you just want to return the id and email that are duplicated, you can just use a GROUP BY query:
SELECT id, email
FROM yourtable
GROUP BY id, email
HAVING COUNT(*)>1
if you also want to return the full rows, then you have to join the previous query back:
SELECT yourtable.*
FROM
yourtable INNER JOIN (
SELECT id, email
FROM yourtable
GROUP BY id, email
HAVING COUNT(*)>1
) s
ON yourtable.id = s.id AND yourtable.email=s.email

You'll want something like this:
select field1,field2,field3, count(*)
from table_name
group by field1,field2,field3
having count(*) > 1
See also this question.

You can search for all ids that meet a specific count by grouping them and using a having clause like this:
SELECT id, COUNT(*) AS totalCount
FROM myTable
GROUP BY id
HAVING COUNT(*) > 1;
Anything this query returns has a duplicate. To check for duplicate emails, you can just change the column you're selecting.

Related

MySQL select 1 of multiple rows with same field based on where clause

I need to be able to select my entire table but where there are duplicate id's, only select 1 of them based on the data in a different field.
For example if my table looks like this
I want to select all rows, but if there are 2 of the same id, select only the row with Billing as the address type.
You can do it this way:
select * from Table1
where (AddressType='Billing') or
(AddressType='Shipping' and ID not in (select ID from Table1 where AddressType='Billing'))
order by ID
Explanation:
1st condition is to filter only Billing address types.
2nd condition is to filter Shipping address types which do not have Billing with the same ID.
Result in SQL Fiddle
Try this -
SELECT *, ADDRESS
FROM (SELECT MIN(ID), ADDRESSTYPE
FROM YOUR_TABLE
GROUP BY ADDRESS) X

SQL - Get all rows of table one and count of rows

How do I create an SQL query which gets all the rows of the table and count of rows inserted under e-mail?
I tried something like this, but this groups the rows and so I don't get all the rows.
SELECT *, COUNT(email) AS 'count' FROM adverts GROUP BY email
select a1.*, a2.count
from adverts a1
join
(
SELECT email, COUNT(*) AS 'count'
FROM adverts
GROUP BY email
) a2 on a1.email = a2.email
try this,
select *,
(select count(Email)
from adverts where adverts.Email =a.Email) as EmailCount
from adverts as a
or this
SELECT *, COUNT(email) OVER (PARTITION BY email) as EmailCount FROM adverts

Count the number of occurrences of each email address

I have a mySQL workbench table called table_contacts, with the fields:
user_id and PrimaryEmail
I want to write a query that, for each row in the table will return:
User_id, PrimaryEmail and Number of occurrences of that email address in the table. So I want the following table to be returned:
I know I need to use a sub query. So far I have:
select user_id, PrimaryEmail,
(select Count(PrimaryEmail) from table_contacts where PrimaryEmail = table_contacts.PrimaryEmail)
from table_contacts
But this is returning the count of all email addresses in the table.
What am I doing wrong?
The solution of Simone and Grażynka will group by address, so you will lose some row each time the email address is more than one time.
To display all row with a count of same email, you can do :
SELECT t1.user_id, t1.PrimaryEmail, (SELECT COUNT(*) FROM table_contacts t2 WHERE t2.PrimaryEmail = t1.PrimaryEmail) FROM table_contacts t1
try this:
select user_id, PrimaryEmail, Count(PrimaryEmail)
from table_contacts
group by PrimaryEmail
in SQL tryit editor a similar query would be
SELECT customerid,count(country),country FROM [Customers] group by country
but in this case you receive only the count of each email (one row for each email). Other (better) solutions have been proposed if you want to list all the rows with the couunt added.
Try this one:
Select user_id, primaryemail, count(*)
from table_contacts
group by user_id, primaryemail
You need a group by, not a subquery
something like
select user_id, PrimaryEmail, Count(PrimaryEmail)
from table_contacts
group by PrimaryEmail
This should do the job:
select t1.user_id, t1.PrimaryEmail, count(*)
from table_contacts t1
join table_contacts t2 on t1.PrimaryEmail = t2.PrimaryEmail
group by t1.user_id, t1.PrimaryEmail
order by t1.user_id;

Find most recent duplicates ID with MySQL

I use to do
SELECT email, COUNT(email) AS occurences
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);
to find duplicates based on their email.
But now I'd need their ID to be able to define which one to remove exactly.
The second constraint is: I want only the LAST INSERTED duplicates.
So if there's 2 entries with test#test.com as an email and their IDs are respectively 40 and 12782 it would delete only the 12782 entry and keep the 40 one.
Any ideas on how I could do this? I've been mashing SQL for about a hour and can't seem to find exactly how to do this.
Thanks and have a nice day!
Well, you sort of answer your question. You seem to want max(id):
SELECT email, COUNT(email) AS occurences, max(id)
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);
You can delete the others using the statement. Delete with join has a tricky syntax where you have to list the table name first and then specify the from clause with the join:
delete wineries
from wineries join
(select email, max(id) as maxid
from wineries
group by email
having count(*) > 1
) we
on we.email = wineries.email and
wineries.id < we.maxid;
Or writing this as an exists clause:
delete from wineries
where exists (select 1
from (select email, max(id) as maxid
from wineries
group by email
) we
where we.email = wineries.email and wineries.id < we.maxid
)
select email, max(id), COUNT(email) AS occurences
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);
delete from wineries
where id not in
(
select * from
(
select min(id)
from wineries
group by email
) x
)
You need a subquery to trick MySQL to delete from a table it is selecting from at the same time.
DELETE duplicates.*
FROM wineries
JOIN wineries AS duplicates USING (email)
WHERE duplicates.id < wineries.id;
play with it on sqlfiddle.com
This is the simplest option:
DELETE FROM wineries
WHERE id NOT IN
(
SELECT MIN(id) id
FROM wineries
GROUP BY email
);
This will only keep the first inserted record for each email address, all other records will be deleted. Credit for this answer should go to #juergen d since this is just a revised version of his answer.

Mysql select distinct

I am trying to select of the duplicate rows in mysql table it's working fine for me but the problem is that it is not letting me select all the fields in that query , just letting me select the field name i used as distinct , lemme write the query for better understading
mysql_query("SELECT DISTINCT ticket_id FROM temp_tickets ORDER BY ticket_id")
mysql_query("SELECT * , DISTINCT ticket_id FROM temp_tickets ORDER BY ticket_id")
1st one is working fine
now when i am trying to select all fields i am ending up with errors
i am trying to select the latest of the duplicates let say ticket_id 127 is 3 times on row id 7,8,9 so i want to select it once with the latest entry that would be 9 in this case and this applies on all the rest of the ticket_id's
Any idea
thanks
DISTINCT is not a function that applies only to some columns. It's a query modifier that applies to all columns in the select-list.
That is, DISTINCT reduces rows only if all columns are identical to the columns of another row.
DISTINCT must follow immediately after SELECT (along with other query modifiers, like SQL_CALC_FOUND_ROWS). Then following the query modifiers, you can list columns.
RIGHT: SELECT DISTINCT foo, ticket_id FROM table...
Output a row for each distinct pairing of values across ticket_id and foo.
WRONG: SELECT foo, DISTINCT ticket_id FROM table...
If there are three distinct values of ticket_id, would this return only three rows? What if there are six distinct values of foo? Which three values of the six possible values of foo should be output?
It's ambiguous as written.
Are you looking for "SELECT * FROM temp_tickets GROUP BY ticket_id ORDER BY ticket_id ?
UPDATE
SELECT t.*
FROM
(SELECT ticket_id, MAX(id) as id FROM temp_tickets GROUP BY ticket_id) a
INNER JOIN temp_tickets t ON (t.id = a.id)
You can use group by instead of distinct. Because when you use distinct, you'll get struggle to select all values from table. Unlike when you use group by, you can get distinct values and also all fields in table.
You can use DISTINCT like that
mysql_query("SELECT DISTINCT(ticket_id), column1, column2, column3
FROM temp_tickets
ORDER BY ticket_id");
use a subselect:
http://forums.asp.net/t/1470093.aspx