Find duplicate records in table mysql - mysql

I have a table like the following:
| ID | Short Name | Long Name |
|----|------------|-----------|
| 1 | s1 | l2 |
| 2 | s1 | l2 |
| 3 | s1 | l2 |
| 4 | s5 | l6 |
| .. | ... | |
I want to get all records that share the same Short Name and Long Name. I need their shared short name, long name, and 3 duplicates' IDs. For this particular example, I want {s1, l2, 1,2,3}

This is a fairly simple problem to solve. Basically what you want to do is write a subquery that counts the number of rows that match on your specified field for each row in your query. I have included a few examples below.
Find all rows that are duplicates and match on both name fields
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.ShortName = TableName.ShortName AND T2.LongName = TableName.LongName) > 1;
Find all rows that are duplicates and match on the short name
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.ShortName = TableName.ShortName) > 1;
Find all rows that are duplicates and match on the long name
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.LongName = TableName.LongName) > 1;

Simply use a a self join of your table and select those rows where the two names are equal and the ids differ:
SELECT *
FROM <table> t1, <table> t2
WHERE t1.id <> t2.id
AND t1.short_name = t2.short_name
AND t1.long_name = t2.long_name;

You can use exits to see if there are any other data exists with the same condition where ID is not same.
select t1.* from table_name t1
where exists (
select 1 from table_name t2
where
t1.ID <> t2.ID
and t1.`Short Name` = t2.`Short Name`
and t1.`Long Name` = t2.`Long Name`
);

Related

Find Values between two different columns in a table

table one
+----------------------+
|column A | Column B|
| 2 | 4 |
| 3 | 5 |
| 1 | 2 |
| 1 | 2 |
| 8 | 7 |
+----------------------+
Output
+-------+
|1 | 2 |
|1 | 2 |
+-------+
i want to print only the above output without COUNT, and any duplicate record example? please help
how about below where cluase
select * from t where columnA=1 and columnB=2
or
select columnA,columnB from t
group by columnA,columnB
having count(*)>1
or you can use exists
select t1.* from t t1 where exists
(select 1 from t t2 where t2.columnA=t1.columnA
and t2.columnB=t1.columnB group by columnA,columnB
having count(*)>1
)
You possibly want only those rows which are duplicate. If you don't have Window Functions available in your MySQL version, you can do the following:
SELECT
t.*
FROM your_table AS t
JOIN (SELECT columnA, columnB
FROM your_table
GROUP BY columnA, columnB
HAVING COUNT(*) > 1) AS dt
ON dt.columnA = t.columnA AND dt.columnB = t.columnB
Details: In a Derived table, we get all those combination of columnA and columnB which have more than one row(s) (HAVING COUNT(*) > 1).
Now, we simply join this result-set back to the main table, to get those rows only.
Note: This approach would not be needed if you want to fetch only these two columns. A simple Group By with Having would suffice, as suggested in other answer(s). However, if you have more columns in the table, and you will need to fetch all of them, and not just the columns (used to determine duplicates); you will need to use this approach.
You can use in operator with a grouped subquery as :
select *
from tab
where ( columnA, columnB) in
(
select columnA, count(columnA)
from tab
group by columnA
);
or use a self-join as :
select t1.columnA, t1.columnB
from tab t1
join
(
select columnA, count(columnA) as columnB
from tab
group by columnA
) t2
on ( t1.columnA = t2.columnA and t1.columnB = t2.columnB );
Rextester Demo
I would use EXISTS, if the table has primary column :
SELECT t.*
FROM table t
WHERE EXISTS (SELECT 1 FROM table t1 WHERE t1.col1 = t.col1 AND t1.col2 = t.col2 AND t1.pk <> t.pk);

MYSQL Updating row to maximum value of similar rows

I have a table like this in MYSQL:
ID | NAME | VALUE |
----------------------------
1 | Bob | 1 |
2 | Bob | 2 |
3 | Jack | 5 |
4 | Jack | 8 |
5 | Jack | 10 |
and I'm trying to update the VALUE column to the highest value of rows with same NAME. So the result should be:
ID | NAME | VALUE |
----------------------------
1 | Bob | 2 |
2 | Bob | 2 |
3 | Jack | 10 |
4 | Jack | 10 |
5 | Jack | 10 |
I managed to get the max value like this:
SELECT MAX(Value) max FROM `table` GROUP BY Name having count(*) >1 AND MAX(Value) != MIN(Value)
But can't figure out how to put it in my update
Update table set Value = (SELECT MAX(Value) max FROM `table` GROUP BY Name having count(*) >1 AND MAX(Value) != MIN(Value))
Doesn't work. I'd appreciate any help.
This is easier than other answers are making it.
UPDATE MyTable AS t1 INNER JOIN MyTable AS t2 USING (Name)
SET Value = GREATEST(t1.Value, t2.Value);
You don't have to find the largest value. You just have to join each row to the set of rows with the same name, and set the Value to the greater Value of the two joined rows. This is a no-op on some rows, but it will apply to every row in turn.
http://sqlfiddle.com/#!9/f79a3/1
UPDATE t1
INNER JOIN (SELECT name, MAX(`value`) max_value
FROM t1 GROUP BY name) t2
ON t1.name = t2.name
SET t1.value = t2.max_value;
Create a temporary table consisting of ID NAME and MAX VALUE as follows:
CREATE TEMP TABLE TABLE1 AS
(SELECT NAME,MAX(Value) value FROM `table` GROUP BY Name having count(*) >1
AND MAX(Value) != MIN(Value)
);
Use this temporary table to do your update as follows:
UPDATE
Table_A
SET
Table_A.value = Table_B.value
FROM
`table` AS Table_A
INNER JOIN TABLE1 AS Table_B
ON Table_A.NAME = Table_B.NAME
Also this code is somewhat of an approximation as i am not familiar with mysql but i am familiar with sql.
Let me know if this doesn't help.
Simple left join would do the trick.
Try this out and let me know in case of any queries.
select a.id,a.name,b.value
from
table a
left join
(select name,max(value) as value from table group by name) b
on a.name=b.name;
You may use this query. The table is joined with a subquery (table t2) that contains the results you want to update your table with:
UPDATE `table` t1,
(SELECT Name, MAX(Value) maxv, MIN(Value) minv
FROM `table`
GROUP BY Name
HAVING COUNT(*)>1 AND maxv != minv) t2
SET t1.Value = t2.maxv
WHERE t1.Name = t2.Name;
If you want to know how will the values be updated, you can first run an equivalent SELECT query:
SELECT t1.*, t2.maxv
FROM `table` t1,
(SELECT Name, MAX(Value) maxv, MIN(Value) minv
FROM `table`
GROUP BY Name
HAVING COUNT(*)>1 AND maxv != minv) t2
WHERE t1.Name = t2.Name;
This query will display all the fields of table, followed by the new value maxv. You can check the current value and the new value, and if it looks fine, you may run the UPDATE query.

MySQL find duplicates in multiple columns

I have a table with user IDs split into 2 columns. (To explain this a little more, we capture the IDs of participants by scanning barcodes. Sometimes the barcode scanner function doesn't work for whatever reason, so we also allow manual entry of the ID, IF the barcode scanner doesn't work.) This results in data like the following:
+------+-----------+
| ID | ID_MANUAL |
+------+-----------+
| A | NULL |
| NULL | A |
| B | NULL |
| B | NULL |
| NULL | C |
| C | NULL |
| NULL | D |
| NULL | D |
+------+-----------+
I want to find all of the duplicate IDs, taking both columns into account. It's easy to find the duplicates that are only in 1 column ("B" and "D"). But how do I find the duplicates "A" and "C"? Ideally, the query would find and return ALL duplicates (A,B,C, and D).
Thanks!
Try this:
SELECT DUP.* FROM (SELECT ID FROM yourtable) ORI
LEFT JOIN yourtable DUP ON DUP.ID = ORI.ID_MANUAL WHERE DUP.ID IS NOT NULL
An advice: a field named ID m,ust be unique and not null. But if you have this structure, you can try this:
SELECT id
FROM yourtable t
WHERE id is not null
AND
(SELECT COUNT(*)
FROM yourtable t2
WHERE t2.id = t.id) +
(SELECT COUNT(*)
FROM yourtable t3
WHERE t3.id_manual = t.id) > 1
UNION
SELECT id_manual
FROM yourtable t
WHERE id_manual is not null
AND
(SELECT COUNT(*)
FROM yourtable t2
WHERE t2.id = t.id_manual) +
(SELECT COUNT(*)
FROM yourtable t3
WHERE t3.id_manual = t.id_manual) > 1
You can go on Sql Fiddle
You could try UNION ALL here:
select id,count(*)
from
(
select id
from yourtable
union all
select id_manual as id
from yourtable
) a
group by id
having count(*) >1;
try:
select id, count(*)
from
(
select id
from data
where id_manual is null
union all
select id_manual as id
from data
where id is null
) a
group by id
having count(*) > 1;
and
select id, id_manual
from data
group by id, id_manual
having count(*) > 1;
You can do this with a simple JOIN, using COALESCE and DISTINCT if you have a surrogate auto-increment primary key:
SELECT DISTINCT s2.pk, s2.ID, s2.ID_MANUAL
FROM scans s1
JOIN scans s2
ON COALESCE(s2.ID, s2.ID_MANUAL) = COALESCE(s1.ID, s1.ID_MANUAL)
AND s2.pk > s1.pk
This will exclude the original record, so you could delete the records returned in this result set.
Here's the SQL Fiddle.

MySql select next lower number without using limit

Is it possible to select the next lower number from a table without using limit.
Eg: If my table had 10, 3, 2 , 1 I'm trying to select * from table where col > 10.
The result I'm expecting is 3. I know I can use limit 1, but can it be done without that?
Try
SELECT MAX(no) no
FROM table1
WHERE no < 10
Output:
| NO |
------
| 3 |
SQLFiddle
Try this query
SELECT
*
FROM
(SELECT
#rid:=#rid+1 as rId,
a.*
FROM
tbl a
JOIN
(SELECT #rid:=0) b
ORDER BY
id DESC)tmp
WHERE rId=2;
SQL FIDDLE:
| RID | ID | TYPE | DETAILS |
------------------------------------
| 2 | 28 | Twitter | #sqlfiddle5 |
Another approach
select a.* from supportContacts a inner join
(select max(id) as id
from supportContacts
where
id in (select id from supportContacts where id not in
(select max(id) from supportContacts)))b
on a.id=b.id
SQL FIDDLE:
| ID | TYPE | DETAILS |
------------------------------
| 28 | Twitter | #sqlfiddle5 |
Alternatively, this query will always get the second highest number based on the inner where clause.
SELECT *
FROM
(
SELECT t.col,
(
SELECT COUNT(distinct t2.col)
FROM tableName t2
WHERE t2.col >= t.col
) as rank
FROM tablename t
WHERE col <= 10
) xx
WHERE rank = 2 -- <<== means second highest
SQLFiddle Demo
SQLFiddle Demo (supports duplicate values)
If you want to get next lower number from table
you can get it with this query:
SELECT distinct col FROM table1 a
WHERE 2 = (SELECT count(DISTINCT(b.col)) FROM table1 b WHERE a.col >= b.col);
later again if you want to get third lower number you can just pass 3 in place of 2 in where clause
again if you want to get second higher number, just change the condition of where clause in inner query with
a.col <= b.col

mysql query to identify and delete duplicates based on timestamp

I am trying to build a mysql query to list all column a's that have a duplicate column b from a single table. The trick is I have a timestamp on the rows so i need to essentially identify which is the older of the duplicates so i can delete it. Any help would be appreciated.
Just example - this query return duplicate posts, now you just need to execute delete
id| title | text_desc | created
-------------------------------------------------------
1 | The title | description here |2012-02-21 10:58:58
2 | The title | description here 1 |2012-02-21 10:58:58
3 | The title | description here 3 |2012-02-21 10:58:58
select bad_rows.*
from posts as bad_rows
inner join (
select title, MIN(id) as min_id
from posts
group by title
having count(*) > 1
) as good_rows on good_rows.title = bad_rows.title
and good_rows.min_id <> bad_rows.id;
Here is the return rows
id| title | text_desc | created
-------------------------------------------------------
2 | The title | description here 1 |2012-02-21 10:58:58
3 | The title | description here 3 |2012-02-21 10:58:58
Here's your query:
DELETE FROM tablename
WHERE id IN
(SELECT t1.id
FROM tablename t1
JOIN tablename t2
ON t2.cola = t1.cola AND t2.colb = t1.colb
AND t2.timecol > t1.timecol
WHERE t1.cola = t1.colb)
The SELECT statement returns records where cola = colb and there are other matching rows with a later date. The DELETE statement deletes all records returned by the SELECT.
If you're looking to remove duplicate cola, then this is the query:
DELETE FROM tablename
WHERE id IN
(SELECT t1.id
FROM tablename t1
JOIN tablename t2
ON t2.cola = t1.cola
AND t2.timecol > t1.timecol)
SELECT FOOCODE,COUNT(*) AS DUPS
FROM TABLE
GROUP BY FOOCODE
HAVING COUNT(FOOCODE)>1;
The above query will return u all the duplicates.Is this what u are looking for?