MySQL find duplicates in multiple columns - mysql

I have a table with user IDs split into 2 columns. (To explain this a little more, we capture the IDs of participants by scanning barcodes. Sometimes the barcode scanner function doesn't work for whatever reason, so we also allow manual entry of the ID, IF the barcode scanner doesn't work.) This results in data like the following:
+------+-----------+
| ID | ID_MANUAL |
+------+-----------+
| A | NULL |
| NULL | A |
| B | NULL |
| B | NULL |
| NULL | C |
| C | NULL |
| NULL | D |
| NULL | D |
+------+-----------+
I want to find all of the duplicate IDs, taking both columns into account. It's easy to find the duplicates that are only in 1 column ("B" and "D"). But how do I find the duplicates "A" and "C"? Ideally, the query would find and return ALL duplicates (A,B,C, and D).
Thanks!

Try this:
SELECT DUP.* FROM (SELECT ID FROM yourtable) ORI
LEFT JOIN yourtable DUP ON DUP.ID = ORI.ID_MANUAL WHERE DUP.ID IS NOT NULL

An advice: a field named ID m,ust be unique and not null. But if you have this structure, you can try this:
SELECT id
FROM yourtable t
WHERE id is not null
AND
(SELECT COUNT(*)
FROM yourtable t2
WHERE t2.id = t.id) +
(SELECT COUNT(*)
FROM yourtable t3
WHERE t3.id_manual = t.id) > 1
UNION
SELECT id_manual
FROM yourtable t
WHERE id_manual is not null
AND
(SELECT COUNT(*)
FROM yourtable t2
WHERE t2.id = t.id_manual) +
(SELECT COUNT(*)
FROM yourtable t3
WHERE t3.id_manual = t.id_manual) > 1
You can go on Sql Fiddle

You could try UNION ALL here:
select id,count(*)
from
(
select id
from yourtable
union all
select id_manual as id
from yourtable
) a
group by id
having count(*) >1;

try:
select id, count(*)
from
(
select id
from data
where id_manual is null
union all
select id_manual as id
from data
where id is null
) a
group by id
having count(*) > 1;
and
select id, id_manual
from data
group by id, id_manual
having count(*) > 1;

You can do this with a simple JOIN, using COALESCE and DISTINCT if you have a surrogate auto-increment primary key:
SELECT DISTINCT s2.pk, s2.ID, s2.ID_MANUAL
FROM scans s1
JOIN scans s2
ON COALESCE(s2.ID, s2.ID_MANUAL) = COALESCE(s1.ID, s1.ID_MANUAL)
AND s2.pk > s1.pk
This will exclude the original record, so you could delete the records returned in this result set.
Here's the SQL Fiddle.

Related

Select IDs from table where col_2 is not null (duplicate id)

I have a table that contains the following data
ID | Col_2
A | 'ABC'
A | 'GHI'
A | null
B | 'null'
B | 'HJH'
B | 'NBN'
C | null
I have two cases to cater :
Duplicate Ids:
Incase of duplicate ids I only want those IDs which do not have null in col_2
E.g.
Query should return :
A | 'ABC'
A | 'GHI'
B | 'HJH'
B | 'NBN'
Non Duplicate Id:
Incase of non duplicate id the query should return result irrespective of the value present in col_2
So the final result of the query should be
ID | Col_2
A | 'ABC'
A | 'GHI'
B | 'HJH'
B | 'NBN'
C | null
I have managed to create the following query where it is fulfilling the duplicate id case not the non duplicate case.
Query :
select id,col_2
from mytable
group by id,col_2
having (sum(case when col_2 is not null then 1 else 0 end) > 0)
What changes should be made in the query to cater the non duplicate case also.
Thanks in advance!!!
Assuming NULL is NULL and not a string and that you have only one NULL value per id, you can do something like this:
select t.*
from t
where t.col_2 is not null or
not exists (select 1 from t t2 where t2.id = t.id and t2.col_2 is not null);
If your null values can be duplicated and you want only one row for them, then tweak this to:
select t.*
from t
where t.col_2 is not null
union all
select distinct t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.col_2 is not null);
Here is a db<>fiddle.
For performance, you want an index on (id, col_2).
If you just want the col_2 values for each id, you can concatenate them on each row:
select id, group_concat(col_2)
from t
group by id;
Another alternative uses window functions:
select t.id, col_2
from (select t.*,
rank() over (partition by id order by col_2 is not null desc) as seqnum
from t
) t
where seqnum = 1;

Show id and Count the same Foreign Key

I'm using MySql. I have a table with 2 column id (Primary Key) and id_of (Foreign Key). Number of id can have the same id_of. I want to get all the id and get the count of rows having the id_of related to the id. How to make this sql query/queries? So far I could only get this:
SELECT id, (SELECT COUNT(id_of) FROM test_table) AS count FROM test_table;
database's table:
id | id_of
----------------
abasb | 2131233
hdafd | 2131233
fajdf | 3546541
pogad | 3546541
afdaj | 2131233
fafda | 8661565
the results I want:
id | count
----------------
abasb | 3
hdafd | 3
fajdf | 2
pogad | 2
afdaj | 3
fafda | 1
just need a bit of correction your query
SELECT id,
(SELECT COUNT(*) FROM test_table t2 where t2.id_of=t1.id_of) AS count
FROM test_table t1
You may try this...
; with cte as ( select distinct id_of, count(*) as Coun from testtable )
select t.id , c.coun from testtable as t inner join cte as c on t.id_of=c.id_of
You can try this:
select id , count(*) over (partition by id_of) id_of from Yourtable
You could use ajoin on subqiery for count
select a.id, t.count_of_id_of
from test_table a
inner join (
select id_of, count(*) count_of_id_of
from test_table
group by id_of
) t on t.id_of= a.id_of

MySQL - Select results with specified ID or with null

I have one table:
| ID | ADV_ID | USER_ID |
| 1 | 22 | NULL |
| 2 | 22 | 3 |
| 5 | 44 | NULL |
and now, I want to select row where adv_id = 22 and user_id = 3. If that row doesn't exist, I want to get row where adv_id = 22 and user_id is null.
I tried in that way:
SELECT * FROM `table` WHERE adv_id = 22 AND (user_id = 3 OR user_id is null)
but this query return two rows - with user_id = NULL and with user_id = 3. I want to get one row - with user_id = 3 or (if not exist), with user_id = NULL.
How I can do it in one query?
Thanks.
Use conditional aggregation:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT
ADV_ID,
CASE WHEN COUNT(CASE WHEN USER_ID = 3 THEN 1 END) > 0 THEN 3 END USER_ID
FROM yourTable
) t2
ON t1.ADV_ID = t2.ADV_ID AND
((t1.USER_ID IS NULL AND t2.USER_ID IS NULL) OR (t1.USER_ID = t2.USER_ID))
WHERE
t1.ADV_ID = 22;
Demo
For an explanation, the subquery I have aliased as t2 aggregates over the ADV_ID, and outputs the value 3 if that value occurs in one or more records, otherwise it outputs NULL. Then, we join this subquery back to your original table on the condition that both USER_ID values are NULL, or, if not, that the two USER_ID values match.
You may modify the demo to see that it generates the output you want for other inputs.
SELECT *
FROM test
WHERE ADV_ID IS NOT NULL AND USER_ID IS NOT NULL
UNION ALL
SELECT *
FROM test
WHERE USER_ID IS NULL AND NOT EXISTS (
SELECT 1
FROM test
WHERE ADV_ID IS NOT NULL AND USER_ID IS NOT NULL
)
Select all rows with the first condition: ADV_ID IS NOT NULL AND USER_ID IS NOT NULL
and then UNION ALL with the same table if the first condition is NOT EXISTS.
So we only get results if the first condition is not returned any rows.
The MySQL UNION ALL operator is used to combine the result sets of 2 or more SELECT statements.
try like that:
SELECT * FROM `table` t1 WHERE (t1.adv_id = 44)
AND ((t1.user_id = 3) OR
(NOT EXISTS (select * from `table` t2 where t2.adv_id=t1.adv_id and t2.user_id = 3) AND t1.user_id is null ))
DEMO

MYSQL Updating row to maximum value of similar rows

I have a table like this in MYSQL:
ID | NAME | VALUE |
----------------------------
1 | Bob | 1 |
2 | Bob | 2 |
3 | Jack | 5 |
4 | Jack | 8 |
5 | Jack | 10 |
and I'm trying to update the VALUE column to the highest value of rows with same NAME. So the result should be:
ID | NAME | VALUE |
----------------------------
1 | Bob | 2 |
2 | Bob | 2 |
3 | Jack | 10 |
4 | Jack | 10 |
5 | Jack | 10 |
I managed to get the max value like this:
SELECT MAX(Value) max FROM `table` GROUP BY Name having count(*) >1 AND MAX(Value) != MIN(Value)
But can't figure out how to put it in my update
Update table set Value = (SELECT MAX(Value) max FROM `table` GROUP BY Name having count(*) >1 AND MAX(Value) != MIN(Value))
Doesn't work. I'd appreciate any help.
This is easier than other answers are making it.
UPDATE MyTable AS t1 INNER JOIN MyTable AS t2 USING (Name)
SET Value = GREATEST(t1.Value, t2.Value);
You don't have to find the largest value. You just have to join each row to the set of rows with the same name, and set the Value to the greater Value of the two joined rows. This is a no-op on some rows, but it will apply to every row in turn.
http://sqlfiddle.com/#!9/f79a3/1
UPDATE t1
INNER JOIN (SELECT name, MAX(`value`) max_value
FROM t1 GROUP BY name) t2
ON t1.name = t2.name
SET t1.value = t2.max_value;
Create a temporary table consisting of ID NAME and MAX VALUE as follows:
CREATE TEMP TABLE TABLE1 AS
(SELECT NAME,MAX(Value) value FROM `table` GROUP BY Name having count(*) >1
AND MAX(Value) != MIN(Value)
);
Use this temporary table to do your update as follows:
UPDATE
Table_A
SET
Table_A.value = Table_B.value
FROM
`table` AS Table_A
INNER JOIN TABLE1 AS Table_B
ON Table_A.NAME = Table_B.NAME
Also this code is somewhat of an approximation as i am not familiar with mysql but i am familiar with sql.
Let me know if this doesn't help.
Simple left join would do the trick.
Try this out and let me know in case of any queries.
select a.id,a.name,b.value
from
table a
left join
(select name,max(value) as value from table group by name) b
on a.name=b.name;
You may use this query. The table is joined with a subquery (table t2) that contains the results you want to update your table with:
UPDATE `table` t1,
(SELECT Name, MAX(Value) maxv, MIN(Value) minv
FROM `table`
GROUP BY Name
HAVING COUNT(*)>1 AND maxv != minv) t2
SET t1.Value = t2.maxv
WHERE t1.Name = t2.Name;
If you want to know how will the values be updated, you can first run an equivalent SELECT query:
SELECT t1.*, t2.maxv
FROM `table` t1,
(SELECT Name, MAX(Value) maxv, MIN(Value) minv
FROM `table`
GROUP BY Name
HAVING COUNT(*)>1 AND maxv != minv) t2
WHERE t1.Name = t2.Name;
This query will display all the fields of table, followed by the new value maxv. You can check the current value and the new value, and if it looks fine, you may run the UPDATE query.

Find duplicate records in table mysql

I have a table like the following:
| ID | Short Name | Long Name |
|----|------------|-----------|
| 1 | s1 | l2 |
| 2 | s1 | l2 |
| 3 | s1 | l2 |
| 4 | s5 | l6 |
| .. | ... | |
I want to get all records that share the same Short Name and Long Name. I need their shared short name, long name, and 3 duplicates' IDs. For this particular example, I want {s1, l2, 1,2,3}
This is a fairly simple problem to solve. Basically what you want to do is write a subquery that counts the number of rows that match on your specified field for each row in your query. I have included a few examples below.
Find all rows that are duplicates and match on both name fields
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.ShortName = TableName.ShortName AND T2.LongName = TableName.LongName) > 1;
Find all rows that are duplicates and match on the short name
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.ShortName = TableName.ShortName) > 1;
Find all rows that are duplicates and match on the long name
SELECT * FROM TableName WHERE (SELECT COUNT(*) FROM TableName AS T2 WHERE T2.LongName = TableName.LongName) > 1;
Simply use a a self join of your table and select those rows where the two names are equal and the ids differ:
SELECT *
FROM <table> t1, <table> t2
WHERE t1.id <> t2.id
AND t1.short_name = t2.short_name
AND t1.long_name = t2.long_name;
You can use exits to see if there are any other data exists with the same condition where ID is not same.
select t1.* from table_name t1
where exists (
select 1 from table_name t2
where
t1.ID <> t2.ID
and t1.`Short Name` = t2.`Short Name`
and t1.`Long Name` = t2.`Long Name`
);