Turning a duplicate selection into an update - mysql

I've managed to select the count and IDs of each record that has duplicates via:
select T1.ID,T2.Count
from MyTable T1
join (SELECT ID,Count(*) as Count FROM MyTable
where Field_C=X
and Field_S=Y
group by NumberField,NameField having count(*)>1) T2
on T1.NumberField=T2.NumberField
and T1.NameField = T2.NameField
This returns the ID of the records I want to update (T1.ID) and the value I want to update a CountField with (T2.Count).
Just unsure how to change into update after getting this far.

If you have the select you have already the update this way
UPDATE MyTable T1
join (SELECT ID,Count(*) as Count FROM MyTable
SET T1.ID = T2.Count
where Field_C=X
and Field_S=Y
group by NumberField,NameField having count(*)>1) T2
on T1.NumberField=T2.NumberField

I guess a long day at the office but this solved it pretty easily. I think having the first select threw me off until I realized I needed to get rid of it entirely as I was not selecting but updating:
Update MyTable T1
join (SELECT ID,Count(*) as Count FROM MyTable
where Field_C=X
and Field_S=Y
group by NumberField,NameField having count(*)>1) T2
on T1.NumberField=T2.NumberField
and T1.NameField = T2.NameField
Set T1.CountField=T2.Count

Related

Using group by in SET clause

I'm trying to update a column of a table so that is equal to the count of something in another table. Like this:
UPDATE TABLE
SET TOTAL = (SELECT COUNT(f1)
FROM TABLE2
GROUP BY f2);
But I keep getting sub query returns more than 1 row, and I can't think of how to fix it.
UPDATE (copied from the comment)
f2 is the relation between TABLE and TABLE2 – Thomasd d
Based on your comment
f2 is the relation between TABLE and TABLE2
you probably want something like this
UPDATE TABLE T1, (SELECT f2, COUNT(F1) cnt FROM TABLE2 GROUP BY f2) T2
SET T1.TOTAL = T2.cnt
WHERE T1.f2=T2.f2
adapt T1.f2 if necessary
UPDATE t1
SET total = ( SELECT COUNT(f1)
FROM t2
WHERE t1.f2 = t2.f2 );
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=91de17deff657f66fa54b42fe20ed3c5
Add WHERE total IS NULL if you do not need to recalculate values for rows which have a value already.
Your subquery is returning multiple values and your SET statement is only expecting one. This might fix your code if that is what you are looking for.
UPDATE TABLE
SET TOTAL = (SELECT COUNT(f1)
FROM TABLE2)

MySQL Select works fine but Delete hangs indefinitely based on the position of GROUP BY

select * from table1 where ID in (
select min(a.ID) from (select * from table1) a group by id_x, id_y, col_z having count(*) > 1)
Above query ran in 2.2 seconds returning four result. Now when I change the select * to delete, it hangs up indefinitely.
delete from table1 where ID in (
select min(a.ID) from (select * from table1) a group by id_x, id_y, col_z having count(*) > 1)
If I move the position of group by clause inside the alias select query, it will no longer hang.
delete from table1 where ID in (
select a.ID from (select min(ID) from table1 group by id_x, id_y, col_z having count(*) > 1) a)
Why does it hang? Even though (select * from table1) pulls millions of records, the query doesn't seem to stop executing for hours. Can anybody explain what huddles the query? It puzzles me because the select query works fine whereas the delete query hangs.
EDIT:
My focus here is why it hangs. I already posted work-around that works fine. But in order to develop prevention system, I need to get to the root cause of this..
Use a JOIN instead of WHERE ID IN (SELECT ...).
DELETE t1
FROM table1 AS t1
JOIN (
SELECT MIN(id) AS minId
FROM table1
GROUP BY id_x, id_y, col_z
HAVING COUNT(*) > 1) AS t2
ON t1.id = t2.minId
I think your query is not being optimized because it has to recalculate the subquery after each deletion, since deleting a row could change the MIN(id) for that group. Using a JOIN requires the grouping and aggregation to be done just once.
Try this:
delete t
from table1 t join
(select min(id) as min_id
from table1
group byid_x, id_y, col_z
having count(*) >= 2
) tt
on tt.min_id = t.id;
That said, you probably don't want to delete just the minimum id. I'm guessing you want to keep the most recent id. If so:
delete t
from table1 t left join
(select max(id) as max_id
from table1
group byid_x, id_y, col_z
having count(*) >= 2
) tt
on tt.max_id = t.id
where tt.max_id is null;

why the sql correct and the inner mechanism for run it?

the sql as follows come from mysql document. it is:
SELECT * FROM t1 AS t
WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);
The document say It finds all rows in table t1 containing a value that occurs twice in a given column , and doesnot explain the sql.
t1 and t is the same table, so the
count(*) in subquery == select count(*) from t
, isn't it?
count(*) in subquery == select count(*) from t
is wrong. because in mysql you can't use it like that. so you have to run it like that to get result of same id having two rows.
if you want to get count of same occurrence,
SELECT id, name, count(*) AS all_count FROM t1 GROUP BY id HAVING all_count > 1 ORDER BY all_count DESC
And also you can get values as your query like this as well,
select * from t1 where id in ( select id from t1 group by id having count(*) > 1 )
The query contains a correlated subquery in WHERE clause:
SELECT COUNT(*) FROM t1 WHERE t1.id = t.id
It is called correlated because it is related to the main query via t.id. So, this subquery counts the number of records having an id value that is equal to the current id value of the record returned by the main query.
Thus, predicate
(SELECT COUNT(*) FROM t1 WHERE t1.id = t.id) = 2
evaluates to true for any row with an id value that occurs twice in the table.
SELECT * FROM t1 AS t
WHERE 2 = (SELECT COUNT(*) FROM t1 WHERE t1.id = t.id);
This query goes through each record in t1 and then in the subquery looks into t1 again to see if in this case id is found 2 times (and only 2 times). You can do the same for any other column in t1 (or any table for that matter).
When you would like to see all values that are multiple times in the table, change WHERE 2 = by WHERE 1 <. This will also give you the values that are 3 times, 4 times, etc. in the table.
{
SELECT id,count( * )
FROM
MyTable
group by id
having count( * )>1
}
with this code, you can see the rows which repet more than one,
and you can change this query by yourself
How about using GROUP BY and HAVING:
SELECT id, count(1) as Total FROM MyTable AS t1
GROUP BY t1.id
HAVING Total = 2

How to SELECT date and time within that date?

I read few articles about this: Select max date, then max time This one seems most helpful but I do not see way to implement it.
There is five tables. I join them. I need to select only one row with highest date and highest time from first table and same from second table and join the rest on some other value. With the code I wrote I get multiple rows. It seems time selection is not right.
It might be done with subquery in subquery. I've tried something like this:
SELECT * from table1
INNER JOIN table2 ON table1.date = table2.date AND table1.gm = table2.gm
INNER JOIN table3 ON table2.gm = table3.gm ...
WHERE table3.date = :date AND table4.date = :date ...
AND table1.date IN(
SELECT MAX(table1.date) FROM table1 WHERE table1.time IN(
SELECT MAX(table1.time) FROM table1
)
)
AND table2.date IN(
SELECT MAX(table2.date) FROM table1 WHERE table2.time IN(
SELECT MAX(table2.time) FROM table2 )
)
ORDER BY table1.id
Question is:
How to get single row after joining all of this where date is highest and time is highest on that date?
Thanks!
EDIT: I am sorry for this. I forgot to say that I need max time of max date related with specific value from tables(gm columns). So that is one row(in example I gave it is table1.gm and table2.gm ... ) for each one of that .gm values which are same in every table, not just one row all together. Solutions Nick and Salim provided works but I did not solved problem.
EDIT: SOLVED! after implementing solutions by Nick I just neded to add GROUP BY cntrs_reper.gm_company_no, cntrs_reper.date.
And that's it. For every row in one table enties with highest date and time from others!! Thanks to all.
EDIT. If this can help this is full query:
SELECT cntrs_gm.gm_company_no AS company_c_g,
bns_gms.ded_bns AS ded_bns_gms,
bns_gms.no_ded_bns AS no_ded_bns_gms,
bns_gms.wag_ded_bns AS wag_ded_bns_gms,
cntrs_gm.cur_credit AS cur_credit_c_g,
cntrs_gm.cdrop AS cdrop_c_g,
cntrs_gm.total_jp AS total_jp_c_g,
cntrs_gm.games AS games_c_g,
cntrs_gm.wgames AS wgames_c_g,
cntrs_gm.doors AS doors_c_g,
cntrs_gm.power AS power_c_g,
cntrs_gm.total_in AS total_in_c_g,
cntrs_gm.total_out AS total_out_c_g,
cntrs_gm.total_acc AS total_acc_c_g,
cntrs_gm.total_bet AS total_bet_c_g,
cntrs_gm.total_win AS total_win_c_g,
cntrs_gm.total_bonus AS total_bonus_c_g,
cntrs_gm.date AS date_c_g,
cntrs_reper.gm_company_no AS company_reper,
bns_reper.ded_bns AS ded_bns_reper,
bns_reper.no_ded_bns AS no_ded_bns_reper,
bns_reper.wag_ded_bns AS wag_ded_bns_reper,
cntrs_reper.cur_credit AS cur_credit_reper,
cntrs_reper.cdrop AS cdrop_reper,
cntrs_reper.total_jp AS total_jp_reper,
cntrs_reper.games AS games_reper,
cntrs_reper.wgames AS wgames_reper,
cntrs_reper.doors AS doors_reper,
cntrs_reper.power AS power_reper,
cntrs_reper.total_in AS total_in_reper,
cntrs_reper.total_out AS total_out_reper,
cntrs_reper.total_acc AS total_acc_reper,
cntrs_reper.total_bet AS total_bet_reper,
cntrs_reper.total_win AS total_win_reper,
cntrs_reper.total_bonus AS total_bonus_reper,
cntrs_reper.date AS date_reper,
cntrs_reper.time AS time_reper,
bns_reper.time AS time_c_g,
gms_cfg.gm_no AS machine_id,
gms_cfg.denom_cin AS machine_cin
FROM bns_gms
INNER JOIN cntrs_gm
ON bns_gms.gm_company_no = cntrs_gm.gm_company_no AND bns_gms.date = cntrs_gm.date
INNER JOIN bns_reper
ON cntrs_gm.gm_company_no = bns_reper.gm_company_no
INNER JOIN cntrs_reper
ON bns_reper.gm_company_no = cntrs_reper.gm_company_no AND bns_reper.date = cntrs_reper.date
INNER JOIN gms_cfg
ON cntrs_reper.gm_company_no = gms_cfg.gm_no
WHERE bns_reper.date IN(
SELECT MAX(DATE(bns_reper.date)) FROM bns_reper WHERE bns_reper.time IN(
SELECT MAX(TIME(bns_reper.time)) FROM bns_reper
)
)
AND cntrs_reper.date IN(
SELECT MAX(DATE(cntrs_reper.date)) FROM cntrs_reper WHERE cntrs_reper.time IN(
SELECT MAX(TIME(cntrs_reper.time)) FROM cntrs_reper
)
)
ORDER BY cntrs_gm.gm_company_no
DB example
bns_gms
bns_reper
cntrs_gm
cntrs_reper
gms_cfg
The problem with your current query is that it will select all rows where table1.date is the latest date on which the highest time occurs, which may well be more than one e.g. for data such as
id date time
1 2018-03-30 18:40
2 2018-03-31 12:20
3 2018-03-31 19:20
Your WHERE clause:
table1.date IN(
SELECT MAX(table1.date) FROM table1 WHERE table1.time IN(
SELECT MAX(table1.time) FROM table1
)
will select rows with id=2 and id=3 as they both have date = '2018-03-31' which is when the maximum time occurs.
What you want to do is select the row which has the latest time on the latest date, for which you could use
table1.date = (SELECT MAX(date) FROM table1) AND
table1.time = (SELECT MAX(time) FROM table1 WHERE date = (SELECT(MAX(date) FROM table1))
By using aliasing, that can be simplified (since we already know table1.date = MAX(date) FROM table1) to
table1.date = (SELECT MAX(date) FROM table1) AND
table1.time = (SELECT MAX(time) FROM table1 AS t1 WHERE t1.date = table1.date)
I don't have MySQL but here is the general idea you can use. I don't have enough points to write a comment so I am responding as a reply. Essentially make a subquery/inline view for each table to select max of a column, then join those subqueries/inline views together.
Here is Oracle syntax. You can convert it to ANSI syntax.
select table1.column1, table2.column2,table3.column3
from
(select id1, max(column1) as column1 from table1 group by id1) as table1
(select id2, max(column2) as column2 from table2 group by id2) as table2
(select id3, max(column3) as column3 from table3 group by id3) as table3
where
table1.id1 = table2.id2
and table1.id1 = table3.id3
;

Delete duplicates in mysql when based on values from 3 columns matching

I have the query below that shows me duplicates in my table. I would like to know how can i turn this into a delete query to delete these duplicate rows but leaving just one. My table does have a auto increment id column.
SELECT * FROM tbl_user_tmp AS t1
INNER JOIN (
SELECT name, activity, class, COUNT(1) AS cnt FROM tbl_user_tmp
WHERE user = 'test' AND disregard = 0
GROUP BY name, activity, class
HAVING cnt > 1
) AS t2
ON t1.name = t2.name AND t1.activity = t2.activity AND t1.class = t2.class
WHERE user = 'test' AND disregard = 0
GROUP BY t1.name, t1.activity, t1.class
I have tried the query below and seems to work, but im afraid im missing something. does it look correct?
delete from tbl_user_tmp
where user='test' AND id not in
(
select minid from
(select min(id) as minid from tbl_user_tmp where user='test' group by name, activity, class) as newtable
)
You can use LIMIT.
Example:
DELETE FROM users
LIMIT 2;
Now you just need to set COUNT - 1 as your limit ;)