MySQL to update subsequent duplicates of a row - mysql

We're going through a bit of a clean-up exercise and I need to remove duplicate data that has accidentally been added to our database table. The ID is obviously different, but other fields are the same.
I can use the following query to select the duplicate data sets:
SELECT user_id, start_datetime, count(id) AS dup_count
FROM our_table
WHERE status = 1
GROUP BY user_id, start_datetime
HAVING count(id) > 1;
What I need to do is create a query that would take each of the duplicate IDs APART FROM THE FIRST and use that to update the status to 0.
I'm not sure I can do this is one query, but I think the steps are as follows:
Run a query similar to the one above
Extract all the IDs for the duplicate sets
Ignore the first in the list as we don't want to alter the correctly added first record
Run the update on the remaining set of IDs
Am I out of luck here - or is it possible to do?
Many thanks!

You can do this with an update/join:
UPDATE our_table ot JOIN
(SELECT user_id, start_datetime, count(id) AS dup_count, min(id) as minid
FROM our_table
WHERE status = 1
GROUP BY user_id, start_datetime
HAVING count(id) > 1
) dups
ON ot.user_id = dups.user_id and
ot.start_datetime = dups.start_datetime and
ot.id > dups.minid
SET ot.status = 0;

You can use this update query that will join OUR_TABLE with itself:
UPDATE
our_table o1 INNER JOIN our_table o2
ON o1.status=1
AND o2.status=1
AND o1.user_id = o2.user_id
AND o1.start_datetime = o2.start_datetime
AND o1.ID > o2.ID
SET
o1.status = 0
Please see an example fiddle here.

Related

Using the results of a function multiple times for duplicates - SQL

I am trying to produce a result that shows duplicates in a table. One method I found for getting duplicates and showing them is to run the select statement again through an inner join. However, one of my columns needs to be the result of a function, and the only thing I can think to do is use an alias, however I can't use the alias twice in a SELECT statement.
I am not sure what the best way to run this code for getting the duplicates I need.
My code below
SELECT EXTRACT(YEAR_MONTH FROM date) as 'ndate', a.transponderID
FROM dispondo_prod_disposition.event a
inner JOIN (SELECT EXTRACT(YEAR_MONTH FROM date) as ???,
transponderID, COUNT(*)
FROM dispondo_prod_disposition.event
GROUP BY mdate, transponderID
HAVING count(*) > 1 ) b
ON ndate = ???
AND a.transponderID = b.transponderID
ORDER BY b.transponderID
SELECT b.ndate, transponderID
FROM dispondo_prod_disposition.event a
INNER JOIN ( SELECT EXTRACT(YEAR_MONTH FROM date) as ndate,
transponderID
FROM dispondo_prod_disposition.event
GROUP BY 1, 2
HAVING COUNT(*) > 1 ) b USING (transponderID)
WHERE b.ndate = ??? -- for example, WHERE b.ndate = 202201
ORDER BY transponderID

Getting Newest Record from a table using mysql

This has to be a no brainer, but I am stumped. I'm used to using aggregate 'FIRST' in MsAccess, but MySql has no such thing.
Here is a simple table. I want to return the most recent record based on the date,
for each unique 'group ID'. I need the three records in yellow.
I was asked to add my full query. I tried one of the suggestions using the JOIN feature replacing 't' with the temp table name, but it failed to work. "Can't reopen table 't'"
The code is below. I know it's ugly, but it does return the correct data set.
I cleaned up the code a bit and added the JOIN code. Error: "Can't reopen table 't'"
enter code here
DROP TABLE IF EXISTS `tmpMaxLookupResults`;
create temporary table tmpMaxLookupResults
as
SELECT
REPORTS.dtmReportCompleted,
RESULTS.lngMainReport_ID, RESULTS.lngLocationGroupSub_ID
FROM
(tbl_010_040_ProcedureVsTest_Sub as ProcVsSub
INNER JOIN tbl_010_050_CheckLog_RESULTS as RESULTS
ON (ProcVsSub.lngLocationGroupSub_ID = RESULTS.lngLocationGroupSub_ID)
AND (ProcVsSub.lngProcedure_ID = RESULTS.lngProcedure_ID)
AND (ProcVsSub.lngItemizedTestList_ID = RESULTS.lngItemizedTestList_ID)
AND (ProcVsSub.strPasscodeAdmin = RESULTS.strPasscodeAdmin)
AND (ProcVsSub.strCFICode = RESULTS.strCFICode))
INNER JOIN
tbl_000_010_MAIN_REPORT_INFO as REPORTS ON (RESULTS.lngPCC_ID =
REPORTS.lngPCC_ID)
AND (RESULTS.lngProcedure_ID = REPORTS.lngProcedure_ID)
AND (RESULTS.lngMainReport_ID = REPORTS.idMainReport_ID)
AND (RESULTS.strPasscodeAdmin = REPORTS.strPasscodeAdmin)
AND (RESULTS.strCFICode = REPORTS.strCFICode)
WHERE
(((RESULTS.lngProcedure_ID) = 143)
AND ((RESULTS.dtmExpireDate) IS NOT NULL)
AND ((RESULTS.strCFICode) = 'ems'))
GROUP BY RESULTS.lngMainReport_ID, RESULTS.lngLocationGroupSub_ID
ORDER BY (REPORTS.dtmReportCompleted) DESC;
SELECT t.*
FROM tmpMaxLookupResults AS t
JOIN (
SELECT lngLocationGroupSub_ID,
MAX(dtmReportCompleted) AS max_date_completed
FROM tmpMaxLookupResults
GROUP BY lngLocationGroupSub_ID ) AS dt
ON dt.lngLocationGroupSub_ID = t.lngLocationGroupSub_ID AND
dt.max_date_completed = t.dtmReportCompleted
enter code here
Try this
SELECT
tn.*
FROM
tableName tn
RIGHT OUTER JOIN
(
SELECT
groupId, MAX(date_completed) as max_date_completed
FROM
tableName
GROUP BY
groupId
) AS gt
ON
(gt.max_date_completed = nt.date_completed AND gt.groupId = nt.groupId)
You can use the following SQL.
select * from table1 order by date_completed desc Limit 1;
Use Order By
SELECT *
FROM table_name
ORDER BY your_date_column_name
DESC
LIMIT 1
In a Derived Table, get the maximum date_completed value for every group_id.
Join this result-set back to the main table, in order to get the complete row corresponding to maximum date_completed value for every group_id
Try the following query:
SELECT t.*
FROM your_table_name AS t
JOIN (
SELECT group_id,
MAX(date_completed) AS max_date_completed
FROM your_table_name
GROUP BY group_id
) AS dt
ON dt.group_id = t.group_id AND
dt.max_date_completed = t.date_completed

Select rows with similar value in one column

I have a table called trades and has a field session id. The table has specific rows with a similar session id. This reason some rows have a similar session id is that when a trade is placed, it takes an existing session id.
I now want to select rows with similar sessions ids and do something with it.
This is my first query that lists all the rows
SELECT * FROM trades
where trade_session_status="DONE" AND
trade_profit_worker_status="UNDONE"
I have tried this query as well
SELECT * FROM trades
where trade_session_status="DONE" AND
trade_profit_worker_status="UNDONE"
order BY(session_id)
I have looked at the distinct queries and came up with this query
SELECT DISTINCT session_id,id
FROM trades
WHERE trade_session_status="DONE" AND
trade_profit_worker_status="UNDONE"
ORDER BY session_id
The #2 and #3 queries all return the same number of rows. My question is,will the #2 and #3 queries always return the rows with distinct session_id without leaving any rows out?.
Sounds to me that you could use an EXISTS for this.
SELECT *
FROM trades t
WHERE trade_session_status = 'DONE'
AND trade_profit_worker_status = 'UNDONE'
AND EXISTS
(
SELECT 1
FROM trades d
WHERE d.trade_session_status = 'DONE'
AND d.trade_profit_worker_status = 'UNDONE'
AND d.session_id = t.session_id
AND d.id <> t.id
);
Note that the criteria for trade_session_status & trade_profit_worker_status are also used in the query for the EXISTS. I don't know if that's needed for your purpose, so remove them if that's not what you expect. But you get the idea.
Another way is to inner join to a sub-query with the duplicate session_id's.
SELECT t.*
FROM trades t
JOIN
(
SELECT session_id
FROM trades
WHERE trade_session_status = 'DONE'
AND trade_profit_worker_status = 'UNDONE'
GROUP BY session_id
HAVING COUNT(*) > 1
) d ON d.session_id = t.session_id
WHERE t.trade_session_status = 'DONE'
AND t.trade_profit_worker_status = 'UNDONE';

UPDATE TABLE WITH CASE

I have two tables user_master and login_history. In User_master I want to update the status column as A(absent) or P(Present) if user has logged in in current date from login history.the code I am trying but it updates all the rows. All I want is if the user has logged in , it should match both the tables and update user_master status column as P or A. Hope My question is clear. Help would be really appreciated. here is my MySQL query
UPDATE User_master a
INNER JOIN
(
SELECT DISTINCT user_name FROM login_history WHERE DATE(`login_time`)=CURRENT_DATE()
) b
SET a.`user_status` = CASE
WHEN a.`user_name`=B.`user_name` THEN 'P'
WHEN a.`user_name`!=B.`user_name` THEN 'A'
END
Hmmm, I am thinking LEFT JOIN:
UPDATE User_master m
LEFT JOIN Login_History lh
ON m.user_name = lh.user_name AND
DATE(lh.login_time) = CURRENT_DATE()
SET m.user_status = (CASE WHEN lh.user_name IS NULL THEN 'A' ELSE 'P' END);
It occurs to me that there might be more than one login on a given date. The result is additional updates on the same row. You can prevent this by doing:
UPDATE User_master m LEFT JOIN
(SELECT lh.user_name, 'P' as user_status
FROM Login_History lh
WHERE lh.login_time >= CURRENT_DATE() AND
lh.login_tie < DATE_ADD(CURRENT_DATE(), INTERVAL 1 DAY)
GROUP BY lh.user_name
) lh
ON m.user_name = lh.user_name
SET m.user_status = COALESCE(lh.user_status, 'A');
Notice that I changed the date arithmetic as well. This version should make better use of an index.
Might be easier with two queries:
Set everyone absent (update ... set user_status='A')
Set the present people to P with a select set.
Like so:
update user_master set user_status='A';
update user_master set user_status='P'
where user_name in (select distinct user_name from login_history...);
A join is somewhat quicker, but this is a pretty clean, understandable approach.

MySQL update value from the same table with count

What I want to do is to set every patient its unique patient code which starts with 1 and it's not based on row id. Id only specifies order. Something like this:
patient_id patient_code
2 1
3 2
4 3
This is my query:
UPDATE patients p1
SET p1.patient_code = (
SELECT COUNT( * )
FROM patients p2
WHERE p2.patient_id <= p1.patient_id
)
But it is throwing error:
#1093 - You can't specify target table 'p1' for update in FROM clause
I found this thread: Mysql error 1093 - Can't specify target table for update in FROM clause.But I don't know how to apply approved answer this to work with subquery WHERE which is necessary for COUNT.
UPDATE
patients AS p
JOIN
( SELECT
p1.patient_id
, COUNT(*) AS cnt
FROM
patients AS p1
JOIN
patients AS p2
ON p2.patient_id <= p1.patient_id
GROUP BY
p1.patient_id
) AS g
ON g.patient_id = p.patient_id
SET
p.patient_code = g.cnt ;
I found working solution, but this is just workaround:
SET #code=0;
UPDATE patients SET patient_code = (SELECT #code:=#code+1 AS code)
Try this,
UPDATE patients p1 INNER JOIN
(
SELECT COUNT(*) as count,patient_id
FROM patients
group by patient_id
)p2
SET p1.patient_code=p2.count
WHERE p2.patient_id <= p1.patient_id
SQL_LIVE_DEMO
Thanks to Mari's answer I found a solution to my similar problem. But I wanted to add a bit of an explanation which for me at first wasn't too clear from his answer.
What I wanted to do would have been as simple as the following:
UPDATE my_comments AS c
SET c.comment_responses = (
SELECT COUNT(c1.*) FROM my_comments AS c1
WHERE c.uid = c.parent_uid
);
Thanks to Mari I then found the solution on how to achieve this without running into the error You can't specify target table 'p1' for update in FROM clause:
UPDATE my_comments AS c
INNER JOIN (
SELECT c1.parent_uid, COUNT(*) AS cnt
FROM my_comments AS c1
WHERE c1.parent_uid <> 0
GROUP BY c1.parent_uid
) AS c2
SET c.comment_responses = c2.cnt
WHERE c2.parent_uid = c.uid;
My problems before getting to this solution were 2:
the parent_uid field doesn't always contain an id of a parent which is why I added the WHERE statement in the inner join
I didn't quite understand why I would need the GROUP BY until I executed the SELECT statement on it's own and the answer is: because COUNT groups the result and really counts everything. In order to prevent this behavior the GROUP BY is needed. In my case I didn't have to group it by uid though but the parent_uid to get the correct count. If I grouped it by uid the COUNT would always be 1 but the parent_uid existed multiple times in the result. I suggest you check the SELECT statement on it's own to check if it's the result you expect before you execute the full UPDATE statement.