Update Mysql Query Optimisation - mysql

I must create a mysql query with a large number of queries (about 150,000)
For the moment the query is:
UPDATE table SET activated=NULL
WHERE (
id=XXXX
OR id=YYYY
OR id=ZZZZ
OR id=...
...
)
AND activated IS NOT NULL
Do you know a best way for to do that please?

If you're talking about thousands of items, an IN clause probably isn't going to work. In that case you would want to insert the items into a temporary table, then join with it for the update, like so:
UPDATE table tb
JOIN temptable ids ON ids.id = tb.id
SET tb.activated = NULL

UPDATE table
SET activated = NULL
WHERE id in ('XXXX', 'YYYY', 'zzzz')
AND activated IS NOT NULL

Related

Multiple conditions on a temporary table - MySQL

I have a temporary table called 'tempaction'. I wanted to select rows where 'ActionID' matches that of another table. I got the safe update mode error, I think as ActionID is part of a compound primary key. However, when I try
UPDATE action
SET Status = 'Sent'
WHERE ActionID IN( select ActionID from tempaction)
AND DeviceID IN( select DeviceID from tempaction);
I get temporary table cannot be reopened error.
Checking both parts of primary key has worked for the safe update error in the past. I also understand that I cannot reference a temporary table twice in the same statement.
How can I select rows with matching ActionID's or matching ActionID's AND DeviceID's from this temporary table?
Tempory Table
CREATE TEMPORARY TABLE tempaction (ActionID BIGINT)
SELECT *
FROM action
WHERE DeviceID = '1234'
AND Status = 'Pending'
You can try Update using Join with sub-query.
UPDATE action a
JOIN
tempaction t ON a.ActionID = t.ActionID
SET
a.Status = 'Sent';

MySQL Procedure Insert only if not exists

I'm using MySQL Stored Procedures and I want to insert some rows from a table's database to another table's database through a stored procedure. More specifically from database "new_schema", table "Routers" and field "mac_address" to database "data_warehouse2", table "dim_cpe" and field "mac_address".
This is the code I used in the first insertion, that worked perfectly.
insert into data_warehouse2.dim_cpe (data_warehouse2.dim_cpe.mac_address, data_warehouse2.dim_cpe.ssid)
(select new_schema.Routers.mac_address, new_schema.Routers.ssid from new_schema.Routers, data_warehouse2.dim_cpe);
Now I have more rows in the table "Routers" to be inserted into "dim_cpe" but, since there are rows already there, I want just to insert the new ones.
As seen in other posts, I tried a where clause:
where new_schema.device_info.mac_address != data_warehouse2.dim_cpe.mac_address
and a:
on duplicate key update new_schema.Routers.mac_address = data_warehouse2.dim_cpe.mac_address"
Both didn't work. What's the best way to do this?
Thanks in advance.
You could leave the source table out of the from clause, and use a not exists clause instead:
where not exists
(select mac_address from dim_cpe mac_address = new_schema.Routers.mac_address
and ssid = new_schema.Routers.ssid)
Or you could left join and check whether the fields from dim_cpe are null:
insert into data_warehouse2.dim_cpe
(data_warehouse2.dim_cpe.mac_address, data_warehouse2.dim_cpe.ssid)
(select new_schema.Routers.mac_address, new_schema.Routers.ssid
from new_schema.Routers
left join data_warehouse2.dim_cpe on
new_schema.Routers.mac_address = data_warehouse2.dim_cpe.mac_address
and new_schema.Routers.ssid = data_warehouse2.dim_cpe.ssid
where dim_cpe.mac_address is null and dim_cpe.ssid is null);
Edit to say this is a general SQL solution. I'm not sure if there's a better MySql-specific approach to this.
Edit to show your query:
insert into data_warehouse2.dim_cpe (mac_address, ssid)
select new_schema.Routers.mac_address, new_schema.Routers.ssid
from new_schema.Routers where not exists
(select data_warehouse2.dim_cpe.mac_address from data_warehouse2.dim_cpe
where data_warehouse2.dim_cpe.mac_address = new_schema.Routers.mac_address
and data_warehouse2.dim_cpe.ssid = new_schema.Routers.ssid);

SQL Update with inner join

I have a table of records, which has a self-relationship.
Additionally - to make searching easier - I have a flag which determines that a record has been referenced and hence that row is now "obsolete" and is only there for audit purposes:
CREATE TABLE Records
(
RecordID INT(5) NOT NULL,
Replaces INT(5) NULL,
Obsolete INT(1) NOT NULL
)
RecordID is the PK, Replaces links to a previous RecordID which has now been replaced, and Obsolete is redundant information which just says that another record has replaced this one. It just makes searching a lot easier. The table is very large. These are just 3 of the columns.
The only problem is: there was a typo in one of the queries in the system so for a small set of rows, the Obsolete value was not set to 1 (true).
This query will show all the records with Obsolete equal to 0 which should be equal to 1:
SELECT *
FROM Records AS rec1
LEFT JOIN Records AS rec2
ON rec1.Replaces = rec2.RecordID
WHERE rec2.RecordID IS NOT NULL
AND rec2.Obsolete = 0;
Now I need to run an UPDATE to change all those req2.Obsolete from 0 to 1, but I'm not sure how to write a query with an INNER JOIN.
You don't need an inner join. Since your query already returns the records that need to be updated, just do this:
Update Records
set Obsolete=1 where
RecordID in (
SELECT rec2.RecordID
FROM Records AS rec1
LEFT JOIN Records AS rec2
ON rec1.Replaces = rec2.RecordID
WHERE rec2.RecordID IS NOT NULL
AND rec2.Obsolete = 0
)
UPDATE Records
SET obsolete = 1
WHERE recordID in (
SELECT rec1.recordid
FROM Records AS rec1
LEFT JOIN Records AS rec2
ON rec1.Replaces = rec2.RecordID
WHERE rec2.RecordID IS NOT NULL
AND rec2.Obsolete = 0
)
I would suggest doing this in two steps using a temporary table:
-- Create temporary table for holding RecordIDs to be marked as obsolete
CREATE TEMPORARY TABLE `mark_obsolete` (`RecordID` INT NOT NULL);
-- Insert RecordIDs to mark as obsolete into temp table
INSERT INTO `mark_obsolete` (`RecordID`)
SELECT `rec2`.`RecordID`
FROM
`Records` AS `rec1`
INNER JOIN `Records` AS `rec2`
ON `rec1`.`Replaces` = `rec2`.`RecordID`
WHERE `rec2`.`Obsolete` = 0;
-- Update records using inner join to temp table
UPDATE
`Records` AS `r`
INNER JOIN `mark_obsolete` AS `o`
ON `r`.`RecordID` = `o`.`RecordID`
SET `r`.`Obsolete` = 1;
DROP TEMPORARY TABLE `mark_obsolete`;
Note that using a LEFT JOIN with WHERE rec2.RecordID IS NOT NULL is the same as an INNER JOIN.
The reason for using a temporary table is to avoid locking issues when updating the same table used in the sub-query. And it might also give you better performance than using the IN clause.

Does creating a CTE in this case helps?

I have a query written very poorly in SQL Server 2008
UPDATE PatientChartImages
SET PatientChartImages.IsLockDown = #IsLockdown
WHERE PatientChartImages.IsLockDown = #IsNotLockdown
AND PatientChartId IN (
SELECT PatientCharts.PatientChartId
FROM PatientCharts
WHERE ( PatientCharts.ChartStatusID = #ChartCompletedStatusID
OR PatientCharts.ChartStatusID = #ChartOnBaseStatusID
)
AND PatientCharts.IsLockDown = #IsNotLockdown
AND PatientCharts.CompletedOn IS NOT NULL
AND DATEDIFF(MINUTE, PatientCharts.CompletedOn, GETUTCDATE()) >= ( SELECT
tf.LockUpInterval
FROM
#tblFacCOnf tf
WHERE
tf.facilityId = PatientCharts.FacilityId
) )
This query locks the main table and results in TimeOut. IF i create a CTE first of all the updateable records and then update the main table by joining to the CTE. Will it help ??
First thing i advice you to do is to substitute IN condition with EXISTS. Second is to move all this conditional logic into CTE. Third is to substitute sub-select with #tblFacCOnf with join.
Last advice depends on your business logic and is not so important in my opinion
So at the end you will get something as
WITH search_cte as (
SELECT PatientCharts.PatientChartId
FROM PatientCharts
JOIN #tblFacCOnf tf on tf.facilityId = PatientCharts.FacilityId
WHERE ( PatientCharts.ChartStatusID = #ChartCompletedStatusID
OR PatientCharts.ChartStatusID = #ChartOnBaseStatusID
)
AND PatientCharts.IsLockDown = #IsNotLockdown
AND PatientCharts.CompletedOn IS NOT NULL
AND DATEDIFF(MINUTE, PatientCharts.CompletedOn, GETUTCDATE()) >= tf.LockUpInterval
) --cte end
UPDATE PatientChartImages
SET PatientChartImages.IsLockDown = #IsLockdown
WHERE PatientChartImages.IsLockDown = #IsNotLockdown
AND EXISTS (select 1 from PatientChartImages where PatientChartImages.PatientChartId = search_cte.PatientChartId)
One additional thing I might suggest if the other suggestions don't get you enough speed is not to use a table variable. Temp Tables are often faster for large data sets and can be indexed if need be.
The update lock is being held the time it takes to compute the CTE and the time for the update. The CTE time is probably causing the time out.
To reduce the lock time to the minimum required to update the target table. I suggest you create a temp table with two columns. Col1 is the primary key or cluster key of the target table and Col2 is the value you want in the target table. Wrap the temp table creation and fill the table with values according to your business logic within a transaction. Update the target table using a join to the temp table and the value from the temp table in a seperate transaction. After update drop the temp table.
I think you should create an SQL script (or stored procedure, if you will use it from a higher level) where you store the results of your selection into a cursor (you'll only have to find the PatientCartId's of the rows to be updated) and then you should use it in your update, so, the answer is yes.
It's easy to test this, you should put these commands into a transaction, rollback the transaction and before the rollback you should perform a selection to test your results. Good luck.

Updating cached counts in MySQL

In order to fix a bug, I have to iterate over all the rows in a table, updating a cached count of children to what its real value should be. The structure of the things in the table form a tree.
In rails, the following does what I want:
Thing.all.each do |th|
Thing.connection.update(
"
UPDATE #{Thing.quoted_table_name}
SET children_count = #{th.children.count}
WHERE id = #{th.id}
"
)
end
Is there any way of doing this in a single MySQL query?
Alternatively, is there any way of doing this in multiple queries, but in pure MySQL?
I want something like
UPDATE table_name
SET children_count = (
SELECT COUNT(*)
FROM table_name AS tbl
WHERE tbl.parent_id = table_name.id
)
except the above doesn't work (I understand why it doesn't).
You probably got this error, right?
ERROR 1093 (HY000): You can't specify target table 'table_name' for update in FROM clause
The easiest way around this is probably to select the child counts into a temporary table, then join to that table for the updates.
This should work, assuming the depth of the parent/child relationship is always 1. Based on your original update this seems like a safe assumption.
I added an explicit write lock on the table to assure that no rows are modified after I create the temp table. You should only do this if you can afford to have it locked for the duration of this update, which will depend on the amount of data.
lock tables table_name write;
create temporary table temp_child_counts as
select parent_id, count(*) as child_count
from table_name
group by parent_id;
alter table temp_child_counts add unique key (parent_id);
update table_name
inner join temp_child_counts on temp_child_counts.parent_id = table_name.id
set table_name.child_count = temp_child_counts.child_count;
unlock tables;
your subselect update should work; let's try touching it up a bit:
UPDATE table_name
SET children_count = (
SELECT COUNT(sub_table_name.id)
FROM sub_table_name
WHERE sub_table_name.parent_id = table_name.id
)
Or if the sub-table is the same table:
UPDATE table_name as top_table
SET children_count = (
SELECT COUNT(sub_table.id)
FROM (select * from table_name) as sub_table
WHERE sub_table.parent_id = top_table.id
)
But that's not super efficient I'm guessing.