MySQL RDS Stored Procedure Update query is slow - mysql

I have an update query in the Stored Procedure that updates TABLE1 based on the IDs present from TABLE2. This is written using a subquery as follows.
update TABLE1 A
set status = 'ABC'
where A.ID in (
select ID
from TABLE2 B
where B.SE_ID = V_ID
and B.LOAD_DT = V_DT
);
I have rewritten this using,
a JOIN
masking the subquery from the main query
Used a temp table and join.
Standalone update is faster.
But placing this in the Stored Procedure is very slow.
TABLE1 needs to be updated with 2000 records from the 2000 ID from TABLE2.
Someone please help on this.

Avoid using subqueries in place of joins. MySQL optimizes this use of subquery very poorly. That is, it may run the subquery up to 2000 times.
Use a join:
UPDATE TABLE1 A
INNER JOIN TABLE2 B
ON A.ID = B.ID
SET A.status = 'ABC'
WHERE B.SE_ID = V_ID
AND B.LOAD_DT = V_DT;
You'll want to create an index to optimize this.
ALTER TABLE TABLE2 ADD INDEX (SE_ID, LOAD_DT, ID);
No need to create an index on TABLE1, if my assumption is correct that its ID column is its primary key. That is itself an index.

Related

Performance issue when updating one table based on another

Given:
TableA
Table B
guid
guid
missing
Table A ~400k rows
Table B ~150k rows
Both tables have the same guids but I need to mark in A all the missing guids from B. Both guids have indexes.
Query:
update table_a
left join table_b b on table_a.guid = b.guid
set missing = true
where b.guid is null;
This query works but took 4,5 hours on my machine. Is there any way I can make this query run faster?
UPD:
All three answers below gave me some tips to think on.
The following query ran for 8 seconds.
update table_a a
set missing = true
where a.guid not in (
select a.guid
from table_b b,
table_a a
where b.guid = a.guid
);
its much faster. it only tests if the ROW xists.
UPDATE table_a ta
SET missing = true
WHERE NOT EXISTS ( SELECT 1 from table_b tb WHERE ta.guid = tb.guid );
Is table_b have an index starting with guid? (A PRIMARY KEY is an INDEX.)
Do you need to run this query frequently? Let's get rid of it after this initial update. In the future, whenever you modify table_b, reach over and update table_a. TRIGGERs might be a good way to do such. A DELETE TRIGGER could set missing=1; an INSERT TRIGGER (etc)
In, instead, you choose to run the UPDATE repeatedly, see this for how to chunk the action, etc: http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks
Another approach is to check for whether the row is "missing" by a LEFT JOIN in the SELECT, not by having the column.
Using TRIGGERs would be something like this:
DELIMITER //
CREATE TRIGGER del BEFORE DELETE ON table_b
FOR EACH ROW
BEGIN
UPDATE table_a
SET missing = true
WHERE guid = OLD.guid; -- or maybe test id??
END;
//
DELIMITER ;
And one for INSERT. And one for UPDATEs if you might change guid. This would need a two commands for UPDATE table_a -- one for the old guid (a la Delete), one for the new (a la Insert).
These would add a small burden when table_b is modified, but probably not enough to worry about.
Try:
update table_a
set missing = true
where guid not in (select guid from table_b)

MySQL Procedure Insert only if not exists

I'm using MySQL Stored Procedures and I want to insert some rows from a table's database to another table's database through a stored procedure. More specifically from database "new_schema", table "Routers" and field "mac_address" to database "data_warehouse2", table "dim_cpe" and field "mac_address".
This is the code I used in the first insertion, that worked perfectly.
insert into data_warehouse2.dim_cpe (data_warehouse2.dim_cpe.mac_address, data_warehouse2.dim_cpe.ssid)
(select new_schema.Routers.mac_address, new_schema.Routers.ssid from new_schema.Routers, data_warehouse2.dim_cpe);
Now I have more rows in the table "Routers" to be inserted into "dim_cpe" but, since there are rows already there, I want just to insert the new ones.
As seen in other posts, I tried a where clause:
where new_schema.device_info.mac_address != data_warehouse2.dim_cpe.mac_address
and a:
on duplicate key update new_schema.Routers.mac_address = data_warehouse2.dim_cpe.mac_address"
Both didn't work. What's the best way to do this?
Thanks in advance.
You could leave the source table out of the from clause, and use a not exists clause instead:
where not exists
(select mac_address from dim_cpe mac_address = new_schema.Routers.mac_address
and ssid = new_schema.Routers.ssid)
Or you could left join and check whether the fields from dim_cpe are null:
insert into data_warehouse2.dim_cpe
(data_warehouse2.dim_cpe.mac_address, data_warehouse2.dim_cpe.ssid)
(select new_schema.Routers.mac_address, new_schema.Routers.ssid
from new_schema.Routers
left join data_warehouse2.dim_cpe on
new_schema.Routers.mac_address = data_warehouse2.dim_cpe.mac_address
and new_schema.Routers.ssid = data_warehouse2.dim_cpe.ssid
where dim_cpe.mac_address is null and dim_cpe.ssid is null);
Edit to say this is a general SQL solution. I'm not sure if there's a better MySql-specific approach to this.
Edit to show your query:
insert into data_warehouse2.dim_cpe (mac_address, ssid)
select new_schema.Routers.mac_address, new_schema.Routers.ssid
from new_schema.Routers where not exists
(select data_warehouse2.dim_cpe.mac_address from data_warehouse2.dim_cpe
where data_warehouse2.dim_cpe.mac_address = new_schema.Routers.mac_address
and data_warehouse2.dim_cpe.ssid = new_schema.Routers.ssid);

Update Mysql Query Optimisation

I must create a mysql query with a large number of queries (about 150,000)
For the moment the query is:
UPDATE table SET activated=NULL
WHERE (
id=XXXX
OR id=YYYY
OR id=ZZZZ
OR id=...
...
)
AND activated IS NOT NULL
Do you know a best way for to do that please?
If you're talking about thousands of items, an IN clause probably isn't going to work. In that case you would want to insert the items into a temporary table, then join with it for the update, like so:
UPDATE table tb
JOIN temptable ids ON ids.id = tb.id
SET tb.activated = NULL
UPDATE table
SET activated = NULL
WHERE id in ('XXXX', 'YYYY', 'zzzz')
AND activated IS NOT NULL

Large records table insertion issue Mysql

I am a developer and I am facing an issue while managing table which has large amount of records.
I am executing a cron job to fill up data in primary table (Table A) which has 5-6 columns and approx 4,00,000 to 5,00,000 rows and then creating another table and data in this table would continue to increase over the time.
TABLE A contains the raw data and my output table is TABLE B
My cron script truncates data in Table B then inserts data using select query
TRUNCATE TABLE_B;
INSERT INTO TABLE_B (field1, field2)
SELECT DISTINCT(t1.field2), t2.field2
FROM TABLE_A AS t1
INNER JOIN TABLE_A t2 ON t2.field1=t1.field1
WHERE t1.field2 <> t2.field2
GROUP BY t1.field2, t2.field2
ORDER BY COUNT(t1.field2) DESC;
Above select query produces approx 1,50,000 to 2,00,000 rows
Now it takes too much time to populate TABLE B and meanwhile If my application tries to access TABLE B then select query fails
Explaining query results following:
'1','PRIMARY','T1','ALL','field1_index',NULL,NULL,NULL,'431743','Using temporary;Using filesort'
'1','PRIMARY','T2','ref','field1_index','field1_index','767','DBNAME.T1.field1','1','Using where'
Can someone please help me in improving this process, or guide me alternatives for above process?
Thanks
Suketu
You should do the whole process in a stored proc.
Do not truncate such a large table. Follow the following steps:
Copy the TableB structure to TableB_Copy.
DROP TABLEB.
Rename TableB_Copy to TableB
Disable indexes on TableB
Insert the data from TableA into TableB
Create the indexes on TableB.
According to my view the solution would be like this:
SELECT
DISTINCT(t1.field2), t2.field2
FROM
TABLE_A AS t1
INNER JOIN
TABLE_A t2 ON
t2.field1=t1.field1
WHERE
t1.field2 <> t2.field2
GROUP BY
t1.field2, t2.field2
ORDER BY
COUNT(t1.field2)
DESC INTO OUTPUT "PATH-TO-FILE";
For instance file as "C:\TEMP\DATA1.SQL". What will happen with this query a simple new file is created with TAB delimiter to insert into any table.
Now how to import the data to table.
LOAD DATA
"PATH-TO-FILE"
INTO TABLE
table_name
With this query the data will be inserted and on the other hand you will be able to use the table in which you are inserting the data.

Delete statement in a same table

I need to query a delete statement for the same table based on column conditions from the same table for a correlated subquery.
I can't directly run a delete statement and check a condition for the same table in mysql for a correlated subquery.
I want to know whether using temp table will affect mysql's memory/performance?
Any help will be highly appreciated.
Thanks.
You can make mysql do the temp table for you by wrapping your "where" query as an inline from table.
This original query will give you the dreaded "You can't specify target table for update in FROM clause":
DELETE FROM sametable
WHERE id IN (
SELECT id FROM sametable WHERE stuff=true
)
Rewriting it to use inline temp becomes...
DELETE FROM sametable
WHERE id IN (
SELECT implicitTemp.id from (SELECT id FROM sametable WHERE stuff=true) implicitTemp
)
Your question is really not clear, but I would guess you have a correlated subquery and you're having trouble doing a SELECT from the same table that is locked by the DELETE. For instance to delete all but the most recent revision of a document:
DELETE FROM document_revisions d1 WHERE edit_date <
(SELECT MAX(edit_date) FROM document_revisions d2
WHERE d2.document_id = d1.document_id);
This is a problem for MySQL.
Many examples of these types of problems can be solved using MySQL multi-table delete syntax:
DELETE d1 FROM document_revisions d1 JOIN document_revisions d2
ON d1.document_id = d2.document_id AND d1.edit_date < d2.edit_date;
But these solutions are best designed on a case-by-case basis, so if you edit your question and be more specific about the problem you're trying to solve, perhaps we can help you.
In other cases you may be right, using a temp table is the simplest solution.
can't directly run a delete statement and check a condition for the same table
Sure you can. If you want to delete from table1 while checking the condition that col1 = 'somevalue', you could do this:
DELETE
FROM table1
WHERE col1 = 'somevalue'
EDIT
To delete using a correlated subquery, please see the following example:
create table project (id int);
create table emp_project (id int, project_id int);
insert into project values (1);
insert into project values (2);
insert into emp_project values (100, 1);
insert into emp_project values (200, 1);
/* Delete any project record that doesn't have associated emp_project records */
DELETE
FROM project
WHERE NOT EXISTS
(SELECT *
FROM emp_project e
WHERE e.project_id = project.id);
/* project 2 doesn't have any emp_project records, so it was deleted, now
we have 1 project record remaining */
SELECT * FROM project;
Result:
id
1
Create a temp table with the values you want to delete, then join it to the table while deleting. In this example I have a table "Games" with an ID column. I will delete ids greater than 3. I will gather the targets in a temp table first so I can report on them later.
DECLARE #DeletedRows TABLE (ID int)
insert
#DeletedRows
(ID)
select
ID
from
Games
where
ID > 3
DELETE
Games
from
Games g
join
#DeletedRows x
on x.ID = g.ID
I have used group by aggregate with having clause and same table, where the query was like
DELETE
FROM TableName
WHERE id in
(select implicitTable.id
FROM (
SELECT id
FROM `TableName`
GROUP by id
HAVING count(id)>1
) as implicitTable
)
You mean something like:
DELETE FROM table WHERE someColumn = "someValue";
?
This is definitely possible, read about the DELETE syntax in the reference manual.
You can delete from same table. Delete statement is as follows
DELETE FROM table_name
WHERE some_column=some_value