In case I ran a very long update (which has millions of records to update and is going to take several hours), I was wondering if there is anyway to kill the update without having InnoDB rollback the changes.
I would like the records that were already updated to stay as they are (and the table locks released ASAP), meaning to continue the update later when I have time for it.
This is similar to what MyISAM would do when killing an update.
If you mean a single UPDATE statement, I may be wrong but I doubt that's possible. However, you can always split your query into smaller sets. Rather than:
UPDATE foo SET bar=gee
... use:
UPDATE foo SET bar=gee WHERE id BETWEEN 1 AND 100;
UPDATE foo SET bar=gee WHERE id BETWEEN 101 AND 200;
UPDATE foo SET bar=gee WHERE id BETWEEN 201 AND 300;
...
This can be automated in a number of ways.
My suggestion would be to create a black_hole table, with fields to match your needs for the update statement.
CREATE TABLE bh_table1
..field defs
) ENGINE = BLACKHOLE;
Now create a trigger on the blackhole table.
DELIMITER $$
CREATE TRIGGER ai_bh_table1_each AFTER INSERT ON bh_table1 FOR EACH ROW
BEGIN
//implicit start transaction happens here.
UPDATE table1 t1 SET t1.field1 = NEW.field1 WHERE t1.id = NEW.id;
//implicit commit happens here.
END $$
DELIMITER ;
You can do the update statement as an insert into the blackhole.
INSERT INTO bh_table1 (id, field1)
SELECT id, field1
FROM same_table_with_lots_of_rows
WHERE filter_that_still_leaves_lots_of_rows;
This will still be a lot slower than your initial update.
Let me know how it turns out.
Related
If I understand correctly this code
START TRANSACTION;
SELECT field FROM table WHERE ... FOR UPDATE ; // single row
UPDATE table SET field = ... ;
COMMIT;
will lock the SELECT row until COMMIT.
But if I use MAX()
START TRANSACTION;
SELECT MAX(field) FROM table WHERE ... FOR UPDATE ; // whole table
UPDATE table SET field = ... ;
COMMIT;
will this code lock the whole table until COMMIT?
EDIT
Sorry, I have my question wrong.
Obviously above code will lock rows affected by WHERE. But it wouldn't lock the table. Meaning
INSERT INTO table() VALUES();
could still took place regardless of COMMIT.
That would mean the return value of
SELECT MAX(field) FROM table WHERE ... FOR UPDATE ;
is now no longer valid.
How to lock the table during transaction so neither INSERT nor UPDATE could took place before COMMIT?
It doesn't matter what you're selecting. FOR UPDATE locks all the rows that have to be examined to evaluate the WHERE clause. Otherwise, another transaction could change the columns that are mentioned there, so the later UPDATE would assign to different rows.
And since inserting a new row can change the value of MAX(field), it actually locks the entire table. When I try your example, and try to insert a new from another transaction, the second transaction blocks until I commit the first transaction.
I have a weird issue with my Trigger. There are 2 tables: Table A and Table B.
Whenever a row is inserted to Table A sum of a column in this table is inserted into Table B
It was working fine at first, but recently I noticed when >1 rows are inserted at the exact time for a user, the trigger returns sum in a weird way.
CREATE TRIGGER `update_something` AFTER INSERT ON `Table_A`
FOR EACH ROW BEGIN
DECLARE sum BIGINT(20);
SELECT IFNULL(SUM(number), 0) INTO sum FROM Table_A WHERE `user` = NEW.user;
UPDATE Table_B SET sum_number = sum WHERE id = NEW.id;
END
Example:
Table A
User X has a sum of 15 currently, then (with almost no delay in between):
Number 5 is inserted for him
Number 7 is inserted for him
Table B
On this table where we hold the sum, sum for this user was 15
Trigger updates this table in this way:
20
22 <--- Wrong, this should be 27
As you can see there isn't any number 2 inserted, it adds 7-5 = 2 for some reason.
How is that possible and why does it subtract 5 from 7 and add 2 to the sum instead of normally adding 7?
Edit 1:
Warning: This won't work, check the accepted answer instead
One of answers suggested select for update method.
Will this SELECT ... FOR UPDATE affect the performance negatively in a huge way?
CREATE TRIGGER `update_something` AFTER INSERT ON `Table_A`
FOR EACH ROW BEGIN
DECLARE sum BIGINT(20);
SELECT IFNULL(SUM(number), 0) INTO sum FROM Table_A WHERE `user` = NEW.user FOR UPDATE;
UPDATE Table_B SET sum_number = sum WHERE id = NEW.id;
END
Basically we only add FOR UPDATE to the end of SELECT line like this and it will perform Row Lock in InnoDB to fix the issue?
SELECT IFNULL(SUM(number), 0) INTO sum FROM Table_A WHERE user = NEW.user FOR UPDATE;
Edit 2 (Temporary Fix):
In case some one needs a very quick temporary fix for this before doing the actual & logical suggested fix: What I did was to put a random usleep(rand(1,500000)) before INSERT query in PHP to reduce the chance of simultaneous inserts.
The reason for this behaviour is that the inserted data is only committed to the database when the trigger finishes executing. So when both insert operations (5 and 7) execute the trigger in parallel, they read the data as it is in their transaction, i.e. the committed data with the changes made in their own transaction, but not the changes made in any other ongoing transaction.
The committed data in table A sums up to 20 for both transactions, and to that is added the record that is inserted in their own transaction. For the one this is 5, for the other it is 7, but as these records were not yet committed, the other transaction does not see this value.
That is why the sum is 20+5 for the one, and 20+7 for the other. The transactions then both update table B, one after the other (because table B will be locked during an update and until the end of the transaction), and the one that is latest "wins".
To solve this, don't read the sum from table A, but keep a running sum in Table B:
CREATE TRIGGER `update_something` AFTER INSERT ON `Table_A`
FOR EACH ROW BEGIN
UPDATE Table_B SET sum_number = sum_number + NEW.number WHERE id = NEW.id;
END;
/
I suppose you already have triggers for delete and update on Table_B, as otherwise you'd have another source of inconsistencies.
So these need to be (re)written too:
CREATE TRIGGER `delete_something` AFTER DELETE ON `Table_A`
FOR EACH ROW BEGIN
UPDATE Table_B SET sum_number = sum_number - OLD.number WHERE id = OLD.id;
END;
/
CREATE TRIGGER `update_something` AFTER UPDATE ON `Table_A`
FOR EACH ROW BEGIN
UPDATE Table_B SET sum_number = sum_number - OLD.number WHERE id = OLD.id;
UPDATE Table_B SET sum_number = sum_number + NEW.number WHERE id = NEW.id;
END;
/
This way you prevent to lock potentially many rows in your triggers.
Then, after you have done the above, you can fix the issues from the past, and do a one-shot update:
update Table_B
join (select id, user, ifnull(sum(number),0) sum_number
from Table_A
group by id, user) A
on Table_B.id = A.id
and Table_B.sum_number <> A.sum_number
set Table_B.sum_number = A.sum_number;
You get this because of the race condition in the trigger. Both triggeres are fired at the same time, thus SELECT returns the same value for both of them - 15. Then first trigger updates tha value adding 5 and resulting in 20, and then the second update is run with 15 + 7 = 22.
What you should do is use SELECT ... FOR UPDATE instead. This way if first trigger issues the select, then the second one will have to wait until first one finishes.
EDIT:
Your question made me think, and maybe using FOR UPDATE is not the best solution. According to documentation:
For index records the search encounters, SELECT ... FOR UPDATE locks the rows and any associated index entries, the same as if you issued an UPDATE statement for those rows.
And because you are selecting the sum of entries from Table A it will lock those entries, but will still allow inserting new ones, so the problem would not be solved.
It would be better to operate only on data from Table B inside the trigger, as suggested by trincot.
I need to update a table with pre-calculated values from tables where data can be added/updated/deleted.
I could use
insert into precalculated(...)
select ... from ...
on duplicate key update ...
to add/update the pre-calculated table but is there an optimized method to delete the obsolete rows ?
I think you should create a stored procedure that deletes the data of your related tables if and only if the records fulfill a condition.
There's not enough information in your question to design the procedure, but I can give you a little example:
delimiter $$
create procedure delete_orphans()
begin
declare id_orphan int;
declare done int default false;
declare cur_orphans cursor for
select distinct d.id
from data as d
left join precalculated as p on d.id = p.id
where p.id is null;
declare continue handler for not found set done = true;
open cur_orphans;
loop_delete_orphans: loop
fetch cur_orphans into id_orphan;
if done then
leave cur_orphans;
end if;
delete from data where id = id_orphan;
end loop;
close cur_orphans;
end$$
delimiter ;
This procedure will delete every row in the data table that does not have at least one related row in the precalculated table.
Of course, this approach might be inneficient, because it will delete the rows one by one, but as I said this is only an example. You can customize it to fit your needs.
You can call this procedure from a trigger if you want (with call delete_orphans()).
Hope this helps.
Since you are always adding or updating rows that exist in these other tables, and you want to remove any rows that don't exist, why don't you just :
DELETE FROM precalculated
insert into precalculated(...)
select ... from ...
on duplicate key update ...
Always starting clean means you don't have to worry about orphans later.
You could add triggers for insert, delete and update on the main tables that maintains precalculated.
When inserting or updating the same code can be used to calculate the values and issuing a replace into precalculated (...) values (...)
When deleting it's probably the same, with the addition that you'll also delete rows from precalculated that are orphans. Be smart here and use values from the original delete to query precalculated for orphans instead of doing a table scan.
I may have found my solution using rename.
so basically, I will do a simple insert select to the temporary table and then
rename precalculated to precalculated_temprename, precalculated_temp to precalculated, precalculated_temprename to precalculated_temp;
truncate precalculated_temp;
need some tests but it seems the rename operation is fast and atomic.
How to achieve creating a trigger that ONLY updates if exist but NEVER inserts in mysql.
Thanks in advance.
Edit: I had not known that using update statement just update if exist, and does not throw any error if not exist. Thanks to #juergend
You can specify when a trigger gets fired: after update for instance.
When it got fired then you can do whatever you want, for instance update an other table.
Generally it works like that:
delimiter //
CREATE TRIGGER upd_trigger_name after UPDATE ON your_updated_table
FOR EACH ROW
BEGIN
update other_table set col1 = a_value where id = other_value
END;
//
delimiter ;
MySQL keywords:
update -> updates a record if found
insert -> inserts a new recods
replace -> updates if record found, inserts if not
I have a situation in which I don't want inserts to take place (the transaction should rollback) if a certain condition is met. I could write this logic in the application code, but say for some reason, it has to be written in MySQL itself (say clients written in different languages will be inserting into this MySQL InnoDB table) [that's a separate discussion].
Table definition:
CREATE TABLE table1(x int NOT NULL);
The trigger looks something like this:
CREATE TRIGGER t1 BEFORE INSERT ON table1
FOR EACH ROW
IF (condition) THEN
NEW.x = NULL;
END IF;
END;
I am guessing it could also be written as(untested):
CREATE TRIGGER t1 BEFORE INSERT ON table1
FOR EACH ROW
IF (condition) THEN
ROLLBACK;
END IF;
END;
But, this doesn't work:
CREATE TRIGGER t1 BEFORE INSERT ON table1 ROLLBACK;
You are guaranteed that:
Your DB will always be MySQL
Table type will always be InnoDB
That NOT NULL column will always stay the way it is
Question: Do you see anything objectionable in the 1st method?
From the trigger documentation:
The trigger cannot use statements that explicitly or implicitly begin or end a transaction such as START TRANSACTION, COMMIT, or ROLLBACK.
Your second option couldn't be created. However:
Failure of a trigger causes the statement to fail, so trigger failure also causes rollback.
So Eric's suggestion to use a query that is guaranteed to result in an error is the next option. However, MySQL doesn't have the ability to raise custom errors -- you'll have false positives to deal with. Encapsulating inside a stored procedure won't be any better, due to the lack of custom error handling...
If we knew more detail about what your condition is, it's possible it could be dealt with via a constraint.
Update
I've confirmed that though MySQL has CHECK constraint syntax, it's not enforced by any engine. If you lock down access to a table, you could handle limitation logic in a stored procedure. The following trigger won't work, because it is referencing the table being inserted to:
CREATE TRIGGER t1 BEFORE INSERT ON table1
FOR EACH ROW
DECLARE num INT;
SET num = (SELECT COUNT(t.col)
FROM your_table t
WHERE t.col = NEW.col);
IF (num > 100) THEN
SET NEW.col = 1/0;
END IF;
END;
..results in MySQL error 1235.
Have you tried raising an error to force a rollback? For example:
CREATE TRIGGER t1 BEFORE INSERT ON table1
FOR EACH ROW
IF (condition) THEN
SELECT 1/0 FROM table1 LIMIT 1
END IF;
END;