If I understand correctly this code
START TRANSACTION;
SELECT field FROM table WHERE ... FOR UPDATE ; // single row
UPDATE table SET field = ... ;
COMMIT;
will lock the SELECT row until COMMIT.
But if I use MAX()
START TRANSACTION;
SELECT MAX(field) FROM table WHERE ... FOR UPDATE ; // whole table
UPDATE table SET field = ... ;
COMMIT;
will this code lock the whole table until COMMIT?
EDIT
Sorry, I have my question wrong.
Obviously above code will lock rows affected by WHERE. But it wouldn't lock the table. Meaning
INSERT INTO table() VALUES();
could still took place regardless of COMMIT.
That would mean the return value of
SELECT MAX(field) FROM table WHERE ... FOR UPDATE ;
is now no longer valid.
How to lock the table during transaction so neither INSERT nor UPDATE could took place before COMMIT?
It doesn't matter what you're selecting. FOR UPDATE locks all the rows that have to be examined to evaluate the WHERE clause. Otherwise, another transaction could change the columns that are mentioned there, so the later UPDATE would assign to different rows.
And since inserting a new row can change the value of MAX(field), it actually locks the entire table. When I try your example, and try to insert a new from another transaction, the second transaction blocks until I commit the first transaction.
Related
I have a simple table
CREATE TABLE test (
col INT,
data TEXT,
KEY (col)
);
and a simple transaction
START TRANSACTION;
SELECT * FROM test WHERE col = 4 FOR UPDATE;
-- If no results, generate data and insert
INSERT INTO test SET col = 4, data = 'data';
COMMIT;
I am trying to ensure that two copies of this transaction running concurrently result in no duplicate rows and no deadlocks. I also don't want to incur the cost of generating data for col = 4 more than once.
I have tried:
SELECT .. (without FOR UPDATE or LOCK IN SHARE MODE):
Both transactions see that there are no rows with col = 4 (without acquiring a lock) and both generate data and insert two copies of the row with col = 4.
SELECT .. LOCK IN SHARE MODE
Both transactions acquire a shared lock on col = 4, generate data and attempt to insert a row with col = 4. Both transactions wait for the other to release their shared lock so it can INSERT, resulting in ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction.
SELECT .. FOR UPDATE
I would expect that one transaction's SELECT will succeed and acquire an exclusive lock on col = 4 and the other transaction's SELECT will block waiting for the first.
Instead, both SELECT .. FOR UPDATE queries succeed and the transactions proceed to deadlock just like with SELECT .. LOCK IN SHARE MODE. The exclusive lock on col = 4 just doesn't seem to work.
How can I write this transaction without causing duplicate rows and without deadlock?
Adjust your schema slightly:
CREATE TABLE test (
col INT NOT NULL PRIMARY KEY,
data TEXT
);
With col being a primary key it cannot be duplicated.
Then use the ON DUPLICATE KEY feature:
INSERT INTO test (col, data) VALUES (4, ...)
ON DUPLICATE KEY UPDATE data=VALUES(data)
Maybe this...
START TRANSACTION;
INSERT IGNORE INTO test (col, data) VALUES (4, NULL); -- or ''
-- if Rows_affected() == 0, generate data and replace `data`
UPDATE test SET data = 'data' WHERE col = 4;
COMMIT;
Caution: If the PRIMARY KEY is an AUTO_INCREMENT, this may 'burn' an id.
Note that InnoDB has 2 types of exclusive locks: one is for update and delete, and another one for insert. So to execute your SELECT FOR UPDATE transaction InnoDB will have to first take the lock for update in one transaction, then the second transaction will try to take the same lock and will block waiting for the first transaction (it couldn't have succeeded as you claimed in the question), then when first transaction will try to execute INSERT it will have to change its lock from the lock for update to the lock for insert. The only way InnoDB can do that is first downgrade the lock down to shared one and then upgrade it back to lock for insert. And it can't downgrade the lock when there's another transaction waiting to acquire the exclusive lock as well. That's why in this situation you get a deadlock error.
The only way for you to correctly execute this is to have unique index on col, try to INSERT the row with col = 4 (you can put dummy data if you don't want to generate it before the INSERT), then in case of duplicate key error rollback, and in case INSERT was successful you can UPDATE the row with the correct data.
Note though that if you don't want to incur cost of generating data unnecessarily it probably means that generating it takes a long time, and all that time you'll hold an open transaction that inserted row with col = 4 which will hold all other processes trying to insert the same row hanging. I'm not sure that would be significantly better than generating data first and then inserting it.
If you're goal is to have only one session insert the missing row, and any other sessions do nothing without even attempting an insert of DATA, then you need to either lock the entire table (which reduces your concurrency) or insert an incomplete row and follow it with an update.
A. create a primary key on column COL
Code:
begin
insert into test values (4,null);
update test set data = ... where col = 4;
commit;
exception
when dup_val_on_index then
null;
end;
The first session that attempts the insert on col 4 will succeed and procede to the update where you can do the expensive calculation of DATA. Any other session trying to do this will raise a PK violation (-00001, or DUP_VAL_ON_INDEX) and go to the exception handler which traps it and does nothing (NULL). It will never reach the update statement, so won't do whatever expensive thing it is you do to calculate DATA.
Now, this will cause the other session to wait while the first session calculates DATA and does the update. If you don't want that wait, you can use NOWAIT to cause the lagging sessions to throw an exception immediately if the row is locked. If the row doesn't exist, that will also throw an exception, but a different one. Not great to use exception handling for normal code branches, but hey, it should work.
declare
var_junk number;
begin
begin
select col into var_junk from test where col = 4 for update nowait;
exception
when no_data_found then
insert into test values (col,null);
update test set data = ... where col = 4;
commit;
when others then
null;
end;
end;
In my application I want to take a value from an InnoDB table, and then increment and return it within a single transaction. I want also lock the row that i am going to update in order to prevent another session from changing the value during the transaction. I wrote this query;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT #no:=`value` FROM `counter` where name='booking' FOR UPDATE;
UPDATE `counter` SET `value` = `value` + 1 where `name`='booking';
SELECT #no;
COMMIT;
I want to know if the isolation level is right and is there any need for 'FOR UPDATE' statement. Am i doing it right?
Yes whatever you are doing perfectly fine.
Below lines I am directly quoting from MySQL documentation.
"If you query data and then insert or update related data within the same transaction, the regular SELECT statement does not give enough protection.
..
To implement reading and incrementing the counter, first perform a locking read of the counter using FOR UPDATE, and then increment the counter. For example:
SELECT counter_field FROM child_codes FOR UPDATE;
UPDATE child_codes SET counter_field = counter_field + 1;
A SELECT ... FOR UPDATE reads the latest available data, setting exclusive locks on each row it reads. Thus, it sets the same locks a searched SQL UPDATE would set on the rows.
Reference:
https://dev.mysql.com/doc/refman/5.6/en/innodb-locking-reads.html
I have two table source and target. Is it possible to perform following operation in single query?
If the row exists in both the source and target, UPDATE the target;
If the row only exists in the source, INSERT the row into the target;
If the row exists in the target but not the source,
DELETE the row from the target.
You can't do it all in one query, but you can do it all in one transaction if you are using a transactional store engine (like InnoDB). This might be what you want,
START TRANSACTION;
INSERT...;
DELETE...
UPDATE...;
COMMIT;
In case I ran a very long update (which has millions of records to update and is going to take several hours), I was wondering if there is anyway to kill the update without having InnoDB rollback the changes.
I would like the records that were already updated to stay as they are (and the table locks released ASAP), meaning to continue the update later when I have time for it.
This is similar to what MyISAM would do when killing an update.
If you mean a single UPDATE statement, I may be wrong but I doubt that's possible. However, you can always split your query into smaller sets. Rather than:
UPDATE foo SET bar=gee
... use:
UPDATE foo SET bar=gee WHERE id BETWEEN 1 AND 100;
UPDATE foo SET bar=gee WHERE id BETWEEN 101 AND 200;
UPDATE foo SET bar=gee WHERE id BETWEEN 201 AND 300;
...
This can be automated in a number of ways.
My suggestion would be to create a black_hole table, with fields to match your needs for the update statement.
CREATE TABLE bh_table1
..field defs
) ENGINE = BLACKHOLE;
Now create a trigger on the blackhole table.
DELIMITER $$
CREATE TRIGGER ai_bh_table1_each AFTER INSERT ON bh_table1 FOR EACH ROW
BEGIN
//implicit start transaction happens here.
UPDATE table1 t1 SET t1.field1 = NEW.field1 WHERE t1.id = NEW.id;
//implicit commit happens here.
END $$
DELIMITER ;
You can do the update statement as an insert into the blackhole.
INSERT INTO bh_table1 (id, field1)
SELECT id, field1
FROM same_table_with_lots_of_rows
WHERE filter_that_still_leaves_lots_of_rows;
This will still be a lot slower than your initial update.
Let me know how it turns out.
Edit:
I found a solution here http://mysql.bigresource.com/Track/mysql-8TvKWIvE/
assuming select takes a long time to execute, will this lock the table for a long time?
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
START TRANSACTION;
SELECT foo FROM bar WHERE wee = 'yahoo!';
DELETE FROM bar WHERE wee = 'yahoo!';
COMMIT;
I wish to use a criteria to select the rows in mysql, return them to my app as resultset, and then delete these rows. How can this be done? I know I can do the following but it's too inefficient:
select * from MyTable t where _critera_.
//get the resultset and then
delete from MyTable t where t.id in(...result...)
Do I need to use a transaction? Is there a single query solution?
I needed to SELECT some rows by some criteria, do something with the data, and then DELETE those same rows atomically, that is, without deleting any rows that meet the criteria but were inserted after the SELECT.
Contrary to other answers, REPEATABLE READ is not sufficient. Refer to Consistent Nonlocking Reads. In particular note this callout:
The snapshot of the database state applies to SELECT statements within a transaction, not necessarily to DML statements. If you insert or modify some rows and then commit that transaction, a DELETE or UPDATE statement issued from another concurrent REPEATABLE READ transaction could affect those just-committed rows, even though the session could not query them.
You can try it yourself:
First create a table:
CREATE TABLE x (i INT NOT NULL, PRIMARY KEY (i)) ENGINE = InnoDB;
Start a transaction and examine the table (this will be called session 1 now):
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
START TRANSACTION;
SELECT * FROM x;
Start another session (session 2) and insert a row. Note this session is in auto commit mode.
INSERT INTO x VALUES (1);
SELECT * FROM x;
You will see your newly inserted row. Then back in session 1 again:
SELECT * FROM x;
DELETE FROM x;
COMMIT;
In session 2:
SELECT * FROM x;
You'll see that even though you get nothing from the SELECT in session 1, you delete one row. In session 2 you will see the table is empty at the end. Note the following output from session 1 in particular:
mysql> SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.00 sec)
mysql> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM x;
Empty set (0.00 sec)
/* --- insert in session 2 happened here --- */
mysql> SELECT * FROM x;
Empty set (0.00 sec)
mysql> DELETE FROM x;
Query OK, 1 row affected (0.00 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.06 sec)
mysql> SELECT * FROM x;
Empty set (0.00 sec)
This testing was done with MySQL 5.5.12.
For a correct solution
Use SERIALIZABLE transaction isolation level. However note that session 2 will block on the INSERT.
It seems that SELECT...FOR UPDATE will also do the trick. I have not studied the manual 100% in depth to understand this but it worked when I tried it. The advantage is you don't have to change the transaction isolation level. Again, session 2 will block on the INSERT.
Delete the rows individually after the SELECT. Basically you'd have to include a unique column (the primary key would be good) in the SELECT and then use DELETE FROM x WHERE i IN (...), or something similar, where IN contains a list of keys from the SELECT's result set. The advantage is you don't need to use a transaction at all and session 2 will not be blocked at any time. The disadvantage is that you have more data to send back and forth to the SQL server. Also I don't know if deleting the rows individually is as efficient as using the same WHERE clause as the original SELECT, but if the original SELECT's WHERE clause was complicated or slow the individual deletion may well be faster, so that could be another advantage.
To editorialize, this is one of those things that is so dangerous that even though it's documented it could almost be considered a "bug." But hey, the MySQL designers didn't ask me (or anyone else, apparently).
Do I need to use a transaction? Is there a single query solution?
Yes, you need to use a transaction. You cannot delete and select rows in a single query (i.e., there is no way to "return" or "select" the rows you have deleted).
You don't necessarily need to do the REPEATABLE READ option - I believe you could also select the rows FOR UPDATE, although this is a higher level of locking. REPEATABLE READ does seem to be the lowest level of locking you could use to execute this transaction safely. It happens to be the default for InnoDB.
How much this affects your table depends on whether you have an index on the wee column or not. Without it, I believe MySQL would have to lock writes the entire table.
Further reading:
Wikipedia - Isolation (database systems)
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html
Do a select statement. While looping through it, create a list string of unique IDs. Then pass this list back to mySQL using IN.
You could select your rows into a temporary table, then delete using the same criteria as your select. Since SELECT FROM WHERE FOR UPDATE also returns a result set, you could alter the SELECT FOR UPDATE to a SELECT INTO tmp_table FOR UPDATE. Then delete your selected rows, either using your original criteria, or by using the data in the temporary table as the criteria.
Something like this (but haven't checked it for syntax)
START TRANSACTION;
SELECT a,b into TMP_TABLE FROM table_a WHERE a=1 FOR UPDATE;
DELETE FROM table_a
USING table_a JOIN TMP_TABLE ON (table_a.a=TMP_TABLE.a, table_a.b=TMP_TABLE.b)
WHERE 1=1;
COMMIT;
Now your records are gone from the original table, but you also have a copy in your temporary table, which you can keep, or delete.
There is no single query solution. Use
select * from MyTable t where _critera_
//get the resultset and then
delete from MyTable where _critera_
Execute the SELECT statement with the WHERE clause and then use the same WHERE clause in the DELETE statement as well. Assuming there was no interim changes to the data, the same rows should be deleted.
EDIT: Yes, you could set this up as a single transaction so there's no modification to the tables while you're doing this.