I am getting following exception when I try to do a batch update. There are multiple threads running at same time which might be accessing a row in database. I am doing multiple batch updates. Can anyone please comment on relation between size of batch and deadlock ? By decreasing the batch size (currently batch size = 1000), will the probability of deadlock decrease ?
The exception I am getting is
com.mysql.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
Short answer:
yes, the probability would decrease
Long answer:
Lets figure out why the deadlocks are occurring. When you update a row an exclusive lock is set on this particular row and it will be held until your transaction is commited/rolled back.
That means, no other transaction may update it — it would just block until the transaction is finished. A deadlock would occur when tran1 is willing to lock rows being held by tran2, and tran2, in turn, is already waiting for some rows locked by tran1
Here's an example:
MariaDB [test]> create table a (id int primary key, value int);
Query OK, 0 rows affected (0.14 sec)
MariaDB [test]> insert into a values (1, 0), (2, 0), (3, 0), (4, 0);
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql console 1:
step 1> start transaction;
step 3> update a set value = 1 where id = 2;
step 5> update a set value = 1 where id = 1;
mysql console 2:
step 2> start transaction;
step 4> update a set value = 1 where id = 1;
step 6> update a set value = 1 where id = 2;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
The more rows are being touched(=updated) during every batch update, the higher the probability of such kind of conflicts is.
You might lower this probability by traversing the rows in a well-defined order. In this case the simple example I've provided wouldn't be feasible.
More details on avoiding deadlocks are in this awesome article:
http://www.xaprb.com/blog/2006/08/03/a-little-known-way-to-cause-a-database-deadlock/
Related
CREATE TABLE 'test'.'t1' (
'id' INT NULL);
CREATE TABLE 'test'.'t2' (
'id' INT NULL);
INSERT INTO test.t1 VALUES(1);
INSERT INTO test.t2 VALUES(1);
example1:
sqlConnection1:
SET autocommit = 0;
START TRANSACTION;
UPDATE test.t1 set id = 1 WHERE id = 2;
sqlConnection2:
SET autocommit = 0;
START TRANSACTION;
LOCK TABLES test.t2 WRITE,test.t1 WRITE;
COMMIT;
UNLOCK TABLES;
sqlConnection1:
UPDATE test.t2 set id = 1 where id = 2;
COMMIT;
sqlConnection2:
ERROR 1213 (40001): Deadlock found when trying to get lock; try restart transaction
example2:
sqlConnection1:
SET autocommit = 0;
START TRANSACTION;
UPDATE test.t1 set id = 1 WHERE id = 1;
sqlConnection2:
SET autocommit = 0;
START TRANSACTION;
LOCK TABLES test.t2 WRITE,test.t1 WRITE;
COMMIT;
UNLOCK TABLES;
sqlConnection1:
UPDATE test.t2 set id = 1 where id = 1;
COMMIT;
sqlConnection1:
ERROR 1213 (40001): Deadlock found when trying to get lock; try restart transaction
example3:
deadlock not found after mysql5.6
question:
What causes the results to differ in three examples
There are several points in this problem.
When InnoDB tries to update a record it will put a RECORD LOCK or a GAP LOCK onto the range it is going to update. For records that are existent, InnoDB will put a record lock, otherwise a gap lock. For example, in example 1, InnoDB will add a gap lock, and in this case, it adds a [2,∞) gap lock onto the table.
MySQL's LOCK TABLES command actually implements in MySQL itself but not in InnoDB. Although MySQL's LOCK TABLE has a "deadlock-free design" in other table-lock-only storage engines like MyISAM, it cannot avoid the deadlocks in InnoDB since InnoDB has row-level locks. Fortunately, the InnoDB has a deadlock detection mechanism that can detect when an InnoDB row lock and a MySQL table lock occur together on the same table and do what you see -- rollback a "lighter" transaction to avoid deadlock, which is so-called "victim".
The InnoDB engine will generally decide the transaction with fewer rows affected to be the victim. In example 1, connection 1 puts two GAP LOCKS on rows [2,∞) for two tables, while connection 2 only acquires two WRITE LOCKS and does not get anything done. Thus, connection 2 is "lighter" and being selected as the "victim". In example 2, connection 1 only has two RECORD LOCKS while connection 2 still has two WRITE LOCKS, and connection 1 basically do not do anything. Thus InnoDB decides that connection 1 uses fewer resources (i.e. locks in this case), and chooses connection 1 as the victim.
Since MySQL 5.6, a new lock wait mechanism has been introduced to MySQL. When MySQL cannot acquire a table lock, it will wait as long as the interval lock-wait-timeout specified. quote from the MySQL document:
This variable[lock-wait-timeout] specifies the timeout in seconds for attempts to acquire metadata locks. The permissible values range from 1 to 31536000 (1 year). The default is 31536000.
This timeout applies to all statements that use metadata locks. These include DML and DDL operations on tables, views, stored procedures, and stored functions, as well as LOCK TABLES, FLUSH TABLES WITH READ LOCK, and HANDLER statements.
Thus, MySQL will not acquire a table lock when it sees InnoDB already has a lock on the table it is going to manipulate and wait until InnoDB gets its work done. For further information, you can refer to MySQL document(The following documents are for 8.0 version, but the basic principles are the same):
https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html
https://dev.mysql.com/doc/refman/8.0/en/innodb-deadlock-detection.html
I have a simple table
CREATE TABLE test (
col INT,
data TEXT,
KEY (col)
);
and a simple transaction
START TRANSACTION;
SELECT * FROM test WHERE col = 4 FOR UPDATE;
-- If no results, generate data and insert
INSERT INTO test SET col = 4, data = 'data';
COMMIT;
I am trying to ensure that two copies of this transaction running concurrently result in no duplicate rows and no deadlocks. I also don't want to incur the cost of generating data for col = 4 more than once.
I have tried:
SELECT .. (without FOR UPDATE or LOCK IN SHARE MODE):
Both transactions see that there are no rows with col = 4 (without acquiring a lock) and both generate data and insert two copies of the row with col = 4.
SELECT .. LOCK IN SHARE MODE
Both transactions acquire a shared lock on col = 4, generate data and attempt to insert a row with col = 4. Both transactions wait for the other to release their shared lock so it can INSERT, resulting in ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction.
SELECT .. FOR UPDATE
I would expect that one transaction's SELECT will succeed and acquire an exclusive lock on col = 4 and the other transaction's SELECT will block waiting for the first.
Instead, both SELECT .. FOR UPDATE queries succeed and the transactions proceed to deadlock just like with SELECT .. LOCK IN SHARE MODE. The exclusive lock on col = 4 just doesn't seem to work.
How can I write this transaction without causing duplicate rows and without deadlock?
Adjust your schema slightly:
CREATE TABLE test (
col INT NOT NULL PRIMARY KEY,
data TEXT
);
With col being a primary key it cannot be duplicated.
Then use the ON DUPLICATE KEY feature:
INSERT INTO test (col, data) VALUES (4, ...)
ON DUPLICATE KEY UPDATE data=VALUES(data)
Maybe this...
START TRANSACTION;
INSERT IGNORE INTO test (col, data) VALUES (4, NULL); -- or ''
-- if Rows_affected() == 0, generate data and replace `data`
UPDATE test SET data = 'data' WHERE col = 4;
COMMIT;
Caution: If the PRIMARY KEY is an AUTO_INCREMENT, this may 'burn' an id.
Note that InnoDB has 2 types of exclusive locks: one is for update and delete, and another one for insert. So to execute your SELECT FOR UPDATE transaction InnoDB will have to first take the lock for update in one transaction, then the second transaction will try to take the same lock and will block waiting for the first transaction (it couldn't have succeeded as you claimed in the question), then when first transaction will try to execute INSERT it will have to change its lock from the lock for update to the lock for insert. The only way InnoDB can do that is first downgrade the lock down to shared one and then upgrade it back to lock for insert. And it can't downgrade the lock when there's another transaction waiting to acquire the exclusive lock as well. That's why in this situation you get a deadlock error.
The only way for you to correctly execute this is to have unique index on col, try to INSERT the row with col = 4 (you can put dummy data if you don't want to generate it before the INSERT), then in case of duplicate key error rollback, and in case INSERT was successful you can UPDATE the row with the correct data.
Note though that if you don't want to incur cost of generating data unnecessarily it probably means that generating it takes a long time, and all that time you'll hold an open transaction that inserted row with col = 4 which will hold all other processes trying to insert the same row hanging. I'm not sure that would be significantly better than generating data first and then inserting it.
If you're goal is to have only one session insert the missing row, and any other sessions do nothing without even attempting an insert of DATA, then you need to either lock the entire table (which reduces your concurrency) or insert an incomplete row and follow it with an update.
A. create a primary key on column COL
Code:
begin
insert into test values (4,null);
update test set data = ... where col = 4;
commit;
exception
when dup_val_on_index then
null;
end;
The first session that attempts the insert on col 4 will succeed and procede to the update where you can do the expensive calculation of DATA. Any other session trying to do this will raise a PK violation (-00001, or DUP_VAL_ON_INDEX) and go to the exception handler which traps it and does nothing (NULL). It will never reach the update statement, so won't do whatever expensive thing it is you do to calculate DATA.
Now, this will cause the other session to wait while the first session calculates DATA and does the update. If you don't want that wait, you can use NOWAIT to cause the lagging sessions to throw an exception immediately if the row is locked. If the row doesn't exist, that will also throw an exception, but a different one. Not great to use exception handling for normal code branches, but hey, it should work.
declare
var_junk number;
begin
begin
select col into var_junk from test where col = 4 for update nowait;
exception
when no_data_found then
insert into test values (col,null);
update test set data = ... where col = 4;
commit;
when others then
null;
end;
end;
I am performing SELECT ... FOR UPDATE or row level locking with InnoDB tables.
My intention is to only one request can read the same row. So if two users make request for the same data as the same time. Only one of them get data, who fires the query first.
But How can i test that locking is placed or not. as I am testing it by retrieving the same data at same time and both users getting the data.
Note: My tables are InnoDB, My query executes in transaction, my query as below:
SELECT * FROM table_name WHERE cond FOR UPDATE;
Any other thing I have to check for this to make work?
open 2 mysql client session.
on session 1:
mysql> start transaction;
mysql> SELECT * FROM table_name WHERE cond FOR UPDATE;
... (result here) ...
1 row in set (0.00 sec)
on session 2:
mysql> start transaction;
mysql> SELECT * FROM table_name WHERE cond FOR UPDATE;
... (no result yet, will wait for the lock to be released) ...
back to session 1, to update selected record (and release the lock):
mysql> UPDATE table_name SET something WHERE cond;
mysql> commit;
back to session 2:
1) either showing lock timeout error
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
2) or showing result
... (result here) ...
1 row in set (0.00 sec)
3) or showing no result (because corresponding record has been modified, so specified condition was not met)
Empty set (0.00 sec)
You can use own lock mechanizm with lock_by column.
UPDATE table_name SET locked_by=#{proccess_id} WHERE cond and locked_by IS NULL
Now in your program you will get count of affected rows:
if(affected_rows==0)
return 'rows locked'
else
//do your staff with locked_by=#{process_id} rows
With this mechanism you can control locked rows and locking processes. You can also add in UPDATE statement locked_at=NOW() to get more info about locked row.
Don't forget to add some index on locked_by column.
Here is MySQL docs about working with locks.
Before update you can put lock, releasing it after. In another transaction you can check lock using it unique name. Strategy for naming you can choose yourself.
here is an interesting situation.
I start a transaction with MySQL. My transaction involves 3 related queries.
Each query must succeed, and if not then none should be written to the database.
Now... on purpose, for the 2nd query...which happens to be an UPDATE query... I changed
the pk value identifying the record to be updated to an invalid (non-existing) PK value. I wanted the 2nd query to fail for testing purposes. The query is fine, it is just that the c_id value is wrong (the record I'm trying to UPDATE does not exits).
The problem is that the query is executed with an "OK"...
mysql> UPDATE tableX SET bal = 4576.99 WHERE c_id = 3789;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0 Changed: 0 Warnings: 0
This is a problem because the error (is error from my perspective since a key record that must be updated was not updated in a chain of related queries) was not caught and the transaction thus did not abort and rollback, instead the process goes on to the 3rd query which also succeeds and then the transaction is committed.
So, I find it strange that such an error is not caught by MySQL or not labeled an error by MySQL.
Any insights as to why or how to fix?
It is correct, 0 rows were updated.
If, for your logic, that is an error you should test the number of affected rows and then raise an error if that number is 0:
DECLARE count INT;
UPDATE tableX SET bal = 4576.99 WHERE c_id = 3789;
SELECT ROW_COUNT() INTO count;
IF count = 0 THEN
CALL raise_error;
END IF;
error will make the transaction rollback.
To raise an error just call a routine which doesn't exist as explained on this SO question:
How to raise an error within a MySQL function
further info about row_count():
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_row-count
I get deadlock error in my mysql transaction.
The simple example of my situation:
Thread1 > BEGIN;
Query OK, 0 rows affected (0.00 sec)
Thread1 > SELECT * FROM A WHERE ID=1000 FOR UPDATE;
1 row in set (0.00 sec)
Thread2 > BEGIN;
Query OK, 0 rows affected (0.00 sec)
Thread2 > INSERT INTO B (AID, NAME) VALUES (1000, 'Hello world');
[Hangs]
Thread1 > INSERT INTO B (AID, NAME) VALUES (1000, 'Hello world2');
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Thread2 >
Query OK, 1 row affected (10.00 sec)
B.AID is a FOREIGN KEY referring to A.ID
I see three solutions:
catch deadlock error in code and retry query.
use innodb_locks_unsafe_for_binlog in my.cnf
lock (for update) table A in Thread2 before insert
Is there any other solutions ?
I don't know what code surounds this examples, but it might be worth using LOCK IN SHARE MODE for both Threads, since you're not actually updating the row itself. If you must use LOCK FOR UPDATE, I would think that locking the other thread would be the only logical path.
Also if you open to moving away from MySQL, I've found that PostgreSQL has much better resolution of deadlocks. In some cases, I was finding MySQL deadlocked every time when running the same script on >1 thread. Where the same script in PostgreSQL could handle it just fine for any number of parallel threads.
Based on a function from the mysql high performance blog.
I was able to implement the following deadlock handling code in PHP:
/* maximum number of attempts for deadlock */
$MAX_ATTEMPS = 20;
/* query */
$sql = "INSERT INTO B (AID, NAME) VALUES (1000, 'Hello world')";
/* current attempt counter */
$current = 0;
/* try to query */
while ($current++ <$MAX_ATTEMPS)
{
$result = mysql_query($sql);
if(!$result && ( mysql_errno== '1205' || mysql_errno == '1213' ) )
continue;
else
break;
}
}
Hopefully this might give you some good ideas.
There are no deadlocks here, what version of MySQL and what isolation level do you use?
I got these results, adding timestamp column to table B:
Thread1 > BEGIN;
Thread1 > SELECT * FROM A WHERE ID=1000 FOR UPDATE;
/* 0 rows affected, 1 rows found */
Thread2 > BEGIN;
Thread2 > INSERT INTO B (AID, NAME, date) VALUES (1000, 'Hello world', NOW());
[Hangs]
-- after 5 seconds
Thread1 > INSERT INTO B (AID, NAME, date) VALUES (1000, 'Hello world2', NOW());
/* 1 rows affected, 0 rows found */
Thread1 > COMMIT;
Thread2 > COMMIT;
B will contain 2 rows that look like:
1000 'Hello world' '2011-06-11 19:23:15'
1000 'Hello world2' '2011-06-11 19:23:20'
The situation you described takes place only when B.NAME is unique index and you are trying to insert the same values. The first insert waits for A.ID index to be released which will never happen because of duplicating value for B.NAME.