We have table
CREATE TABLE TEST_SUBSCRIBERS (
SUBSCRIPTION_ID varchar(255) NOT NULL COMMENT 'Subscriber id in format MSISDN-SERVICE_ID-TIMESTAMP',
MSISDN varchar(12) NOT NULL COMMENT 'Subscriber phone',
STATE enum ('ACTIVE', 'INACTIVE', 'UNSUBSCRIBED_SMS', 'UNSUBSCRIBED_PARTNER', 'UNSUBSCRIBED_ADMIN', 'UNSUBSCRIBED_REBILLING') NOT NULL,
SERVICE_ID varchar(255) NOT NULL COMMENT 'Id of service',
PRIMARY KEY (SUBSCRIPTION_ID)
)
ENGINE = INNODB
CHARACTER SET utf8
COLLATE utf8_general_ci;
In parallel threads we perform actions (in java) like these
1. Select active subscribers
SELECT *
FROM TEST_SUBSCRIBERS
WHERE SERVICE_ID='web-sub-1'
and MSISDN='000000002'
AND STATE IN ('ACTIVE', 'INACTIVE');
2. If there are no such subscribers, I can insert it
INSERT INTO TEST_SUBSCRIBERS
(SUBSCRIPTION_ID, MSISDN, STATE, SERVICE_ID)
VALUES ('web-sub-1-000000002-1504624819', '000000002', 'ACTIVE', 'web-sub-1');
In concurrency mode 2 threads can try to insert row with msisdn="000000002" and service-id="web-sub-1" and different subscriptionId because the current timestamp can be different. Both threads perform first select, get zero results and both insert. So we try to join these 2 queries into tranaction, but there is problem with locking for not existing rows - when we need lock for insert or something like that.
And we do not want to lock all table during this 2 actions because we suppose that our system will work too slowly in this case.
We cannot create uniq key for this situation, because for one abonent there can be multiple rows with the same unsubscribed statuses. And if we try to insert the 2 subscribers for the same service, primary key can contain timestamp with different seconds.
We tried to use SELECT ... FOR UPDATE and SELECT ... LOCK IN SHARE MODE, but we get deadlock and it's heavy operation for database server.
For tests we opened 2 terminals and did step by step:
# Window 1
mysql> start transaction;
mysql> SELECT SUBSCRIPTION_ID FROM TEST_SUBSCRIBERS s
WHERE s.SERVICE_ID="web-sub-1" AND s.MSISDN="000000002" FOR UPDATE;
# Window 2
start transaction;
mysql> SELECT SUBSCRIPTION_ID FROM TEST_SUBSCRIBERS s
WHERE s.SERVICE_ID="web-sub-1" AND s.MSISDN="000000002" FOR UPDATE;
# Window 1
mysql> INSERT INTO TEST_SUBSCRIBERS
(SUBSCRIPTION_ID, MSISDN, STATE, SERVICE_ID)
VALUES('web-sub-1-000000002-1504624818', '000000002', 'ACTIVE', 'web-sub-1');
# Window 2
mysql> INSERT INTO TEST_SUBSCRIBERS
(SUBSCRIPTION_ID, MSISDN, STATE, SERVICE_ID)
VALUES('web-sub-1-000000002-1504624819', '000000002', 'ACTIVE', 'web-sub-1');
ERROR 1213 (40001): Deadlock found when trying to get lock;
try restarting transaction
Is there any way to do such without deadlocks and without locking full table? Other variants that we analyzed were:
1. separate table
2. inserting and deleting unwanted rows.
Plan A. This will either insert (if necessary) or silently do nothing:
INSERT IGNORE ...;
Plan B. This may be overkill, since nothing needs "updating":
INSERT INTO ...
(...)
ON DUPLICATE KEY UPDATE
...;
Plan C. This statement is mostly replaced by IODKU:
REPLACE ... (same syntax as INSERT, but it does a silent DELETE first)
A and B (and probably C) are "atomic", so there is no chance of a deadlock.
Following answer from #RickJames.
Plan D. Use READ-COMMITTED
Window 1
mysql> set tx_isolation='READ-COMMITTED';
mysql> start transaction;
mysql> SELECT SUBSCRIPTION_ID FROM TEST_SUBSCRIBERS s
WHERE s.SERVICE_ID="web-sub-1" AND s.MSISDN="000000002" FOR UPDATE;
Window 2
mysql> set tx_isolation='READ-COMMITTED';
mysql> start transaction;
mysql> SELECT SUBSCRIPTION_ID FROM TEST_SUBSCRIBERS s
WHERE s.SERVICE_ID="web-sub-1" AND s.MSISDN="000000002" FOR UPDATE;
Window 1
mysql> INSERT INTO TEST_SUBSCRIBERS (SUBSCRIPTION_ID, MSISDN, STATE, SERVICE_ID)
VALUES('web-sub-1-000000002-10', '000000002', 'ACTIVE', 'web-sub-1');
Window 2
mysql> INSERT INTO TEST_SUBSCRIBERS (SUBSCRIPTION_ID, MSISDN, STATE, SERVICE_ID)
VALUES('web-sub-1-000000002-10', '000000002', 'ACTIVE', 'web-sub-1');
<begins lock wait>
Window 1
mysql> commit;
Window 2
<lock wait ends immediately>
ERROR 1062 (23000): Duplicate entry 'web-sub-1-000000002-10' for key 'PRIMARY'
The duplicate key error is not a deadlock, but it's still an error. But it doesn't roll back the entire transaction, it just cancels the attempted insert. You still have an active transaction with any other changes that have been successfully executed still pending.
Plan E. Use a queue
Instead of having concurrent Java threads inserting to the database, just have the Java threads enter items into a message queue (e.g. ActiveMQ). Then create one Java thread to do nothing but pull items from the queue and insert them into the database. This prevents deadlocks because there's only one thread inserting to the database.
Plan F. Embrace the deadlocks
You can't prevent all types of deadlocks, you can only handle them when they occur. Concurrent systems should be designed to anticipate some number of deadlocks, and retry operations when necessary.
Related
MySQL document (https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html) mentioned,
If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. ...
...
INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs.
and I've read the source code(https://github.com/mysql/mysql-server/blob/f8cdce86448a211511e8a039c62580ae16cb96f5/storage/innobase/row/row0ins.cc#L1930) that corresponding this situation, InnoDB indeed set the S or X lock when a duplicate-key error occurs.
if (flags & BTR_NO_LOCKING_FLAG) {
/* Set no locks when applying log
in online table rebuild. */
} else if (allow_duplicates) {
... ...
/* If the SQL-query will update or replace duplicate key we will take
X-lock for duplicates ( REPLACE, LOAD DATAFILE REPLACE, INSERT ON
DUPLICATE KEY UPDATE). */
err = row_ins_set_rec_lock(LOCK_X, lock_type, block, rec, index, offsets, thr);
} else {
... ...
err = row_ins_set_rec_lock(LOCK_S, lock_type, block, rec, index, offsets, thr);
}
But I wonder why InnoDB has to set such locks, it seems that these locks will bring more problems than they solve(they solved this problem: MySQL duplicate key error causes a shared lock set on the duplicate index record?).
Firstly, it can result in deadlock easily, the same MySQL document showed 2 examples about the deadlock.
Worse, the S or X lock is not single index-record lock, it is Next Key lock and may refuse many values to be inserted rather than just one duplicated value.
e.g.
CREATE TABLE `t` (
`id` int NOT NULL AUTO_INCREMENT,
`c` int DEFAULT NULL,
`d` int DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniq_idx_c` (`c`)
) ENGINE=InnoDB AUTO_INCREMENT=48 DEFAULT CHARSET=utf8mb4
mysql> select * from t;
+----+------+------+
| id | c | d |
+----+------+------+
| 30 | 10 | 10 |
| 36 | 100 | 100 |
+----+------+------+
mysql> show variables like '%iso%';
+-----------------------+-----------------+
| Variable_name | Value |
+-----------------------+-----------------+
| transaction_isolation | REPEATABLE-READ |
+-----------------------+-----------------+
1 row in set (0.41 sec)
# Transaction 1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t values (null, 100, 100);
ERROR 1062 (23000): Duplicate entry '100' for key 't.uniq_idx_c'
# not commit
# Transcation 2
mysql> insert into t values(null, 95, 95);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 20, 20);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 50, 50);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
# All c in [10, 100] can not be inserted
The goal in an ACID database is that queries in your session have the same result if you try to run them again.
Example: You run an INSERT query that results in a duplicate key error. You would expect if you retry that INSERT query, it would again fail with the same error.
But what if another session updates the row that caused the conflict, and changes the unique value? Then if you retry your INSERT, it would succeed, which is unexpected.
InnoDB has no way to implement true REPEATABLE-READ transactions when your statements are locking. E.g. INSERT/UPDATE/DELETE, or even SELECT with the locking options FOR UPDATE, FOR SHARE, or LOCK IN SHARE MODE. Locking SQL statements in InnoDB always act on the latest committed version of a row, not the version of that row that is visible to your session.
So how can InnoDB simulate REPEATABLE-READ, ensuring that the row affected by a locking statement is the same as the latest committed row?
By locking rows that are indirectly referenced by your locking statement, preventing them from being changed by other concurrent sessions.
Another possible explain I found in MySQL source code is row0ins.cc Line 2141
We set a lock on the possible duplicate: this
is needed in logical logging of MySQL to make
sure that in roll-forward we get the same duplicate
errors as in original execution
I have the following table:
CREATE TABLE `accounts` (
`name` varchar(50) NOT NULL,
`balance` int NOT NULL,
PRIMARY KEY (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
And it has two accounts in it. "Bob" has a balance of 100. "Jim" has a balance of 200.
I run this query to transfer 50 from Jim to Bob:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT * FROM accounts;
SELECT SLEEP(10);
SET #bobBalance = (SELECT balance FROM accounts WHERE name = 'bob' FOR UPDATE);
SET #jimBalance = (SELECT balance FROM accounts WHERE name = 'jim' FOR UPDATE);
UPDATE accounts SET balance = #bobBalance + 50 WHERE name = 'bob';
UPDATE accounts SET balance = #jimBalance - 50 WHERE name = 'jim';
COMMIT;
While that query is sleeping, I run the following query in a different session to set Jim's balance to 500:
UPDATE accounts SET balance = 500 WHERE name = 'jim';
What I thought would happen is that this would cause a bug. The transaction would set Jim's balance to 150, because the first read in the transaction (before the SLEEP) would establish a snapshot in which Jim's balance is 200, and that snapshot would be used in the later query to get Jim's balance. So we would subtract 50 from 200 even though Jim's balance has actually been changed to 500 by the other query.
But that's not what happens. Actually, the end result is correct. Bob has 150 and Jim has 450. But I don't understand why this is.
The MySQL documentation says about Repeatable Read:
This is the default isolation level for InnoDB. Consistent reads within the same transaction read the snapshot established by the first read. This means that if you issue several plain (nonlocking) SELECT statements within the same transaction, these SELECT statements are consistent also with respect to each other. See Section 15.7.2.3, “Consistent Nonlocking Reads”.
So what am I missing here? Why does it seem like the SELECT statements in the transaction are not all using a snapshot established by the first SELECT statement?
The repeatable-read behavior only works for non-locking SELECT queries. It reads from the snapshot established by the first query in the transaction.
But any locking SELECT query reads the latest committed version of the row, as if you had started your transaction in READ-COMMITTED isolation level.
A SELECT is implicitly a locking read if it's involved in any kind of SQL statement that modifies data.
For example:
INSERT INTO table2 SELECT * FROM table1 WHERE ...;
The above locks examined rows in table1, even though the statement is just copying them to table2.
SET #myvar = (SELECT ... FROM table1 WHERE ...);
This is also copying a value from table1, into a variable. It locks the examined row in table1.
Likewise SELECT statements that are invoked in a trigger, or as part of a multi-table UPDATE or DELETE, and so on. Anytime the SELECT is part of a larger statement that modifies any data (in a table or in a variable), it locks the rows examined by the SELECT.
And therefore it's a locking read, and behaves like an UPDATE with respect to which row version it reads.
MySQL document (https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html) mentioned,
If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. ...
...
INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs.
and I've read the source code(https://github.com/mysql/mysql-server/blob/f8cdce86448a211511e8a039c62580ae16cb96f5/storage/innobase/row/row0ins.cc#L1930) that corresponding this situation, InnoDB indeed set the S or X lock when a duplicate-key error occurs.
if (flags & BTR_NO_LOCKING_FLAG) {
/* Set no locks when applying log
in online table rebuild. */
} else if (allow_duplicates) {
... ...
/* If the SQL-query will update or replace duplicate key we will take
X-lock for duplicates ( REPLACE, LOAD DATAFILE REPLACE, INSERT ON
DUPLICATE KEY UPDATE). */
err = row_ins_set_rec_lock(LOCK_X, lock_type, block, rec, index, offsets, thr);
} else {
... ...
err = row_ins_set_rec_lock(LOCK_S, lock_type, block, rec, index, offsets, thr);
}
But I wonder why InnoDB has to set such locks, it seems that these locks will bring more problems than they solve(they solved this problem: MySQL duplicate key error causes a shared lock set on the duplicate index record?).
Firstly, it can result in deadlock easily, the same MySQL document showed 2 examples about the deadlock.
Worse, the S or X lock is not single index-record lock, it is Next Key lock and may refuse many values to be inserted rather than just one duplicated value.
e.g.
CREATE TABLE `t` (
`id` int NOT NULL AUTO_INCREMENT,
`c` int DEFAULT NULL,
`d` int DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uniq_idx_c` (`c`)
) ENGINE=InnoDB AUTO_INCREMENT=48 DEFAULT CHARSET=utf8mb4
mysql> select * from t;
+----+------+------+
| id | c | d |
+----+------+------+
| 30 | 10 | 10 |
| 36 | 100 | 100 |
+----+------+------+
mysql> show variables like '%iso%';
+-----------------------+-----------------+
| Variable_name | Value |
+-----------------------+-----------------+
| transaction_isolation | REPEATABLE-READ |
+-----------------------+-----------------+
1 row in set (0.41 sec)
# Transaction 1
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t values (null, 100, 100);
ERROR 1062 (23000): Duplicate entry '100' for key 't.uniq_idx_c'
# not commit
# Transcation 2
mysql> insert into t values(null, 95, 95);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 20, 20);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> insert into t values(null, 50, 50);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
# All c in [10, 100] can not be inserted
The goal in an ACID database is that queries in your session have the same result if you try to run them again.
Example: You run an INSERT query that results in a duplicate key error. You would expect if you retry that INSERT query, it would again fail with the same error.
But what if another session updates the row that caused the conflict, and changes the unique value? Then if you retry your INSERT, it would succeed, which is unexpected.
InnoDB has no way to implement true REPEATABLE-READ transactions when your statements are locking. E.g. INSERT/UPDATE/DELETE, or even SELECT with the locking options FOR UPDATE, FOR SHARE, or LOCK IN SHARE MODE. Locking SQL statements in InnoDB always act on the latest committed version of a row, not the version of that row that is visible to your session.
So how can InnoDB simulate REPEATABLE-READ, ensuring that the row affected by a locking statement is the same as the latest committed row?
By locking rows that are indirectly referenced by your locking statement, preventing them from being changed by other concurrent sessions.
Another possible explain I found in MySQL source code is row0ins.cc Line 2141
We set a lock on the possible duplicate: this
is needed in logical logging of MySQL to make
sure that in roll-forward we get the same duplicate
errors as in original execution
I have an complex database. I can simplify it like that:
The table a:
CREATE TABLE a
(
Id int(10) unsigned NOT NULL AUTO_INCREMENT,
A int(11) DEFAULT NULL,
CalcUniqId int(11) DEFAULT NULL,
PRIMARY KEY (Id)
) ENGINE=InnoDB;
CREATE TRIGGER a_before_ins_tr before INSERT on a
FOR EACH ROW
BEGIN
select Max(CalcUniqId) from A into #MaxCalcUniqId;
set new.CalcUniqId=IfNull(#MaxCalcUniqId,1)+1;
END $
It works like that:
start transaction
insert into A(A)
... insert in other tables. It take between 30 and 60 seconds
commit;
The problem is, the trigger returns the same CalcUniqId for all transaction that run at the same time.
Is there any solution or work arround.
Is this a solution:
start transaction;
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
insert into A(A) values(10);
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
....
commit;
Man can run this test:
Session 1:
Step1: start transaction;
Step2: insert into A(A) values(1);
Step3: commit;
Session 2:
Step1: start transaction;
Step2: insert into A(A) values(2);
Step3: commit;
Run in session 1 the steps 1,2 and in session 2 the steps 1,2. than step 3 in both. After that do
select Id, A, CalcUniqId from a;
both have the same CalcUniqId=2.
Change the SELECT in the Trigger to this:
select Max(CalcUniqId) from A into #MaxCalcUniqId
FOR UPDATE; -- add this
That tells the transaction that you intend to change the value; that blocks the other transactions from changing it.
This will probably lead to your 30-60 sec transactions being run one after another. And probably dying due to exceeding lock_wait_timeout. Rather than increasing that setting (which is already "too high"), please explain the bigger picture. Perhaps we can concoct a workaround that gets the 'correct' value and runs in parallel.
I have a PHP/5.2 driven application that uses transactions under MySQL/5.1 so it can rollback multiple inserts if an error condition is met. I have different reusable functions to insert different type of items. So far so good.
Now I need to use table locking for some of the inserts. As the official manual suggests, I'm using SET autocommit=0 instead of START TRANSACTION so LOCK TABLES does not issue an implicit commit. And, as documented, unlocking tables implicitly commits any active transaction:
http://dev.mysql.com/doc/refman/5.1/en/lock-tables-and-transactions.html
And here lies the problem: if I simply avoid UNLOCK TABLES, it happens that the second call to LOCK TABLES commits pending changes!
It appears that the only way is to perform all necessary LOCK TABLES in a single statement. That's a mainteinance nightmare.
Does this issue have a sensible workaround?
Here's a little test script:
DROP TABLE IF EXISTS test;
CREATE TABLE test (
test_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
random_number INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (test_id)
)
COLLATE='utf8_spanish_ci'
ENGINE=InnoDB;
-- No table locking: everything's fine
START TRANSACTION;
INSERT INTO test (random_number) VALUES (ROUND(10000* RAND()));
SELECT * FROM TEST ORDER BY test_id;
ROLLBACK;
SELECT * FROM TEST ORDER BY test_id;
-- Table locking: everything's fine if I avoid START TRANSACTION
SET autocommit=0;
INSERT INTO test (random_number) VALUES (ROUND(10000* RAND()));
SELECT * FROM TEST ORDER BY test_id;
ROLLBACK;
SELECT * FROM TEST ORDER BY test_id;
SET autocommit=1;
-- Table locking: I cannot nest LOCK/UNLOCK blocks
SET autocommit=0;
LOCK TABLES test WRITE;
INSERT INTO test (random_number) VALUES (ROUND(10000* RAND()));
SELECT * FROM TEST ORDER BY test_id;
ROLLBACK;
UNLOCK TABLES; -- Implicit commit
SELECT * FROM TEST ORDER BY test_id;
SET autocommit=1;
-- Table locking: I cannot chain LOCK calls ether
SET autocommit=0;
LOCK TABLES test WRITE;
INSERT INTO test (random_number) VALUES (ROUND(10000* RAND()));
SELECT * FROM TEST ORDER BY test_id;
-- UNLOCK TABLES;
LOCK TABLES test WRITE; -- Implicit commit
INSERT INTO test (random_number) VALUES (ROUND(10000* RAND()));
SELECT * FROM TEST ORDER BY test_id;
-- UNLOCK TABLES;
ROLLBACK;
SELECT * FROM TEST ORDER BY test_id;
SET autocommit=1;
Apparently, LOCK TABLES cannot be fixed to play well with transactions. A workaround is to replace it with SELECT .... FOR UPDATE. You don't need any special syntax (you can use regular START TRANSACTION) and it works as expected:
START TRANSACTION;
SELECT COUNT(*) FROM foo FOR UPDATE; -- Lock issued
INSERT INTO foo (foo_name) VALUES ('John');
SELECT COUNT(*) FROM bar FOR UPDATE; -- Lock issued, no side effects
ROLLBACK; -- Rollback works as expected
Please note that COUNT(*) is just an example, you can normally use the SELECT statement to fetch data you actually need ;-)
(This information was provided by Frank Heikens.)