Handeling Latency in MySQL Transactions - mysql

The Problem
I'm trying to figure out how to correctly set up a transaction in a database, and account for potential latency.
The Setup
In my example I have a table of users, keys, where each user can have multiple keys, and a config table that dictates how many keys each user is allowed to have.
I want to run a stored procedure that:
figures out if the given user is allowed to request a key.
get an available, unclaimed key .
attempts to redeem the key for the given user.
the pseudocode for the procedure would be:
START TRANSACTION
(1) CALL check_permission(...,#result);
IF (#result = 'has_permission') THEN
(2) SET #unclaimed_key_id = (QUERY FOR RETURNING AVAILABLE KEY ID);
(3) CALL claim_key(#unclaimed_key_id);
END IF;
COMMIT;
The problem that I am running into, is that when I simulate lag after step 1, (by using SELECT SLEEP(<seconds>)), it's possible for a given user to redeem multiple keys when they only have permissions to redeem one, by running the procedure in multiple sessions before the first procedure has finished its sleep (which again, is to simulate lag)
Here is the code for the Tables and the Procedures
(note: for the small example I didn't bother with indexes and foreign keys, but obviously I use those on the actual project).
To see my issue just set up the tables and procedures in a database, then open two mysql terminals, and in the first run this:
CALL `P_user_request_key`(10,1,#out);
SELECT #out;
And then quickly (you have 10 seconds) in the second run this:
CALL `P_user_request_key`(0,1,#out);
SELECT #out;
Both queries will successfully return key_claimed and User Bob will end up with 4 keys assigned to him, although the max value in config is set to 3 per user.
The Questions
What is the best way of avoiding issues like this? I'm trying to use a transaction but I feel like It's not going to help specifically with this issue, and may be implementing this wrong.
I realize that one possible way to fix the problem would be to just encapsulate everything in one large update query, but I would prefer to avoid that, since I like being able to set up individual procedures, where each is only meant to do a single task.
The database behind this example is intended to be used by many (thousands) of concurrent users. As such it would be best if one user attempting to redeem a code doesn't block all other users from redeeming one. I'm fine with changing my code to just attempt to redeem again if another user already claimed a key, but it should absolutely not happen that a user can redeem two codes when they only have permission to get one.

You're off the hook for not wanting to encapsulate everything in one large query, because that won't actually solve anything either, it just makes it less likely.
What you need are locks on the rows, or locks on the index where the new row would be inserted.
InnoDB uses an algorithm called next-key locking that combines index-row locking with gap locking. InnoDB performs row-level locking in such a way that when it searches or scans a table index, it sets shared or exclusive locks on the index records it encounters. Thus, the row-level locks are actually index-record locks. In addition, a next-key lock on an index record also affects the “gap” before that index record. That is, a next-key lock is an index-record lock plus a gap lock on the gap preceding the index record. If one session has a shared or exclusive lock on record R in an index, another session cannot insert a new index record in the gap immediately before R in the index order.
http://dev.mysql.com/doc/refman/5.5/en/innodb-next-key-locking.html
So how do we get exclusive locks?
Two connections, mysql1 and mysql2, each of them requesting an exclusive lock using SELECT ... FOR UPDATE. The table 'history' has a column 'user_id' which is indexed. (It's also a foreign key.) There are no rows found, so they both appear to proceed normally as if nothing unusual is going to happen. The user_id 2808 is valid but has nothing in history.
mysql1> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql2> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql1> select * from history where user_id = 2808 for update;
Empty set (0.00 sec)
mysql2> select * from history where user_id = 2808 for update;
Empty set (0.00 sec)
mysql1> insert into history(user_id) values (2808);
... and I don't get my prompt back ... no response ... because another session has a lock, too ... but then:
mysql2> insert into history(user_id) values (2808);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Then mysql1 immediately returns success on the insert.
Query OK, 1 row affected (3.96 sec)
All that is left is for mysql1 to COMMIT and magically, we prevented a user with 0 entries from inserting more than 1 entry. The deadlock occurred because both sessions needed incompatible things to happen: mysql1 needed mysql2 to release its lock before it would be able to commit and mysql2 needed mysql1 to release its lock before it would be able to insert. Somebody has to lose that fight, and generally the thread that has done the least work is the loser.
But what if there had been 1 or more rows already existing when I did the SELECT ... FOR UPDATE? In that case, the lock would have been on the rows, so the second session to try to SELECT would actually block waiting for the SELECT until the first session decided to either COMMIT or ROLLBACK, at which time the second session would have seen an accurate count of the number of rows (including any inserted or deleted by the first session) and could have accurately decided the user already had the maximum allowed.
You can't outrace a race condition, but you can lock them out.

Related

How to resolve database deadlock issue caused by parallel goroutines using retry transaction? [duplicate]

I have a innoDB table which records online users. It gets updated on every page refresh by a user to keep track of which pages they are on and their last access date to the site. I then have a cron that runs every 15 minutes to DELETE old records.
I got a 'Deadlock found when trying to get lock; try restarting transaction' for about 5 minutes last night and it appears to be when running INSERTs into this table. Can someone suggest how to avoid this error?
=== EDIT ===
Here are the queries that are running:
First Visit to site:
INSERT INTO onlineusers SET
ip = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
On each page refresh:
UPDATE onlineusers SET
ips = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
WHERE id = 888
Cron every 15 minutes:
DELETE FROM onlineusers WHERE datetime <= now() - INTERVAL 900 SECOND
It then does some counts to log some stats (ie: members online, visitors online).
One easy trick that can help with most deadlocks is sorting the operations in a specific order.
You get a deadlock when two transactions are trying to lock two locks at opposite orders, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(2), locks key(1);
If both run at the same time, connection 1 will lock key(1), connection 2 will lock key(2) and each connection will wait for the other to release the key -> deadlock.
Now, if you changed your queries such that the connections would lock the keys at the same order, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(1), locks key(2);
it will be impossible to get a deadlock.
So this is what I suggest:
Make sure you have no other queries that lock access more than one key at a time except for the delete statement. if you do (and I suspect you do), order their WHERE in (k1,k2,..kn) in ascending order.
Fix your delete statement to work in ascending order:
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
To
DELETE FROM onlineusers
WHERE id IN (
SELECT id FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY id
) u;
Another thing to keep in mind is that MySQL documentation suggest that in case of a deadlock the client should retry automatically. you can add this logic to your client code. (Say, 3 retries on this particular error before giving up).
Deadlock happen when two transactions wait on each other to acquire a lock. Example:
Tx 1: lock A, then B
Tx 2: lock B, then A
There are numerous questions and answers about deadlocks. Each time you insert/update/or delete a row, a lock is acquired. To avoid deadlock, you must then make sure that concurrent transactions don't update row in an order that could result in a deadlock. Generally speaking, try to acquire lock always in the same order even in different transaction (e.g. always table A first, then table B).
Another reason for deadlock in database can be missing indexes. When a row is inserted/update/delete, the database needs to check the relational constraints, that is, make sure the relations are consistent. To do so, the database needs to check the foreign keys in the related tables. It might result in other lock being acquired than the row that is modified. Be sure then to always have index on the foreign keys (and of course primary keys), otherwise it could result in a table lock instead of a row lock. If table lock happen, the lock contention is higher and the likelihood of deadlock increases.
In case someone is still struggling with this issue:
I faced similar issue where 2 requests were hitting the server at the same time. There was no situation like below:
T1:
BEGIN TRANSACTION
INSERT TABLE A
INSERT TABLE B
END TRANSACTION
T2:
BEGIN TRANSACTION
INSERT TABLE B
INSERT TABLE A
END TRANSACTION
So, I was puzzled why deadlock is happening.
Then I found that there was parent child relation ship between 2 tables because of foreign key. When I was inserting a record in child table, the transaction was acquiring a lock on parent table's row. Immediately after that I was trying to update the parent row which was triggering elevation of lock to EXCLUSIVE one. As 2nd concurrent transaction was already holding a SHARED lock, it was causing deadlock.
Refer to: https://blog.tekenlight.com/2019/02/21/database-deadlock-mysql.html
It is likely that the delete statement will affect a large fraction of the total rows in the table. Eventually this might lead to a table lock being acquired when deleting. Holding on to a lock (in this case row- or page locks) and acquiring more locks is always a deadlock risk. However I can't explain why the insert statement leads to a lock escalation - it might have to do with page splitting/adding, but someone knowing MySQL better will have to fill in there.
For a start it can be worth trying to explicitly acquire a table lock right away for the delete statement. See LOCK TABLES and Table locking issues.
You might try having that delete job operate by first inserting the key of each row to be deleted into a temp table like this pseudocode
create temporary table deletetemp (userid int);
insert into deletetemp (userid)
select userid from onlineusers where datetime <= now - interval 900 second;
delete from onlineusers where userid in (select userid from deletetemp);
Breaking it up like this is less efficient but it avoids the need to hold a key-range lock during the delete.
Also, modify your select queries to add a where clause excluding rows older than 900 seconds. This avoids the dependency on the cron job and allows you to reschedule it to run less often.
Theory about the deadlocks: I don't have a lot of background in MySQL but here goes... The delete is going to hold a key-range lock for datetime, to prevent rows matching its where clause from being added in the middle of the transaction, and as it finds rows to delete it will attempt to acquire a lock on each page it is modifying. The insert is going to acquire a lock on the page it is inserting into, and then attempt to acquire the key lock. Normally the insert will wait patiently for that key lock to open up but this will deadlock if the delete tries to lock the same page the insert is using because thedelete needs that page lock and the insert needs that key lock. This doesn't seem right for inserts though, the delete and insert are using datetime ranges that don't overlap so maybe something else is going on.
http://dev.mysql.com/doc/refman/5.1/en/innodb-next-key-locking.html
For Java programmers using Spring, I've avoided this problem using an AOP aspect that automatically retries transactions that run into transient deadlocks.
See #RetryTransaction Javadoc for more info.
cron is dangerous. If one instance of cron fails to finish before the next is due, they are likely to fight each other.
It would be better to have a continuously running job that would delete some rows, sleep some, then repeat.
Also, INDEX(datetime) is very important for avoiding deadlocks.
But, if the datetime test includes more than, say, 20% of the table, the DELETE will do a table scan. Smaller chunks deleted more often is a workaround.
Another reason for going with smaller chunks is to lock fewer rows.
Bottom line:
INDEX(datetime)
Continually running task -- delete, sleep a minute, repeat.
To make sure that the above task has not died, have a cron job whose sole purpose is to restart it upon failure.
Other deletion techniques: http://mysql.rjweb.org/doc.php/deletebig
#Omry Yadan's answer ( https://stackoverflow.com/a/2423921/1810962 ) can be simplified by using ORDER BY.
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
to
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY ID
to keep the order in which you delete items consistent. Also if you are doing multiple inserts in a single transaction, make sure they are also always ordered by id.
According to the mysql delete documentation:
If the ORDER BY clause is specified, the rows are deleted in the order that is specified.
You can find a reference here: https://dev.mysql.com/doc/refman/8.0/en/delete.html
I have a method, the internals of which are wrapped in a MySqlTransaction.
The deadlock issue showed up for me when I ran the same method in parallel with itself.
There was not an issue running a single instance of the method.
When I removed MySqlTransaction, I was able to run the method in parallel with itself with no issues.
Just sharing my experience, I'm not advocating anything.

FOR UPDATE doesn't seem to lock the row in MySql InnoDB

MySql = v5.6
Table engine = InnoDB
I have one mysql cli open. I run:
START TRANSACTION;
SELECT id FROM my_table WHERE id=1 FOR UPDATE;
I then have a second cli open and run:
SELECT id FROM my_table WHERE id=1;
I expected it to wait until I either committed or rolled back the first transaction but it doesn't, it just brings back the row straight away as if no row-locking had occurred.
I did another test where I updated a status field in the first cli and I couldn't see that change in the 2nd cli until I committed the transaction, proving the transactions are actually working.
Am I misunderstanding FOR UPDATE or doing something wrong?
update:
Needed FOR UPDATE on the 2nd SELECT query
That action you saw is valid. With "MVCC", different connections can see different versions on the row(s).
The first connection grabbed a type of lock that prevents writes, but not reads. If the second connection had done FOR UPDATE or INSERT or other "write" type of operation, it would have been either delayed waiting for the lock to be released, or deadlocked. (A deadlock would require other locks going on also.)
Common Pattern
BEGIN;
SELECT ... FOR UPDATE; -- the row(s) you will update in this transaction
miscellany work
UPDATE...; -- those row(s).
COMMIT;
If two threads are running that code at the "same" time on the same row(s), the second one will stalled at the SELECT..FOR UPDATE. After the first thread finished, the SELECT will run, getting the new values. All is well.
Meanwhile, other threads can SELECT (without for update) and get some value. Think of these threads as getting the value before or after the transaction, depending on the exact timing of all the threads. The important thing is that these 'other' threads will see a consistent view of the data -- either none of the updates in that transaction have been applied, or all have been applied. This is what "Atomic" means.

Select only unlocked rows mysql

I have locked one row in one transaction by following query
START TRANSACTION;
SELECT id FROM children WHERE id=100 FOR UPDATE;
And in another transaction i have a query as below
START TRANSACTION;
SELECT id FROM children WHERE id IN (98,99,100) FOR UPDATE;
It gives error lock wait timeout exceeded.
Here 100 is already locked (in first transaction ) But the ids 98,99 are not locked.Is there any possibility return records of 98,99 if only 100 is row locked in above query.So result should be as below
Id
===
98
99
===
Id 100 should be ignored because 100 is locked by a transaction.
Looks like SKIP LOCKED option mentioned in a previous answer is now available in MySQL. It does not wait to acquire a row lock and allows you to work with rows that are not currently locked.
From MySQL 8.0.0 Release Notes/Changes in MySQL 8.0.1:
InnoDB now supports NOWAIT and SKIP LOCKED options with SELECT ... FOR SHARE and SELECT ... FOR UPDATE locking read statements. NOWAIT causes the statement to return immediately if a requested row is locked by another transaction. SKIP LOCKED removes locked rows from the result set. See Locking Read Concurrency with NOWAIT and SKIP LOCKED.
Sample usage (complete example with outputs can be found in the link above):
START TRANSACTION;
SELECT * FROM tableName FOR UPDATE SKIP LOCKED;
Also, it might be good to include the warning in the Reference Manual here as well:
Queries that skip locked rows return an inconsistent view of the data. SKIP LOCKED is therefore not suitable for general transactional work. However, it may be used to avoid lock contention when multiple sessions access the same queue-like table.
MySQL does not have a way to ignore locked rows in a SELECT. You'll have to find a different way to set a row aside as "already processed".
The simplest way is to lock the row briefly in the first query just to mark it as "already processed", then unlock it and lock it again for the rest of the processing - the second query will wait for the short "marker" query to complete, and you can add an explicit WHERE condition to ignore already-marked rows. If you don't want to rely on the first operation being able to complete successfully, you may need to add a bit more complexity with timestamps and such to clean up after those failed operations.
MySQL does not have this feature. For anyone searching for this topic in general, some RDBMS have better/smarter locking features than others.
For developers constrained to MySQL, the best approach is to add a column (or use an existing, e.g., status column) that can be set to "locked" or "in progress" or similar, execute a SELECT ID, * ... WHERE IN_PROGRESS != 1 FOR UPDATE; to get the row ID you want to lock, issue UPDATE .. SET IN_PROGRESS = 1 WHERE ID = XX to unlock the records.
Using LOCK IN SHARE MODE is almost never the solution because while it'll let you read the old value, but the old value is in the process of being updated so unless you are performing a non-atomic task, there's no point in even looking at that record.
Better* RDBMS recognize this pattern (select one row to work on and lock it, work on it, unlock it) and provide a smarter approach that lets you only search unlocked records. For example, PostgreSQL 9.5+ provide SELECT ... SKIP LOCKED which only selects from within the unlocked subset of rows matching the query. That lets you obtain an exclusive lock on a row, service that record to completion, then update & unlock the record in question without having to block other threads/consumers from being able to work independent of yourself.
*Here "better" means from the perspective of atomic updates, multi-consumer architecture, etc. and not necessarily "better designed" or "overall better." Not trying to start a flamewar here.
As per http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html
The solution is to perform the SELECT in a locking mode using LOCK IN SHARE MODE:
SELECT * FROM parent WHERE NAME = 'Jones' LOCK IN SHARE MODE;

While in a transaction, how can reads to an affected row be prevented until the transaction is done?

I'm fairly sure this has a simple solution, but I haven't been able to find it so far. Provided an InnoDB MySQL database with the isolation level set to SERIALIZABLE, and given the following operation:
BEGIN WORK;
SELECT * FROM users WHERE userID=1;
UPDATE users SET credits=100 WHERE userID=1;
COMMIT;
I would like to make sure that as soon as the select inside the transaction is issued, the row corresponding to userID=1 is locked for reads until the transaction is done. As it stands now, UPDATEs to this row will wait for the transaction to be finished if it is in process, but SELECTs simply will read the previous value. I understand this is the expected behaviour in this case, but I wonder if there is a way to lock the row in such a way that SELECTs will also wait until the transaction is finished to return the values?
The reason I'm looking for that is that at some point, and with enough concurrent users, it could happen that while the previous transaction is in process someone else reads the "credits" to calculate something else. Ideally the code run by that someone else should wait for the transaction to finish to use the new value, because otherwise it could lead to irreversible desync issues.
Note that I don't want to lock the entire table for reads, just the specific row.
Also, I could add a boolean "locked" field to the tables and set it to 1 every time I'm starting a transaction but I don't really feel this is the most elegant solution here, unless there is absolutely no other way to handle this through mysql directly.
I found a workaround, specifically:
SELECT ... LOCK IN SHARE MODE sets a shared mode lock on the rows
read. A shared mode lock enables other sessions to read the rows but
not to modify them. The rows read are the latest available, so if they
belong to another transaction that has not yet committed, the read
blocks until that transaction ends.
(Source)
It seems that one can just include LOCK IN SHARE MODE in the critical SELECT statements that rely on transactional data and they will indeed wait for current transactions to finish before retrieving the row/s. For this to work the transaction has to use FOR UPDATE explicitly (as opposed to the original example I gave). E.g., given the following:
BEGIN WORK;
SELECT * FROM users WHERE userID=1 FOR UPDATE;
UPDATE users SET credits=100 WHERE userID=1;
COMMIT;
Anywhere else in the code I could use:
SELECT * FROM users WHERE userID=1 LOCK IN SHARE MODE;
Since this statement is not wrapped in a transaction, the lock is released immediately, thus having no impacts in subsequent queries, but if the row involving userID=1 has been selected for update within a transaction this statement would wait until the transaction is done, which is exactly what I was looking for.
You could try the SELECT ... FOR UPDATE locking read.
A SELECT ... FOR UPDATE reads the latest available data, setting exclusive locks on each row it reads. Thus, it sets the same locks a searched SQL UPDATE would set on the rows.
Please go through the following site: http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html

MySQL pause index rebuild on bulk INSERT without TRANSACTION

I have a lot of data to INSERT LOW_PRIORITY into a table. As the index is rebuilt every time a row is inserted, this takes a long time. I know I could use transactions, but this is a case where I don't want the whole set to fail if just one row fails.
Is there any way to get MySQL to stop rebuilding indices on a specific table until I tell it that it can resume?
Ideally, I would like to insert 1,000 rows or so, set the index do its thing, and then insert the next 1,000 rows.
I cannot use INSERT DELAYED as my table type is InnoDB. Otherwise, INSERT DELAYED would be perfect for me.
Not that it matters, but I am using PHP/PDO to access MySQL. Any advice you could give would be appreciated. Thanks!
ALTER TABLE tableName DISABLE KEYS
// perform inserts
ALTER TABLE tableName ENABLE KEYS
This disables updating of all non-unique indexes. The disadvantage is that those indexes won't be used for select queries as well.
You can however use multi-inserts (INSERT INTO table(...) VALUES(...),(...),(...) which will also update indexes in batches.
AFAIK, for those that use InnoDB tables, if you don't want indexes to be rebuilt after each INSERT, you must use transactions.
For example, for inserting a batch of 1000 rows, use the following SQL:
SET autocommit=0;
//Insert the rows one after the other, or using multi values inserts
COMMIT;
By disabling autocommit, a transaction will be started at the first INSERT. Then, the rows are inserted one after the other and at the end, the transaction is committed and the indexes are rebuilt.
If an error occurs during execution of one of the INSERT, the transaction is not rolled back but an error is reported to the client which has the choice of rolling back or continuing. Therefore, if you don't want the entire batch to be rolled back if one INSERT fails, you can log the INSERTs that failed and continue inserting the rows, and finally commit the transaction at the end.
However, take into account that wrapping the INSERTs in a transaction means you will not be able to see the inserted rows until the transaction is committed. It is possible to set the transaction isolation level for the SELECT to READ_UNCOMMITTED but as I've tested it, the rows are not visible when the SELECT happens very close to the INSERT. See my post.