Intermittent Lock Wait Timeout Laravel DB Transaction (with 5 retries) - mysql

We've been experiencing intermittent lock timeout errors (roughly 1-2 a day out of ~250).
On checkout, we get all of the users details, save the order, process any payments, and then update the order. I think it may be the secondary update that's causing it.
Example of our code (not exactly the same but close enough):
DB::transaction(function () use ($paymentMethod, $singleUseTokenId, $requiresPayment, $chargeAccount) {
// create order locally
$order = Order::create([
'blah' => $data['blah'],
]);
// handle payment
$this->handlePayment();
// update order with new status (with a secondary transaction for safety)
DB::transaction(function () use ($order) {
$order->update([
'status' => 'new status',
]);
}, 5);
}, 5); // Retry transaction 5 times - this reduced the lock timeout errors a lot
And the intermittent error we get back is (actual values removed):
SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction (SQL: insert into `orders` (`user_id`, `customer_uuid`, `type_uuid`, `status_uuid`, `po_number`, `order_details`, `cart_identifier`, `cart_content`, `cart_sub_total`, `cart_tax`, `cart_grand_total`, `payment_type_uuid`, `shipping_address`, `uuid`, `updated_at`, `created_at`)
I've read lots up on it and some people say increase timeout (seems like a workaround), optimistic locking (I thought transactions already do that), and other things.
From what I can tell from database breadcrumbs, the order create sometimes takes a long time (eg saw one at 3s, another at 23s for some reason as it's usually 50ms insert), and then the other things happen and it tries to update the order but the row is still locked from the create().
Notes:
We have 4 foreign keys on the orders table (customer uuid, user uuid, order type uuid, order status uuid) - I feel that these may be causing issues.
Some eloquent creates take 3s, others 23s (only checked issue ones). On most orders, request is 500ms max, so these are outliers.
Any suggestions?
Solution: No primary key on orders uuid. Very silly mistake. Caused InnoDB to basically create a 6 byte key for index. And lock up from consecutive insert, then update..

If you see "lock wait timeout" errors look at other transactions. Particularly harmful are long running transactions. You can spot those in SHOW ENGINE INNODB STATUS\G. Climbing InnoDB history list indicates there are ones, too. Currently running long transactions will be listed in information_schema.INNODB_TRX.
Note if a transaction grabbed an exclusive lock it's not released until the end of the transaction, not the end of a query.
First, rule out long running queries. For example, slow UPDATE will hold a lock for its execution time.
After all queries are made reasonably fast, review your transactions. Make them as short as possible. Quite often clients open a transaction, execute a query or two then go to third-party API calls or do other heavy lifting and keep the transaction open. Other transactions meanwhile will be getting "Lock wait timeout".

Related

MariaDB. Use Transaction Rollback without locking tables

On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?

How to resolve database deadlock issue caused by parallel goroutines using retry transaction? [duplicate]

I have a innoDB table which records online users. It gets updated on every page refresh by a user to keep track of which pages they are on and their last access date to the site. I then have a cron that runs every 15 minutes to DELETE old records.
I got a 'Deadlock found when trying to get lock; try restarting transaction' for about 5 minutes last night and it appears to be when running INSERTs into this table. Can someone suggest how to avoid this error?
=== EDIT ===
Here are the queries that are running:
First Visit to site:
INSERT INTO onlineusers SET
ip = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
On each page refresh:
UPDATE onlineusers SET
ips = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
WHERE id = 888
Cron every 15 minutes:
DELETE FROM onlineusers WHERE datetime <= now() - INTERVAL 900 SECOND
It then does some counts to log some stats (ie: members online, visitors online).
One easy trick that can help with most deadlocks is sorting the operations in a specific order.
You get a deadlock when two transactions are trying to lock two locks at opposite orders, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(2), locks key(1);
If both run at the same time, connection 1 will lock key(1), connection 2 will lock key(2) and each connection will wait for the other to release the key -> deadlock.
Now, if you changed your queries such that the connections would lock the keys at the same order, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(1), locks key(2);
it will be impossible to get a deadlock.
So this is what I suggest:
Make sure you have no other queries that lock access more than one key at a time except for the delete statement. if you do (and I suspect you do), order their WHERE in (k1,k2,..kn) in ascending order.
Fix your delete statement to work in ascending order:
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
To
DELETE FROM onlineusers
WHERE id IN (
SELECT id FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY id
) u;
Another thing to keep in mind is that MySQL documentation suggest that in case of a deadlock the client should retry automatically. you can add this logic to your client code. (Say, 3 retries on this particular error before giving up).
Deadlock happen when two transactions wait on each other to acquire a lock. Example:
Tx 1: lock A, then B
Tx 2: lock B, then A
There are numerous questions and answers about deadlocks. Each time you insert/update/or delete a row, a lock is acquired. To avoid deadlock, you must then make sure that concurrent transactions don't update row in an order that could result in a deadlock. Generally speaking, try to acquire lock always in the same order even in different transaction (e.g. always table A first, then table B).
Another reason for deadlock in database can be missing indexes. When a row is inserted/update/delete, the database needs to check the relational constraints, that is, make sure the relations are consistent. To do so, the database needs to check the foreign keys in the related tables. It might result in other lock being acquired than the row that is modified. Be sure then to always have index on the foreign keys (and of course primary keys), otherwise it could result in a table lock instead of a row lock. If table lock happen, the lock contention is higher and the likelihood of deadlock increases.
In case someone is still struggling with this issue:
I faced similar issue where 2 requests were hitting the server at the same time. There was no situation like below:
T1:
BEGIN TRANSACTION
INSERT TABLE A
INSERT TABLE B
END TRANSACTION
T2:
BEGIN TRANSACTION
INSERT TABLE B
INSERT TABLE A
END TRANSACTION
So, I was puzzled why deadlock is happening.
Then I found that there was parent child relation ship between 2 tables because of foreign key. When I was inserting a record in child table, the transaction was acquiring a lock on parent table's row. Immediately after that I was trying to update the parent row which was triggering elevation of lock to EXCLUSIVE one. As 2nd concurrent transaction was already holding a SHARED lock, it was causing deadlock.
Refer to: https://blog.tekenlight.com/2019/02/21/database-deadlock-mysql.html
It is likely that the delete statement will affect a large fraction of the total rows in the table. Eventually this might lead to a table lock being acquired when deleting. Holding on to a lock (in this case row- or page locks) and acquiring more locks is always a deadlock risk. However I can't explain why the insert statement leads to a lock escalation - it might have to do with page splitting/adding, but someone knowing MySQL better will have to fill in there.
For a start it can be worth trying to explicitly acquire a table lock right away for the delete statement. See LOCK TABLES and Table locking issues.
You might try having that delete job operate by first inserting the key of each row to be deleted into a temp table like this pseudocode
create temporary table deletetemp (userid int);
insert into deletetemp (userid)
select userid from onlineusers where datetime <= now - interval 900 second;
delete from onlineusers where userid in (select userid from deletetemp);
Breaking it up like this is less efficient but it avoids the need to hold a key-range lock during the delete.
Also, modify your select queries to add a where clause excluding rows older than 900 seconds. This avoids the dependency on the cron job and allows you to reschedule it to run less often.
Theory about the deadlocks: I don't have a lot of background in MySQL but here goes... The delete is going to hold a key-range lock for datetime, to prevent rows matching its where clause from being added in the middle of the transaction, and as it finds rows to delete it will attempt to acquire a lock on each page it is modifying. The insert is going to acquire a lock on the page it is inserting into, and then attempt to acquire the key lock. Normally the insert will wait patiently for that key lock to open up but this will deadlock if the delete tries to lock the same page the insert is using because thedelete needs that page lock and the insert needs that key lock. This doesn't seem right for inserts though, the delete and insert are using datetime ranges that don't overlap so maybe something else is going on.
http://dev.mysql.com/doc/refman/5.1/en/innodb-next-key-locking.html
For Java programmers using Spring, I've avoided this problem using an AOP aspect that automatically retries transactions that run into transient deadlocks.
See #RetryTransaction Javadoc for more info.
cron is dangerous. If one instance of cron fails to finish before the next is due, they are likely to fight each other.
It would be better to have a continuously running job that would delete some rows, sleep some, then repeat.
Also, INDEX(datetime) is very important for avoiding deadlocks.
But, if the datetime test includes more than, say, 20% of the table, the DELETE will do a table scan. Smaller chunks deleted more often is a workaround.
Another reason for going with smaller chunks is to lock fewer rows.
Bottom line:
INDEX(datetime)
Continually running task -- delete, sleep a minute, repeat.
To make sure that the above task has not died, have a cron job whose sole purpose is to restart it upon failure.
Other deletion techniques: http://mysql.rjweb.org/doc.php/deletebig
#Omry Yadan's answer ( https://stackoverflow.com/a/2423921/1810962 ) can be simplified by using ORDER BY.
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
to
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY ID
to keep the order in which you delete items consistent. Also if you are doing multiple inserts in a single transaction, make sure they are also always ordered by id.
According to the mysql delete documentation:
If the ORDER BY clause is specified, the rows are deleted in the order that is specified.
You can find a reference here: https://dev.mysql.com/doc/refman/8.0/en/delete.html
I have a method, the internals of which are wrapped in a MySqlTransaction.
The deadlock issue showed up for me when I ran the same method in parallel with itself.
There was not an issue running a single instance of the method.
When I removed MySqlTransaction, I was able to run the method in parallel with itself with no issues.
Just sharing my experience, I'm not advocating anything.

MySQL "LOCK TABLES" timeout?

What's the timeout for mysql LOCK TABLES statement?
Can't find it anywhere.
I tried to set variable innodb_lock_wait_timeout ini my.cnf but it seems it's related to another (row level) locking not to table locking.
Simply it has no effect for LOCK TABLES.
I want to set some low timeout value for case of deadlock, because if some operation will LOCK tables and something will go wrong, it will hang up the whole site!
Which is stupid for example in case of finishing purchase on your site.
My work-around is to create a dedicated lock table and just lock a row in that table. This has the advantage of only locking the processes that specifically want to be locked. Other parts of the application can continue to access the tables even if they are at some point touched by the update processes.
Setup
CREATE TABLE `mutex` (
EMPTY ENUM('') NOT NULL,
PRIMARY KEY (EMPTY)
);
Usage
set innodb_lock_wait_timeout = 1;
start transaction;
insert into `mutex` values();
[... do the real work here ... or somewhere else ... even a different machine ...]
delete from `mutex`;
commit;
Why are you using LOCK TABLES?
If you are using MyISAM (which sometimes needs LOCK TABLES), you should convert to InnoDB.
If you are using InnoDB, you should never use LOCK TABLES. Instead, depend on innodb_lock_wait_timeout (default is an unreasonably high 50 seconds). And you should check for errors.
InnoDB Deadlocks are caught and immediately cause an error. Certain non-deadlocks may wait for innodb_lock_wait_timeout.
Edit
Since the transaction looks like
BEGIN;
SELECT ...;
compute some stuff
UPDATE ... (using that stuff);
COMMIT;
You need to add FOR UPDATE on the end of the SELECT.
I think you are after the table_lock_timout variable which was introduced in MySQL 5.0.10 but subsequently removed in 5.5. Unfortunately, the release notes don't specify an alternative to use, and I'm guessing that the general attitude is to switch over to using InnoDB transactions as #Rick James has stated in his answer.
I think that removing the variable was unhelpful. Others may regard this as a case of the XY Problem, where we are trying to fix a symptom (deadlocks) by changing the timeout period of locking tables when really we should resolve the root cause by switching over to transactions instead. I think there may still be cases where table locks are more suitable to the application than using transactions and are perhaps a lot easier to comprehend, even if they are worse performing.
The nice thing about using LOCK TABLES, is that you can state the tables that you're queries are dependent upon before proceeding. With transactions, the locks are grabbed at the last possible moment and if they can't be fetched and time-out, you then need to check for this failure and roll back before trying everything all over again. It's simpler to have a 1 second timeout (minimum) on the lock tables query and keep retrying to get the lock(s) until you succeed and then proceeding with your queries before unlocking the tables. This logic is at no risk of deadlocks.
I believe the developer's attitude is summed up by the following excerpt from the documetation:
...avoid using the LOCK TABLES statement, because it does not offer
any extra protection, but instead reduces concurrency.
The correct answer is the lock_wait_timeout system variable.
From the documentation:
This variable specifies the timeout in seconds for attempts to acquire
metadata locks. The permissible values range from 1 to 31536000 (1
year). The default is 31536000.
This timeout applies to all statements that use metadata locks. These
include DML and DDL operations on tables, views, stored procedures,
and stored functions, as well as LOCK TABLES, FLUSH TABLES WITH READ
LOCK, and HANDLER statements.
I think you meant to say the default timeout value; which is 50 Seconds per MySQL Documentation it says
innodb_lock_wait_timeout Default 50 The timeout in seconds an
InnoDB transaction may wait for a row lock before giving up. The
default value is 50 seconds

Why does mysql deadlock here?

Once in a while I get a mysql error. The error is
Deadlock found when trying to get lock; try restarting transaction
The query is
var res = cn.Execute("insert ignore into
Post(desc, item_id, user, flags)
select #desc, #itemid, #userid, 0",
new { desc, itemid, userid });
How on earth can this query cause it? When googling I saw something about how querys that take long lock rows and cause this problem but no rows need to be touched for this insert
Deadlocks are caused by inter-transaction ordering and lock acquisitions. Generally there is one active transaction per connection (although different databases may work differently). So it is only in the case of multiple connections and thus multiple overlapping transactions that deadlocks can occur. A single connection/transaction cannot deadlock itself because there is no lock it can't acquire: it has it, or it can get it.
An insert deadlock can be caused by a unique constraint - so check for a unique key constraint as a culprit. Other causes could be locks held for select "for update" statements, etc.
Also, ensure all transactions are completed immediately (committed or rolled back) after the operation(s) that require them. If a transaction is not closed in a timely manner it can lead to such deadlock behavior trivially. While "autocommit" usually handles this, it can be changed and should not be relied upon: I recommend proper manual transaction usage.
See Mysql deadlock explanation needed and How to Cope with Deadlocks for more information. In this case, it is likely sufficient to "just try again".

How can I troubleshoot MySQL Lock Timeout Errors with Rails?

All of a sudden (without any changes to related code) we are getting lock errors through active record such as:
ActiveRecord::StatementInvalid: Mysql2::Error: Lock wait timeout exceeded;
try restarting transaction: UPDATE `items` SET `state` = 'reserved', `updated_at` = '2012-09-15 17:58:21' WHERE `items`.`id` = 248220
and
ActiveRecord::StatementInvalid: Mysql2::Error: Lock wait timeout exceeded;
try restarting transaction: DELETE FROM `sessions` WHERE `sessions`.`id` = 41997883
We aren't doing our own transactions in either of these models, so the only transactions are the built in rails ones. There has not been a surge in traffic or request volume.
These errors appear to be when a "new" query tries to run on a locked table and has to wait, how do we see what it's waiting for? How do we figure out which part of our code is issuing queries that lock the tables for extended periods of time?
Any ideas on where we can look or how to investigate the cause of this?
Take a look at pt-deadlock-logger, while not directly related to rails, should give you a considerable amount of information about the deadlocks occurring.
http://www.percona.com/doc/percona-toolkit/2.1/pt-deadlock-logger.html
There is a nice writeup with some examples:
http://www.mysqlperformanceblog.com/2012/09/19/logging-deadlocks-errors/
The tool is very simple and useful. It monitors the output of SHOW ENGINE INNODB STATUS and log the new deadlocks to a file or to a table that we can later review. Let’s see how it works with an example.
The article goes on to explain that this can log information about the deadlock such as queries involved, which hosts, thread ids, etc.
I've also found it helpful to prefix queries with comments to allow tracking, such as the file or module, function, even which user. The query comments usually get passed down all the way to diagnostic tools like this, and could help track down which parts of code and in which circumstances are causing deadlocks.