MySQL: How to lock tables and start a transaction? - mysql

TL;DR - MySQL doesn't let you lock a table and use a transaction at the same time. Is there any way around this?
I have a MySQL table I am using to cache some data from a (slow) external system. The data is used to display web pages (written in PHP.) Every once in a while, when the cached data is deemed too old, one of the web connections should trigger an update of the cached data.
There are three issues I have to deal with:
Other clients will try to read the cache data while I am updating it
Multiple clients may decide the cache data is too old and try to update it at the same time
The PHP instance doing the work may be terminated unexpectedly at any time, and the data should not be corrupted
I can solve the first and last issues by using a transaction, so clients will be able to read the old data until the transaction is committed, when they will immediately see the new data. Any problems will simply cause the transaction to be rolled back.
I can solve the second problem by locking the tables, so that only one process gets a chance to perform the update. By the time any other processes get the lock they will realise they have been beaten to the punch and don't need to update anything.
This means I need to both lock the table and start a transaction. According to the MySQL manual, this is not possible. Starting a transaction releases the locks, and locking a table commits any active transaction.
Is there a way around this, or is there another way entirely to achieve my goal?

This means I need to both lock the table and start a transaction
This is how you can do it:
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
For more info, see mysql doc

If it were me, I'd use the advisory locking function within MySQL to implement a mutex for updating the cache, and a transaction for read isolation. e.g.
begin_transaction(); // although reading a single row doesnt really require this
$cached=runquery("SELECT * FROM cache WHERE key=$id");
end_transaction();
if (is_expired($cached)) {
$cached=refresh_data($cached, $id);
}
...
function refresh_data($cached, $id)
{
$lockname=some_deterministic_transform($id);
if (1==runquery("SELECT GET_LOCK('$lockname',0)") {
$cached=fetch_source_data($id);
begin_transaction();
write_data($cached, $id);
end_transaction();
runquery("SELECT RELEASE_LOCK('$lockname')");
}
return $cached;
}
(BTW: bad things may happen if you try this with persistent connections)

I'd suggest to solve the issue by removing the contention altogether.
Add a timestamp column to your cached data.
When you need to update the cached data:
Just add new cached data to your table using the current timestamp
Remove cached data older than, let's say, 24 hours.
When you need to serve the cached data
Sort by timestamp (DESC) and return the newest cached data
At any given time your clients will retrieve records which are never deleted by any other process. Moreover, you don't care if a client gets cached data belonging to different writes (i.e. with different timestamps)

The second problem may be solved without involving the database at all. Have a lock file for the cache update procedure so that other clients know that someone is already on it. This may not catch each and every corner case, but is it that big of a deal if two clients are updating the cache at the same time? After all, they are doing the update in transactions to the cache will still be consistent.
You may even implement the lock yourself by having the last cache update time stored in a table. When a client wants update the cache, make it lock that table, check the last update time and then update the field.
I.e., implement your own locking mechanism to prevent multiple clients from updating the cache. Transactions will take care of the rest.

Related

Alternative to skip locked in mariaDB

Is there any good & performant alternative to FOR UPDATE SKIP LOCKED in mariaDB? Or is there any good practice to archieve job queueing in mariaDB?
Instead of using a lock to indicate a queue record is being processed, use an indexed processing column. Set it to 0 for new records, and, in a separate transaction from any processing, select a single not yet processing record and update it to 1. Possibly also store the time and process or thread id and server that is processing the record. Have a separate monitoring process to detect jobs flagged as processing that did not complete processing within the expected time.
An alternative that avoids even the temporary lock on a non-primary index needed to select a record is to use a separate, non-database message queue to notify you of new records available in the database queue. (Unless you won't ever care if a unit of work is processed more than once, I would always use a database table in addition to any non-database queue.)
DELETE FROM QUEUE_TABLE LIMIT 1 RETURNING *
for dequeue operations. Depending on your needs it might work ok
Update 2022-06-14:
MariaDB supports SKIP LOCKED now.

MySQL/MariaDB InnoDB Simultaneous Transactions & Locking Behaviour

As part of the persistence process in one of my models an md5 check_sum of the entire record is generated and stored with the record. The md5 check_sum contains a flattened representation of the entire record including all EAV attributes etc. This makes preventing absolute duplicates very easy and efficient.
I am not using a unique index on this check_sum for a specific reason, I want this all to be silent, i.e. if a user submits a duplicate then the app just silently ignores it and returns the already existing record. This ensures backwards compatibility with legacy app's and api's.
I am using Laravel's eloquent. So once a record has been created and before committing the application does the following:
$taxonRecords = TaxonRecord::where('check_sum', $taxonRecord->check_sum)->get();
if ($taxonRecords->count() > 0) {
DB::rollBack();
return $taxonRecords->first();
}
However recently I encountered a 60,000/1 shot incident(odds based on record counts at that time). A single duplicate ended up in the database with the same check_sum. When I reviewed the logs I noticed that the creation time was identical down to the second. Further investigation of Apache logs showed a valid POST but the POST was duplicated. I presume the users browser malfunctioned or something but both POSTS arrived simultaneously resulting in two simultaneous transactions.
My question is how can I ensure that a transaction and its contained SELECT for the previous check_sum is Atomic & Isolated. Based upon my reading the answer lies in https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html and isolation levels.
If transaction A and transaction B arrive at the server at the same time then they should not run side by side but should wait for the first to complete.
You created a classic race condition. Both transactions are calculating the checksum while they're both in progress, not yet committed. Neither can read the other's data, since they're uncommitted. So they calculate that they're the only one with the same checksum, and they both go through and commit.
To solve this, you need to run such transactions serially, to be sure that there aren't other concurrent transactions submitting the same data.
You may have to use use GET_LOCK() before starting your transaction to calculate the checksum. Then RELEASE_LOCK() after you commit. That will make sure other concurrent requests wait for your data to be committed, so they will see it when they try to calculate their checksum.

MariaDB. Use Transaction Rollback without locking tables

On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?

Mysql (myisam) : "waiting for lock" REPLACE blocking all selects

I've a MyISAM table (I can't change it to use InnoDb, do please don't suggest that) which is pretty big (~20GB)
I've a worker which regularly dump this table (I launch is with the --skip-lock-tables option)
During the dump (which takes ~5min), concurrent select can be correctly run, as I would expect. When I go a "REPLACE" during the dump, this REPLACE is "waiting for metadatalock" which seems normal too.
But, every SELECT started after the start the REPLACE will also be "waiting for metadata lock". I can't understand why. Could you help me on this, and tell me how I can have all the selects correctly run (even after this replace)
Thanks !
What is happening is:
Your worker is making a big SELECT. The SELECT is locking the table with a read lock. By the way, the skip-lock-tables only means that you are not locking all the tables at once, but the SELECT query is still locking each table individually. More info on this answer.
Your REPLACE is trying to INSERT but has to wait for the first SELECT (the dump) to finish in order to acquire a write lock. It is put in the write lock queue.
Every SELECT after the REPLACE is put in the read lock queue.
This is a behavior described in the doc on table-level locking:
Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not “starved” even when there is heavy SELECT activity for the table.
If you want the SELECT to not wait for the REPLACE you could (never actually tested that) try the LOW_PRIORITY modifier on your replace.
If you use the LOW_PRIORITY modifier, execution of the INSERT is delayed until no other clients are reading from the table. This includes other clients that began reading while existing clients are reading, and while the INSERT LOW_PRIORITY statement is waiting. It is possible, therefore, for a client that issues an INSERT LOW_PRIORITY statement to wait for a very long time (or even forever) in a read-heavy environment. (This is in contrast to INSERT DELAYED, which lets the client continue at once.)
However be careful as it might never run if there are always a lot of select.

Insert/ update at the same time in a MySql table?

I have a MySql database hosted on a webserver which has a set of tables with data in it. I am distributing my front end application which is build using HTML5 / Javascript /CS3.
Now when multiple users tries to make an insert/update into one of the tables at the same time is it going to create a conflict or will it handle the locking of the table for me automatically example when one user is using, it will lock the table for him and then let the rest follow in a queue once the user finishes it will release the lock and then give it to the next in the queue ? Is this going to happen or do i need to handle the case in mysql database
EXAMPLE:
When a user wants to make an insert into the database he calls a php file located on a webserver which has an insert command to post data into the database. I am concerned if two or more people make an insert at the same time will it make the update.
mysqli_query($con,"INSERT INTO cfv_postbusupdate (BusNumber, Direction, StopNames, Status, comments, username, dayofweek, time) VALUES (".trim($busnum).", '".trim($direction3)."', '".trim($stopname3)."', '".$status."', '".$comments."', '".$username."', '".trim($dayofweek3)."', '".trim($btime3)."' )");
MySQL handles table locking automatically.
Note that with MyISAM engine, the entire table gets locked, and statements will block ("queue up") waiting for a lock to be released.
The InnoDB engine provides more concurrency, and can do row level locking, rather than locking the entire table.
There may be some cases where you want to take locks on multiple MyISAM tables, if you want to maintain referential integrity, for example, and you want to disallow other sessions from making changes to any of the tables while your session does its work. But, this really kills concurrency; this should be more of an "admin" type function, not really something a concurrent application should be doing.
If you are making use of transactions (InnoDB), the issue your application needs to deal with is the sequence in which rows in which tables are locked; it's possible for an application to experience "deadlock" exceptions, when MySQL detects that there are two (or more) transactions that can't proceed because each needs to obtain locks held by the other. The only thing MySQL can do is detect that, and the only recovery MySQL can do for this is to choose one of the transactions to be the victim, that's the transaction that will get the "deadlock" exception, because MySQL killed it, to allow at least one of the transactions to proceed.