Mysql with innodb and serializable transaction does not (always) lock rows - mysql

I have a transaction with a SELECT and possible INSERT. For concurrency reasons, I added FOR UPDATE to the SELECT. To prevent phantom rows, I'm using the SERIALIZABLE transaction isolation level. This all works fine when there are any rows in the table, but not if the table is empty. When the table is empty, the SELECT FOR UPDATE does not do any (exclusive) locking and a concurrent thread/process can issue the same SELECT FOR UPDATE without being locked.
CREATE TABLE t (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
display_order INT
) ENGINE = InnoDB;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
START TRANSACTION;
SELECT COALESCE(MAX(display_order), 0) + 1 from t FOR UPDATE;
..
This concept works as expected with SQL Server, but not with MySQL. Any ideas on what I'm doing wrong?
EDIT
Adding an index on display_order does not change the behavior.

There's something fun with this, both transaction are ready to get the real lock. As soon as one of the transaction will try to perform an insert the lock will be there. If both transactions try it one will get a deadlock and rollback. If only one of them try it it will get a lock wait timeout.
If you detect the lock wait timeout you can rollback and this will allow the next transaction to perform the insert.
So I think you're likely to get a deadlock exception or a timeout exception quite fast and this should save the situation. But talking about perfect 'serializable' situation this is effectively a bad side effect of empty table. The engine cannot be perfect on all cases, at least No double-transaction-inserts can be done..
I've send yesterday an interesting case of true seriability vs engine seriability, on potsgreSQl documentation, check this example it's funny : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
Update:
Other interesting resource: Does MySQL/InnoDB implement true serializable isolation?

This is probably not a bug.
The way that the different databases implement specific transaction isolation levels is NOT 100% consistent, and there are a lot of edge-cases to consider which behave differently. InnoDB was meant to emulate Oracle, but even there, I believe there are cases where it works differently.
If your application relies on very subtle locking behaviour in specific transaction isolation modes, it is probably broken:
Even if it "works" right now, it might not if somebody changes the database schema
It is unlikely that engineers maintaining your code will understand how it's using the database if it depends upon subtleties of locking

Did you have a look at this document:
http://dev.mysql.com/doc/refman/5.1/en/innodb-locking-reads.html
If you ask me, mysql wasn't built to be used in that way...
My recomendation is:
If you can affort it -> Lock the whole table.

Related

InnoDB deadlock with lock modes S and X

In my application, I have two queries that occur from time to time (from different processes), that cause a deadlock.
Query #1
UPDATE tblA, tblB SET tblA.varcharfield=tblB.varcharfield WHERE tblA.varcharfield IS NULL AND [a few other conditions];
Query #2
INSERT INTO tmp_tbl SELECT * FROM tblA WHERE [various conditions];
Both of these queries take a significant time, as these tables have millions of rows. When query #2 is running, it seems that tblA is locked in mode S. It seems that query #1 requires an X lock. Since this is incompatible with an S lock, query #1 waits for up to 30 seconds, at which point I get a deadlock:
Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
Based on what I've read in the documentation, I think I have a couple options:
Set an index on tblA.varcharfield. Unfortunately, I think that this would require a very large index to store the field of varchar(512). (See edit below... this didn't work.)
Disable locking with SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
. I don't understand the implications of this, and am worried about corrupt data. I don't use explicit transactions in my application currently, but I might at some point in the future.
Split my time-consuming queries into small pieces so that they can queue and run in MySQL without reaching the 30-second timeout. This wouldn't really fix the heart of the issue, and I am concerned that when my database servers get busy that the problem will occur again.
Simply retrying queries over and over again... not an option I am hoping for.
How should I proceed? Are there alternate methods I should consider?
EDIT: I have tried setting an index on varcharfield, but the table is still locking. I suspect that the locking happens when the UPDATE portion is actually executing. Are there other suggestions to get around this problem?
A. If we assume that indexing varcharField takes a lot of disk space and adding new column will not hit you hard I can suggest the following approach:
create new field with datatype "tinyint"
index it.
this field will store 0 if varcharField is null and 1 - otherwise.
rewrite the first query to do update relying on new field. In this case it will not cause entire table locking.
Hope it helps.
You can index only part of the varchar column, it will still work, and will require less space. Just specify index size:
CREATE INDEX someindex ON sometable (varcharcolumn(32))
I was able to solve the issue by adding explicit LOCK TABLE statements around both queries. This turned out to be a better solution, since each query affects so many records, and that both of these are background processes. They now wait on each other.
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html
While this is an okay solution for me, it obviously isn't the answer for everyone. Locking with WRITE means that you cannot READ. Only a READ lock will allow others to READ.

Locking a table within a transaction

I would like to be able to lock an entire table to prevent any INSERTs or UPDATEs in it between the "beginTransaction" and the ending "commit" or "rollback".
I know that beginning a transaction results in an implicit UNLOCK TABLES and that a LOCK table results in a implicit COMMIT... so is there any way to do what I want?
Why? Perhaps you have missed the point of transactions.
If you use repeatable-read transaction isolation, inserts, updates etc, can happen during your transaction, BUT YOU WILL NOT SEE THEM. So as far as your process is concerned, the table is locked for inserts/updates. Except they are still happening, they're still durable to disc, and other processes can continue.
After you do your first "select", a snapshot is created, and you are effectively reading that snapshot, not the latest version. If this is what you want, repeatable-read works well for you.
select count(*) from table
within a transaction, locks the talbe on msSQL 2000
If you're using PHP, so when there is a transaction going on, you can set a SESSION variable to tell the script not to do anything with the database, i.e. $_SESSION['on_going_transaction'] = true.
When the transaction is completed, just destroy the SESSION variable so that another transaction can occur. This is much easier.

Explain inexplicable deadlock

First of all, I don't see how I could be getting any deadlock at all, since I am using no explicit locking, there's only one table involved, there's a separate process each to insert, select, and update rows, only one row is inserted or updated at a time, and each process only rarely (perhaps once a minute) runs at all.
It's an email queue:
CREATE TABLE `emails_queue` (
`id` varchar(40) NOT NULL,
`email_address` varchar(128) DEFAULT NULL,
`body` text,
`status_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` enum('pending','inprocess','sent','discarded','failed') DEFAULT NULL,
KEY `status` (`status`),
KEY `status_time` (`status`,`status_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The generating process, in response to some user action but roughly every 90 seconds, does an insert to the table, setting the status to "pending".
There's a monitoring process that every minute checks that the number of "pending" and "failed" emails is not excessive. It takes less than a second to run and has never given me any trouble.
Every minute, the sending process grabs all the pending emails. It loops through and one email at a time, sets its status to "inprocess", tries to send it, and finally sets its status accordingly to "sent", "discarded" (it has reasons for deciding an email shouldn't go out), or "failed" (rejected by the SMTP system).
The statement for setting the status is unusual.
UPDATE emails_queue SET status=?, status_time=NOW() WHERE id=? AND status = ?
That is, I only update the status if the current status it already what I believe it to be. Before this mechanism, I accidentally kicked off two sending processes and they would each try to send the same email. Now, if that were to happen, one process would successfully move the email from "pending" to "inprocess", but the second one would update zero rows, realize there's a problem, and skip that email.
The problem is, about one time in 100, the update fails altogether! I get com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
WTH?
This is the only table and only query that this happens to and it only happens in production (to maximize difficulty in investigating it).
The only two things that seem at all unusual are (1) updating a column that participates in the WHERE clause, and (2) the (unused) automatic updating of the status_time.
I'm looking for any suggestions or diagnostic techniques.
Firstly, deadlocks do not depend on explicit locking. MySQL's LOCK TABLE or using non-default transaction isolation modes are NOT required to have a deadlock. You can still have deadlocks if you never use an explicit transaction.
Deadlocks can happen on a single table, quite easily. Most commonly it's from a single hot table.
Deadlocks can even happen if all your transactions just do a single row insert.
A deadlock can happen if you have
More than one connection to the database (obviously)
Any operation that internally involves more than one lock.
What is not obvious, is that most of the time, a single row insert or update involves more than one lock. The reason for this is that secondary indexes also need to be locked during inserts / updates.
SELECTs won't lock (assuming you're using the default isolation mode, and aren't using FOR UPDATE) so they can't be the cause.
SHOW ENGINE INNODB STATUS is your friend. It will give you a bunch of (admittedly very confusing) information about deadlocks, specifically, the most recent one.
You can't completely eliminate deadlocks, they will continue to happen in production (even on test systems if stress them properly)
Aim for a very low amount of deadlocks. If 1% of your transactions deadlock, that is possibly too many.
Consider changing the transaction isolation level of your transactions to read-committed IF YOU FULLY UNDERSTAND THE IMPLICATIONS
ensure that your software handles deadlocks appropriately.
With some database servers there are default settings for locking behavior. The usual default it to use locks (at least on the systems I used). I'm not sure this is true on mysql but I believe it is.
Do you have an index on the emails_queue table? The type of index can change how it does locking. In one case I dealt with not having a clustered index on the table caused it to
use page locking instead of row locking. I had explicitly told it to use row locking and
it silently changed it. Page locking can cause deadlocks. Try checking that index.
If those don't help the solution is the one suggested in the error message. Catch
the exception for deadlocks and re-run the sql when it happens.
You have not described the scope of the transactions in your description. If each process that you have described is trying to do everything within a single transaction, then there certainly is the potential for deadlock in this system.
While it may seem like a deadlock should not occur because only a single table is involved, the resources that are being locked are not tables but rows. Two processes may each be holding a row lock that is required by the other processes, if the same transaction is used to manipulate multiple rows.

Locking mySQL tables/rows

can someone explain the need to lock tables and/or rows in mysql?
I am assuming that it to prevent multiple writes to the same field, is this the best practise?
First lets look a good document This is not a mysql related documentation, it's about postgreSQl, but it's one of the simplier and clear doc I've read on transaction. You'll understand MySQl transaction better after reading this link http://www.postgresql.org/docs/8.4/static/mvcc.html
When you're running a transaction 4 rules are applied (ACID):
Atomicity : all or nothing (rollback)
Coherence : coherent before, coherent after
Isolation: not impacted by others?
Durability : commit, if it's done, it's really done
In theses rules there's only one which is problematic, it's Isolation. using a transaction does not ensure a perfect isolation level. The previous link will explain you better what are the phantom-reads and suchs isolation problems between concurrent transactions. But to make it simple you should really use Row levels locks to prevent other transaction, running in the same time as you (and maybe comitting before you), to alter the same records. But with locks comes deadlocks...
Then when you'll try using nice transactions with locks you'll need to handle deadlocks and you'll need to handle the fact that transaction can fail and should be re-launched (simple for or while loops).
Edit:------------
Recent versions of InnoDb provides greater levels of isolation than previous ones. I've done some tests and I must admit that even the phantoms reads that should happen are now difficult to reproduce.
MySQL is on level 3 by default of the 4 levels of isolation explained in the PosgtreSQL document (where postgreSQL is in level 2 by default). This is REPEATABLE READS. That means you won't have Dirty reads and you won't have Non-repeatable reads. So someone modifying a row on which you made your select in your transaction will get an implicit LOCK (like if you had perform a select for update).
Warning: If you work with an older version of MySQL like 5.0 you're maybe in level 2, you'll need to perform the row lock using the 'FOR UPDATE' words!
We can always find some nice race conditions, working with aggregate queries it could be safer to be in the 4th level of isolation (by using LOCK IN SHARE MODE at the end of your query) if you do not want people adding rows while you're performing some tasks. I've been able to reproduce one serializable level problem but I won't explain here the complex example, really tricky race conditions.
There is a very nice example of race conditions that even serializable level cannot fix here : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
When working with transactions the more important things are:
data used in your transaction must always be read INSIDE the transaction (re-read it if you had data from before the BEGIN)
understand why the high isolation level set implicit locks and may block some other queries ( and make them timeout)
try to avoid dead locks (try to lock tables in the same order) but handle them (retry a transaction aborted by MySQL)
try to freeze important source tables with serialization isolation level (LOCK IN SHARE MODE) when your application code assume that no insert or update should modify the dataset he's using (if not you will not have problems but your result will have ignored the concurrent changes)
It is not a best practice. Modern versions of MySQL support transactions with well defined semantics. Use transactions, and forget about locking stuff by hand.
The only new thing you'll have to deal with is that transaction commits may fail because of race conditions, but you'd be doing error checking with locks anyway, and it is easier to retry the logic that led to a transaction failure than to recover from errors in a non-transactional setup.
If you do get race conditions and failed commits, then you may want to fine-tune the isolation configuration for your transactions.
For example if you need to generate invoice numbers which are sequential and have no numbers missing - this is a requirement at least in the country I live in.
If you have a few web servers, then a few users might be buying stuff literally at the same time.
If you do select max(invoice_id)+1 from invoice to get the new invoice number, two web servers might do that at the same time (before the new invoice has been added), and get the same invoice number for the invoices they're trying to create.
If you use a mechanism such as "auto_increment", this is just meant to generate unique values, and makes no guarantees about not missing out numbers (if one transaction tries to insert a row, then does a rollback, the number is "lost"),
So the solution is to (a) lock the table (b) select max(invoice_id)+1 from invoice (c) do the insert (d) commit + unlock the table.
On another note, in MySQL you're best using InnoDB and using row-level locking. Doing a lock table command can implicitly commit the transaciton you're working on.
Take a look here for general introduction to what transactions are and how to use them.
Databases are designed to work in concurrent environments, so locking the tables and/or records helps to keep the transactions consistent.
So a record affected by one transaction should not be altered until this transaction commits or rolls back.

What does MySQL do if you attempt to update a table that is being queried?

I have a very slow query that I need to run on a MySQL database from time to time.
I've discovered that attempts to update the table that is being queried are blocked until the query has finished.
I guess this makes sense, as otherwise the results of the query might be inconsistent, but it's not ideal for me, as the query is of much lower importance than the update.
So my question really has two parts:
Out of curiosity, what exactly does MySQL do in this situation? Does it lock the table for the duration of the query? Or try to lock it before the update?
Is there a way to make the slow query not blocking? I guess the options might be:
Kill the query when an update is needed.
Run the query on a copy of the table as it was just before the update took place
Just let the query go wrong.
Anyone have any thoughts on this?
It sounds like you are using a MyISAM table, which uses table level locking. In this case, the SELECT will set a shared lock on the table. The UPDATE then will try to request an exclusive lock and block and wait until the SELECT is done. Once it is done, the UPDATE will run like normal.
MyISAM Locking
If you switched to InnoDB, then your SELECT will set no locks by default. There is no need to change transaction isolation levels as others have recommended (repeatable read is default for InnoDB and no locks will be set for your SELECT). The UPDATE will be able to run at the same time. The multi-versioning that InnoDB uses is very similar to how Oracle handles the situation. The only time that SELECTs will set locks is if you are running in the serializable transaction isolation level, you have a FOR UPDATE/LOCK IN SHARE MODE option to the query, or it is part of some sort of write statement (such as INSERT...SELECT) and you are using statement based binary logging.
InnoDB Locking
For the purposes of the select statement, you should probably issue a:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
command on the connection, which causes the subsequent select statements to operate without locking.
Don't use the 'SELECT ... FOR UPDATE', as that definitely locks the table rows that are affected by the select statement.
The full list of msql transaction isloation levels are in the docs.
First off all you need to know what engine you´re using (MySam or InnoDb).
This is clearly a transaction problem.
Take a look a the section 13.4.6. SET TRANSACTION Syntax in the mysql manual.
UPDATE LOW_PRIORITY .... may be helpful - the mysql docs aren't clear whether this would let the user requesting the update continue and the update happen when it can (which is what I think happens) or whether the user has to wait (which would be worse than at present ...), and I can't remember.
What table types are you using? If you are on MyISAM, switching to InnoDB (if you can - it has no full text indexing) opens up more options for this sort of thing, as it supports the transactional features and row level locking.
I don't know MySQL, But it sounds like transaction problem.
You should be able to set transaction typ to Dirty Read in your select query.
That won't nessarily give you correct results. But it should'nt be blocked.
Better would be to make the first query go faster. Do some analyzing and check if you can speed it up with correct indeing and so on.