Does MySQL InnoDB table wait for write locks even for query such as SELECT COUNT(*) FROM t?
My situation:
I've got table with 50000 rows with many updates (views count in every row). InnoDB should put a write lock on the updated row. But when I make a query with only COUNT(*) on this table, MySQL could answer this query even without waiting for write locks because no UPDATE will change the number of rows.
Thanks a lot!
No, MySql doesn't lock InnoDb tables for queries that only read data from tables.
This is only the case for old MyIsam tables, where all readers must wait until the writer is done and vice versa.
For InnoDb tables they implemented Multiversion concurrency control
In MySql terminology it is called Consistent Nonlocking Reads
In short - when the reader starts the query, the database makes a snapshot of the database at a point in time when the query was started, and the reader (the query) sees only changes made visible (commited) up to this point in time, but doesn't see changes made by later transactions. This allows readers to read data without locking and waiting for writers, but still keeping ACID
There are subtle differences depending on the transaction isolation level, you can find detailed description here: http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html
In short - in read uncommited, read commited and repeatable read modes, all SELECT statements that only read data (SELECTs without FOR UPDATE or LOCK IN SHARE MODE clasues) are performed in a nonlocking fashion.
In serializable mode all transacions are serialized and, depending on autocommit mode, SELECT can be blocked when conflicts with other transactions (when autocommit=true), or is automatically converted to SELECT ... LOCK IN SHARE MODE (when autocommit=false). All details are explained in the above links.
Related
I have a query such as
Select count(*) from table log where num = ?;
If I set the isolation level to serializable, then the range lock will be acquired for the where clause.
My question is: Can other transaction also acquire the range lock in share mode to read the count as the above OR the range lock is exclusive and all other transactions have to wait until the current transaction commits before executing the above read query.
Background: I am trying to implement a view counter for heavy traffic website. To reduce IO to the database, I create a log table so that every time there is a view, I only write a new row in the log table. Once a while, I (randomly) decide if I want to clear the log table and add the number of rows in the log table into a column of a view count table. This means I have to be careful with interleaving transaction.
The statements below are relevant only to SQL Server and were made before the OP made clear this was really about MySQL, about which I know nothing. I'm leaving it here since it (and the resulting discussion) might be of some use nevertheless, but it is not a complete, relevant answer to the question.
SELECT statements only ever acquire shared locks, on all isolation levels (unless overridden with a table hint). And shared locks are always compatible with each other (see Lock Compatibility), so there's no problem if other transactions want to acquire shared (range) locks as well. So yes, you can have any number of queries performing SELECT COUNT(*) in parallel and they will never block each other.
This doesn't mean other transactions don't have to wait. In particular, a DELETE query must eventually acquire an exclusive lock, and it will have to wait if the SELECT is holding a shared lock. Normally this is not an issue since the engine releases locks as soon as possible. When it does become an issue, you'll want to look at solutions like snapshot isolation, which uses optimistic concurrency and conflict detection rather than locking. Under that model, a SELECT will never block any other query (save those that want table locks). Of course, this isn't free; the row versioning is uses takes up disk space and I/O.
My question is a follow up to this answer. I want to find out how to perform a select statement without locking a table with MyISAM engine.
The answer states the following if you have InnoDB but not MyISAM. What is the equivalent for MyISAM engine?
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
SELECT * FROM TABLE_NAME ;
COMMIT ;
This is the default behaviour with MyISAM tables. If one actually wants to lock a MyISAM table, one must manually acquire a table-level lock. Transaction isolation level, START TRANSACTION, COMMIT, ROLLBACK have no effect on MyISAM tables behaviour since MyISAM does not support transactions.
More about internal locking mechanisms
A READ lock is implicitely acquired before, and released after execution of a SELECT statement. Notice that several concurrent, simultaneous, SELECT statements could be running at the same time, because several sessions may hold a READ lock on the same table.
Conversely, a WRITE lock is implicitely acquired before executing an INSERT or UPDATE or DELETE statement. This means that no read (let alone a concurrent write) can take place as long as a write is in progress*.
The above applies to MyISAM, MEMORY, and MERGE tables only.
You might want to read more about this here:
Internal locking methods
Read vs Write locks
* However, these locks are not always required thanks to this clever trick:
The MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no free blocks in the middle of the data file, rows are always inserted at the end of the data file. In this case, you can freely mix concurrent INSERT and SELECT statements for a MyISAM table without locks.
MyISAM does indeed use a read lock during SELECT. An INSERT at the end of the table can get around that.
But try doing an UPDATE, DELETE, or ALTER TABLE while a long-running SELECT is in progress. Or vice-versa, reading from a table while a change to that table is running. It's first-come, first-serve, and the later thread blocks until the first thread is done.
MyISAM doesn't have any support for transactions, so it must work this way. If a SELECT were reading rows from a table, and a concurrent thread changes some of those rows, you would get a race condition. The SELECT may read some of the rows before the change, and some of the rows after the change, resulting in a completely mixed-up view of the data.
Anything you do with SET TRANSACTION ISOLATION LEVEL has no effect with MyISAM.
For these reasons, it's recommended to use InnoDB instead.
can someone explain the need to lock tables and/or rows in mysql?
I am assuming that it to prevent multiple writes to the same field, is this the best practise?
First lets look a good document This is not a mysql related documentation, it's about postgreSQl, but it's one of the simplier and clear doc I've read on transaction. You'll understand MySQl transaction better after reading this link http://www.postgresql.org/docs/8.4/static/mvcc.html
When you're running a transaction 4 rules are applied (ACID):
Atomicity : all or nothing (rollback)
Coherence : coherent before, coherent after
Isolation: not impacted by others?
Durability : commit, if it's done, it's really done
In theses rules there's only one which is problematic, it's Isolation. using a transaction does not ensure a perfect isolation level. The previous link will explain you better what are the phantom-reads and suchs isolation problems between concurrent transactions. But to make it simple you should really use Row levels locks to prevent other transaction, running in the same time as you (and maybe comitting before you), to alter the same records. But with locks comes deadlocks...
Then when you'll try using nice transactions with locks you'll need to handle deadlocks and you'll need to handle the fact that transaction can fail and should be re-launched (simple for or while loops).
Edit:------------
Recent versions of InnoDb provides greater levels of isolation than previous ones. I've done some tests and I must admit that even the phantoms reads that should happen are now difficult to reproduce.
MySQL is on level 3 by default of the 4 levels of isolation explained in the PosgtreSQL document (where postgreSQL is in level 2 by default). This is REPEATABLE READS. That means you won't have Dirty reads and you won't have Non-repeatable reads. So someone modifying a row on which you made your select in your transaction will get an implicit LOCK (like if you had perform a select for update).
Warning: If you work with an older version of MySQL like 5.0 you're maybe in level 2, you'll need to perform the row lock using the 'FOR UPDATE' words!
We can always find some nice race conditions, working with aggregate queries it could be safer to be in the 4th level of isolation (by using LOCK IN SHARE MODE at the end of your query) if you do not want people adding rows while you're performing some tasks. I've been able to reproduce one serializable level problem but I won't explain here the complex example, really tricky race conditions.
There is a very nice example of race conditions that even serializable level cannot fix here : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
When working with transactions the more important things are:
data used in your transaction must always be read INSIDE the transaction (re-read it if you had data from before the BEGIN)
understand why the high isolation level set implicit locks and may block some other queries ( and make them timeout)
try to avoid dead locks (try to lock tables in the same order) but handle them (retry a transaction aborted by MySQL)
try to freeze important source tables with serialization isolation level (LOCK IN SHARE MODE) when your application code assume that no insert or update should modify the dataset he's using (if not you will not have problems but your result will have ignored the concurrent changes)
It is not a best practice. Modern versions of MySQL support transactions with well defined semantics. Use transactions, and forget about locking stuff by hand.
The only new thing you'll have to deal with is that transaction commits may fail because of race conditions, but you'd be doing error checking with locks anyway, and it is easier to retry the logic that led to a transaction failure than to recover from errors in a non-transactional setup.
If you do get race conditions and failed commits, then you may want to fine-tune the isolation configuration for your transactions.
For example if you need to generate invoice numbers which are sequential and have no numbers missing - this is a requirement at least in the country I live in.
If you have a few web servers, then a few users might be buying stuff literally at the same time.
If you do select max(invoice_id)+1 from invoice to get the new invoice number, two web servers might do that at the same time (before the new invoice has been added), and get the same invoice number for the invoices they're trying to create.
If you use a mechanism such as "auto_increment", this is just meant to generate unique values, and makes no guarantees about not missing out numbers (if one transaction tries to insert a row, then does a rollback, the number is "lost"),
So the solution is to (a) lock the table (b) select max(invoice_id)+1 from invoice (c) do the insert (d) commit + unlock the table.
On another note, in MySQL you're best using InnoDB and using row-level locking. Doing a lock table command can implicitly commit the transaciton you're working on.
Take a look here for general introduction to what transactions are and how to use them.
Databases are designed to work in concurrent environments, so locking the tables and/or records helps to keep the transactions consistent.
So a record affected by one transaction should not be altered until this transaction commits or rolls back.
So after reading Performance in PDO / PHP / MySQL: transaction versus direct execution in regards to performance issues I was thinking about I did some research on locking tables in MySQL.
On http://dev.mysql.com/doc/refman/5.0/en/table-locking.html
Table locking enables many sessions to
read from a table at the same time,
but if a session wants to write to a
table, it must first get exclusive
access. During the update, all other
sessions that want to access this
particular table must wait until the
update is done.
This part struck me particularly because most of our queries will be updates rather than inserts. I was wondering if one created a table called foo on which all updates/inserts were carried out and then a view called foo_view (A copy of foo, or perhaps foo and a linkage of several other tables plus foo) on which all selects occurred, would this locking issue still occur?
That is, would SELECT queries on foo_view still have to wait for an update to finish on foo?
Another brief question my colleague asked. Does this affect caching? I.e. if the SELECT is cached will it hit the cache and return results, or will it wait for the lock to finish first?
Your view will experience the same locking as the underlying tables.
From the MySQL Reference page on locking:
MySQL grants table write locks as
follows:
If there are no locks on the table, put a write lock on it.
Otherwise, put the lock request in the write lock queue.
MySQL grants table read locks as
follows:
If there are no write locks on the table, put a read lock on it.
Otherwise, put the lock request in the read lock queue.
It's worth mentioning that this depends on the database engine you are using. MyISAM will follow the steps above and lock the entire table (even if it is split into multiple partitions) where an engine like InnoDB will do row level locking instead.
If you're not reaching the necessary performance benchmarks with MyISAM and you have shown your bottleneck is waiting on table locks via updates, I would suggest changing the storage engine of your table to InnoDB.
In MySQL:
Every one minute I empty the table and fill it with a new data. Now I want that users should not read data during the fill process, before or after is ok.
How do I achieve this?
Is transaction the way?
Assuming you use a transactional engine (Usually Innodb), clear and refill the table in the same transaction.
Be sure that your readers use READ_COMMITTED or higher transaction isolation level (the default is REPEATABLE READ which is higher).
That way readers will continue to be able to read the old contents of the table during the update.
There are a few things to be careful of:
If the table is so big that it exhausts the rollback area - this is possible if you update the whole of (say) a 1M row table. Of course this is tunable but there are limits
If the transaction fails part way through and gets rolled back - rolling back big transactions is VERY inefficient in InnoDB (it is optimised for commits, not rollbacks)
Be careful of deadlocks and lock wait timeouts, which are more likely if you use big transactions.
You can LOCK your table for the duration of your operation:
http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
A table lock protects only against
inappropriate reads or writes by other
sessions. The session holding the
lock, even a read lock, can perform
table-level operations such as DROP
TABLE. Truncate operations are not
transaction-safe, so an error occurs
if the session attempts one during an
active transaction or while holding a
table lock.
I don't know enough about the internal row-versioning mechanisms of MySql (or indeed, if there is one), but other databases (Oracle, Postgresql, and more recently, Sql Server) have invested a lot of effort into allowing writers to not block readers, in so far as readers have access to the version of the rows that existed immediately before the update/write process started. Once the update is committed, that version of the row becomes the one made availabe to all readers, thereby avoiding a bottleneck that the above behaviour in MySql will introduce.
This policy ensures that table locking
is deadlock free. There are, however,
other things you need to be aware of
about this policy: If you are using a
LOW_PRIORITY WRITE lock for a table,
it means only that MySQL waits for
this particular lock until there are
no other sessions that want a READ
lock. When the session has gotten the
WRITE lock and is waiting to get the
lock for the next table in the lock
table list, all other sessions wait
for the WRITE lock to be released. If
this becomes a serious problem with
your application, you should consider
converting some of your tables to
transaction-safe tables.
You can load your data into a shadow table as slowly as you like, then instantly swap the shadow and actual with RENAME TABLE:
truncate table shadow; # make sure it is clean to start with
insert into shadow .....; # lots of inserts etc against shadow table
rename table active to temp, shadow to active, temp to shadow;
truncate table shadow; # throw away the old active data
The rename statement is atomic. An intermediate name "temp" is used to help swap the names of temp and active.
This should work with all storage engines.
Rename table - MySQL Manual