I have a query such as
Select count(*) from table log where num = ?;
If I set the isolation level to serializable, then the range lock will be acquired for the where clause.
My question is: Can other transaction also acquire the range lock in share mode to read the count as the above OR the range lock is exclusive and all other transactions have to wait until the current transaction commits before executing the above read query.
Background: I am trying to implement a view counter for heavy traffic website. To reduce IO to the database, I create a log table so that every time there is a view, I only write a new row in the log table. Once a while, I (randomly) decide if I want to clear the log table and add the number of rows in the log table into a column of a view count table. This means I have to be careful with interleaving transaction.
The statements below are relevant only to SQL Server and were made before the OP made clear this was really about MySQL, about which I know nothing. I'm leaving it here since it (and the resulting discussion) might be of some use nevertheless, but it is not a complete, relevant answer to the question.
SELECT statements only ever acquire shared locks, on all isolation levels (unless overridden with a table hint). And shared locks are always compatible with each other (see Lock Compatibility), so there's no problem if other transactions want to acquire shared (range) locks as well. So yes, you can have any number of queries performing SELECT COUNT(*) in parallel and they will never block each other.
This doesn't mean other transactions don't have to wait. In particular, a DELETE query must eventually acquire an exclusive lock, and it will have to wait if the SELECT is holding a shared lock. Normally this is not an issue since the engine releases locks as soon as possible. When it does become an issue, you'll want to look at solutions like snapshot isolation, which uses optimistic concurrency and conflict detection rather than locking. Under that model, a SELECT will never block any other query (save those that want table locks). Of course, this isn't free; the row versioning is uses takes up disk space and I/O.
Related
What is the exact difference between the two locking read clauses:
SELECT ... FOR UPDATE
and
SELECT ... LOCK IN SHARE MODE
And why would you need to use one over the other?
I have been trying to understand the difference between the two. I'll document what I have found in hopes it'll be useful to the next person.
Both LOCK IN SHARE MODE and FOR UPDATE ensure no other transaction can update the rows that are selected. The difference between the two is in how they treat locks while reading data.
LOCK IN SHARE MODE does not prevent another transaction from reading the same row that was locked.
FOR UPDATE prevents other locking reads of the same row (non-locking reads can still read that row; LOCK IN SHARE MODE and FOR UPDATE are locking reads).
This matters in cases like updating counters, where you read value in 1 statement and update the value in another. Here using LOCK IN SHARE MODE will allow 2 transactions to read the same initial value. So if the counter was incremented by 1 by both transactions, the ending count might increase only by 1 - since both transactions initially read the same value.
Using FOR UPDATE would have locked the 2nd transaction from reading the value till the first one is done. This will ensure the counter is incremented by 2.
For Update --- You're informing Mysql that the selected rows can be updated in the next steps(before the end of this transaction) ,,so that mysql does'nt grant any read locks on the same set of rows to any other transaction at that moment. The other transaction(whether for read/write )should wait until the first transaction is finished.
For Share- Indicates to Mysql that you're selecting the rows from the table only for reading purpose and not to modify before the end of transaction. Any number of transactions can access read lock on the rows.
Note: There are chances of getting a deadlock if this statement( For update, For share) is not properly used.
Either way the integrity of your data will be guaranteed, it's just a question of how the database guarantees it. Does it do so by raising runtime errors when transactions conflict with each other (i.e. FOR SHARE), or does it do so by serializing any transactions that would conflict with each other (i.e. FOR UPDATE)?
FOR SHARE (a.k.a. LOCK IN SHARE MODE): Transactions face a higher probability of failure due to deadlock, because they delay blocking until the moment an update statement is received (at which point they either block until all readlocks are released, or fail due to deadlock if another write is in progress). However, only one client blocks and eventually succeeds: the other clients will fail with deadlock if they try to update, so only one of them will succeed and the rest will have to retry their transactions.
FOR UPDATE: Transactions won't fail due to deadlock, because they won't be allowed to run concurrently. This may be desirable for example because it makes it easier to reason about multi-threading if all updates are serialized across all clients. However, it limits the concurrency you can achieve because all other transactions block until the first transaction is finished.
Pro-Tip: As an exercise I recommend taking some time to play with a local test database and a couple mysql clients on the command line to prove this behavior for yourself. That is how I eventually understood the difference myself, because it can be very abstract until you see it in action.
Does MySQL InnoDB table wait for write locks even for query such as SELECT COUNT(*) FROM t?
My situation:
I've got table with 50000 rows with many updates (views count in every row). InnoDB should put a write lock on the updated row. But when I make a query with only COUNT(*) on this table, MySQL could answer this query even without waiting for write locks because no UPDATE will change the number of rows.
Thanks a lot!
No, MySql doesn't lock InnoDb tables for queries that only read data from tables.
This is only the case for old MyIsam tables, where all readers must wait until the writer is done and vice versa.
For InnoDb tables they implemented Multiversion concurrency control
In MySql terminology it is called Consistent Nonlocking Reads
In short - when the reader starts the query, the database makes a snapshot of the database at a point in time when the query was started, and the reader (the query) sees only changes made visible (commited) up to this point in time, but doesn't see changes made by later transactions. This allows readers to read data without locking and waiting for writers, but still keeping ACID
There are subtle differences depending on the transaction isolation level, you can find detailed description here: http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html
In short - in read uncommited, read commited and repeatable read modes, all SELECT statements that only read data (SELECTs without FOR UPDATE or LOCK IN SHARE MODE clasues) are performed in a nonlocking fashion.
In serializable mode all transacions are serialized and, depending on autocommit mode, SELECT can be blocked when conflicts with other transactions (when autocommit=true), or is automatically converted to SELECT ... LOCK IN SHARE MODE (when autocommit=false). All details are explained in the above links.
In my application, I want any insert to the database to be executed as soon as a request comes for writing some data.
I am using InnoDB engine.
Since insertion requires an exclusive lock, it is possible that while current read query has a shared lock, some other reads might come which again have a shared lock and the write operation might have to wait for a long time.
I want that when there is a write operation in queue, no read operation gets a shared lock. As soon as the reads which were initiated before the current write request are completed, the write operation should be executed. After that all other read operations should take place.
How can this be implemented?
Edit
Since I am using InnoDB tables and I am not implementing table lock, there should not be a conflict between select and insert. It would be select and update. (Please correct me if there can be a conflict between select and insert as well)
In MySQL, update has higher priority than select. But if there are some read queries being executed, then update query comes in followed by some read queries. In that case, will the read queries coming after update wait for the update to finish as mentioned here http://dev.mysql.com/doc/refman/5.0/en//table-locking.html OR they will get shared lock along with the read queries which were there before the update query was fired?
You don't need to acquire shared locks when reading from the database. In fact, with the default transaction isolation level REPEATABLE READ ordinary SELECT queries within one transaction read from a consistent snapshot. No locks are acquired and required, within this transaction you will simply not see any changes committed in other transactions.
Since no shared locks are acquired, exclusive locks for updating queries are immediately granted to other sessions in the order they are filed.
MySQL doc says the following
Consistent read is the default mode in which InnoDB processes SELECT statements in READ COMMITTED and REPEATABLE READ isolation levels. A consistent read does not set any locks on the tables it accesses, and therefore other sessions are free to modify those tables at the same time a consistent read is being performed on the table.
Suppose that you are running in the default REPEATABLE READ isolation level. When you issue a consistent read (that is, an ordinary SELECT statement), InnoDB gives your transaction a timepoint according to which your query sees the database. If another transaction deletes a row and commits after your timepoint was assigned, you do not see the row as having been deleted. Inserts and updates are treated similarly.
You can try looking at INSERT DELAYED Mysql Insert Delayed
Unfortunatelly it is not available on innodb.
PS: A exclusive lock cannot be acquired if there is already a shared lock on the table. So basicly, in your situation: 3 reads obtain a shared lock, one insert needs exclusive lock. The insert will only be able to obtain the lock after the selects have finished.
i am reading the book High Performance MySQL, it mentions:
performing one query per table uses table locks more efficiently: the queries
will lock the tables invididually and relatively briefly, instead of locking
them all for a longer time.
MyISAM places table-lock even when selecting something? can someone explain a little bit?
MyISAM has different kinds of locks. A SELECT operation places a READ LOCK on the table. There can be multiple active read locks at any given time, as long as there are no active WRITE LOCKS. Operations that modify the table, eg. INSERT, UPDATE, DELETE or ALTER TABLE place a WRITE LOCK on the table. Write lock can only be placed on a table when there are no active read locks; If there are active read locks, MyISAM queues the write lock to be activated as soon as all active read locks are expired.
Likewise when there's an active write lock, attempting to place a read lock on a table will queue the lock (and the associated query) until write locks have expired on the table.
Ultimately this all means that:
You can have any number of active read locks (also called shared locks)
You can only have one active write lock (also called an exclusive lock)
For more information see: http://dev.mysql.com/doc/refman/5.5/en/internal-locking.html
reko_t provided a good answer, I will try to elaborate on it:
Yes.
You can have EITHER one writer or several readers
Except there is a special case, called concurrent inserts. This means that you can have one thread doing an insert, while one or more threads are doing select (read) queries.
there are a lot of caveats doing this:
it has to be "at the end" of the table - not in a "hole" in the middle
Only inserts can be done concurrently (no updates, deletes)
There is still contention on the single MyISAM key buffer. There is a single key buffer, protected by a single mutex, for the whole server. Everything which uses an index needs to take it (typically several times).
Essentially, MyISAM has poor concurrency. You can try to fake it, but it's bad whichever way you look at it. MySQL / Oracle has made no attempts to improve it recently (looking at the source code, I'm not surprised - they'd only introduce bugs).
If you have a workload with lots of "big" SELECTs which retrieve lots of rows, or are hard in some way, they may often overlap, this may seem ok. But a single row update or delete will block the whole lot of them.
can someone explain the need to lock tables and/or rows in mysql?
I am assuming that it to prevent multiple writes to the same field, is this the best practise?
First lets look a good document This is not a mysql related documentation, it's about postgreSQl, but it's one of the simplier and clear doc I've read on transaction. You'll understand MySQl transaction better after reading this link http://www.postgresql.org/docs/8.4/static/mvcc.html
When you're running a transaction 4 rules are applied (ACID):
Atomicity : all or nothing (rollback)
Coherence : coherent before, coherent after
Isolation: not impacted by others?
Durability : commit, if it's done, it's really done
In theses rules there's only one which is problematic, it's Isolation. using a transaction does not ensure a perfect isolation level. The previous link will explain you better what are the phantom-reads and suchs isolation problems between concurrent transactions. But to make it simple you should really use Row levels locks to prevent other transaction, running in the same time as you (and maybe comitting before you), to alter the same records. But with locks comes deadlocks...
Then when you'll try using nice transactions with locks you'll need to handle deadlocks and you'll need to handle the fact that transaction can fail and should be re-launched (simple for or while loops).
Edit:------------
Recent versions of InnoDb provides greater levels of isolation than previous ones. I've done some tests and I must admit that even the phantoms reads that should happen are now difficult to reproduce.
MySQL is on level 3 by default of the 4 levels of isolation explained in the PosgtreSQL document (where postgreSQL is in level 2 by default). This is REPEATABLE READS. That means you won't have Dirty reads and you won't have Non-repeatable reads. So someone modifying a row on which you made your select in your transaction will get an implicit LOCK (like if you had perform a select for update).
Warning: If you work with an older version of MySQL like 5.0 you're maybe in level 2, you'll need to perform the row lock using the 'FOR UPDATE' words!
We can always find some nice race conditions, working with aggregate queries it could be safer to be in the 4th level of isolation (by using LOCK IN SHARE MODE at the end of your query) if you do not want people adding rows while you're performing some tasks. I've been able to reproduce one serializable level problem but I won't explain here the complex example, really tricky race conditions.
There is a very nice example of race conditions that even serializable level cannot fix here : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
When working with transactions the more important things are:
data used in your transaction must always be read INSIDE the transaction (re-read it if you had data from before the BEGIN)
understand why the high isolation level set implicit locks and may block some other queries ( and make them timeout)
try to avoid dead locks (try to lock tables in the same order) but handle them (retry a transaction aborted by MySQL)
try to freeze important source tables with serialization isolation level (LOCK IN SHARE MODE) when your application code assume that no insert or update should modify the dataset he's using (if not you will not have problems but your result will have ignored the concurrent changes)
It is not a best practice. Modern versions of MySQL support transactions with well defined semantics. Use transactions, and forget about locking stuff by hand.
The only new thing you'll have to deal with is that transaction commits may fail because of race conditions, but you'd be doing error checking with locks anyway, and it is easier to retry the logic that led to a transaction failure than to recover from errors in a non-transactional setup.
If you do get race conditions and failed commits, then you may want to fine-tune the isolation configuration for your transactions.
For example if you need to generate invoice numbers which are sequential and have no numbers missing - this is a requirement at least in the country I live in.
If you have a few web servers, then a few users might be buying stuff literally at the same time.
If you do select max(invoice_id)+1 from invoice to get the new invoice number, two web servers might do that at the same time (before the new invoice has been added), and get the same invoice number for the invoices they're trying to create.
If you use a mechanism such as "auto_increment", this is just meant to generate unique values, and makes no guarantees about not missing out numbers (if one transaction tries to insert a row, then does a rollback, the number is "lost"),
So the solution is to (a) lock the table (b) select max(invoice_id)+1 from invoice (c) do the insert (d) commit + unlock the table.
On another note, in MySQL you're best using InnoDB and using row-level locking. Doing a lock table command can implicitly commit the transaciton you're working on.
Take a look here for general introduction to what transactions are and how to use them.
Databases are designed to work in concurrent environments, so locking the tables and/or records helps to keep the transactions consistent.
So a record affected by one transaction should not be altered until this transaction commits or rolls back.