Locking selects - mysql

I use InnoDB engine for all my tables. I know that by default INSERT creates lock for rows that will be inserted, and UPDATE creates lock for rows that it uses (no matter if in set or where clausules). SELECT doesn't lock anything. And nothing locks whole tables.
But what if I did something like that:
SELECT * FROM table INTO OUTFILE '/tmp/file.txt'
If it would last 5min, anything could happen in some other thread. I've read I could use:
SELECT * FROM table INTO OUTFILE '/tmp/file.txt' LOCK IN SHARE MODE;
But then again I couldn't do any SELECT operations on this table, and it sucks.
What's the best approach to do this? Also, I've read that the last query should be used inside a transaction with a rollback instead of a commit - why is that so?

If you want a consistent view of an InnoDB table for a long running SELECT, the best approach is to just ensure that the transaction isolation level for the session is set to REPEATABLE READ when the SELECT is run.
That won't block other threads that attempt to read the same rows. But it might block some threads from obtaining exclusive locks or write intent locks.
https://dev.mysql.com/doc/refman/5.6/en/set-transaction.html
As an addendum, to clarify some of the points OP raises.
"SELECT doesn't lock anything."
It's true that a non-locking SELECT won't obtain row locks. But some special SELECT statements (as pointed out later) that can obtain row locks:
SELECT ... FOR UPDATE
SELECT ... LOCK IN SHARE MODE
And there are meta-data locks, which will block DDL operations on the table (e.g. ALTER TABLE) while the SELECT statement is executing.
"And nothing locks whole tables."
That's not strictly true. A LOCK TABLE statement can obtain a lock on the entire table. And a SELECT ... FOR UPDATE (with no predicates) could (potentially) obtain locks on every row in the table.
"SELECT ... LOCK IN SHARE MODE will block other SELECT statements"
This isn't true. Shared locks will block exclusive locks from other threads. But they won't block other threads from obtaining share locks, and won't block non-locking SELECT statements.
What's the best approach to do this?
To re-iterate the first part of my answer again... just run a non-locking SELECT statement. As long as the transaction isolation level isn't set to READ UNCOMMITED, the SELECT statement will get a consistent view of the rows in the table, from the timepoint at the time the SELECT begins executing.
Also, I've read that the last query should be used inside a transaction with a rollback instead of a commit - why is that so?
This is a curious notion. It has me puzzled. Why would a ROLLBACK preferred over a COMMIT?
As long as no DML changes have been applied, I think the COMMIT and the ROLLBACK would be equivalent. In both cases, all of the locks obtained by the transaction would be released. In terms of the database, I don't think it makes a difference.
This has me thinking this recommendation comes from a preferred pattern on the client side. Maybe there's a notion of following a rule such as "don't commit unless you've applied DML changes". But that's just a guess.
My personal recommendation would be to follow the normative pattern of using a COMMIT to end the transaction. I don't favor using an implicit ROLLBACK. In my personal opinion, a ROLLBACK should be issued when we want to explicitly discard DML changes that have been applied in a transaction. And that's typically due to an exception or error condition.

Related

Can range lock in SQL be acquired in share mode

I have a query such as
Select count(*) from table log where num = ?;
If I set the isolation level to serializable, then the range lock will be acquired for the where clause.
My question is: Can other transaction also acquire the range lock in share mode to read the count as the above OR the range lock is exclusive and all other transactions have to wait until the current transaction commits before executing the above read query.
Background: I am trying to implement a view counter for heavy traffic website. To reduce IO to the database, I create a log table so that every time there is a view, I only write a new row in the log table. Once a while, I (randomly) decide if I want to clear the log table and add the number of rows in the log table into a column of a view count table. This means I have to be careful with interleaving transaction.
The statements below are relevant only to SQL Server and were made before the OP made clear this was really about MySQL, about which I know nothing. I'm leaving it here since it (and the resulting discussion) might be of some use nevertheless, but it is not a complete, relevant answer to the question.
SELECT statements only ever acquire shared locks, on all isolation levels (unless overridden with a table hint). And shared locks are always compatible with each other (see Lock Compatibility), so there's no problem if other transactions want to acquire shared (range) locks as well. So yes, you can have any number of queries performing SELECT COUNT(*) in parallel and they will never block each other.
This doesn't mean other transactions don't have to wait. In particular, a DELETE query must eventually acquire an exclusive lock, and it will have to wait if the SELECT is holding a shared lock. Normally this is not an issue since the engine releases locks as soon as possible. When it does become an issue, you'll want to look at solutions like snapshot isolation, which uses optimistic concurrency and conflict detection rather than locking. Under that model, a SELECT will never block any other query (save those that want table locks). Of course, this isn't free; the row versioning is uses takes up disk space and I/O.

Do "SELECT ... LOCK IN SHARE MODE" and "SELECT ... FOR UPDATE" have to be inside of a transaction?

I'm reading the documentation for these commands and am confused. The descriptions for the commands mention transactions:
SELECT ... LOCK IN SHARE MODE sets a shared mode lock on any rows that
are read. Other sessions can read the rows, but cannot modify them
until your transaction commits. If any of these rows were changed by
another transaction that has not yet committed, your query waits until
that transaction ends and then uses the latest values.
For index records the search encounters, SELECT ... FOR UPDATE blocks
other sessions from doing SELECT ... LOCK IN SHARE MODE or from
reading in certain transaction isolation levels. Consistent reads will
ignore any locks set on the records that exist in the read view. (Old
versions of a record cannot be locked; they will be reconstructed by
applying undo logs on an in-memory copy of the record.)
But then the examples don't show transactions being used. Running a test command such as select * from users for update; without a transaction doesn't result in any errors (it works). Does this mean transactions don't have to be used with these commands? If so, is there any advantage to putting these commands inside of a transaction?
In InnoDB each query is effectively run in a transaction. If you don't start transaction explicitly (with start transaction or by setting autocommit to off), each transaction is committed after the query run. This means that if you are not in a transaction, the lock acquired with SELECT ... IN SHARE MODE will be released as soon as the query is completed. There is nothing that prevents you from doing this, it just doesn't make much sense to use locks outside of a transaction; as these locks are to guarantee that the value you select won't change until a later query you are going to execute (like if you want to insert/update data in one table based on the values in another)
A transaction ensures that all the commands it contains will either run successfully or rollback.
These types of select statements affect other transactions in other sessions. So basically wrapping these in transactions is only a matter of whether you are selecting the data as part of a larger set of commands.
If you only want to select the data you should either use the shared lock or no lock at all and no need to begin a transaction.

How can a row be read when the table is read/write locked?

I am running these queries on MySQL 5.6.13.
I using repeatable read isolation level. The table looks like below:
In Session A terminal I have issued below statement
UPDATE manufacurer
SET lead_time = 2
WHERE mname = 'Hayleys';
In Session B terminal I tried to update the value lead_time of ACL Cables to 2. But since the previous UPDATE command from Session A is not yet committed (and Session A has an exclusive lock on manufacturer table), this update waits. This I can understand.
But when I try to execute a SELECT statement on Session B as below,
SELECT * FROM manufacturer
WHERE mcode = 'ACL';
it correctly query the manufacturer table and give out the row. How can this happen? Because Session A still hold the exclusive lock on manufacturer table and as I understand when an exclusive lock is held on a table no other transactions can read from or write to it till the previous transaction is committed.
Found below information on this page
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html#isolevel_repeatable-read
Scope of Transaction Characteristics
You can set transaction characteristics globally, for the current
session, or for the next transaction:
With the GLOBAL keyword, the statement applies globally for all
subsequent sessions. Existing sessions are unaffected.
With the SESSION keyword, the statement applies to all subsequent
transactions performed within the current session.
Without any SESSION or GLOBAL keyword, the statement applies to the
next (not started) transaction performed within the current session.
Have this been taken into consideration?
REPEATABLE READ
This is the default isolation level for InnoDB. For consistent reads,
there is an important difference from the READ COMMITTED isolation
level: All consistent reads within the same transaction read the
snapshot established by the first read. This convention means that if
you issue several plain (nonlocking) SELECT statements within the same
transaction, these SELECT statements are consistent also with respect
to each other.
In this article its decribes very well.
http://www.mysqlperformanceblog.com/2012/08/28/differences-between-read-committed-and-repeatable-read-transaction-isolation-levels/
It is important to remember that InnoDB actually locks index entries,
not rows. During the execution of a statement InnoDB must lock every
entry in the index that it traverses to find the rows it is modifying.
It must do this to prevent deadlocks and maintain the isolation level.
Are the tables well indexed? Can you run a SHOW ENGINE innodb STATUS to confirm that the lock is held?
There are kinds of lock in mysql: row-level lock and table-level lock.
What you need is row-level lock,which allows read the lines beyond the ones updating.
And to implement the row-level lock,you have to define the engine type of your table to 'InnoDB':
alter table TABLE_NAME engine=innodb;

While in a transaction, how can reads to an affected row be prevented until the transaction is done?

I'm fairly sure this has a simple solution, but I haven't been able to find it so far. Provided an InnoDB MySQL database with the isolation level set to SERIALIZABLE, and given the following operation:
BEGIN WORK;
SELECT * FROM users WHERE userID=1;
UPDATE users SET credits=100 WHERE userID=1;
COMMIT;
I would like to make sure that as soon as the select inside the transaction is issued, the row corresponding to userID=1 is locked for reads until the transaction is done. As it stands now, UPDATEs to this row will wait for the transaction to be finished if it is in process, but SELECTs simply will read the previous value. I understand this is the expected behaviour in this case, but I wonder if there is a way to lock the row in such a way that SELECTs will also wait until the transaction is finished to return the values?
The reason I'm looking for that is that at some point, and with enough concurrent users, it could happen that while the previous transaction is in process someone else reads the "credits" to calculate something else. Ideally the code run by that someone else should wait for the transaction to finish to use the new value, because otherwise it could lead to irreversible desync issues.
Note that I don't want to lock the entire table for reads, just the specific row.
Also, I could add a boolean "locked" field to the tables and set it to 1 every time I'm starting a transaction but I don't really feel this is the most elegant solution here, unless there is absolutely no other way to handle this through mysql directly.
I found a workaround, specifically:
SELECT ... LOCK IN SHARE MODE sets a shared mode lock on the rows
read. A shared mode lock enables other sessions to read the rows but
not to modify them. The rows read are the latest available, so if they
belong to another transaction that has not yet committed, the read
blocks until that transaction ends.
(Source)
It seems that one can just include LOCK IN SHARE MODE in the critical SELECT statements that rely on transactional data and they will indeed wait for current transactions to finish before retrieving the row/s. For this to work the transaction has to use FOR UPDATE explicitly (as opposed to the original example I gave). E.g., given the following:
BEGIN WORK;
SELECT * FROM users WHERE userID=1 FOR UPDATE;
UPDATE users SET credits=100 WHERE userID=1;
COMMIT;
Anywhere else in the code I could use:
SELECT * FROM users WHERE userID=1 LOCK IN SHARE MODE;
Since this statement is not wrapped in a transaction, the lock is released immediately, thus having no impacts in subsequent queries, but if the row involving userID=1 has been selected for update within a transaction this statement would wait until the transaction is done, which is exactly what I was looking for.
You could try the SELECT ... FOR UPDATE locking read.
A SELECT ... FOR UPDATE reads the latest available data, setting exclusive locks on each row it reads. Thus, it sets the same locks a searched SQL UPDATE would set on the rows.
Please go through the following site: http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html

Mysql InnoDB - Locking scenario

I am a developer and have only fair knowledge about databases. I need to understand the transaction level locking mechanism in InnoDB.
I read that InnoDB uses row level locking? As far as I understand, it locks down a particular row within a transaction. What will happen to a select statement when a table update is going on ?
For Example, assume there is transaction and a select statement both triggered from two different processes and assume Transaction1 starts before the select statement is issued.
Transaction1 : Start
Update table_x set x = y where 1=1
Transaction1 : End
Select Query
Select x from table_x
What will happen to the select statement. Will it return values "during" Transaction1 takes place or "after" it completes? And if it can begin only after Transaction1 ends, where is Row level locking in this picture?
Am I making sense or my fundamental understanding itself is wrong? Please advise.
It depends on the Isolation level.
SERIALIZABLE
REPEATABLE READS
READ COMMITTED
READ UNCOMMITTED
Good explained on wikipedia
And the mySQL docu
It does not depend only on the locking involved, but on the isolation level, which uses locking to provide the transaction isolation as defined by ACID standards. InnoDB uses not only locking, but also multiversioning of the rows to speed up transactions.
In serializable isolation level it would use read-lock with the update, so the select will have to wait for first transaction to be completed. On lower isolation levels however the lock will be write, and selects won't be blocked. In repeatable read and read committed it will scan the rollback log to get the previous value of the record, if it is updated, and in read uncommitted in will return the current value.
The difference between table-level locking and row-level locking is when you have 2 transactions that run update query. In table-level locking, the 2nd will have to wait the first one, as the whole table is locked. In row-level locking, only the rows that match the where clause* (as well as some gaps between them, but this is another topic) will be locked, which means that different transactions can update different parts of the table without need to wait for each other.
*assuming that there is index covering the where clause
The select will not wait for the transaction to complete, instead it will return the current value of the rows (aka, before the transaction started).
If you want the select to wait for the transaction to finish you can use "LOCK IN SHARE MODE":
Select x from table_x LOCK IN SHARE MODE;
This will cause the select to wait for any row(s) that are currently lock by a transaction holding an exclusive (update/delete) lock on them.
A read performed with LOCK IN SHARE MODE reads the latest available
data and sets a shared mode lock on the rows read. A shared mode lock
prevents others from updating or deleting the row read. Also, if the
latest data belongs to a yet uncommitted transaction of another
session, we wait until that transaction ends.
http://dev.mysql.com/doc/refman/5.0/en/innodb-lock-modes.html
SELECT started from outside of a transaction will see the table as it was before transaction started. It will see updated values only after transsaction is commited.