Transaction vs locking in mysql - mysql

I have some confusing regarding transaction and locking in MySQL.
What is difference between transaction and locking in MySQL and how it related to each other?
Is transaction related to DML (INSERT, UPDATE and DELETE) only or it also related to SELECT query?
Is transaction cover the Truncate?
for example:
START TRANSACTION;
SELECT * from XYX;
UPDATE abc SET summary=788 WHERE type=1;
TRUNCATE TABLE pqr;
INSERT INTO ABL VALUE('OK');
COMMIT;

It's requires a large explanation to full cover your question.
In short a transaction is an "atomic operation". If it's committed all inside it is committed, if it's rolled back all inside it is rolled back.
Locks are a mechanism to avoid dirty and ghost reads (and two process to make updates at the same time, messing with one another) in parallel/concurrent environments.
In a general saying transaction levels defines locks strategies.
Almost everything is contained in a transaction, including the select and truncate.
I suggest you to hit the books to learn about transaction levels, locks (strategies, granularity, performance, deadlocks, starvation, glutton philosophers...)

What is transaction?
A transaction comprises a unit of work performed within a database
management system (or similar system) against a database, and treated
in a coherent and reliable way independent of other transactions.
Also, there is a documentation on MySQL site
What is database lock?
A lock, as a read lock or write lock, is used when multiple users need
to access a database concurrently.
So, it's completely different things, you can't 'compare' them.

Related

Are changes made to DB only through transactions?

I am not able to get a clear complete understanding regarding the role of transactions in databases.
I know operations clubbed in a transactions will be executed together and then either committed or rolled back.
But then what about about any other query that I write to the database without manually creating a transaction.
Is a transaction created internally for them?
Also what about select statements then? Are transactions created for them too?
I have been using database and sql for some time now, alas I am not clear on these
Are changes to DBs happening only through transactions? Short answer is yes.
There is always a transaction involved:
It might be automatically started (before) and commited (after) every single DML statement you issue, if you're relying on AUTOCOMMIT behaviour of your database session
Or you may explictly start one with BEGIN, execute your statements and end it with COMMIT
I like to think a transaction as a boundary that imposes clear semantics of ATOMICITY and ISOLATION to the statements that are contained within.
You describe atomicity (all or nothing behaviour) but that is not the only guarantee that a transaction can give you: there's also isolation (and this has to do with reads you within transactions (E.g. SELECTs).
In a concurrent application (many clients writing and reading to the same db/table at the same time), transaction ISOLATION is the property that defines "how much of the effects of other operations" can be observed in the current one. For example, assume you need to perform a transaction that involves doing the same SELECT multiple times: do you want this SELECT to return (possibly) different results each time (because some modification happened concurrently) or not?
For single statements is:
A single DML (UPDATE, INSERT...) statement by itself is effectively "As if it was in a transaction with a single statement, that gets immediately commited after execution" (Either it works like this because you're in AUTOCOMMIT, or you wrapped a single statement within BEGIN...COMMIT)
For a single SELECT it's the same. The transaction in this case (implicit or not, gives you the possibility of specifiying different isolation levels). It might sound strange to consider transactions for SELECTS, but requiring particular isolation levels might mean that the db is acquiring some lock to the data under the hood: committing the transaction in that case would release such lock.
Since you tagged mysql, here you can read on transaction isolations supported by mysql:
https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html
A SQL transaction is any statement that contains Data Manipulation Language (DML). That is, any statement that changes values in a table, such as UPDATE, INSERT, MERGE, DELETE, etc.

Can range lock in SQL be acquired in share mode

I have a query such as
Select count(*) from table log where num = ?;
If I set the isolation level to serializable, then the range lock will be acquired for the where clause.
My question is: Can other transaction also acquire the range lock in share mode to read the count as the above OR the range lock is exclusive and all other transactions have to wait until the current transaction commits before executing the above read query.
Background: I am trying to implement a view counter for heavy traffic website. To reduce IO to the database, I create a log table so that every time there is a view, I only write a new row in the log table. Once a while, I (randomly) decide if I want to clear the log table and add the number of rows in the log table into a column of a view count table. This means I have to be careful with interleaving transaction.
The statements below are relevant only to SQL Server and were made before the OP made clear this was really about MySQL, about which I know nothing. I'm leaving it here since it (and the resulting discussion) might be of some use nevertheless, but it is not a complete, relevant answer to the question.
SELECT statements only ever acquire shared locks, on all isolation levels (unless overridden with a table hint). And shared locks are always compatible with each other (see Lock Compatibility), so there's no problem if other transactions want to acquire shared (range) locks as well. So yes, you can have any number of queries performing SELECT COUNT(*) in parallel and they will never block each other.
This doesn't mean other transactions don't have to wait. In particular, a DELETE query must eventually acquire an exclusive lock, and it will have to wait if the SELECT is holding a shared lock. Normally this is not an issue since the engine releases locks as soon as possible. When it does become an issue, you'll want to look at solutions like snapshot isolation, which uses optimistic concurrency and conflict detection rather than locking. Under that model, a SELECT will never block any other query (save those that want table locks). Of course, this isn't free; the row versioning is uses takes up disk space and I/O.

How can priority be given to write/update over read in MySQL?

In my application, I want any insert to the database to be executed as soon as a request comes for writing some data.
I am using InnoDB engine.
Since insertion requires an exclusive lock, it is possible that while current read query has a shared lock, some other reads might come which again have a shared lock and the write operation might have to wait for a long time.
I want that when there is a write operation in queue, no read operation gets a shared lock. As soon as the reads which were initiated before the current write request are completed, the write operation should be executed. After that all other read operations should take place.
How can this be implemented?
Edit
Since I am using InnoDB tables and I am not implementing table lock, there should not be a conflict between select and insert. It would be select and update. (Please correct me if there can be a conflict between select and insert as well)
In MySQL, update has higher priority than select. But if there are some read queries being executed, then update query comes in followed by some read queries. In that case, will the read queries coming after update wait for the update to finish as mentioned here http://dev.mysql.com/doc/refman/5.0/en//table-locking.html OR they will get shared lock along with the read queries which were there before the update query was fired?
You don't need to acquire shared locks when reading from the database. In fact, with the default transaction isolation level REPEATABLE READ ordinary SELECT queries within one transaction read from a consistent snapshot. No locks are acquired and required, within this transaction you will simply not see any changes committed in other transactions.
Since no shared locks are acquired, exclusive locks for updating queries are immediately granted to other sessions in the order they are filed.
MySQL doc says the following
Consistent read is the default mode in which InnoDB processes SELECT statements in READ COMMITTED and REPEATABLE READ isolation levels. A consistent read does not set any locks on the tables it accesses, and therefore other sessions are free to modify those tables at the same time a consistent read is being performed on the table.
Suppose that you are running in the default REPEATABLE READ isolation level. When you issue a consistent read (that is, an ordinary SELECT statement), InnoDB gives your transaction a timepoint according to which your query sees the database. If another transaction deletes a row and commits after your timepoint was assigned, you do not see the row as having been deleted. Inserts and updates are treated similarly.
You can try looking at INSERT DELAYED Mysql Insert Delayed
Unfortunatelly it is not available on innodb.
PS: A exclusive lock cannot be acquired if there is already a shared lock on the table. So basicly, in your situation: 3 reads obtain a shared lock, one insert needs exclusive lock. The insert will only be able to obtain the lock after the selects have finished.

Locking mySQL tables/rows

can someone explain the need to lock tables and/or rows in mysql?
I am assuming that it to prevent multiple writes to the same field, is this the best practise?
First lets look a good document This is not a mysql related documentation, it's about postgreSQl, but it's one of the simplier and clear doc I've read on transaction. You'll understand MySQl transaction better after reading this link http://www.postgresql.org/docs/8.4/static/mvcc.html
When you're running a transaction 4 rules are applied (ACID):
Atomicity : all or nothing (rollback)
Coherence : coherent before, coherent after
Isolation: not impacted by others?
Durability : commit, if it's done, it's really done
In theses rules there's only one which is problematic, it's Isolation. using a transaction does not ensure a perfect isolation level. The previous link will explain you better what are the phantom-reads and suchs isolation problems between concurrent transactions. But to make it simple you should really use Row levels locks to prevent other transaction, running in the same time as you (and maybe comitting before you), to alter the same records. But with locks comes deadlocks...
Then when you'll try using nice transactions with locks you'll need to handle deadlocks and you'll need to handle the fact that transaction can fail and should be re-launched (simple for or while loops).
Edit:------------
Recent versions of InnoDb provides greater levels of isolation than previous ones. I've done some tests and I must admit that even the phantoms reads that should happen are now difficult to reproduce.
MySQL is on level 3 by default of the 4 levels of isolation explained in the PosgtreSQL document (where postgreSQL is in level 2 by default). This is REPEATABLE READS. That means you won't have Dirty reads and you won't have Non-repeatable reads. So someone modifying a row on which you made your select in your transaction will get an implicit LOCK (like if you had perform a select for update).
Warning: If you work with an older version of MySQL like 5.0 you're maybe in level 2, you'll need to perform the row lock using the 'FOR UPDATE' words!
We can always find some nice race conditions, working with aggregate queries it could be safer to be in the 4th level of isolation (by using LOCK IN SHARE MODE at the end of your query) if you do not want people adding rows while you're performing some tasks. I've been able to reproduce one serializable level problem but I won't explain here the complex example, really tricky race conditions.
There is a very nice example of race conditions that even serializable level cannot fix here : http://www.postgresql.org/docs/8.4/static/transaction-iso.html#MVCC-SERIALIZABILITY
When working with transactions the more important things are:
data used in your transaction must always be read INSIDE the transaction (re-read it if you had data from before the BEGIN)
understand why the high isolation level set implicit locks and may block some other queries ( and make them timeout)
try to avoid dead locks (try to lock tables in the same order) but handle them (retry a transaction aborted by MySQL)
try to freeze important source tables with serialization isolation level (LOCK IN SHARE MODE) when your application code assume that no insert or update should modify the dataset he's using (if not you will not have problems but your result will have ignored the concurrent changes)
It is not a best practice. Modern versions of MySQL support transactions with well defined semantics. Use transactions, and forget about locking stuff by hand.
The only new thing you'll have to deal with is that transaction commits may fail because of race conditions, but you'd be doing error checking with locks anyway, and it is easier to retry the logic that led to a transaction failure than to recover from errors in a non-transactional setup.
If you do get race conditions and failed commits, then you may want to fine-tune the isolation configuration for your transactions.
For example if you need to generate invoice numbers which are sequential and have no numbers missing - this is a requirement at least in the country I live in.
If you have a few web servers, then a few users might be buying stuff literally at the same time.
If you do select max(invoice_id)+1 from invoice to get the new invoice number, two web servers might do that at the same time (before the new invoice has been added), and get the same invoice number for the invoices they're trying to create.
If you use a mechanism such as "auto_increment", this is just meant to generate unique values, and makes no guarantees about not missing out numbers (if one transaction tries to insert a row, then does a rollback, the number is "lost"),
So the solution is to (a) lock the table (b) select max(invoice_id)+1 from invoice (c) do the insert (d) commit + unlock the table.
On another note, in MySQL you're best using InnoDB and using row-level locking. Doing a lock table command can implicitly commit the transaciton you're working on.
Take a look here for general introduction to what transactions are and how to use them.
Databases are designed to work in concurrent environments, so locking the tables and/or records helps to keep the transactions consistent.
So a record affected by one transaction should not be altered until this transaction commits or rolls back.

Prevent read when updating the table

In MySQL:
Every one minute I empty the table and fill it with a new data. Now I want that users should not read data during the fill process, before or after is ok.
How do I achieve this?
Is transaction the way?
Assuming you use a transactional engine (Usually Innodb), clear and refill the table in the same transaction.
Be sure that your readers use READ_COMMITTED or higher transaction isolation level (the default is REPEATABLE READ which is higher).
That way readers will continue to be able to read the old contents of the table during the update.
There are a few things to be careful of:
If the table is so big that it exhausts the rollback area - this is possible if you update the whole of (say) a 1M row table. Of course this is tunable but there are limits
If the transaction fails part way through and gets rolled back - rolling back big transactions is VERY inefficient in InnoDB (it is optimised for commits, not rollbacks)
Be careful of deadlocks and lock wait timeouts, which are more likely if you use big transactions.
You can LOCK your table for the duration of your operation:
http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
A table lock protects only against
inappropriate reads or writes by other
sessions. The session holding the
lock, even a read lock, can perform
table-level operations such as DROP
TABLE. Truncate operations are not
transaction-safe, so an error occurs
if the session attempts one during an
active transaction or while holding a
table lock.
I don't know enough about the internal row-versioning mechanisms of MySql (or indeed, if there is one), but other databases (Oracle, Postgresql, and more recently, Sql Server) have invested a lot of effort into allowing writers to not block readers, in so far as readers have access to the version of the rows that existed immediately before the update/write process started. Once the update is committed, that version of the row becomes the one made availabe to all readers, thereby avoiding a bottleneck that the above behaviour in MySql will introduce.
This policy ensures that table locking
is deadlock free. There are, however,
other things you need to be aware of
about this policy: If you are using a
LOW_PRIORITY WRITE lock for a table,
it means only that MySQL waits for
this particular lock until there are
no other sessions that want a READ
lock. When the session has gotten the
WRITE lock and is waiting to get the
lock for the next table in the lock
table list, all other sessions wait
for the WRITE lock to be released. If
this becomes a serious problem with
your application, you should consider
converting some of your tables to
transaction-safe tables.
You can load your data into a shadow table as slowly as you like, then instantly swap the shadow and actual with RENAME TABLE:
truncate table shadow; # make sure it is clean to start with
insert into shadow .....; # lots of inserts etc against shadow table
rename table active to temp, shadow to active, temp to shadow;
truncate table shadow; # throw away the old active data
The rename statement is atomic. An intermediate name "temp" is used to help swap the names of temp and active.
This should work with all storage engines.
Rename table - MySQL Manual