MariaDB. Use Transaction Rollback without locking tables - mysql

On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().

I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.

All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?

Related

Which version of record will be returned in read committed isolation of MYSQL

I have a scenario where my cluster is in read committed isolation mode and the use case is like below:
A select statement when executed takes around 1 minutes to run the query and get the response back.
During which updates(Committed) to data can happen during this time frame of 1 minute.
So my question is will i get the updated record in the response or the old record??
I read the documentation and it's mentioned any phantom reads are allowed.
I am confused here so just want some clarity, please help.
Using READ COMMITTED has additional effects(Reference MYSQL docs):
For UPDATE or DELETE statements, InnoDB holds locks only for rows
that it updates or deletes. Record locks for nonmatching rows are
released after MySQL has evaluated the WHERE condition. This greatly
reduces the probability of deadlocks, but they can still happen.
For UPDATE statements, if a row is already locked, InnoDB performs a
“semi-consistent” read, returning the latest committed version to
MySQL so that MySQL can determine whether the row matches the WHERE
condition of the UPDATE. If the row matches (must be updated), MySQL
reads the row again and this time InnoDB either locks it or waits
for a lock on it.
There is no way concurrent updates to data can modify a given query while it is executing. It's as if every query runs in its own REPEATABLE READ snapshot, even if your transaction is READ COMMITTED.
It will return rows that had been committed at the time the statement began executing. It will not include any rows committed after the statement began.
Re your comment:
No, there is no transaction isolation level that can change this. Even if you use READ UNCOMMITTED, a given query reads only rows that were committed at the time the query began executing.
If you want to query recent updates, you can only do it by starting a new query.
If you're concerned that you aren't getting notified about recent updates, then you need to optimize your query so it doesn't take 60 seconds to execute.
This is starting to sound like you're polling the database. Running frequent expensive queries to poll a database is an indication that perhaps you need to use a message queue instead.
Re your second comment:
Locking SQL statements, including UPDATE and DELETE and also locking SELECT statements do function like READ COMMITTED even when your transaction is REPEATABLE READ. Locking statements always read the most recent row that was committed at the time the statement started.
But they still cannot read new rows committed after the statement started. If for no other reason than they can't get the locks on those rows.
Your original question was about SELECT statements, and I assumed you meant non-locking SELECT (that is, without the options of FOR UPDATE or LOCK IN SHARE MODE). Those SELECT statements also cannot view rows added after the SELECT started.
P.S. I have never found a good use of READ UNCOMMITTED for any purpose.
By default, INNOBD will lock the tables during processing, but there are ways to do an UNLOCKED SELECT. In that case, it will run on a versioned snapshot of the table, so any COMMIT during the processing won't alter the result.
For more information:
https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html
In all cases, the ACID property of databases will always prevent unstable functions: https://en.wikipedia.org/wiki/ACID

Mysql (myisam) : "waiting for lock" REPLACE blocking all selects

I've a MyISAM table (I can't change it to use InnoDb, do please don't suggest that) which is pretty big (~20GB)
I've a worker which regularly dump this table (I launch is with the --skip-lock-tables option)
During the dump (which takes ~5min), concurrent select can be correctly run, as I would expect. When I go a "REPLACE" during the dump, this REPLACE is "waiting for metadatalock" which seems normal too.
But, every SELECT started after the start the REPLACE will also be "waiting for metadata lock". I can't understand why. Could you help me on this, and tell me how I can have all the selects correctly run (even after this replace)
Thanks !
What is happening is:
Your worker is making a big SELECT. The SELECT is locking the table with a read lock. By the way, the skip-lock-tables only means that you are not locking all the tables at once, but the SELECT query is still locking each table individually. More info on this answer.
Your REPLACE is trying to INSERT but has to wait for the first SELECT (the dump) to finish in order to acquire a write lock. It is put in the write lock queue.
Every SELECT after the REPLACE is put in the read lock queue.
This is a behavior described in the doc on table-level locking:
Table updates are given higher priority than table retrievals. Therefore, when a lock is released, the lock is made available to the requests in the write lock queue and then to the requests in the read lock queue. This ensures that updates to a table are not “starved” even when there is heavy SELECT activity for the table.
If you want the SELECT to not wait for the REPLACE you could (never actually tested that) try the LOW_PRIORITY modifier on your replace.
If you use the LOW_PRIORITY modifier, execution of the INSERT is delayed until no other clients are reading from the table. This includes other clients that began reading while existing clients are reading, and while the INSERT LOW_PRIORITY statement is waiting. It is possible, therefore, for a client that issues an INSERT LOW_PRIORITY statement to wait for a very long time (or even forever) in a read-heavy environment. (This is in contrast to INSERT DELAYED, which lets the client continue at once.)
However be careful as it might never run if there are always a lot of select.

MySQL InnoDB: Difference Between `FOR UPDATE` and `LOCK IN SHARE MODE`

What is the exact difference between the two locking read clauses:
SELECT ... FOR UPDATE
and
SELECT ... LOCK IN SHARE MODE
And why would you need to use one over the other?
I have been trying to understand the difference between the two. I'll document what I have found in hopes it'll be useful to the next person.
Both LOCK IN SHARE MODE and FOR UPDATE ensure no other transaction can update the rows that are selected. The difference between the two is in how they treat locks while reading data.
LOCK IN SHARE MODE does not prevent another transaction from reading the same row that was locked.
FOR UPDATE prevents other locking reads of the same row (non-locking reads can still read that row; LOCK IN SHARE MODE and FOR UPDATE are locking reads).
This matters in cases like updating counters, where you read value in 1 statement and update the value in another. Here using LOCK IN SHARE MODE will allow 2 transactions to read the same initial value. So if the counter was incremented by 1 by both transactions, the ending count might increase only by 1 - since both transactions initially read the same value.
Using FOR UPDATE would have locked the 2nd transaction from reading the value till the first one is done. This will ensure the counter is incremented by 2.
For Update --- You're informing Mysql that the selected rows can be updated in the next steps(before the end of this transaction) ,,so that mysql does'nt grant any read locks on the same set of rows to any other transaction at that moment. The other transaction(whether for read/write )should wait until the first transaction is finished.
For Share- Indicates to Mysql that you're selecting the rows from the table only for reading purpose and not to modify before the end of transaction. Any number of transactions can access read lock on the rows.
Note: There are chances of getting a deadlock if this statement( For update, For share) is not properly used.
Either way the integrity of your data will be guaranteed, it's just a question of how the database guarantees it. Does it do so by raising runtime errors when transactions conflict with each other (i.e. FOR SHARE), or does it do so by serializing any transactions that would conflict with each other (i.e. FOR UPDATE)?
FOR SHARE (a.k.a. LOCK IN SHARE MODE): Transactions face a higher probability of failure due to deadlock, because they delay blocking until the moment an update statement is received (at which point they either block until all readlocks are released, or fail due to deadlock if another write is in progress). However, only one client blocks and eventually succeeds: the other clients will fail with deadlock if they try to update, so only one of them will succeed and the rest will have to retry their transactions.
FOR UPDATE: Transactions won't fail due to deadlock, because they won't be allowed to run concurrently. This may be desirable for example because it makes it easier to reason about multi-threading if all updates are serialized across all clients. However, it limits the concurrency you can achieve because all other transactions block until the first transaction is finished.
Pro-Tip: As an exercise I recommend taking some time to play with a local test database and a couple mysql clients on the command line to prove this behavior for yourself. That is how I eventually understood the difference myself, because it can be very abstract until you see it in action.

Do "SELECT ... LOCK IN SHARE MODE" and "SELECT ... FOR UPDATE" have to be inside of a transaction?

I'm reading the documentation for these commands and am confused. The descriptions for the commands mention transactions:
SELECT ... LOCK IN SHARE MODE sets a shared mode lock on any rows that
are read. Other sessions can read the rows, but cannot modify them
until your transaction commits. If any of these rows were changed by
another transaction that has not yet committed, your query waits until
that transaction ends and then uses the latest values.
For index records the search encounters, SELECT ... FOR UPDATE blocks
other sessions from doing SELECT ... LOCK IN SHARE MODE or from
reading in certain transaction isolation levels. Consistent reads will
ignore any locks set on the records that exist in the read view. (Old
versions of a record cannot be locked; they will be reconstructed by
applying undo logs on an in-memory copy of the record.)
But then the examples don't show transactions being used. Running a test command such as select * from users for update; without a transaction doesn't result in any errors (it works). Does this mean transactions don't have to be used with these commands? If so, is there any advantage to putting these commands inside of a transaction?
In InnoDB each query is effectively run in a transaction. If you don't start transaction explicitly (with start transaction or by setting autocommit to off), each transaction is committed after the query run. This means that if you are not in a transaction, the lock acquired with SELECT ... IN SHARE MODE will be released as soon as the query is completed. There is nothing that prevents you from doing this, it just doesn't make much sense to use locks outside of a transaction; as these locks are to guarantee that the value you select won't change until a later query you are going to execute (like if you want to insert/update data in one table based on the values in another)
A transaction ensures that all the commands it contains will either run successfully or rollback.
These types of select statements affect other transactions in other sessions. So basically wrapping these in transactions is only a matter of whether you are selecting the data as part of a larger set of commands.
If you only want to select the data you should either use the shared lock or no lock at all and no need to begin a transaction.

Locking a table within a transaction

I would like to be able to lock an entire table to prevent any INSERTs or UPDATEs in it between the "beginTransaction" and the ending "commit" or "rollback".
I know that beginning a transaction results in an implicit UNLOCK TABLES and that a LOCK table results in a implicit COMMIT... so is there any way to do what I want?
Why? Perhaps you have missed the point of transactions.
If you use repeatable-read transaction isolation, inserts, updates etc, can happen during your transaction, BUT YOU WILL NOT SEE THEM. So as far as your process is concerned, the table is locked for inserts/updates. Except they are still happening, they're still durable to disc, and other processes can continue.
After you do your first "select", a snapshot is created, and you are effectively reading that snapshot, not the latest version. If this is what you want, repeatable-read works well for you.
select count(*) from table
within a transaction, locks the talbe on msSQL 2000
If you're using PHP, so when there is a transaction going on, you can set a SESSION variable to tell the script not to do anything with the database, i.e. $_SESSION['on_going_transaction'] = true.
When the transaction is completed, just destroy the SESSION variable so that another transaction can occur. This is much easier.