Do MySQL transaction commits apply queries, or results? - mysql

If I start a transaction, run some queries, and then commit it, does the commit work by applying the results of the queries, or the queries themselves?
For instance, if my transaction contains insert into b select x from a, and the x changes after I run this query but before I commit the transaction, will the result be the value of x as it was during the transaction, or the value of x at the time of the commit?

Answer: x will have the value that it had at the timestamp of read.
To be more specific, commit is a part of the transaction, so you may not address it as during the transaction or at the time of commit, they are not two separate operation.
Clarification: Using transaction will ensure that all the operations within the transaction boundaries look atomic to outside world.
Atomic operation means that:
The operation should be either succeed or fail. There must not be an intermediary step in between. In particular, if a transaction starts and a failure (like power outage) occurs, it needs to rollback all the changes that it has made so far.
Also it means that outside world should see it as a single operation. If a transaction uses some values, all the values should be taken from the timestamp right at the start of the transaction. Similarly all the changes that transaction should make, will be available to outside world right after the timestamp at which the transaction ends. This is achieved by proper form of locking (exclusive, read or write) tables involved in the transaction.
So to be more specific regarding your question, when select x from a is called in a transaction, table a is locked till the transaction either commits or rolled back. The scenario you mentioned, x changes after I run this query (before transaction commits/rolled back), is not practically possible .

Related

Are changes made to DB only through transactions?

I am not able to get a clear complete understanding regarding the role of transactions in databases.
I know operations clubbed in a transactions will be executed together and then either committed or rolled back.
But then what about about any other query that I write to the database without manually creating a transaction.
Is a transaction created internally for them?
Also what about select statements then? Are transactions created for them too?
I have been using database and sql for some time now, alas I am not clear on these
Are changes to DBs happening only through transactions? Short answer is yes.
There is always a transaction involved:
It might be automatically started (before) and commited (after) every single DML statement you issue, if you're relying on AUTOCOMMIT behaviour of your database session
Or you may explictly start one with BEGIN, execute your statements and end it with COMMIT
I like to think a transaction as a boundary that imposes clear semantics of ATOMICITY and ISOLATION to the statements that are contained within.
You describe atomicity (all or nothing behaviour) but that is not the only guarantee that a transaction can give you: there's also isolation (and this has to do with reads you within transactions (E.g. SELECTs).
In a concurrent application (many clients writing and reading to the same db/table at the same time), transaction ISOLATION is the property that defines "how much of the effects of other operations" can be observed in the current one. For example, assume you need to perform a transaction that involves doing the same SELECT multiple times: do you want this SELECT to return (possibly) different results each time (because some modification happened concurrently) or not?
For single statements is:
A single DML (UPDATE, INSERT...) statement by itself is effectively "As if it was in a transaction with a single statement, that gets immediately commited after execution" (Either it works like this because you're in AUTOCOMMIT, or you wrapped a single statement within BEGIN...COMMIT)
For a single SELECT it's the same. The transaction in this case (implicit or not, gives you the possibility of specifiying different isolation levels). It might sound strange to consider transactions for SELECTS, but requiring particular isolation levels might mean that the db is acquiring some lock to the data under the hood: committing the transaction in that case would release such lock.
Since you tagged mysql, here you can read on transaction isolations supported by mysql:
https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html
A SQL transaction is any statement that contains Data Manipulation Language (DML). That is, any statement that changes values in a table, such as UPDATE, INSERT, MERGE, DELETE, etc.

MariaDB. Use Transaction Rollback without locking tables

On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?

Ensure MySQL transaction fails on COMMIT

I want to test a script what it does when MySQL transaction fails. So I want to simulate transaction failure on COMMIT.
Image the script opens a transaction, makes some queries, and then commits the transaction. I want to implant some extra query to the transaction, to ensure the transaction will fail on COMMIT. So I can test how the script recovers from the transation failure.
I don't want to use explicit ROLLBACK. I want to simulate some real-life case when a commit fails. The exact DB structure is unimportant. It may adapt to the solution.
Edit: I need the transaction to fail on COMMIT, not on some prior query. Therefore answers that rollback prior to COMMIT are not what I want. Such as this one: How to simulate a transaction failure?
My unsuccessful attempts:
Inserting a row with invalid PK or FK fails immediately with insert. Temporarily disabling FK checks with FOREIGN_KEY_CHECKS=0 won't help as they won't be rechecked on COMMIT. If it was psql, defferable constraints would help. But not in mysql.
Opening two parallel transactions and inserting a row with the same PK (or any column with unique constaint) in both transactions locks the later transaction on insert and waits for the former transaction. So the transaction rolls back on insert not on commit.
So I believe you can try the following
Two phase commit : Use two database ( First and second) and make
use of two phase commit. When the commit statement is supposed to be
executed you can shutdown the second db. This way your commit
operation will fail and transaction will rollback.
You executed several inserts and before your commit your database server dies. A transaction may fail if it doesn't receive commit .
Hopefully it helps!

MySQL InnoDB: Difference Between `FOR UPDATE` and `LOCK IN SHARE MODE`

What is the exact difference between the two locking read clauses:
SELECT ... FOR UPDATE
and
SELECT ... LOCK IN SHARE MODE
And why would you need to use one over the other?
I have been trying to understand the difference between the two. I'll document what I have found in hopes it'll be useful to the next person.
Both LOCK IN SHARE MODE and FOR UPDATE ensure no other transaction can update the rows that are selected. The difference between the two is in how they treat locks while reading data.
LOCK IN SHARE MODE does not prevent another transaction from reading the same row that was locked.
FOR UPDATE prevents other locking reads of the same row (non-locking reads can still read that row; LOCK IN SHARE MODE and FOR UPDATE are locking reads).
This matters in cases like updating counters, where you read value in 1 statement and update the value in another. Here using LOCK IN SHARE MODE will allow 2 transactions to read the same initial value. So if the counter was incremented by 1 by both transactions, the ending count might increase only by 1 - since both transactions initially read the same value.
Using FOR UPDATE would have locked the 2nd transaction from reading the value till the first one is done. This will ensure the counter is incremented by 2.
For Update --- You're informing Mysql that the selected rows can be updated in the next steps(before the end of this transaction) ,,so that mysql does'nt grant any read locks on the same set of rows to any other transaction at that moment. The other transaction(whether for read/write )should wait until the first transaction is finished.
For Share- Indicates to Mysql that you're selecting the rows from the table only for reading purpose and not to modify before the end of transaction. Any number of transactions can access read lock on the rows.
Note: There are chances of getting a deadlock if this statement( For update, For share) is not properly used.
Either way the integrity of your data will be guaranteed, it's just a question of how the database guarantees it. Does it do so by raising runtime errors when transactions conflict with each other (i.e. FOR SHARE), or does it do so by serializing any transactions that would conflict with each other (i.e. FOR UPDATE)?
FOR SHARE (a.k.a. LOCK IN SHARE MODE): Transactions face a higher probability of failure due to deadlock, because they delay blocking until the moment an update statement is received (at which point they either block until all readlocks are released, or fail due to deadlock if another write is in progress). However, only one client blocks and eventually succeeds: the other clients will fail with deadlock if they try to update, so only one of them will succeed and the rest will have to retry their transactions.
FOR UPDATE: Transactions won't fail due to deadlock, because they won't be allowed to run concurrently. This may be desirable for example because it makes it easier to reason about multi-threading if all updates are serialized across all clients. However, it limits the concurrency you can achieve because all other transactions block until the first transaction is finished.
Pro-Tip: As an exercise I recommend taking some time to play with a local test database and a couple mysql clients on the command line to prove this behavior for yourself. That is how I eventually understood the difference myself, because it can be very abstract until you see it in action.

Do "SELECT ... LOCK IN SHARE MODE" and "SELECT ... FOR UPDATE" have to be inside of a transaction?

I'm reading the documentation for these commands and am confused. The descriptions for the commands mention transactions:
SELECT ... LOCK IN SHARE MODE sets a shared mode lock on any rows that
are read. Other sessions can read the rows, but cannot modify them
until your transaction commits. If any of these rows were changed by
another transaction that has not yet committed, your query waits until
that transaction ends and then uses the latest values.
For index records the search encounters, SELECT ... FOR UPDATE blocks
other sessions from doing SELECT ... LOCK IN SHARE MODE or from
reading in certain transaction isolation levels. Consistent reads will
ignore any locks set on the records that exist in the read view. (Old
versions of a record cannot be locked; they will be reconstructed by
applying undo logs on an in-memory copy of the record.)
But then the examples don't show transactions being used. Running a test command such as select * from users for update; without a transaction doesn't result in any errors (it works). Does this mean transactions don't have to be used with these commands? If so, is there any advantage to putting these commands inside of a transaction?
In InnoDB each query is effectively run in a transaction. If you don't start transaction explicitly (with start transaction or by setting autocommit to off), each transaction is committed after the query run. This means that if you are not in a transaction, the lock acquired with SELECT ... IN SHARE MODE will be released as soon as the query is completed. There is nothing that prevents you from doing this, it just doesn't make much sense to use locks outside of a transaction; as these locks are to guarantee that the value you select won't change until a later query you are going to execute (like if you want to insert/update data in one table based on the values in another)
A transaction ensures that all the commands it contains will either run successfully or rollback.
These types of select statements affect other transactions in other sessions. So basically wrapping these in transactions is only a matter of whether you are selecting the data as part of a larger set of commands.
If you only want to select the data you should either use the shared lock or no lock at all and no need to begin a transaction.