Related
I am for example trying to create a new record in my mysql database. In the case of a sql.ErrTxDone, what does it actually mean, what should i do in-case the transaction was committed ?
You get this error if a transaction is in a state where it cannot be used anymore.
sql.Tx:
After a call to Commit or Rollback, all operations on the transaction fail with ErrTxDone.
And also sql.ErrTxDone:
ErrTxDone is returned by any operation that is performed on a transaction that has already been committed or rolled back.
var ErrTxDone = errors.New("sql: transaction has already been committed or rolled back")
What should you do? Don't use the transaction anymore. If you have further task, do it outside of it or in another transaction.
If you have tasks that should be in the same transaction, don't commit it until you do everything you have to. If the transaction was rolled back (e.g. due to a previous error), you have no choice but to retry (using another transaction) or report failure.
If you're already using transactions, try to put everything in the transaction that needs to happen all-or-nothing. That's the point of transactions. Either everything in it gets applied, or none of them. Using them properly you don't have to think about cleaning up after them. They either succeed and you're happy, or they don't and you either retry or report error, but you don't have to do any cleanup.
On a website, when a user posts a comment I do several queries, Inserts and Updates. (On MariaDB 10.1.29)
I use START TRANSACTION so if any query fails at any given point I can easily do a rollback and delete all changes.
Now I noticed that this locks the tables when I do an INSERT from an other INSERT, and I'm not talking while the query is running, that’s obvious, but until the transaction is not closed.
Then DELETE is only locked if they share a common index key (comments for the same page), but luckily UPDATE is no locked.
Can I do any Transaction that does not lock the table from new inserts (while the transaction is ongoing, not the actual query), or any other method that lets me conveniently "undo" any query done after some point?
PD:
I start Transaction with PHPs function mysqli_begin_transaction() without any of the flags, and then mysqli_commit().
I don't think that a simple INSERT would block other inserts for longer than the insert time. AUTO_INC locks are not held for the full transaction time.
But if two transactions try to UPDATE the same row like in the following statement (two replies to the same comment)
UPDATE comment SET replies=replies+1 WHERE com_id = ?
the second one will have to wait until the first one is committed. You need that lock to keep the count (replies) consistent.
I think all you can do is to keep the transaction time as short as possible. For example you can prepare all statements before you start the transaction. But that is a matter of milliseconds. If you transfer files and it can take 40 seconds, then you shouldn't do that while the database transaction is open. Transfer the files before you start the transaction and save them with a name that indicates that the operation is not complete. You can also save them in a different folder but on the same partition. Then when you run the transaction, you just need to rename the files, which should not take much time. From time to time you can clean-up and remove unrenamed files.
All write operations work in similar ways -- They lock the rows that they touch (or might touch) from the time the statement is executed until the transaction is closed via either COMMIT or ROLLBACK. SELECT...FOR UPDATE and SELECT...WITH SHARED LOCK also get involved.
When a write operation occurs, deadlock checking is done.
In some situations, there is "gap" locking. Did com_id happen to be the last id in the table?
Did you leave out any SELECTs that needed FOR UPDATE?
I want to know if there's a way to commit a transaction partially. I have a long running transaction in C# and when two users are running this transaction parallel to each other, the data is co-dependent and should show to them both even while in the transaction. For example say I have a table with these 3 columns
username | left_child | right_child
I am making a binary tree and whenever a new user is added into the database they end up somewhere in the tree. But I am running all of the insertions and updates in one transaction so if there's even one error the whole transaction can be rolled back and the structure of the tree is not disturbed. But the problem is the when two users are using my web app at the same time.
Say that username 'jackie_' does not have any children at the minute. Two new users 'king_' and 'robbo' enter parallel to each other and the transaction is running for both of them. Since the results of the transaction running for one user are not visible to the other user in the actual database yet, they both think that the left_child of 'jackie_' hasn't been set yet and so they both update the left_child to their own username. During the transaction since the update was successful for both of them, they both commit the transaction. Now I have two users but only one of them is actually successfully entered into the tree and the structure of the tree is disturbed completely.
So what I need is to be able to commit one transaction even during, "partially". So if 'robbo' got set the left_child of 'jackie_' first, the transaction implements the change into the database so when 'king_' tries to update the same row, he can't. But if along the way, if some other problem occurs for 'robbo' I still want to be able to rollback the whole transaction. Any other solutions which would be more practical are appreciated as well.
For all the queries that I am running, this is the way I am doing it in the transaction
string insertTreesQuery = "INSERT INTO tree (username) VALUES('king_')";
MySqlCommand insertTreesQueryCmd = new MySqlCommand(insertTreesQuery , con);
insertTreesQueryCmd.Transaction = sqlTrans;
insertTreesQueryCmd.ExecuteNonQuery();
where sqlTrans is the transaction that I am using for all the MySqlCommand objects before executing them
What you are asking is not possible. You cannot "partially" commit a transaction. I'm assuming that your example is greatly simplified since you mention the transactions are very large. In that case, it would probably be best to split it up into smaller ones that can be committed independently, thus reducing the chance of there being a conflict.
I develop an online reservation system. To simplify let's say that users can book multiple items and each item can be booked only once. Items are first added to the shopping cart.
App uses MySql / InnoDB database. According to MySql documentation, default isolation level is Repeatable reads.
Here is the checkout procedure I've came up with so far:
Begin transaction
Select items in the shopping cart (with for update lock)
Records from cart-item and items tables are fetched at this step.
Check if items haven't been booked by anybody else
Basically check if quantity > 0. It's more complicated in the real application, thus I put it here as a separate step.
Update items, set quantity = 0
Also perform other essential database manipulations.
Make payment (via external api like PayPal or Stripe)
No user interaction is necessary as payment details can be collected before checkout.
If everything went fine commit transaction or rollback otherwise
Continue with non-essential logic
Send e-mail etc in case of success, redirect for error.
I am unsure if that is sufficient. I'm worried whether:
Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
I'm a bit confused about SELECT ... FOR UPDATE on non-existent rows section of Willem Renzema answer. When may it become important? Could you provide any example?
Here are some resources I've read:
How to deal with concurrent updates in databases?, MySQL: Transactions vs Locking Tables, Do database transactions prevent race conditions?,
Isolation (database systems), InnoDB Locking and Transaction Model, A beginner’s guide to database locking and the lost update phenomena.
Rewrote my original question to make it more general.
Added follow-up questions.
Begin transaction
Select items in shopping cart (with for update lock)
So far so good, this will at least prevent the user from doing checkout in multiple sessions (multiple times trying to checkout the same card - good to deal with double clicks.)
Check if items haven't been booked by other user
How do you check? With a standard SELECT or with a SELECT ... FOR UPDATE? Based on step 5, I'm guessing you are checking a reserved column on the item, or something similar.
The problem here is that the SELECT ... FOR UPDATE in step 2 is NOT going to apply the FOR UPDATE lock to everything else. It is only applying to what is SELECTed: the cart-item table. Based on the name, that is going to be a different record for each cart/user. This means that other transactions will NOT be blocked from proceeding.
Make payment
Update items marking them as reserved
If everything went fine commit transaction, rollback otherwise
Following the above, based on the information you've provided, you may end up with multiple people buying the same item, if you aren't using SELECT ... FOR UPDATE on step 3.
Suggested Solution
Begin transaction
SELECT ... FOR UPDATE the cart-item table.
This will lock a double click out from running. What you select here should be the some kind of "cart ordered" column. If you do this, a second transaction will pause here and wait for the first to finish, and then read the result what the first saved to the database.
Make sure to end the checkout process here if the cart-item table says it has already been ordered.
SELECT ... FOR UPDATE the table where you record if an item has been reserved.
This will lock OTHER carts/users from being able to read those items.
Based on the result, if the items are not reserved, continue:
UPDATE ... the table in step 3, marking the item as reserved. Do any other INSERTs and UPDATEs you need, as well.
Make payment. Issue a rollback if the payment service says the payment didn't work.
Record payment, if success.
Commit transaction
Make sure you don't do anything that might fail between steps 5 and 7 (like sending emails), else you may end up with them making a payment without it being recorded, in the event the transaction gets rolled back.
Step 3 is the important step with regards to making sure two (or more) people don't try to order the same item. If two people do try, the 2nd person will end up having their webpage "hang" while it processes the first. Then when the first finishes, the 2nd will read the "reserved" column, and you can return a message to the user that someone has already purchased that item.
Payment in transaction or not
This is subjective. Generally, you want to close transactions as quickly as possible, to avoid multiple people being locked out from interacting with the database at once.
However, in this case, you actually do want them to wait. It's just a matter of how long.
If you choose to commit the transaction before payment, you'll need to record your progress in some intermediate table, run the payment, and then record the result. Be aware that if the payment fails, you'll then have to manually undo the item reservation records that you updated.
SELECT ... FOR UPDATE on non-existent rows
Just a word of warning, in case your table design involves inserting rows where you need to earlier SELECT ... FOR UPDATE: If a row doesn't exist, that transaction will NOT cause other transactions to wait, if they also SELECT ... FOR UPDATE the same non-existent row.
So, make sure to always serialize your requests by doing a SELECT ... FOR UPDATE on a row that you know exists first. Then you can SELECT ... FOR UPDATE on the row that may or may not exist yet. (Don't try to do just a SELECT on the row that may or may not exist, as you'll be reading the state of the row at the time the transaction started, not at the moment you run the SELECT. So, SELECT ... FOR UPDATE on non-existent rows is still something you need to do in order to get the most up to date information, just be aware it will not cause other transactions to wait.)
1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.
2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.
3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.
SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.
4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.
By default suspended statements fail after 50s.
5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
Yes, SELECT ... FOR UPDATE on items table should be enough.
Yes, these selects wait, because FOR UPDATE is an exclusive lock.
Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.
6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.
What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.
If I start a transaction, run some queries, and then commit it, does the commit work by applying the results of the queries, or the queries themselves?
For instance, if my transaction contains insert into b select x from a, and the x changes after I run this query but before I commit the transaction, will the result be the value of x as it was during the transaction, or the value of x at the time of the commit?
Answer: x will have the value that it had at the timestamp of read.
To be more specific, commit is a part of the transaction, so you may not address it as during the transaction or at the time of commit, they are not two separate operation.
Clarification: Using transaction will ensure that all the operations within the transaction boundaries look atomic to outside world.
Atomic operation means that:
The operation should be either succeed or fail. There must not be an intermediary step in between. In particular, if a transaction starts and a failure (like power outage) occurs, it needs to rollback all the changes that it has made so far.
Also it means that outside world should see it as a single operation. If a transaction uses some values, all the values should be taken from the timestamp right at the start of the transaction. Similarly all the changes that transaction should make, will be available to outside world right after the timestamp at which the transaction ends. This is achieved by proper form of locking (exclusive, read or write) tables involved in the transaction.
So to be more specific regarding your question, when select x from a is called in a transaction, table a is locked till the transaction either commits or rolled back. The scenario you mentioned, x changes after I run this query (before transaction commits/rolled back), is not practically possible .