Partially commit MySQL Transaction? - mysql

I want to know if there's a way to commit a transaction partially. I have a long running transaction in C# and when two users are running this transaction parallel to each other, the data is co-dependent and should show to them both even while in the transaction. For example say I have a table with these 3 columns
username | left_child | right_child
I am making a binary tree and whenever a new user is added into the database they end up somewhere in the tree. But I am running all of the insertions and updates in one transaction so if there's even one error the whole transaction can be rolled back and the structure of the tree is not disturbed. But the problem is the when two users are using my web app at the same time.
Say that username 'jackie_' does not have any children at the minute. Two new users 'king_' and 'robbo' enter parallel to each other and the transaction is running for both of them. Since the results of the transaction running for one user are not visible to the other user in the actual database yet, they both think that the left_child of 'jackie_' hasn't been set yet and so they both update the left_child to their own username. During the transaction since the update was successful for both of them, they both commit the transaction. Now I have two users but only one of them is actually successfully entered into the tree and the structure of the tree is disturbed completely.
So what I need is to be able to commit one transaction even during, "partially". So if 'robbo' got set the left_child of 'jackie_' first, the transaction implements the change into the database so when 'king_' tries to update the same row, he can't. But if along the way, if some other problem occurs for 'robbo' I still want to be able to rollback the whole transaction. Any other solutions which would be more practical are appreciated as well.
For all the queries that I am running, this is the way I am doing it in the transaction
string insertTreesQuery = "INSERT INTO tree (username) VALUES('king_')";
MySqlCommand insertTreesQueryCmd = new MySqlCommand(insertTreesQuery , con);
insertTreesQueryCmd.Transaction = sqlTrans;
insertTreesQueryCmd.ExecuteNonQuery();
where sqlTrans is the transaction that I am using for all the MySqlCommand objects before executing them

What you are asking is not possible. You cannot "partially" commit a transaction. I'm assuming that your example is greatly simplified since you mention the transactions are very large. In that case, it would probably be best to split it up into smaller ones that can be committed independently, thus reducing the chance of there being a conflict.

Related

MySQL/MariaDB InnoDB Simultaneous Transactions & Locking Behaviour

As part of the persistence process in one of my models an md5 check_sum of the entire record is generated and stored with the record. The md5 check_sum contains a flattened representation of the entire record including all EAV attributes etc. This makes preventing absolute duplicates very easy and efficient.
I am not using a unique index on this check_sum for a specific reason, I want this all to be silent, i.e. if a user submits a duplicate then the app just silently ignores it and returns the already existing record. This ensures backwards compatibility with legacy app's and api's.
I am using Laravel's eloquent. So once a record has been created and before committing the application does the following:
$taxonRecords = TaxonRecord::where('check_sum', $taxonRecord->check_sum)->get();
if ($taxonRecords->count() > 0) {
DB::rollBack();
return $taxonRecords->first();
}
However recently I encountered a 60,000/1 shot incident(odds based on record counts at that time). A single duplicate ended up in the database with the same check_sum. When I reviewed the logs I noticed that the creation time was identical down to the second. Further investigation of Apache logs showed a valid POST but the POST was duplicated. I presume the users browser malfunctioned or something but both POSTS arrived simultaneously resulting in two simultaneous transactions.
My question is how can I ensure that a transaction and its contained SELECT for the previous check_sum is Atomic & Isolated. Based upon my reading the answer lies in https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html and isolation levels.
If transaction A and transaction B arrive at the server at the same time then they should not run side by side but should wait for the first to complete.
You created a classic race condition. Both transactions are calculating the checksum while they're both in progress, not yet committed. Neither can read the other's data, since they're uncommitted. So they calculate that they're the only one with the same checksum, and they both go through and commit.
To solve this, you need to run such transactions serially, to be sure that there aren't other concurrent transactions submitting the same data.
You may have to use use GET_LOCK() before starting your transaction to calculate the checksum. Then RELEASE_LOCK() after you commit. That will make sure other concurrent requests wait for your data to be committed, so they will see it when they try to calculate their checksum.

How to properly use transactions and locks to ensure database integrity?

I develop an online reservation system. To simplify let's say that users can book multiple items and each item can be booked only once. Items are first added to the shopping cart.
App uses MySql / InnoDB database. According to MySql documentation, default isolation level is Repeatable reads.
Here is the checkout procedure I've came up with so far:
Begin transaction
Select items in the shopping cart (with for update lock)
Records from cart-item and items tables are fetched at this step.
Check if items haven't been booked by anybody else
Basically check if quantity > 0. It's more complicated in the real application, thus I put it here as a separate step.
Update items, set quantity = 0
Also perform other essential database manipulations.
Make payment (via external api like PayPal or Stripe)
No user interaction is necessary as payment details can be collected before checkout.
If everything went fine commit transaction or rollback otherwise
Continue with non-essential logic
Send e-mail etc in case of success, redirect for error.
I am unsure if that is sufficient. I'm worried whether:
Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
I'm a bit confused about SELECT ... FOR UPDATE on non-existent rows section of Willem Renzema answer. When may it become important? Could you provide any example?
Here are some resources I've read:
How to deal with concurrent updates in databases?, MySQL: Transactions vs Locking Tables, Do database transactions prevent race conditions?,
Isolation (database systems), InnoDB Locking and Transaction Model, A beginner’s guide to database locking and the lost update phenomena.
Rewrote my original question to make it more general.
Added follow-up questions.
Begin transaction
Select items in shopping cart (with for update lock)
So far so good, this will at least prevent the user from doing checkout in multiple sessions (multiple times trying to checkout the same card - good to deal with double clicks.)
Check if items haven't been booked by other user
How do you check? With a standard SELECT or with a SELECT ... FOR UPDATE? Based on step 5, I'm guessing you are checking a reserved column on the item, or something similar.
The problem here is that the SELECT ... FOR UPDATE in step 2 is NOT going to apply the FOR UPDATE lock to everything else. It is only applying to what is SELECTed: the cart-item table. Based on the name, that is going to be a different record for each cart/user. This means that other transactions will NOT be blocked from proceeding.
Make payment
Update items marking them as reserved
If everything went fine commit transaction, rollback otherwise
Following the above, based on the information you've provided, you may end up with multiple people buying the same item, if you aren't using SELECT ... FOR UPDATE on step 3.
Suggested Solution
Begin transaction
SELECT ... FOR UPDATE the cart-item table.
This will lock a double click out from running. What you select here should be the some kind of "cart ordered" column. If you do this, a second transaction will pause here and wait for the first to finish, and then read the result what the first saved to the database.
Make sure to end the checkout process here if the cart-item table says it has already been ordered.
SELECT ... FOR UPDATE the table where you record if an item has been reserved.
This will lock OTHER carts/users from being able to read those items.
Based on the result, if the items are not reserved, continue:
UPDATE ... the table in step 3, marking the item as reserved. Do any other INSERTs and UPDATEs you need, as well.
Make payment. Issue a rollback if the payment service says the payment didn't work.
Record payment, if success.
Commit transaction
Make sure you don't do anything that might fail between steps 5 and 7 (like sending emails), else you may end up with them making a payment without it being recorded, in the event the transaction gets rolled back.
Step 3 is the important step with regards to making sure two (or more) people don't try to order the same item. If two people do try, the 2nd person will end up having their webpage "hang" while it processes the first. Then when the first finishes, the 2nd will read the "reserved" column, and you can return a message to the user that someone has already purchased that item.
Payment in transaction or not
This is subjective. Generally, you want to close transactions as quickly as possible, to avoid multiple people being locked out from interacting with the database at once.
However, in this case, you actually do want them to wait. It's just a matter of how long.
If you choose to commit the transaction before payment, you'll need to record your progress in some intermediate table, run the payment, and then record the result. Be aware that if the payment fails, you'll then have to manually undo the item reservation records that you updated.
SELECT ... FOR UPDATE on non-existent rows
Just a word of warning, in case your table design involves inserting rows where you need to earlier SELECT ... FOR UPDATE: If a row doesn't exist, that transaction will NOT cause other transactions to wait, if they also SELECT ... FOR UPDATE the same non-existent row.
So, make sure to always serialize your requests by doing a SELECT ... FOR UPDATE on a row that you know exists first. Then you can SELECT ... FOR UPDATE on the row that may or may not exist yet. (Don't try to do just a SELECT on the row that may or may not exist, as you'll be reading the state of the row at the time the transaction started, not at the moment you run the SELECT. So, SELECT ... FOR UPDATE on non-existent rows is still something you need to do in order to get the most up to date information, just be aware it will not cause other transactions to wait.)
1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.
2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.
3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.
SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.
4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.
By default suspended statements fail after 50s.
5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
Yes, SELECT ... FOR UPDATE on items table should be enough.
Yes, these selects wait, because FOR UPDATE is an exclusive lock.
Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.
6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.
What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.

Should I commit after a single select

I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.

Updating account balances with mysql

I have a field on a User table that holds the account balance for the user. Users can perform a lot of actions with my service that will result in rapid changes to their balance.
I'm trying to use mysql's serializable isolation level to make sure that multiple user actions will not update the value incorrectly. (Action A and action B simultaneously want to deduct 1 dollar from the balance.) However, I'm getting a lot of deadlock errors.
How do I do this correctly without getting all these deadlocks, and still keeping the balance field up to date?
simple schema: user has an id and a balance.
im using doctrine, so i'm doing something like the following:
$con->beginTransaction();
$tx = $con->transaction;
$tx->setIsolation('SERIALIZABLE');
$user = UserTable::getInstance()->find($userId);
$user->setBalance($user->getBalance() + $change);
$user->save();
$con->commit();
First trying to use serializable isolation level on your transaction is a good idea. It means you know at least a minimum what a transation is, and that the isolation level is one of the biggest problem.
Note that serializable is not really a true seriability. More on that on this previous answer, when you'll have some time to read it :-).
But the most important part is that you should consider that having automatic rollbacks on your transaction because of failed serialibility is a normal fact, and that the right thing to do is building your application so that transactions could fail and should be replayed.
One simple solution, and for accounting things I like this simple solution as we can predict all the facts, no suprises, so, one solution is to perform table locks. This is not a fine and elegant solution, no row levels locks, just simple big table locks (and always in the same order). After that you can do your operation as a single player and then release teh locks. Not multi user concurrency on the rows of the tables, no next-row magical locks fails (see previous link). This will certainly slow down your write operations, but if everybody performs the table locks in the same order you'll only get locks timeouts problems, no deadlocks and no 'unserializable auto-rollback'.
Edit
From your code sample I'm not sure you can set the transaction isolation level after the begin. You should activate query logs on MySQL and seewhat is done, then check that other transactions runned by the CMS are not still in the serializable level.

How can i found out if there is transaction open in MySQL?

How can i find out if there is transaction open in mySQL? I need to start new one if there is no transaction open, but i don't want to start new one if there is one running, because that would commit that running transaction.
UPDATE:
i need to query database in one method of my application, but that query could be called as part of bigger transaction, or just as it should be a transaction on its own. Changing application to track if it has open a transaction would be more difficult, as it could started from many pieces of code. Although it would be possible, i'm looking for solution that would be faster to implement. Simple if statement in sql would be effortless.
Thank you
I am assuming you are doing this as a one-off and not trying to establish something that can be done programatically. You can get the list of currently active processes using: SHOW PROCESSLIST
http://dev.mysql.com/doc/refman/5.1/en/show-processlist.html
If you want something programatic, then I would suggest an explicit lock on a table, at the beginning of your transaction and release it at the end.