optimistic locking user credit management - mysql

I have a central database for handling user credit with multiple servers reads and writes to it. The application sits on top of these servers serve user requests by doing the following for each request:
1. check if user has enough credit for the task by reading from db.
2. perform the time consuming request
3. deduct a credit from user account, save the new credit count back to db.
the application uses the database's optimistic locking. So following might happen
1. request a comes in, see that user x has enough credit,
2. request b comes in, see that user x has enough credit,
3. a performs work
4. a saves the new credit count back to db
5. b performs work
6. b tries to save the new credit count back to db, application gets an exception and fails to account for this credit deduction.
With pessimistic locking, the application will need to explicitly get a lock on the user account to guarantee exclusive access, but this KILLs performance since the system have many concurrent requests.
so what would be a good new design for this credit system?

Here are two "locking" mechanisms at avoid using InnoDB's locking mechanism for either of two reasons:
A task that takes longer than you should spend in a BEGIN...COMMIT of InnoDB.
A task that ends in a different program (or different web page) than it started in.
Plan A. (This assumes the race condition is rare, and the time wasted for Step 2 is acceptable in those rare cases.)
(same) check if user has enough credit for the task by reading from db.
(same) perform the time consuming request
(added) START TRANSACTION;
(added) Again Check if the user has enough credit. (ROLLABCK and abort if not.)
(same as old #3) deduct a credit from user account, save the new credit count back to db.
(added) COMMIT;
START..COMMIT is InnoDB transaction stuff. If a race condition caused 'x' to not have credit by step 4, you will ROLLBACK and not perform steps 4 and 5.
Plan B. (This is more complex, but you might prefer it.)
Have a table Locks for locking. It contains user_id and a timestamp.
START TRANSACTION;
If user_id is in Locks, abort (ROLLBACK and exit).
INSERT INTO Locks the user_id and current_timestamp in Locks (thereby "locking" 'x').
COMMIT;
Perform the processing (original Steps 1,2,3)
DELETE FROM Locks WHERE user_id = 'x'; (autocommit=1 suffices here.)
A potential problem: If the processing dies in step 6, not getting around to releasing the lock, that user will be locked out forever. The 'solution' is to periodically check Locks for any timestamps that are 'very' old. If any are found, assume that the processing died, and DELETE the row(s).

You didn't state explicitly what you want to achieve, so I assume you don't want to perform the work just to realise it has been in vain due to low credit.
No-lock
Implement credit hold on step (1) and associate the work (2) and the deduction (3) with the hold. This way low credit user won't pass step (1).
Optimistic locking
As a collision is detected in optimistic locking post factum, I don't think it fits the assumption.
Pessimistic locking
It isn't possible to tell definitely without knowing the schema, but I think it's an exaggeration about killing performance. You can smartly incorporate MySQL InnoDB transaction isolation levels and locking reads at finer granularity than exclusively locking a user account completely. For instance, using SELECT ... LOCK IN SHARE MODE which sets shared locks and allows reads for other transactions.
Rick's caution about the tasking taking longer then MySQL will wait (innodb_lock_wait_timeout) applies here.

You want The Escrow Transactional Method.
You record the credit left after doling out some to each updating process and the credit doled out to (ie held in escrow for) them. A process retries until success a transaction that increases the credit doled out by what it needs and decreases the credit that is left by what it needs; it succeeds only if that would leave the credit left non-negative. Then it does its long calculation. Regardless of the calculation's success it then applies a transaction that decreases the credit doled out. But on success it also increases the assets while on failure it increases the credit left.

Use the timestamp/rowversion approach that you will find in all real database engines except MySQL.
You can emulate them with MySQL in this way. Have a TIMESTAMP column (updated) that gets updated whenever a row is updated. Select that column along with the rest of the data you require. Use the returned timestamp as a condition in your WHERE clause so that the row will only be updated when the timestamp is still the same as when you read the row.
UPDATE table SET col1 = value WHERE id = 1 AND updated = timestamp_value_read
Now when you run the update and the timestamps do not match no update will be performed. You can test for this by using rows affected, if zero rows were updated then you know that the row was modified between read and write. Handle that condition in your code whichever way is best for your application and your users.
Timestamp tutorial

Related

SQL: why we need transaction for ticket booking?

I have read that transaction is usually used in movie ticket booking website, to solve concurrent purchase problem. However, I failed to understand why is it necessary.
If at the same time, 2 users book the same seat (ID = 1) on the same show (ID = 99), can't you simply issue the following SQL command?
UPDATE seat_db
SET takenByUserID=someUserId
WHERE showID=99 AND seatID=1 AND takenByUserID IS NOT NULL
As I can see, this SQL is already been executed atomically, there's no concurrency issue. The database will set seat ID=1 to 1st user of which the server receives the request, then let the 2nd user's request fail. So, why is transaction still needed for ticket booking system?
When you batch all of your DML statements into a single transaction typically you are telling the database a couple things:
Make this operation (i.e. book movie ticket) an all-or-nothing operation
Ensure you don't leave any orphan rows and have consistent data
Lock all the associated tables up-front so that no other writes can be done while the operation runs
Prevents other transactions from modifying tables your current operation wants to access
Prevents deadlock and allows processing to continue by aborting one of the locking queries
Whether you need to wrap your UPDATE seat_db request in its own transaction depends on what other processing (DML) is being done before and after it.
You'll have to use transactions if your action involves multiple unrelated rows. For example, if the user has to pay for the ticket, then there will be at least two updates: update the user's credit and mark the seat as occupied. If any of the two updates were performed alone you'll definitely get into trouble.

MySQL/MariaDB InnoDB Simultaneous Transactions & Locking Behaviour

As part of the persistence process in one of my models an md5 check_sum of the entire record is generated and stored with the record. The md5 check_sum contains a flattened representation of the entire record including all EAV attributes etc. This makes preventing absolute duplicates very easy and efficient.
I am not using a unique index on this check_sum for a specific reason, I want this all to be silent, i.e. if a user submits a duplicate then the app just silently ignores it and returns the already existing record. This ensures backwards compatibility with legacy app's and api's.
I am using Laravel's eloquent. So once a record has been created and before committing the application does the following:
$taxonRecords = TaxonRecord::where('check_sum', $taxonRecord->check_sum)->get();
if ($taxonRecords->count() > 0) {
DB::rollBack();
return $taxonRecords->first();
}
However recently I encountered a 60,000/1 shot incident(odds based on record counts at that time). A single duplicate ended up in the database with the same check_sum. When I reviewed the logs I noticed that the creation time was identical down to the second. Further investigation of Apache logs showed a valid POST but the POST was duplicated. I presume the users browser malfunctioned or something but both POSTS arrived simultaneously resulting in two simultaneous transactions.
My question is how can I ensure that a transaction and its contained SELECT for the previous check_sum is Atomic & Isolated. Based upon my reading the answer lies in https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html and isolation levels.
If transaction A and transaction B arrive at the server at the same time then they should not run side by side but should wait for the first to complete.
You created a classic race condition. Both transactions are calculating the checksum while they're both in progress, not yet committed. Neither can read the other's data, since they're uncommitted. So they calculate that they're the only one with the same checksum, and they both go through and commit.
To solve this, you need to run such transactions serially, to be sure that there aren't other concurrent transactions submitting the same data.
You may have to use use GET_LOCK() before starting your transaction to calculate the checksum. Then RELEASE_LOCK() after you commit. That will make sure other concurrent requests wait for your data to be committed, so they will see it when they try to calculate their checksum.

How to properly use transactions and locks to ensure database integrity?

I develop an online reservation system. To simplify let's say that users can book multiple items and each item can be booked only once. Items are first added to the shopping cart.
App uses MySql / InnoDB database. According to MySql documentation, default isolation level is Repeatable reads.
Here is the checkout procedure I've came up with so far:
Begin transaction
Select items in the shopping cart (with for update lock)
Records from cart-item and items tables are fetched at this step.
Check if items haven't been booked by anybody else
Basically check if quantity > 0. It's more complicated in the real application, thus I put it here as a separate step.
Update items, set quantity = 0
Also perform other essential database manipulations.
Make payment (via external api like PayPal or Stripe)
No user interaction is necessary as payment details can be collected before checkout.
If everything went fine commit transaction or rollback otherwise
Continue with non-essential logic
Send e-mail etc in case of success, redirect for error.
I am unsure if that is sufficient. I'm worried whether:
Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
I'm a bit confused about SELECT ... FOR UPDATE on non-existent rows section of Willem Renzema answer. When may it become important? Could you provide any example?
Here are some resources I've read:
How to deal with concurrent updates in databases?, MySQL: Transactions vs Locking Tables, Do database transactions prevent race conditions?,
Isolation (database systems), InnoDB Locking and Transaction Model, A beginner’s guide to database locking and the lost update phenomena.
Rewrote my original question to make it more general.
Added follow-up questions.
Begin transaction
Select items in shopping cart (with for update lock)
So far so good, this will at least prevent the user from doing checkout in multiple sessions (multiple times trying to checkout the same card - good to deal with double clicks.)
Check if items haven't been booked by other user
How do you check? With a standard SELECT or with a SELECT ... FOR UPDATE? Based on step 5, I'm guessing you are checking a reserved column on the item, or something similar.
The problem here is that the SELECT ... FOR UPDATE in step 2 is NOT going to apply the FOR UPDATE lock to everything else. It is only applying to what is SELECTed: the cart-item table. Based on the name, that is going to be a different record for each cart/user. This means that other transactions will NOT be blocked from proceeding.
Make payment
Update items marking them as reserved
If everything went fine commit transaction, rollback otherwise
Following the above, based on the information you've provided, you may end up with multiple people buying the same item, if you aren't using SELECT ... FOR UPDATE on step 3.
Suggested Solution
Begin transaction
SELECT ... FOR UPDATE the cart-item table.
This will lock a double click out from running. What you select here should be the some kind of "cart ordered" column. If you do this, a second transaction will pause here and wait for the first to finish, and then read the result what the first saved to the database.
Make sure to end the checkout process here if the cart-item table says it has already been ordered.
SELECT ... FOR UPDATE the table where you record if an item has been reserved.
This will lock OTHER carts/users from being able to read those items.
Based on the result, if the items are not reserved, continue:
UPDATE ... the table in step 3, marking the item as reserved. Do any other INSERTs and UPDATEs you need, as well.
Make payment. Issue a rollback if the payment service says the payment didn't work.
Record payment, if success.
Commit transaction
Make sure you don't do anything that might fail between steps 5 and 7 (like sending emails), else you may end up with them making a payment without it being recorded, in the event the transaction gets rolled back.
Step 3 is the important step with regards to making sure two (or more) people don't try to order the same item. If two people do try, the 2nd person will end up having their webpage "hang" while it processes the first. Then when the first finishes, the 2nd will read the "reserved" column, and you can return a message to the user that someone has already purchased that item.
Payment in transaction or not
This is subjective. Generally, you want to close transactions as quickly as possible, to avoid multiple people being locked out from interacting with the database at once.
However, in this case, you actually do want them to wait. It's just a matter of how long.
If you choose to commit the transaction before payment, you'll need to record your progress in some intermediate table, run the payment, and then record the result. Be aware that if the payment fails, you'll then have to manually undo the item reservation records that you updated.
SELECT ... FOR UPDATE on non-existent rows
Just a word of warning, in case your table design involves inserting rows where you need to earlier SELECT ... FOR UPDATE: If a row doesn't exist, that transaction will NOT cause other transactions to wait, if they also SELECT ... FOR UPDATE the same non-existent row.
So, make sure to always serialize your requests by doing a SELECT ... FOR UPDATE on a row that you know exists first. Then you can SELECT ... FOR UPDATE on the row that may or may not exist yet. (Don't try to do just a SELECT on the row that may or may not exist, as you'll be reading the state of the row at the time the transaction started, not at the moment you run the SELECT. So, SELECT ... FOR UPDATE on non-existent rows is still something you need to do in order to get the most up to date information, just be aware it will not cause other transactions to wait.)
1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.
2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.
3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.
SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.
4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.
By default suspended statements fail after 50s.
5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
Yes, SELECT ... FOR UPDATE on items table should be enough.
Yes, these selects wait, because FOR UPDATE is an exclusive lock.
Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.
6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.
What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.

MySQL Repeatable Read and dirty reads

According to this wikipedia entry, the repeatable read isolation level holds read and write locks when selecting data.
My understanding is that this can prevent the age old banking example:
Start a transaction
Get (SELECT) account balance ($100)
Withdraw $10 and UPDATE new value ($90)
Commit transaction
If in between 2 & 3 the customer receives a deposit of $1000, that transaction should be blocked because of the read/write lock acquired in step 2. Otherwise, step 3 would write $90 instead of $1090.
However, according to the MySQL docs, repeatable read (default) works differently. All it ensures is that no matter how many SELECTs we do, we get the same value, regardless whether the value has been changed by another transaction. Also other transactions are allowed to modify the values we read.
This sounds broken, not sure why I would want to read an old balance. The doc says that an explicit FOR UPDATE needs to be added to the SELECT to acquire the appropriate locks.
I'm confused about the definition and implementation of repeatable read. Could somebody clarify how the banking problem is solved?
I'll talk about how it works in MySQL and PostgreSQL, since I'm not as familiar with other SQL implementations. In the banking example you gave, the following should be noted:
the deposit would not be blocked. Repeatable read isolation does not prevent concurrent changes in another transaction. It only dictates how read operations will behave within a transaction. Namely, as you said, read operations get a frozen version of the database, from the time the transaction began.
the withdraw operation in step 3, however, would be blocked by the deposit operation, and would have to wait until the deposit transaction has been committed. This kind of lock happens on all transaction isolation levels, even the least conservative (read uncommitted), because changes to data are involved.
Now, the resulting balance after both transactions are finished will depend on how you write your statements. With a SQL statement for the withdrawal step done as follows,
UPDATE accounts SET balance = balance - 10 WHERE id = 10;
the final balance will be $1090. MySQL/PostgreSQL will realize the data has changed in another transaction, and the withdrawal will thus use the latest value, even though the UPDATE statement was called within a "repeatable read" transaction.
If you, however, subtract $10 from the $100 you got from step 2 in code, and then run the withdrawal like this
UPDATE accounts SET balance = 90 WHERE id = 10;
the transaction isolation won't help, and the balance will end up being $90.

Updating account balances with mysql

I have a field on a User table that holds the account balance for the user. Users can perform a lot of actions with my service that will result in rapid changes to their balance.
I'm trying to use mysql's serializable isolation level to make sure that multiple user actions will not update the value incorrectly. (Action A and action B simultaneously want to deduct 1 dollar from the balance.) However, I'm getting a lot of deadlock errors.
How do I do this correctly without getting all these deadlocks, and still keeping the balance field up to date?
simple schema: user has an id and a balance.
im using doctrine, so i'm doing something like the following:
$con->beginTransaction();
$tx = $con->transaction;
$tx->setIsolation('SERIALIZABLE');
$user = UserTable::getInstance()->find($userId);
$user->setBalance($user->getBalance() + $change);
$user->save();
$con->commit();
First trying to use serializable isolation level on your transaction is a good idea. It means you know at least a minimum what a transation is, and that the isolation level is one of the biggest problem.
Note that serializable is not really a true seriability. More on that on this previous answer, when you'll have some time to read it :-).
But the most important part is that you should consider that having automatic rollbacks on your transaction because of failed serialibility is a normal fact, and that the right thing to do is building your application so that transactions could fail and should be replayed.
One simple solution, and for accounting things I like this simple solution as we can predict all the facts, no suprises, so, one solution is to perform table locks. This is not a fine and elegant solution, no row levels locks, just simple big table locks (and always in the same order). After that you can do your operation as a single player and then release teh locks. Not multi user concurrency on the rows of the tables, no next-row magical locks fails (see previous link). This will certainly slow down your write operations, but if everybody performs the table locks in the same order you'll only get locks timeouts problems, no deadlocks and no 'unserializable auto-rollback'.
Edit
From your code sample I'm not sure you can set the transaction isolation level after the begin. You should activate query logs on MySQL and seewhat is done, then check that other transactions runned by the CMS are not still in the serializable level.