Quick question/clarification required here. I have a DB table that will quite possibly have simultaneous updates to a record. I am using Zend Framework for the application, and I have read about two directions to go to avoid this, first being table locking (LOCK TABLES test WRITE) or something like that, will go back and re-read how to do it exactly if that is the best solution. The second being transactions: $db->beginTransaction(); ... $db->commit();
Now 'assuming' I am using a transactional storage engine such as InnoDB, transactions seem like the more common solution. However does that avoid the following scenario:
User A is on a webpage -> submits data -> begin transaction -> read row -> calculate new value -> update row -> save -> commit
User B is on the same webpage at the same time and submits data at the same time, now lets just say it is almost simultaneous (User B calls the update function at a point between begin transaction and commit for User A's transaction) User B relies on the committed data from User A's transaction before it can achieve the accurate calculation for updating the record.
IE:
Opening value in database row : 5 User A submits a value of 5. (begin
transaction -> read value (5) -> add submitted value (5+5=10) -> write
the updated value -> save -> commit)
User B submits the value of 7. I need to make sure that the value of
User B's transaction read is 10, and not 5 (if the update isn't done
before read).
I know this is a long winded explanation, I apologize, I am not exactly sure of the correct terminology to simplify the question.
Thanks
transactions doesn't ensure locking. The whole block in transaction is treated as atomic update to db (if anything fails in between all previous changes of this block are rollback). So, two transactions running in parallel can update same row.
You need to use both.
Transaction do
row.lock
update row
end
see, if row level locking can make it easier for u.
Related
I develop an online reservation system. To simplify let's say that users can book multiple items and each item can be booked only once. Items are first added to the shopping cart.
App uses MySql / InnoDB database. According to MySql documentation, default isolation level is Repeatable reads.
Here is the checkout procedure I've came up with so far:
Begin transaction
Select items in the shopping cart (with for update lock)
Records from cart-item and items tables are fetched at this step.
Check if items haven't been booked by anybody else
Basically check if quantity > 0. It's more complicated in the real application, thus I put it here as a separate step.
Update items, set quantity = 0
Also perform other essential database manipulations.
Make payment (via external api like PayPal or Stripe)
No user interaction is necessary as payment details can be collected before checkout.
If everything went fine commit transaction or rollback otherwise
Continue with non-essential logic
Send e-mail etc in case of success, redirect for error.
I am unsure if that is sufficient. I'm worried whether:
Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
I'm a bit confused about SELECT ... FOR UPDATE on non-existent rows section of Willem Renzema answer. When may it become important? Could you provide any example?
Here are some resources I've read:
How to deal with concurrent updates in databases?, MySQL: Transactions vs Locking Tables, Do database transactions prevent race conditions?,
Isolation (database systems), InnoDB Locking and Transaction Model, A beginner’s guide to database locking and the lost update phenomena.
Rewrote my original question to make it more general.
Added follow-up questions.
Begin transaction
Select items in shopping cart (with for update lock)
So far so good, this will at least prevent the user from doing checkout in multiple sessions (multiple times trying to checkout the same card - good to deal with double clicks.)
Check if items haven't been booked by other user
How do you check? With a standard SELECT or with a SELECT ... FOR UPDATE? Based on step 5, I'm guessing you are checking a reserved column on the item, or something similar.
The problem here is that the SELECT ... FOR UPDATE in step 2 is NOT going to apply the FOR UPDATE lock to everything else. It is only applying to what is SELECTed: the cart-item table. Based on the name, that is going to be a different record for each cart/user. This means that other transactions will NOT be blocked from proceeding.
Make payment
Update items marking them as reserved
If everything went fine commit transaction, rollback otherwise
Following the above, based on the information you've provided, you may end up with multiple people buying the same item, if you aren't using SELECT ... FOR UPDATE on step 3.
Suggested Solution
Begin transaction
SELECT ... FOR UPDATE the cart-item table.
This will lock a double click out from running. What you select here should be the some kind of "cart ordered" column. If you do this, a second transaction will pause here and wait for the first to finish, and then read the result what the first saved to the database.
Make sure to end the checkout process here if the cart-item table says it has already been ordered.
SELECT ... FOR UPDATE the table where you record if an item has been reserved.
This will lock OTHER carts/users from being able to read those items.
Based on the result, if the items are not reserved, continue:
UPDATE ... the table in step 3, marking the item as reserved. Do any other INSERTs and UPDATEs you need, as well.
Make payment. Issue a rollback if the payment service says the payment didn't work.
Record payment, if success.
Commit transaction
Make sure you don't do anything that might fail between steps 5 and 7 (like sending emails), else you may end up with them making a payment without it being recorded, in the event the transaction gets rolled back.
Step 3 is the important step with regards to making sure two (or more) people don't try to order the same item. If two people do try, the 2nd person will end up having their webpage "hang" while it processes the first. Then when the first finishes, the 2nd will read the "reserved" column, and you can return a message to the user that someone has already purchased that item.
Payment in transaction or not
This is subjective. Generally, you want to close transactions as quickly as possible, to avoid multiple people being locked out from interacting with the database at once.
However, in this case, you actually do want them to wait. It's just a matter of how long.
If you choose to commit the transaction before payment, you'll need to record your progress in some intermediate table, run the payment, and then record the result. Be aware that if the payment fails, you'll then have to manually undo the item reservation records that you updated.
SELECT ... FOR UPDATE on non-existent rows
Just a word of warning, in case your table design involves inserting rows where you need to earlier SELECT ... FOR UPDATE: If a row doesn't exist, that transaction will NOT cause other transactions to wait, if they also SELECT ... FOR UPDATE the same non-existent row.
So, make sure to always serialize your requests by doing a SELECT ... FOR UPDATE on a row that you know exists first. Then you can SELECT ... FOR UPDATE on the row that may or may not exist yet. (Don't try to do just a SELECT on the row that may or may not exist, as you'll be reading the state of the row at the time the transaction started, not at the moment you run the SELECT. So, SELECT ... FOR UPDATE on non-existent rows is still something you need to do in order to get the most up to date information, just be aware it will not cause other transactions to wait.)
1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.
2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.
3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.
SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.
4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.
By default suspended statements fail after 50s.
5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
Yes, SELECT ... FOR UPDATE on items table should be enough.
Yes, these selects wait, because FOR UPDATE is an exclusive lock.
Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.
6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.
What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.
I have a central database for handling user credit with multiple servers reads and writes to it. The application sits on top of these servers serve user requests by doing the following for each request:
1. check if user has enough credit for the task by reading from db.
2. perform the time consuming request
3. deduct a credit from user account, save the new credit count back to db.
the application uses the database's optimistic locking. So following might happen
1. request a comes in, see that user x has enough credit,
2. request b comes in, see that user x has enough credit,
3. a performs work
4. a saves the new credit count back to db
5. b performs work
6. b tries to save the new credit count back to db, application gets an exception and fails to account for this credit deduction.
With pessimistic locking, the application will need to explicitly get a lock on the user account to guarantee exclusive access, but this KILLs performance since the system have many concurrent requests.
so what would be a good new design for this credit system?
Here are two "locking" mechanisms at avoid using InnoDB's locking mechanism for either of two reasons:
A task that takes longer than you should spend in a BEGIN...COMMIT of InnoDB.
A task that ends in a different program (or different web page) than it started in.
Plan A. (This assumes the race condition is rare, and the time wasted for Step 2 is acceptable in those rare cases.)
(same) check if user has enough credit for the task by reading from db.
(same) perform the time consuming request
(added) START TRANSACTION;
(added) Again Check if the user has enough credit. (ROLLABCK and abort if not.)
(same as old #3) deduct a credit from user account, save the new credit count back to db.
(added) COMMIT;
START..COMMIT is InnoDB transaction stuff. If a race condition caused 'x' to not have credit by step 4, you will ROLLBACK and not perform steps 4 and 5.
Plan B. (This is more complex, but you might prefer it.)
Have a table Locks for locking. It contains user_id and a timestamp.
START TRANSACTION;
If user_id is in Locks, abort (ROLLBACK and exit).
INSERT INTO Locks the user_id and current_timestamp in Locks (thereby "locking" 'x').
COMMIT;
Perform the processing (original Steps 1,2,3)
DELETE FROM Locks WHERE user_id = 'x'; (autocommit=1 suffices here.)
A potential problem: If the processing dies in step 6, not getting around to releasing the lock, that user will be locked out forever. The 'solution' is to periodically check Locks for any timestamps that are 'very' old. If any are found, assume that the processing died, and DELETE the row(s).
You didn't state explicitly what you want to achieve, so I assume you don't want to perform the work just to realise it has been in vain due to low credit.
No-lock
Implement credit hold on step (1) and associate the work (2) and the deduction (3) with the hold. This way low credit user won't pass step (1).
Optimistic locking
As a collision is detected in optimistic locking post factum, I don't think it fits the assumption.
Pessimistic locking
It isn't possible to tell definitely without knowing the schema, but I think it's an exaggeration about killing performance. You can smartly incorporate MySQL InnoDB transaction isolation levels and locking reads at finer granularity than exclusively locking a user account completely. For instance, using SELECT ... LOCK IN SHARE MODE which sets shared locks and allows reads for other transactions.
Rick's caution about the tasking taking longer then MySQL will wait (innodb_lock_wait_timeout) applies here.
You want The Escrow Transactional Method.
You record the credit left after doling out some to each updating process and the credit doled out to (ie held in escrow for) them. A process retries until success a transaction that increases the credit doled out by what it needs and decreases the credit that is left by what it needs; it succeeds only if that would leave the credit left non-negative. Then it does its long calculation. Regardless of the calculation's success it then applies a transaction that decreases the credit doled out. But on success it also increases the assets while on failure it increases the credit left.
Use the timestamp/rowversion approach that you will find in all real database engines except MySQL.
You can emulate them with MySQL in this way. Have a TIMESTAMP column (updated) that gets updated whenever a row is updated. Select that column along with the rest of the data you require. Use the returned timestamp as a condition in your WHERE clause so that the row will only be updated when the timestamp is still the same as when you read the row.
UPDATE table SET col1 = value WHERE id = 1 AND updated = timestamp_value_read
Now when you run the update and the timestamps do not match no update will be performed. You can test for this by using rows affected, if zero rows were updated then you know that the row was modified between read and write. Handle that condition in your code whichever way is best for your application and your users.
Timestamp tutorial
A two part question:
In my CodeIgniter script, I'm starting a transaction, then inserting a row, setting the insert_id() to a php variable, inserting more rows into another table using the new ID as a foreign key, and then I commit everything.
So my question is: if everything does not commit before ending the transaction, how is mysql able to return the last insert ID, if nothing was even inserted? My script works (almost) perfectly, with the new ID being used in subsequent queries.
(I say "almost" because, using the PDO mysql driver, sometimes the first insert that is supposed to return the insert_id() is duplicated--it get's inserted twice. Any idea why that would be? Is that related to getting the last ID? It never happens if using the mysqli or mysql driver.)
I first wrote the script without transactions, so I have code that checks for mysql errors along the way, such as:
if(!$this->db->insert($table, $data)) {
//log message here
}
How does this affect the mysql process once I wrapped all my mysql code in a transaction? It's not causing any visible errors (hopefully unrelated to the problem stated above), but should it be removed?
Thank you.
To answer your first question...
When using transactions, your queries are executed normally as far as your connection is concerned. You can choose to commit, saving those changes, or rollback, reverting all of the changes. Consider the following pseudo-code:
insert into number(Random_number) values (rand());
select Random_number from number where Number_id=Last_insert_id();
//php
if($num < 1)
$this->db->query('rollback;'); // This number is too depressing.
else
$this->db->query('commit;'); // This number is just right.
The random number that was generated can be read prior to commit to ensure that it is suitable before saving it for everyone to see (e.g. commit and unlock the row).
If the PDO driver is not working, consider using the mysqli driver. If that is not an option, you can always use the query 'select last_insert_id() as id;' rather than the $this->db->insert_id() function.
To answer your second question, if you are inserting or updating data that other models will be updating or reading, be sure to use transactions. For example, if a column 'Number_remaining' is set to 1 the following problem can occur.
Person A reads 1
Person B reads 1
Person A wins $1000!
Person A updates 1 to be 0
Person B wins $1000!
Person B updates 0 to be 0
Using transactions in the same situation would yield this result:
Person A starts transaction Person A reads '1' from
Number_remaining (The row is now locked if select for update is used) Person B
attempts to read Number_remaining - forced to wait Person A wins
$1000 Person A updates 1 to be 0 Person A commits Person B
reads 0 Person B does not win $1000 Person B cries
You may want to read up on transaction isolation levels as well.
Be careful of deadlock, which can occur in this case:
Person A reads row 1 (select ... for update) Person B reads row
2 (select ... for update) Person A attempts to read row 2,
forced to wait Person B attempts to read row 1, forced to wait
Person A reaches innodb_lock_wait_timeout (default 50sec) and is
disconnected Person B reads row 1 and continues normally
At the end, since Person B has probably reached PHP's max_execution_time, the current query will finish executing independently of PHP, but no further queries will be received. If this was a transaction with autocommit=0, the query will automatically rollback when the connection to your PHP server is severed.
I'm writing a strategy-kind of multi user game for the web. It has a playfield (X by Y squares) that I plan on serialize and store in a BLOB in a MySQL (innodb) database, one row for each ongoing game.
I now try to figure out a good way of keeping the database updated with any changes to the playfield, and at the same time finding a convenient solution to how to handle things that happen to the playfield in the time frame between loading the page and actually making a move.
I don't use AJAX.
There will be at most 20 players in each game, each player making between 1 and 10 moves in 24 hours, so it is a "slow" game.
My plan (so far) is to also store a kind of checksum for the playfield next to the blob and compare the databases state with the state loaded before trying to make changes to the playfield.
What I worry about is how to prevent race conditions.
Is it enough to:
Begin transaction.
load playfield from table
if checksum differs - rollback and update the users view
if checksum unchanged - update table and commit changes
Is the BEGIN TRANSACTION enough to block the race, or do I need to do something more in step 2 to show my intent to update the table?
Thankful for all advice.
If you use SELECT ... FOR UPDATE when you load the playfield from the database, it will block other selects until you commit or rollback the transaction.
No. You will need to issue a LOCK TABLES command for the tables you need to protect against conflicting updates. This would look something like...
LOCK TABLE my_table WRITE;
More details may be found here... http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
Don't forget to UNLOCK them afterwards!
How can I undo the most recently executed mysql query?
If you define table type as InnoDB, you can use transactions. You will need set AUTOCOMMIT=0, and after you can issue COMMIT or ROLLBACK at the end of query or session to submit or cancel a transaction.
ROLLBACK -- will undo the changes that you have made
You can only do so during a transaction.
BEGIN;
INSERT INTO xxx ...;
DELETE FROM ...;
Then you can either:
COMMIT; -- will confirm your changes
Or
ROLLBACK -- will undo your previous changes
Basically: If you're doing a transaction just do a rollback. Otherwise, you can't "undo" a MySQL query.
For some instrutions, like ALTER TABLE, this is not possible with MySQL, even with transactions (1 and 2).
You can stop a query which is being processed by this
Find the Id of the query process by => show processlist;
Then => kill id;
in case you do not only need to undo your last query (although your question actually only points on that, I know) and therefore if a transaction might not help you out, you need to implement a workaround for this:
copy the original data before commiting your query and write it back on demand based on the unique id that must be the same in both tables; your rollback-table (with the copies of the unchanged data) and your actual table (containing the data that should be "undone" than).
for databases having many tables, one single "rollback-table" containing structured dumps/copies of the original data would be better to use then one for each actual table. it would contain the name of the actual table, the unique id of the row, and in a third field the content in any desired format that represents the data structure and values clearly (e.g. XML). based on the first two fields this third one would be parsed and written back to the actual table. a fourth field with a timestamp would help cleaning up this rollback-table.
since there is no real undo in SQL-dialects despite "rollback" in a transaction (please correct me if I'm wrong - maybe there now is one), this is the only way, I guess, and you have to write the code for it on your own.