Transaction + Select ... For Update... Skipping indexes [duplicate]

Transaction + Select ... For Update... Skipping indexes [duplicate] - mysql

This question already has answers here:
When to fix auto-increment gaps in MYSQL
(2 answers)
MySQL AUTO_INCREMENT does not ROLLBACK
(11 answers)
Closed 1 year ago.
I noticed something funny with my table's index column after running an experiment to answer a different question.
In my experiment, I had two conditions. In one, autocommit=FALSE and in another autocommit=TRUE. I had 2 sessions connected to the server and tried various combinations of session #1 starting transactions / selecting FOR UPDATE and session #2 attempting selects and inserts, etc under those conditions. Important note: all transactions started by session #1 were rolled back, not committed.
Reviewing my results, I noticed that when the first session was in a transaction AND had selected FOR UPDATE, the insert made by session 2 (after the first session finished its transaction of course) had its index advanced by 2. In all other inserts by session 2 the index only advanced by 1. That includes inserts forced to wait because session 1 had selected FOR UDPATE (this is possible when autocommit is off btw).
Without losing another hour or two to testing, I was hoping someone could explain how the index advanced by 2. My best guess is this: I assume that the index +1 was reserved for session #1, which was rolled back instead of committed and the index + 2 was reserved for session #2 because it’s request came in while index +1 was reserved. After session #1 rolled back, nothing was put at index + 1, and session #2 inserted at the index already reserved for it, index + 2.
Is that true? If it is, I am concerned that if many sessions request to update at the same time and then don’t commit, whole patches of the index may go unused… Can I prevent this? Can I remediate it if and when it happens?
Thank you in advance.

Related

How to properly use transactions and locks to ensure database integrity?

I develop an online reservation system. To simplify let's say that users can book multiple items and each item can be booked only once. Items are first added to the shopping cart.
App uses MySql / InnoDB database. According to MySql documentation, default isolation level is Repeatable reads.
Here is the checkout procedure I've came up with so far:
Begin transaction
Select items in the shopping cart (with for update lock)
Records from cart-item and items tables are fetched at this step.
Check if items haven't been booked by anybody else
Basically check if quantity > 0. It's more complicated in the real application, thus I put it here as a separate step.
Update items, set quantity = 0
Also perform other essential database manipulations.
Make payment (via external api like PayPal or Stripe)
No user interaction is necessary as payment details can be collected before checkout.
If everything went fine commit transaction or rollback otherwise
Continue with non-essential logic
Send e-mail etc in case of success, redirect for error.
I am unsure if that is sufficient. I'm worried whether:
Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
I'm a bit confused about SELECT ... FOR UPDATE on non-existent rows section of Willem Renzema answer. When may it become important? Could you provide any example?
Here are some resources I've read:
How to deal with concurrent updates in databases?, MySQL: Transactions vs Locking Tables, Do database transactions prevent race conditions?,
Isolation (database systems), InnoDB Locking and Transaction Model, A beginner’s guide to database locking and the lost update phenomena.
Rewrote my original question to make it more general.
Added follow-up questions.

Begin transaction
Select items in shopping cart (with for update lock)
So far so good, this will at least prevent the user from doing checkout in multiple sessions (multiple times trying to checkout the same card - good to deal with double clicks.)
Check if items haven't been booked by other user
How do you check? With a standard SELECT or with a SELECT ... FOR UPDATE? Based on step 5, I'm guessing you are checking a reserved column on the item, or something similar.
The problem here is that the SELECT ... FOR UPDATE in step 2 is NOT going to apply the FOR UPDATE lock to everything else. It is only applying to what is SELECTed: the cart-item table. Based on the name, that is going to be a different record for each cart/user. This means that other transactions will NOT be blocked from proceeding.
Make payment
Update items marking them as reserved
If everything went fine commit transaction, rollback otherwise
Following the above, based on the information you've provided, you may end up with multiple people buying the same item, if you aren't using SELECT ... FOR UPDATE on step 3.
Suggested Solution
Begin transaction
SELECT ... FOR UPDATE the cart-item table.
This will lock a double click out from running. What you select here should be the some kind of "cart ordered" column. If you do this, a second transaction will pause here and wait for the first to finish, and then read the result what the first saved to the database.
Make sure to end the checkout process here if the cart-item table says it has already been ordered.
SELECT ... FOR UPDATE the table where you record if an item has been reserved.
This will lock OTHER carts/users from being able to read those items.
Based on the result, if the items are not reserved, continue:
UPDATE ... the table in step 3, marking the item as reserved. Do any other INSERTs and UPDATEs you need, as well.
Make payment. Issue a rollback if the payment service says the payment didn't work.
Record payment, if success.
Commit transaction
Make sure you don't do anything that might fail between steps 5 and 7 (like sending emails), else you may end up with them making a payment without it being recorded, in the event the transaction gets rolled back.
Step 3 is the important step with regards to making sure two (or more) people don't try to order the same item. If two people do try, the 2nd person will end up having their webpage "hang" while it processes the first. Then when the first finishes, the 2nd will read the "reserved" column, and you can return a message to the user that someone has already purchased that item.
Payment in transaction or not
This is subjective. Generally, you want to close transactions as quickly as possible, to avoid multiple people being locked out from interacting with the database at once.
However, in this case, you actually do want them to wait. It's just a matter of how long.
If you choose to commit the transaction before payment, you'll need to record your progress in some intermediate table, run the payment, and then record the result. Be aware that if the payment fails, you'll then have to manually undo the item reservation records that you updated.
SELECT ... FOR UPDATE on non-existent rows
Just a word of warning, in case your table design involves inserting rows where you need to earlier SELECT ... FOR UPDATE: If a row doesn't exist, that transaction will NOT cause other transactions to wait, if they also SELECT ... FOR UPDATE the same non-existent row.
So, make sure to always serialize your requests by doing a SELECT ... FOR UPDATE on a row that you know exists first. Then you can SELECT ... FOR UPDATE on the row that may or may not exist yet. (Don't try to do just a SELECT on the row that may or may not exist, as you'll be reading the state of the row at the time the transaction started, not at the moment you run the SELECT. So, SELECT ... FOR UPDATE on non-existent rows is still something you need to do in order to get the most up to date information, just be aware it will not cause other transactions to wait.)

1. Other user that tries to book same item at the same time will be handled correcly. Will his transaction T2 wait until T1 is done?
Yes. While active transaction keeps FOR UPDATE lock on a record, statements in other transactions that use any lock (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE) will be suspended untill either active transaction commits or "Lock wait timeout" is exceeded.
2. Payment using PayPal or Stripe may take some time. Wouldn't this become a problem in terms of performance?
This will not be a problem, as this is exactly what is necessary. Checkout transactions should be executed sequentially, ie. latter checkout should not start before former finish.
3. Items availability will be shown correctly all the time (items should be available until checkout succeeds). Should these read-only selects use shared lock?
Repeatable reads isolation level ensures that changes made by a transaction are not visible until that transaction is commited. Therefore items availability will be displayed correctly. Nothing will be shown unavailable before it is actually paid for. No locks are necessary.
SELECT ... LOCK IN SHARE MODE would cause checkout transaction to wait until it is finished. This could slow down checkouts without giving any payoff.
4. Is it possible that MySql rollbacks transaction by itself? Is it generally better to retry automatically or display an error message and let user try again?
It is possible. Transaction may be rolled back when "Lock wait timeout" is exceeded or when deadlock happens. In that case it would be a good idea to retry it automatically.
By default suspended statements fail after 50s.
5. I guess its enough if I do SELECT ... FOR UPDATE on items table. This way both request caused by double click and other user will have to wait till transaction finishes. They'll wait because they also use FOR UPDATE. Meanwhile vanilla SELECT will just see a snapshot of db before the transaction, with no delay though, right?
Yes, SELECT ... FOR UPDATE on items table should be enough.
Yes, these selects wait, because FOR UPDATE is an exclusive lock.
Yes, simple SELECT will just grab value as it was before transaction started, this will happen immediately.
6. If I use JOIN in SELECT ... FOR UPDATE, will records in both tables be locked?
Yes, SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE, UPDATE, DELETE lock all read records, so whatever we JOIN is included. See MySql Docs.
What's interesting (at least for me) everything that is scanned in the processing of the SQL statement gets locked, no matter wheter it is selected or not. For example WHERE id < 10 would lock also the record with id = 10!
If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.

Select only unlocked rows mysql

I have locked one row in one transaction by following query
START TRANSACTION;
SELECT id FROM children WHERE id=100 FOR UPDATE;
And in another transaction i have a query as below
START TRANSACTION;
SELECT id FROM children WHERE id IN (98,99,100) FOR UPDATE;
It gives error lock wait timeout exceeded.
Here 100 is already locked (in first transaction ) But the ids 98,99 are not locked.Is there any possibility return records of 98,99 if only 100 is row locked in above query.So result should be as below
Id
===
98
99
===
Id 100 should be ignored because 100 is locked by a transaction.

Looks like SKIP LOCKED option mentioned in a previous answer is now available in MySQL. It does not wait to acquire a row lock and allows you to work with rows that are not currently locked.
From MySQL 8.0.0 Release Notes/Changes in MySQL 8.0.1:
InnoDB now supports NOWAIT and SKIP LOCKED options with SELECT ... FOR SHARE and SELECT ... FOR UPDATE locking read statements. NOWAIT causes the statement to return immediately if a requested row is locked by another transaction. SKIP LOCKED removes locked rows from the result set. See Locking Read Concurrency with NOWAIT and SKIP LOCKED.
Sample usage (complete example with outputs can be found in the link above):
START TRANSACTION;
SELECT * FROM tableName FOR UPDATE SKIP LOCKED;
Also, it might be good to include the warning in the Reference Manual here as well:
Queries that skip locked rows return an inconsistent view of the data. SKIP LOCKED is therefore not suitable for general transactional work. However, it may be used to avoid lock contention when multiple sessions access the same queue-like table.

MySQL does not have a way to ignore locked rows in a SELECT. You'll have to find a different way to set a row aside as "already processed".
The simplest way is to lock the row briefly in the first query just to mark it as "already processed", then unlock it and lock it again for the rest of the processing - the second query will wait for the short "marker" query to complete, and you can add an explicit WHERE condition to ignore already-marked rows. If you don't want to rely on the first operation being able to complete successfully, you may need to add a bit more complexity with timestamps and such to clean up after those failed operations.

MySQL does not have this feature. For anyone searching for this topic in general, some RDBMS have better/smarter locking features than others.
For developers constrained to MySQL, the best approach is to add a column (or use an existing, e.g., status column) that can be set to "locked" or "in progress" or similar, execute a SELECT ID, * ... WHERE IN_PROGRESS != 1 FOR UPDATE; to get the row ID you want to lock, issue UPDATE .. SET IN_PROGRESS = 1 WHERE ID = XX to unlock the records.
Using LOCK IN SHARE MODE is almost never the solution because while it'll let you read the old value, but the old value is in the process of being updated so unless you are performing a non-atomic task, there's no point in even looking at that record.
Better* RDBMS recognize this pattern (select one row to work on and lock it, work on it, unlock it) and provide a smarter approach that lets you only search unlocked records. For example, PostgreSQL 9.5+ provide SELECT ... SKIP LOCKED which only selects from within the unlocked subset of rows matching the query. That lets you obtain an exclusive lock on a row, service that record to completion, then update & unlock the record in question without having to block other threads/consumers from being able to work independent of yourself.
*Here "better" means from the perspective of atomic updates, multi-consumer architecture, etc. and not necessarily "better designed" or "overall better." Not trying to start a flamewar here.

As per http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html
The solution is to perform the SELECT in a locking mode using LOCK IN SHARE MODE:
SELECT * FROM parent WHERE NAME = 'Jones' LOCK IN SHARE MODE;

Is SELECT ... FOR UPDATE suitable for a one-time-use row scenario?

I have a table of promo_codes that can be activated by a web application. There is a state column which can be either 0 for unactivated or 1 for activated. If I run a transaction with
SELECT FROM promo_codes WHERE state=0 LIMIT 1 FOR UPDATE;
UPDATE promo_codes SET state=1 WHERE id = ?;
What happens to a second transaction running:
SELECT FROM promo_codes WHERE state=0 LIMIT 1 FOR UPDATE;
Does it simply return the next row, or does it block until the first transaction is done?
I've actually started thinking about just setting a lock based on the row id in redis because it's obvious to me how that would work and I know it wouldn't create any performance issues in MySQL, but on the other hand, there must be a clean and performant way to make this work purely in SQL. Maybe I could use just do an UPDATE ... LIMIT 1 first, but how do I get the id of the promo code back in that case?

SELECT for UPDATE and LOCK IN SHARE MODE modifiers effectively run in READ-COMMITTED isolation mode even if current isolation mode is REPEATABLE-READ. This is done beause Innodb can only lock current version of row. Think about similar case and row being deleted. Even if Innodb would be able to set locks on rows which no more exist – would it do any good for you ? Not really – for example you could try to update the row which you just locked with SELECT FOR UPDATE but this row is already gone so you would get quite unexpected error updating the row which you thought you locked successfully. Anyway it is done this way for good all other decisions would be even more troublesome.
LOCK IN SHARE MODE is actually often used to bypass multiversioning and make sure we’re reading most current data, plus to ensure it can’t be changed. This for example can be used to read set of the rows, compute new values for some of them and write them back. If we would not use LOCK IN SHARE MODE we could be in trouble as rows could be update before we write new values to them and such update could be lost.

Why does the Autoincrement process in MySQL on a InnoDB table sometimes increments by more then 1? [duplicate]

A co-worker just made me aware of a very strange MySQL behavior.
Assuming you have a table with an auto_increment field and another field that is set to unique (e.g. a username-field). When trying to insert a row with a username thats already in the table the insert fails, as expected. Yet the auto_increment value is increased as can be seen when you insert a valid new entry after several failed attempts.
For example, when our last entry looks like this...
ID: 10
Username: myname
...and we try five new entries with the same username value on our next insert we will have created a new row like so:
ID: 16
Username: mynewname
While this is not a big problem in itself it seems like a very silly attack vector to kill a table by flooding it with failed insert requests, as the MySQL Reference Manual states:
"The behavior of the auto-increment mechanism is not defined if [...] the value becomes bigger than the maximum integer that can be stored in the specified integer type."
Is this expected behavior?

InnoDB is a transactional engine.
This means that in the following scenario:
Session A inserts record 1
Session B inserts record 2
Session A rolls back
, there is either a possibility of a gap or session B would lock until the session A committed or rolled back.
InnoDB designers (as most of the other transactional engine designers) chose to allow gaps.
From the documentation:
When accessing the auto-increment counter, InnoDB uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction. The special lock release strategy was introduced to improve concurrency for inserts into a table containing an AUTO_INCREMENT column
…
InnoDB uses the in-memory auto-increment counter as long as the server runs. When the server is stopped and restarted, InnoDB reinitializes the counter for each table for the first INSERT to the table, as described earlier.
If you are afraid of the id column wrapping around, make it BIGINT (8-byte long).

Without knowing the exact internals, I would say yes, the auto-increment SHOULD allow for skipped values do to failure inserts. Lets say you are doing a banking transaction, or other where the entire transaction and multiple records go as an all-or-nothing. If you try your insert, get an ID, then stamp all subsequent details with that transaction ID and insert the detail records, you need to ensure your qualified uniqueness. If you have multiple people slamming the database, they too will need to ensure they get their own transaction ID as to not conflict with yours when their transaction gets committed. If something fails on the first transaction, no harm done, and no dangling elements downstream.

Old post,
but this may help people,
You may have to set innodb_autoinc_lock_mode to 0 or 2.
System variables that take a numeric value can be specified as --var_name=value on the command line or as var_name=value in option files.
Command-Line parameter format:
--innodb-autoinc-lock-mode=0
OR
Open your mysql.ini and add following line :
innodb_autoinc_lock_mode=0

I know that this is an old article but since I also couldn't find the right answer, I actually found a way to do this. You have to wrap your query within an if statement. Its usually insert query or insert and on duplicate querys that mess up the organized auto increment order so for regular inserts use:
$check_email_address = //select query here\\
if ( $check_email_address == false ) {
your query inside of here
}
and instead of INSERT AND ON DUPLICATE use a UPDATE SET WHERE QUERY in or outside an if statement doesn't matter and a REPLACE INTO QUERY also does seem to work

Why does MySQL autoincrement increase on failed inserts?

A co-worker just made me aware of a very strange MySQL behavior.
Assuming you have a table with an auto_increment field and another field that is set to unique (e.g. a username-field). When trying to insert a row with a username thats already in the table the insert fails, as expected. Yet the auto_increment value is increased as can be seen when you insert a valid new entry after several failed attempts.
For example, when our last entry looks like this...
ID: 10
Username: myname
...and we try five new entries with the same username value on our next insert we will have created a new row like so:
ID: 16
Username: mynewname
While this is not a big problem in itself it seems like a very silly attack vector to kill a table by flooding it with failed insert requests, as the MySQL Reference Manual states:
"The behavior of the auto-increment mechanism is not defined if [...] the value becomes bigger than the maximum integer that can be stored in the specified integer type."
Is this expected behavior?

InnoDB is a transactional engine.
This means that in the following scenario:
Session A inserts record 1
Session B inserts record 2
Session A rolls back
, there is either a possibility of a gap or session B would lock until the session A committed or rolled back.
InnoDB designers (as most of the other transactional engine designers) chose to allow gaps.
From the documentation:
When accessing the auto-increment counter, InnoDB uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction. The special lock release strategy was introduced to improve concurrency for inserts into a table containing an AUTO_INCREMENT column
…
InnoDB uses the in-memory auto-increment counter as long as the server runs. When the server is stopped and restarted, InnoDB reinitializes the counter for each table for the first INSERT to the table, as described earlier.
If you are afraid of the id column wrapping around, make it BIGINT (8-byte long).

Without knowing the exact internals, I would say yes, the auto-increment SHOULD allow for skipped values do to failure inserts. Lets say you are doing a banking transaction, or other where the entire transaction and multiple records go as an all-or-nothing. If you try your insert, get an ID, then stamp all subsequent details with that transaction ID and insert the detail records, you need to ensure your qualified uniqueness. If you have multiple people slamming the database, they too will need to ensure they get their own transaction ID as to not conflict with yours when their transaction gets committed. If something fails on the first transaction, no harm done, and no dangling elements downstream.

Old post,
but this may help people,
You may have to set innodb_autoinc_lock_mode to 0 or 2.
System variables that take a numeric value can be specified as --var_name=value on the command line or as var_name=value in option files.
Command-Line parameter format:
--innodb-autoinc-lock-mode=0
OR
Open your mysql.ini and add following line :
innodb_autoinc_lock_mode=0

I know that this is an old article but since I also couldn't find the right answer, I actually found a way to do this. You have to wrap your query within an if statement. Its usually insert query or insert and on duplicate querys that mess up the organized auto increment order so for regular inserts use:
$check_email_address = //select query here\\
if ( $check_email_address == false ) {
your query inside of here
}
and instead of INSERT AND ON DUPLICATE use a UPDATE SET WHERE QUERY in or outside an if statement doesn't matter and a REPLACE INTO QUERY also does seem to work

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008