I read about optimistic locking scheme, where clients can read the values, perform there computation and when write needs to happen, Updates are validated before being written to database.
Lets say If we employ version mechanism for Optimistic Locks then (In case two clients) both will be having update statements as :
update tableName Set field = val, version = oldVersion +1 where
version = OldVersion and Id = x;
Now lets consider the following scenario with Two Clients :
Both Clients read the values of field and version.
Both Clients compute something at there end. Generate new value of field.
Now Both Clients send query Request to Database Server.
As soon as it reaches database :
One Client Update Query starts executing.
But in the mean time interleaving happens and other Client Update
starts executing.
Will these query interleaving causes data races at table
I mean to say, we can't say that Optimistic Lock executes on its own, for example I understand the case where row level locking happens or other locking like table level locking happens, then its fine. But then its like Optimistic Locks doesn't work on its own, it needs pessimistic lock also(row level/ table level, which totally depends on underlying Storage Engine Implementation).
What happens when there is no Row / table level locks already there, but want to implement Optimistic Locking strategy. With query interleaving will it causes data races at table.(I mean to say only field is updated and version is not and then interleaving happens. Is this totally depends on what Isolation levels are set for query)?
I'm little bit confused with this scenario.
Also what is the right use case where optimistic Locking can be really helpful and increase the overall performance of application as compared to Pessimistic Locking.
The scenario in pseudo code for the worst case scenario: Two clients update the same record:
Scenario 1 (your scenario: optimistic locking):
Final constraints are checked at the server side. Optimistic locking is used only for presentation purposes.
Client one orders a product of which there is only 1 in stock.
Client two orders the same product of which there is only 1 in stock.
Both clients get this presented on the screen.
Products table:
CREATE TABLE products (
product_id VARCHAR(200),
stock INT,
price DOUBLE(5,2)
) ENGINE=InnoDB;
Presentation code:
-- Presentation:
SELECT * FROM products WHERE product_id="product_a";
-- Presented to client
Order code:
-- Verification of record (executed in the same block of code within
-- an as short time interval as possible):
SELECT stock FROM products WHERE product_id="product_a";
IF(stock>0) THEN
-- Client clicks "order" (one click method=also payment);
START TRANSACTION;
-- Gets a record lock
SELECT * FROM products WHERE product_id="product_a" FOR UPDATE;
UPDATE products SET stock=stock-1 WHERE product_id="product_a";
INSERT INTO orders (customer_id,product_id,price)
VALUES (customer_1, "product_a",price);
COMMIT;
END IF;
The result of this scenario is that both orders can succeed: They both get the stock>0 from the first select, and then execute the order placement. This is an unwanted situation (in almost any scenario). So this would then have to be addressed in code by cancelling the order, taking a few more transactions.
Scenario 2: Alternative to optimistic locking:
Final constraints are checked at the database side. Optimistic locking is used only for presentation purposes. Less database queries then in the previous optimistic locking scenario, less chance of redos.
Client one orders a product of which there is only 1 in stock.
Client two orders the same product of which there is only 1 in stock.
Both clients get this presented on the screen.
Products table:
CREATE TABLE products (
product_id VARCHAR(200),
stock INT,
price DOUBLE(5,2),
CHECK (stock>=-1) -- The constraint preventing ordering
) ENGINE=InnoDB;
Presentation code:
-- Presentation:
SELECT * FROM products WHERE product_id="product_a";
-- Presented to client
Order code:
-- Client clicks "order" (one click method=also payment);
START TRANSACTION;
-- Gets a record lock
SELECT * FROM products WHERE product_id="product_a" FOR UPDATE;
UPDATE products SET stock=stock-1 WHERE product_id="product_a";
INSERT INTO orders (customer_id,product_id,price)
VALUES (customer_1, "product_a",price);
COMMIT;
So now two customers get presented this product, and click order on the same time. The system executes both orders simultaneous. The result will be: One order will be placed, the other gets an exception since the constraint will fail to verify, and the transaction will be aborted. This abort (exception) will have to be handled in code but does not take any further queries or transactions.
Related
I have a MySQL table of Users, and a table of Actions performed by the Users (linked to that User by a the primary key, userid ). The Actions table has an incrementing key indx. Whenever I add a new row to that table, I then update the latest column of the relevant Users row with the indx of the row I just added to the Actions table. So something like:
INSERT INTO actions(indx,actionname,userid) VALUES(default, "myaction", 1);
UPDATE users SET latest=LAST_INSERT_ID() WHERE userid=1;
The idea being that I can check for updates for a User by seeing if the latest is higher then the last time I checked.
My issue is that if more than one connection is opened on the database and they try and add an Action for the same User at the same time, connection2 could conceivably run their INSERT and UPDATE between the INSERT and update of connection1, and the latest entry of the user they're both trying to update will no longer have the indx of the most recent action entry.
I've been reading up on transaction, isolation levels, etc. But haven't really found a way around this (though my understanding of how these work exactly is pretty shaky, so maybe I just misunderstood). I think I need a way to lock the Actions table until the User table is updated. This application only gets used by a few hundred users tops, so I don't think the performance hit due to momentarily locking the table will be too bad.
So is that something that can be done in MySQL? Is there a better solution? I imagine this general pattern must be pretty common: having one table with a bunch of varieties of rows, and a second table with a row that tracks meta data for each variety in table A and needs to be updated atomically each time that first table is changed. So I'm hoping there's a solution that isn't too complex
Use SELECT ... FOR UPDATE to lock the row in order to serialize the access to the table and prevent from race conditions:
START TRANSACTION;
SELECT any_column FROM users WHERE userid=1 FOR UPDATE;
INSERT INTO actions(indx,actionname,userid) VALUES(default, "myaction", 1);
UPDATE users SET latest=LATEST_INSERT_ID() WHERE userid=1;
COMMIT;
However this will slown down your INSERTing rate, because all these transactions from all sessions will be serialized.
The better option is to not store the last ID in users table at all. Just use SELECT max( id ) FROM actions WHERE userid = xxxx in all places where this number is required. With an index on actions( userid ) this query will be very fast (assuming that id column is the primary key in this table), and the inserts will not be slowed down
High level summary of the issue: Getting issues around locking of the inventory table when placing orders resulting in order failures due to timeouts.
Tracing through the checkout process, I see the following queries being executed: (comments added by me)
-- Lock stock and product tables
SELECT `si`.*, `p`.`type_id` FROM `cataloginventory_stock_item` AS `si`
INNER JOIN `catalog_product_entity` AS `p` ON p.entity_id=si.product_id
WHERE (stock_id=1) AND (product_id IN(28775, 28777)) FOR UPDATE
-- Perform the actual stock update
UPDATE `cataloginventory_stock_item`
SET `qty` =
CASE product_id
WHEN 28775 THEN qty-2
WHEN 28777 THEN qty-1
ELSE
qty
END
WHERE (product_id IN (28775, 28777)) AND (stock_id = 1)
My understanding of the FOR UPDATE modifier of a SELECT statement is that all rows in tables that were returned in the SELECT will be locked (read and write?) until the transaction is committed.
From my understanding of MySQL, the fact that the cataloginventory_stock_item query has a calculated value for the qty column (i.e. the value wasn't calculated in PHP and passed into the query, the new column value is based on the existing column value when the query is performed) means it will not be susceptible to race conditions.
My questions are:
Are my assumptions correct?
Why does Magento need a lock of catalog_product_entity in order to update the stock?
Why does Magento need a lock of cataloginventory_stock_item if the cataloginventory_stock_item UPDATE is atomic?
1) Yes, your assumptions regarding FOR UPDATE are correct, the rows selected in both cataloginventory_stock_item and catalog_product_entity will be locked for reading and writing. That is, other queries for these rows will block.
2) I don't know, and in fact it seems it doesn't.. Perhaps this to prevent race conditions when a user is manually updating stock status or similar, but I still don't see why it couldn't be removed. Another possibility is the original author intended to support multiple stock items per product and thought the "parent" should be locked.
3) Because the PHP code checks if the item is "salable" using the loaded values before issuing the update. Without locking, two processes could load the same value and then race to update the value. So even though it is atomic, the query doesn't fail properly if there was a race condition when loading the data.
It is unclear to me (by reading MySQL docs) if the following query ran on INNODB tables on MySQL 5.1, would create WRITE LOCK for each of the rows the db updates internally (5000 in total) or LOCK all the rows in the batch. As the database has really heavy load, this is very important.
UPDATE `records`
INNER JOIN (
SELECT id, name FROM related LIMIT 0, 5000
) AS `j` ON `j`.`id` = `records`.`id`
SET `name` = `j`.`name`
I'd expect it to be per row but as I do not know a way to make sure it is so, I decided to ask someone with deeper knowledge. If this is not the case and the db would LOCK all the rows in the set, I'd be thankful if you give me explanation why.
The UPDATE is running in transaction - it's an atomic operation, which means that if one of the rows fails (because of unique constrain for example) it won't update any of the 5000 rows. This is one of the ACID properties of a transactional database.
Because of this the UPDATE hold a lock on all of the rows for the entire transaction. Otherwise another transaction can further update the value of a row, based on it's current value (let's say update records set value = value * '2'). This statement should produce different result depending if the first transaction commits or rollbacks. Because of this it should wait for the first transaction to complete all 5000 updates.
If you want to release the locks, just do the update in (smaller) batches.
P.S. autocommit controls if each statement is issued in own transaction, but does not effect the execution of a single query
This is a follow up on my previous question (you can skip it as I explain in this post the issue):
MySQL InnoDB SELECT...LIMIT 1 FOR UPDATE Vs UPDATE ... LIMIT 1
Environment:
JSF 2.1 on Glassfish
JPA 2.0 EclipseLink and JTA
MySQL 5.5 InnoDB engine
I have a table:
CREATE TABLE v_ext (
v_id INT NOT NULL AUTO_INCREMENT,
product_id INT NOT NULL,
code VARCHAR(20),
username VARCHAR(30),
PRIMARY KEY (v_id)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
It is populated with 20,000 records like this one (product_id is 54 for all records, code is randomly generated and unique, username is set to NULL):
v_id product_id code username
-----------------------------------------------------
1 54 '20 alphanumerical' NULL
...
20,000 54 '20 alphanumerical' NULL
When a user purchase product 54, he gets a code from that table. If the user purchases multiple times, he gets a code each times (no unique constraint on username). Because I am preparing for a high activity I want to make sure that:
No concurrency/deadlock can occur
Performance is not impacted by the locking mechanism which will be needed
From the SO question (see link above) I found that doing such a query is faster:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL LIMIT 1 FOR UPDATE;
// Use result for next query
UPDATE v_ext SET username=xxx WHERE v_id=...;
COMMIT;
However I found a deadlock issue ONLY when using an index on username column. I thought of adding an index would help in speeding up a little bit but it creates a deadlock after about 19,970 records (actually quite consistently at this number of rows). Is there a reason for this? I don't understand. Thank you.
From a purely theoretical point of view, it looks like you are not locking the right rows (different condition in the first statement than in the update statement; besides you only lock one row because of LIMIT 1, whereas you possibly update more rows later on).
Try this:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL AND v_id=yyy FOR UPDATE;
UPDATE v_ext SET username=xxx WHERE v_id=yyy;
COMMIT;
[edit]
As for the reason for your deadlock, this is the probable answer (from the manual):
If you have no indexes suitable for your statement and MySQL must scan
the entire table to process the statement, every row of the table
becomes locked (...)
Without an index, the SELECT ... FOR UPDATE statement is likely to lock the entire table, whereas with an index, it only locks some rows. Because you didn't lock the right rows in the first statement, an additional lock is acquired during the second statement.
Obviously, a deadlock cannot happen if the whole table is locked (i.e. without an index).
A deadlock can certainly occur in the second setup.
First of all, the definition of the table is wrong. You have no tid column in the table, so i am suspecting the primary key is v_id.
Second of all, if you select for update, you lock the row. Any other select coming until the first transaction is done will wait for the row to be cleared, because it will hit the exact same record. So you will have waits for this row.
However, i pretty much doubt this can be a real serious problem in your case, because first of all, you have the username there, and second of all you have the product id there. It is extremly unlikely that you will have alot of hits on that exact same record you hit initially, and even if you do, the transaction should be running very fast.
You have to understand that by using transactions, you usually give up pretty much on concurrency for consistent data. There is no way to support consistency of data and concurrency at the same time.
I would like to get a suggestion on improving my setup that is causing the sql server to return the deadlock message. I have multiple threading application that actually uses the TaskParallel library and each task will use a stored procedure to select an id from a table to use in its processing. I immediately delete that id from the table in the same statement and I think that is what is causing the deadlocks. The table consists of one column of uniques ids with no indexes. I thought of doing a batch delete periodically but that means keeping a tally of used ids across multiple servers.
here is my sql stored procedure:
CREATE PROCEDURE [dbo].[get_Ids]
#id nvarchar(20) OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
Select top 1 #id = siteid from siteids
delete siteids where siteid = #id
END
Is there any better way to do this? My processes work very fast and I used to request this id from a webrequest service but this took 3 seconds.
Some things to try:
Maybe try hinting to the DB that you will delete the record you just selected, this way it will grab the lock early. For this to work you'll need to wrap the whole procedure in a transaction, then hint the select. Should look something like:
BEGIN TRANSACTION
SELECT TOP 1 #id = siteid from siteids WITH (UPDLOCK, HOLDLOCK)
DELETE siteids WHERE siteid = #id
COMMIT TRANSACTION
Also make sure the siteid column is indexed(or tag it as primary key, since you say it is unique), otherwise it would have to scan the table to get the record to delete, which could make deadlocking worse since it spends a bunch more time deleting.
For deadlocks in general, run the SQL profiler and see what the deadlock graph looks like - might be something else is going on.