Here is my Query:
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
START TRANSACTION;
DROP TEMPORARY TABLE IF EXISTS taken;
CREATE TEMPORARY Table taken(
id int,
invoice_id int
);
INSERT INTO taken(id, invoice_id)
SELECT id, $invoice_id FROM `licenses` l
WHERE l.`status` = 0 AND `type` = $type
LIMIT $serial_count
FOR UPDATE;
UPDATE `licenses` SET `status` = 1
WHERE id IN (SELECT id FROM taken);
If I'm going to face high concurrency is the query above thread-safe? I mean I don't wanna assign records which has already assigned to another one.
With your FOR UPDATE statement, you are locking all selected licenses until you perform an update, so you can be sure that there will not be concurrency problem on those records.
the only problem i can see is that if your query requires a lot of time to perform (how many licenses do you expect to process at every query?) and other queries requires licenses (even read queries are locked) on the same time, your system will be slowed down.
Related
I need to select, make manipulation and update a lot of data for less than 3 minutes. And was decided to create some kind of locking mechanism to make the ability to run separate processes (in parallel) and each process should lock, select and update own rows.
To make it possible was decided to add the column worker_id to the table.
Table structure:
CREATE TABLE offers
(
id int(10) unsigned PRIMARY KEY NOT NULL AUTO_INCREMENT,
offer_id int(11) NOT NULL,
offer_sid varchar(255) NOT NULL,
offer_name varchar(255),
account_name varchar(255),
worker_id varchar(255),
);
CREATE UNIQUE INDEX offers_offer_id_offer_sid_unique ON offers (offer_id, offer_sid);
CREATE INDEX offers_offer_id_index ON offers (offer_id);
CREATE INDEX offers_offer_sid_index ON offers (offer_sid);
Also, we decided to start from 5 parallel processes and to not allow selection of the same row by different processes we are using the formula: offer_id % max_amount_of_processes = process_number (process_number starting from 0, so first is 0 and last is 4)
Each process is following the steps:
set worker_id with current process id to the first 1000 rows using the query: update offers set worker_id =: process_id where worker_id is null and offer_id%5 =: process_number order by offer_id asc limit 1000
select those rows: select * from offers where worker_id =: process_id
order by offer_id asc limit 1000
make manipulation with data, store last offer_id to the variable and prepared data to another variable for further update
run the same query from step 1 to lock next 1000 rows
run the same query as we have in step 2 with additional where clause and offer_id > :last_selected_id to select next 1000 rows
make the same steps in the loop until we lock all rows
remove all locks update offers set worker_id = null where worker_id =: process_id
run the query to update all collected data
and the same steps for other 4 processes
The issue here is that I'm getting a deadlock when all 5 processes simultaneously run the query from step 1 to lock rows (set worker_id) but each process doing lock for own rows which depending on the formula. I tried to set transaction isolation level to READ COMMITED but still the same issue.
I'm a novice in the locking mechanism and I need a help to prevent deadlocks here or to create the better mechanism
The expression offer_id%5 = :process_number cannot use an index, so it can only scan all the rows matched by the first condition, worker_id is null.
You can prove this with two windows:
mysql1> begin;
mysql1> set #p=1;
mysql1> update offers set worker_id = #p where worker_id is null and offer_id%5 = #p;
Don't commit the transaction in window 1 yet.
mysql2> set #p=2;
mysql2> update offers set worker_id = #p where worker_id is null and offer_id%5 = #p;
...waits for about 50 seconds, or value of innodb_lock_wait_timeout, then...
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
This demonstrates that each concurrent session locks overlapping sets of rows, not only the rows that match the modulus expression. So the sessions queue up against each other's locks.
This will get worse if you put all the steps into a transaction like #SloanThrasher suggests. Making the work of each worker take longer will make them hold only their locks longer, and further delay the other processes waiting on those locks.
I do not understand how updated_at field can cause the issue as I'm still updating other fields
I'm not sure because you haven't posted the InnoDB deadlock diagnostics from SHOW ENGINE INNODB STATUS.
I do notice that your table has a secondary UNIQUE KEY, which will also require locks. There are some cases of deadlocks that occur because of non-atomicity of the lock assignment.
Worker 1 Worker 2
UPDATE SET worker_id = 1
(acquires locks on PK)
UPDATE SET worker_id = 2
(waits for PK locks held by worker 1)
(waits for locks on UNIQUE KEY)
Both worker 1 and worker 2 can therefore be waiting on each other, and enter into a deadlock.
This is just a guess. Another possibility is that the ORM is doing a second UPDATE for the updated_at column, and this introduces another opportunity for a race condition. I haven't quite worked that out mentally, but I think it's possible.
Below is a recommendation for a different system that would avoid these problems:
There's another problem, that you're not really balancing the work over your processes to achieve the best completion time. There might not be an equal number of offers in each group when you split them by modulus. And each offer might not take the same amount of time to process anyway. So some of your workers could finish and have nothing to do, while the last worker is still processing its work.
You can solve both problems, the locking and the load-balancing:
Change the table columns in the following way:
ALTER TABLE offers
CHANGE worker_id work_state ENUM('todo', 'in progress', 'done') NOT NULL DEFAULT 'todo',
ADD INDEX (work_state),
ADD COLUMN updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
ADD INDEX (updated_at);
Create ONE process that reads from the table periodically, and adds the primary key id values of offers in a 'todo' state to a message queue. All the offers, regardless of their offer_id value, get queued in the same way.
SELECT id FROM offers WHERE work_state = 'todo'
/* push each id onto the queue */
Then each of the workers can pull one id at a time from the message queue. The worker does the following steps with each id:
UPDATE offers SET work_state = 'in progress' WHERE id = :id
The worker performs the work for its one offer.
UPDATE offers SET work_state = 'done' WHERE id = :id
These worker queries only reference one offer at a time, and they address the offers by primary key, which will use the PK index and only lock one row at a time.
Once it has finished one offer, then the worker pulls the next offer from the queue.
In this way, the workers will all finish at the same time, and the work will be balanced over the workers better. Also you can start or stop workers at any time, and you don't care about what worker number they are, because your offers don't need to be processed by a worker with the same number as the modulus of the offer_id.
When the workers finish all the offers, the message queue will be empty. Most message queues allow workers to do blocking reads, so while the queue is empty, the worker will just wait for the read to return. When you use a database, the workers have to poll frequently for new work.
There's a chance a worker will fail during its work, and never mark an offer 'done'. You need to check periodically for orphaned offers. Assume they are not going to be completed, and mark their state 'todo'.
UPDATE offers SET work_state = 'todo'
WHERE work_state = 'in progress' AND updated_at < NOW() - INTERVAL 5 MINUTE
Choose the interval length so it's certain that any worker would have finished it by that time unless something had gone wrong. You would probably do this "reset" before the dispatcher queries for current offers todo, so the offers that had been forgotten will be re-queued.
I found the issue. It was because my ORM is by default updating timestamp fields (to simplify the example above I removed them from table structure) while doing an update operation, and after I turn it off the deadlock disappeared. But still, I do not understand how updated_at field can cause the issue as I'm still updating other fields
I have two tables
spies |
--------- |
id | PK
weapon_id | FK
name |
weapons
--------- |
id | PK
name |
I'm trying to clarify whether there is a difference in these two SQL updates (when using MySQL innoDB)
Query 1:
UPDATE spies SET name = 'Bond', weapon_id = 1 WHERE id = 1
OR
Query 2:
UPDATE spies SET name = 'Bond' WHERE id = 1
I have heard that when the updating a row with a FK creates read-only lock (not sure if that's the correct term) on the parent.
Would using Query 2 avoid the lock on the parent table?
Consider the following schema:
(Rem stmts left in for your convenience) :
-- drop table if exists spies;
create table spies
( id int primary key,
weapon_id int not null,
name varchar(100) not null,
key(weapon_id),
foreign key (weapon_id) references weapons(id)
)engine=InnoDB;
-- drop table if exists weapons;
create table weapons
( id int primary key,
name varchar(100) not null
)engine=InnoDB;
insert weapons(id,name) values (1,'slingshot'),(2,'Ruger');
insert spies(id,weapon_id,name) values (1,2,'Sally');
-- truncate table spies;
Now, we have 2 processes, P1 and P2. Best to test where P1 is perhaps MySQL Workbench and P2 is a MySql Command-line window. In other words, you have to set this up as separate connections and right. You would have to have a meticulous eye for step-by-step running these in the proper fashion (described in the Narrative below) and see its impact on the other process window.
Consider the following queries, keeping in mind that a mysql query not wrapped in an explicit transaction is itself an implicit transaction. But below, I swung for explicit:
Q1:
START TRANSACTION;
-- place1
UPDATE spies SET name = 'Bond', weapon_id = 1 WHERE id = 1;
-- place2
COMMIT;
Q2:
START TRANSACTION;
-- place1
UPDATE spies SET name = 'Bond' WHERE id = 1;
-- place2
COMMIT;
Q3:
START TRANSACTION;
-- place1
SELECT id into #mine_to_use from weapons where id=1 FOR UPDATE; -- place2
-- place3
COMMIT;
Q4:
START TRANSACTION;
-- place1
SELECT id into #mine_to_use from spies where id=1 FOR UPDATE; -- place2
-- place3
COMMIT;
Q5 (hodge podge of queries):
SELECT * from weapons;
SELECT * from spies;
Narrative
Q1: When P1 starts to begin Q1, and gets to place2, it has obtained an exclusive row-level update lock in both tables weapons and spies for the id=1 row (2 rows total, 1 row in each table). This can be proved by P2 starting to run Q3, getting to place1, but blocking on place2, and only being freed when P1 gets around to calling COMMIT. Everything I just said about P2 running Q3 is ditto for P2 running Q4. In summary, on the P2 screen, place2 freezes until the P1 Commit.
A note again about implicit transactions. Your real Q1 query is going to perform this very fast and coming out of it will do an implicit commit. However, the prior paragraph breaks it down were you to have more time-costly routines running.
Q2: When P1 starts to begin Q2, and gets to place2, it has obtained an exclusive row-level update lock in both tables weapons and spies for the id=1 row (2 rows total, 1 row in each table). However, P2 has no issues with Q3 blocking weapons, but P2 has block issues running Q4 at place2 spies.
So, the differences between Q1 and Q2 come down to MySQL knowing that the FK index is not relevant to a column in the UPDATE, and the manual states that in Note1 below.
When P1 runs Q1, P2 has no problems the read-only non-lock aquiring Q5 types of queries. The only issues are what data renditions P2 sees based on the ISOLATION LEVEL in place.
Note1: From the MySQL Manual Page entitled Locks Set by Different SQL Statements in InnoDB:
If a FOREIGN KEY constraint is defined on a table, any insert, update,
or delete that requires the constraint condition to be checked sets
shared record-level locks on the records that it looks at to check the
constraint. InnoDB also sets these locks in the case where the
constraint fails.
The above is why the behavior of Q2: is such that P2 is free to perform an UPDATE or acquire an UPDATE exclusive momentary lock on weapons. This is because the engine is not performing an UPDATE with P1 on weapon_id and thus does not have a row-level lock in that table.
To pull this back to 50,000 feet, one's biggest concern is the duration at which a lock is held either in an implicit transaction (one with no START/COMMIT), or explicit transaction before a COMMIT. A peer process can be prohibited from acquiring its need for an UPDATE in theory indefinitely. But each attempt at acquiring that lock is governed by its setting for innodb_lock_wait_timeout. What that means is, by default, after about 60 seconds it times out. For a view of your setting, run:
select ##innodb_lock_wait_timeout;
For me, at the moment, it is 50 (seconds).
Why not run EXPLAIN for this query and check it for yourself?
So, lets run!!
EXPLAIN UPDATE spies SET name = 'Bond', weapon_id = 1 WHERE id = 1\G
And check for number of rows that this query is scanning for, check for ROWS section and see how many rows its scanning.
Do the same for the below one as well.
EXPLAIN UPDATE spies SET name = 'Bond' WHERE id = 1\G
Now, coming to your question, INNODB will lock every update you are making on the each row in a table. But remember, this is a row level locking.
So, to answer your question, updating a row with or without a foreign key will not make a difference if its the same row and the same table.
It will make a difference if its a different row or different table.
I ran into a problem and can't choose the right solution.
I have a SELECT query that selects records from table.
These records has an status column as seen below.
SELECT id, <...>, status FROM table WHERE something
Now, right after this SELECT I have to UPDATE the status column.
How can I do it to avoid a race condition?
What I want to achieve is once somebody (session) selected something, this something cannot be selected by anybody else until I do not release it manually (for example using a status column).
Thoughts?
There is some mysql documentation, thar may be interesting to solve your task, not sure if it fit you needs, but it describes right way to do select followed by update.
The technique described does not prevent other sessions reading, but prevent writing of selected record until the end of transaction.
It contains an example similar to your problem:
SELECT counter_field FROM child_codes FOR UPDATE;
UPDATE child_codes SET counter_field = counter_field + 1;
It is required that you tables use Innodb engine and your programs use transactions.
If you need locking only for short time, i.e. one session select row with lock, update it, and release lock in one session, then you do not need field status at all, just use select ... for update and select ... lock in share mode so if all sessions will use these two with conjunction with transactions select... for update then update to modify, and select ... with shared lock to just read - this will solve your requirements.
If you need to lock for long time, select and lock in one session and then update and release in another, then right you use some storage to keep lock statuses and all session should use as described below: select ... for update and set status and status owner in one session, then in another session select for update check status and owner, update and remove status - for updating scenario, and for read scenario: select ... with shared lock check status.
You can do it with some preparations. Add a column sessionId to your table. It has to be NULL-able and it will contain the unique ID of the session that acquires the row. Also add an index on this new column; we'll use the column to search for rows in the table.
ALTER TABLE `tbl`
ADD COLUMN `sessionId` CHAR(32) DEFAULT NULL,
ADD INDEX `sessionId`(`sessionId`)
When a session needs to acquire some rows (based on some criteria) run:
UPDATE `tbl`
SET `sessionId` = 'aaa'
WHERE `sessionId` IS NULL
AND ...
LIMIT bbb
Replace aaa with the current session ID and ... with the conditions you need to select the correct rows. Replace bbb with the number of rows you need to acquire. Add an ORDER BY clause if you need to process the rows in a certain order (if some of them have higher priority than others). You can also add status = ... in the UPDATE clause to change the status of the acquired rows (to pending f.e.) to let other instances of the code know those rows are processed right now.
The query above acquires some rows. Next, run:
SELECT *
FROM `tbl`
WHERE `sessionId` = 'aaa'
This query gets the acquired rows to be processed in the client code.
After each row is processed, you either DELETE the row or UPDATE it and set sessionId to NULL (release the row) and status to reflect its new status.
Also you should release the rows (using the same procedure as above) when the session is closed.
I have several servers hitting a common MySQL box and I need exclusive access to a table of scheduled jobs.
After some reading here and elsewhere I was led to believe SELECT...FOR UPDATE was what I wanted, but now we are (very rarely) seeing multiple servers pick up the same record.
Here's the PROC (minus the BEGIN/END stuff because it was playing hell with my formatting):
CREATE DEFINER=`root`#`%` PROCEDURE `PopScheduledJob`(OUT `JobId` varchar(36) )
SELECT ScheduledJobId INTO JobId
FROM scheduledjob
WHERE
Status = 0
AND NextRun < UTC_TIMESTAMP()
ORDER BY StartDate
LIMIT 1
FOR UPDATE;
UPDATE scheduledjob
SET Status = 2
WHERE ScheduledJobId = JobId;
So the intent here is that it should only pick up a job with Status=0, and it sets it to 1 immediately.
My hope was that this would prevent any other thread/process from accessing the same record, but now it seems that's not the case.
EDIT: forgot to mention, we have an InnoDB backing store
This is a follow up on my previous question (you can skip it as I explain in this post the issue):
MySQL InnoDB SELECT...LIMIT 1 FOR UPDATE Vs UPDATE ... LIMIT 1
Environment:
JSF 2.1 on Glassfish
JPA 2.0 EclipseLink and JTA
MySQL 5.5 InnoDB engine
I have a table:
CREATE TABLE v_ext (
v_id INT NOT NULL AUTO_INCREMENT,
product_id INT NOT NULL,
code VARCHAR(20),
username VARCHAR(30),
PRIMARY KEY (v_id)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
It is populated with 20,000 records like this one (product_id is 54 for all records, code is randomly generated and unique, username is set to NULL):
v_id product_id code username
-----------------------------------------------------
1 54 '20 alphanumerical' NULL
...
20,000 54 '20 alphanumerical' NULL
When a user purchase product 54, he gets a code from that table. If the user purchases multiple times, he gets a code each times (no unique constraint on username). Because I am preparing for a high activity I want to make sure that:
No concurrency/deadlock can occur
Performance is not impacted by the locking mechanism which will be needed
From the SO question (see link above) I found that doing such a query is faster:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL LIMIT 1 FOR UPDATE;
// Use result for next query
UPDATE v_ext SET username=xxx WHERE v_id=...;
COMMIT;
However I found a deadlock issue ONLY when using an index on username column. I thought of adding an index would help in speeding up a little bit but it creates a deadlock after about 19,970 records (actually quite consistently at this number of rows). Is there a reason for this? I don't understand. Thank you.
From a purely theoretical point of view, it looks like you are not locking the right rows (different condition in the first statement than in the update statement; besides you only lock one row because of LIMIT 1, whereas you possibly update more rows later on).
Try this:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL AND v_id=yyy FOR UPDATE;
UPDATE v_ext SET username=xxx WHERE v_id=yyy;
COMMIT;
[edit]
As for the reason for your deadlock, this is the probable answer (from the manual):
If you have no indexes suitable for your statement and MySQL must scan
the entire table to process the statement, every row of the table
becomes locked (...)
Without an index, the SELECT ... FOR UPDATE statement is likely to lock the entire table, whereas with an index, it only locks some rows. Because you didn't lock the right rows in the first statement, an additional lock is acquired during the second statement.
Obviously, a deadlock cannot happen if the whole table is locked (i.e. without an index).
A deadlock can certainly occur in the second setup.
First of all, the definition of the table is wrong. You have no tid column in the table, so i am suspecting the primary key is v_id.
Second of all, if you select for update, you lock the row. Any other select coming until the first transaction is done will wait for the row to be cleared, because it will hit the exact same record. So you will have waits for this row.
However, i pretty much doubt this can be a real serious problem in your case, because first of all, you have the username there, and second of all you have the product id there. It is extremly unlikely that you will have alot of hits on that exact same record you hit initially, and even if you do, the transaction should be running very fast.
You have to understand that by using transactions, you usually give up pretty much on concurrency for consistent data. There is no way to support consistency of data and concurrency at the same time.