I have a table of "commands to do" with a status ('toprocess', 'processing', 'done')
I have several instances (amazon ec2) with a daemon asking for "commands to do".
The daemon asks for rows with status 'toprocess', then it processes, and at the end of each loop it changes the status to 'done'.
The thing is that, before starting that loop, I need to change all rows 'toprocess' to status 'processing', so other instances will not take the same rows, avoiding conflict.
I've read about innodb row locks, but I don't understand them very well ...
SELECT * from commands where status = 'toprocess'
then I need to take the ID's of these results, and update status to 'processing' , locking these rows until they are updated.
How can i do it ?
Thank you
You'd use a transaction , and read the data with FOR UPDATE, which will block other selects that include the FOR UPDATE on the rows that gets selected
begin transaction;
select * from commands where status = 'toprocess' for update;
for each row in the result:
add the data to an array/list for processing later.
update commands set status='processing' where id = row.id;
commit;
process all the data
Read a bit about the FOR UPDATE , and InnoDB isolation levels.
A possible (yet not very elegant) solution may be to first UPDATE the record, then read its data:
Each deamon will have a unique ID, and the table will have a new column named 'owner' for that ID.
Then the deamon will run something like "UPDATE table SET status='processing', owner='theDeamonId' where status='toprocess' ... LIMIT 1"
While the update runs the row is locked, so no other deamon can read it.
After the update this row is Owned by a specific deamon, then it can run a SELECT to fetch all necessary data from that row (WHERE status='processing' AND owner= 'theDeamonId').
Finally, the last UPDATE will set the row to 'processed', and may (or may not) remove the owner field. Keeping it there will also enable some statistics about the deamons' work.
As far as I know you can't use MySQL to lock a row (using a built-in method). You have two options though:
If your table should not be read by any other process until the locks are released then you can use table level locking as described here
You can implement your own basic row locking by updating a value in each row you're processing, and then have all your other daemons checking whether this property is set (a BIT data type would suffice).
InnoDB locks at a row level for reading and updating anyway, but if you want to lock the rows for an arbitrary period then you may have to go with the second option.
Related
I need to run a php a script , I want to make sure there is no more than one script running at the same time .
I am using mysql , and I though about this solution :
I build the bellow database :
job_id | task_id | last_updated_time (AUTO UPDATE)
"sending_emails" 77238 2107-5-3 12:2:2
Before running the script I create random task id , then I run a query to update the task_id .
$task_id = generate_random_task_id();
$query = "
UPDATE
jobs
SET
task_id = $task_id
WHERE
task_id = $task_id
OR
NOW() - last_updated_time > 30
LIMIT 1
"
/*
Then I need to check if there was an update, if yes then I will run the script otherwise i will stop since there is already another script running
*/
$query = "SELECT JOB_ID WHERE taks_id = $task_id "
$result = run($query)
if( ! isset($result[JOB_ID])){
DIE();
}
is there any chance that two scripts run at the same time ?
No, they can't run at the same time, here's MySQL's documentation about UPDATE and SELECT, this is what it says:
UPDATE ... WHERE ... sets an exclusive next-key lock on every record
the search encounters. However, only an index record lock is required
for statements that lock rows using a unique index to search for a
unique row.
Here's more about Shared and Exclusive locks:
A shared (S) lock permits the transaction that holds the lock to read
a row.
An exclusive (X) lock permits the transaction that holds the lock to
update or delete a row.
If a transaction T1 holds an exclusive (X) lock on row r, a request
from some distinct transaction T2 for a lock of either type on r
cannot be granted immediately. Instead, transaction T2 has to wait for
transaction T1 to release its lock on row r.
Yes there's every chance you could run the same task again.
There are two obvious solutions.
One is to open a mysql connection then acquire a lock using GET_LOCK() using a short timeout - if you acquire the lock then you're good to go. You need to maintain the db connection for the lifetime of the script.
Alternatively you could create a table with a unique contraint on finish_time, INSERT a record with a null finish time to indicate the start (it will fail if there is already a record with a null finish time) then update the finish_time to NOW() when it completes.
However using the database to represent the state of a running task only makes sense when the task is running within a lossely coupled but highly available cluster - implying that the databse is also clustered. And the nature of the clustering (NDB, asych, semi-sync, multi-master) has a lot of impact on how this will behave in practice.
OTOH if that is not the case, then using the database to represent the state is the wrong way to solve the problem.
Yes, they can run at the same time.
If you want them to run one at a time SELECT
query should be changed to:
SELECT JOB_ID WHERE taks_id = $task_id LOCK IN SHARED MODE
In this case it uses a read lock.
This is the same whether you use NDB or InnoDB.
I have locked one row in one transaction by following query
START TRANSACTION;
SELECT id FROM children WHERE id=100 FOR UPDATE;
And in another transaction i have a query as below
START TRANSACTION;
SELECT id FROM children WHERE id IN (98,99,100) FOR UPDATE;
It gives error lock wait timeout exceeded.
Here 100 is already locked (in first transaction ) But the ids 98,99 are not locked.Is there any possibility return records of 98,99 if only 100 is row locked in above query.So result should be as below
Id
===
98
99
===
Id 100 should be ignored because 100 is locked by a transaction.
Looks like SKIP LOCKED option mentioned in a previous answer is now available in MySQL. It does not wait to acquire a row lock and allows you to work with rows that are not currently locked.
From MySQL 8.0.0 Release Notes/Changes in MySQL 8.0.1:
InnoDB now supports NOWAIT and SKIP LOCKED options with SELECT ... FOR SHARE and SELECT ... FOR UPDATE locking read statements. NOWAIT causes the statement to return immediately if a requested row is locked by another transaction. SKIP LOCKED removes locked rows from the result set. See Locking Read Concurrency with NOWAIT and SKIP LOCKED.
Sample usage (complete example with outputs can be found in the link above):
START TRANSACTION;
SELECT * FROM tableName FOR UPDATE SKIP LOCKED;
Also, it might be good to include the warning in the Reference Manual here as well:
Queries that skip locked rows return an inconsistent view of the data. SKIP LOCKED is therefore not suitable for general transactional work. However, it may be used to avoid lock contention when multiple sessions access the same queue-like table.
MySQL does not have a way to ignore locked rows in a SELECT. You'll have to find a different way to set a row aside as "already processed".
The simplest way is to lock the row briefly in the first query just to mark it as "already processed", then unlock it and lock it again for the rest of the processing - the second query will wait for the short "marker" query to complete, and you can add an explicit WHERE condition to ignore already-marked rows. If you don't want to rely on the first operation being able to complete successfully, you may need to add a bit more complexity with timestamps and such to clean up after those failed operations.
MySQL does not have this feature. For anyone searching for this topic in general, some RDBMS have better/smarter locking features than others.
For developers constrained to MySQL, the best approach is to add a column (or use an existing, e.g., status column) that can be set to "locked" or "in progress" or similar, execute a SELECT ID, * ... WHERE IN_PROGRESS != 1 FOR UPDATE; to get the row ID you want to lock, issue UPDATE .. SET IN_PROGRESS = 1 WHERE ID = XX to unlock the records.
Using LOCK IN SHARE MODE is almost never the solution because while it'll let you read the old value, but the old value is in the process of being updated so unless you are performing a non-atomic task, there's no point in even looking at that record.
Better* RDBMS recognize this pattern (select one row to work on and lock it, work on it, unlock it) and provide a smarter approach that lets you only search unlocked records. For example, PostgreSQL 9.5+ provide SELECT ... SKIP LOCKED which only selects from within the unlocked subset of rows matching the query. That lets you obtain an exclusive lock on a row, service that record to completion, then update & unlock the record in question without having to block other threads/consumers from being able to work independent of yourself.
*Here "better" means from the perspective of atomic updates, multi-consumer architecture, etc. and not necessarily "better designed" or "overall better." Not trying to start a flamewar here.
As per http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html
The solution is to perform the SELECT in a locking mode using LOCK IN SHARE MODE:
SELECT * FROM parent WHERE NAME = 'Jones' LOCK IN SHARE MODE;
The Problem
I'm trying to figure out how to correctly set up a transaction in a database, and account for potential latency.
The Setup
In my example I have a table of users, keys, where each user can have multiple keys, and a config table that dictates how many keys each user is allowed to have.
I want to run a stored procedure that:
figures out if the given user is allowed to request a key.
get an available, unclaimed key .
attempts to redeem the key for the given user.
the pseudocode for the procedure would be:
START TRANSACTION
(1) CALL check_permission(...,#result);
IF (#result = 'has_permission') THEN
(2) SET #unclaimed_key_id = (QUERY FOR RETURNING AVAILABLE KEY ID);
(3) CALL claim_key(#unclaimed_key_id);
END IF;
COMMIT;
The problem that I am running into, is that when I simulate lag after step 1, (by using SELECT SLEEP(<seconds>)), it's possible for a given user to redeem multiple keys when they only have permissions to redeem one, by running the procedure in multiple sessions before the first procedure has finished its sleep (which again, is to simulate lag)
Here is the code for the Tables and the Procedures
(note: for the small example I didn't bother with indexes and foreign keys, but obviously I use those on the actual project).
To see my issue just set up the tables and procedures in a database, then open two mysql terminals, and in the first run this:
CALL `P_user_request_key`(10,1,#out);
SELECT #out;
And then quickly (you have 10 seconds) in the second run this:
CALL `P_user_request_key`(0,1,#out);
SELECT #out;
Both queries will successfully return key_claimed and User Bob will end up with 4 keys assigned to him, although the max value in config is set to 3 per user.
The Questions
What is the best way of avoiding issues like this? I'm trying to use a transaction but I feel like It's not going to help specifically with this issue, and may be implementing this wrong.
I realize that one possible way to fix the problem would be to just encapsulate everything in one large update query, but I would prefer to avoid that, since I like being able to set up individual procedures, where each is only meant to do a single task.
The database behind this example is intended to be used by many (thousands) of concurrent users. As such it would be best if one user attempting to redeem a code doesn't block all other users from redeeming one. I'm fine with changing my code to just attempt to redeem again if another user already claimed a key, but it should absolutely not happen that a user can redeem two codes when they only have permission to get one.
You're off the hook for not wanting to encapsulate everything in one large query, because that won't actually solve anything either, it just makes it less likely.
What you need are locks on the rows, or locks on the index where the new row would be inserted.
InnoDB uses an algorithm called next-key locking that combines index-row locking with gap locking. InnoDB performs row-level locking in such a way that when it searches or scans a table index, it sets shared or exclusive locks on the index records it encounters. Thus, the row-level locks are actually index-record locks. In addition, a next-key lock on an index record also affects the “gap” before that index record. That is, a next-key lock is an index-record lock plus a gap lock on the gap preceding the index record. If one session has a shared or exclusive lock on record R in an index, another session cannot insert a new index record in the gap immediately before R in the index order.
http://dev.mysql.com/doc/refman/5.5/en/innodb-next-key-locking.html
So how do we get exclusive locks?
Two connections, mysql1 and mysql2, each of them requesting an exclusive lock using SELECT ... FOR UPDATE. The table 'history' has a column 'user_id' which is indexed. (It's also a foreign key.) There are no rows found, so they both appear to proceed normally as if nothing unusual is going to happen. The user_id 2808 is valid but has nothing in history.
mysql1> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql2> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql1> select * from history where user_id = 2808 for update;
Empty set (0.00 sec)
mysql2> select * from history where user_id = 2808 for update;
Empty set (0.00 sec)
mysql1> insert into history(user_id) values (2808);
... and I don't get my prompt back ... no response ... because another session has a lock, too ... but then:
mysql2> insert into history(user_id) values (2808);
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Then mysql1 immediately returns success on the insert.
Query OK, 1 row affected (3.96 sec)
All that is left is for mysql1 to COMMIT and magically, we prevented a user with 0 entries from inserting more than 1 entry. The deadlock occurred because both sessions needed incompatible things to happen: mysql1 needed mysql2 to release its lock before it would be able to commit and mysql2 needed mysql1 to release its lock before it would be able to insert. Somebody has to lose that fight, and generally the thread that has done the least work is the loser.
But what if there had been 1 or more rows already existing when I did the SELECT ... FOR UPDATE? In that case, the lock would have been on the rows, so the second session to try to SELECT would actually block waiting for the SELECT until the first session decided to either COMMIT or ROLLBACK, at which time the second session would have seen an accurate count of the number of rows (including any inserted or deleted by the first session) and could have accurately decided the user already had the maximum allowed.
You can't outrace a race condition, but you can lock them out.
MySQL Verion: v5.0.95
Basically I have clients trying to get data - each client should only get unique rows.
START TRANSACTION;
SELECT id where result='new';
UPDATE SET result='old' WHERE id=$id;
COMMIT;
LOCK IN SHARED MODE on the select statement still lets other clients read the data, which seems like a problem.
Basically I need the data selected once, updated, and not read again by another client.
SELECT FOR UPDATE will block another read, while LOCK IN SHARED MODE will allow the read, but won't allow update from another client
I have a table of promo_codes that can be activated by a web application. There is a state column which can be either 0 for unactivated or 1 for activated. If I run a transaction with
SELECT FROM promo_codes WHERE state=0 LIMIT 1 FOR UPDATE;
UPDATE promo_codes SET state=1 WHERE id = ?;
What happens to a second transaction running:
SELECT FROM promo_codes WHERE state=0 LIMIT 1 FOR UPDATE;
Does it simply return the next row, or does it block until the first transaction is done?
I've actually started thinking about just setting a lock based on the row id in redis because it's obvious to me how that would work and I know it wouldn't create any performance issues in MySQL, but on the other hand, there must be a clean and performant way to make this work purely in SQL. Maybe I could use just do an UPDATE ... LIMIT 1 first, but how do I get the id of the promo code back in that case?
SELECT for UPDATE and LOCK IN SHARE MODE modifiers effectively run in READ-COMMITTED isolation mode even if current isolation mode is REPEATABLE-READ. This is done beause Innodb can only lock current version of row. Think about similar case and row being deleted. Even if Innodb would be able to set locks on rows which no more exist – would it do any good for you ? Not really – for example you could try to update the row which you just locked with SELECT FOR UPDATE but this row is already gone so you would get quite unexpected error updating the row which you thought you locked successfully. Anyway it is done this way for good all other decisions would be even more troublesome.
LOCK IN SHARE MODE is actually often used to bypass multiversioning and make sure we’re reading most current data, plus to ensure it can’t be changed. This for example can be used to read set of the rows, compute new values for some of them and write them back. If we would not use LOCK IN SHARE MODE we could be in trouble as rows could be update before we write new values to them and such update could be lost.