My question is similar to:
Ignoring locked row in a MySQL query
except that I have already implemented a logic close to what's suggested in the accepted answer. My question is how to set the process id initially. All servers run a query like (the code is in ruby on rails but the resulting mysql query is):
UPDATE (some_table) SET process_id=(some process_id) WHERE (some condition on row_1) AND process_id is null ORDER BY (row_1) LIMIT 100
Now what happens is all processes try to update the same rows, they get locked and they timeout waiting for the lock. I would like the servers to ignore the rows that are locked (because after the lock is released the process_id won't be null anymore so there is no point for locking here).
I could try to randomize the batch of records to update but the problem is I want to prioritize the update based on row_1 as in the query above.
So my question is, is there a way in mysql to check if a record is locked and ignore it if it is?
No, there is no way to ignore already-locked rows. Your best bet will be to ensure that nothing locks any row for any extended period of time. That will ensure that any lock conflicts are very short in duration. That will generally mean "advisory" locking of rows by locking them within a transaction (using FOR UPDATE) and updating the row to mark it as "locked".
For example, first you want to find your candidate row(s) without locking anything:
SELECT id FROM t WHERE lock_expires IS NULL AND lock_holder IS NULL <some other conditions>;
Now lock only the row you want, very quickly:
START TRANSACTION;
SELECT * FROM t WHERE id = <id> AND lock_expires IS NULL AND lock_holder IS NULL;
UPDATE t SET lock_expires = <some time>, lock_holder = <me> WHERE id = <id>;
COMMIT;
(Technical note: If you are planning to lock multiple rows, always lock them in a specific order. Ascending order by primary key is a decent choice. Locking out-of-order or in random order will subject your program to deadlocks from competing processes.)
Now you can take as long as you want (less than lock_expires) to process your row(s) without blocking any other process (they won't match the row during the non-locking select, so will always ignore it). Once the row is processed, you can UPDATE or DELETE it by id, also without blocking anything.
Related
I am trying to read 10 records from a MySql table and updating a field IsRead to 1 to avoid the duplicate read. So when I again read the data then the next 10 record should be read not the already read records using IsRead,
select * from tablename where IsRead=0 limit 10;
But my Question is how can I read and update the 10 records at the same time.
Using a Single Query.
EDIT
Previously I am reading and updating one one records, but now I want to avoid the reading time (once for reading and once for updating) so what will be the suitable way to read and update 10 records. Duplicate record should not be read.
What you're looking for is not a single statement, but transactions.
Transactions are a way to make multiple statements ACID compliant. Read about it in the link provided. In short it means, "all or nothing".
Code wise it would simply look something like this:
START TRANSACTION;
select * from tablename where IsRead=0 ORDER BY created_or_whatever_column limit 10 for update;
update tablename set IsRead = 1 ORDER BY created_or_whatever_column LIMIT 10;
COMMIT;
Notice, that I added an order by clause. Using limit without order by doesn't make sense. There's no order in the data in a database, unless you specify it.
Also I added for update to the select statement, so the rows are locked until the transaction ends (with commit) so that no other transaction manipulates these rows in the meantime.
What you should also have a look at in this context are the isolation levels.
It is unclear to me (by reading MySQL docs) if the following query ran on INNODB tables on MySQL 5.1, would create WRITE LOCK for each of the rows the db updates internally (5000 in total) or LOCK all the rows in the batch. As the database has really heavy load, this is very important.
UPDATE `records`
INNER JOIN (
SELECT id, name FROM related LIMIT 0, 5000
) AS `j` ON `j`.`id` = `records`.`id`
SET `name` = `j`.`name`
I'd expect it to be per row but as I do not know a way to make sure it is so, I decided to ask someone with deeper knowledge. If this is not the case and the db would LOCK all the rows in the set, I'd be thankful if you give me explanation why.
The UPDATE is running in transaction - it's an atomic operation, which means that if one of the rows fails (because of unique constrain for example) it won't update any of the 5000 rows. This is one of the ACID properties of a transactional database.
Because of this the UPDATE hold a lock on all of the rows for the entire transaction. Otherwise another transaction can further update the value of a row, based on it's current value (let's say update records set value = value * '2'). This statement should produce different result depending if the first transaction commits or rollbacks. Because of this it should wait for the first transaction to complete all 5000 updates.
If you want to release the locks, just do the update in (smaller) batches.
P.S. autocommit controls if each statement is issued in own transaction, but does not effect the execution of a single query
I need a little help with SELECT FOR UPDATE (resp. LOCK IN SHARE MODE).
I have a table with around 400 000 records and I need to run two different processing functions on each row.
The table structure is appropriately this:
data (
`id`,
`mtime`, -- When was data1 set last
`data1`,
`data2` DEFAULT NULL,
`priority1`,
`priority2`,
PRIMARY KEY `id`,
INDEX (`mtime`),
FOREIGN KEY ON `data2`
)
Functions are a little different:
first function - has to run in loop on all records (is pretty fast), should select records based on priority1; sets data1 and mtime
second function - has to run only once on each records (is pretty slow), should select records based on priority2; sets data1 and mtime
They shouldn't modify the same row at the same time, but the select may return one row in both of them (priority1 and priority2 have different values) and it's okay for transaction to wait if that's the case (and I'd expect that this would be the only case when it'll block).
I'm selecting data based on following queries:
-- For the first function - not processed first, then the oldest,
-- the same age goes based on priority
SELECT id FROM data ORDER BY mtime IS NULL DESC, mtime, priority1 LIMIT 250 FOR UPDATE;
-- For the second function - only processed not processed order by priority
SELECT if FROM data ORDER BY priority2 WHERE data2 IS NULL LIMIT 50 FOR UPDATE;
But what I am experiencing is that every time only one query returns at the time.
So my questions are:
Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT) would be appreciated )?
Can ORDER BY ... LIMIT ... cause any issues?
Can indexes and keys cause any issues?
Key things to check for before getting much further:
Ensure the table engine is InnoDB, otherwise "for update" isn't going to lock the row, as there will be no transactions.
Make sure you're using the "for update" feature correctly. If you select something for update, it's locked to that transaction. While other transactions may be able to read the row, it can't be selected for update, updated or deleted by any other transaction until the lock is released by the original locking transaction.
To keep things clean, try explicitly starting a transaction using "START TRANSACTION", run your select "for update", do whatever you're going to do to the records that are returned, and finish up by explicitly executing a "COMMIT" to close out the transaction.
Order and limit will have no impact on the issue you're experiencing as far as I can tell, whatever was going to be returned by the Select will be the rows that get locked.
To answer your questions:
Is it possible to acquire two separate locks in two separate transactions on separate bunch of rows (in the same table)?
Yes, but not on the same rows. Locks can only exist at the row level in one transaction at a time.
Do I have that many collisions between first and second query (I have troubles debugging that, any hint on how to debug SELECT ... FROM (SELECT ...) WHERE ... IN (SELECT) would be appreciated )?
There could be a short period where the row lock is being calculated, which will delay the second query, however unless you're running many hundreds of these select for updates at once, it shouldn't cause you any significant or noticable delays.
Can ORDER BY ... LIMIT ... cause any issues?
Not in my experience. They should work just as they always would on a normal select statement.
Can indexes and keys cause any issues?
Indexes should exist as always to ensure sufficient performance, but they shouldn't cause any issues with obtaining a lock.
All points in accepted answer seem fine except below 2 points:
"whatever was going to be returned by the Select will be the rows that get locked." &
"Can indexes and keys cause any issues?
but they shouldn't cause any issues with obtaining a lock."
Instead all the rows which are internally read by DB during deciding which rows to select and return will be locked. For example below query will lock all rows of the table but might select and return only few rows:
select * from table where non_primary_non_indexed_column = ? for update
Since there is no index, DB will have to read the entire table to search for your desired row and hence lock entire table.
If you want to lock only one row either you need to specify its primary key or an indexed column in the where clause. Thus indexing becomes very important in case of locking only the appropriate rows.
This is a good reference - https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html
This is a follow up on my previous question (you can skip it as I explain in this post the issue):
MySQL InnoDB SELECT...LIMIT 1 FOR UPDATE Vs UPDATE ... LIMIT 1
Environment:
JSF 2.1 on Glassfish
JPA 2.0 EclipseLink and JTA
MySQL 5.5 InnoDB engine
I have a table:
CREATE TABLE v_ext (
v_id INT NOT NULL AUTO_INCREMENT,
product_id INT NOT NULL,
code VARCHAR(20),
username VARCHAR(30),
PRIMARY KEY (v_id)
) ENGINE=InnoDB DEFAULT CHARSET=UTF8;
It is populated with 20,000 records like this one (product_id is 54 for all records, code is randomly generated and unique, username is set to NULL):
v_id product_id code username
-----------------------------------------------------
1 54 '20 alphanumerical' NULL
...
20,000 54 '20 alphanumerical' NULL
When a user purchase product 54, he gets a code from that table. If the user purchases multiple times, he gets a code each times (no unique constraint on username). Because I am preparing for a high activity I want to make sure that:
No concurrency/deadlock can occur
Performance is not impacted by the locking mechanism which will be needed
From the SO question (see link above) I found that doing such a query is faster:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL LIMIT 1 FOR UPDATE;
// Use result for next query
UPDATE v_ext SET username=xxx WHERE v_id=...;
COMMIT;
However I found a deadlock issue ONLY when using an index on username column. I thought of adding an index would help in speeding up a little bit but it creates a deadlock after about 19,970 records (actually quite consistently at this number of rows). Is there a reason for this? I don't understand. Thank you.
From a purely theoretical point of view, it looks like you are not locking the right rows (different condition in the first statement than in the update statement; besides you only lock one row because of LIMIT 1, whereas you possibly update more rows later on).
Try this:
START TRANSACTION;
SELECT v_id FROM v_ext WHERE username IS NULL AND v_id=yyy FOR UPDATE;
UPDATE v_ext SET username=xxx WHERE v_id=yyy;
COMMIT;
[edit]
As for the reason for your deadlock, this is the probable answer (from the manual):
If you have no indexes suitable for your statement and MySQL must scan
the entire table to process the statement, every row of the table
becomes locked (...)
Without an index, the SELECT ... FOR UPDATE statement is likely to lock the entire table, whereas with an index, it only locks some rows. Because you didn't lock the right rows in the first statement, an additional lock is acquired during the second statement.
Obviously, a deadlock cannot happen if the whole table is locked (i.e. without an index).
A deadlock can certainly occur in the second setup.
First of all, the definition of the table is wrong. You have no tid column in the table, so i am suspecting the primary key is v_id.
Second of all, if you select for update, you lock the row. Any other select coming until the first transaction is done will wait for the row to be cleared, because it will hit the exact same record. So you will have waits for this row.
However, i pretty much doubt this can be a real serious problem in your case, because first of all, you have the username there, and second of all you have the product id there. It is extremly unlikely that you will have alot of hits on that exact same record you hit initially, and even if you do, the transaction should be running very fast.
You have to understand that by using transactions, you usually give up pretty much on concurrency for consistent data. There is no way to support consistency of data and concurrency at the same time.
I have innodb table read by lot of different instances (cloud)
Daemon in each instance takes 100 rows to "do things" of this table, but I don't want 2 (or more) instances to take the same things.
So I have a "status" column ("todo", "doing", "done").
INSTANCE 1: It takes 100 rows where status = "todo" ... Then I need to UPDATE these rows asap to status "doing", so INSTANCE 2,3,..x can't take the same rows.
How can i do it ?
Please, I would like a solution without LOCKING WHOLE table, but locking just the rows (that's because I use innodb) ... I have read a lot about that (LOCK SHARE MODE, FOR UPDATE, COMMITs ... ) but I do not get the right way ...
You should use LOCK TABLES and UNLOCK TABLES functions to do this:
http://dev.mysql.com/doc/refman/5.1/en/lock-tables.html
use a transaction and then SELECT ... FOR UPDATE when you read the records.
This way the records you read are locked. When you get all the data update the records to "doing" and COMMIT the transaction.
Maybe what you were missing is the use of a transaction, or the correct order of commands. Here is a basic example:
BEGIN TRANSACTION;
SELECT * FROM table WHERE STATUS = 'todo' FOR UPDATE;
// Loop over results in code, save necessary data to array/list..
UPDATE table SET STATUS ='doing' WHERE ...;
COMMIT;
// process the data...
UPDATE table SET STATUS ='done' WHERE ...;