I have a database table that is basically a first-in-first-out queue. Rows are simply inserted into the table by other parts of the system and forgotten about. Every 5 minutes, a job runs to process items from the queue. Each row to be processed has it's status field changed from a pending value to a processing value. Subsequent duplicates in the queue are matched up and marked as the duplicate of an earlier queued item that is being processed. The queue processor job is the only thing that does anything with the table, apart from the parts of the system which just blindly insert rows.
This is exactly what the processor does with the queue:
START TRANSACTION;
SELECT id
FROM api_queue
WHERE status=:status_processing
-- Application checks this result set is empty, then...
UPDATE api_queue qs
INNER JOIN api_queue qdupes ON qdupes.products_id=qs.products_id AND qdupes.action=qs.action
SET qdupes.status = IF(qs.id=qdupes.id, :status_processing, :status_processing_duplicate)
WHERE qs.id IN (:queue_ids) ;
COMMIT;
-- Each queue item is processed
-- Once processing is complete, we purge the queue
START TRANSACTION;
SELECT COUNT(*) AS total FROM api_queue WHERE status = :status_processing ;
-- Application sanity checks the number of processing items it's about to delete against how many it's processed, and then...
DELETE FROM api_queue WHERE status IN (:status_processing, :status_processing_duplicate) ;
COMMIT;
In a typical 5 minutes, the queue will build up a backlog of about 100 items, though occasionally it can be in the thousands if a lot of changes have occurred in the catalog.
The first transaction is typically pretty fast when it doesn't hit a deadlock (0.1 - 0.2 seconds to complete), but it does seem to hit deadlocks about 10% of the time.
Why does it hit deadlocks so often? Even if a transaction locks all the rows currently in a table, should I expect this to cause contention when new rows are added to the table? If so, why is that?
I've also noticed that sometimes the first transaction above (containing the UPDATE query) doesn't appear to actually apply at all - though I think this may well be an unrelated bug.
My queue table looks like this:
CREATE TABLE IF NOT EXISTS `api_queue` (
`id` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY,
`products_id` int(11) NOT NULL,
`action` tinyint(3) NOT NULL,
`triggered_by` tinyint(3) NOT NULL,
`status` tinyint(1) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ;
My Mantra: "Don't queue it, just do it". I say this because I have seen too many queues implemented in MySQL that flopped for one reason or another. A common reason is that the overhead of inserting/checking/removing the items may be as costly as "just doing the task". So why double the cost? And, apparently, the queuing is causing extra deadlocks.
According to the info you gave, the system should be able to handle 1500-3000 every 5 minutes. That should handle the "100" to "thousands".
Your queuing mechanism seems overly complex since it involves a JOIN and other things that are not simply 1-in, 1-out.
Assuming you reject my comments so far, I will proceed to critique the code...
SELECT ... FOR UPDATE
is possibly required for the both SELECTs.
The SELECT next to the DELETE could possibly be merged with the DELETE as a multi-table DELETE. Or it might be possible to pull it, plus the associated code, out of the transaction. (Faster transactions are less likely to deadlock.)
You are checking for errors (deadlock, etc) after the COMMITs, yes? That's when Galera gets the hit.
When using an IN(...), sort the elements. The underlying code is probably locking the rows in the order of the IN elements. This could turn a deadlock into a delay of up to innodb_lock_wait_timeout seconds. (Such a delay is not as 'bad' as a deadlock.)
You repeat the transaction when it gets a deadlock, correct? (That's the simple way to deal with deadlocks.)
Edit (IN)
If one thread is doing UPDATE ... WHERE id IN (11,22) and another is doing UPDATE ... WHERE id IN (22,11), and each gets one row locked, then trying to get the other row locked is a deadlock -- and one would have to ROLLBACK. If, instead, both said (11,22), then (at worst) one would have to wait (but not be deadlocked). I am assuming, without proof, that InnoDB code is not cleaver enough to somehow avoid this IN deadlock -- by sorting the number, by atomically locking, or whatever. (And I would argue that cleaver=slower, hence not worth doing for such a rare happening.)
Related
Let's say I have a table as follows
CREATE TABLE `Foo` (
`id` int(10) unsigned NOT NULL,
`bar1` int(10) unsigned NOT NULL,
`bar2` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
);
And I have two queries:
update Foo set bar1=10 where id=5000;
update Foo set bar1=10 where id=5000 and bar1=0;
My guess is that the second query will not run slower than the first query but I need confirmation from someone with certainty knowledge.
(The reason I want to do the second is that when multiple clients select the table first and then update them simultaneously only one people will be able to update successfully)
Find the row. The Optimizer will look at the possible indexes (just the PK) and decide to start with id=5000. There is at most one such row.
(for the second case) verify that bar1=0. If not, the query is finished.
Check to see if there is anything to change -- is bar1 already set to 10? If so, finish.
Do the work of updating -- this involves saving the existing row in case of a ROLLBACK, tentatively storing the new value, etc, etc. -- This step is likely to be the most costly step.
Step 2 is the only difference -- and it is a quite small step. It is not worth worrying about when it comes to performance.
On the other hand, Step 2 means that the two Updates are different -- What should happen if bar1=4567? The first Update would change it, but the second won't.
Your final comment implies that maybe you should be using transactions to keep one client from stepping on another. Perhaps the code should be more like:
BEGIN;
SELECT ... WHERE id = 5000 FOR UPDATE;
decide what to do -- which might include ROLLBACK and exit
UPDATE Foo SET bar1=10 WHERE id = 5000;
COMMIT;
Bottom Line: Use transactions, not extra code, to deal with concurrency.
Caveat: A transaction should be "fast" (less than a few seconds). If you need a "long" transaction (eg, a 'shopping cart' that could take minutes to finish), a different mechanism is needed. If you need a long transaction, start a new question explaining the situation. (The current question is discussing the performance of a single Update.)
I have a innoDB table which records online users. It gets updated on every page refresh by a user to keep track of which pages they are on and their last access date to the site. I then have a cron that runs every 15 minutes to DELETE old records.
I got a 'Deadlock found when trying to get lock; try restarting transaction' for about 5 minutes last night and it appears to be when running INSERTs into this table. Can someone suggest how to avoid this error?
=== EDIT ===
Here are the queries that are running:
First Visit to site:
INSERT INTO onlineusers SET
ip = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
On each page refresh:
UPDATE onlineusers SET
ips = 123.456.789.123,
datetime = now(),
userid = 321,
page = '/thispage',
area = 'thisarea',
type = 3
WHERE id = 888
Cron every 15 minutes:
DELETE FROM onlineusers WHERE datetime <= now() - INTERVAL 900 SECOND
It then does some counts to log some stats (ie: members online, visitors online).
One easy trick that can help with most deadlocks is sorting the operations in a specific order.
You get a deadlock when two transactions are trying to lock two locks at opposite orders, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(2), locks key(1);
If both run at the same time, connection 1 will lock key(1), connection 2 will lock key(2) and each connection will wait for the other to release the key -> deadlock.
Now, if you changed your queries such that the connections would lock the keys at the same order, ie:
connection 1: locks key(1), locks key(2);
connection 2: locks key(1), locks key(2);
it will be impossible to get a deadlock.
So this is what I suggest:
Make sure you have no other queries that lock access more than one key at a time except for the delete statement. if you do (and I suspect you do), order their WHERE in (k1,k2,..kn) in ascending order.
Fix your delete statement to work in ascending order:
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
To
DELETE FROM onlineusers
WHERE id IN (
SELECT id FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY id
) u;
Another thing to keep in mind is that MySQL documentation suggest that in case of a deadlock the client should retry automatically. you can add this logic to your client code. (Say, 3 retries on this particular error before giving up).
Deadlock happen when two transactions wait on each other to acquire a lock. Example:
Tx 1: lock A, then B
Tx 2: lock B, then A
There are numerous questions and answers about deadlocks. Each time you insert/update/or delete a row, a lock is acquired. To avoid deadlock, you must then make sure that concurrent transactions don't update row in an order that could result in a deadlock. Generally speaking, try to acquire lock always in the same order even in different transaction (e.g. always table A first, then table B).
Another reason for deadlock in database can be missing indexes. When a row is inserted/update/delete, the database needs to check the relational constraints, that is, make sure the relations are consistent. To do so, the database needs to check the foreign keys in the related tables. It might result in other lock being acquired than the row that is modified. Be sure then to always have index on the foreign keys (and of course primary keys), otherwise it could result in a table lock instead of a row lock. If table lock happen, the lock contention is higher and the likelihood of deadlock increases.
In case someone is still struggling with this issue:
I faced similar issue where 2 requests were hitting the server at the same time. There was no situation like below:
T1:
BEGIN TRANSACTION
INSERT TABLE A
INSERT TABLE B
END TRANSACTION
T2:
BEGIN TRANSACTION
INSERT TABLE B
INSERT TABLE A
END TRANSACTION
So, I was puzzled why deadlock is happening.
Then I found that there was parent child relation ship between 2 tables because of foreign key. When I was inserting a record in child table, the transaction was acquiring a lock on parent table's row. Immediately after that I was trying to update the parent row which was triggering elevation of lock to EXCLUSIVE one. As 2nd concurrent transaction was already holding a SHARED lock, it was causing deadlock.
Refer to: https://blog.tekenlight.com/2019/02/21/database-deadlock-mysql.html
It is likely that the delete statement will affect a large fraction of the total rows in the table. Eventually this might lead to a table lock being acquired when deleting. Holding on to a lock (in this case row- or page locks) and acquiring more locks is always a deadlock risk. However I can't explain why the insert statement leads to a lock escalation - it might have to do with page splitting/adding, but someone knowing MySQL better will have to fill in there.
For a start it can be worth trying to explicitly acquire a table lock right away for the delete statement. See LOCK TABLES and Table locking issues.
You might try having that delete job operate by first inserting the key of each row to be deleted into a temp table like this pseudocode
create temporary table deletetemp (userid int);
insert into deletetemp (userid)
select userid from onlineusers where datetime <= now - interval 900 second;
delete from onlineusers where userid in (select userid from deletetemp);
Breaking it up like this is less efficient but it avoids the need to hold a key-range lock during the delete.
Also, modify your select queries to add a where clause excluding rows older than 900 seconds. This avoids the dependency on the cron job and allows you to reschedule it to run less often.
Theory about the deadlocks: I don't have a lot of background in MySQL but here goes... The delete is going to hold a key-range lock for datetime, to prevent rows matching its where clause from being added in the middle of the transaction, and as it finds rows to delete it will attempt to acquire a lock on each page it is modifying. The insert is going to acquire a lock on the page it is inserting into, and then attempt to acquire the key lock. Normally the insert will wait patiently for that key lock to open up but this will deadlock if the delete tries to lock the same page the insert is using because thedelete needs that page lock and the insert needs that key lock. This doesn't seem right for inserts though, the delete and insert are using datetime ranges that don't overlap so maybe something else is going on.
http://dev.mysql.com/doc/refman/5.1/en/innodb-next-key-locking.html
For Java programmers using Spring, I've avoided this problem using an AOP aspect that automatically retries transactions that run into transient deadlocks.
See #RetryTransaction Javadoc for more info.
cron is dangerous. If one instance of cron fails to finish before the next is due, they are likely to fight each other.
It would be better to have a continuously running job that would delete some rows, sleep some, then repeat.
Also, INDEX(datetime) is very important for avoiding deadlocks.
But, if the datetime test includes more than, say, 20% of the table, the DELETE will do a table scan. Smaller chunks deleted more often is a workaround.
Another reason for going with smaller chunks is to lock fewer rows.
Bottom line:
INDEX(datetime)
Continually running task -- delete, sleep a minute, repeat.
To make sure that the above task has not died, have a cron job whose sole purpose is to restart it upon failure.
Other deletion techniques: http://mysql.rjweb.org/doc.php/deletebig
#Omry Yadan's answer ( https://stackoverflow.com/a/2423921/1810962 ) can be simplified by using ORDER BY.
Change
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
to
DELETE FROM onlineusers
WHERE datetime <= now() - INTERVAL 900 SECOND
ORDER BY ID
to keep the order in which you delete items consistent. Also if you are doing multiple inserts in a single transaction, make sure they are also always ordered by id.
According to the mysql delete documentation:
If the ORDER BY clause is specified, the rows are deleted in the order that is specified.
You can find a reference here: https://dev.mysql.com/doc/refman/8.0/en/delete.html
I have a method, the internals of which are wrapped in a MySqlTransaction.
The deadlock issue showed up for me when I ran the same method in parallel with itself.
There was not an issue running a single instance of the method.
When I removed MySqlTransaction, I was able to run the method in parallel with itself with no issues.
Just sharing my experience, I'm not advocating anything.
Here's a neat locking problem with MariaDB/MySQL.
A server is reassembling multipart SMS messages. Messages arrive in segments. Segments with the same "smsfrom" and "uniqueid" are part of the same message. Segments have a segment number starting from 1 up to "segmenttotal". When all segments of a message have arrived, the message is complete. We have a table of unmatched segments waiting to be reassembled, as follows:
CREATE TABLE frags (
smsfrom TEXT,
uniqueid VARCHAR(32) NOT NULL,
smsbody TEXT,
segmentnum INTEGER NOT NULL,
segmenttotal INTEGER NOT NULL);
When a new segment comes in, we do, in a transaction,
SELECT ... FROM frags WHERE smsfrom = % AND uniqueid = %;
This gets us all the segments received so far. If the new one
plus these has all the segment numbers, we have a complete message.
We send the message off for further processing and delete the fragments involved. Fine.
If not all segments have arrived yet, we do an INSERT of the segment we just got. Autocommit is off, so both operations are part of a transaction. InnoDB engine, incidentally.
This has a race condition. Two segments come in at the same time for a two-segment message, and are processed by separate processes. Process A does the SELECT, finds nothing. Process B does the SELECT, finds nothing. Process A inserts segment 1, no problem. Process B inserts segment 2, no problem. Now we're stuck - all segments are in the table but we didn't notice. So the message is stuck there forever. (In practice we do a purge every few minutes to remove old unmatched stuff, but ignore that for now.)
So what's wrong? The SELECTs lock no rows, because they find nothing.
We need a row lock on a row that doesn't exist yet. Adding FOR UPDATE to the SELECT doesn't help; nothing to lock. Nor does LOCK IN SHARE MODE. Even going to a transaction type of SERIALIZABLE doesn't help, because that's just global LOCK IN SHARE MODE.
OK, so suppose we do the INSERT first and then do a SELECT to see if we have all the segments. Process A does the INSERT of 1, no problem. Process B does the insert of 2, no problem. Process A does a SELECT, and sees only 1. Process B does a SELECT, and sees only 2. That's repeatable read semantics. No good.
The brute force approach is a LOCK TABLE before doing any of this. That ought to work, although it's annoying, because I'm in a transaction involving other tables and LOCK TABLE implies a commit.
Doing a commit after each INSERT might work, but I'm not totally sure.
Is there a more elegant solution?
Why not
1) Process 1. Insert Into your frag table. Nothing else
Insert ....
Commit;
2) Process 2
This find the complete multipart SMS by
select smsfrom, unique, uniqueid,count() from frags group by smsfrom, unique, unique having count() == segmenttotal;
Move them to the new table
delete from frags where smsfrom=<> and unique = <>;
commit;
As I wrote above, I ended up doing this:
INSERT ... -- Insert new fragment.
COMMIT
SELECT ... FROM frags WHERE smsfrom = % AND uniqueid = % FOR UPDATE;
Check if the SELECT returned a complete set of fragments. If so, reassemble and process message, then
DELETE ... FROM FRAGS WHERE smsfrom = % AND uniqueid = %;
Both the COMMIT and the FOR UPDATE are necessary. The COMMIT is needed so that each process sees any INSERT from another process. The FOR UPDATE is needed on the SELECT to row lock all the fragments until the DELETE can be done. Otherwise, two processes might see the complete set of fragments in the SELECT and reassemble and process the message twice.
This is surprisingly complicated for a one-table problem, but seems to work.
First of all, I don't see how I could be getting any deadlock at all, since I am using no explicit locking, there's only one table involved, there's a separate process each to insert, select, and update rows, only one row is inserted or updated at a time, and each process only rarely (perhaps once a minute) runs at all.
It's an email queue:
CREATE TABLE `emails_queue` (
`id` varchar(40) NOT NULL,
`email_address` varchar(128) DEFAULT NULL,
`body` text,
`status_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` enum('pending','inprocess','sent','discarded','failed') DEFAULT NULL,
KEY `status` (`status`),
KEY `status_time` (`status`,`status_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
The generating process, in response to some user action but roughly every 90 seconds, does an insert to the table, setting the status to "pending".
There's a monitoring process that every minute checks that the number of "pending" and "failed" emails is not excessive. It takes less than a second to run and has never given me any trouble.
Every minute, the sending process grabs all the pending emails. It loops through and one email at a time, sets its status to "inprocess", tries to send it, and finally sets its status accordingly to "sent", "discarded" (it has reasons for deciding an email shouldn't go out), or "failed" (rejected by the SMTP system).
The statement for setting the status is unusual.
UPDATE emails_queue SET status=?, status_time=NOW() WHERE id=? AND status = ?
That is, I only update the status if the current status it already what I believe it to be. Before this mechanism, I accidentally kicked off two sending processes and they would each try to send the same email. Now, if that were to happen, one process would successfully move the email from "pending" to "inprocess", but the second one would update zero rows, realize there's a problem, and skip that email.
The problem is, about one time in 100, the update fails altogether! I get com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
WTH?
This is the only table and only query that this happens to and it only happens in production (to maximize difficulty in investigating it).
The only two things that seem at all unusual are (1) updating a column that participates in the WHERE clause, and (2) the (unused) automatic updating of the status_time.
I'm looking for any suggestions or diagnostic techniques.
Firstly, deadlocks do not depend on explicit locking. MySQL's LOCK TABLE or using non-default transaction isolation modes are NOT required to have a deadlock. You can still have deadlocks if you never use an explicit transaction.
Deadlocks can happen on a single table, quite easily. Most commonly it's from a single hot table.
Deadlocks can even happen if all your transactions just do a single row insert.
A deadlock can happen if you have
More than one connection to the database (obviously)
Any operation that internally involves more than one lock.
What is not obvious, is that most of the time, a single row insert or update involves more than one lock. The reason for this is that secondary indexes also need to be locked during inserts / updates.
SELECTs won't lock (assuming you're using the default isolation mode, and aren't using FOR UPDATE) so they can't be the cause.
SHOW ENGINE INNODB STATUS is your friend. It will give you a bunch of (admittedly very confusing) information about deadlocks, specifically, the most recent one.
You can't completely eliminate deadlocks, they will continue to happen in production (even on test systems if stress them properly)
Aim for a very low amount of deadlocks. If 1% of your transactions deadlock, that is possibly too many.
Consider changing the transaction isolation level of your transactions to read-committed IF YOU FULLY UNDERSTAND THE IMPLICATIONS
ensure that your software handles deadlocks appropriately.
With some database servers there are default settings for locking behavior. The usual default it to use locks (at least on the systems I used). I'm not sure this is true on mysql but I believe it is.
Do you have an index on the emails_queue table? The type of index can change how it does locking. In one case I dealt with not having a clustered index on the table caused it to
use page locking instead of row locking. I had explicitly told it to use row locking and
it silently changed it. Page locking can cause deadlocks. Try checking that index.
If those don't help the solution is the one suggested in the error message. Catch
the exception for deadlocks and re-run the sql when it happens.
You have not described the scope of the transactions in your description. If each process that you have described is trying to do everything within a single transaction, then there certainly is the potential for deadlock in this system.
While it may seem like a deadlock should not occur because only a single table is involved, the resources that are being locked are not tables but rows. Two processes may each be holding a row lock that is required by the other processes, if the same transaction is used to manipulate multiple rows.
It's probably the tenth time I'm implementing something like this, and I've never been 100% happy about solutions I came up with.
The reason using mysql table instead of a "proper" messaging system is attractive is primarily because most application already use some relational database for other stuff (which tends to be mysql for most of the stuff I've been doing), while very few applications use a messaging system. Also - relational databases have very strong ACID properties, while messaging systems often don't.
The first idea is to use:
create table jobs(
id auto_increment not null primary key,
message text not null,
process_id varbinary(255) null default null,
key jobs_key(process_id)
);
And then enqueue looks like this:
insert into jobs(message) values('blah blah');
And dequeue looks like this:
begin;
select * from jobs where process_id is null order by id asc limit 1;
update jobs set process_id = ? where id = ?; -- whatever i just got
commit;
-- return (id, message) to application, cleanup after done
Table and enqueue look nice, but dequeue kinda bothers me. How likely is it to rollback? Or to get blocked? What keys I should use to make it O(1)-ish?
Or is there any better solution that what I'm doing?
Your dequeue could be more concise. Rather than relying on the transaction rollback, you could do it in one atomic statement without an explicit transaction:
UPDATE jobs SET process_id = ? WHERE process_id IS NULL ORDER BY ID ASC LIMIT 1;
Then you can pull jobs with (brackets [] mean optional, depending on your particulars):
SELECT * FROM jobs WHERE process_id = ? [ORDER BY ID LIMIT 1];
I've built a few message queuing systems and I'm not certain what type of message you're referring to, but in the case of the dequeuing (is that a word?) I've done the same thing you've done. Your method looks simple, clean and solid. Not that my work is the best, but it's proven very effective for large-monitoring for many sites. (error logging, mass email marketing campaigns, social networking notices)
My vote: no worries!
Brian Aker talked about a queue engine a while ago. There's been talk about a SELECT table FROM DELETE syntax, too.
If you're not worried about throughput, you can always use SELECT GET_LOCK() as a mutex. For example:
SELECT GET_LOCK('READQUEUE');
SELECT * FROM jobs;
DELETE FROM JOBS WHERE ID = ?;
SELECT RELEASE_LOCK('READQUEUE');
And if you want to get really fancy, wrap it in a stored procedure.
In MySQL 8 you can use the new NOWAIT and SKIP LOCKED keywords to avoid complexity with special locking mechanisms:
START TRANSACTION;
SELECT id, message FROM jobs
WHERE process_id IS NULL
ORDER BY id ASC LIMIT 1
FOR UPDATE SKIP LOCKED;
UPDATE jobs
SET process_id = ?
WHERE id = ?;
COMMIT;
Traditionally this was hard to achieve without hacks and unusual special tables or columns, unreliable solutions or losing concurrency.
SKIP LOCKED may cause performance issues with extremely large numbers of consumers.
This still does not however handle automatically marking the job complete on transaction rollback. For this you may need save points. That however might not solve all cases. You would really want to set an action to execute on transaction failure but as part of the transaction!
In future it's possible there may be more features to help optimise with cases such as an update that can also return the matched rows. It's important to keep apprised of new features and capabilities in the change log.
Here is a solution I used, working without the process_id of the current thread, or locking the table.
SELECT * from jobs ORDER BY ID ASC LIMIT 0,1;
Get the result in a $row array, and execute:
DELETE from jobs WHERE ID=$row['ID'];
Then get the affected rows(mysql_affected_rows). If there are affected rows, process the job in the $row array. If there are 0 affected rows, it means some other process is already processing the selected job. Repeat the above steps until there are no rows.
I've tested this with a 'jobs' table having 100k rows, and spawning 20 concurrent processes that do the above. No race conditions happened. You can modify the above queries to update a row with a processing flag, and delete the row after you actually processed it:
while(time()-$startTime<$timeout)
{
SELECT * from jobs WHERE processing is NULL ORDER BY ID ASC LIMIT 0,1;
if (count($row)==0) break;
UPDATE jobs set processing=1 WHERE ID=$row['ID'];
if (mysql_affected_rows==0) continue;
//process your job here
DELETE from jobs WHERE ID=$row['ID'];
}
Needless to say, you should use a proper message queue (ActiveMQ, RabbitMQ, etc) for this kind of work. We had to resort to this solution though, as our host regularly breaks things when updating software, so the less stuff to break the better.
I would suggest using Quartz.NET
It has providers for SQL Server, Oracle, MySql, SQLite and Firebird.
This thread has design information that should be mappable.
To quote:
Here's what I've used successfully in the past:
MsgQueue table schema
MsgId identity -- NOT NULL
MsgTypeCode varchar(20) -- NOT NULL
SourceCode varchar(20) -- process inserting the message -- NULLable
State char(1) -- 'N'ew if queued, 'A'(ctive) if processing, 'C'ompleted, default 'N' -- NOT NULL
CreateTime datetime -- default GETDATE() -- NOT NULL
Msg varchar(255) -- NULLable
Your message types are what you'd expect - messages that conform to a contract between the process(es) inserting and the process(es) reading, structured with XML or your other choice of representation (JSON would be handy in some cases, for instance).
Then 0-to-n processes can be inserting, and 0-to-n processes can be reading and processing the messages, Each reading process typically handles a single message type. Multiple instances of a process type can be running for load-balancing.
The reader pulls one message and changes the state to "A"ctive while it works on it. When it's done it changes the state to "C"omplete. It can delete the message or not depending on whether you want to keep the audit trail. Messages of State = 'N' are pulled in MsgType/Timestamp order, so there's an index on MsgType + State + CreateTime.
Variations:
State for "E"rror.
Column for Reader process code.
Timestamps for state transitions.
This has provided a nice, scalable, visible, simple mechanism for doing a number of things like you are describing. If you have a basic understanding of databases, it's pretty foolproof and extensible. There's never been an issue with locks roll-backs etc. because of the atomic state transition transactions.
You can have an intermediate table to maintain the offset for the queue.
create table scan(
scan_id int primary key,
offset_id int
);
You might have multiple scans going on as well, hence one offset per scan. Initialise the offset_id = 0 at the start of the scan.
begin;
select * from jobs where order by id where id > (select offset_id from scan where scan_id = 0) asc limit 1;
update scan set offset_id = ? where scan_id = ?; -- whatever i just got
commit;
All you need to do is just to maintain the last offset. This would also save you significant space (process_id per record). Hope this sounds logical.