Enqueue each row in a ssb queue from a large table - sql-server-2008

I have a table that contains 2.5 million rows, each row has one column of type xml. All records should be deleted and enqueued in a sqlserver service broker queue when a message arrives in another queue (triggerqueue). Performance is very important and now it's too slow. What would be the best way to achieve this?
currently we use an activated sp on the triggerqueue which does in a while(#message <> null) loop:
begin transaction
delete top (1) from table output #tempTable
select top 1 #message = message from #tempTable
send on conversation #message
commit transaction
are there faster ways to tackle this problem?
By the way: before someone asks: we need to start from the table, because it is filled with the output from an earlier calculated merge statement

So your performance problem is on the send side rather than receive side, right? (it's a bit unclear from your question). In this case, you'll want to start with trying:
Batch many operations in a single transaction. You're most likely getting hit the most by synchronous log flushes at commit time.
Try processing the table more efficiently (e.g. select more rows at once into the temp table and then use cursors to iterate over it and send messages)
In case you're experiencing problems on the receive side, take a look at this great article by Remus.

Related

How can I prevent a stored procedure from running twice at the same time?

I'm using an Aurora DB (ie MySQL version 5.6.10) as a queue, and I'm using a stored procedure to pull records out of a table in batches. The sproc works with the following steps...
Select the next batch of data into a temptable
Write the IDs from the records from the temp table into to a log table
Output the records
Once a record has been added to the log, the sproc won't select it again next time it's called, so multiple servers can call this sproc, and both deal with batches of data from the queue without stepping on each others toes.
The sproc runs in a fraction of a second, but my company is now spinning up servers automatically, and these cloned servers are calling the sproc at exactly the same time, and the result is the same records are being selected twice
Is there a way I can make this sproc be limited to one call at a time? Ideally, any additional calls should wait until the first call is finished, and then they can run
Unfortunately, I have very little experience working with MySQL, so I'm not really sure where to start. I'd much appreciate it if anyone could point me in the right direction
This is a job for MySQL table locking. Try something like this. (You didn't show us your queries so there's a lot of guesswork here.)
SET autocommit = 0;
LOCK TABLES logtable WRITE;
CREATE TEMPORARY TABLE temptable AS
SELECT whatever FROM whatevertable FOR UPDATE;
INSERT INTO logtable (id)
SELECT id FROM temptable;
COMMIT;
UNLOCK TABLES;
If more than one connection tries to run this sequence concurrently, one will wait for the other's UNLOCK TABLES; to proceed. You say your SP is quick, so probably nobody will notice the short wait.
Pro tip: When you have the same timed code running on lots of servers, it's best to put in a short random delay before running the job. That way the shared resources (like your MySQL database) won't get hammered by a whole lot of requests precisely timed to be simultaneous.

Alternative to skip locked in mariaDB

Is there any good & performant alternative to FOR UPDATE SKIP LOCKED in mariaDB? Or is there any good practice to archieve job queueing in mariaDB?
Instead of using a lock to indicate a queue record is being processed, use an indexed processing column. Set it to 0 for new records, and, in a separate transaction from any processing, select a single not yet processing record and update it to 1. Possibly also store the time and process or thread id and server that is processing the record. Have a separate monitoring process to detect jobs flagged as processing that did not complete processing within the expected time.
An alternative that avoids even the temporary lock on a non-primary index needed to select a record is to use a separate, non-database message queue to notify you of new records available in the database queue. (Unless you won't ever care if a unit of work is processed more than once, I would always use a database table in addition to any non-database queue.)
DELETE FROM QUEUE_TABLE LIMIT 1 RETURNING *
for dequeue operations. Depending on your needs it might work ok
Update 2022-06-14:
MariaDB supports SKIP LOCKED now.

How can I parallelize Writes to the same row in MySQL?

I'm currently building a system that does running computations, and every 5 seconds inserts or updates information based on those computations to a few rows in MySQL. I'm working on running this system on a few different servers at once right now with a few agents that are each doing similar processing and then writing on the same set of rows. I already randomize the order in which each agent writes its set of rows, but there's still a lot of deadlock happening. What's the best/fastest way to get through those deadlocks? Should I just rerun the query each time one happens, or do row locks, or something else entirely?
I suggest you try something that won't require more than one client to update your 'few rows.'
For example, you could have each agent that produces results do an INSERT to a staging table with the MEMORY access method.
Then, every five seconds you can run a MySQL event (a stored procedure within the server) that loops through all the rows in that table, posting their results to your 'few rows' and then deleting them. If it's important for the rows in your staging table to be processed in order, then you can use an AUTO_INCREMENT id field. But it might not be important for them to be in order.
If you want to get fancier and more scalable than that, you'll need a queue management system like Apache ActiveMQ.

mysql row level read lock to replace messaging queue

I have a mysql table in which I store jobs to be processed. mainly text fields of raw data the will take around a minute each to process.
I have 2 servers pulling data from that table processing it then deleting.
To manage the job allocation between the 2 servers I am currently using amazon SQS. I store all the row IDS that need processing in SQS, the worker servers poll SQS to get new rows to work on.
The system currently works but SQS adds a layer of complexity and costs that I feel are overkill to achieve what I am doing.
I am trying to implement the same thing without SQS and was wondering if there is any way to read lock a row so that if one worker is working on one row, no other worker can select that row. Or if there's any better way to do it.
A simple workaround: add one more column to your jobs table, is_taken_by INT.
Then in your worker you do something like this:
select job_id from jobs where is_taken_by is null limit 1 for update;
update jobs set is_taken_by = worker_pid where id = job_id;
SELECT ... FOR UPDATE sets exclusive locks on rows it reads. This way you ensure that no other worker can take the same job.
Note: you have to run those two lines in an explicit transaction.
Locking of rows for update using SELECT FOR UPDATE only applies when autocommit is disabled (either by beginning transaction with START TRANSACTION or by setting autocommit to 0. If autocommit is enabled, the rows matching the specification are not locked.

Seeking an example of a procedure that uses row_count

I want to write a procedure that will handle the insert of data into 2 tables. If the insert should fail in either one then the whole procedure should fail. I've tried this many different ways and cannot get it to work. I've purposefully made my second insert fail but the data is inserted into the first table anyway.
I've tried to nest IF statements based on the rowcount but even though the data fails on the second insert, the data is still being inserted into the first table. I'm looking for a total number of 2 affected rows.
Can someone please show me how to handle multiple inserts and rollback if one of them fails? A short example would be nice.
If you are using InnoDB tables (or other compatible engine) you can use the Transaction feature of MySQL that allows you to do exactly what you want.
Basically you start the transaction
do the queries checking for the result
If every result is OK you call the CONMIT
else you call the ROLLBACK to void all the queries within the transaction.
You can read and article about with examples here.
HTH!
You could try turning autocommit off. It might be automatically committing your first insert even though you haven't explicitly committed the transaction that's been started:
SET autocommit = 0;
START TRANSACTION
......