Leasing jobs (atomic update and get) from a MySQL database - mysql

I have a MySQL table that manages jobs that worker-clients can lease for processing. Apart from the columns that describe the job, the table has a unique primary key column id, a time-stamp-column lease, a boolean-column complete, and an int-column priority.
I'm trying to write a (set of) SQL statement(s) that will manage the leasing-process. My current plan is to find the first incomplete job that has a lease-date that is at least 8 hours in the past (no job should take more than one hour, so an incomplete lease that is that old probably means that the client died and the job needs to be restarted), set its lease-date to the current time-stamp, and return its info. All of this, of course, needs to happen atomically.
I found a neat trick here on SO and a variation of it in the discussion of the MySQL documentation (see post on 7-29-04 here) that uses user-defined variables to return the leased job from an UPDATE statement.
And, indeed, this works fine:
UPDATE jobs SET lease=NOW() WHERE TIMESTAMPDIFF(HOUR,lease,NOW())>=8 AND NOT complete AND #id:=id LIMIT 1;
SELECT * FROM jobs WHERE id=#id;
The problem comes in when I try to add priorities to the jobs and add ORDER BY priority into the UPDATE statement right before LIMIT. The UPDATE still works as expected, but the SELECT always returns the same row back (either the first or the last, but not the one that was actually updated). I'm a little confused by this, since LIMIT 1 should make sure that the first update that actually happens will terminate the UPDATE process, leaving #id set to the correct value of that updated row, no? For some reason it seems to keep evaluating the condition #id:=id for all rows anyways, even after it's done with its update (or maybe it evaluates it first for all rows before even figuring out which one to update, I don't know...).
To fix this, I tried rewriting the statement to make sure the variable really only gets set for the matching row:
UPDATE jobs SET lease=NOW(),#id:=id WHERE TIMESTAMPDIFF(HOUR,lease,NOW())>=8 AND NOT complete ORDER BY priority LIMIT 1;
But for some reason, this gives me the following error:
Error Code : 1064
You have an error in your SQL syntax; check the manual that corresponds
to your MySQL server version for the right syntax to use near
'#id:=id WHERE TIMESTAMPDIFF(HOUR,lease,NOW())>=8 AND NOT complete ORDER BY prior'
at line 1
So, it seems that I can't assign the variable in the SET-part of the UPDATE (although this was the way it was suggested in the SO-answer linked above).
Can this approach be salvaged somehow or is there a better one altogether?
PS: I'm using MySQL server v5.5.44-0+deb8u1

My solution with a little trick:
first: you must use a subselect so that UPDATE not nows thats the same table an
second: you must initialize the #id with "(SELECT #id:=0)" else if the found no row they returns the last set value. Here you can also specify if they return 0 or '' when no result is found.
UPDATE jobs SET lease=NOW() WHERE id =
( SELECT * FROM
( SELECT #id:=id FROM jobs,(SELECT #id:=0) AS tmp_id
WHERE TIMESTAMPDIFF(HOUR,lease,NOW())>=8
AND NOT complete ORDER BY priority LIMIT 1
) AS tmp
);

It is OK that you found a solution.
If this must be quite stable, I would go for a different solution. I would not use atomicity, but "commit"- like workflows. You should identify your worker-client with a unique key, either in it's own table or with a secure hash key. You add two fields to your jobs-table: worker and state. So if you look for a job for worker W345, you assign worker to that job.
First part would be
update jobs set worker='W345', state='planning', lease=now()
where TIMESTAMPDIFF(HOUR,lease,NOW())>=8
AND NOT complete
ORDER BY priority LIMIT 1;
Next part (could be even from different part of application)
select * from jobs where worker='W345' and state='planning';
get id and data, update:
update jobs set state='sending', lease=now() where id=...;
Maybe you even can commit the sending of the job, otherwise you guess that it started after sending.
update jobs set state='working', lease=now() where id = ...;
You find all jobs that are dead before being sent to worker by their state and some short minutes old lease. You can find out where the process got into trouble. You can find out which workers get most trouble, and so on.
Maybe the real details differ, but as long as you have some status column you should be quite flexible and find your solution.

I was able to fix things with the following hack:
UPDATE jobs SET lease=IF(#id:=id,NOW(),0) WHERE TIMESTAMPDIFF(HOUR,lease,NOW())>=8 AND NOT complete ORDER BY priority LIMIT 1;
Seems like it's simply not allowed to set a local variable within the SET section of UPDATE.
Note:Since the id column is an auto-increment primary key, it is never 0 or NULL. Thus, the assignment #id:=id inside the IF-statement should always evaluate to TRUE and therefore lease should be set correctly (correct me if I'm wrong on this, please!).
One thing to keep in mind:The variable #id by default is scoped to the MySQL connection (not any Java Statement-object, for example, or similar), so if one connection is to be used for multiple job-leases, one needs to ensure that the different UPDATE/SELECT-pairs never get interleaved. Or one could add an increasing number to the variable-name (#id1, #id2, #id3, ...) to guarantee correct results, but I don't know what performance (or memory-use) impact this will have on the MySQL-server. Or, the whole thing could be packaged up into a stored procedure and the variable declared as local.

Related

Switch values in MySQL table

I have a table in MySQL with a field "Ordering" These are just auto incremented numbers. Now I wonder if there is a query to change the values from the last to the first...
So the entry with ordering 205 should become 1, 204 -> 2 and so on...
It's actually not an auto-increment. The problem is I started adding projects from the current website. From page 1 to page 20, but the first item on page 1 is the latest. The way I added the new projects, the newest is on the last page..
If the ordering field is switched, the new items added will be correctly numbered again and added to the front page. It's just a wrong way I started adding old projects...
Structure
Examples of the content
I can't comment due to limitations, but i really agree with #Abhik Chakraborty.
You don't want to do this. Just use the order by as he suggested.
Example:
SELECT * FROM tableName
ORDER BY columnName DESC;
Just in case you would like to know more about it: http://www.w3schools.com/sql/sql_orderby.asp
Try this as one statement call:
SET #MaxSort := (SELECT MAX(Ordering) FROM MyTable);
UPDATE MyTable t set t.Ordering = (#MaxSort + 1 - t.Ordering);
This will work if field doesn't have unique constraint.
But this field, should not be an auto_increment field at first place. Auto increment is increasing NOT decreasing counter. Except if you just try to fix existing data and the new records will be increasing.
Additional explanation
Thanks for pointing it out. Multiple query inside single query statement doesn't work with php_mysqli and it is not used because of potential MySQL injection attack if servers allows it. Maybe you can setup PHPMyAdmin to use PHP PDO.
I can use multiple queries, but I'm using PHP PDO or DBeaver database manager.
I can only suggest to supply MaxSort manually (since this is one time job anyway):
UPDATE
MyTable t
set
t.Ordering = 254 - t.Ordering + 1;

Selecting and updating a row while dealing with race conditions?

We have a table of elements that can be issued to clients. These elements can only ever be given to a client once, and we have situations where many clients could be pulling elements all at the same time. We then need to return data associated with it (so there is an update, and then a select).
The current solution is that a random one is found/updated to be issued=true and sets its id as LAST_INSERTED_ID; then immediately afterwards it makes the select call to find where('id = LAST_INSERTED_ID()') which is unique per connection.
Since we're updating where issued=false to issued=true and [last inserted], that one call is small enough to not encounter race condition issues.
But, all this is being done in SQL and feels very hackish. This does not seem like a rare enough problem that it has not been solved using a more Railsy solution. Wrapping a transaction might work to prevent double-issues, but then we'd need retry logic in the case the transaction failed.
What solution are we not thinking of?
You will want to use database-level locking to avoid race conditions.
One way to do this in MySQL is SELECT FOR UPDATE like this:
SELECT * FROM elements WHERE issued=false LIMIT 1 FOR UPDATE
In ActiveRecord (Rails), this is called pessimistic locking, and an implementation would look like this:
Element.transaction do
element = Element.lock(true).where(issued: false).first
element.issued = true
# ... do other stuff to assign to a given client
element.save!
end
If that got kicked off more than once at the same time, the 2nd call would be blocked until the first call finished, so by the time it executed, the first record would already be updated to issued=true and the 2nd call would return the next record instead of the same record.
You can read about SELECT FOR UPDATE here

Update statistic counter or just count(*) - Perfomance

What is the faster/better way to keep track on statistical data in a message board?
-> number of posts/topics
Update a column like 'number_of_posts' for each incoming post or after a post gets deleted.
Or just count(*) on the posts matching a topicId?
Just use count(*) - it's built into the database. It's well tested, and already written.
Having a special column to do this for you means you need to write the code to manage it, keep it in sync with the actual value (on adds and deletes). Why make more work for yourself?

How to deal with two transactions updating the same row in SQL Server

I have the following situation that I wish to deal with:
A table of values:
id int
val varchar(20)
used bit
flag int
I want to find the first row WHERE used = 0 AND flag IS NULL and stick something in 'flag'. Once that is done any other user will not be able to use that row (because flag is not null)
That's simple enough to do of course:
UPDATE top (1) mytable
SET flag = someUniqueValue
WHERE used = 0
AND flag IS NULL
What I want to know is what happens if two users are running the same UPDATE at the same time. Obviously one will get there first.
I don't know how to go about testing this scenario, so can't find out myself.
Does the second user over-ride the first? (straight away or after the lock is released?)
Does the second user get locked out and get an error? (If so how do I go about detecting the error?)
Any help would be appreciated.
The second user will over-ride the first. It wont get locked unless enclosed in transaction.
Check this link....
OK, after a bit of research I have found my own answer.
It seems I have to lock the table, do my update, and then release the table lock. The following sql does all of that in one go:
UPDATE top (1) mytable WITH (TABLOCKX)
SET flag = someUniqueValue
WHERE used = 0
AND flag IS NULL
To test it, I ran two loops (of 10000 cycles each - a bit over the top but did the trick). The first loop stuck one value in, the second another value. The end result showed that there were exactly 10000 of each value in the table when both loops finished running.

How to properly avoid Mysql Race Conditions

I know this has been asked before, but I'm still confused and would like to avoid any problems before I go into programming if possible.
I plan on having an internal website with at least 100 users active at any given time. Users would post an item (inserted into db with a 0 as its value) and that item would be shown via a php site (db query). Users then get the option to press a button and lock that item as theirs (assign the value of that item as their id)
How do I ensure that 2 or more users don't retrieve the same item at the same time. I know in programming like c++ I would just use plain ol mutex lock. Is their an equivalent in mysql where it will lock just one item entry like that? I've seen references to LOCK_TABLES and GET_LOCK and many others so I'm still very confused on what would be best.
There is potential for many people all racing to press that one button and it would be disastrous if multiple people get a confirmation.
I know this is a prime example of a race condition, but mysql is foreign territory for me.
I obviously will query the value of the item before I update it and make sure it hasn't written, but what is the best way to ensure that this race condition is avoided.
Thanks in advance.
To achieve this, you will need to lock the record somehow.
Add a column LockedBy defaulting to 0.
When someone pushes the button execute a query resembling this:
UPDATE table SET LockedBy= WHERE LockedBy=0 and id=;
After the update verify the affected rows (in php mysql_affected_rows). If the value is 0 it means the query did not update anything because the LockedBy column is not 0 and thus locked by someone else.
Hope this helps
When you post a row, set the column to NULL, not 0.
Then when a user updates the row to make it their own, update it as follows:
UPDATE MyTable SET ownership = COALESCE(ownership, $my_user_id) WHERE id = ...
COALESCE() returns its first non-null argument. So even if you and I are updating concurrently, the first one to commit gets to set the value. The second one will not override that value.
You may consider Transactions
BEGING TRANSACTION;
SELECT ownership FROM ....;
UPDATE table .....; // set the ownership if the table not owned yet
COMMIT;
and also you can ROLLBACK all the queries between the transaction if you caught an error !