Read Lock on a row of a Database table - mysql

I am trying to build a job scheduler.I have a list of jobs to be executed on 2-3 different machines on time basis. So any machine can pick any job and will execute it if its next_execution_time < current_time. I am storing all the jobs in a database table and I am using SELECT.... FOR UPDATE query in SQL to select a job for execution.
But the problem with this approach is that, if a machine1 has picked a job, since there is only write lock, other machines will also select the same job for for execution, but can't execute as they will wait for the lock to be released or lock timeout will occur. So is there any way so that other machine skips this job and execute other jobs using SQL locks. No other column should be added in the database?
Flow is something like this :
select a job and lock it -> execute the job -> release the lock
I am using ruby-on-rails for developing this. If there is no-wait or set_lock_timeout = 0 in rails. it can probably solve the problem. If there exists ... what is the syntax?

Actually you have a simple way of doing this with your current table in mysql, you need to temporarily lock the table when selecting the next task. I'm assuming you have a column in the table to flag already started/done tasks, otherwise you can use the same column with the datetime to start the job to flag that it is already done/started:
lock tables jobs write;
select * from jobs where start_time < current_time and status = 'pending' order by start_time;
-- be carefull here to check for SQL errors in your code and run unlock tables if an exception is thrown or something like that
update jobs set status = 'started' where id = the_one_you_selected_adobe;
unlock tables;
And that's it, multiple concurrent threads/processes cab use the jobs table to execute tasks without having 2 threads running the same task.

Related

Table Renaming in an Explicit Transaction

I am extracting a subset of data from a backend system to load into a SQL table for querying by a number of local systems. I do not expect the dataset to ever be very large - no more than a few thousand records. The extract will run every two minutes on a SQL2008 server. The local systems are in use 24 x 7.
In my prototype, I extract the data into a staging table, then drop the live table and rename the staging table to become the live table in an explicit transaction.
SELECT fieldlist
INTO Temp_MyTable_Staging
FROM FOOBAR;
BEGIN TRANSACTION
IF(OBJECT_ID('dbo.MyTable') Is Not Null)
DROP TABLE MyTable;
EXECUTE sp_rename N'dbo.Temp_MyTable_Staging', N'MyTable';
COMMIT
I have found lots of posts on the theory of transactions and locks, but none that explain what actually happens if a scheduled job tries to query the table in the few milliseconds while the drop/rename executes. Does the scheduled job just wait a few moments, or does it terminate?
Conversely, what happens if the rename starts while a scheduled job is selecting from the live table? Does transaction fail to get a lock and therefore terminate?

Make sure the cron job won't do the same job twice

I've got a list of similar tasks in mysql database and a PHP-script that takes out 1 task at a time and executes it. When it's done it changes the flag from pending to done
I want to speed up my performance by adding more scripts(up to 20) running on the same database. How do i make sure these scripts won't be executing the same task twice, ie. processing the same row in the table
Thanks in advance!
One possible approach is:
You can change the datatype of flag column to ENUM type (if it is not already). It will have three possible Enum values: pending, in_process, done.
When selecting a pending task to do, do an explicit LOCK on the table; so that no other session can update it.
Code example:
LOCK TABLES tasks_table WRITE; -- locking the table for read/write
-- Selecting a pending task to do
SELECT * FROM tasks_table
WHERE flag = 'pending'
LIMIT 1;
-- In application code (PHP) - get the Primary key value of the selected task.
-- Now update the flag to in_process for the selected task
UPDATE tasks_table
SET flag = 'in_process'
WHERE primary_key_field = $selected_value;
At the end, do not forget to release the explicit lock.
Code:
-- Release the explicit Lock
UNLOCK TABLES;

How to manage single thread job using mysql database?

I need to run a php a script , I want to make sure there is no more than one script running at the same time .
I am using mysql , and I though about this solution :
I build the bellow database :
job_id | task_id | last_updated_time (AUTO UPDATE)
"sending_emails" 77238 2107-5-3 12:2:2
Before running the script I create random task id , then I run a query to update the task_id .
$task_id = generate_random_task_id();
$query = "
UPDATE
jobs
SET
task_id = $task_id
WHERE
task_id = $task_id
OR
NOW() - last_updated_time > 30
LIMIT 1
"
/*
Then I need to check if there was an update, if yes then I will run the script otherwise i will stop since there is already another script running
*/
$query = "SELECT JOB_ID WHERE taks_id = $task_id "
$result = run($query)
if( ! isset($result[JOB_ID])){
DIE();
}
is there any chance that two scripts run at the same time ?
No, they can't run at the same time, here's MySQL's documentation about UPDATE and SELECT, this is what it says:
UPDATE ... WHERE ... sets an exclusive next-key lock on every record
the search encounters. However, only an index record lock is required
for statements that lock rows using a unique index to search for a
unique row.
Here's more about Shared and Exclusive locks:
A shared (S) lock permits the transaction that holds the lock to read
a row.
An exclusive (X) lock permits the transaction that holds the lock to
update or delete a row.
If a transaction T1 holds an exclusive (X) lock on row r, a request
from some distinct transaction T2 for a lock of either type on r
cannot be granted immediately. Instead, transaction T2 has to wait for
transaction T1 to release its lock on row r.
Yes there's every chance you could run the same task again.
There are two obvious solutions.
One is to open a mysql connection then acquire a lock using GET_LOCK() using a short timeout - if you acquire the lock then you're good to go. You need to maintain the db connection for the lifetime of the script.
Alternatively you could create a table with a unique contraint on finish_time, INSERT a record with a null finish time to indicate the start (it will fail if there is already a record with a null finish time) then update the finish_time to NOW() when it completes.
However using the database to represent the state of a running task only makes sense when the task is running within a lossely coupled but highly available cluster - implying that the databse is also clustered. And the nature of the clustering (NDB, asych, semi-sync, multi-master) has a lot of impact on how this will behave in practice.
OTOH if that is not the case, then using the database to represent the state is the wrong way to solve the problem.
Yes, they can run at the same time.
If you want them to run one at a time SELECT
query should be changed to:
SELECT JOB_ID WHERE taks_id = $task_id LOCK IN SHARED MODE
In this case it uses a read lock.
This is the same whether you use NDB or InnoDB.

Minimal logging not happening for INSERT INTO temp table

I have an SP with a set of 3 temp tables created within the SP with no indexes. All three are inserted into using INSERT INTO ... WITH (TABLOCK). The database recovery model is SIMPLE for userDB as well as tempDB.
This SP is generating and inserting new data and a Transaction Commit/Rollback is good enough to maintain data integrity. So I want it to do minimal logging which I think I have enabled by using the TABLOCK hint.
I am checking the log generation before and after execution of the SP using below query and see no difference in log generation after adding the TABLOCK hint. (Checking in tempDB as tables are temp tables)
SELECT count(1) as NumLogEntries
, sum("log record length") as TotalLengthWritten
FROM fn_dblog(null, null);
Is there anything else I need to do in order to enable minimal logging?
PN: I am able to see reduced logging if I use this hint to do the same INSERT INTO separately in management studio, but not if I do the same within the SP.
I have also tried adding the trace flag 610 ON before the insert statement but to no effect.

application setup to avoid timeouts and deadlocks in SQL server DB table

My application accesses a local DB where it inserts records into a table (+- 30-40 million a day). I have processes that run and process data and do these inserts. Part of the process involves selecting an id from an IDs table which is unique and this is done using a simple
Begin Transaction
Select top 1 #id = siteid from siteids WITH (UPDLOCK, HOLDLOCK)
delete siteids where siteid = #id
Commit Transaction
I then immediately delete that id with a separate statement from that very table so that no other process grabs it. This is causing tremendous timeout issues and with only 4 processes accessing it, I am surprised though. I also get timeout issues when checking my main post table to see if a record was inserted using the above id. It runs fast but with all the deadlocks and timeouts I think this indicates poor design and is a recipe for disaster.
Any advice?
EDIT
this is the actual statement that someone else here helped with. I then removed the delete and included it in my code as a separately executed statement. Will the order by clause really help here?