Shedlock and temporary stop task(s)? - shedlock

When running several scheduled tasks on multiple servers, Shedlock seems great, but sometimes, we need to halt some of the tasks for a short or long period too.
Of course, it is possible to control each task with additional properties/flags, but my suggestion is to use Shedlock for this too, and introduce a logical "node/server" for the task we wish to stop, and update the row in shedlock-table with a lock to this node, and set a lockedAt time in the future, and a lockUntil to future + 1 second (so longer than maxRunning is not triggered). Then it will start again automatically, or we can go in and move the time further into the future if needed.
Any thoughts on this kind of use for Shedlock... smart or bad practice? It is still used for locking, just locking the job to a logical fake server.

It is possible to (mis)use ShedLock for this. The update you are looking for can look like this:
update shedlock set lock_until = :future, locked_at = now(), locked_by = "manual" where
name = :name and lock_until < now()
The important part is the condition lock_until < now() which prevents meddling with an existing lock for a running task. You do not have to set locked_by since it's mostly ignored by the library. It's just better to set it just in case someone else wonders why the tasks are not being executed.

Related

Transaction not working as I expected in Laravel to prevent duplicate entries

Within a Laravel-based platform there are three relevant tables :
users.
codes.
users_codes.
Users can claim a code - in which case, what should happen is that the system gets the next available code from the codes table, marks it as allocated, and then sets up a new entry in users_codes which ties the user to the code.
Nothing special would need to be done whilst the site is under low load, however anticipating high load, when initially written this was done as a Transaction, which I had thought would prevent the same code being allocated twice.
DB::transaction(function () use($user) {
$code = Codes::getNextAvailableCode(); // Not the actual function, but works for the sake of example
$code->allocated = 1;
$code->save();
$uc = new UserCode();
$uc->user_id = $user->id;
$uc->code_id = $code->id;
$uc->save();
});
Now that the site is under high load, a couple of times the same code has been allocated to two different users. So it's clear that a Transaction isn't what I want.
Thinking it through, I initially thought that replacing the Transaction with locking would be an option, but the more I think about it, I can't really lock a table.
Instead, I think I need to be focussing on checking, before creating and saving the new UserCode() to ensure that there is no existing UserCode with the same $code->id?
Any suggestions? Is there a way that I've not considered that will allow this to work smoothly under high load (ie. not continually throw errors back to the user when they try and claim a code that has been taken a millisecond before)?
You want PESSIMISTIC_WRITE type lock which allows us to obtain an exclusive lock and prevent the data from being read, updated or deleted.
Laravel's query builder supports Pessimistic Locking.
The query builder also includes a few functions to help you achieve "pessimistic locking" when executing your select statements. To execute a statement with a "shared lock", you may call the sharedLock method. A shared lock prevents the selected rows from being modified until your transaction is committed:
DB::table('users')
->where('votes', '>', 100)
->sharedLock()
->get();
I think you definitely need table lock instead of Transaction.

Trigger Performance Degredation (To use or not to use)

I've created this trigger:
DELIMITER $$
CREATE TRIGGER `increment_daily_called_count` BEFORE UPDATE ON `list`
FOR EACH ROW begin
if (NEW.called_count != OLD.called_count) then
set NEW.daily_called_count = OLD.daily_called_count(NEW.called_count-OLD.called_count);
set NEW.modify_date = OLD.modify_date;
end if;
end
$$
DELIMITER ;
The database table this runs on is accessed and used by 100's of different scripts in the larger system and the reason for the trigger is so I don't have to hunt down every little place in these scripts where the called_count might get updated...
My concern is that, because this particular table gets modified constantly (I'm talking dozens of times per second), is this going to put undue strain on the database? Am I better off in the long run hunting down all the called_count update queries in the myriad scripts and adding daily_called_count = daily_called_count+1?
Some specifics I'd like to know the answer to here:
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is MySQL smart enough to bundle these queries?
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
Could this trigger cause any unforeseen weirdness that I'm not anticipating?
Two disclaimers:
I have not worked with MySQL in a very long time, and never used triggers with it. I can only speak from general experience with RDBMS's.
The only way to really know anything for sure is to run a performance test.
That said, my attempts to answer with semi-educated guesses (from experience):
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is mysql smart enough to bundle these queries?
I don't think it's a separate update in the sense of statement execution. But you are adding a computation overhead cost to each row.
However, what I am more worried about is the row-by-row nature of this trigger. It literally says FOR EACH ROW. Generally speaking, row-by-row operations scale poorly in a RDBMS compared to SET-based operations. MS SQL Server runs statement-level triggers where the entire set of affected rows is passed in, so a row-by-row operation is not necessary. This may not be an option in MySQL triggers - I really don't know.
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
It would certainly make the system do less work. How much the performance impact is, numerically, I can't say. You'd have to test. If it's only a 1% difference, the trigger is probably fine. If it's 50%, well, it'd be worth hunting down all the code. Since hunting down the code is a burden, I suspect it's either embedded in an application or comes dynamically from an ORM. If that is the case, as long as the performance cost of the trigger is acceptable, I'd rather stick to the trigger as it keeps a DB-specific detail in the DB.
Measure, measure, measure.
Could this trigger cause any unforeseen weirdness that I'm not anticipating?
Caching comes to mind. If these columns are part of something an application reads and caches, its cache invalidation is probably tied to when it thinks it changed the data. If the database changes data underneath it, like with a trigger, caching may result in stale data being processed.
First, thanks to #Brandon for his response. I built my own script and test database to benchmark and solve my question... While I don't have a good answer to points 1 and 3, I do have an answer on the performance question...
To note I am using 10.0.24-MariaDB on our development server which didn't have anything else running on it at the time.
Here are my results...
Updating 100000 rows:
TRIGGER QUERY TIME: 6.85960197 SECONDS
STANDARD QUERY TIME: 5.90444183 SECONDS
Updating 200000 rows:
TRIGGER QUERY TIME: 13.19935203 SECONDS
STANDARD QUERY TIME: 11.88235188 SECONDS
You folks can decide for yourselves which way to go.

Change Data Capture For Updates and Deletes Only

Our database is insert inensive (200-500k per night) but update light (maybe a few hundred per day).
I need to preserve a history of all changes to the inserted rows themselves for an indefinite period of time (but not the actual insert). I would love to use Change Data Capture but the amount of space required to support this is not avaiable. If I can figure out of to do one of the following, my life would be much easier.
1) Limite change data tracking to UPDATES and DELETES only
2) Cleanup only INSERTS from the CDC tables regularly
In the past, I'd have just used a trigger (which is still not off the table!).
I would just use a trigger to capture updates and deletes.
I don't think you can tell CDC what DML to pay attention to, and I think it's quite wasteful to let CDC record all of these inserts only to delete them afterward. That in and of itself is expensive and the fragmentation it will cause will also cause issues for any queries you run against the capture tables (you'll have lots of mostly-empty pages) as well as the work statistics will have to do to constantly keep the stats up to date.
You could possible put an instead of insert trigger on the capture table, that just does nothing, but I haven't tried to do this to even see if it is allowed, and I certainly don't know what impact that will have on the CDC functions. Possibly worth some investigation, but my original answer still stands even if this hack does work: just use a trigger.
If space is a consideration, you can always assign the CDC tables to work with a different filegroup that could potentially live on a different server. You'd do that this way:
ALTER DATABASE YourDatabase
ADD FILEGROUP [cdc_ChangeTables];
go
--this step requires going on to somewhere on your hard drive and creating a folder
ALTER DATABASE YourDatabase
ADD FILE ( NAME = N'cdc_ChangeTables',
FILENAME = N'E:\NameOfFolderYouSetUp\YourDatabase_cdc_ChangeTables.mdf',
SIZE = 1048576KB,
FILEGROWTH = 102400KB )
TO FILEGROUP [cdc_ChangeTables];
GO
Then when you want to set up your CDC tables, you point them toward that filegroup instead:
EXEC sys.sp_cdc_enable_table
#source_schema = N'dbo',
#source_name = N'TableYouWantToCapture',
#role_name = N'cdc_admin',
#filegroup_name = N'cdc_ChangeTables', --this is where you name the filegroup from previous step
#supports_net_changes = 1,
#capture_instance = N'dbo_TableYouWantToCapture',
#captured_column_list = 'ColumnName1, ColumnName2'; --comma-delimited list of column names
GO
If you want to query only updates/deletes, you can use the system function like so:
SELECT *
FROM cdc.fn_cdc_get_all_changes_dbo_TableYouWantToCapture(#from_lsn, #to_lsn, N'all update old')
WHERE __$operation IN (1,3,4)

SQL Server 2008 - How to implement a "Watch Dog Service" which woofs when too many insert statements on a table

Like my title describes: how can I implement something like a watchdog service in SQL Server 2008 with following tasks: Alerting or making an action when too many inserts are committed on that table.
For instance: Error table gets in normal situation 10 error messages in one second. If more than 100 error messages (100 inserts) in one second then: ALERT!
Would appreciate it if you could help me.
P.S.: No. SQL Jobs are not an option because the watchdog should be live and woof on the fly :-)
Integration Services? Are there easier ways to implement such a service?
Kind regards,
Sani
I don't understand your problem exactly, so I'm not entirely sure whether my answer actually solves anything or just makes an underlying problem worse. Especially if you are facing performance or concurrency problems, this may not work.
If you can update the original table, just add a datetime2 field like
InsertDate datetime2 NOT NULL DEFAULT GETDATE()
Preferrably, make an index on the table and then with whatever interval that fits, poll the table by seeing how many rows have an InsertDate > GetDate - X.
For this particular case, you might benefit from making the polling process read uncommitted (or use WITH NOLOCK), although one has to be careful when doing so.
If you can't modify the table itself and you can't or won't make another process or job monitor the relevant variables, I'd suggest the following:
Make a 'counter' table that just has one Datetime2 column.
On the original table, create an AFTER INSERT trigger that:
Deletes all rows where the datetime-field is older than X seconds.
Inserts one row with current time.
Counts to see if too many rows are now present in the counter-table.
Acts if necessary - ie. by executing a procedure that will signal sender/throw exception/send mail/whatever.
If you can modify the original table, add the datetime column to that table instead and make the trigger count all rows that aren't yet X seconds old, and act if necessary.
I would also look into getting another process (ie. an SQL Jobs or a homemade service or similar) to do all the housekeeping, ie. deleting old rows, counting rows and acting on it. Keeping this as the work of the trigger is not a good design and will probably cause problems in the long run.
If possible, you should consider having some other process doing the housekeeping.
Update: A better solution will probably be to make the trigger insert notifications (ie. datetimes) into a queue - if you then have something listening against that queue, you can write logic to determine whether your threshold has been exceeded. However, that will require you to move some of your logic to another process, which I initially understood was not an option.

How to rollback the effect of last executed mysql query

I just ran a command
update sometable set col = '1';
by mistake without specifying the where condition.
Is it possible to recover the previous version of the table?
Unless you...
Started a transaction before running the query, and...
Didn't already commit the transaction
...then no, you're out of luck, barring any backups of previous versions of the database you might have made yourself.
(If you don't use transactions when manually entering queries, you might want to in the future to prevent headaches like the one you probably have now. They're invaluable for mitigating the realized-5-seconds-later kind of mistake.)
Consider enabling sql_safe_updates in future if you are worried about doing this kind of thing again.
SET SESSION sql_safe_updates = 1
No. MySQL does have transaction support for some table types, but because you're asking this question, I'll bet you're not using it.
Everybody does this once. It's when you do it twice you have to worry :)