Execute mediawiki extension job periodically

Execute mediawiki extension job periodically - mediawiki

I'm developing an extension for mediawiki. My extension needs to execute some database updating periodically (e.g. every 30 mins).
Reading mediawiki manual I found there is a job queue implemented, but it does not have support for scheduling.
Is there any way to set a mediawiki extension job to execute periodically?

This is not what a job queue is for; it is to run a task as soon as there are free resources. Create a maintenance script and use cron to run it periodically.

The job works thanks wiki visits. Each n visits, a job is executed (n being configured in your LocalSettings.php).
It is probably not what you are looking for, but if you really want to purge this queue every 30 minutes, you can still use a cron job. For instance :
30 * * * * php ./maintenance/runJobs.php
Based on your short elements, I would propose instead to configure cron to execute one of the scripts of your extension, and explain the set up in your install documentation.

Related

How can I delay deletion?

I would like to delay deletion of data from the database. I am using MySQL, nest.js. I heard that CRON is what I need. I want to delete the entry in a week. Can you help me with this? CRON is what I need, or i need to use something another?

A cron job (or at in Windows) or a MySQL EVENT can be created to periodically check for something and take action. The resolution is only 1 minute.
If you need a very precise resolution, another technique would be required. For example, if you don't want to show a user something that is more than 1 week old to the second, then simply exclude that from the SELECT. That is add something like this to the WHERE: AND created_date >= NOW() - INTERVAL 7 DAY.
Doing the above gives you the freedom to schedule the actual DELETE for only, say, once a day -- rather than pounding on the database only to usually find nothing to do.
If you do choose to "pound on the database", be aware of the following problem. If one instance of the deleter script is running for a long time (for any of a number of reasons), it might not be finished before the next copy comes along. In some situations these scripts can stumple over each other to the extent of effectively "crashing" the server.
That leads to another solution -- a single script that runs forever. It has a simple loop:
Do the actions needed (deleting old rows)
Sleep 1 -- or 10 or 60 or whatever -- this is to be a "nice guy" and not "pound on the system".
The only tricky part is making sure that starts up after any server restart or crash of the script.

You can configure a cronjob to periodically delete it.
There are several ways to configure a cron job.
You can write a shell script that periodically deletes entities in the db using linux crontab, or you can configure an application that provides cronjobs such as jenkins or airflow.
AWS lambda also provides cronjob.
Using crontab provided by nestjs seems to be the simplest to solve the problem.
See this link
https://docs.nestjs.com/techniques/task-scheduling

How to manage server-side processes using MySQL

I have a perl script which takes in unique parameters (one of the parameters being --user=username_here). Users can start these processes using a web interface I am developing.
A MySQL table, transactions, keeps track of users that run the perl script
id user script_parameters execute last_modified
23 alex --user=alex --keywords=thisthat 0 2014-05-06 05:49:01
24 alex --user=alex --keywords=thisthat 0 2014-05-06 05:49:01
25 alex --user=alex --keywords=lg 0 2014-05-06 05:49:01
26 alex --user=alex --keywords=lg 0 2014-04-30 04:31:39
The execute value for a given row will be "1" if the process should be running. It is set to "0" if the process should be ended.
My perl script constantly checks this value to make sure it's not "0" and if it is, the perl script terminates.
However, I need to manage these process to protect against this problem:
What if my server abruptly crashes and restarts, OR the script crashes? I will need something running in the background, reading the transactions table and make sure it restarts the perl script as many times as needed using the appropriate parameters.
And so, I'm having trouble figuring out how to balance giving control to the user to manage his/her own transaction(s), while I also make sure that the transactions that SHOULD be running, ARE running, and those that AREN'T, AREN'T.
Hope that makes sense and I appreciate any help!

It seems you're trying to launch long-running processes from a web server and then track those processes in a database. That's not impossible, but not a recommended practice.
The main problem is that an HTTP request needs to be currently being handled in your web server for you do actually do anything (including track processes running on the system) -- you need something that can run all the time...
Instead, a better idea would be to have another daemonized "manager" process (as you mention perl, that'd be a good language to write it in) spawn & track the long running tasks (by PID and signals), and for that process to update your SQL database.
You can then have your "manager" process listen for requests to start a new process from your web server. There are various IPC mechanisms you could use. (e.g: signals, SysV shm, unix domain sockets, in-process queues like ZeroMQ, etc).
This has multiple benefits:
If your spawned scripts need to run with user/group based isolation (either from the system or each other), then your webserver doesn't need to run as root, nor be setgid.
If a spawned process "crashes", a signal will be delivered to the "manager" process, so it can track mis-executiions without issues.
If you use in-process queues (e.g: ZeroMQ) to deliver requests to the "manager" process, it can "throttle" requests from the web server (so that users cannot intentionally or accidentally cause D.O.S).
Whether or not the spawned process ends well, you don't need an 'active' HTTP request to the web server in order to update your tracking database.
As to whether something that should be running is running, that's really up to your semantics. (i.e: is it based on a known run time? based on data consumed? etc).
The check as to whether it is running can be two-fold:
The "manager" process updates the database as appropriate, including the spawned PID.
Your web server hosted code can actually list processes to determine if the PID in the database is actually running, and even how much time it's been doing something useful!
The check for whether it is not running would have to be based on convention:
Name the spawned processes something you can predict.
Get a process list to determine what's still running (defunct?) that shouldn't be.
In either case, you could either inform the users who requested the processes be spawned and/or actually do something about it.
One approach might be to have a CRON job which reads from the SQL database and does ps to determine which spawned processes need to be restarted, and then re-requests that the "manager" process does so using the same IPC mechanism used by the web server. How you differentiate starts vs. restarts in your tracking/monitoring/logging is up to you.
If the server itself loses power or crashes, then you could have the "manager" process perform cleanup when it first runs, e.g:
Look for entries in the database for spawned processes that were alegedly running before the server was shut down.
Check for those processes by PID and run time (this is important).
Either re-spawn the spawned proceses that didn't complete, or store something in the database to indicate to the web server that this was the case.
Update #1
Per your comment, here are some pointers to get started:
You mentioned perl, so presuming you have some proficiency there -- here are some perl modules to help you on your way to writing the "manager" process script:
If you're not already familiar with it CPAN is the repository for perl modules that do basically anything.
Daemon::Daemonize - To daemonize process so that it will continue running after you log out. Also provides methods for writing scripts to start/stop/restart the daemon.
Proc::Spawn - Helps with 'spawning' child scripts. Basically does fork() then exec(), but also handles STDIN/STDOUT/STDERR (or even tty) of child process. You could use this to launch your long-running perl scripts.
If your web server front-end code is not already written in perl, you'll need something that's pretty portable for inter-process message-passing and queuing; I'd probably make your web server front end in something easy to deploy (like PHP).
Here are two possibilities (there are many more):
Perl and PHP implementations for the Spread Toolkit.
Perl and PHP implementations for the ZeroMQ library.
Proc::ProcessTable - You can use this check on running processes (and get all sorts of stats as discussed above).
Time::HiRes - Use the high-granularity time functions from this package to implement your 'throttling' framework. Basically just limit the number of requests you de-queue per unit of time.
DBI (with mysql) - Update your MySQL database from the "manager" process.

Is there a way to restrict Hudson CI jobs to run only during a certain time of day?

is there a plugin or some configuration in Hudson CI where the job will not run during a certain time of day? I was thinking of a job like:
Run job A on version control change if not after 5pm and before 9am
Thanks!

Maybe you could set up a global property that gets set to on/off depending on time and you configure the builds to use that global property.
But the best way would probably to set up the scm polling to just not poll at the desired time of day. Then no changes are found and no build is triggered ;-)

We run automated tests in our Jenkins instance. When these are being run on certain environments, we disable the deploy job to the involved environments. This is done using Jenkins CLI. https://wiki.jenkins-ci.org/display/JENKINS/Jenkins+CLI
I bet you could do the same thing but instead use the "Build periodically" function in the job.

LAMP: How to Implement Scheduling?

Users of my application need to be able to schedule certain task to run at certain times (e.g. once only, every every minute, every hour, etc.). My plan is to have a cron run a script every minute to check the application to see if it has tasks to execute. If so, then execute the tasks.
Questions:
Is the running of cron every minute a good idea?
How do I model in the database intervals like cron does (e.g. every minute, ever 5th minute of every hour, etc.)?
I'm using LAMP.

Or, rather than doing any, you know, real work, simply create an interface for the users, and then publish entries in cron! Rather than having cron call you every minute, have it call scripts as directed by the users. When they add or change jobs, rewrite the crontab.
No big deal.
In unix, cron allows each user (unix login that is) to have their own crontab, so you can have one dedicated to your app, don't have to use the root crontab for this.

Do you mean that you have a series of user-defined jobs that need executed in user-defined intervals, and you'd like to have cron facilitate the processing of those jobs? If so, you'd want to have a database with at least 2 fields:
JOB,
OFTEN
where OFTEN is how often they'd like the job to run, using syntax similar to CRON.
you'd then need to write a script (in python, ruby, or some similar language) to parse that data. this script would be what runs every 1 minute via your actual cron.
take a look at this StackOverflow question, and this StackOverflow question, regarding how to parse crontab data via python.

Ways of managing the data in a database

I'm new to databases and web servers and that kind of thing. So I am looking for information so I can begin to figure out a starting point and options open to me.
I need to have a database that can be accessed by an iPhone app. So logically it will be hosted on a webserver somewhere.
To get/insert the data from/into the database the app would make a HTTP connection to a php file on the same server as the DB which would then insert/return the relevant data. To stop random hackers messing with the DB the app would have some validation code inside it to send to the php file to check that its not a hacker trying to mess with the database. This all making sense or will that not be secure enough.
Now the most confusing part to get my head around is :
I need check every minute has any data in the database become to old and remove it if so. So something needs to be running on the server constantly checking/manageing the database. What would this be? What is commonly used to do this kinda of thing? Is there somekey word for it that i can start searching and reading about to see what options there are?
Thanks for your advise,
-Code

One way to do this is to have a purge script run via crontab. The script can run every minute and check for old data and remove it.
MySQL version greater than 5.1.6 has inbuilt event scheduler which can be used to schedule periodic jobs inside mysql server itself.
http://dev.mysql.com/doc/refman/5.1/en/events.html

Sounds to me like you need a cron job. Cron is the standard scheduling task application for Unix type systems.
You would have some sort of script that connects to the database and performs a cleanup query, and you would schedule that script via cron.
http://en.wikipedia.org/wiki/Cron

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008