Best Practice for synchronized jobs in Application clusters - mysql

We have got 3 REST-Applications within a cluster.
So each application server can receive requests from "outside".
Now we got timed events, which are analysing the database and add/remove rows from the database, send emails, etc.
The problem is, that each application server does start this timed events and it happens that 2 application server are starting this analysing job at the same time.
We got a sql table in the back.
Our idea was to lock a table within the sql database, when starting the job. If the table is locked, we exit the job, because an other application just started to analyse.
What's a good practice to insert some kind of semaphore ?
Any ideas ?

Don't use semaphores, you are over complicating things, just use message queueing, where you queue your tasks and get them executed in row.
Make ONLY one separate node/process/child_process to consume from the queue and get your task done.

We (at a previous employer) used a database-based semaphore. Each of several (for redundancy and load sharing) servers had the same set of cron jobs. The first thing in each was a custom library call that did:
Connect to the database and check for (or insert) "I'm working on X".
If the flag was already set, then the cron job silently exited.
When finished, the flag was cleared.
The table included a timestamp and a host name -- for debugging and recovering from cron jobs that fail to finish gracefully.
I forget how the "test and set" was done. Possibly an optimistic INSERT, then check for "duplicate key".

Related

How to make polling from database scalable?

I am try to find a scalable way to allow for my desktop application to run command when a change in the database is made.
The application is for running a remote command on your PC. The user logs into the website and can choose the run the command. Currently, users have to download a desktop application that checks the database every few seconds to see if a value has changed. The value can only be changed when they login to a website and press a button.
For now it seems to be working fine since there aren't many users. But when I hit 100+ users hitting the database 100+ times every few seconds is not good. What might be a better approach?
It's true that polling for changes is too expensive, especially if you have many clients. The queries are often very costly, and it's tempting to run the queries frequently to make sure the client gets notified promptly after a change. It's better to avoid polling the database.
One suggestion in the comments above is to use a UDF called from a trigger. But I don't recommend this, because a trigger runs when you do an INSERT/UPDATE/DELETE, not when you COMMIT the change. So a client could be notified of a change, and then when they check the database the change appears to not be there, because either the transaction was rolled back, or else the transaction simply hasn't been committed yet.
Another reason the trigger solution is not good is that MySQL triggers execute once for each row changed, not once for each INSERT/UPDATE/DELETE statement. So you could cause notification spam, if you do an UPDATE that affects thousands of rows.
A different solution is to use a message queue like RabbitMQ or ActiveMQ or Amazon SQS (there are many others). When a client commits their INSERT/UPDATE/DELETE, they confirm the commit succeeded, then post a message on a message queue topic. Many clients can be notified efficiently this way. But it requires that every client who commits changes to the database write code to post to the message queue.
Another solution is for clients to subscribe to MySQL's binary log and read it as a change data capture log. Every committed change to the database is logged in the binary log. You can make clients read this, and it has no more impact to the database server than a replication client (MySQL can easily support hundreds of replicas).
A hybrid solution is to consume the binary log, and turn those changes into events in a message queue. This is how a product like Debezium works. It reads the binary log, and posts events to an Apache Kafka message queue. Then other clients can wait for events on the Kafka queue and respond to them.

Handling doctrine 2 connections in long running background scripts

I'm running PHP commandline scripts as rabbitmq consumers which need to connect to a MySQL database. Those scripts run as Symfony2 commands using Doctrine2 ORM, meaning opening and closing the database connection is handled behind the scenes.
The connection is normally closed automatically when the cli command exits - which is by definition not happening for a long time in a background consumer.
This is a problem when the consumer is idle (no incoming messages) longer then the wait_timeout setting in the MySQL server configuration. If no message is consumed longer than that period, the database server will close the connection and the next message will fail with a MySQL server has gone away exception.
I've thought about 2 solutions for the problem:
Open the connection before each message and close the connection manually after handling the message.
Implementing a ping message which runs a dummy SQL query like SELECT 1 FROM table each n minutes and call it using a cronjob.
The problem with the first approach is: If the traffic on that queue is high, there might be a significant overhead for the consumer in opening/closing connections. The second approach just sounds like an ugly hack to deal with the issue, but at least i can use a single connection during high load times.
Are there any better solutions for handling doctrine connections in background scripts?
Here is another Solution. Try to avoid long running Symfony 2 Workers. They will always cause problems due to their long execution time. The kernel isn't made for that.
The solution here is to build a proxy in front of the real Symfony command. So every message will trigger a fresh Symfony kernel. Sound's like a good solution for me.
http://blog.vandenbrand.org/2015/01/09/symfony2-and-rabbitmq-lessons-learned/
My approach is a little bit different. My workers only process one message, then die. I have supervisor configured to create a new worker every time. So, a worker will:
Ask for a new message.
If there are no messages, sleep for 20 seconds. If not, supervisor will think there's something wrong and stop creating the worker.
If there is a message, process it.
Maybe, if processing a message is super fast, sleep for the same reason than 2.
After processing the message, just finish.
This has worked very well using AWS SQS.
Comments are welcomed.
This is a big problem when running PHP-Scripts for too long. For me, the best solution is to restart the script some times. You can see how to do this in this Topic: How to restart PHP script every 1 hour?
You should also run multiple instances of your consumer. Add a counter to any one and terminate them after some runs. Now you need a tool to ensure a consistent amount of worker processes. Something like this: http://kamisama.me/2012/10/12/background-jobs-with-php-and-resque-part-4-managing-worker/

How to manage server-side processes using MySQL

I have a perl script which takes in unique parameters (one of the parameters being --user=username_here). Users can start these processes using a web interface I am developing.
A MySQL table, transactions, keeps track of users that run the perl script
id user script_parameters execute last_modified
23 alex --user=alex --keywords=thisthat 0 2014-05-06 05:49:01
24 alex --user=alex --keywords=thisthat 0 2014-05-06 05:49:01
25 alex --user=alex --keywords=lg 0 2014-05-06 05:49:01
26 alex --user=alex --keywords=lg 0 2014-04-30 04:31:39
The execute value for a given row will be "1" if the process should be running. It is set to "0" if the process should be ended.
My perl script constantly checks this value to make sure it's not "0" and if it is, the perl script terminates.
However, I need to manage these process to protect against this problem:
What if my server abruptly crashes and restarts, OR the script crashes? I will need something running in the background, reading the transactions table and make sure it restarts the perl script as many times as needed using the appropriate parameters.
And so, I'm having trouble figuring out how to balance giving control to the user to manage his/her own transaction(s), while I also make sure that the transactions that SHOULD be running, ARE running, and those that AREN'T, AREN'T.
Hope that makes sense and I appreciate any help!
It seems you're trying to launch long-running processes from a web server and then track those processes in a database. That's not impossible, but not a recommended practice.
The main problem is that an HTTP request needs to be currently being handled in your web server for you do actually do anything (including track processes running on the system) -- you need something that can run all the time...
Instead, a better idea would be to have another daemonized "manager" process (as you mention perl, that'd be a good language to write it in) spawn & track the long running tasks (by PID and signals), and for that process to update your SQL database.
You can then have your "manager" process listen for requests to start a new process from your web server. There are various IPC mechanisms you could use. (e.g: signals, SysV shm, unix domain sockets, in-process queues like ZeroMQ, etc).
This has multiple benefits:
If your spawned scripts need to run with user/group based isolation (either from the system or each other), then your webserver doesn't need to run as root, nor be setgid.
If a spawned process "crashes", a signal will be delivered to the "manager" process, so it can track mis-executiions without issues.
If you use in-process queues (e.g: ZeroMQ) to deliver requests to the "manager" process, it can "throttle" requests from the web server (so that users cannot intentionally or accidentally cause D.O.S).
Whether or not the spawned process ends well, you don't need an 'active' HTTP request to the web server in order to update your tracking database.
As to whether something that should be running is running, that's really up to your semantics. (i.e: is it based on a known run time? based on data consumed? etc).
The check as to whether it is running can be two-fold:
The "manager" process updates the database as appropriate, including the spawned PID.
Your web server hosted code can actually list processes to determine if the PID in the database is actually running, and even how much time it's been doing something useful!
The check for whether it is not running would have to be based on convention:
Name the spawned processes something you can predict.
Get a process list to determine what's still running (defunct?) that shouldn't be.
In either case, you could either inform the users who requested the processes be spawned and/or actually do something about it.
One approach might be to have a CRON job which reads from the SQL database and does ps to determine which spawned processes need to be restarted, and then re-requests that the "manager" process does so using the same IPC mechanism used by the web server. How you differentiate starts vs. restarts in your tracking/monitoring/logging is up to you.
If the server itself loses power or crashes, then you could have the "manager" process perform cleanup when it first runs, e.g:
Look for entries in the database for spawned processes that were alegedly running before the server was shut down.
Check for those processes by PID and run time (this is important).
Either re-spawn the spawned proceses that didn't complete, or store something in the database to indicate to the web server that this was the case.
Update #1
Per your comment, here are some pointers to get started:
You mentioned perl, so presuming you have some proficiency there -- here are some perl modules to help you on your way to writing the "manager" process script:
If you're not already familiar with it CPAN is the repository for perl modules that do basically anything.
Daemon::Daemonize - To daemonize process so that it will continue running after you log out. Also provides methods for writing scripts to start/stop/restart the daemon.
Proc::Spawn - Helps with 'spawning' child scripts. Basically does fork() then exec(), but also handles STDIN/STDOUT/STDERR (or even tty) of child process. You could use this to launch your long-running perl scripts.
If your web server front-end code is not already written in perl, you'll need something that's pretty portable for inter-process message-passing and queuing; I'd probably make your web server front end in something easy to deploy (like PHP).
Here are two possibilities (there are many more):
Perl and PHP implementations for the Spread Toolkit.
Perl and PHP implementations for the ZeroMQ library.
Proc::ProcessTable - You can use this check on running processes (and get all sorts of stats as discussed above).
Time::HiRes - Use the high-granularity time functions from this package to implement your 'throttling' framework. Basically just limit the number of requests you de-queue per unit of time.
DBI (with mysql) - Update your MySQL database from the "manager" process.

Implementing locking mysql

I want to use mysql row level lock. I can't lock complete table. I want to avoid two process processing two different message for server at same time.
What I thought that I can have some table called:
server_lock and if one process start working on server it will insert a row in the table.
Problem with this approach is that if application crashes. We need to remove the lock manually.
Is there a way I may row level lock and lock will get released if application is crashing ?
Edit
I am using C++ as language.
My application is similar to message queue. But difference is that there is two queue which are getting populated by one process for each queue. After action if action belong to same object and both are processing same object it may result in wrong data. So I want a locking mechanism b/w these two queue so that both processor don't modify same object at same time.
I can think of two ways:
Implement some error handler on your program where you remove the lock. Without knowing anything about your program it is hard to say how to do this, but most languages have some method to do some work before exiting upon a crash. This is dangerous, because a crash happens when something is not right. If you continue to do any work, it is possible that you corrupt the database or something like that.
Periodically update the lock. Add a thread on your program that periodically reacquires the lock, or reacquire the lock in some loop you are doing. Then, when a lock is not updated in a while, you know that it belonged to a program that crashed.

How To Mutex Across a Network?

I have a desktop application that runs on a network and every instance connects to the same database.
So, in this situation, how can I implement a mutex that works across all running instances that are connected to the same database?
In other words, I don't wan't that two+ instances to run the same function at the same time. If one is already running the function, the other instances shouldn't have access to it.
PS: Database transaction won't solve, because the function I wan't to mutex doesn't use the database. I've mentioned the database just because it can be used to exchange information across the running instances.
PS2: The function takes about ~30 minutes to complete, so if a second instance tries to run the same function I would like to display a nice message that it can't be performed right now because computer 'X' is already running that function.
PS3: The function has to be processed on the client machine, so I can't use stored procedures.
I think you're looking for a database transaction. A transaction will isolate your changes from all other clients.
Update:
You mentioned that the function doesn't currently write to the database. If you want to mutex this function, there will have to be some central location to store the current mutex holder. The database can work for this -- just add a new table that includes the computername of the current holder. Check that table before starting your function.
I think your question may be confusion though. Mutexes should be about protecting resources. If your function is not accessing the database, then what shared resource are you protecting?
put the code inside a transaction either - in the app, or better -inside a stored procedure, and call the stored procedure.
the transaction mechanism will isolate the code between the callers.
Conversely consider a message queue. As mentioned, the DB should manage all of this for you either in transactions or serial access to tables (ala MyISAM).
In the past I have done the following:
Create a table that basically has two fields, function_name and is_running
I don't know what RDBMS you are using, but most have a way to lock individual records for update. Here is some pseduocode based on Oracle:
BEGIN TRANS
SELECT FOR UPDATE is_running FROM function_table WHERE function_name='foo';
-- Check here to see if it is running, if not, you can set running to 'true'
UPDATE function_table set is_running='Y' where function_name='foo';
COMMIT TRANS
Now I don't have the Oracle PSQL docs with me, but you get the idea. The 'FOR UPDATE' clause locks there record after the read until the commit, so other processes will block on that SELECT statement until the current process commits.
You can use Terracotta to implement such functionality, if you've got a Java stack.
Even if your function does not currently use the database, you could still solve the problem with a specific table for the purpose of synchronizing this function. The specifics would depend on your DB and how it handles isolation levels and locking. For example, with SQL Server you would set the transaction isolation to repeatable read, read a value from your locking row and update it inside a transaction. Don't commit the transaction until your function is done. You can also use explicit table locks in a transaction on most databases which might be simpler. This is probably the simplest solution given you are already using a database.
If you do not want to rely on the database for whatever reason you could write a simple service that would accept TCP connections from your client. Each client would request permission to run and would return a response when done. The server would be able to ensure only one client gets permission to run at a time. Dead clients would eventually drop the TCP connection and be detected as long as you have the correct keep alive setting.
The message queue solution suggested by Xepoch would also work. You could use something like MSMQ or Java Message Queue and have a single message that would act as a run token. All your clients would request the message and then repost it when done. You risk a deadlock if a client dies before reposting so you would need to devise some logic to detect this and it might get complicated.