How To Mutex Across a Network? - language-agnostic

I have a desktop application that runs on a network and every instance connects to the same database.
So, in this situation, how can I implement a mutex that works across all running instances that are connected to the same database?
In other words, I don't wan't that two+ instances to run the same function at the same time. If one is already running the function, the other instances shouldn't have access to it.
PS: Database transaction won't solve, because the function I wan't to mutex doesn't use the database. I've mentioned the database just because it can be used to exchange information across the running instances.
PS2: The function takes about ~30 minutes to complete, so if a second instance tries to run the same function I would like to display a nice message that it can't be performed right now because computer 'X' is already running that function.
PS3: The function has to be processed on the client machine, so I can't use stored procedures.

I think you're looking for a database transaction. A transaction will isolate your changes from all other clients.
Update:
You mentioned that the function doesn't currently write to the database. If you want to mutex this function, there will have to be some central location to store the current mutex holder. The database can work for this -- just add a new table that includes the computername of the current holder. Check that table before starting your function.
I think your question may be confusion though. Mutexes should be about protecting resources. If your function is not accessing the database, then what shared resource are you protecting?

put the code inside a transaction either - in the app, or better -inside a stored procedure, and call the stored procedure.
the transaction mechanism will isolate the code between the callers.

Conversely consider a message queue. As mentioned, the DB should manage all of this for you either in transactions or serial access to tables (ala MyISAM).

In the past I have done the following:
Create a table that basically has two fields, function_name and is_running
I don't know what RDBMS you are using, but most have a way to lock individual records for update. Here is some pseduocode based on Oracle:
BEGIN TRANS
SELECT FOR UPDATE is_running FROM function_table WHERE function_name='foo';
-- Check here to see if it is running, if not, you can set running to 'true'
UPDATE function_table set is_running='Y' where function_name='foo';
COMMIT TRANS
Now I don't have the Oracle PSQL docs with me, but you get the idea. The 'FOR UPDATE' clause locks there record after the read until the commit, so other processes will block on that SELECT statement until the current process commits.

You can use Terracotta to implement such functionality, if you've got a Java stack.

Even if your function does not currently use the database, you could still solve the problem with a specific table for the purpose of synchronizing this function. The specifics would depend on your DB and how it handles isolation levels and locking. For example, with SQL Server you would set the transaction isolation to repeatable read, read a value from your locking row and update it inside a transaction. Don't commit the transaction until your function is done. You can also use explicit table locks in a transaction on most databases which might be simpler. This is probably the simplest solution given you are already using a database.
If you do not want to rely on the database for whatever reason you could write a simple service that would accept TCP connections from your client. Each client would request permission to run and would return a response when done. The server would be able to ensure only one client gets permission to run at a time. Dead clients would eventually drop the TCP connection and be detected as long as you have the correct keep alive setting.
The message queue solution suggested by Xepoch would also work. You could use something like MSMQ or Java Message Queue and have a single message that would act as a run token. All your clients would request the message and then repost it when done. You risk a deadlock if a client dies before reposting so you would need to devise some logic to detect this and it might get complicated.

Related

ActiveRecord::StaleObject error on opening each result on a new tab

Recently we've added a functionality in our RoR application which allows users to open a particular record, let's say in their own individual tabs. Doing so, we've started seeing frequent ActiveRecord::StaleObject errors. On investigating the issue I found that rails is indeed trying to update the session store first whenever a resource is opened in a tab and the exception is raised.
We've lock_version in our active record session store, so Rails is taking it as optimistic locking by default. Is there any way we could solve this issue without introducing much complexity, as the application is already live on the client's machine and without affecting any sessions' data we've stored in our session store DB.
Any suggestions would be much appreciated. Thanks
It sounds like you're using optimistic locking on a db session record and updating the session record when you process an update to other records. Not sure what you'd need to update in the session, but if you're worried about possibly conflicting updates to the session object (and need the locking) then these errors might be desired.
If you don't - you can refresh the session object before saving the session (or disable it's optimistic locking) to avoid this error for these session updates.
You also might look into what about the session is being updated and whether it's strictly necessary. If you're updating something like "last_active_on" then you might be better off sending off a background job to do this and/or using the update_column method which bypasses the rather heavyweight activerecord save callback chain.
--- UPDATE ---
Pattern: Putting side-effects in background jobs
There are several common Rails patterns that start to break down as your app usage grows. One of the most common that I've run into is when a controller endpoint for a specific record also updates a common/shared record (for example, if creating a 'message' also updates the messages_count for a user using counter cache, or updates a last_active_at on a session). These patterns create bottlenecks in your application as multiple different types of requests across your application will compete for write locks on the same database rows unnecessarily.
These tend to creep into your app over time and become hard to refactor later. I'd recommend always handling side-effects of a request in an asynchronous job (using something like Sidekiq). Something like:
class Message < ActiveRecord::Base
after_commit :enqueue_update_messages_count_job
def enqueue_update_messages_count_job
Jobs::UpdateUserMessageCountJob.enqueue(self.id)
end
end
While this may seem like overkill at first, it creates an architecture that is significantly more scalable. If counting the messages becomes slow... that will make the job slower but not impact the usability of the product. In addition, if certain activities create lots of objects with the same side-effects (lets say you have a "signup" controller that creates a bunch of objects for a user that all trigger an update of user.updated_at) it becomes easy to throw out duplicate jobs and prevent updating the same field 20 times.
Pattern: Skipping the activerecord callback chain
Calling save on an ActiveRecord object runs validations and all the before and after callbacks. These can be slow and (at times) unnecessary. For example, updating a message_count cached value doesn't necessarily care about whether the user's email address is valid (or any other validations) and you may not care about other callbacks running. Similar if you're just updating a user's updated_at value to clear a cache. You can bypass the activerecord callback chain by calling user.update_attribute(:message_count, ..) to write that field directly to the database. In theory this shouldn't be necessary for a well designed application but in practice some larger/legacy codebases may make significant use of the activerecord callback chain to handle business logic that you may not want to invoke.
--- Update #2 ---
On Deadlocks
One reason to avoid updating (or generally locking) a common/shared object from a concurrent request is that it can introduce Deadlock errors.
Generally speaking a "Deadlock" in a database is when there are two processes that both need a lock the other one has. Neither thread can continue so it must error instead. In practice, detecting this is hard, so some databases (like postgres) just throw a "Deadlock" error after a thread waits for an exclusive/write lock for x amount of time. While contention for locks is common (e.g. two updates that are both updating a 'session' object), a true deadlock is often rare (where thread A has a lock on the session that thread B needs, but thread B has a lock on a different object that thread A needs), so you may be able to partially address the problem by looking at / extending your deadlock timeout. While this may reduce the errors, it doesn't fix the issue that the threads may be waiting for up to the deadlock timeout. An alternative approach is to have a short deadlock timeout and rescue/retry a few times.

What could cause mysql db read to return stale data

I am chasing a problem on a mysql application. At some point my client INSERTs some data, using a query wrapped in a START TRANSACTION; .... COMMIT; statement. Right after that another client comes are read back the data, and it is not there (I am sure of the order of things).
I am running nodejs, express, mysql2, and use connection pooling, with multiple statements queries.
What is interesting is that I see weird things on mysqlworkbench. I just had a workbench instance which would not see the newly inserted data either. I opened a second one, it saw the new data. Minutes later, the first instance would still not see the new data. Hit 'Reconnect to DBMS', and now it sees it. The workbench behaviour, if applied to my node client, would explain the bad result I see in node / mysql2.
There is some sort of caching going on somewhere... no idea where to start :-( Any pointers? Thanks!
It sounds like your clients are living in their own snapshot of the database, which would be true if they have an open transaction using the REPEATABLE-READ isolation level. In other words, no data committed after that client started its transaction will be visible to that client.
One workaround is to force a new transaction to start. Just run COMMIT in the client session where it appears to be viewing stale data. That will resolve any open transaction and the next query will start a new transaction.
Another way you can test is to use a locking read query such as SELECT ... FOR UPDATE. This will read the most recently committed data, regardless of the client's transaction isolation level. That is, even if the client had started their transaction using REPEATABLE-READ, a locking read behaves as if they had started their transaction with READ-COMMITTED.

MySQL paginated retrieval of the data avoiding race conditions

My service is clustered and I am running several instances of it.
I need to collect all entities in the paginated fashion and push them into the caching layer (Redis).
While doing so on one application server, an application that is running on server #2 can already be making the changes.
Those paginated calls to db will be fetching 1000 items at one call.
Now, since I want to prevent modifications while retrieval is ongoing, how do I achieve that?
Can I use SELECT FOR UPDATE mechanism even though I am not updating anything in this transaction, but only fetch the data in a paginated fashion?
If it were one app instance with multiple threads, you could use a critical section. But that doesn't work for a cluster of app instances.
I implemented this for a service a couple of months ago. The app is deployed in several instances. These instances don't communicate with each other, so they can't coordinate directly. But they all connect to the same MySQL database.
What I did was use the GET_LOCK() builtin function of MySQL.
When a routine wants exclusive access, it calls GET_LOCK('mylock', 0). This returns immediately, with a true value if it acquired the lock, or a false value if the lock was already held by some other client. That tells the client app whether it is the "winner" or not.
If a client is not the winner, then it calls GET_LOCK('mylock', -1) which means wait indefinitely. It does this because the winner is working on whatever it needs to do in the critical section.
When the winner finishes, it must call RELEASE_LOCK('mylock'). This unblocks the clients who were waiting. They now know that the work of the critical section is done, and they can feel free to read the contents of the cache or whatever else they need to do.
Also remember that the client who were waiting on GET_LOCK('mylock', -1) need to call RELEASE_LOCK('mylock') immediately, because once they stopped waiting, they actually acquired the lock themselves.
This design allows a single lock coordinator (MySQL) to be used by multiple clients. It implements pessimistic locking, without needing to rely on locking any table or set of rows.

Best Practice for synchronized jobs in Application clusters

We have got 3 REST-Applications within a cluster.
So each application server can receive requests from "outside".
Now we got timed events, which are analysing the database and add/remove rows from the database, send emails, etc.
The problem is, that each application server does start this timed events and it happens that 2 application server are starting this analysing job at the same time.
We got a sql table in the back.
Our idea was to lock a table within the sql database, when starting the job. If the table is locked, we exit the job, because an other application just started to analyse.
What's a good practice to insert some kind of semaphore ?
Any ideas ?
Don't use semaphores, you are over complicating things, just use message queueing, where you queue your tasks and get them executed in row.
Make ONLY one separate node/process/child_process to consume from the queue and get your task done.
We (at a previous employer) used a database-based semaphore. Each of several (for redundancy and load sharing) servers had the same set of cron jobs. The first thing in each was a custom library call that did:
Connect to the database and check for (or insert) "I'm working on X".
If the flag was already set, then the cron job silently exited.
When finished, the flag was cleared.
The table included a timestamp and a host name -- for debugging and recovering from cron jobs that fail to finish gracefully.
I forget how the "test and set" was done. Possibly an optimistic INSERT, then check for "duplicate key".

Is a MySQL procedure thread safe?

I am developing some websites that need to interact with a database. I will not bring here a complicated example. My question actually comes down to: Is a MySQL procedure thread safe? If one client on my site triggers a procedure, can I assume it is atomic, or could it interfere with another request from another user?
Depends on if you're using SQL transactions. Its possible, without the appropriate use of transactions and the actual serialization level, that a procedure can expose some data in a write call, for instance, that is visible to other queries / procedures before the complete procedure has completed.
in short: a given procedure will only be atomic if you use the appropriate transaction level
The database will handle concurrency for you. This is normally done via transactions - any set of statements within a transaction is considered atomic and isolated from other processes. In some databases, a stored procedure will be in an implicit transaction (so you don't need to declare one) - read the documentation for your RDBMS.
Sometimes this will mean that records are locked while another process tries to use them.
You will need to write your application so it can detect such occurrences and retry.
It really depends on how your server is configured to use transactions. There are tradeoff to consider depending on how your data is used and whether or not dirty, non-repeatable, or phantom reads are acceptable for your application.
Yes.
It's the DB's job to ensure thread safety among its worker threads, and it's your job to ensure thread safety among your application threads. Since there's a separation between the DB server, and your application, you don't need to worry about thread safety in this case. MySQL's data locking mechanisms will prevent you from corrupting the data in the DB due to simultaneous access from multiple threads in your own app.
Thread safety is more about modifying data in-memory, that is also shared among multiple threads within your app. Since the DB server is its own, separate application, it basically protects you from the scenario you've outlined above.