MySQL paginated retrieval of the data avoiding race conditions - mysql

My service is clustered and I am running several instances of it.
I need to collect all entities in the paginated fashion and push them into the caching layer (Redis).
While doing so on one application server, an application that is running on server #2 can already be making the changes.
Those paginated calls to db will be fetching 1000 items at one call.
Now, since I want to prevent modifications while retrieval is ongoing, how do I achieve that?
Can I use SELECT FOR UPDATE mechanism even though I am not updating anything in this transaction, but only fetch the data in a paginated fashion?

If it were one app instance with multiple threads, you could use a critical section. But that doesn't work for a cluster of app instances.
I implemented this for a service a couple of months ago. The app is deployed in several instances. These instances don't communicate with each other, so they can't coordinate directly. But they all connect to the same MySQL database.
What I did was use the GET_LOCK() builtin function of MySQL.
When a routine wants exclusive access, it calls GET_LOCK('mylock', 0). This returns immediately, with a true value if it acquired the lock, or a false value if the lock was already held by some other client. That tells the client app whether it is the "winner" or not.
If a client is not the winner, then it calls GET_LOCK('mylock', -1) which means wait indefinitely. It does this because the winner is working on whatever it needs to do in the critical section.
When the winner finishes, it must call RELEASE_LOCK('mylock'). This unblocks the clients who were waiting. They now know that the work of the critical section is done, and they can feel free to read the contents of the cache or whatever else they need to do.
Also remember that the client who were waiting on GET_LOCK('mylock', -1) need to call RELEASE_LOCK('mylock') immediately, because once they stopped waiting, they actually acquired the lock themselves.
This design allows a single lock coordinator (MySQL) to be used by multiple clients. It implements pessimistic locking, without needing to rely on locking any table or set of rows.

Related

ActiveRecord::StaleObject error on opening each result on a new tab

Recently we've added a functionality in our RoR application which allows users to open a particular record, let's say in their own individual tabs. Doing so, we've started seeing frequent ActiveRecord::StaleObject errors. On investigating the issue I found that rails is indeed trying to update the session store first whenever a resource is opened in a tab and the exception is raised.
We've lock_version in our active record session store, so Rails is taking it as optimistic locking by default. Is there any way we could solve this issue without introducing much complexity, as the application is already live on the client's machine and without affecting any sessions' data we've stored in our session store DB.
Any suggestions would be much appreciated. Thanks
It sounds like you're using optimistic locking on a db session record and updating the session record when you process an update to other records. Not sure what you'd need to update in the session, but if you're worried about possibly conflicting updates to the session object (and need the locking) then these errors might be desired.
If you don't - you can refresh the session object before saving the session (or disable it's optimistic locking) to avoid this error for these session updates.
You also might look into what about the session is being updated and whether it's strictly necessary. If you're updating something like "last_active_on" then you might be better off sending off a background job to do this and/or using the update_column method which bypasses the rather heavyweight activerecord save callback chain.
--- UPDATE ---
Pattern: Putting side-effects in background jobs
There are several common Rails patterns that start to break down as your app usage grows. One of the most common that I've run into is when a controller endpoint for a specific record also updates a common/shared record (for example, if creating a 'message' also updates the messages_count for a user using counter cache, or updates a last_active_at on a session). These patterns create bottlenecks in your application as multiple different types of requests across your application will compete for write locks on the same database rows unnecessarily.
These tend to creep into your app over time and become hard to refactor later. I'd recommend always handling side-effects of a request in an asynchronous job (using something like Sidekiq). Something like:
class Message < ActiveRecord::Base
after_commit :enqueue_update_messages_count_job
def enqueue_update_messages_count_job
Jobs::UpdateUserMessageCountJob.enqueue(self.id)
end
end
While this may seem like overkill at first, it creates an architecture that is significantly more scalable. If counting the messages becomes slow... that will make the job slower but not impact the usability of the product. In addition, if certain activities create lots of objects with the same side-effects (lets say you have a "signup" controller that creates a bunch of objects for a user that all trigger an update of user.updated_at) it becomes easy to throw out duplicate jobs and prevent updating the same field 20 times.
Pattern: Skipping the activerecord callback chain
Calling save on an ActiveRecord object runs validations and all the before and after callbacks. These can be slow and (at times) unnecessary. For example, updating a message_count cached value doesn't necessarily care about whether the user's email address is valid (or any other validations) and you may not care about other callbacks running. Similar if you're just updating a user's updated_at value to clear a cache. You can bypass the activerecord callback chain by calling user.update_attribute(:message_count, ..) to write that field directly to the database. In theory this shouldn't be necessary for a well designed application but in practice some larger/legacy codebases may make significant use of the activerecord callback chain to handle business logic that you may not want to invoke.
--- Update #2 ---
On Deadlocks
One reason to avoid updating (or generally locking) a common/shared object from a concurrent request is that it can introduce Deadlock errors.
Generally speaking a "Deadlock" in a database is when there are two processes that both need a lock the other one has. Neither thread can continue so it must error instead. In practice, detecting this is hard, so some databases (like postgres) just throw a "Deadlock" error after a thread waits for an exclusive/write lock for x amount of time. While contention for locks is common (e.g. two updates that are both updating a 'session' object), a true deadlock is often rare (where thread A has a lock on the session that thread B needs, but thread B has a lock on a different object that thread A needs), so you may be able to partially address the problem by looking at / extending your deadlock timeout. While this may reduce the errors, it doesn't fix the issue that the threads may be waiting for up to the deadlock timeout. An alternative approach is to have a short deadlock timeout and rescue/retry a few times.

Read after write consistency with mysql and multiple concurrent connections

I'm trying to understand whether it is possible to achieve the following:
I have multiple instances of an application server running behind a round-robin load balancer. The client expects GET after POST/PUT semantics, in particular the client will make a POST request, wait for the response and immediately make a GET request expecting the response to reflect the change made by the POST request, e.g:
> Request: POST /some/endpoint
< Response: 201 CREATED
< Location: /some/endpoint/123
> Request: GET /some/endpoint/123
< Response must not be 404 Not Found
It is not guaranteed that both requests are handled by the same application server. Each application server has a pool of connections to the DB. Each request will commit a transaction before responding to the client.
Thus the database will on one connection see an INSERT statement, followed by a COMMIT. One another connection, it will see a SELECT statement. Temporally, the SELECT will be strictly after the commit, however there may only be a tiny delay in the order of milliseconds.
The application server I have in mind uses Java, Spring, and Hibernate. The database is MySQL 5.7.11 managed by Amazon RDS in a multiple availability zone setup.
I'm trying to understand whether this behavior can be achieved and how so. There is a similar question, but the answer suggesting to lock the table does not seem right for an application that must handle concurrent requests.
Under ordinary circumstances, you will not have any issue with this sequence of requests, since your MySQL will have committed the changes to the database by the time the 201 response has been sent back. Therefore, any subsequent statements will see the created / updated record.
What could be the extraordinary circumstances under which the subsequent select will not find the updated / inserted record?
Another process commits an update or delete statement that changes or removes the given record. There is not too much you can do about this, since it is part of the normal operation. If you do not want such thing to happen, then you have to implement application level locking of data.
The subsequent GET request is routed not only to a different application server, but that one uses (or is forced to use) a different database instance, which does not have the most updated state of that record. I would envisage this to happen if either application or database server level there is a severe failure, or routing of the request goes really bad (routed to a data center at a different geographical location). These should not happen too frequently.
If you're using MyISAM tables, you might be seeing the effects of 'concurrent inserts' (see 8.11.3 in the mysql manual). You can avoid them by either setting the concurrent_insert system variable to 0, or by using the HIGH_PRIORITY keyword on the INSERT.

Correct way to do a distributed mutex in Rails?

I am building a feature that requires application-level lock functionality.
The feature goes like this:
user logs in to the site
they hit a button which kicks off a bunch of API requests (this is a long-running, synchronous, process)
process finishes and all is well
The issue is that there can only be one instance of this process running at any one time. Any kind of double-submit will cause major problems.
My current strategy is to implement the following logic:
I will put a boolean field on a table that indicates whether or not the long-running process is currently active
when the user first submits their action, I will update that boolean using a lock:
pc = ProcessControl.first
pc.with_lock do
if pc.process_is_running?
return # abort
else
pc.process_is_running = true
pc.save!
end
end
LongRunningProcess.start!
then, the long process will run, and at the end, I'll flip the flag back to false.
So my question is: will this work in a distributed environment? I have multiple app servers and I want to be sure that once the long-running process is off and running on one of the app servers, any request to kick off the long running process around the same time will read pc.process_is_running? and it should return false, preventing the double-submit.
I have found some resources already that indicate there are other ways to do a distributed lock, I'm hoping that this (maybe naive?) approach above will work.
Resources I've looked at:
http://makandracards.com/makandra/1026-simple-database-mutex-mysql-lock
https://github.com/mceachen/with_advisory_lock

Database strategy for synchronization based on changes

I have a Spring+Hibernate+MySQL backend that exposes my model (8 different entities) to a desktop client. To keep synchronized, I want the client to regularely ask the server for recent changes. The process may be as follows:
Point A: The client connects for the
first time and retrieves all the
model from the server.
Point B: The client asks the server
for all changes since Point A.
Point C: The client asks the server
for all changes since Point B.
To retrieve the changes (point B&C) I could create a HQL query that returns all rows in all my tables that have been last modified since my previous retrieval. However I'm afraid this can be a heavy query and degrade my performance if executed oftenly.
For this reason I was considering other alternatives as keeping a separate table with recent updates for a fast access. I have looked to using L2 query cache but it doesn't seem to serve for my purpose.
Does someone know a good strategy for my purpose? My initial thought is to keep control of synchronization and avoid using "automatic" synchronization tools.
Many thanks
you can store changes in a queue table. Triggers can populate the queue on insert, update, delete. this preserves the order of the changes like insert, update, update, delete. Empty the queue after download.
Emptying the queue would cause issues if you have multiple clients.... may need to think about a design to handle that case.
there are several designs you can go with, all with trade offs. I have used the queue design before, but it was only copying data to a single destination, not multiple.

How To Mutex Across a Network?

I have a desktop application that runs on a network and every instance connects to the same database.
So, in this situation, how can I implement a mutex that works across all running instances that are connected to the same database?
In other words, I don't wan't that two+ instances to run the same function at the same time. If one is already running the function, the other instances shouldn't have access to it.
PS: Database transaction won't solve, because the function I wan't to mutex doesn't use the database. I've mentioned the database just because it can be used to exchange information across the running instances.
PS2: The function takes about ~30 minutes to complete, so if a second instance tries to run the same function I would like to display a nice message that it can't be performed right now because computer 'X' is already running that function.
PS3: The function has to be processed on the client machine, so I can't use stored procedures.
I think you're looking for a database transaction. A transaction will isolate your changes from all other clients.
Update:
You mentioned that the function doesn't currently write to the database. If you want to mutex this function, there will have to be some central location to store the current mutex holder. The database can work for this -- just add a new table that includes the computername of the current holder. Check that table before starting your function.
I think your question may be confusion though. Mutexes should be about protecting resources. If your function is not accessing the database, then what shared resource are you protecting?
put the code inside a transaction either - in the app, or better -inside a stored procedure, and call the stored procedure.
the transaction mechanism will isolate the code between the callers.
Conversely consider a message queue. As mentioned, the DB should manage all of this for you either in transactions or serial access to tables (ala MyISAM).
In the past I have done the following:
Create a table that basically has two fields, function_name and is_running
I don't know what RDBMS you are using, but most have a way to lock individual records for update. Here is some pseduocode based on Oracle:
BEGIN TRANS
SELECT FOR UPDATE is_running FROM function_table WHERE function_name='foo';
-- Check here to see if it is running, if not, you can set running to 'true'
UPDATE function_table set is_running='Y' where function_name='foo';
COMMIT TRANS
Now I don't have the Oracle PSQL docs with me, but you get the idea. The 'FOR UPDATE' clause locks there record after the read until the commit, so other processes will block on that SELECT statement until the current process commits.
You can use Terracotta to implement such functionality, if you've got a Java stack.
Even if your function does not currently use the database, you could still solve the problem with a specific table for the purpose of synchronizing this function. The specifics would depend on your DB and how it handles isolation levels and locking. For example, with SQL Server you would set the transaction isolation to repeatable read, read a value from your locking row and update it inside a transaction. Don't commit the transaction until your function is done. You can also use explicit table locks in a transaction on most databases which might be simpler. This is probably the simplest solution given you are already using a database.
If you do not want to rely on the database for whatever reason you could write a simple service that would accept TCP connections from your client. Each client would request permission to run and would return a response when done. The server would be able to ensure only one client gets permission to run at a time. Dead clients would eventually drop the TCP connection and be detected as long as you have the correct keep alive setting.
The message queue solution suggested by Xepoch would also work. You could use something like MSMQ or Java Message Queue and have a single message that would act as a run token. All your clients would request the message and then repost it when done. You risk a deadlock if a client dies before reposting so you would need to devise some logic to detect this and it might get complicated.