Should I try a rollback before beginning a new transaction? - mysql

I have a doubt and I can't find a similar question.
In a generic php script like:
$pdo->beginTransaction();
//...
//many things to do...
//...
$pdo->commit();
Let's say the user stops the page loading or loses connection before the commit is reached.
Does the transaction remain opened? Do I have to try a rollback before the beginTransaction?

If you are worried about a user dropping connection, you would be better off using ignore_user_abort
ignore_user_abort
That way, regardless of whether a user stops the page loading or any other consequence, the script runs until completion.

Related

Long running transactions in Slick

I'm working on a akka-http/slick web service, and I need to do the following in a transaction:
Insert a row in a table
Call some external web service
Commit the transaction
The web service I need to call is sometimes really slow to respond (let's say ~2 seconds).
I'm worried that this might keep the SQL connection open for too longer, and that'll exhaust Slick's connection pool and affect other independent requests.
Is this a possibility? Or does Slick do something to make sure this "idle" mid-transaction connection does not starve the pool?
If it is something I should be worried about - is there anything I can do to remedy this?
If it matters, I'm using MySQL with TokuDB.
The slick documentation seems to say that this will be a problem.
The use of a transaction always implies a pinned session.
And
You can use withPinnedSession to force the use of a single session, keeping the existing session open even when waiting for non-database computations.
From: http://slick.lightbend.com/doc/3.2.0/dbio.html#transactions-and-pinned-sessions

Correct way to do a distributed mutex in Rails?

I am building a feature that requires application-level lock functionality.
The feature goes like this:
user logs in to the site
they hit a button which kicks off a bunch of API requests (this is a long-running, synchronous, process)
process finishes and all is well
The issue is that there can only be one instance of this process running at any one time. Any kind of double-submit will cause major problems.
My current strategy is to implement the following logic:
I will put a boolean field on a table that indicates whether or not the long-running process is currently active
when the user first submits their action, I will update that boolean using a lock:
pc = ProcessControl.first
pc.with_lock do
if pc.process_is_running?
return # abort
else
pc.process_is_running = true
pc.save!
end
end
LongRunningProcess.start!
then, the long process will run, and at the end, I'll flip the flag back to false.
So my question is: will this work in a distributed environment? I have multiple app servers and I want to be sure that once the long-running process is off and running on one of the app servers, any request to kick off the long running process around the same time will read pc.process_is_running? and it should return false, preventing the double-submit.
I have found some resources already that indicate there are other ways to do a distributed lock, I'm hoping that this (maybe naive?) approach above will work.
Resources I've looked at:
http://makandracards.com/makandra/1026-simple-database-mutex-mysql-lock
https://github.com/mceachen/with_advisory_lock

MySQL transactions: reads while writing

I'm implementing PayPal Payments Standard in the website I'm working on. The question is not related to PayPal, I just want to present this question through my real problem.
PayPal can notify your server about a payment in two ways:
PayPal IPN - after each payment PayPal sends a (server-to-server) notification to a url (choose by you) with the transaction details.
PayPal PDT - after a payment (if you set this up in your PP account) PayPal will redirect the user back to your site, passing the transaction id in the url, so you can query PayPal about that transaction, to get details.
The problem is, that you can't be sure which one happens first:
Will your server notified by IPN
Will be the user redirected back to your site
Whichever is happening first, I want to be sure I'm not processing a transaction twice.
So, in both cases, I query my DB against the transaction id coming from paypal (and the payment status actually..but it doesn't matter now) to see if I already saved and processed that transaction. If not, I process it, and save the transaction id with other transaction details into my database.
QUESTION
What happens if I start processing the first request (let it be the PDT..so the user was redirected back to my site, but my server wasn't notified by IPN yet), but before I actually save the transaction to database, the second (the IPN) request arrives and it will try to process the transaction too, because it doesn't find it in DB.
I would love to make sure that while I'm writing a transaction into database, no other queries can read the table, looking for that given transaction id.
I'm using InnoDB, and don't want to lock the whole table, for the time of the write.
Can this be solved simply by transactions, have I to lock "manually" that row? I'm really confused, and I hope some more experienced mysql developers can help making this clear for me and solving the problem.
Native database locks are almost useless in a Web context, particularly in situations like this. MySQL connections are generally NOT done in a persistent way - when a script shuts down, so does the MySQL connection and all locks are released and any in-flight transactions are rolled back.
e.g.
situation 1: You direct a user to paypal's site to complete the purchase
When they head off paypal, the script which sent over the http redirect will terminate and shuts down. Locks/transactions are released/rolled back, and they come back to a "virgin" status as far as the DB is concerned. Their record is no longer locked.
situation 2: Paypal does a server-to-server response. This will be done via a completely separate HTTP connection, utterly distinct from the connection established by the user to your server. That means any locks you establish in the yourserver<->user connection will be distinct from the paypal<->yourserver session, and the paypal response will encounter locked tables. And of course, there's no way of predicting when the paypal response comes in. If the network gods smile upon you and paypal's not swamped, you get a response very quickly and possibly while the user<->you connection is still open. If things are slow and the response is delayed, that response MAY encounter unlocked tables/rows because the user<->server session has completed.
You COULD use persistent MySQL connections, but they open up a whole other world of pain. e.g. consider the case where your script has a bug which gets triggered halfway through processing. You connection, do some transaction work, set up some locks... and then the script dies. Because the MySQL connection is persistent, MySQL will NOT see that the client script has died, and it will keep the transactions/locks in-flight. But the connection is still sitting there, in the shared pool waiting for another session to pick it up. When it invariably is, that new script has no idea that it's gotten this old "stale" connection. It'll step into the middle of a mess of locks and transactions it has no idea exists. You can VERY easily get yourself into a deadlock situation like this, because your buggy scripts have dumped garbage all over the system and other scripts cannot cope with that garbage.
Basically, unless you implement your own locking mechanism on top of the system, e.g. UPDATE users SET locked=1 WHERE id=XXX, you cannot use native DB locking mechanisms in a Web context except in 1-shot-per-script contexts. Locks should never be attempted over multiple independent requests.

Implementing locking mysql

I want to use mysql row level lock. I can't lock complete table. I want to avoid two process processing two different message for server at same time.
What I thought that I can have some table called:
server_lock and if one process start working on server it will insert a row in the table.
Problem with this approach is that if application crashes. We need to remove the lock manually.
Is there a way I may row level lock and lock will get released if application is crashing ?
Edit
I am using C++ as language.
My application is similar to message queue. But difference is that there is two queue which are getting populated by one process for each queue. After action if action belong to same object and both are processing same object it may result in wrong data. So I want a locking mechanism b/w these two queue so that both processor don't modify same object at same time.
I can think of two ways:
Implement some error handler on your program where you remove the lock. Without knowing anything about your program it is hard to say how to do this, but most languages have some method to do some work before exiting upon a crash. This is dangerous, because a crash happens when something is not right. If you continue to do any work, it is possible that you corrupt the database or something like that.
Periodically update the lock. Add a thread on your program that periodically reacquires the lock, or reacquire the lock in some loop you are doing. Then, when a lock is not updated in a while, you know that it belonged to a program that crashed.

How To Mutex Across a Network?

I have a desktop application that runs on a network and every instance connects to the same database.
So, in this situation, how can I implement a mutex that works across all running instances that are connected to the same database?
In other words, I don't wan't that two+ instances to run the same function at the same time. If one is already running the function, the other instances shouldn't have access to it.
PS: Database transaction won't solve, because the function I wan't to mutex doesn't use the database. I've mentioned the database just because it can be used to exchange information across the running instances.
PS2: The function takes about ~30 minutes to complete, so if a second instance tries to run the same function I would like to display a nice message that it can't be performed right now because computer 'X' is already running that function.
PS3: The function has to be processed on the client machine, so I can't use stored procedures.
I think you're looking for a database transaction. A transaction will isolate your changes from all other clients.
Update:
You mentioned that the function doesn't currently write to the database. If you want to mutex this function, there will have to be some central location to store the current mutex holder. The database can work for this -- just add a new table that includes the computername of the current holder. Check that table before starting your function.
I think your question may be confusion though. Mutexes should be about protecting resources. If your function is not accessing the database, then what shared resource are you protecting?
put the code inside a transaction either - in the app, or better -inside a stored procedure, and call the stored procedure.
the transaction mechanism will isolate the code between the callers.
Conversely consider a message queue. As mentioned, the DB should manage all of this for you either in transactions or serial access to tables (ala MyISAM).
In the past I have done the following:
Create a table that basically has two fields, function_name and is_running
I don't know what RDBMS you are using, but most have a way to lock individual records for update. Here is some pseduocode based on Oracle:
BEGIN TRANS
SELECT FOR UPDATE is_running FROM function_table WHERE function_name='foo';
-- Check here to see if it is running, if not, you can set running to 'true'
UPDATE function_table set is_running='Y' where function_name='foo';
COMMIT TRANS
Now I don't have the Oracle PSQL docs with me, but you get the idea. The 'FOR UPDATE' clause locks there record after the read until the commit, so other processes will block on that SELECT statement until the current process commits.
You can use Terracotta to implement such functionality, if you've got a Java stack.
Even if your function does not currently use the database, you could still solve the problem with a specific table for the purpose of synchronizing this function. The specifics would depend on your DB and how it handles isolation levels and locking. For example, with SQL Server you would set the transaction isolation to repeatable read, read a value from your locking row and update it inside a transaction. Don't commit the transaction until your function is done. You can also use explicit table locks in a transaction on most databases which might be simpler. This is probably the simplest solution given you are already using a database.
If you do not want to rely on the database for whatever reason you could write a simple service that would accept TCP connections from your client. Each client would request permission to run and would return a response when done. The server would be able to ensure only one client gets permission to run at a time. Dead clients would eventually drop the TCP connection and be detected as long as you have the correct keep alive setting.
The message queue solution suggested by Xepoch would also work. You could use something like MSMQ or Java Message Queue and have a single message that would act as a run token. All your clients would request the message and then repost it when done. You risk a deadlock if a client dies before reposting so you would need to devise some logic to detect this and it might get complicated.