In my project there is a table called process_detail. Row inserted in this table as soon as a cron process starts and is updated at the end
of the cron process completion. We are using grails which internally takes care of transaction at service level method i.e. transaction starts at the start of the method, commit if the method execution successful, rollback if any exception.
Here what happens is that if the transaction fails this row also being rolled back this I don't want because this is type of a log
table. I tried creating a nested transaction and save this row and at the end update it but that fails with lock acquisition exception.
I am thinking of using MyISAM for this particular table,
this way I don't have to worry about transaction because MyISAM does not support it and it will commit immediately and no rollback possible. Here's pseudo code for what I am trying to achieve.
def someProcess(){
//Transaction starts
saveProcessDetail(details); //Commit this immediately, should not rollback if below code fails.
someOtherWork;
updateProcessDetail(details); //Commit this immediately, should
//Transaction Ends
}
Pseudo code for save and update process detail;
def saveProcessDetail(processName, processStatus){
ProcessDetail pd = new ProcessDetail(processName, processStatus);
pd.save();
}
def updateProcessDetail(processDetail, processStatus){
pd.procesStatus = processStatus;
pd.save();
}
Please advice if there is better of doing this in InnoDB. Answer could be mysql level I can find the grails solution my self. Let me know if any other info required.
Make someProcess #NonTransactional, then manage the transactional nature yourself. Write the initial saveProcessDetail with a flush:true, then make the remainder of the processing transactional, withTransaction?
Or
#NonTransactional
def someProcess() {
saveProcessDetail(details) // I'd still use a flush:true
transactionalProcessWork()
}
#Transactional
def transactionalProcessWork() {
someOtherWork()
updateProcessDetail(details)
}
Related
I have a piece of Go code with 3 different functions, insertIntoMysql, updateRowMysql and deleteRowmysql. I check for operation type and run one of these functions as needed.
I want to convert my normal functions into go routines to be able to handle more operations.
But here is the issue:
If I convert into goroutines, I will lose the sequence of operations.
For example, the insert operations are much more frequent than the delete operations and insert operations are being queued in the insert channel while the delete channel is empty it is possible for my code to try to delete a row before it gets inserted (e.g. a row is inserted and the deleted 1 sec later).
Any ideas on how to make sure the sequence of my operations on mysql is the same as the received operations.
here is the code:
go insertIntoMysql(insertChan, fields, db, config.DestinationTable)
go updatefieldMysql(updateChan, fields, db, config.DestinationTable)
go deleteFromMysql(deleteChan, fields, db, config.DestinationTable)
for !opsDone {
select {
case op, open := <-mysqlChan:
if !open {
opsDone = true
break
}
switch op.Operation {
case "i":
//fmt.Println("got insert operation")
insertChan <- op
break
case "u":
fmt.Println("got update operation")
updateChan <- op
break
case "d":
fmt.Println("got delete operation")
deleteChan <- op
break
case "c":
fmt.Println("got command operation")
//no logic yet
break
}
break
}
}
close(done)
close(insertChan)
close(deleteChan)
close(updateChan)
}
Looking at your code and your question/requirement: Stay in order as the original go channel delivered the data.
Staying in sync and still firing off multiple go routines to process the data, could be done like this:
func processStatements(mysqlChan chan *instructions) {
var wg sync.WaitGroup
prevOp := "" // Previous operation
for {
op, open := <-mysqlChan
if !open {
break
}
if prevOp!=op.Operation {
// Next operation type, unknown side effects on previous operations,
// wait for previous to finish before continuing
wg.Wait()
}
switch op.Operation {
case "i":
wg.Add(1)
go processInsert(op,wg)
break
case "u":
wg.Add(1)
go processUpdate(op,wg)
break
case "d":
wg.Add(1)
go processDelete(op,wg)
break
case "c":
break
}
prevOp = op.Operation
}
// Previous go routines might still be running, wait till done
wg.Wait()
// Close any channels
}
func processInsert(c chan *sqlCall,wg *sync.WaitGroup) {
defer wg.Done()
// Do actual insert etc
}
The main differences with your program are:
Processing is in order, safe against your base scenario
Process of next operation type will wait until previous operation type is finished
Multiple of the same operations run in parallel (as for in your code a single operation per type would run in parallel). Depending on the data distribution either of them can be faster or slower (i,i,i,d,u = 3 waits in this code, while it would also be 3 waits in your code, however in your code the 3 i would run sequential, here they run parallel) (Look into insert or update for mysql, since now your inserts could suddenly be updates depending on your data).
How would the one that calls delete know about the row in the first place? From a SQL perspective: if it is not committed, it does not exist. Something calls your code to insert. Your code should return only after the transaction is completed. At which point the caller could start select, update or delete.
For example, in a web server: Only the client that called insert knows of the existence to the record initially. Other clients will only learn of its existence after running select. That select will only return the new record if it was committed in the transaction that inserted it. Now they might decide to upsert or delete.
Many SQL databases take care of proper row locking, which will ensure data integrity during concurrent access. So if you use Go routines to write to the same record, they will become sequential (in any given order) on the DB side. Concurrent reads could still happen.
Please note that net/http already runs a Go routine for every request. And so do most other server front-ends for Go. If you are writing something different altogether, like a custom TCP listener, you could initiate your own Go routine for each request.
As far as I understand the problem, the following can be done as a resolution :
Call Insert, Delete or Update as go routines and they will run concurrently and performing all these operations.
Ensure you have row-level locking in your MySQL DB (I am sure InnoDb provides that)
For your delete and update operations, have an isExist check to check if the row you are updating/deleting exists before deleting and updating the row, this allows for some level of control.
You can implement a retry mechanism in your channel (with jitter preferably) to ensure that even if Delete Operation goes before insert operation, it will fail the exist check and then can be retried(after a second or 0.5 second or some config) and this will ensure insert record is already performed before delete or update operation is retried.
Hope this helps.
I have a Spring application which updates particular entity details in MySQL DB using a #Transactional method, And within the same method, I am trying to call another endpoint using #Async which is one more Spring application which reads the same entity from MySql DB and updates the value in redis storage.
Now the problem is, every time I update some value for the entity, sometimes its updated in redis and sometimes it's not.
When I tried to debug I found that sometimes the second application when it reads the entity from MySql is picking the old value instead of updated value.
Can anyone suggest me what can be done to avoid this and make sure that second application always picks the updated value of that entity from Mysql?
The answer from M. Deinum is good but there is still another way to achieve this which may be simpler for you case, depending on the state of your current application.
You could simply wrap the call to the async method in an event that will be processed after your current transaction commits so you will read the updated entity from the db correctly every time.
Is quite simple to do this, let me show you:
import org.springframework.transaction.annotation.Transactional;
import org.springframework.transaction.support.TransactionSynchronization;
import org.springframework.transaction.support.TransactionSynchronizationManager;
#Transactional
public void doSomething() {
// application code here
// this code will still execute async - but only after the
// outer transaction that surrounds this lambda is completed.
executeAfterTransactionCommits(() -> theOtherServiceWithAsyncMethod.doIt());
// more business logic here in the same transaction
}
private void executeAfterTransactionCommits(Runnable task) {
TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronization() {
public void afterCommit() {
task.run();
}
});
}
Basically what happens here is that we supply an implementation for the current transaction callback and we only override the afterCommit method - there are others methods there that might be useful, check them out. And to avoid typing the same boilerplate code if you want to use this in other parts or simply make the method more readable I extracted that in a helper method.
The solution is not that hard, apparently you want to trigger and update after the data has been written to the database. The #Transactional only commits after the method finished executing. If another #Async method is called at the end of the method, depending on the duration of the commit (or the actual REST call) the transaction might have committed or not.
As something outside of your transaction can only see committed data it might see the updated one (if already committed) or still the old one. This also depends on the serialization level of your transaction but you generally don't want to use an exclusive lock on the database for performance reason.
To fix this the #Async method should not be called from inside the #Transactional but right after it. That way the data is always committed and the other service will see the updated data.
#Service
public class WrapperService {
private final TransactionalEntityService service1;
private final AsyncService service2;
public WrapperService(TransactionalEntityService service1, AsyncService service2) {
this.service1=service1;
this.service2=service2;
}
public updateAndSyncEntity(Entity entity) {
service1.update(entity); // Update in DB first
service2.sync(entity); // After commit trigger a sync with remote system
}
}
This service is non-transactional and as such the service1.update which, presumable, is #Transactional will update the database. When that is done you can trigger the external sync.
I have the following method in my DAL:
public void SavePlan()
{
using (TransactionScope scope =
new TransactionScope(TransactionScopeOption.RequiresNew))
{
CallSaveDataProc();
CallLogMsgProc();
scope.Complete();
}
}
I have deliberately put a COMMIT Transaction in the CallLogMsgProc without creating a Transaction. This results in a SQLException being thrown from CallLogMsgProc procedure and scope.Complete() never executes.
However, in my database, I'm still seeing records saved by the first method, CallSaveDataProc. Am I doing something wrong?
Starting/committing transactions have to be paired, and preferably each pair should ideally be in the same scope (though each pair doesn't have to be in the same scope as another pair).
So you have a case of starting a transaction via the new TransactionScope, followed by Commit in your stored procedure (which will save the work... as you are seeing), followed by an attempt to commit the transaction "seen" by TransactionScope, which has now become invalid.
Code
double timeout_in_hours = 6.0;
MyDataContext db = new MyDataContext();
using (TransactionScope tran = new TransactionScope( TransactionScopeOption.Required, new TransactionOptions(){ IsolationLevel= System.Transactions.IsolationLevel.ReadCommitted, Timeout=TimeSpan.FromHours( timeout_in_hours )}, EnterpriseServicesInteropOption.Automatic ))
{
int total_records_processed = 0;
foreach (DataRow datarow in data.Rows)
{
//Code runs some commands on the DataContext (db),
//possibly reading/writing records and calling db.SubmitChanges
total_records_processed++;
try
{
db.SubmitChanges();
}
catch (Exception err)
{
MessageBox.Show( err.Message );
}
}
tran.Complete();
return total_records_processed;
}
While the above code is running, it successfully completes 6 or 7 hundred loop iterations. However, after 10 to 20 minutes, the catch block above catches the following error:
{"The transaction associated with the current connection has completed but has not been disposed. The transaction must be disposed before the connection can be used to execute SQL statements."}
The tran.Complete call is never made, so why is it saying the transaction associated with the connection is completed?
Why, after successfully submitting hundreds of changes, does the connection associated with the DataContext suddenly enter a closed state? (That's the other error I sometimes get here).
When profiling SQL Server, there are just a lot of consecutive selects and inserts with really nothing else while its running. The very last thing the profiler catches is a sudden "Audit Logout", which I'm not sure if that's the cause of the problem or a side-effect of it.
Wow, the max timeout is limited by machine.config: http://forums.asp.net/t/1587009.aspx/1
"OK, we resolved this issue. apparently the .net 4.0 framework doesn't
allow you to set your transactionscope timeouts in the code as we have
done in the past. we had to make the machine.config changes by adding
< system.transactions> < machineSettings maxTimeout="02:00:00"/>
< defaultSettings timeout="02:00:00"/> < /system.transactions>
to the machine.config file. using the 2.0 framework we did not have
to make these entries as our code was overriding teh default value to
begin with."
It seems that the timeout you set in TransactionScope's constructor is ignored or defeated by a maximum timeout setting in the machine.config file. There is no mention of this in the documentation for the TransactionScope's constructor that accepts a time out parameter: http://msdn.microsoft.com/en-us/library/9wykw3s2.aspx
This makes me wonder, what if this was a shared hosting environment I was dealing with, where I could not access the machine.config file? There's really no way to break up the transaction, since it involves creating data in multiple tables with relationships and identity columns whose values are auto-incremented. What a poor design decision. If this was meant to protect servers with shared hosting, it's pointless, because such a long-running transaction would be isolated to my own database only. Also, if a program specifies a longer timeout, then it obviously expects a transaction to take a longer amount of time, so it should be allowed. This limitation is just a pointless handicap IMO that's going to cause problems. See also: TransactionScope maximumTimeout
I have a question about testing the queries in a transaction. I've been using MySQL transactions for quite some time now, and everytime I do this, I use something like:
$doCommit = true;
$error = "";
mysql_query("BEGIN");
/* repeat this part with the different queries in the transaction
this often involves updating of and inserting in multiple tables */
$query = "SELECT, UPDATE, INSERT, etc";
$result = mysql_query($query);
if(!$result){
$error .= mysql_error() . " in " . $query . "<BR>";
$doCommit = false;
}
/* end of repeating part */
if($doCommit){
mysql_query("COMMIT");
} else {
echo $error;
mysql_query("ROLLBACK");
}
Now, it often happens that I want to test my transaction, so I change mysql_query("COMMIT"); to mysql_query("ROLLBACK");, but I can imagine this is not a very good way to test this kind of stuff. It's usually not really feasable to copy every table to a temp_table and update and insert into those tables and delete them afterwards (for instance because tables maybe very large). Of course, when the code goes into production relevant error-handling (instead of just printing the error) is put into place.
What's the best way to do stuff like this?
First of all, there is a bug in your implementation. If a query errors out, the current transaction is automatically rolled back and then closed. So as you continue to execute queries, they will not be within a transaction (they will be commited to the DB). Then, when you execute Rollback, it'll silently fail. From the MySQL docs:
Rolling back can be a slow operation that may occur implicitly without the user
having explicitly asked for it (for example, when an error occurs).
The explicit command ROLLBACK should only be used if you determine in the application that you need to rollback (for reasons other than a query error). For example, if you're deducting funds from an account, you'd explicitly rollback if you found out the user didn't have enough funds to complete the exchange...
As far as testing the transactions, I do copy the database. I create a new database and install a set of "dummy data". Then I run all the tests using an automated tool. The tool will actually commit the transactions and force rollbacks, and check that the expected database state is maintained throughout the tests. Since it's harder to programatically know the end state from a transaction if you have an unknown input to the transaction, testing off of live (or even copied-from-live) data is not going to be easy. You can do it (and should), but don't depend upon those results for determining if your system is working. Use those results to build new test cases for the automated tester...
Maybe you could refactor your first example and use some DB access wrapper class?
In that wrapper class you can have a variable $normalCommit = true;
and a method SetCommitMode() which sets that $normalCommit variable.
And you have a method Commit() which commits if($normalCommit == true)
Or even have a variable $failTransaction which calls mysql_query("ROLLBACK"); if you wish (so you could pass/fail many sequential tests).
Then when you run the test, you can set somewhere in the test code file:
$myDBClass->SetCommitMode(false);
or
$myDBClass->RollBackNextOperation(true);
before the operation which you wish to fail, and it will just fail. In such a way the code which you are testing will not contain those fail/commit checks, only the DB class will contain them.
And normally ONLLY the test code (especially if you do unit testing) should call those SetCommitMode and RollBackNextOperation methods, so you accidentally do not leave those calls in the production code.
Or you could pass some crazy data to your method (if you are testing a method), like negative variables to save in UNSIGNED fields, and then your transaction should fail 100% if your code does not do commit after such an SQL error (but it should not).
Generally I use something like (I use pdo for my example):
$db->beginTransaction();
try {
$db->exec('INSERT/DELETE/UPDATE');
$db->commit();
}
catch (PDOException $e) {
$db->rollBack();
// rethrow the error or
}
Or if you have your own exception handler, use a special clause for your PDOExceptions, where to rollback the execution. Example:
function my_exception_handler($exception) {
if($exception instanceof PDOException) {
// assuming you have a registry class
Registry::get('database')->rollBack();
}
}