#Never transaction attribute is terribly slow - mysql

I have written sort of a benchmark which estimates how different combinations of transaction attributes affect the performance of a Java EE program. The benchmark calls a method annotated with 'Y' annotation from method with 'X' annotation. Transactions in my benchmark cover the situation of a bank transfer:
#Required #RequiresNew
theCallerMethod() -> updateAccount(Account acc)
#RequiresNew
-> updateOwner(Company c)
#RequiresNew
-> addLogEntry(Transfer t)
So being in the context of a callerMethod transaction a container have to suspend the caller's transaction, start a new transaction, update an account, commit, switch to the caller's, suspend, start a new one, update a company, commit, return to caller's, suspend, start yet another one, add log entry, commit, and return to the caller method where finally commit the caller's transaction.
And I was quite surprised when it came out that the slowest calls was from #Never-annotated caller method: to perform 1000 described above call cases for #Required -> #Required scenario it took 5,71 sec., #Required -> #RequiresNew 6,35 sec., but 9,05 sec. for #Never -> #Not_Supported and 8,95 sec. for #Never -> #Supports.
Is it OK for #Never-contexts to execute for so long? I mean we even do not have a transaction to suspend and resume. Maybe there is some general knowledge about #Never transaction attribute that I have missed?
I use Java EE 6, GlassFish 3, MySQL 5.1.69 InnoDB.
Thanks in advance.

I mean we even do not have a transaction to suspend and resume.
I would not be so sure about that. This is what the ejb3.1 specification says:
13.6.5 Handling of Methods that Run with “an unspecified transaction context”
The EJB specification does not prescribe how the container should manage the execution of a method with an unspecified transaction context the transaction semantics are left to the container implementation.
Some techniques for how the container may choose to implement the execution of a method with
an unspecified transaction context are as follows (the list is not inclusive of all possible strategies):
(among other possibilities)
The container may treat each call of an instance to a resource manager as a single transaction
(e.g. the container may set the auto-commit option on a JDBC connection).

Related

Grails Immediate commit for objects in a transaction

In my project there is a table called process_detail. Row inserted in this table as soon as a cron process starts and is updated at the end
of the cron process completion. We are using grails which internally takes care of transaction at service level method i.e. transaction starts at the start of the method, commit if the method execution successful, rollback if any exception.
Here what happens is that if the transaction fails this row also being rolled back this I don't want because this is type of a log
table. I tried creating a nested transaction and save this row and at the end update it but that fails with lock acquisition exception.
I am thinking of using MyISAM for this particular table,
this way I don't have to worry about transaction because MyISAM does not support it and it will commit immediately and no rollback possible. Here's pseudo code for what I am trying to achieve.
def someProcess(){
//Transaction starts
saveProcessDetail(details); //Commit this immediately, should not rollback if below code fails.
someOtherWork;
updateProcessDetail(details); //Commit this immediately, should
//Transaction Ends
}
Pseudo code for save and update process detail;
def saveProcessDetail(processName, processStatus){
ProcessDetail pd = new ProcessDetail(processName, processStatus);
pd.save();
}
def updateProcessDetail(processDetail, processStatus){
pd.procesStatus = processStatus;
pd.save();
}
Please advice if there is better of doing this in InnoDB. Answer could be mysql level I can find the grails solution my self. Let me know if any other info required.
Make someProcess #NonTransactional, then manage the transactional nature yourself. Write the initial saveProcessDetail with a flush:true, then make the remainder of the processing transactional, withTransaction?
Or
#NonTransactional
def someProcess() {
saveProcessDetail(details) // I'd still use a flush:true
transactionalProcessWork()
}
#Transactional
def transactionalProcessWork() {
someOtherWork()
updateProcessDetail(details)
}

RxJava: retryWhen with retry limit

I am new to ReactiveX and reactive programming in general. I need to implement a retry mechanism for Couchbase CAS operations, but the example on the Couchbase website shows a retryWhen which seems to retry indefinitely. I need to have a retry limit and retry count somewhere in there.
The simple retry() would work, since it accepts a retryLimit, but I don't want it to retry on every exception, only on CASMismatchException.
Any ideas? I'm using the RxJava library.
In addition to what Simon Basle said, here is a quick version with linear backoff:
.retryWhen(notification ->
notification
.zipWith(Observable.range(1, 5), Tuple::create)
.flatMap(att ->
att.value2() == 3 ? Observable.error(att.value1()) : Observable.timer(att.value2(), TimeUnit.SECONDS)
)
)
Note that "att" here is a tuple which consists of both the throwable and the number of retries, so you can very specifically implement a return logic based on those two params.
If you want to learn even more, you can peek at the resilient doc I'm currently writing: https://gist.github.com/daschl/db9fcc9d2b932115b679#retry-with-delay
retryWhen is clearly a little bit more complicated than simple retry, but here's the gist of it:
you pass a notificationHandler function to retryWhen which takes an Observable<Throwable> and outputs an Observable<?>
the emission of this returned Observable determine when retry should occur or stop
so, for each occurring Exception in the original stream, if the handler's one emits 1 item, there'll be 1 retry. If it emits 2 items, there'll be 2...
as soon as the handler's stream emits an error, retry is aborted.
Using this, you can both:
work only on CasMismatchExceptions: just have your function return an Observable.error(t) in other cases
retry only for a specific number of times: for each exception, flatMap from an Observable.range representing the max number of retries, have it return an Observable.timer using the retry # if you need increasing delays.
Your use case is pretty close to the one in RxJava doc here
reviving this thread since in the Couchbase Java SDK 2.1.2 there's a new simpler way to do that: use the RetryBuilder:
Observable<Something> retryingObservable =
sourceObservable.retryWhen(
RetryBuilder
//will limit to the relevant exception
.anyOf(CASMismatchException.class)
//will retry only 5 times
.max(5)
//delay doubling each time, from 100ms to 2s
.delay(Delay.linear(TimeUnit.MILLISECONDS, 2000, 100, 2.0))
.build()
);

DataContext connection closed or transaction completed unexpectedly while submitting changes within a TransactionScope transaction?

Code
double timeout_in_hours = 6.0;
MyDataContext db = new MyDataContext();
using (TransactionScope tran = new TransactionScope( TransactionScopeOption.Required, new TransactionOptions(){ IsolationLevel= System.Transactions.IsolationLevel.ReadCommitted, Timeout=TimeSpan.FromHours( timeout_in_hours )}, EnterpriseServicesInteropOption.Automatic ))
{
int total_records_processed = 0;
foreach (DataRow datarow in data.Rows)
{
//Code runs some commands on the DataContext (db),
//possibly reading/writing records and calling db.SubmitChanges
total_records_processed++;
try
{
db.SubmitChanges();
}
catch (Exception err)
{
MessageBox.Show( err.Message );
}
}
tran.Complete();
return total_records_processed;
}
While the above code is running, it successfully completes 6 or 7 hundred loop iterations. However, after 10 to 20 minutes, the catch block above catches the following error:
{"The transaction associated with the current connection has completed but has not been disposed. The transaction must be disposed before the connection can be used to execute SQL statements."}
The tran.Complete call is never made, so why is it saying the transaction associated with the connection is completed?
Why, after successfully submitting hundreds of changes, does the connection associated with the DataContext suddenly enter a closed state? (That's the other error I sometimes get here).
When profiling SQL Server, there are just a lot of consecutive selects and inserts with really nothing else while its running. The very last thing the profiler catches is a sudden "Audit Logout", which I'm not sure if that's the cause of the problem or a side-effect of it.
Wow, the max timeout is limited by machine.config: http://forums.asp.net/t/1587009.aspx/1
"OK, we resolved this issue. apparently the .net 4.0 framework doesn't
allow you to set your transactionscope timeouts in the code as we have
done in the past. we had to make the machine.config changes by adding
< system.transactions> < machineSettings maxTimeout="02:00:00"/>
< defaultSettings timeout="02:00:00"/> < /system.transactions>
to the machine.config file. using the 2.0 framework we did not have
to make these entries as our code was overriding teh default value to
begin with."
It seems that the timeout you set in TransactionScope's constructor is ignored or defeated by a maximum timeout setting in the machine.config file. There is no mention of this in the documentation for the TransactionScope's constructor that accepts a time out parameter: http://msdn.microsoft.com/en-us/library/9wykw3s2.aspx
This makes me wonder, what if this was a shared hosting environment I was dealing with, where I could not access the machine.config file? There's really no way to break up the transaction, since it involves creating data in multiple tables with relationships and identity columns whose values are auto-incremented. What a poor design decision. If this was meant to protect servers with shared hosting, it's pointless, because such a long-running transaction would be isolated to my own database only. Also, if a program specifies a longer timeout, then it obviously expects a transaction to take a longer amount of time, so it should be allowed. This limitation is just a pointless handicap IMO that's going to cause problems. See also: TransactionScope maximumTimeout

Why EclipseLink 's auto commit doesn't work with MySQL?

Using the following code:
EntityManager manager = factory.createEntityManager();
manager.setFlushMode(FlushModeType.AUTO);
PhysicalCard card = new PhysicalCard();
card.setIdentifier("012345ABCDEF");
card.setStatus(CardStatusEnum.Assigned);
manager.persist(card);
manager.close();
when code runs to this line, the "card" record does not appear in the database. However, if using the FlushModeType.COMMIT, and using transaction like this:
EntityManager manager = factory.createEntityManager();
manager.setFlushMode(FlushModeType.COMMIT);
manager.getTransaction().begin();
PhysicalCard card = new PhysicalCard();
card.setIdentifier("012345ABCDEF");
card.setStatus(CardStatusEnum.Assigned);
manager.persist(card);
manager.getTransaction().commit();
manager.close();
it works fine. From the eclipselink's log i can see the previous code doesn't issue an INSERT statement while the second code does.
Do I miss something here? I'm using EclipseLink 2.3 and mysql connection/J 5.1
I am assuming that you are using EclipseLink in a Java SE application, or in a Java EE application but with an application managed EntityManager instead of a container managed EntityManager.
In both scenarios, all updates made to the persistence context are flushed only when the transaction associated with the EntityManager commits (using EntityTransaction.commit), or when the EntityManager's persistence context is flushed (using EntityManager.flush). This is the reason why the second code snippet issues the INSERT as it invokes the EntityTransaction's begin and commit methods, while the first doesn't; an invocation of em.persist does not issue an INSERT.
As far as FlushModeType values are concerned, the API documentation states the following:
COMMIT
public static final FlushModeType COMMIT
Flushing to occur at transaction commit. The provider may flush at
other times, but is not required to.
AUTO
public static final FlushModeType AUTO
(Default) Flushing to occur at query execution.
Since, queries haven't been executed in the first case case, no flushing i.e. no INSERT statements corresponding to the persistence of the PhysicalCard entity will be issued. It is the explicit commit of the EntityTransaction in the second, that is resulting in the INSERT statement being issued.

What is an idempotent operation?

What is an idempotent operation?
In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. For example, removing an item from a set can be considered an idempotent operation on the set.
In mathematics, an idempotent operation is one where f(f(x)) = f(x). For example, the abs() function is idempotent because abs(abs(x)) = abs(x) for all x.
These slightly different definitions can be reconciled by considering that x in the mathematical definition represents the state of an object, and f is an operation that may mutate that object. For example, consider the Python set and its discard method. The discard method removes an element from a set, and does nothing if the element does not exist. So:
my_set.discard(x)
has exactly the same effect as doing the same operation twice:
my_set.discard(x)
my_set.discard(x)
Idempotent operations are often used in the design of network protocols, where a request to perform an operation is guaranteed to happen at least once, but might also happen more than once. If the operation is idempotent, then there is no harm in performing the operation two or more times.
See the Wikipedia article on idempotence for more information.
The above answer previously had some incorrect and misleading examples. Comments below written before April 2014 refer to an older revision.
An idempotent operation can be repeated an arbitrary number of times and the result will be the same as if it had been done only once. In arithmetic, adding zero to a number is idempotent.
Idempotence is talked about a lot in the context of "RESTful" web services. REST seeks to maximally leverage HTTP to give programs access to web content, and is usually set in contrast to SOAP-based web services, which just tunnel remote procedure call style services inside HTTP requests and responses.
REST organizes a web application into "resources" (like a Twitter user, or a Flickr image) and then uses the HTTP verbs of POST, PUT, GET, and DELETE to create, update, read, and delete those resources.
Idempotence plays an important role in REST. If you GET a representation of a REST resource (eg, GET a jpeg image from Flickr), and the operation fails, you can just repeat the GET again and again until the operation succeeds. To the web service, it doesn't matter how many times the image is gotten. Likewise, if you use a RESTful web service to update your Twitter account information, you can PUT the new information as many times as it takes in order to get confirmation from the web service. PUT-ing it a thousand times is the same as PUT-ing it once. Similarly DELETE-ing a REST resource a thousand times is the same as deleting it once. Idempotence thus makes it a lot easier to construct a web service that's resilient to communication errors.
Further reading: RESTful Web Services, by Richardson and Ruby (idempotence is discussed on page 103-104), and Roy Fielding's PhD dissertation on REST. Fielding was one of the authors of HTTP 1.1, RFC-2616, which talks about idempotence in section 9.1.2.
No matter how many times you call the operation, the result will be the same.
Idempotence means that applying an operation once or applying it multiple times has the same effect.
Examples:
Multiplication by zero. No matter how many times you do it, the result is still zero.
Setting a boolean flag. No matter how many times you do it, the flag stays set.
Deleting a row from a database with a given ID. If you try it again, the row is still gone.
For pure functions (functions with no side effects) then idempotency implies that f(x) = f(f(x)) = f(f(f(x))) = f(f(f(f(x)))) = ...... for all values of x
For functions with side effects, idempotency furthermore implies that no additional side effects will be caused after the first application. You can consider the state of the world to be an additional "hidden" parameter to the function if you like.
Note that in a world where you have concurrent actions going on, you may find that operations you thought were idempotent cease to be so (for example, another thread could unset the value of the boolean flag in the example above). Basically whenever you have concurrency and mutable state, you need to think much more carefully about idempotency.
Idempotency is often a useful property in building robust systems. For example, if there is a risk that you may receive a duplicate message from a third party, it is helpful to have the message handler act as an idempotent operation so that the message effect only happens once.
A good example of understanding an idempotent operation might be locking a car with remote key.
log(Car.state) // unlocked
Remote.lock();
log(Car.state) // locked
Remote.lock();
Remote.lock();
Remote.lock();
log(Car.state) // locked
lock is an idempotent operation. Even if there are some side effect each time you run lock, like blinking, the car is still in the same locked state, no matter how many times you run lock operation.
An idempotent operation produces the result in the same state even if you call it more than once, provided you pass in the same parameters.
An idempotent operation is an operation, action, or request that can be applied multiple times without changing the result, i.e. the state of the system, beyond the initial application.
EXAMPLES (WEB APP CONTEXT):
IDEMPOTENT:
Making multiple identical requests has the same effect as making a single request. A message in an email messaging system is opened and marked as "opened" in the database. One can open the message many times but this repeated action will only ever result in that message being in the "opened" state. This is an idempotent operation. The first time one PUTs an update to a resource using information that does not match the resource (the state of the system), the state of the system will change as the resource is updated. If one PUTs the same update to a resource repeatedly then the information in the update will match the information already in the system upon every PUT, and no change to the state of the system will occur. Repeated PUTs with the same information are idempotent: the first PUT may change the state of the system, subsequent PUTs should not.
NON-IDEMPOTENT:
If an operation always causes a change in state, like POSTing the same message to a user over and over, resulting in a new message sent and stored in the database every time, we say that the operation is NON-IDEMPOTENT.
NULLIPOTENT:
If an operation has no side effects, like purely displaying information on a web page without any change in a database (in other words you are only reading the database), we say the operation is NULLIPOTENT. All GETs should be nullipotent.
When talking about the state of the system we are obviously ignoring hopefully harmless and inevitable effects like logging and diagnostics.
Just wanted to throw out a real use case that demonstrates idempotence. In JavaScript, say you are defining a bunch of model classes (as in MVC model). The way this is often implemented is functionally equivalent to something like this (basic example):
function model(name) {
function Model() {
this.name = name;
}
return Model;
}
You could then define new classes like this:
var User = model('user');
var Article = model('article');
But if you were to try to get the User class via model('user'), from somewhere else in the code, it would fail:
var User = model('user');
// ... then somewhere else in the code (in a different scope)
var User = model('user');
Those two User constructors would be different. That is,
model('user') !== model('user');
To make it idempotent, you would just add some sort of caching mechanism, like this:
var collection = {};
function model(name) {
if (collection[name])
return collection[name];
function Model() {
this.name = name;
}
collection[name] = Model;
return Model;
}
By adding caching, every time you did model('user') it will be the same object, and so it's idempotent. So:
model('user') === model('user');
Quite a detailed and technical answers. Just adding a simple definition.
Idempotent = Re-runnable
For example,
Create operation in itself is not guaranteed to run without error if executed more than once.
But if there is an operation CreateOrUpdate then it states re-runnability (Idempotency).
Idempotent Operations: Operations that have no side-effects if executed multiple times.
Example: An operation that retrieves values from a data resource and say, prints it
Non-Idempotent Operations: Operations that would cause some harm if executed multiple times. (As they change some values or states)
Example: An operation that withdraws from a bank account
It is any operation that every nth result will result in an output matching the value of the 1st result. For instance the absolute value of -1 is 1. The absolute value of the absolute value of -1 is 1. The absolute value of the absolute value of absolute value of -1 is 1. And so on. See also: When would be a really silly time to use recursion?
An idempotent operation over a set leaves its members unchanged when applied one or more times.
It can be a unary operation like absolute(x) where x belongs to a set of positive integers. Here absolute(absolute(x)) = x.
It can be a binary operation like union of a set with itself would always return the same set.
cheers
In short, Idempotent operations means that the operation will not result in different results no matter how many times you operate the idempotent operations.
For example, according to the definition of the spec of HTTP, GET, HEAD, PUT, and DELETE are idempotent operations; however POST and PATCH are not. That's why sometimes POST is replaced by PUT.
An operation is said to be idempotent if executing it multiple times is equivalent to executing it once.
For eg: setting volume to 20.
No matter how many times the volume of TV is set to 20, end result will be that volume is 20. Even if a process executes the operation 50/100 times or more, at the end of the process the volume will be 20.
Counter example: increasing the volume by 1. If a process executes this operation 50 times, at the end volume will be initial Volume + 50 and if a process executes the operation 100 times, at the end volume will be initial Volume + 100. As you can clearly see that the end result varies based upon how many times the operation was executed. Hence, we can conclude that this operation is NOT idempotent.
I have highlighted the end result in bold.
If you think in terms of programming, let's say that I have an operation in which a function f takes foo as the input and the output of f is set to foo back. If at the end of the process (that executes this operation 50/100 times or more), my foo variable holds the value that it did when the operation was executed only ONCE, then the operation is idempotent, otherwise NOT.
foo = <some random value here, let's say -2>
{ foo = f( foo ) }   curly brackets outline the operation
if f returns the square of the input then the operation is NOT idempotent. Because foo at the end will be (-2) raised to the power (number of times operation is executed)
if f returns the absolute of the input then the operation is idempotent because no matter how many multiple times the operation is executed foo will be abs(-2).
Here, end result is defined as the final value of variable foo.
In mathematical sense, idempotence has a slightly different meaning of:
f(f(....f(x))) = f(x)
here output of f(x) is passed as input to f again which doesn't need to be the case always with programming.
my 5c:
In integration and networking the idempotency is very important.
Several examples from real-life:
Imagine, we deliver data to the target system. Data delivered by a sequence of messages.
1. What would happen if the sequence is mixed in channel? (As network packages always do :) ). If the target system is idempotent, the result will not be different. If the target system depends of the right order in the sequence, we have to implement resequencer on the target site, which would restore the right order.
2. What would happen if there are the message duplicates? If the channel of target system does not acknowledge timely, the source system (or channel itself) usually sends another copy of the message. As a result we can have duplicate message on the target system side.
If the target system is idempotent, it takes care of it and result will not be different.
If the target system is not idempotent, we have to implement deduplicator on the target system side of the channel.
For a workflow manager (as Apache Airflow) if an idempotency operation fails in your pipeline the system can retry the task automatically without affecting the system. Even if the logs change, that is good because you can see the incident.
The most important in this case is that your system can retry the task that failed and doesn't mess up the pipeline (e.g. appending the same data in a table each retry)
Let's say the client makes a request to "IstanceA" service which process the request, passes it to DB, and shuts down before sending the response. since the client does not see that it was processed and it will retry the same request. Load balancer will forward the request to another service instance, "InstanceB", which will make the same change on the same DB item.
We should use idempotent tokens. When a client sends a request to a service, it should have some kind of request-id that can be saved in DB to show that we have already executed the request. if the client retries the request, "InstanceB" will check the requestId. Since that particular request already has been executed, it will not make any change to the DB item. Those kinds of requests are called idempotent requests. So we send the same request multiple times, but we won't make any change