Missing branches when using assertTrue instead of assertNull

Missing branches when using assertTrue instead of assertNull - junit

In Java/Junit, I need to test for null with some object. There are a variety of ways I can test a condition but I have been using assertTrue for most of my tests. When I check for nulls in an assertTrue, EclEmma states that it is only testing one branch.
When I resolve the statement into a variable manually (like setting the result to a boolean and passing it into assertTrue) the code coverage is deemed complete on the assert but not on the variable initializing line.
Why is this happening? Is this related to the extra byte code that Java apparently adds as mentioned on http://sourceforge.net/apps/trac/eclemma/wiki/FilteringOptions? Any solutions (besides using other assert statements).
assertTrue:
assertTrue( myObject == null ); //1 of 2 branches
assertTrue:
boolean test = (myObject == null); //1 of 2 branches missing
assertTrue(test); // complete
assertNull:
assertNull( myObject ) //complete;

For most Boolean expressions, the Java compiler generates extra branches in the byte code.
JaCoCo produces "branch coverage" based on the generated byte code, not based on the original Java code, and hence shows additional branch coverage information for almost any Boolean expression you would use.
In your code, the Boolean expression you use is myObject == null.
To compute this value, the Java compiler generates code pushing the two arguments on the stack, and then doing a conditional jump in order to push 1 (true) or 0 (false) on the stack. JaCoCo reports the branch coverage of this conditional jump.
Thus, the fact that you use myObject == null triggers the behavior you describe.
As some other examples, try this:
boolean t = true;
boolean f = false;
boolean result1 = (t && f) || f; // 3 out of 6 missed.
boolean result2 = !t; // 1 out of 2 missed.
This can be useful if the Boolean expression is, for example, returned by a function, which is used as condition in an if-then-else statement somewhere else. While mostly a consequence of the way the Java compiler works, it helps to assess condition coverage (instead of mere branch coverage) of the original Java code.
This feature isn't too well documented, but here are some pointers:
The JaCoCo test cases for Boolean Expressions
A JaCoCo forum discussion on branches generated for the statement a = !a
does anyone have a pointer to more documentation?
So it is, indeed, related to the extra byte code is generated, but not to the specific examples of synthetic byte compiler constructs for which the filtering options are intended.
NOTE: Did major EDIT since initial answer was too much of a guess. Thanks to #ira-baxter for good & critical discussion.

The fact that Emma treats a conditional expression as "something with a branch" for (branch) coverage counting IMHO seems simply broken. It isn't a conditional branch.
We can argue more about the Assert; if it were defined as "throws exception on assert failure" then it really does have a conditonal branch; if it is defined [as I think i does, I'm not a Java expert] as "terminate my program on assert failure" then it isn't really a branch. Also obscure are method calls; these are conditional branches in the sense that if the called method throws an exception, control flow does not continue to the "rest of the statement".
Our Java Test Coverage tool gets the (branch) coverage analysis on such conditionals "right".

To get 100% code coverage on boolean methods, do the following
Class RecordService{
public boolean doesRecordExist(String id){
return id!=null;
}
}
//Method inside your mock
#Test
public boolean testDoesRecordExist(){
RecordService recordService = mock(RecordService.class);
when(recordService.doesRecordExists()).thenReturn(
anyString()).thenReturn(null);
}

Related

Is the "if" statement considered a method?

Interesting discussion came up among my peers as to whether or not the "if" statement is considered a method? Although "if" is appended with the word statement it still behaves similar to a simple non-return value method.
For example:
if(myValue) //myValue is the parameter passed in
{
//Execute
}
Likewise a method could perform the same operation:
public void MyMethod(myValue)
{
switch(myValue)
{
case true:
//Logic
break;
case false:
//Logic
break;
}
}
Is it accurate to call (consider) the "if" statement a simple predefined method in a programming language?

In languages such as C, C++, C#, Java, IF is a statement implemented as a reserved word, part of the core of the language. In programming languages of the LISP family (Scheme comes to mind) IF is an expression (meaning that it returns a value) and is implemented as a special form. On the other hand, in pure object-oriented languages such as Smalltalk, IF really is a method (more precisely: a message), typically implemented on the Boolean class or one of its subclasses.
Bottom line: the true nature of the conditional instruction IF depends on the programming language, and on the programming paradigm of that language.

No, the "if" statement is nothing like a method in C#. Consider the ways in which it is not like a method:
The entities in the containing block are in scope in the body of an "if". But a method does not get any access to the binding environment of its caller.
In many languages methods are members of something -- a type, probably. Statements are not members.
In languages with first-class methods, methods can be passed around as data. (In C#, by converting them to delegates.) "if" statements are not first-class.
and so on. The differences are myriad.
Now, it does make sense to think of some things as a kind of method, just not "if" statements. Many operators, for instance, are a lot like methods. There's very little conceptual difference between:
decimal x = y + z;
and
decimal x = Add(y, z);
And in fact if you disassemble an addition of two decimals in C#, you'll find that the generated code actually is a method call.
Some operators have unusual characteristics that make it hard to characterize them as methods though:
bool x = Y() && Z();
is different from
bool x = And(Y(), Z());
in a language that has eager evaluation of method arguments; in the first, Z() is not evaluated if Y() is false. In the second, both are evaluated.
Your creation of an "if" method rather begs the question; the implementation is more complicated than an "if" statement. Saying that you can emulate "if" with a switch is like saying that you can emulate a bicycle with a motorcycle; replacing something simple with something far more complex is not compelling. It would be more reasonable to point out that a switch is actually a fancy "if".

You can't create a myIfStatement() method and expect the following to work:
...
myIfStatement(something == somethingElse)
{
// execute if equal
}
else
{
// execute if different
}
if is a control statement, and cannot be replicated by a method, nor can you replace a method call with if:
myVariable = if(something == somethingElse);
if cannot be overloaded.
These are a few signs that if is not a method, but there are others I suspect.

Depends on the language for sure, but in C, java, perl, no, they're language commands. Reserved words. If they were functions, you'd be able to overload them and get pointers to them and do all the other things that you can do with functions.
This is more of a philiosophical question than a programming question though.

A method has a signature and its main intention is resuable logic, whereas if is simply a condition that controls the flow of execution.
If you understand assembly, you would know that both are different even on a very low level.

You can of course write If() and IfElse() methods but that does not make them the same.
if() is defined as a statement in the language , at the same level as method calls. But there are differences in a.o. syntax and optimization possibilities.
So: No, the if() statement is not a method. You cannot for instance not assign it to a delegate.

Considering the if statement to be a method only makes it confusing, in my opinion. The similarities with a method call is just superficial.
The if statement is one of the statements that control the execution flow. When it's compiled into native machine code, it will evaluate the expression and make a conditional jump.
Pseudo code:
load myValue, reg0
test reg0
jumpeq .skip
; code inside the if
.skip:
If you use else, you will get two jumps:
load myValue, reg0
test reg0
jumpeq .else
; code inside the if
jmp .done
.else:
; code inside the else
.done:

Is the “if” statement considered a method?
No, it's not considered a method as you may have already seen in the other answers. However, if your question were - "Does it behave like a method?", then the answer could be yes depending on the language in question. Any language that supports first-class functions could do without an in-built construct/statement like if. Ignore all the fluffy stuff like return values and syntax, as basically it is just a function that evaluates a boolean value and if it is true, then it executes some block of code. Also ignore OO and functional differences because the following examples can be implemented as a method on the Boolean class in whatever language is being used like Smalltalk does it.
Ruby supports blocks of executable code that can be stored in a variable and passed around to methods. So here's a custom _if_ function implemented in Ruby. The stuff within the { .. } is a piece of executable code that's passed to the function. It's also known as a block in Ruby.
def _if_ (condition)
condition && yield
end
# simple statement
_if_ (42 > 0) { print "42 is indeed greater than 0" }
# complicated statement
_if_ (2 + 3 == 5) {
_if_ (3 + 5 == 8) { puts "3 + 5 is 8" }
_if_ (5 + 8 == 13) { puts "5 + 8 is 13" }
}
We can do the same thing in C, C++, Objective-C, JavaScript, Python, LISP, and many other languages. Here's a JavaScript example.
function _if_(condition, code) {
condition && code();
}
_if_(42 > 0, function() { console.log("Yes!"); });

If it were to be classed as a method then surely we would be in the realms of OO, however we're not, so I'll assume we're on about a function. Certainly a function/subroutine could be written to replicate the if behaviour (I think it is actually a function in lisp/scheme).
I wouldn't class it as a function or even a subroutine though, just control flow.

If by method we understand a block of code that could be called and the control flow automatically returns to the caller when the method ends, then ifs aren't methods. The control flow doesn't return anywhere after an if is executed.

The IF statement is a conditional contruct feature used in most lanuages which executes a path flow from the boolean condition evaluation of true or false. Apart from the case of branch predication, this is always achieved by selectively altering the control flow based on some condition.
The IF construct is the most basic and needed logic used when programming. It allows the building blocks for functions to be introduced.

Yes, if is a function in certain languages, even though it's rare and the uses are limited.
Usually the construct is something like if(booleanCondition, functionPointerToCallIfConditionTrue, functionPointerToCallIfCondtionFalse) This can itself be used as a delegate to other functions if you want.
Mathematica, for example, behaves this way and even C# can do so with a bit of work if you use Linq-expressions; Take a look at System.Linq.Expressions.Expression.IfThenElse.

No. You don't return back when you are finished with an if. It's merely a control statement.

Note that in your example, you replaced one "selection statement" (C# 4 specification, section 8.7), the if statement (section 8.7.1) with another, the switch statement (section 8.7.2). You also refactored the selection statement into a separate method. You haven't replaced the use of a selection statement with a method, however.
The answer to your question is "no".

Are preconditions and postconditions needed in addition to invariants in member functions if doing design by contract?

I understand that in the DbC method, preconditions and postconditions are attached to a function.
What I'm wondering is if that applies to member functions as well.
For instance, assuming I use invariants at the beginning at end of each public function, a member function will look like this:
edit: (cleaned up my example)
void Charcoal::LightOnFire() {
invariant();
in_LightOnFire();
StartBurning();
m_Status = STATUS_BURNING;
m_Color = 0xCCCCCC;
return; // last return in body
out_LightOnFire();
invariant();
}
inline void Charcoal::in_LightOnFire() {
#ifndef _RELEASE_
assert (m_Status == STATUS_UNLIT);
assert (m_OnTheGrill == true);
assert (m_DousedInLighterFluid == true);
#endif
}
inline void Charcoal::out_LightOnFire() {
#ifndef _RELEASE_
assert(m_Status == STATUS_BURNING);
assert(m_Color == 0xCCCCCC);
#endif
}
// class invariant
inline void Charcoal::invariant() {
assert(m_Status == STATUS_UNLIT || m_Status == STATUS_BURNING || m_Status == STATUS_ASHY);
assert(m_Color == 0x000000 || m_Color == 0xCCCCCC || m_Color == 0xEEEEEE);
}
Is it okay to use preconditions and postconditions with global/generic functions only and just use invariants inside classes?
This seems like overkill, but maybe its my example is bad.
edit:
Isn't the postcondition just checking a subset of the invariant?
In the above, I am following the instructions of http://www.digitalmars.com/ctg/contract.html that states, "The invariant is checked when a class constructor completes, at the start of the class destructor, before a public member is run, and after a public function finishes."
Thanks.

Restricting the contracts in the classes to invariants is not optimal.
Preconditions and Postconditions are not just a subset of the invariants.
Invariants, Pre-conditions and Post-conditions have very different roles.
Invariants confirms the internal coherence of the object. They should be valid at the end of the constructor and before and after each method call.
Pre-conditions are checking that the status of the object and the arguments are suitable for the execution of the method. Preconditions are complementary to the invariants. They cover the check of the arguments (a stronger check that the type itself, i.e. not null, > 0,.. etc) but also could check for the object internal status (i.e. a call to file.write("hello") is a valid call only if file.is_rw and file.is_open are true).
Post-conditions are cheking that the method satisfied its obligation Post-conditions are also complementary to the invariants. Of course the status of the object has to be coherent after the method execution, but the Post-conditions are checking that the expected action was performed (i.e. list.add(i) should have as consequence that list.has(i) is true and list.count = old list.count + 1).

Yes.
Class C's invariant is a common property of all of its instances (objects). The invariant evaluates to true if and only if the object is in a semantically valid state.
An elevator's invariant may contain information such as ASSERT(IsStopped() || Door.IsClosed()), because it is invalid for an elevator to be in a state different than stopped (say, going up) and with the door open.
In contrast, a member function such as MoveTo(int flat) may have CurrentFlat()==flat as a postcondition; because after a call to MoveTo(6) the current flat is 6. Similarly, it may have IsStopped() as a precondition, because (depending on the design) you can't invoke function MoveTo if the elevator is already moving. First, you have to query its state, make sure that it is stopped, and then call the function.
Of course I may be totally oversimplifying how an elevator works.
In any case, the preconditions and postconditions will make no sense, in general, as invariant conditions; an elevator doesn't need to be at floor 6 to be in a valid state.
A more concise example can be found here: Interception and Attributes: A Design-By-Contract Sample by Sasha Goldshtein.

Well, the point of an invariant is that it describes something that's true of the object at all times. In this case, something is on the grill, or not (nothing in between). They normally describe a property of the entire state of the object.
Pre and post conditions describe things that are true just before a method executes, and just after, and will concern just the state that should have been touched by the method. This is different, presumably, from the state of the object. Pre and post conditions might be thought of as describing the footprint of a method - just what it needed, just what it touched.
So, to the specific question, the ideas do different things, so you may well want both. You certainly cannot just use invariants instead of pre and post conditions - in this instance, part of the object invariant is "Something is on the grill or not", but the precondition of lightOnFire needs to know that the item is on the grill. You can never infer this from the object invariant. It is true that from pre and postconditions and a known start state, you can (assuming that the objects structure is only mutable through methods, and the pre and post conditions describe all the environmental changes), infer an object invariant. However, this can be complex, and when you're stating things "in language", it's easier to just provide both.
Of course, doing in variants that state a boolean item is either true or false is a bit pointless - the type system ensures that.

When to check function/method parameters?

when writing small functions I often have the case that some parameters are given to a function which itself only passes them to different small functions to serve its purpose.
For example (C#-ish Syntax):
public void FunctionA(object param)
{
DoA(param);
DoB(param);
DoC(param);
// etc.
}
private void DoA(object param)
{
DoD(param);
}
private void DoD(object param)
{
// Error if param == null
param.DoX();
}
So the parameters are not used inside the called function but "somewhere" in the depths of the small functions that do the job.
So when is it best to check if my param-Object is null?
When checking in FunctionA:
Pro:
-There is no overhead through the use of further methods which at last will do nothing because object is null.
Con:
-My syntactically wonderful FunctionA is dirtied by ugly validation code.
When checking only when param-object is used:
Pro:
-My syntactically wonderful FunctionA keeps a joy to read :)
Cons:
-There will be overhead through calling methods which will do nothing because the param-object is null.
-Further cons I don't think about at the moment.

Always put it as far down the call stack as possible, so that if you later refactor the code and something else calls DoD other than DoA you have the check in place and don't have to rework your parameter checks. The overhead of a small null check and possibly a few extra method calls is going to be trivial in most cases and doing the check an extra few times is not something you should be worrying about.

Unless you think the value is likely to be null the vast majority of the time, I'd put the validation in DoD(). If you put it in FunctionA() you'll have to repeat the validation code later when you decide FunctionB() also needs to use DoD(). To me, the extra overhead is worth not having to repeat myself.

As a guideline, I make a habit of it to check every parameter that is used by the method, even including my own private variables. I would therefore only check for nil in your DoD method.
You might want to check out Bertrand Meyers Design By Contract mantra.

Fail early. Unless a partial result is preferable to no result at all, execution should stop as soon as the code can detect that there is a problem. Why should the code run through several methods when the result downstream is going to be an invalid or missing argument?
If it is possible that the downstream methods could be called separately then validation could be handled by a call to ca common validation methed as has already been suggested.

Always check everything :) Coming from the deep bowels of coding libraries for embedded systems, this is the method I'd use:
public void FunctionA(object param)
{
assert(param != null && param.canDoX());
DoA(param);
DoB(param);
DoC(param);
// etc.
}
private void DoA(object param)
{
assert(param != null && param.canDoX());
DoD(param);
}
private void DoD(object param)
{
assert(param != null && param.canDoX());
if ( param != null )
param.DoX();
else
// Signal error, for instance by throwing a runtime exception
// This error-handling is then verified by a unit test that
// uses a release build of the code.
}
To de-clutter this, the obvious solution is to break out the validation to a separate validator function. Using a C-style preprocessor, or just sticking to asserts, it should be trivial to have this "paranoid" validation excluded from release builds.

It's the caller's responsibility to pass a valid parameter. In this case:
if(param != null)
{
FunctionA(param);
}

Is using an if() coupled with an immediate return an accepted practise?

Is using an if coupled with an immediate return like in the example below an acceptable practise instead of having a if with a block of code inside {} ? Are these equivalent in practise or is there a disadvantage to one of the approaches ?
An example in Java:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
ServletContext sc = this.getServletContext();
// Throw exception for fatal error (Servlet not defined in web.xml ?)
if( sc == null )
return; // old-style programming
// Careful with silent bugs ! Correct way of handling this is:
// throw new RuntimeException( "BookDetail: ServletContext is null" );
BookList bookList = WebUtil.getBookList( sc );

Martin Fowler would favour the early return, and calls the idea a Guard Clause.
Personally, I don't like it in Java, as I prefer one return per method. However this is subjective and I may be in the minority.
I've blogged about this for and against.

That is not a return, it's an exception. The code is perfectly ok tho.
Even if you'd replace that throw with a "return something", it would still be ok.

I think it comes down to readability. The code should function the same either way.
I use stuff like this all the time
Function Blah() As Boolean
If expr Then
Return False
End If
Do Other work...
Return result
End Function

For error conditions, generally it's best to throw an exception - exception handling was invented to get rid of the manual return code style error checking in C that comprises about 30% of a C program.
However, early returns are fine - they are far more readable than adding an extra scope with curly braces.
if (!_cache.has_key(key))
return null;
return _cache[key]
Is better than:
if (_cache_has_key(key))
{
return _cache[key]
}
else
return null;
And it only gets more obvious the more early returns that you add, 5 early returns beats the hell out of 5 nested if statements.
Note that I didn't return null on an error condition, it's expected that often the key won't be in the cache - but it still means the caller has to write code to check the result. In .NET there's a better pattern of returning a boolean, and setting the result via an out parameter. The methods beginning with Try usually follow this pattern:
Foo foo;
if (!TryGetCachedFoo("myfoo", foo))
{
foo = new Foo(...);
AddToCache("myfoo", foo);
}
// do something with foo

As long as you're using them for conditional escapes as the first thing in the routine. I think the fact that they are obvious in that location, and avoid at least one level of indentation outweighs the negative of having a multiple returns.

In the example you give, I'd favor throwing an exception because a null ServletContext is usually a sign that something has gone wrong. However, there are times when checking whether a parameter is null and returning immediately from the method is both useful and valid.
For instance, if you are gathering contact information about a user and the user has the option of providing a phone number. In that case, you may have a method that validates that the phone number contains all numbers, has the correct number of digits, etc, but which would immediately return if the phone number was empty or null.
public void validatePhone(String phoneNumber) throws ValidationException {
if (phoneNumber == null || phoneNumber.equals("")) {
return;
}
//do validation stuff, throwing exception if not valid

In your example, there is no return after the if statement; you are throwing an exception. (edit: I see you have changed the code since I posted this answer).
There are purists who think that you should have only one return statement in a method (at the end of the method). There's some merit to that idea - it makes the code more clear, it makes it easier to see what can be returned for the method, and especially when you need to cleanup resources (especially in a language without garbage collection; or in Java where you need to close for example an InputStream) it's more clear and easier if you have just one return at the bottom, and do the cleanup code just before the return.
I would not have any objection against the code in your example, however.

I have a few (subjective, or not) remarks:
I always use accolades with a if even the block contains only one line
I don't like to have many return on one method
I don't think that this null check is necessary. If getServletContext() returns null, then you have a much bigger problem with your webapp that should definitely be fixed. In that case, having a NullPointerException later in the code is an exceptional error so I wouldn't bother handling it.

Conditional logging with minimal cyclomatic complexity

After reading "What’s your/a good limit for cyclomatic complexity?", I realize many of my colleagues were quite annoyed with this new QA policy on our project: no more 10 cyclomatic complexity per function.
Meaning: no more than 10 'if', 'else', 'try', 'catch' and other code workflow branching statement. Right. As I explained in 'Do you test private method?', such a policy has many good side-effects.
But: At the beginning of our (200 people - 7 years long) project, we were happily logging (and no, we can not easily delegate that to some kind of 'Aspect-oriented programming' approach for logs).
myLogger.info("A String");
myLogger.fine("A more complicated String");
...
And when the first versions of our System went live, we experienced huge memory problem not because of the logging (which was at one point turned off), but because of the log parameters (the strings), which are always calculated, then passed to the 'info()' or 'fine()' functions, only to discover that the level of logging was 'OFF', and that no logging were taking place!
So QA came back and urged our programmers to do conditional logging. Always.
if(myLogger.isLoggable(Level.INFO) { myLogger.info("A String");
if(myLogger.isLoggable(Level.FINE) { myLogger.fine("A more complicated String");
...
But now, with that 'can-not-be-moved' 10 cyclomatic complexity level per function limit, they argue that the various logs they put in their function is felt as a burden, because each "if(isLoggable())" is counted as +1 cyclomatic complexity!
So if a function has 8 'if', 'else' and so on, in one tightly-coupled not-easily-shareable algorithm, and 3 critical log actions... they breach the limit even though the conditional logs may not be really part of said complexity of that function...
How would you address this situation ?
I have seen a couple of interesting coding evolution (due to that 'conflict') in my project, but I just want to get your thoughts first.
Thank you for all the answers.
I must insist that the problem is not 'formatting' related, but 'argument evaluation' related (evaluation that can be very costly to do, just before calling a method which will do nothing)
So when a wrote above "A String", I actually meant aFunction(), with aFunction() returning a String, and being a call to a complicated method collecting and computing all kind of log data to be displayed by the logger... or not (hence the issue, and the obligation to use conditional logging, hence the actual issue of artificial increase of 'cyclomatic complexity'...)
I now get the 'variadic function' point advanced by some of you (thank you John).
Note: a quick test in java6 shows that my varargs function does evaluate its arguments before being called, so it can not be applied for function call, but for 'Log retriever object' (or 'function wrapper'), on which the toString() will only be called if needed. Got it.
I have now posted my experience on this topic.
I will leave it there until next Tuesday for voting, then I will select one of your answers.
Again, thank you for all the suggestions :)

With current logging frameworks, the question is moot
Current logging frameworks like slf4j or log4j 2 don't require guard statements in most cases. They use a parameterized log statement so that an event can be logged unconditionally, but message formatting only occurs if the event is enabled. Message construction is performed as needed by the logger, rather than pre-emptively by the application.
If you have to use an antique logging library, you can read on to get more background and a way to retrofit the old library with parameterized messages.
Are guard statements really adding complexity?
Consider excluding logging guards statements from the cyclomatic complexity calculation.
It could be argued that, due to their predictable form, conditional logging checks really don't contribute to the complexity of the code.
Inflexible metrics can make an otherwise good programmer turn bad. Be careful!
Assuming that your tools for calculating complexity can't be tailored to that degree, the following approach may offer a work-around.
The need for conditional logging
I assume that your guard statements were introduced because you had code like this:
private static final Logger log = Logger.getLogger(MyClass.class);
Connection connect(Widget w, Dongle d, Dongle alt)
throws ConnectionException
{
log.debug("Attempting connection of dongle " + d + " to widget " + w);
Connection c;
try {
c = w.connect(d);
} catch(ConnectionException ex) {
log.warn("Connection failed; attempting alternate dongle " + d, ex);
c = w.connect(alt);
}
log.debug("Connection succeeded: " + c);
return c;
}
In Java, each of the log statements creates a new StringBuilder, and invokes the toString() method on each object concatenated to the string. These toString() methods, in turn, are likely to create StringBuilder instances of their own, and invoke the toString() methods of their members, and so on, across a potentially large object graph. (Before Java 5, it was even more expensive, since StringBuffer was used, and all of its operations are synchronized.)
This can be relatively costly, especially if the log statement is in some heavily-executed code path. And, written as above, that expensive message formatting occurs even if the logger is bound to discard the result because the log level is too high.
This leads to the introduction of guard statements of the form:
if (log.isDebugEnabled())
log.debug("Attempting connection of dongle " + d + " to widget " + w);
With this guard, the evaluation of arguments d and w and the string concatenation is performed only when necessary.
A solution for simple, efficient logging
However, if the logger (or a wrapper that you write around your chosen logging package) takes a formatter and arguments for the formatter, the message construction can be delayed until it is certain that it will be used, while eliminating the guard statements and their cyclomatic complexity.
public final class FormatLogger
{
private final Logger log;
public FormatLogger(Logger log)
{
this.log = log;
}
public void debug(String formatter, Object... args)
{
log(Level.DEBUG, formatter, args);
}
… &c. for info, warn; also add overloads to log an exception …
public void log(Level level, String formatter, Object... args)
{
if (log.isEnabled(level)) {
/*
* Only now is the message constructed, and each "arg"
* evaluated by having its toString() method invoked.
*/
log.log(level, String.format(formatter, args));
}
}
}
class MyClass
{
private static final FormatLogger log =
new FormatLogger(Logger.getLogger(MyClass.class));
Connection connect(Widget w, Dongle d, Dongle alt)
throws ConnectionException
{
log.debug("Attempting connection of dongle %s to widget %s.", d, w);
Connection c;
try {
c = w.connect(d);
} catch(ConnectionException ex) {
log.warn("Connection failed; attempting alternate dongle %s.", d);
c = w.connect(alt);
}
log.debug("Connection succeeded: %s", c);
return c;
}
}
Now, none of the cascading toString() calls with their buffer allocations will occur unless they are necessary! This effectively eliminates the performance hit that led to the guard statements. One small penalty, in Java, would be auto-boxing of any primitive type arguments you pass to the logger.
The code doing the logging is arguably even cleaner than ever, since untidy string concatenation is gone. It can be even cleaner if the format strings are externalized (using a ResourceBundle), which could also assist in maintenance or localization of the software.
Further enhancements
Also note that, in Java, a MessageFormat object could be used in place of a "format" String, which gives you additional capabilities such as a choice format to handle cardinal numbers more neatly. Another alternative would be to implement your own formatting capability that invokes some interface that you define for "evaluation", rather than the basic toString() method.

In Python you pass the formatted values as parameters to the logging function. String formatting is only applied if logging is enabled. There's still the overhead of a function call, but that's minuscule compared to formatting.
log.info ("a = %s, b = %s", a, b)
You can do something like this for any language with variadic arguments (C/C++, C#/Java, etc).
This isn't really intended for when the arguments are difficult to retrieve, but for when formatting them to strings is expensive. For example, if your code already has a list of numbers in it, you might want to log that list for debugging. Executing mylist.toString() will take a while to no benefit, as the result will be thrown away. So you pass mylist as a parameter to the logging function, and let it handle string formatting. That way, formatting will only be performed if needed.
Since the OP's question specifically mentions Java, here's how the above can be used:
I must insist that the problem is not 'formatting' related, but 'argument evaluation' related (evaluation that can be very costly to do, just before calling a method which will do nothing)
The trick is to have objects that will not perform expensive computations until absolutely needed. This is easy in languages like Smalltalk or Python that support lambdas and closures, but is still doable in Java with a bit of imagination.
Say you have a function get_everything(). It will retrieve every object from your database into a list. You don't want to call this if the result will be discarded, obviously. So instead of using a call to that function directly, you define an inner class called LazyGetEverything:
public class MainClass {
private class LazyGetEverything {
#Override
public String toString() {
return getEverything().toString();
}
}
private Object getEverything() {
/* returns what you want to .toString() in the inner class */
}
public void logEverything() {
log.info(new LazyGetEverything());
}
}
In this code, the call to getEverything() is wrapped so that it won't actually be executed until it's needed. The logging function will execute toString() on its parameters only if debugging is enabled. That way, your code will suffer only the overhead of a function call instead of the full getEverything() call.

In languages supporting lambda expressions or code blocks as parameters, one solution for this would be to give just that to the logging method. That one could evaluate the configuration and only if needed actually call/execute the provided lambda/code block.
Did not try it yet, though.
Theoretically this is possible. I would not like to use it in production due to performance issues i expect with that heavy use of lamdas/code blocks for logging.
But as always: if in doubt, test it and measure the impact on cpu load and memory.

Thank you for all your answers! You guys rock :)
Now my feedback is not as straight-forward as yours:
Yes, for one project (as in 'one program deployed and running on its own on a single production platform'), I suppose you can go all technical on me:
dedicated 'Log Retriever' objects, which can be pass to a Logger wrapper only calling toString() is necessary
used in conjunction with a logging variadic function (or a plain Object[] array!)
and there you have it, as explained by #John Millikin and #erickson.
However, this issue forced us to think a little about 'Why exactly we were logging in the first place ?'
Our project is actually 30 different projects (5 to 10 people each) deployed on various production platforms, with asynchronous communication needs and central bus architecture.
The simple logging described in the question was fine for each project at the beginning (5 years ago), but since then, we has to step up. Enter the KPI.
Instead of asking to a logger to log anything, we ask to an automatically created object (called KPI) to register an event. It is a simple call (myKPI.I_am_signaling_myself_to_you()), and does not need to be conditional (which solves the 'artificial increase of cyclomatic complexity' issue).
That KPI object knows who calls it and since he runs from the beginning of the application, he is able to retrieve lots of data we were previously computing on the spot when we were logging.
Plus that KPI object can be monitored independently and compute/publish on demand its information on a single and separate publication bus.
That way, each client can ask for the information he actually wants (like, 'has my process begun, and if yes, since when ?'), instead of looking for the correct log file and grepping for a cryptic String...
Indeed, the question 'Why exactly we were logging in the first place ?' made us realize we were not logging just for the programmer and his unit or integration tests, but for a much broader community including some of the final clients themselves. Our 'reporting' mechanism had to be centralized, asynchronous, 24/7.
The specific of that KPI mechanism is way out of the scope of this question. Suffice it to say its proper calibration is by far, hands down, the single most complicated non-functional issue we are facing. It still does bring the system on its knee from time to time! Properly calibrated however, it is a life-saver.
Again, thank you for all the suggestions. We will consider them for some parts of our system when simple logging is still in place.
But the other point of this question was to illustrate to you a specific problem in a much larger and more complicated context.
Hope you liked it. I might ask a question on KPI (which, believe or not, is not in any question on SOF so far!) later next week.
I will leave this answer up for voting until next Tuesday, then I will select an answer (not this one obviously ;) )

Maybe this is too simple, but what about using the "extract method" refactoring around the guard clause? Your example code of this:
public void Example()
{
if(myLogger.isLoggable(Level.INFO))
myLogger.info("A String");
if(myLogger.isLoggable(Level.FINE))
myLogger.fine("A more complicated String");
// +1 for each test and log message
}
Becomes this:
public void Example()
{
_LogInfo();
_LogFine();
// +0 for each test and log message
}
private void _LogInfo()
{
if(!myLogger.isLoggable(Level.INFO))
return;
// Do your complex argument calculations/evaluations only when needed.
}
private void _LogFine(){ /* Ditto ... */ }

In C or C++ I'd use the preprocessor instead of the if statements for the conditional logging.

Pass the log level to the logger and let it decide whether or not to write the log statement:
//if(myLogger.isLoggable(Level.INFO) {myLogger.info("A String");
myLogger.info(Level.INFO,"A String");
UPDATE: Ah, I see that you want to conditionally create the log string without a conditional statement. Presumably at runtime rather than compile time.
I'll just say that the way we've solved this is to put the formatting code in the logger class so that the formatting only takes place if the level passes. Very similar to a built-in sprintf. For example:
myLogger.info(Level.INFO,"A String %d",some_number);
That should meet your criteria.

Conditional logging is evil. It adds unnecessary clutter to your code.
You should always send in the objects you have to the logger:
Logger logger = ...
logger.log(Level.DEBUG,"The foo is {0} and the bar is {1}",new Object[]{foo, bar});
and then have a java.util.logging.Formatter that uses MessageFormat to flatten foo and bar into the string to be output. It will only be called if the logger and handler will log at that level.
For added pleasure you could have some kind of expression language to be able to get fine control over how to format the logged objects (toString may not always be useful).

(source: scala-lang.org)
Scala has a annontation #elidable() that allows you to remove methods with a compiler flag.
With the scala REPL:
C:>scala
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.
6.0_16).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import scala.annotation.elidable
import scala.annotation.elidable
scala> import scala.annotation.elidable._
import scala.annotation.elidable._
scala> #elidable(FINE) def logDebug(arg :String) = println(arg)
logDebug: (arg: String)Unit
scala> logDebug("testing")
scala>
With elide-beloset
C:>scala -Xelide-below 0
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.
6.0_16).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import scala.annotation.elidable
import scala.annotation.elidable
scala> import scala.annotation.elidable._
import scala.annotation.elidable._
scala> #elidable(FINE) def logDebug(arg :String) = println(arg)
logDebug: (arg: String)Unit
scala> logDebug("testing")
testing
scala>
See also Scala assert definition

As much as I hate macros in C/C++, at work we have #defines for the if part, which if false ignores (does not evaluate) the following expressions, but if true returns a stream into which stuff can be piped using the '<<' operator.
Like this:
LOGGER(LEVEL_INFO) << "A String";
I assume this would eliminate the extra 'complexity' that your tool sees, and also eliminates any calculating of the string, or any expressions to be logged if the level was not reached.

Here is an elegant solution using ternary expression
logger.info(logger.isInfoEnabled() ? "Log Statement goes here..." : null);

Consider a logging util function ...
void debugUtil(String s, Object… args) {
if (LOG.isDebugEnabled())
LOG.debug(s, args);
}
);
Then make the call with a "closure" round the expensive evaluation that you want to avoid.
debugUtil(“We got a %s”, new Object() {
#Override String toString() {
// only evaluated if the debug statement is executed
return expensiveCallToGetSomeValue().toString;
}
}
);

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008