Mockito matchers | Correct Usage - junit

Mockito provides a lot of matchers like any(), anyClass() etc.
People can debate on their usage. But I feel that matchers should be used when we don't really care about what the input object contains as long as it is of the expected class.
I just want to know if this usage is correct? If not, what is the better way to use them?
For example:
Say we have a test that expects a Runtime Exception, when a method is called with some request object. Since we explicitly throw a runtime exception here when the mock is called, it does not matter what the request object's contents are. So this test seems logical.
#Test(expected = RunTimeException.class)
public void testExceptionOccurs() {
when(mock.method(any(RequestObject.class))).thenThrow(new RuntimeException());
mock.method(new RequestObject());
}
Is this the correct way?

You're right that allowing any object of a given type is a great use for Matchers, and that you're calling matchers correctly, but the code isn't doing what you're describing.
In Mockito 1.x, any(RequestObject.class) will not actually check the type of the parameter. Its behavior is identical to anyObject(), except that previous to Java 8, the Java compiler couldn't infer a generic type specified as a parameter. Instead, use isA(RequestObject.class) to check types.
Mockito's default behavior is to check argument equality—specifically equality using equals (or == for primitive types). I've found any (and anyInt etc) to be the most valuable, because they ignore an argument entirely for the purposes of matching, but there are plenty of reasons that you'd want to override that equals behavior with Matchers:
To only check an argument's type, using isA.
To check instances more specifically than equals is defined for, like checking for referential equality using same.
To check properties of arguments, like that a passed bean has a certain value for one of its properties.
To reuse Hamcrest matchers via intThat, argThat, and so forth.
Final note: make sure that if you're using matchers or captors anywhere in your list of arguments, that you use one for each and every argument in that call. Mockito needs that 1:1 mapping.
UPDATE: Mockito committer Brice has offered some historical background and future direction:
For historical reference, any is a shorthand alias of anything, at that time the API was forcing one to cast, and contributors and/or commiters thought about passing the class as a param to avoid this cast, without changing the semantic of this API. However this change eventually modified what people thought that this API was doing. This will be fixed in mockito 2+

Related

Is assert in privation function redundant if check has already been made by the calling public function?

Effective java states a good practice of assertions in private methods.
"For an unexported method, you as the package author control the circumstances under which the method is called, so you can and should ensure that only valid parameter values are ever passed in. Therefore, nonpublic methods should generally check their parameters using assertions, as shown below:
For example:
// Private helper function for a recursive sort
private static void sort(long a[]) {
assert a != null;
// Do the computation;
}
My question is would asserts be required even if the public function calling the sort has a null pointer check ?
Example:
public void computeTwoNumbersThatSumToInputValue(int a[], int x) {
if (a == null) {
throw new Nullptrexception();
}
sort(a);
// code to do the required.
}
In other words, will asserts in private function be 'redudant' or mandatory in this case.
Thanks,
It's redundant if you're sure that you've got the assertion in all the calling code. In some cases, that's very obvious - in other cases it can be less so. If you're calling sort from 20 places in the class, are you sure you've checked it in every case?
It's a matter of taste and balance, with no "one size fits all" answer. The balance is in terms of code clarity (both ways!), performance (in extreme cases) and of course safety. It depends on the exact context, and I wouldn't personally like to even guarantee that I'm entirely consistent. (In other words, "level of caffeine at the time of coding" may turn out to be an influence too.)
Note that your assert is only going to execute when assertions are turned on anyway - I personally prefer to validate parameters consistently however you're running the code. I generally use the Preconditions class from Guava to make preconditions unobtrusive.
Assertions will make the helper function sort more robust to use.
Checking for parameters before passing it to any method is a good methodology to have more control over the Exceptions occurring unintentionally at the runtime.
My suggestion will be to use both the approaches in your code as there is no guarantee that all the callers of sort will do such checks. If assertions in helper methods are algorithmically of high order or seems redundant then this can be disabled (esp for production use) via use of -disableassertions or -da from command-line.
You could do that. I will quote from the Oracle docs.
An assertion is a statement in the JavaTM programming language that
enables you to test your assumptions about your program. For example,
if you write a method that calculates the speed of a particle, you
might assert that the calculated speed is less than the speed of
light.
I do not personally use assertions, but from what I gathered readings the oracle docs on it, it enables you to test your assumptions about what you expect something to do. Try/catch blocks are more for failing gracefully as an inevitability of failures bound to happen (like networking, computer problems). Basically, in a perfect world your code would always run successfully because theres nothing wrong with it code wise. But this isn't a perfect world. Also note:
Experience has shown that writing assertions while programming is one
of the quickest and most effective ways to detect and correct bugs. As
an added benefit, assertions serve to document the inner workings of
your program, enhancing maintainability.
I would say use as a preference. To answer your question, I would mainly use it to test code as the docs say, while testing assumptions you have about your code. As the second quote mentions, it has the added benefit of telling other developers (or future you) what you assume to get as parameters. As a personal preference, I leave control flow to try/catch blocks as that is what they were designed for.
*But keep in mind that assertions could be turned off.

How are implemented classes in dynamic languages?

How are implemented classes in dynamic languages ?
I know that Javascript is using a prototype pattern (there is 'somewhere' a container of unbound JS functions, which are bind when calling them through an object), but I have no idea of how it works in other languages.
I'm curious about this, because I can't think of an efficient way to have native bound methods without wasting memory and/or cpu by copying members for each instance.
(By bound method, I mean that the following code should work :)
class Foo { function bar() : return 42; };
var test = new Foo();
var method = test.bar;
method() == 42;
This highly depends on the language and the implementation. I'll tell you what I know about CPython and PyPy.
The general idea, which is also what CPython does for the most part, goes like this:
Every object has a class, specifically a reference to that class object.
Apart from instance members, which are obviously stored in the individual object, the class also has members. This includes methods, so methods don't have a per-object cost.
A class has a method resolution order (MRO) determined by the inheritance relationships, wherein each base class occurs exactly once. If we didn't have multiple inheritance, this would simply be a reference to the base class, but this way the MRO is hard to figure out on the fly (you'd have to start from the most derived class every time).
(Classes are also objects and have classes themselves, but we'll gloss over that for now.)
If attribute lookup on an object fails, the same attribute is looked up on the classes in the MRO, in the order specified by the MRO. (This is the default behavior, which can be changed by defining magic methods like __getattr__ and __getattribute__.)
So far so simple, and not really an explanation for bound methods. I just wanted to make sure we're talking about the same thing. The missing piece is descriptors. The descriptor protocol is defined in the "deep magic" section of the language reference, but the short and simple story is that lookup on a class can be hijacked by the object it results in via a __get__ method. More importantly, this __get__ method is told whether the lookup started on an instance or on the "owner" (the class).
In Python 2, we have an ugly and unnecessary UnboundMethod descriptor which (apart from the __get__ method) simply wraps the function to throw errors on Class.method(self) if self is not of an acceptable type. In Python 3, the __get__ is simply part of all function objects, and unbound methods are gone. In both cases, the __get__ method returns itself when you look it up on a class (so you can use Class.method, which is useful in a few cases) and a "bound method" object when you look it up on an object. This bound method object does nothing more than storing the raw function and the instance, and passing the latter as first argument to the former in its __call__ (special method overriding the function call syntax).
So, for CPython: While there is a cost to bound methods, it's smaller than you might think. Only two references are needed space-wise, and the CPU cost is limited to a small memory allocation, and an extra indirection when calling. Note though that this cost applies to all method calls, not just those which actually make use of bound method features. a.f() has to call the descriptor and use its return value, because in a dynamic language we don't know if it's monkey-patched to do something different.
In PyPy, things are more interesting. As it's an implementation which doesn't compromise on correctness, the above model is still correct for reasoning about semantics. However, it's actually faster. Apart from the fact that the JIT compiler inlines and then eliminates the entire mess described above in most cases, they also tackle the problem on bytecode level. There are two new bytecode instructions, which preserve the semantics but omit the allocation of the bound method object in the case of a.f(). There is also a method cache which can simplify the lookup process, but requires some additional bookkeeping (though some of that bookkeeping is already done for the JIT).

Should my persistence class return Option or rely on exceptions?

My application's persistence layer is formed by a Storage trait and an implementing class. I'm vacillating on this issue: should the fetchFoo(key: Key) methods should return Option[Foo], or should they throw FooNotFound exceptions if the key cannot be found?
To add flavour to the issue, the persistence layer - it is written in Scala - is called by Java code. Dealing with scala.Option in Java code? Hmmm.
In fact, until yesterday, the persistence layer was written in Java; I've just re-written it in Scala. As a Java code base, it relied on exceptions rather than returning nulls; but now that I've encountered scala.Option, I'm reconsidering. It seems to me that Scala is less enamoured of exceptions than Java.
My take on the general problem is that it depends on where the keys are coming from.
If they are being entered by some user or untrusted system, then I use Option so I can meaningfully denote the possibility of an unknown key and deal with it appropriately.
On the other hand, if the keys are coming from a known system (this includes things like keys embedded in links that originally came from the system), and are assumed to be valid and exist, I would leave it as a runtime exception, handled by a catch-all at the outer level. For the link example, if someone manually changes the key in a url for one reason or another, it should be considered as undefined behaviour and an exception is appropriate, IMO.
Another way to think of it is how you would handle the situation when it arises. If you're using Option and are just delegating the None case to some catch-all error handling, then an exception is probably more appropriate. If you're explicitly catching the NotFound exception and altering the program flow (eg, asking the user to re-enter the key), then use Option, or a checked exception (or Either in Scala) to ensure that the situation is dealt with.
In relation to integrating with Java, Option is easy enough to use from there once the Scala runtime library is available on the classpath. Alternatively, there's an Option implementation in the Functional Java library. In any case, I would steer clear of using null to indicate "not found".
In Java, you can call Option's isEmpty, isDefined and get without any special hassle (the really useful Option methods, such as getOrElse, are another matter.) Checking the result of the isDefined method in an if-clause should be faster than checking exceptions in a try-catch block.
In some cases (like your example) an Option is fine and the "monadic" behavior (map, flatMap, filter...) very convenient, but in other cases you need more information about the cause of problem, which can be better expressed with an exception. Now you probably want your error handling as "uniform" as possible, so I would suggest to use Either, which gives you both a behavior similar to Option and an expressiveness like an exception.
In Java, you need just a helper function which "unpacks" an Either. If it finds a Right(value), it gives the value back, if it finds a Left(Exception), it re-throws it. After this, your back to normal Java behavior.

Using one method with constants as parameters versus several methods

In Kent Beck's Implementation Patterns, one can read
"A common use of constants is to
communicate variations of a message in
an interface. For example, to center
text you could invoke
setJustification(Justification.CENTERED).
One advantage of this style of API is
that you can add new variants of
existing methods by adding new
constants without breaking
implementors. However, these messages
don't communicate as well as having a
separate method for each variation. In
this style, the message above would be
justifyCentered(). An interface where
all invocations of a method have
literal constants as arguments can be
improved by giving it separate methods
for each constant value."
Why is this? Generally when I'm coding and I notice that I have a couple of similar parameterless methods that could be reduced to just one, with an argument, like in the following example,
void justifyRight()
void justifyLeft()
void justifyCentered()
I'd generally do just the opposite of what Kent advices, which would be to group it into
setJustification(Justification justification)
How do you usually handle this situation? Is this totally subjective or there is really a very strong reason that I can't see in favour of Kent's view of this matter?
Thanks
File access methods usually have parameters regarding read/write mode, whether to create non-existing files, security attributes, locking modes and so on. Imagine the amount of methods you'd have if you'd create a separate method for each valid combination of parameters!
I've highlighted the biggest argument in favor of separate methods; it's fail-safe because you have strict control over the API. The caller cannot pass in invalid arguments, or invalid combinations of parameters, if you don't expose such parameters. This also implies less complex parameter validation.
However, I'm not in favor of this practice. API's should be well-designed and should change as little as possible. Kent Beck on breaking API changes:
One advantage of [parameterized methods] is that you can add new variants of existing methods by adding new constants without breaking implementors.
His argument in favor of separate methods is:
However, [parameterized methods] don't communicate as well as having a separate method for each variation.
I disagree. Method parameters can be just as readable. Especially in combination with named parameters, a feature which is supported by several languages. Besides, separate methods would result in a cluttered API.
I suppose it's subjective. Some may argue that justifyLeft is clearer than justify(Justification.LEFT) Collapsing it all into one method may result in a nicer API - less clutter - and the mode can be stored in a variable and simply feeding it to the single setXY method (with different methods for each, you'd have to decide which to call depending on the value manually). Therefore I usually prefer this way way. Though it's usually just:
void justify(Justification justification) {
switch(justification) {
Justification.RIGHT: this.justifyRight();
Justification.LEFT: this.justifyLeft();
Justification.CENTERED: this.justifyCenter();
}
}
Of course this is only advisable when all these methods are very closely related.

How to design a class that has only one heavy duty work method and data returning other methods?

I want to design a class that will parse a string into tokens that are meaningful to my application.
How do I design it?
Provide a ctor that accepts a string, provide a Parse method and provide methods (let's call them "minor") that return individual tokens, count of tokens etc. OR
Provide a ctor that accepts nothing, provide a Parse method that accepts a string and minor methods as above. OR
Provide a ctor that accepts a string and provide only minor methods but no parse method. The parsing is done by the ctor.
1 and 2 have the disadvantage that the user may call minor methods without calling the Parse method. I'll have to check in every minor method that the Parse method was called.
The problem I see in 3 is that the parse method may potentially do a lot of things. It just doesn't seem right to put it in the ctor.
2 is convenient in that the user may parse any number of strings without instantiating the class again and again.
What's a good approach? What are some of the considerations?
(the language is c#, if someone cares).
Thanks
I would have a separate class with a Parse method that takes a string and converts it into a separate new object with a property for each value from the string.
ValueObject values = parsingClass.Parse(theString);
I think this is a really good question...
In general, I'd go with something that resembles option 3 above. Basically, think about your class and what it does; does it have any effective data other than the data to parse and the parsed tokens? If not, then I would generally say that if you don't have those things, then you don't really have an instance of your class; you have an incomplete instance of your class; something which you'd like to avoid.
One of the considerations that you point out is that the parsing of the tokens may be a relatively computationally complicated process; it may take a while. I agree with you that you may not want to take the hit for doing that in the constructor; in that case, it may make sense to use a Parse() method. The question that comes in, though, is whether or not there's any sensible operations that can be done on your class before the parse() method completes. If not, then you're back to the original point; before the parse() method is complete, you're effectively in an "incomplete instance" state of your class; that is, it's effectively useless. Of course, this all changes if you're willing and able to use some multithreading in your application; if you're willing to offload the computationally complicated operations onto another thread, and maintain some sort of synchronization on your class methods / accessors until you're done, then the whole parse() thing makes more sense, as you can choose to spawn that in a new thread entirely. You still run into issues of attempting to use your class before it's completely parsed everything, though.
I think an even more broad question that comes into this design, though, is what is the larger scope in which this code will be used? What is this code going to be used for, and by that, I mean, not just now, with the intended use, but is there a possibility that this code may need to grow or change as your application does? In terms of the stability of implementation, can you expect for this to be completely stable, or is it likely that something about the set of data you'll want to parse or the size of the data to parse or the tokens into which you will parse will change in the future? If the implementation has a possibility of changing, consider all the ways in which it may change; in my experience, those considerations can strongly lead to one or another implementation. And considering those things is not trivial; not by a long shot.
Lest you think this is just nitpicking, I would say, at a conservative estimate, about 10 - 15 percent of the classes that I've written have needed some level of refactoring even before the project was complete; rarely has a design that I've worked on survived implementation to come out the other side looking the same way that it did before. So considering the possible permutations of the implementation becomes very useful for determining what your implementation should be. If, say, your implementation will never possibly want to vary the size of the string to tokenize, you can make an assumption about the computatinal complexity, that may lead you one way or another on the overall design.
If the sole purpose of the class is to parse the input string into a group of properties, then I don't see any real downside in option 3. The parse operation may be expensive, but you have to do it at some point if you're going to use it.
You mention that option 2 is convenient because you can parse new values without reinstantiating the object, but if the parse operation is that expensive, I don't think that makes much difference. Compare the following code:
// Using option 3
ParsingClass myClass = new ParsingClass(inputString);
// Parse a new string.
myClass = new ParsingClass(anotherInputString);
// Using option 2
ParsingClass myClass = new ParsingClass();
myClass.Parse(inputString);
// Parse a new string.
myClass.Parse(anotherInputString);
There's not much difference in use, but with Option 2, you have to have all your minor methods and properties check to see if parsing had occurred before they can proceed. (Option 1 requires to you do everything that option 2 does internally, but also allows you to write Option 3-style code when using it.)
Alternatively, you could make the constructor private and the Parse method static, having the Parse method return an instance of the object.
// Option 4
ParsingClass myClass = ParsingClass.Parse(inputString);
// Parse a new string.
myClass = ParsingClass.Parse(anotherInputString);
Options 1 and 2 provide more flexibility, but require more code to implement. Options 3 and 4 are less flexible, but there's also less code to write. Basically, there is no one right answer to the question. It's really a matter of what fits with your existing code best.
Two important considerations:
1) Can the parsing fail?
If so, and if you put it in the constructor, then it has to throw an exception. The Parse method could return a value indicating success. So check how your colleagues feel about throwing exceptions in situations which aren't show-stopping: default is to assume they won't like it.
2) The constructor must get your object into a valid state.
If you don't mind "hasn't parsed anything yet" being a valid state of your objects, then the parse method is probably the way to go, and call the class SomethingParser.
If you don't want that, then parse in the constructor (or factory, as Garry suggests), and call the class ParsedSomething.
The difference is probably whether you are planning to pass these things as parameters into other methods. If so, then having a "not ready yet" state is a pain, because you either have to check for it in every callee and handle it gracefully, or else you have to write documentation like "the parameter must already have parsed a string". And then most likely check in every callee with an assert anyway.
You might be able to work it so that the initial state is the same as the state after parsing an empty string (or some other base value), thus avoiding the "not ready yet" problem.
Anyway, if these things are likely to be parameters, personally I'd say that they have to be "ready to go" as soon as they're constructed. If they're just going to be used locally, then you might give users a bit more flexibility if they can create them without doing the heavy lifting. The cost is requiring two lines of code instead of one, which makes your class slightly harder to use.
You could consider giving the thing two constructors and a Parse method: the string constructor is equivalent to calling the no-arg constructor, then calling Parse.