Test cases, "when", "what", and "why"?

Test cases, "when", "what", and "why"? - language-agnostic

Being new to test based development, this question has been bugging me. How much is too much? What should be tested, how should it be tested, and why should it be tested? The examples given are in C# with NUnit, but I assume the question itself is language agnostic.
Here are two current examples of my own, tests on a generic list object (being tested with strings, the initialisation function adds three items {"Foo", "Bar", "Baz"}):
[Test]
public void CountChanging()
{
Assert.That(_list.Count, Is.EqualTo(3));
_list.Add("Qux");
Assert.That(_list.Count, Is.EqualTo(4));
_list[7] = "Quuuux";
Assert.That(_list.Count, Is.EqualTo(8));
_list.Remove("Quuuux");
Assert.That(_list.Count, Is.EqualTo(7));
}
[Test]
public void ContainsItem()
{
Assert.That(_list.Contains("Qux"), Is.EqualTo(false));
_list.Add("Qux");
Assert.That(_list.Contains("Qux"), Is.EqualTo(true));
_list.Remove("Qux");
Assert.That(_list.Contains("Qux"), Is.EqualTo(false));
}
The code is fairly self-commenting, so I won't go into what's happening, but is this sort of thing taking it too far? Add() and Remove() are tested seperately of course, so what level should I go to with these sorts of tests? Should I even have these sorts of tests?

I would say that what you're actually testing are equivalence classes. In my view, there is no difference between a adding to a list that has 3 items or 7 items. However, there is a difference between 0 items, 1 item and >1 items. I would probably have 3 tests each for Add/Remove methods for these cases initially.
Once bugs start coming in from QA/users, I would add each such bug report as a test case; see the bug reproduce by getting a red bar; fix the bug by getting a green bar. Each such 'bug-detecting' test is there to stay - it is my safety net (read: regression test) that even if I make this mistake again, I will have instant feedback.

Think of your tests as a specification. If your system can break (or have material bugs) without your tests failing, then you don't have enough test coverage. If one single point of failure causes many tests to break, you probably have too much (or are too tightly coupled).
This is really hard to define in an objective way. I suppose I'd say err on the side of testing too much. Then when tests start to annoy you, those are the particular tests to refactor/repurpose (because they are too brittle, or test the wrong thing, and their failures aren't useful).

A few tips:
Each testcase should only test one thing. That means that the structure of the testcase should be "setup", "execute", "assert". In your examples, you mix these phases. Try splitting your test-methods up. That makes it easier to see exactly what you are testing.
Try giving your test-methods a name that describes what it is testing. I.e. the three testcases contained in your ContainsItem() becomes: containsReportsFalseIfTheItemHasNotBeenAdded(), containsReportsTrueIfTheItemHasBeenAdded(), containsReportsFalseIfTheItemHasBeenAddedThenRemoved(). I find that forcing myself to come up with a descriptive name like that helps me conceptualize what I have to test before I code the actual test.
If you do TDD, you should write your test firsts and only add code to your implementation when you have a failing test. Even if you don't actually do this, it will give you an idea of how many tests are enough. Alternatively use a coverage tool. For a simple class like a container, you should aim for 100% coverage.

Is _list an instance of a class you wrote? If so, I'd say testing it is reasonable. Though in that case, why are you building a custom List class?
If it's not code you wrote, don't test it unless you suspect it's in some way buggy.
I try to test code that's independent and modular. If there's some sort of God-function in code I have to maintain, I strip out as much of it as possible into sub-functions and test them independantly. Then the God function can be written to be "obviously correct" -- no branches, no logic, just passing results from one well-tested subfunction to another.

Related

Detecting JUnit "tests" that never assert anything

We used to have a technical director who liked to contribute code and was also very enthusiastic about adding unit tests. Unfortunately his preferred style of test was to produce some output to screen and visually check the result.
Given that we have a large bank of tests, are there any tools or techniques I could use to identify the tests never assert?

Since that's a one time operation I would:
scan all test methods (easy, get the jUnit report XML)
use an IDE or other to search references to Assert.*, export result as a list of method
awk/perl/excel the results to find mismatches
Edit: another option is to just look for references to System.out or whatever his preferred way to output stuff was, most tests won't have that.

Not sure of a tool, but the thought that comes to mind is two-fold.
Create a TestRule class that keeps track of the number of asserts per test (use static counter, clear counter at beginning of test, assert that it is not 0 at end of test).
Wrap the Assert class in your own proxy that increments the TestRule's counter each time it is called.
Is your Assert class is called Assert that you would only need to update the imports and add the Rule to the tests. The above described mechanism is not thread-safe so if you have multiple tests running concurrently you will be incorrect results.

If those tests are the only ones that produce output, an automated bulk replacement of System.out.println( with org.junit.Assert.fail("Fix test: " + would highlight exactly those tests that aren't pulling their weight. This technique would make it easy to inspect those tests in an IDE after a run and decide whether to fix or delete them; it also gives a clear indication of progress.

How strict should I be in the "do the simplest thing that could possible work" while doing TDD

For TDD you have to
Create a test that fail
Do the simplest thing that could possible work to pass the test
Add more variants of the test and repeat
Refactor when a pattern emerge
With this approach you're supposing to cover all the cases ( that comes to my mind at least) but I'm wonder if am I being too strict here and if it is possible to "think ahead" some scenarios instead of simple discover them.
For instance, I'm processing a file and if it doesn't conform to a certain format I am to throw an InvalidFormatException
So my first test was:
#Test
void testFormat(){
// empty doesn't do anything nor throw anything
processor.validate("empty.txt");
try {
processor.validate("invalid.txt");
assert false: "Should have thrown InvalidFormatException";
} catch( InvalidFormatException ife ) {
assert "Invalid format".equals( ife.getMessage() );
}
}
I run it and it fails because it doesn't throw an exception.
So the next thing that comes to my mind is: "Do the simplest thing that could possible work", so I :
public void validate( String fileName ) throws InvalidFormatException {
if(fileName.equals("invalid.txt") {
throw new InvalidFormatException("Invalid format");
}
}
Doh!! ( although the real code is a bit more complicated, I found my self doing something like this several times )
I know that I have to eventually add another file name and other test that would make this approach impractical and that would force me to refactor to something that makes sense ( which if I understood correctly is the point of TDD, to discover the patterns the usage unveils ) but:
Q: am I taking too literal the "Do the simplest thing..." stuff?

I think your approach is fine, if you're comfortable with it. You didn't waste time writing a silly case and solving it in a silly way - you wrote a serious test for real desired functionality and made it pass in - as you say - the simplest way that could possibly work. Now - and into the future, as you add more and more real functionality - you're ensuring that your code has the desired behavior of throwing the correct exception on one particular badly-formatted file. What comes next is to make that behavior real - and you can drive that by writing more tests. When it becomes simpler to write the correct code than to fake it again, that's when you'll write the correct code. That assessment varies among programmers - and of course some would decide that time is when the first failing test is written.
You're using very small steps, and that's the most comfortable approach for me and some other TDDers. If you're more comfortable with larger steps, that's fine, too - but know you can always fall back on a finer-grained process on those occasions when the big steps trip you up.

Of course your interpretation of the rule is too literal.
It should probably sound like "Do the simplest potentially useful thing..."
Also, I think that when writing implementation you should forget the body of the test which you are trying to satisfy. You should remember only the name of the test (which should tell you about what it tests). In this way you will be forced to write the code generic enough to be useful.

I too am a TDD newbie struggling with this question. While researching, I found this blog post by Roy Osherove that was the first and only concrete and tangible definition of "the simplest thing that could possibly work" that I have found (and even Roy admitted it was just a start).
In a nutshell, Roy says:
Look at the code you just wrote in your production code and ask yourself the following:
“Can I implement the same solution in a way that is ..”
“.. More hard-coded ..”
“.. Closer to the beginning of the method I wrote it in.. “
“.. Less indented (in as less “scopes” as possible like ifs, loops, try-catch) ..”
“.. shorter (literally less characters to write) yet still readable ..”
“… and still make all the tests pass?”
If the answer to one of these is “yes” then do that, and see all the tests still passing.

Lots of comments:
If validation of "empty.txt" throws an exception, you don't catch it.
Don't Repeat Yourself. You should have a single test function that decides if validation does or does not throw the exception. Then call that function twice, with two different expected results.
I don't see any signs of a unit-testing framework. Maybe I'm missing them? But just using assert won't scale to larger systems. When you get a result from validation, you should have a way to announce to a testing framework that a given test, with a given name, succeeded or failed.
I'm alarmed at the idea that checking a file name (as opposed to contents) constitutes "validation". In my mind, that's a little too simple.
Regarding your basic question, I think you would benefit from a broader idea of what the simplest thing is. I'm also not a fundamentalist TDDer, and I'd be fine with allowing you to "think ahead" a little bit. This means thinking ahead to this afternoon or tomorrow morning, not thinking ahead to next week.

You missed point #0 in your list: know what to do. You say you are processing a file for validation purposes. Once you have specified what "validation" means (hint: do this before writing any code), you might have a better idea of how to a) write tests that, well, test the specification as implemented, and b) write the simplest thing.
If, e.g., validation is "must be XML", your test case is just some non-xml-conformant string, and your implementation is using an XML library and (if necessary) transform its exceptions into those specified for your "validation" feature.

One thing of note to future TDD learners - the TDD mantra doesn't actually include "Do the simplest thing that could possibly work." Kent Beck's TDD Book has only 3 steps:
Red— Write a little test that doesn't work, and perhaps doesn't even
compile at first.
Green— Make the test work quickly, committing
whatever sins necessary in the process.
Refactor— Eliminate all of the duplication created in merely getting the test to work.
Although the phrase "Do the simplest thing..." is often attributed to Ward Cunningham, he actually asked a question "What's the simplest thing that could possibly work?", but that question was later turned into a command - which Ward believes may confuse rather help.
Edit: I can't recommend reading Beck's TDD Book strongly enough - it's like having a pairing session with the master himself, giving you his insights and thoughts on the Test Driven Development process.

Like a method should do one thing only, one test should test one thing (behavior) only. To address the example given, I'd write two tests, for instance, test_no_exception_for_empty_file and test_exception_for_invalid_file. The second could indeed be several tests - one per sort of invalidity.
The third step of the TDD process shall be interpreted as "add a new variant of the test", not "add a new variant to the test". Indeed, a unit test shall be atomic (test one thing only) and generally follows the triple A pattern: Arrange - Act - Assert. And it's very important to verify the test fails first, to ensure it is really testing something.
I would also separate the responsibility of reading the file and validating its content. That way, the test can pass a buffer to the validate() function, and the tests do not have to read files. Usually unit tests do not access to the filesystem cause this slow them down.

Should I change code to make it more testable?

I often find myself changing my code to make it more testable, I always wonder whether this is a good idea or not. Some of the things I find myself doing are:
Adding setters just so I can set an internal object to a mock.
Adding getters for internal maps/lists so I can check the internal state of the object has changed after performing some external action.
Wrapping concrete system classes and creating a new interface so I can mock them. For example, File classes can be hard to mock - so I'll create a new interface FileInterface and WrappedFile which extends it and then use the FileInterface instead of File.

Changing your code to make it more testable can be a good thing, but only if it makes your code itself better. Refactoring for testability can make your code better independent of the test suite's needs. Those are good changes.
Of your three examples only #3 is a really good one; often those new interfaces will make your code more flexible for regular use later. #1 is usually addressed for testing via dependency injection, which in my mind makes code needlessly more complicated but does at least make it easier to test. #2 sounds like a bad idea in general.

It is perfectly fine and even recommended to change your code to make it more testable. Here is a list of 10 things that make code hard to test.
I think your third is really ok, but I'm not too fond of the first and the second. If you just open your class internals with getters and setters, then you're giving up encapsulation completely. Depending on your language, there are ways to open visibility of some parameters to test. But what I actually do (which opens encapsulation a little less) is to make the fields I want to check protected (when dependency injection doesn't solve the problem).
Then, on the test project, I inherit the class, and create a "more powerful one", where I can check the internals, but I change nothing on the implementation, and use this class in the tests.
Finally, changing your code to have dependency injection and inversion of control is also highly recommended, as it makes your code easier to test AND more readable and maintainable.
Though changing is ok, the best things to do is to TDD. It produces testable code naturally, once the tests are written first.

It's a trade-off..
You want a slim, streamlined API or a bloated more complicated, but easilier tested one.
Not the answer you wanted to hear, I know :)

Seems reasonable. Some things don't need checking though; I'd suggest checking to see if adding to a list worked is a little useless. But do whatever you feel comfortable with.

Ideally, every class you design should test itself, so you don't need to change the public interface. But you are talking about legacy code, so I think that changing code is reasonable only when the public impact isn't much noticeable. I would prefer to add a static inner class to test instead of bloat the interface of the tested class.

Should a TDD test always fail first?

As a followon to the discussion in the comments of this answer, should a TDD test always be made fail first?
Consider the following example. If I am writing an implementation of LinkedHashSet and one test tests that after inserting a duplicate, the original is in the same iteration order as before the insert, I might want to add a separate test that the duplicate is not in the set at all.
The first test will be observed to fail first, and then implemented.
The problem is that it is quite likely that the implementation to make the first test pass used a different set implementation to store the data, so just as a side effect the second test already passes.
I would think that the main purpose of seeing the test fail is to ensure that the test is a good test (many times I've written a test I thought would fail but didn't because the test was written wrong). But if you are confident that the test you write does indeed test something, isn't it valuable to have to ensure that you don't break that behavior later?

Of course it's valuable, because then it is a useful regression test. In my opinion, regression tests are more important than testing newly developed code.
To say that they must always fail first is taking a rule beyond practicality.

Yes, TDD tests must fail before they turn green (work). Otherwise you do not know if you have a valid test.

TDD for me is more of a design tool, not an afterthought. So there is no other way, the test will fail simply because there is no code to make it pass yet, only after I create it that the test can ever pass.

I think the point of "failing first" is to avoid kidding yourself that a test worked. If you have a set of tests checking the same method with different parameters, one (or more) of them is likely to pass from the start. Consider this example:
public String doFoo(int param) {
//TODO implement me
return null;
}
The tests would be something like:
public void testDoFoo_matches() {
assertEquals("Geoff Hurst", createBar().doFoo(1966));
}
public void testDoFoo_validNoMatch() {
assertEquals("no match", createBar().doFoo(1));
}
public void testDoFoo_outOfRange() {
assertEquals(null, createBar().doFoo(-1));
}
public void testDoFoo_tryAgain() {
assertEquals("try again", createBar().doFoo(0));
}
One of those tests will pass, but clearly the others won't, so you have to implement the code properly for the set of tests to pass. I think that is the true requirement. The spirit of the rule is to ensure you have thought about the expected outcome before you start hacking.

What you're actually asking is how you can test the test to verify that it is a valid one and it tests what you intend.
Making it fail at first is an ok option, but note that even if it fails when you plan it to fail and succeeds after you refactor the code to make it succeed, that still doesn't mean that your test actually tested what you wanted... Of course you can write some other classes which behave differently to test your test... But that's actually a test which tests your original test - How do you know that the new test is valid? :-)
So making a test fail first is a good idea but it still isn't foolproof.

IMHO, the importance of failing first is to make sure that the test you created doesn't have a flaw. You could, for instance, forget the Assert in your test, and you'd maybe never know that.
A similar case occurs when you're doing boundary tests, you've already built the code that covers it, but it is recommended to test that.
I think that's not a big problem for your test not to fail, but you have to make sure it is indeed testing what it should (debugging, maybe).

Developing to an interface with TDD

I'm a big fan of TDD and use it for the vast majority of my development these days. One situation I run into somewhat frequently, though, and have never found what I thought was a "good" answer for, is something like the following (contrived) example.
Suppose I have an interface, like this (writing in Java, but really, this applies to any OO language):
public interface PathFinder {
GraphNode[] getShortestPath(GraphNode start, GraphNode goal);
int getShortestPathLength(GraphNode start, GraphNode goal);
}
Now, suppose I want to create three implementations of this interface. Let's call them DijkstraPathFinder, DepthFirstPathFinder, and AStarPathFinder.
The question is, how do I develop these three implementations using TDD? Their public interface is going to be the same, and, presumably, I would write the same tests for each, since the results of getShortestPath() and getShortestPathLength() should be consistent among all three implementations.
My choices seem to be:
Write one set of tests against PathFinder as I code the first implementation. Then write the other two implementations "blind" and make sure they pass the PathFinder tests. This doesn't seem right because I'm not using TDD to develop the second two implementation classes.
Develop each implementation class in a test-first manner. This doesn't seem right because I would be writing the same tests for each class.
Combine the two techniques above; now I have a set of tests against the interface and a set of tests against each implementation class, which is nice, but the tests are all the same, which isn't nice.
This seems like a fairly common situation, especially when implementing a Strategy pattern, and of course the differences between implementations might be more than just time complexity. How do others handle this situation? Is there a pattern for test-first development against an interface that I'm not aware of?

You write interface tests to exercise the interface, and you write more detailed tests for the actual implementations. Interface-based design talks a bit about the fact that your unit tests should form a kind of "contract" specification for that interface. Maybe when Spec# comes out, there'll be a langugage supported way to do this.
In this particular case, which is a strict strategy implementation, the interface tests are enough. In other cases, where an interface is a subset of the implementation's functionality, you would have tests for both the interface and the implementation. Think of a class which implements 3 interfaces, for example.
EDIT: This is useful so that when you add another implementation of the interface down the road, you already have tests for verifying that the class implements the contract of the interface correctly. This can work for something as specific as ISortingStrategy to something as wide-ranging as IDisposable.

there is nothing wrong with writing tests against the interface, and reusing them for each implementation, for example -
public class TestPathFinder : TestClass
{
public IPathFinder _pathFinder;
public IGraphNode _startNode;
public IGraphNode _goalNode;
public TestPathFinder() : this(null,null,null) { }
public TestPathFinder(IPathFinder ipf,
IGraphNode start, IGraphNode goal) : base()
{
_pathFinder = ipf;
_startNode = start;
_goalNode = goal;
}
}
TestPathFinder tpfDijkstra = new TestPathFinder(
new DijkstraPathFinder(), n1, nN);
tpfDijkstra.RunTests();
//etc. - factory optional
I would argue that this is the least effort solution, which is very much in line with Agile/TDD principles.

I would have no problem going with option 1, and keep in mind that refactoring is part of TDD and it's usually during a refactoring phase that you move to a design pattern such as strategy, so I wouldn't feel bad about doing that w/o writing new tests.
If you wanted to test the implementation-specific details of each PathFinder impl, you might consider passing mock GraphNodes which are somehow capable of helping to assert the Dijkstra-ness or DepthFirst-ness, etc, of the implementation. (Perhaps these mock GraphNodes could record how they are traversed, or somehow measure performance.) Maybe this is testing overkill, but then again if you know your system needs these three distinct strategies for some reason, it'd probably be good to have tests to demonstrate why - otherwise why not just pick one implementation and throw the others away?

I don't mind reusing test code as a template for new tests that have similar functionality. Depending on the particular class under test, you may have to rework them with different mock objects and expectations. At the least you'll have to refactor them to use the new implementation. I would follow the TDD method, though, of taking one test, reworking it for the new class, then writing just the code to pass that test. This may take even more discipline, though, since you already have one implementation under your belt and will undoubtedly be influenced by code you have already written.

This doesn't seem right because I'm
not using TDD to develop the second
two implementation classes.
Sure you are.
Start by commenting out all the tests but one. As you make a test pass either refactor or uncomment another test.
Jtf

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Test cases, "when", "what", and "why"? - language-agnostic

Related

Detecting JUnit "tests" that never assert anything

How strict should I be in the "do the simplest thing that could possible work" while doing TDD

Should I change code to make it more testable?

Should a TDD test always fail first?

Developing to an interface with TDD

Categories

Resources