Detecting JUnit "tests" that never assert anything - junit

We used to have a technical director who liked to contribute code and was also very enthusiastic about adding unit tests. Unfortunately his preferred style of test was to produce some output to screen and visually check the result.
Given that we have a large bank of tests, are there any tools or techniques I could use to identify the tests never assert?

Since that's a one time operation I would:
scan all test methods (easy, get the jUnit report XML)
use an IDE or other to search references to Assert.*, export result as a list of method
awk/perl/excel the results to find mismatches
Edit: another option is to just look for references to System.out or whatever his preferred way to output stuff was, most tests won't have that.

Not sure of a tool, but the thought that comes to mind is two-fold.
Create a TestRule class that keeps track of the number of asserts per test (use static counter, clear counter at beginning of test, assert that it is not 0 at end of test).
Wrap the Assert class in your own proxy that increments the TestRule's counter each time it is called.
Is your Assert class is called Assert that you would only need to update the imports and add the Rule to the tests. The above described mechanism is not thread-safe so if you have multiple tests running concurrently you will be incorrect results.

If those tests are the only ones that produce output, an automated bulk replacement of System.out.println( with org.junit.Assert.fail("Fix test: " + would highlight exactly those tests that aren't pulling their weight. This technique would make it easy to inspect those tests in an IDE after a run and decide whether to fix or delete them; it also gives a clear indication of progress.

Related

Prevent jUnit from failing if assumption not satisfied

When running a series of test cases using jUnit, it is possible to skip the test for some cases using assumptions. However, if there are no cases in which the assumption is satisfied, the test is marked as failed. Is there a way to prevent that?
For example, if the property testTopic is passed as vertical_move, all of the tests for vertical_move should run and none of the tests for horizontal_move. Currently I am using assume as shown below to skip these tests.
assumeTrue(TestCons.get("testTopic").contains("horizontal_move"));
The problem is that if the assumption fails in all cases, the test is marked as failed. I want it prevent that. In other words, if the assumption fails, just skip the test without failing it, even if the assumption is never satisfied. Is there a way to do that? Thanks.
You can use #Ignore to mark tests that should be completely excluded from execution, but I am not aware how you could determine that from within a test case.
And honestly: I think you shouldn't even try that.
Unit tests should be straight forward. They exist to help you to quickly identify a bug in your production code. But: any piece of "extra logic" that a reader needs to digest to understand the reason for a failing testcase makes that harder.
Thus: avoid putting any such extra logic in your test cases. Write testcases that create a clear setup, and then assert for the result that you expect for exactly that setup, and nothing else.

NUnit equivalent for JUnit test state management with #Before/#After

I come from Java world and I mostly used JUnit, and now I have some problems expressing some aspects of tests with NUnit 3. In JUnit, each test creates its own instance of a test class, so it's perfectly valid to create some instance variables in a test class, set up them in #Before method, test method and helpers can access these variables freely without worrying they would be overwritten by other tests run in parallel, and #After tears down the test data nicely. With NUnit it does not work and SetUp and TearDown methods seem to be useless in this case, because test fixture instance is reused between invocations of test method(s), so fields of test fixture class can (and are) overwritten by every invocation of a test method (my class has a few test methods, and each of them generates several test cases, so there are some tens of invocations in one test run).
I do not know how to work around this problem. In my scenario, set up would create a temporary folder, which would be used as a work folder for following test case. Tear down would delete the temporary folder afterwards, cleaning up all intermediate files created by tested method. But now, when SetUp creates and stores a temporary folder path in instance field (so it can be read by test logic and somewhat complicated asserts and verifiers), the value of such field is overwritten by test cases run in parallel. I considered several approaches:
implement an IDisposable which would represent a context of each test, and enclose it with using in each test method - I do not like this idea, because I do not like the idea of IDisposable being used as anything else than resource management tool and combinig IDisposable with using to simulate set up/tear down smells to me like an abuse of this particular language feature,
create a method which accepts a delegate for actual test logic, and which invokes custom SetUpTestCase/TearDownTestCase methods. The method would invoke set up, then test delegate, and tear down afterwards. What I do not like about this approach is that it does not play well with test methods which accept parameters - each set of test methods parametrized in particular way would need a corresponding delegate type. Also it somewhat seems to be against spirit of NUnit and the way of describing test methods with attributes - after all, why should the main logic of my test be delegated to anything? Shouldn't the [Test] or [TestCase] method be actual test?
maybe there's some way to use more advanced aspects of NUnit, like actions or some callbacks/triggers/whatever, I am just too unexperienced to see these. What I particularly miss is the way to transfer data from set up method (for example, a path to a temporary folder created by it) to the test method that follows. I cannot use instance fields for this, and I do not know whether there exists any "tag" structure which would pass test-specific data between methods invoked on different stages of a test lifecycle?
Generally, SetUp and TearDown attributes seem pretty useless to me, if they cannot set up the test case without their result being overwritten immediately by another test case run in parallel. What am I missing here?
How can I implemented such per-test case, scoped setup/tear down behavior with NUnit? What do I do wrong, or what do I miss?
As you have established, the TestFixture class is instantiated once before the OneTimeSetUp is called; then for each test it runs a set of SetUp, Test and TearDown; and finally, the OneTimeTearDown.
If you want the tests to be run in parallel (which is not the default) then you must specify The Parallelizable Attribute. Whether you do that or not, it is a good idea for your tests to be written independently, so they do not conflict with each other - they need to be structured.
The AAA (Arrange, Act, Assert) pattern is a common way of structuring unit tests for a method under test. If your tests are to be run in parallel, then TestFixture fields are not suitable for holding information which may conflict across parallel tests, in the same way that it wouldn't be suitable in a multithreaded class.
I'd suggest using a private method in the TestFixture to set up the temporary folder - it will need to have some way of providing a unique folder name, so that the parallel tests do not interact - perhaps use a Guid or CallerMemberName as part of the folder name, and return the folder name.
This method should be called from the Arrange part of the test. And you'll need a try...finally wrapping the rest of the Test to ensure the folder gets torn down. Or you could go with your IDisposable idea - I don't think there's anything wrong with that: the whole point of that is to guarantee tidying up resources (both managed and unmanaged) when something goes out of scope.
Your second suggestion of a delegate would also be fine if you used lambda expressions rather than strictly-defined delegates - the lambda expression can capture variables from the containing scope.

Drools testing with junit

What is the best practice to test drools rules with junit?
Until now we used junit with dbunit to test rules. We had sample data that was put to hsqldb. We had couple of rule packages and by the end of the project it is very hard to make a good test input to test certain rules and not fire others.
So the exact question is that how can I limit tests in junit to one or more certain rule(s) for testing?
Personally I use unit tests to test isolated rules. I don't think there is anything too wrong with it, as long as you don't fall into a false sense of security that your knowledge base is working because isolated rules are working. Testing the entire knowledge base is more important.
You can write the isolating tests with AgendaFilter and StatelessSession
StatelessSession session = ruleBase.newStatelessSesssion();
session.setAgendaFilter( new RuleNameMatches("<regexp to your rule name here>") );
List data = new ArrayList();
... // create your test data here (probably built from some external file)
StatelessSessionResult result == session.executeWithResults( data );
// check your results here.
Code source: http://blog.athico.com/2007/07/my-rules-dont-work-as-expected-what-can.html
Do not attempt to limit rule execution to a single rule for a test. Unlike OO classes, single rules are not independent of other rules, so it does not make sense to test a rule in isolation in the same way that you would test a single class using a unit test. In other words, to test a single rule, test that it has the right effect in combination with the other rules.
Instead, run tests with a small amount of data on all of your rules, i.e. with a minimal number of facts in the rule session, and test the results and perhaps that a particular rule was fired. The result is not actually that much different from what you have in mind, because a minimal set of test data might only activate one or two rules.
As for the sample data, I prefer to use static data and define minimal test data for each test. There are various ways of doing this, but programmatically creating fact objects in Java might be good enough.
I created simple library that helps to write unit tests for Drools. One of the features is exactly what you need: declare particular drl files you want to use for your unit test:
#RunWith(DroolsJUnitRunner.class)
#DroolsFiles(value = "helloworld.drl", location = "/drl/")
public class AppTest {
#DroolsSession
StatefulSession session;
#Test
public void should_set_discount() {
Purchase purchase = new Purchase(new Customer(17));
session.insert(purchase);
session.fireAllRules();
assertTrue(purchase.getTicket().hasDiscount());
}
}
For more details have a look on blog post: https://web.archive.org/web/20140612080518/http://maciejwalkowiak.pl/blog/2013/11/24/jboss-drools-unit-testing-with-junit-drools/
A unit test with DBUnit doesn't really work. An integration test with DBUnit does. Here's why:
- A unit test should be fast.
-- A DBUnit database restore is slow. Takes 30 seconds easily.
-- A real-world application has many not null columns. So data, isolated for a single feature, still easily uses half the tables of the database.
- A unit test should be isolated.
-- Restoring the dbunit database for every test to keep them isolated has drawbacks:
--- Running all tests takes hours (especially as the application grows), so no one runs them, so they constantly break, so they are disabled, so there is no testing, so you application is full of bugs.
--- Creating half a database for every unit test is a lot of creation work, a lot of maintenance work, can easily become invalid (with regards to validation which database schema's don't support, see Hibernate Validator) and ussually does a bad job of representing reality.
Instead, write integration tests with DBunit:
- One DBunit, the same for all tests. Load it only once (even if you run 500 tests).
-- Wrap each test in a transaction and rollback the database after every test. Most methods use propagation required anyway. Set the testdata dirty only (to reset it in the next test if there is a next test) only when propagation is requires_new.
- Fill that database with corner cases. Don't add more common cases than are strictly needed to test your business rules, so ussually only 2 common cases (to be able to test "one to many").
- Write future-proof tests:
-- Don't test the number of activated rules or the number of inserted facts.
-- Instead, test if a certain inserted fact is present in the result. Filter the result on a certain property set to X (different from the common value of that property) and test the number of inserted facts with that property set to X.
Unit test is about taking minimum piece of code and test all possible usecases defining specification. With integration tests your goal is not all possible usecases but integration of several units that work together. Do the same with rules. Segregate rules by business meaning and purpose. Simplest 'unit under the test' could be file with single or high cohension set of rules and what is required for it to work (if any), like common dsl definition file and decision table. For integration test you could take meaningful subset or all rules of the system.
With this approach you'll have many isolated unit tests and few integration tests with limited amount of common input data to reproduce and test common scenarios. Adding new rules will not impact most of unit tests but few integration tests and will reflect how new rules impact common data flow.
Consider JUnit testing library that could be suitable for this approach

How strict should I be in the "do the simplest thing that could possible work" while doing TDD

For TDD you have to
Create a test that fail
Do the simplest thing that could possible work to pass the test
Add more variants of the test and repeat
Refactor when a pattern emerge
With this approach you're supposing to cover all the cases ( that comes to my mind at least) but I'm wonder if am I being too strict here and if it is possible to "think ahead" some scenarios instead of simple discover them.
For instance, I'm processing a file and if it doesn't conform to a certain format I am to throw an InvalidFormatException
So my first test was:
#Test
void testFormat(){
// empty doesn't do anything nor throw anything
processor.validate("empty.txt");
try {
processor.validate("invalid.txt");
assert false: "Should have thrown InvalidFormatException";
} catch( InvalidFormatException ife ) {
assert "Invalid format".equals( ife.getMessage() );
}
}
I run it and it fails because it doesn't throw an exception.
So the next thing that comes to my mind is: "Do the simplest thing that could possible work", so I :
public void validate( String fileName ) throws InvalidFormatException {
if(fileName.equals("invalid.txt") {
throw new InvalidFormatException("Invalid format");
}
}
Doh!! ( although the real code is a bit more complicated, I found my self doing something like this several times )
I know that I have to eventually add another file name and other test that would make this approach impractical and that would force me to refactor to something that makes sense ( which if I understood correctly is the point of TDD, to discover the patterns the usage unveils ) but:
Q: am I taking too literal the "Do the simplest thing..." stuff?
I think your approach is fine, if you're comfortable with it. You didn't waste time writing a silly case and solving it in a silly way - you wrote a serious test for real desired functionality and made it pass in - as you say - the simplest way that could possibly work. Now - and into the future, as you add more and more real functionality - you're ensuring that your code has the desired behavior of throwing the correct exception on one particular badly-formatted file. What comes next is to make that behavior real - and you can drive that by writing more tests. When it becomes simpler to write the correct code than to fake it again, that's when you'll write the correct code. That assessment varies among programmers - and of course some would decide that time is when the first failing test is written.
You're using very small steps, and that's the most comfortable approach for me and some other TDDers. If you're more comfortable with larger steps, that's fine, too - but know you can always fall back on a finer-grained process on those occasions when the big steps trip you up.
Of course your interpretation of the rule is too literal.
It should probably sound like "Do the simplest potentially useful thing..."
Also, I think that when writing implementation you should forget the body of the test which you are trying to satisfy. You should remember only the name of the test (which should tell you about what it tests). In this way you will be forced to write the code generic enough to be useful.
I too am a TDD newbie struggling with this question. While researching, I found this blog post by Roy Osherove that was the first and only concrete and tangible definition of "the simplest thing that could possibly work" that I have found (and even Roy admitted it was just a start).
In a nutshell, Roy says:
Look at the code you just wrote in your production code and ask yourself the following:
“Can I implement the same solution in a way that is ..”
“.. More hard-coded ..”
“.. Closer to the beginning of the method I wrote it in.. “
“.. Less indented (in as less “scopes” as possible like ifs, loops, try-catch) ..”
“.. shorter (literally less characters to write) yet still readable ..”
“… and still make all the tests pass?”
If the answer to one of these is “yes” then do that, and see all the tests still passing.
Lots of comments:
If validation of "empty.txt" throws an exception, you don't catch it.
Don't Repeat Yourself. You should have a single test function that decides if validation does or does not throw the exception. Then call that function twice, with two different expected results.
I don't see any signs of a unit-testing framework. Maybe I'm missing them? But just using assert won't scale to larger systems. When you get a result from validation, you should have a way to announce to a testing framework that a given test, with a given name, succeeded or failed.
I'm alarmed at the idea that checking a file name (as opposed to contents) constitutes "validation". In my mind, that's a little too simple.
Regarding your basic question, I think you would benefit from a broader idea of what the simplest thing is. I'm also not a fundamentalist TDDer, and I'd be fine with allowing you to "think ahead" a little bit. This means thinking ahead to this afternoon or tomorrow morning, not thinking ahead to next week.
You missed point #0 in your list: know what to do. You say you are processing a file for validation purposes. Once you have specified what "validation" means (hint: do this before writing any code), you might have a better idea of how to a) write tests that, well, test the specification as implemented, and b) write the simplest thing.
If, e.g., validation is "must be XML", your test case is just some non-xml-conformant string, and your implementation is using an XML library and (if necessary) transform its exceptions into those specified for your "validation" feature.
One thing of note to future TDD learners - the TDD mantra doesn't actually include "Do the simplest thing that could possibly work." Kent Beck's TDD Book has only 3 steps:
Red— Write a little test that doesn't work, and perhaps doesn't even
compile at first.
Green— Make the test work quickly, committing
whatever sins necessary in the process.
Refactor— Eliminate all of the duplication created in merely getting the test to work.
Although the phrase "Do the simplest thing..." is often attributed to Ward Cunningham, he actually asked a question "What's the simplest thing that could possibly work?", but that question was later turned into a command - which Ward believes may confuse rather help.
Edit: I can't recommend reading Beck's TDD Book strongly enough - it's like having a pairing session with the master himself, giving you his insights and thoughts on the Test Driven Development process.
Like a method should do one thing only, one test should test one thing (behavior) only. To address the example given, I'd write two tests, for instance, test_no_exception_for_empty_file and test_exception_for_invalid_file. The second could indeed be several tests - one per sort of invalidity.
The third step of the TDD process shall be interpreted as "add a new variant of the test", not "add a new variant to the test". Indeed, a unit test shall be atomic (test one thing only) and generally follows the triple A pattern: Arrange - Act - Assert. And it's very important to verify the test fails first, to ensure it is really testing something.
I would also separate the responsibility of reading the file and validating its content. That way, the test can pass a buffer to the validate() function, and the tests do not have to read files. Usually unit tests do not access to the filesystem cause this slow them down.

Test cases, "when", "what", and "why"?

Being new to test based development, this question has been bugging me. How much is too much? What should be tested, how should it be tested, and why should it be tested? The examples given are in C# with NUnit, but I assume the question itself is language agnostic.
Here are two current examples of my own, tests on a generic list object (being tested with strings, the initialisation function adds three items {"Foo", "Bar", "Baz"}):
[Test]
public void CountChanging()
{
Assert.That(_list.Count, Is.EqualTo(3));
_list.Add("Qux");
Assert.That(_list.Count, Is.EqualTo(4));
_list[7] = "Quuuux";
Assert.That(_list.Count, Is.EqualTo(8));
_list.Remove("Quuuux");
Assert.That(_list.Count, Is.EqualTo(7));
}
[Test]
public void ContainsItem()
{
Assert.That(_list.Contains("Qux"), Is.EqualTo(false));
_list.Add("Qux");
Assert.That(_list.Contains("Qux"), Is.EqualTo(true));
_list.Remove("Qux");
Assert.That(_list.Contains("Qux"), Is.EqualTo(false));
}
The code is fairly self-commenting, so I won't go into what's happening, but is this sort of thing taking it too far? Add() and Remove() are tested seperately of course, so what level should I go to with these sorts of tests? Should I even have these sorts of tests?
I would say that what you're actually testing are equivalence classes. In my view, there is no difference between a adding to a list that has 3 items or 7 items. However, there is a difference between 0 items, 1 item and >1 items. I would probably have 3 tests each for Add/Remove methods for these cases initially.
Once bugs start coming in from QA/users, I would add each such bug report as a test case; see the bug reproduce by getting a red bar; fix the bug by getting a green bar. Each such 'bug-detecting' test is there to stay - it is my safety net (read: regression test) that even if I make this mistake again, I will have instant feedback.
Think of your tests as a specification. If your system can break (or have material bugs) without your tests failing, then you don't have enough test coverage. If one single point of failure causes many tests to break, you probably have too much (or are too tightly coupled).
This is really hard to define in an objective way. I suppose I'd say err on the side of testing too much. Then when tests start to annoy you, those are the particular tests to refactor/repurpose (because they are too brittle, or test the wrong thing, and their failures aren't useful).
A few tips:
Each testcase should only test one thing. That means that the structure of the testcase should be "setup", "execute", "assert". In your examples, you mix these phases. Try splitting your test-methods up. That makes it easier to see exactly what you are testing.
Try giving your test-methods a name that describes what it is testing. I.e. the three testcases contained in your ContainsItem() becomes: containsReportsFalseIfTheItemHasNotBeenAdded(), containsReportsTrueIfTheItemHasBeenAdded(), containsReportsFalseIfTheItemHasBeenAddedThenRemoved(). I find that forcing myself to come up with a descriptive name like that helps me conceptualize what I have to test before I code the actual test.
If you do TDD, you should write your test firsts and only add code to your implementation when you have a failing test. Even if you don't actually do this, it will give you an idea of how many tests are enough. Alternatively use a coverage tool. For a simple class like a container, you should aim for 100% coverage.
Is _list an instance of a class you wrote? If so, I'd say testing it is reasonable. Though in that case, why are you building a custom List class?
If it's not code you wrote, don't test it unless you suspect it's in some way buggy.
I try to test code that's independent and modular. If there's some sort of God-function in code I have to maintain, I strip out as much of it as possible into sub-functions and test them independantly. Then the God function can be written to be "obviously correct" -- no branches, no logic, just passing results from one well-tested subfunction to another.