How to ensure quality of junit tests? - junit

Are there proven ways of verifying quality of junit tests or integration tests?
Should your business analyst review unit tests to cerfity? Or are there any other ways?
In the traditional code first environment a peer or lead would review the test plan but how about automated tests?
I looked at this stackflow thread but couldn't extract anything meaningful stuff.
Thoughts?

Mutation testing and code coverage can verify the quality of your tests.
So first check than your code coverage is high enough. After this verify with mutation testing than your test are good. Mutation testing tool makes small change(s) in production code and reruns test - after a modification a good test should fail. For mutation testing tool in Java look at PIT Mutation Testing and this blog post: Introduction to mutation testing with PIT and TestNG
But this is still not enough, tests should be good written and readable. So you need code review also and quality rules verification for tests. I recommend nice book about writing good tests Practical Unit Testing. Chapter 10: Maintainable tests from this book is available for free.

Here's a nice linked article:
http://www.ibm.com/developerworks/java/library/j-cq01316/index.html?ca=drs
And:
Good Tests ⇒ High Coverage
High Coverage ⇒/⇒ Good Tests
Coverage tools are useful to identify what areas of your project need more attention, but it doesn't mean that areas with good coverage shouldn't need more attention.

Code coverage tool is a good start, but knowing that a given line was executed does not mean it was tested. Infamous test cases without assertions or expected=Exception.class are an example.
I can imagine few criteria on this level:
if the line is tested, any change to it (inverting condition, removing...) should fail at least one test
given piece of logic should be fully reconstructible based only its tests
the test does not mirror the production code
the test should not be dependent on current date, locale, timezone, order of other tests
One might try to automate the first one, others are more or less subjective.
As for analyst doing test review - probably only Fitensse fixtures are readable enough to satisfy non-developers.

Code review is the best way to ensure test quality. I would not have business analysts review the tests, for the simple fact that they might not have the training necessary to understand the tests. Also, unit tests do not all live at the functional level, where analysts' requirements are. An analyst might say 'when the user clicks save, the profile is saved' whereas you might have to write n number of tests across multiple layers to get that functionality.

You might consider code coverage tools to ensure 100% of the code lines are being tested. Emma is a good tool for java (http://emma.sourceforge.net/).

Related

Regression-Test vs. Non-Regression-Test

As a follow up to this answer and a discussion in the comments.
Is regression-test a misnomer for non-regression-test or are these different types of tests?
See this post: https://medium.com/#paulochf/thoughts-about-non-regression-testing-and-why-i-used-it-in-a-pr-for-scikit-learn-f64133abf4b0
I have summarized the conclusion from this post in the graph below:
I've just found this in the Wikipedia article's talk.
1)
The "Regression and non-regression testing" section seems to
mischaracterize regression testing. This statement: "the intent of
regression testing is to assure that a software bug has been
successfully corrected by retesting the modified software" is wrong
(see regression testing, the aim of regression testing is to ensure
that correcting the bug has not introduced any errors in existing
functions, not to test if the bug itself has been fixed). Once you
remove that incorrect statement the whole paragraph falls apart and I
can't see the difference between NRT and RT. I suspect this article
should be removed or simply redirected to regression testing as they
are the same thing. Meritw (talk) 17:46, 2 August 2013 (UTC)
2)
I am not
sure completely if the content of NRT is valid or not(yet to read
through it), but RT and NRT are quite separate. RT tests existing
functionality and NRT will include new functionality testing and would
become part of RT at a later point of time depending on lot of
variables. In short say an application exists which has a
functionality set of X. An RT group exists to cover X. A new code
change occurs on the application to make the functionality as (X+Y).
The RT will still test X until it evolves at a later point of time to
include Y. But to cover the extra changes based on Y, a NRT has to be
conducted which most of the times is a manual process until the RT
evolves to cover both X+Y. A m i t 웃 18:29, 2 August 2013 (UTC)
So I noticed:
1) the definitions really are somewhat confusing
2) non-regression tests become regression ones after the improvements it checks are successfully certified.
Althought they're semantically different, they can relate to the same object (code that has the commands which tests the main program).
A regression test is usually a test that is activity performed to ensure the different functionalities of the system are still working as expected and the new functionalities added did not break any of the existing ones. This could be combination of API/UI/Unit tests that are run periodically.
Non-regression tests based on the context of your project could mean so many different things like Smoke Tests or Unit Tests that are run during every code check in. It could also means story level testing performed when testing a particular feature/requirement in a story. It could also be security testing, load testing, stress testing that are performed at some point of the the development lifecycle.

Junit - program verification vs whitebox fuzzing?

I understand that program verification is a branch of computer engineering - but that it's practical application to real world code bases is limited by combinatorial explosion.
I also understand that as part of designing your software change, for a modification to an existing Java framework, it's helpful to think about whitebox, boundary and blackbox tests for your algorithm, in advance. (Some people call this hammock driven development - thinking before you code.)
Assuming you take this thinking and embed it in junit style tests, I'm assuming that the Computer Science name for the contents is strictly 'whitebox testing/fuzzing' and not sufficient to comprise 'program verification'.
So my question is - junit tests - whitebox fuzzing or program verification?
Program verification is done proving mathematical properties on a mathematical model which is related to your application (it can be derived from the formal semantic of the programming language or by hand, like writing behavioral types that models your web service).
Take a look at pi-calculus to understand what I mean.
Of course, junit has nothing to do with formal program verification.

How do you stress test your own software?

I've been working on an app, by myself, and I am at a stage where everything works great--as long as the user does everything he or she is supposed to do. :-) The software needs more testing to see how robust it is, how well it works when people do things like click the same button repeatedly, try to open the wrong kind of files, put data in the wrong places, etc.
I'm having a little trouble with this because it's a bit difficult for me to think in terms of using the application incorrectly. These are all edge cases to me. Still, I'd like to have the application as stable and well tested as possible before I start giving it to beta testers. Assuming that I am not talking about hiring professional testers at this point, I'm curious whether y'all have any tips or systematic ways of thinking about this task.
Thanks, as always.
Well it sounds like you are talking about 2 different things
"Testing your application's functionality" and "Stress testing"(which is the title of your question)
Stress testing is when you have a website, and want to check that it can server 100,000 people at the same time. Seeing how your application performs under stress. You can do this a number of ways, e.g by recording some actions and then getting a number of agent machines to hit your application concurrently.
This questions sounds more like a Quality Assurance question. That is what testers / beta testers are for. But there are things that you can do yourself to validate your application works the best it can.
Unit testing your code would be a good start, it helps you to try and find those edge cases. If your method takes in things like ints, try passing in int.max, int.min, and seeing what blows up. Pass nulls into everything. If you are using .Net you might want to look at PEX, it will go through all the branches/codepaths that your application has. That might help you to further refine your unit tests to test your application the best you can.
Integration tests, see what happens end to end for some of your usual things. This will help you 'find bugs' as you are developing later.
Those are some quick tips on things you can do yourself to try and find edge cases that you may have missed. But yes, eventually you will need to pass your app off to someone else to test. Just make sure that you have covered off as much as you can before it hits them :-)
Make sure you have adequate code coverage in your unit tests and integration tests.
Use appropriate UI validation, and test combinations that can break it.
I have found that a well-architected application that reduces the number of possible permutations in the UI (ways the user can break it) helps a lot. Design patterns like MVC can be especially useful in this regard, since they make your UI veneer as thin as possible.
Automation.
(Re)Factor your code so that another program can throw user-events at it. Create simple scripts of user events and play them back to your program. Capture events from beta users and save those as test scripts (useful for reproducing problems and checking for regressions). Write a fuzz-tester that applies small random changes to the scripts and try them against your program as well.
With this kind of automation you can stress and application and find glaring problems like caches and memory leaks. It won't test the actual functionality. For functionality, unit tests can be helpful. There are a ton of unit testing frameworks out there to try. Pick something useful, learn to write good tests, and integrate them into your build process.

What is Code Coverage?

I have 3 questions :
What is CodeCoverage ?
What is it good for ?
What tools are used for
analyzing Code Coverage ?
You can get very good information from SO WEB SITE
Free code coverage tools
What is Code Coverage and how do YOU measure it?
Code Coverage is a measurement of how many lines/blocks/arcs of your code are executed while the automated tests are running.CC is collected by using a specialized tool to instrument the binaries to add tracing calls and run a full set of automated tests against the instrumented product. A good CC tools will give you not only the percentage of the code that is executed, but also will allow you to drill into the data and see exactly which lines of code were executed during particular test.
Code coverage algorithms were first created to address the problem of assessing a source code by looking directly at the source code. Code coverage belongs to the structural testing category because of the assertions made on the internal parts of the program and not on system outputs. Therefore code coverage aims at finding parts of the code that are not worth testing.
http://www.stickyminds.com/sitewide.asp?Function=edetail&ObjectType=ART&ObjectId=7580
alt text http://www.codecoveragetools.com/images/stories/software_lifecycle.jpg
Its Good for
Functional coverage aiming at finding how many functions or procedures were executed.
Statement or line coverage which identifies the number of lines in the source code has been executed.
Condition coverage or decision coverage answers the question about the number of loop conditions were executed in the program.
Path coverage which focuses on finding all possible paths from a given starting point in the code has been executed.
Entry and exit coverage which finds how many functions (C/C++, Java) or procedures (Pascal) were executing from the beginning to the end.
TOOLS
http://www.codecoveragetools.com/
http://java-source.net/open-source/code-coverage
http://www.codecoveragetools.com/index.php/coverage-process/code-coverage-tools-java.html
http://open-tube.com/10-code-coverage-tools-c-c/
http://csharp-source.net/open-source/code-coverage
http://www.kdedevelopers.org/node/3190
From wikipedia article
Code coverage is a measure used in
software testing. It describes the
degree to which the source code of a
program has been tested. It is a form
of testing that inspects the code
directly and is therefore a form of
white box testing1. Currently, the
use of code coverage is extended to
the field of digital hardware, the
contemporary design methodology of
which relies on Hardware description
languages (HDLs).
Advocating the use of code coverage
A code coverage tool simply keeps
track of which parts of your code get
executed and which parts do not.
Usually, the results are granular down
to the level of each line of code. So
in a typical situation, you launch
your application with a code coverage
tool configured to monitor it. When
you exit the application, the tool
will produce a code coverage report
which shows which lines of code were
executed and which ones were not. If
you count the total number of lines
which were executed and divide by the
total number of lines which could have
been executed, you get a percentage.
If you believe in code coverage, the
higher the percentage, the better. In
practice, reaching 100% is extremely
rare.
The use of a code coverage tool is
usually combined with the use of some
kind of automated test suite. Without
automated testing, a code coverage
tool merely tells you which features a
human user remembered to use. Such a
tool is far more useful when it is
measuring how complete your test suite
is with respect to the code you have
written.
Related articles
The Future of Code-Coverage Tools
The effectiveness of code coverage tools in software testing
Tools
Open Source Code Coverage Tools in Java
Code coverage is a metric, showing how "well" the source code is tested. There are several types of code coverage: line coverage, function coverage, branch coverage.
In order to measure the coverage, you shall run the application either manually or by automated test.
Tools can be divided in two categories:
- the ones that run the compiled code in a modified environment (like the debugger), counting the required points (functions, lines, etc.);
- the ones that require special compilation - in this case the resulting binary already contains the code which actually does the counting.
There are several tools for measuring and visualizing the result, they depend from platform, from source code's language.
Please read article on Wikipedia
To provide you tools, please define for which OS and language do you use.
Code coverage is a measure used in software testing. It describes the degree to which the source code of a program has been tested.
http://en.wikipedia.org/wiki/Code_coverage
The wikipedia definition is pretty good, but in my own words code coverage tells you how much automated testing you have accounted for. 100% would mean that ever single line of code in your application is being covered by a unit test.
NCover is an application for .NET
The term refers to how well your program is covered by your tests. See the following wikipedia article for more info:
http://en.wikipedia.org/wiki/Code_coverage
The other answers already cover what Code Coverage is. The think I'd like to stress is that you need to be careful not to treat high coverage as implicitly meaning you've tested all scenarios. It doesn't necessarily say how well you've tested the code or the quality of your tests, just that you've hit a certain percentage of code as part of the tests running.
High Code Coverage does not necessarily mean High Test Quality, but High Test Quality does mean High Code Coverage
In practice, I usually aim for 90-95% code coverage which is often achievable. The last few % are often too expensive to be worth trying to hit.
There are many ways to develop applications. One of those is "Extreme Programming" or "Test Driven Design (TDD)". It states that all code should be tested. Code Coverage is a means of measuring how much is tested.
I'd like to make a small remark about this: I don't think all code should be tested, nor that one should set a specific percentage of code coverage. Neither do I think that code shouldn't be tested with Unit Tests (code testing code). I do think one should decide what makes sense to test. Due to this reason I generally don't use code coverage.
One thing that some tools provide, is highlight the parts that are tested. This way you might run into some code that isn't tested but actually should be, which is the only thing I use it for.
Good answers.
My two cents is that there is no method of testing that catches all errors, but less testing will never catch more errors, so any testing is good. To my mind, coverage testing is not to show what code has been exercised, but to show what code has not been exercised, because that is where bugs love to lurk.
If you combine it with single-stepping, it is a very good way to review code and catch bugs. Here's an example.
Another useful tool for ensuring code quality(which encompasses code coverage) that I recently used is Sonar.
Here is the link http://www.sonarqube.org/

Design By Contract and Test-Driven Development [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I'm working on improving our group's development process, and I'm considering how best to implement Design By Contract with Test-Driven Development. It seems the two techniques have a lot of overlap, and I was wondering if anyone had some insight on the following (related) questions:
Isn't it against the DRY principle to have TDD and DbC unless you're using some kind of code generator to generate the unit tests based on contracts? Otherwise, you have to maintain the contract in two places (the test and the contract itself), or am I missing something?
To what extent does TDD make DbC redundant? If I write tests well enough, aren't they equivalent to writing a contract? Do I only get added benefit if I enforce the contract at run time as well as through the tests?
Is it significantly easier/more flexible to only use TDD rather than TDD with DbC?
The main point of these questions is this more general question: If we're already doing TDD properly, will we get a significant benefit for the overhead if we also use DbC?
A couple of details, though I think the question is largely language-agnostic:
Our team is very small, <10 programmers.
We mostly use Perl.
Note the differences.
Design driven by contract. Contract Driven Design.
Develop driven by test. Test Driven Development.
They are related in that one precedes the other. They describe software at different levels of abstraction.
Do you discard the design when you go to implementation? Do you consider that a design document is a violation of DRY? Do you maintain the contract and the code separately?
Software is one implementation of the contract. Tests are another. User's manual is a third. Operations guide is a fourth. Database backup/restore procedures are one part of an implementation of the contract.
I can't see any overhead from Design by Contract.
If you're already doing design, then you change the format from too many words to just the right words to outline the contractual relationship.
If you're not doing design, then writing a contract will eliminate problems, reducing cost and complexity.
I can't see any loss of flexibility.
start with a contract,
then
a. write tests and
b. write code.
See how the two development activities are essentially intertwined and both come from the contract.
I think there is overlap between DbC and TDD, however, I don't think there is duplicated work: introducing DbC will probably result in a reduction of test cases.
Let me explain.
In TDD, tests aren't really tests. They are behavioral specifications. However, they are also design tools: by writing the test first, you use the external API of your object under test – that you haven't actually written yet – in the same way that a user would. That way, you design the API in a way that makes sense to a user, and not in the way that makes it easiest for you to implement. Something like queue.full? instead of queue.num_entries == queue.size.
This second part cannot be replaced by Contracts.
The first part can be partially replaced by contracts, at least for unit tests. TDD tests serve as specifications of behavior, both to other developers (unit tests) and domain experts (acceptance tests). Contracts also specify behavior, to other developers, to domain experts, but also to the compiler and the runtime library.
But contracts have fixed granularity: you have method pre- and postconditions, object invariants, module contracts and so on. Maybe loop variants and invariants. Unit tests however, test units of behavior. Those might be smaller than a method or consist of multiple methods. That's not something you can do with contracts. And for the "big picture" you still need integration tests, functional tests and acceptance tests.
And there is another important part of TDD that DbC doesn't cover: the middle D. In TDD, tests drive your development process: you never write a single line of implementation code unless you have a failing test, you never write a single line of test code unless your tests all pass, you only write the minimal amount of implementation code to make the tests pass, you only write the minimal amount of test code to produce a failing test.
In conclusion: use tests to design the "flow", the "feel" of the API. Use contracts to design the, well, contract of the API. Use tests to provide the "rhythm" for the development process.
Something like this:
Write an acceptance test for a feature
Write a unit test for a unit that implements a part of that feature
Using the method signature you designed in step 2, write the method prototype
Add the postcondition
Add the precondition
Implement the method body
If the acceptance test passes, goto 1, otherwise goto 2
If you want to know what Bertrand Meyer, the inventor of Design by Contract, thinks about combining TDD and DbC, there is a nice paper by his group, called Contract-Driven Design = Test-Driven Development - Writing Test Cases. The basic premise is that contracts provide an abstract representation of all possible cases, whereas test cases only test specific cases. Therefore, a suitable test harness can be automatically generated from the contracts.
I would add:
the API is the contract for the programmers, the UI definition is the contract with the clients, the protocol is the contract for client-server interactions. Get those first, then you can take advantage of parallel development tracks and not get lost in the weeds. Yes, periodically review to make sure requirements are met, but never start a new track without the contract. And 'contract' is a strong word: once deployed, it must never change. You should include versioning management and introspection from the get-go, changes to the contract are only implemented by extension sets, version numbers change with these, and then you can do things like graceful degradation when dealing with mixed or old installations.
I learned this lesson the hard way, with a large project that wandered off into never-never land, then applied it the right way later when seriously under the gun, company-survival, short fuse timeline. We defined the protocol, defined and wrote a set of protocol emulations for each side of the transactions (basically canned message generators and received message checker, one evenings' worth of two-brained coding), then parted to separately write the server and client ends of the app. We recombined the night of the show, and it just worked. Requirements, design, contract, test, code, integrate. In that order. Repeat until baked.
I am a little leery of design by TLA. As with Patterns, buzz-word compliant recipes are a good guide, but it is my experience that there is no such thing as a one-size-fits-all design or project management procedure. If you are doing things precisely By The Book (tm) then, unless it is a DOD contract with DOD procedural requirements, you will probably get into trouble somewhere along the way. Read the Book(s), yes, but be sure tounderstand them, and then take into account also the people side of your team. Rules that are only enforced by the Book will not get enforced uniformly - even when tool-enforced there can be drop-outs (e.g. svn comments left empty or cryptically brief). Procedures only tend to get followed when the tool chain not only enforces them but makes following easier than any possible short-cuts. Believe me, when the going gets tough, the short-cuts get found, and you may not know about the ones that got used at 3am until it is too late.
You can also use executable acceptance tests that are written in the domain language of the contract. It might not be the actual "contract", but half way between unit tests and the contract.
I would recomment using Ruby Cucumber
http://github.com/aslakhellesoy/cucumber
But since you are a Perl shop, then maybe you can use my own small attempt at p5-cucumber.
http://github.com/kesor/p5-cucumber
Microsoft has done work on automatic generation of unit tests, based on code contracts and parameterized unit test. E.g. the contract says the count must be increased by one when an item is added to a collection, and the parameterized unit test say how to add “n” items to a collection. Pex will then try to create a unit test that proves the contract is broken. See this video for a overview.
If this works, your unit test will only have to be written for one example of each thing you are trying to test, and PEX will be able to then work out witch data items will break the test.
I had some ruminations about that topic some time ago.
You may want to take a look at
http://gleichmann.wordpress.com/2007/12/09/test-driven-development-and-design-by-contract-friend-or-foe/
When you are using TDD to implement a new method, you need some input: you need to know the assertions to check in your tests. Design-by-contract gives you those assertions: they are the post-conditions and invariants of the method.
I have found DbC very handy for jumpstarting the red-green-refactor cycle because it helps in identifying unit tests to start with. With DbC I start thinking about pre-conditions that the object being TDD-ed must handle and each pre-condition might represent a failing unit test to start a red-green-refactor cycle. At some point I switch to start the cycle with a failing unit test for a post-condition, and then just keep the TDD flow going. I have tried this approach with newcomers to TDD and it really works in kickstarting the TDD mindset.
In summary, think of DbC as an effective way to identify key behavioral unit tests. DbC helps at analyzing inputs (pre-conditions) and outputs(post-conditions), which are the two things that we need to control (inputs) and observe (outputs) to write testable software (a similar aim of TDD).