Some questions about the process of BDD from a beginner - junit

These days I've read several articles about BDD to find what it is talking about. Now I get a basic understandings, but still not clear about the whole process.
The following is what I think to do in a BDD process:
All the stakeholders(BA, customer, Dev, QA) are sitting together to discuss the requirements, and write the agreed features on story cards. Here I take a "user registeration" feature as example:
As a user,
I want to register on the system,
so that I can use its services
Create several scenarios in Given/When/Then format, and here is one of them:
Scenario: user successfully register
Given an register page
And an un-registered user
When the user fills username "Jeff" and password "123456"
And click on "Register"
Then the user can see a "Success" message
And the user "Jeff" is created in the system
Implement this scenario with some BDD testing framework, say cucumber-jvm, like:
import cucumber.api.java.en.Given;
public class Stepdefs {
#Given("an register page")
public void an_register_page throws Throwable {
// ...
}
#Given("an un-registered user")
public void an_register_page throws Throwable {
// ...
}
// ...
}
for the steps one by one.
But I soon find myself in trouble: there are pages, models, maybe databases need for this scenario, seems a lot of thing to do.
What should I do now?
What should I do now? Do I need to discuss this scenario with all the stakeholders? For BA/Customer/QA, I don't think they really care about the implementations, is it a good idea to discuss it with some other developers?
Suppose after I discuss it with some other developers, we agree to split it to several small parts. Can we make these small parts as "scenario"s with Scenario/Given/When/Then format as what we just did with cucumber-jvm, or can we use JUnit as we normally do in TDD?
1. If choose "cucumber-jvm", it seems a little heavy for small part
2. If choose JUnit, we need to involve more than one testing framework in a project
3. Is it the best if there is a single testing framework to do both things (not sure if there is)
Suppose I choose option 2, using JUnit for the small tasks
The following is what I will do after this decision:
Now we create new small tests to drive implementations, like creating user in database, as we normally do in TDD. (red->green->refactoring). And we don't care the cucumber test Scenario: user successfully register (which is failed) for now, just leave it there. Right?
We develop more small tests with JUnit, made them red -> green -> refactored. (And the imcomplete cucumber test is always failed)
Until all the small tests are passed, we turn to the cucumber test Scenario: user successfully register. Completed it and make sure it turn green at last.
Now develop another scenario, if it's easy, we can implement it just with cucumber, otherwise we will have to split it and write several jUnit test
There are definitely a lot of mis-understandings, even very basic ones. Because I don't find myself gain much value from BDD except the "discuss with all stakeholders" part.
Where is my mistake? Appreciate for any sugguestions!

Don't start with logging in; start with the thing that's different to the other systems out there. Why is someone logging in? Why do they want to use the service? Hard-code a user, pretend they're logged in, focus on the value.
If you focus on UI details you tie yourself to the UI very strongly, and it makes the UI hard to change. Instead, look at what capabilities the system is delivering. I don't recommend using a login scenario anyway, but if I did, I'd expect it to look more like:
Given Jeff isn't registered with the site
When he registers with the username "Jeff" and password "123456"
Then his account creation should be confirmed
And he should be invited to log in for the first time.
Look up "declarative vs. imperative" here to see more on this.
If your UI is really in flux, try out the scenario manually until the UI has settled down a bit. It will be easier to automate then. As you move into more stable scenarios it will be better to automate first (TDD-style).
What should you do now? Well, most people don't write class-level tests for the UI, so don't worry about it until you start driving out the controller and presenter layers. It's normally easier to have frameworks in the same language, but two different frameworks is fine. Cucumber / RSpec, JBehave / JUnit, SpecFlow / NUnit are pretty typical combinations. Do the smallest amount you need to get that first scenario working. It won't be much, because you can hard-code a lot of it. The second scenario will start introducing more interesting behaviour, and then you'll start to see your class-level tests emerge.
BTW, BDD started at a class level, so you can do the same thing with the classes; think of an example of how you might use it, write "Given, When, Then" in comments if your framework doesn't work that way already, and then fill in the gaps!
Yes, your Cucumber scenario will be red throughout, until it isn't.
Ideally you'll be making one last unit test and the Cucumber scenario pass at the same time, rather than just writing a bit of extra code. It's very satisfying to see it finally go green.
The original point of BDD was to get rid of the word "test", since it causes people to think of things like TDD as being about testing. TDD's really about clean design; understanding the responsibilities and behaviour of your code, in the same way that scenarios help you understand the capabilities and behaviour of your system. It should be normal to write both system-level scenarios and class-level tests too.
You're already ahead of all the people who forget to discuss the scenarios before they start coding, though! The conversations with stakeholders are the most important part. You might get value out of including a tester in those conversations. Testers are very good at spotting scenarios that other people miss.
It looks like you're pretty much on the right track where the rest of the process is concerned. You might find some of the other BDD answers in my profile helpful for you too. Congrats and good luck!

I think doing registration/sign_in first is a really good thing to do when you are learning the mechanics of doing BDD. Pretty much everyone understands why you would want to sign into a system, and everyone understands that the system has to know who you are before you can do this, so you have to register first.
Doing this simple task allows you to concentrate on a smaller subset of BDD. By narrowing your focus you can improve quality, whilst being aware that there is much more to learn a little later on.
To write your sign in scenarios you need to focus on two things:
writing scenarios
implementing step definitions
These are the basic mechanics of BDD, but they are only a small part of the overall process. Still I think you'd benefit from working on them because at the moment you are not executing the mechanics very well, which is to be expected because you are new to this.
When you write scenarios you should concentrate on 'what' you are doing and 'why' you are doing it. Scenarios have no need to know anything about 'how' you do things. Anything to do with filling in stuff, clicking on stuff etc. is a smell. When your scenarios only deal with the what and why they become much simpler.
Feature: Registration
A pre-requistite for signing in, see sign_in.feature
Scenario: Register
Given I am a new user
When I register
Then I should be registered
Feature: Sign in
Dependant on registration ...
I want to sign in so I can have personalised content and ...
Scenario: Sign in
Given I am registered
When I sign in
Then I should be signed in
You really don't much more than this to drive the development of a simple sign_in system. Once you have that running you can deal with some sad paths e.g.
Scenario: Sign in with bad password
Given I am registered
When I sign in with a bad password
Then I should not be signed in
And I should be told ...
If you implement things nicely this sad path scenario should be trivial to implement as all the infrastructure is already in place to sign in, all that is different is you are using a bad password.
You can see an example of this at https://github.com/diabolo/cuke_up. The way to use this example is follow the commit history, and in particular notice how I am using the extract_method refactor to take all the code out of the step definitions. Each method I extract is a tool to reused when writing subsequent scenarios. Making an effective set of tools is the key to productivity when implementing scenarios.
Nowadys sign_up is so simple because we can rely on a 3rd party library and their unit tests. This means we can get pretty good results without ever having to worry about the transition to our own code and doing bits of TDD. So for now there really is no need to think about TDD.
So long as you are aware that you are only doing a small subset of BDD, I think you can successfully use this approach to provide foundations for all the extra stuff you have to deal with when working with the things that differentiate your system from others.
To summarize, just focus on
writing simple scenarios
making your step definitions elegant
creating tools (extracted methods) that can be used in the next scenario you write
You have plenty of time to learn the other stuff, and it will be much easier if your basic mechanics are better developed.

Related

Test Driven Design - where did I go wrong?

I am playing with a toy project at home to better understand Test Driven Design. At first things seemed to be going well and I got into the swing of failing tests, code, passing test.
I then came to add a test and realised it would be difficult with my current structure and that furthermore I should split a particular class which had too many responsibilities. Adding even more responsibilities for the next test was clearly wrong. I decided to put aside this test, and refactor what I had. This is where things started to go wrong.
It was difficult to refactor without breaking lots of tests at once, and then the only option seemed to be to make many changes and hope I ended up back at something where the tests passed again. The tests themselves were valid, I just had to break nearly all of them while refactoring. The refactoring (which I'm still not that happy with) took me five or six hours before I had returned to all tests passing. The tests did help me along the way.
It feels like I got off the TDD track. What do you think I did wrong?
As this is mostly a learning exercise I'm considering rolling back all that refactoring and trying to move forward again in a better fashion.
Perhaps you went too fast when splitting your class. The steps for the Extract Class Refactoring are as follows:
create the new class
have an instance of that class as a private data member
move field to the new class, one by one
compile and test for each field
move method to the new class, one by one, testing for each
That way you won't break a large number of tests while refactoring your class, and you can rely on the tests to make sure nothing was broken so far all along the class splitting.
Also, make sure you're testing the behavior, not the implementation.
I wanted to comment the accepted answer but my current reputation does not allow me. So here it is as a new answer.
TDD says:
Create a test that fails. Code a little. Make the test pass.
It insists on coding in tiny steps (especially when beginning). View TDD as systematic validation of successive refactorings you perform to build your programs. If you take too big a step, your refactoring will get out of control.
Perhaps you were testing a too low a level. It's hard to say without seeing you code, but typically I test a feature from end to end, and make sure all the behaviour I expected to happen, happened. Testing every single method in isolation will give you the test-web you have created.
You can use tools like NCover and DotCover to check that you haven't missed any code paths.
The only thing "wrong" was to add a test afterwards. In "true" TTD you first declare all tests before the actual implementation. I say "true" because that's often only theory. But in practice you still have the safty given by the tests.
This is, sadly, something TDD proponents don't talk about enough and that makes people try TDD and then abandon it.
What I do is what I call "High Level Testing" which consists in avoiding unit tests and doing exclusively high level tests (which we could could call "integration tests"). It works pretty well and I avoid the (very important) problem you mentioned. I wrote an article about it a while ago:
http://www.hardcoded.net/articles/high-level-testing.htm
Good luck with TDD, don't give up yet.
Also for TDD continuos regression of test cases are also necessary. So Continuos integration with Coverage tools(as mentioned above) are necessary. So that small changes(i.e Refactoring) can be regressed easily and whether any code path which are missed can be found easily.
Also I feel that if tests were not written previously, time should not be wasted to think whether to write tests or not. Tests should be written immediately.

How do you stress test your own software?

I've been working on an app, by myself, and I am at a stage where everything works great--as long as the user does everything he or she is supposed to do. :-) The software needs more testing to see how robust it is, how well it works when people do things like click the same button repeatedly, try to open the wrong kind of files, put data in the wrong places, etc.
I'm having a little trouble with this because it's a bit difficult for me to think in terms of using the application incorrectly. These are all edge cases to me. Still, I'd like to have the application as stable and well tested as possible before I start giving it to beta testers. Assuming that I am not talking about hiring professional testers at this point, I'm curious whether y'all have any tips or systematic ways of thinking about this task.
Thanks, as always.
Well it sounds like you are talking about 2 different things
"Testing your application's functionality" and "Stress testing"(which is the title of your question)
Stress testing is when you have a website, and want to check that it can server 100,000 people at the same time. Seeing how your application performs under stress. You can do this a number of ways, e.g by recording some actions and then getting a number of agent machines to hit your application concurrently.
This questions sounds more like a Quality Assurance question. That is what testers / beta testers are for. But there are things that you can do yourself to validate your application works the best it can.
Unit testing your code would be a good start, it helps you to try and find those edge cases. If your method takes in things like ints, try passing in int.max, int.min, and seeing what blows up. Pass nulls into everything. If you are using .Net you might want to look at PEX, it will go through all the branches/codepaths that your application has. That might help you to further refine your unit tests to test your application the best you can.
Integration tests, see what happens end to end for some of your usual things. This will help you 'find bugs' as you are developing later.
Those are some quick tips on things you can do yourself to try and find edge cases that you may have missed. But yes, eventually you will need to pass your app off to someone else to test. Just make sure that you have covered off as much as you can before it hits them :-)
Make sure you have adequate code coverage in your unit tests and integration tests.
Use appropriate UI validation, and test combinations that can break it.
I have found that a well-architected application that reduces the number of possible permutations in the UI (ways the user can break it) helps a lot. Design patterns like MVC can be especially useful in this regard, since they make your UI veneer as thin as possible.
Automation.
(Re)Factor your code so that another program can throw user-events at it. Create simple scripts of user events and play them back to your program. Capture events from beta users and save those as test scripts (useful for reproducing problems and checking for regressions). Write a fuzz-tester that applies small random changes to the scripts and try them against your program as well.
With this kind of automation you can stress and application and find glaring problems like caches and memory leaks. It won't test the actual functionality. For functionality, unit tests can be helpful. There are a ton of unit testing frameworks out there to try. Pick something useful, learn to write good tests, and integrate them into your build process.

How do you plan an application's architecture before writing any code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
One thing I struggle with is planning an application's architecture before writing any code.
I don't mean gathering requirements to narrow in on what the application needs to do, but rather effectively thinking about a good way to lay out the overall class, data and flow structures, and iterating those thoughts so that I have a credible plan of action in mind before even opening the IDE. At the moment it is all to easy to just open the IDE, create a blank project, start writing bits and bobs and let the design 'grow out' from there.
I gather UML is one way to do this but I have no experience with it so it seems kind of nebulous.
How do you plan an application's architecture before writing any code? If UML is the way to go, can you recommend a concise and practical introduction for a developer of smallish applications?
I appreciate your input.
I consider the following:
what the system is supposed to do, that is, what is the problem that the system is trying to solve
who is the customer and what are their wishes
what the system has to integrate with
are there any legacy aspects that need to be considered
what are the user interractions
etc...
Then I start looking at the system as a black box and:
what are the interactions that need to happen with that black box
what are the behaviours that need to happen inside the black box, i.e. what needs to happen to those interactions for the black box to exhibit the desired behaviour at a higher level, e.g. receive and process incoming messages from a reservation system, update a database etc.
Then this will start to give you a view of the system that consists of various internal black boxes, each of which can be broken down further in the same manner.
UML is very good to represent such behaviour. You can describe most systems just using two of the many components of UML, namely:
class diagrams, and
sequence diagrams.
You may need activity diagrams as well if there is any parallelism in the behaviour that needs to be described.
A good resource for learning UML is Martin Fowler's excellent book "UML Distilled" (Amazon link - sanitised for the script kiddie link nazis out there (-: ). This book gives you a quick look at the essential parts of each of the components of UML.
Oh. What I've described is pretty much Ivar Jacobson's approach. Jacobson is one of the Three Amigos of OO. In fact UML was initially developed by the other two persons that form the Three Amigos, Grady Booch and Jim Rumbaugh
I really find that a first-off of writing on paper or whiteboard is really crucial. Then move to UML if you want, but nothing beats the flexibility of just drawing it by hand at first.
You should definitely take a look at Steve McConnell's Code Complete-
and especially at his giveaway chapter on "Design in Construction"
You can download it from his website:
http://cc2e.com/File.ashx?cid=336
If you're developing for .NET, Microsoft have just published (as a free e-book!) the Application Architecture Guide 2.0b1. It provides loads of really good information about planning your architecture before writing any code.
If you were desperate I expect you could use large chunks of it for non-.NET-based architectures.
I'll preface this by saying that I do mostly web development where much of the architecture is already decided in advance (WebForms, now MVC) and most of my projects are reasonably small, one-person efforts that take less than a year. I also know going in that I'll have an ORM and DAL to handle my business object and data interaction, respectively. Recently, I've switched to using LINQ for this, so much of the "design" becomes database design and mapping via the DBML designer.
Typically, I work in a TDD (test driven development) manner. I don't spend a lot of time up front working on architectural or design details. I do gather the overall interaction of the user with the application via stories. I use the stories to work out the interaction design and discover the major components of the application. I do a lot of whiteboarding during this process with the customer -- sometimes capturing details with a digital camera if they seem important enough to keep in diagram form. Mainly my stories get captured in story form in a wiki. Eventually, the stories get organized into releases and iterations.
By this time I usually have a pretty good idea of the architecture. If it's complicated or there are unusual bits -- things that differ from my normal practices -- or I'm working with someone else (not typical), I'll diagram things (again on a whiteboard). The same is true of complicated interactions -- I may design the page layout and flow on a whiteboard, keeping it (or capturing via camera) until I'm done with that section. Once I have a general idea of where I'm going and what needs to be done first, I'll start writing tests for the first stories. Usually, this goes like: "Okay, to do that I'll need these classes. I'll start with this one and it needs to do this." Then I start merrily TDDing along and the architecture/design grows from the needs of the application.
Periodically, I'll find myself wanting to write some bits of code over again or think "this really smells" and I'll refactor my design to remove duplication or replace the smelly bits with something more elegant. Mostly, I'm concerned with getting the functionality down while following good design principles. I find that using known patterns and paying attention to good principles as you go along works out pretty well.
http://dn.codegear.com/article/31863
I use UML, and find that guide pretty useful and easy to read. Let me know if you need something different.
UML is a notation. It is a way of recording your design, but not (in my opinion) of doing a design. If you need to write things down, I would recommend UML, though, not because it's the "best" but because it is a standard which others probably already know how to read, and it beats inventing your own "standard".
I think the best introduction to UML is still UML Distilled, by Martin Fowler, because it's concise, gives pratical guidance on where to use it, and makes it clear you don't have to buy into the whole UML/RUP story for it to be useful
Doing design is hard.It can't really be captured in one StackOverflow answer. Unfortunately, my design skills, such as they are, have evolved over the years and so I don't have one source I can refer you to.
However, one model I have found useful is robustness analysis (google for it, but there's an intro here). If you have your use-cases for what the system should do, a domain model of what things are involved, then I've found robustness analysis a useful tool in connecting the two and working out what the key components of the system need to be.
But the best advice is read widely, think hard, and practice. It's not a purely teachable skill, you've got to actually do it.
I'm not smart enough to plan ahead more than a little. When I do plan ahead, my plans always come out wrong, but now I've spend n days on bad plans. My limit seems to be about 15 minutes on the whiteboard.
Basically, I do as little work as I can to find out whether I'm headed in the right direction.
I look at my design for critical questions: when A does B to C, will it be fast enough for D? If not, we need a different design. Each of these questions can be answer with a spike. If the spikes look good, then we have the design and it's time to expand on it.
I code in the direction of getting some real customer value as soon as possible, so a customer can tell me where I should be going.
Because I always get things wrong, I rely on refactoring to help me get them right. Refactoring is risky, so I have to write unit tests as I go. Writing unit tests after the fact is hard because of coupling, so I write my tests first. Staying disciplined about this stuff is hard, and a different brain sees things differently, so I like to have a buddy coding with me. My coding buddy has a nose, so I shower regularly.
Let's call it "Extreme Programming".
"White boards, sketches and Post-it notes are excellent design
tools. Complicated modeling tools have a tendency to be more
distracting than illuminating." From Practices of an Agile Developer
by Venkat Subramaniam and Andy Hunt.
I'm not convinced anything can be planned in advance before implementation. I've got 10 years experience, but that's only been at 4 companies (including 2 sites at one company, that were almost polar opposites), and almost all of my experience has been in terms of watching colossal cluster********s occur. I'm starting to think that stuff like refactoring is really the best way to do things, but at the same time I realize that my experience is limited, and I might just be reacting to what I've seen. What I'd really like to know is how to gain the best experience so I'm able to arrive at proper conclusions, but it seems like there's no shortcut and it just involves a lot of time seeing people doing things wrong :(. I'd really like to give a go at working at a company where people do things right (as evidenced by successful product deployments), to know whether I'm a just a contrarian bastard, or if I'm really as smart as I think I am.
I beg to differ: UML can be used for application architecture, but is more often used for technical architecture (frameworks, class or sequence diagrams, ...), because this is where those diagrams can most easily been kept in sync with the development.
Application Architecture occurs when you take some functional specifications (which describe the nature and flows of operations without making any assumptions about a future implementation), and you transform them into technical specifications.
Those specifications represent the applications you need for implementing some business and functional needs.
So if you need to process several large financial portfolios (functional specification), you may determine that you need to divide that large specification into:
a dispatcher to assign those heavy calculations to different servers
a launcher to make sure all calculation servers are up and running before starting to process those portfolios.
a GUI to be able to show what is going on.
a "common" component to develop the specific portfolio algorithms, independently of the rest of the application architecture, in order to facilitate unit testing, but also some functional and regression testing.
So basically, to think about application architecture is to decide what "group of files" you need to develop in a coherent way (you can not develop in the same group of files a launcher, a GUI, a dispatcher, ...: they would not be able to evolve at the same pace)
When an application architecture is well defined, each of its components is usually a good candidate for a configuration component, that is a group of file which can be versionned as a all into a VCS (Version Control System), meaning all its files will be labeled together every time you need to record a snapshot of that application (again, it would be hard to label all your system, each of its application can not be in a stable state at the same time)
I have been doing architecture for a while. I use BPML to first refine the business process and then use UML to capture various details! Third step generally is ERD! By the time you are done with BPML and UML your ERD will be fairly stable! No plan is perfect and no abstraction is going to be 100%. Plan on refactoring, goal is to minimize refactoring as much as possible!
I try to break my thinking down into two areas: a representation of the things I'm trying to manipulate, and what I intend to do with them.
When I'm trying to model the stuff I'm trying to manipulate, I come up with a series of discrete item definitions- an ecommerce site will have a SKU, a product, a customer, and so forth. I'll also have some non-material things that I'm working with- an order, or a category. Once I have all of the "nouns" in the system, I'll make a domain model that shows how these objects are related to each other- an order has a customer and multiple SKUs, many skus are grouped into a product, and so on.
These domain models can be represented as UML domain models, class diagrams, and SQL ERD's.
Once I have the nouns of the system figured out, I move on to the verbs- for instance, the operations that each of these items go through to commit an order. These usually map pretty well to use cases from my functional requirements- the easiest way to express these that I've found is UML sequence, activity, or collaboration diagrams or swimlane diagrams.
It's important to think of this as an iterative process; I'll do a little corner of the domain, and then work on the actions, and then go back. Ideally I'll have time to write code to try stuff out as I'm going along- you never want the design to get too far ahead of the application. This process is usually terrible if you think that you are building the complete and final architecture for everything; really, all you're trying to do is establish the basic foundations that the team will be sharing in common as they move through development. You're mostly creating a shared vocabulary for team members to use as they describe the system, not laying down the law for how it's gotta be done.
I find myself having trouble fully thinking a system out before coding it. It's just too easy to only bring a cursory glance to some components which you only later realize are much more complicated than you thought they were.
One solution is to just try really hard. Write UML everywhere. Go through every class. Think how it will interact with your other classes. This is difficult to do.
What I like doing is to make a general overview at first. I don't like UML, but I do like drawing diagrams which get the point across. Then I begin to implement it. Even while I'm just writing out the class structure with empty methods, I often see things that I missed earlier, so then I update my design. As I'm coding, I'll realize I need to do something differently, so I'll update my design. It's an iterative process. The concept of "design everything first, and then implement it all" is known as the waterfall model, and I think others have shown it's a bad way of doing software.
Try Archimate.

Design By Contract and Test-Driven Development [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I'm working on improving our group's development process, and I'm considering how best to implement Design By Contract with Test-Driven Development. It seems the two techniques have a lot of overlap, and I was wondering if anyone had some insight on the following (related) questions:
Isn't it against the DRY principle to have TDD and DbC unless you're using some kind of code generator to generate the unit tests based on contracts? Otherwise, you have to maintain the contract in two places (the test and the contract itself), or am I missing something?
To what extent does TDD make DbC redundant? If I write tests well enough, aren't they equivalent to writing a contract? Do I only get added benefit if I enforce the contract at run time as well as through the tests?
Is it significantly easier/more flexible to only use TDD rather than TDD with DbC?
The main point of these questions is this more general question: If we're already doing TDD properly, will we get a significant benefit for the overhead if we also use DbC?
A couple of details, though I think the question is largely language-agnostic:
Our team is very small, <10 programmers.
We mostly use Perl.
Note the differences.
Design driven by contract. Contract Driven Design.
Develop driven by test. Test Driven Development.
They are related in that one precedes the other. They describe software at different levels of abstraction.
Do you discard the design when you go to implementation? Do you consider that a design document is a violation of DRY? Do you maintain the contract and the code separately?
Software is one implementation of the contract. Tests are another. User's manual is a third. Operations guide is a fourth. Database backup/restore procedures are one part of an implementation of the contract.
I can't see any overhead from Design by Contract.
If you're already doing design, then you change the format from too many words to just the right words to outline the contractual relationship.
If you're not doing design, then writing a contract will eliminate problems, reducing cost and complexity.
I can't see any loss of flexibility.
start with a contract,
then
a. write tests and
b. write code.
See how the two development activities are essentially intertwined and both come from the contract.
I think there is overlap between DbC and TDD, however, I don't think there is duplicated work: introducing DbC will probably result in a reduction of test cases.
Let me explain.
In TDD, tests aren't really tests. They are behavioral specifications. However, they are also design tools: by writing the test first, you use the external API of your object under test – that you haven't actually written yet – in the same way that a user would. That way, you design the API in a way that makes sense to a user, and not in the way that makes it easiest for you to implement. Something like queue.full? instead of queue.num_entries == queue.size.
This second part cannot be replaced by Contracts.
The first part can be partially replaced by contracts, at least for unit tests. TDD tests serve as specifications of behavior, both to other developers (unit tests) and domain experts (acceptance tests). Contracts also specify behavior, to other developers, to domain experts, but also to the compiler and the runtime library.
But contracts have fixed granularity: you have method pre- and postconditions, object invariants, module contracts and so on. Maybe loop variants and invariants. Unit tests however, test units of behavior. Those might be smaller than a method or consist of multiple methods. That's not something you can do with contracts. And for the "big picture" you still need integration tests, functional tests and acceptance tests.
And there is another important part of TDD that DbC doesn't cover: the middle D. In TDD, tests drive your development process: you never write a single line of implementation code unless you have a failing test, you never write a single line of test code unless your tests all pass, you only write the minimal amount of implementation code to make the tests pass, you only write the minimal amount of test code to produce a failing test.
In conclusion: use tests to design the "flow", the "feel" of the API. Use contracts to design the, well, contract of the API. Use tests to provide the "rhythm" for the development process.
Something like this:
Write an acceptance test for a feature
Write a unit test for a unit that implements a part of that feature
Using the method signature you designed in step 2, write the method prototype
Add the postcondition
Add the precondition
Implement the method body
If the acceptance test passes, goto 1, otherwise goto 2
If you want to know what Bertrand Meyer, the inventor of Design by Contract, thinks about combining TDD and DbC, there is a nice paper by his group, called Contract-Driven Design = Test-Driven Development - Writing Test Cases. The basic premise is that contracts provide an abstract representation of all possible cases, whereas test cases only test specific cases. Therefore, a suitable test harness can be automatically generated from the contracts.
I would add:
the API is the contract for the programmers, the UI definition is the contract with the clients, the protocol is the contract for client-server interactions. Get those first, then you can take advantage of parallel development tracks and not get lost in the weeds. Yes, periodically review to make sure requirements are met, but never start a new track without the contract. And 'contract' is a strong word: once deployed, it must never change. You should include versioning management and introspection from the get-go, changes to the contract are only implemented by extension sets, version numbers change with these, and then you can do things like graceful degradation when dealing with mixed or old installations.
I learned this lesson the hard way, with a large project that wandered off into never-never land, then applied it the right way later when seriously under the gun, company-survival, short fuse timeline. We defined the protocol, defined and wrote a set of protocol emulations for each side of the transactions (basically canned message generators and received message checker, one evenings' worth of two-brained coding), then parted to separately write the server and client ends of the app. We recombined the night of the show, and it just worked. Requirements, design, contract, test, code, integrate. In that order. Repeat until baked.
I am a little leery of design by TLA. As with Patterns, buzz-word compliant recipes are a good guide, but it is my experience that there is no such thing as a one-size-fits-all design or project management procedure. If you are doing things precisely By The Book (tm) then, unless it is a DOD contract with DOD procedural requirements, you will probably get into trouble somewhere along the way. Read the Book(s), yes, but be sure tounderstand them, and then take into account also the people side of your team. Rules that are only enforced by the Book will not get enforced uniformly - even when tool-enforced there can be drop-outs (e.g. svn comments left empty or cryptically brief). Procedures only tend to get followed when the tool chain not only enforces them but makes following easier than any possible short-cuts. Believe me, when the going gets tough, the short-cuts get found, and you may not know about the ones that got used at 3am until it is too late.
You can also use executable acceptance tests that are written in the domain language of the contract. It might not be the actual "contract", but half way between unit tests and the contract.
I would recomment using Ruby Cucumber
http://github.com/aslakhellesoy/cucumber
But since you are a Perl shop, then maybe you can use my own small attempt at p5-cucumber.
http://github.com/kesor/p5-cucumber
Microsoft has done work on automatic generation of unit tests, based on code contracts and parameterized unit test. E.g. the contract says the count must be increased by one when an item is added to a collection, and the parameterized unit test say how to add “n” items to a collection. Pex will then try to create a unit test that proves the contract is broken. See this video for a overview.
If this works, your unit test will only have to be written for one example of each thing you are trying to test, and PEX will be able to then work out witch data items will break the test.
I had some ruminations about that topic some time ago.
You may want to take a look at
http://gleichmann.wordpress.com/2007/12/09/test-driven-development-and-design-by-contract-friend-or-foe/
When you are using TDD to implement a new method, you need some input: you need to know the assertions to check in your tests. Design-by-contract gives you those assertions: they are the post-conditions and invariants of the method.
I have found DbC very handy for jumpstarting the red-green-refactor cycle because it helps in identifying unit tests to start with. With DbC I start thinking about pre-conditions that the object being TDD-ed must handle and each pre-condition might represent a failing unit test to start a red-green-refactor cycle. At some point I switch to start the cycle with a failing unit test for a post-condition, and then just keep the TDD flow going. I have tried this approach with newcomers to TDD and it really works in kickstarting the TDD mindset.
In summary, think of DbC as an effective way to identify key behavioral unit tests. DbC helps at analyzing inputs (pre-conditions) and outputs(post-conditions), which are the two things that we need to control (inputs) and observe (outputs) to write testable software (a similar aim of TDD).

The best way to familiarize yourself with an inherited codebase

Stacker Nobody asked about the most shocking thing new programmers find as they enter the field.
Very high on the list, is the impact of inheriting a codebase with which one must rapidly become acquainted. It can be quite a shock to suddenly find yourself charged with maintaining N lines of code that has been clobbered together for who knows how long, and to have a short time in which to start contributing to it.
How do you efficiently absorb all this new data? What eases this transition? Is the only real solution to have already contributed to enough open-source projects that the shock wears off?
This also applies to veteran programmers. What techniques do you use to ease the transition into a new codebase?
I added the Community-Building tag to this because I'd also like to hear some war-stories about these transitions. Feel free to share how you handled a particularly stressful learning curve.
Pencil & Notebook ( don't get distracted trying to create a unrequested solution)
Make notes as you go and take an hour every monday to read thru and arrange the notes from previous weeks
with large codebases first impressions can be deceiving and issues tend to rearrange themselves rapidly while you are familiarizing yourself.
Remember the issues from your last work environment aren't necessarily valid or germane in your new environment. Beware of preconceived notions.
The notes/observations you make will help you learn quickly what questions to ask and of whom.
Hopefully you've been gathering the names of all the official (and unofficial) stakeholders.
One of the best ways to familiarize yourself with inherited code is to get your hands dirty. Start with fixing a few simple bugs and work your way into more complex ones. That will warm you up to the code better than trying to systematically review the code.
If there's a requirements or functional specification document (which is hopefully up-to-date), you must read it.
If there's a high-level or detailed design document (which is hopefully up-to-date), you probably should read it.
Another good way is to arrange a "transfer of information" session with the people who are familiar with the code, where they provide a presentation of the high level design and also do a walk-through of important/tricky parts of the code.
Write unit tests. You'll find the warts quicker, and you'll be more confident when the time comes to change the code.
Try to understand the business logic behind the code. Once you know why the code was written in the first place and what it is supposed to do, you can start reading through it, or as someone said, prolly fixing a few bugs here and there
My steps would be:
1.) Setup a source insight( or any good source code browser you use) workspace/project with all the source, header files, in the code base. Browsly at a higher level from the top most function(main) to lowermost function. During this code browsing, keep making notes on a paper/or a word document tracing the flow of the function calls. Do not get into function implementation nitti-gritties in this step, keep that for a later iterations. In this step keep track of what arguments are passed on to functions, return values, how the arguments that are passed to functions are initialized how the value of those arguments set modified, how the return values are used ?
2.) After one iteration of step 1.) after which you have some level of code and data structures used in the code base, setup a MSVC (or any other relevant compiler project according to the programming language of the code base), compile the code, execute with a valid test case, and single step through the code again from main till the last level of function. In between the function calls keep moting the values of variables passed, returned, various code paths taken, various code paths avoided, etc.
3.) Keep repeating 1.) and 2.) in iteratively till you are comfortable up to a point that you can change some code/add some code/find a bug in exisitng code/fix the bug!
-AD
I don't know about this being "the best way", but something I did at a recent job was to write a code spider/parser (in Ruby) that went through and built a call tree (and a reverse call tree) which I could later query. This was slightly non-trivial because we had PHP which called Perl which called SQL functions/procedures. Any other code-crawling tools would help in a similar fashion (i.e. javadoc, rdoc, perldoc, Doxygen etc.).
Reading any unit tests or specs can be quite enlightening.
Documenting things helps (either for yourself, or for other teammates, current and future). Read any existing documentation.
Of course, don't underestimate the power of simply asking a fellow teammate (or your boss!) questions. Early on, I asked as often as necessary "do we have a function/script/foo that does X?"
Go over the core libraries and read the function declarations. If it's C/C++, this means only the headers. Document whatever you don't understand.
The last time I did this, one of the comments I inserted was "This class is never used".
Do try to understand the code by fixing bugs in it. Do correct or maintain documentation. Don't modify comments in the code itself, that risks introducing new bugs.
In our line of work, generally speaking we do no changes to production code without good reason. This includes cosmetic changes; even these can introduce bugs.
No matter how disgusting a section of code seems, don't be tempted to rewrite it unless you have a bugfix or other change to do. If you spot a bug (or possible bug) when reading the code trying to learn it, record the bug for later triage, but don't attempt to fix it.
Another Procedure...
After reading Andy Hunt's "Pragmatic Thinking and Learning - Refactor Your Wetware" (which doesn't address this directly), I picked up a few tips that may be worth mentioning:
Observe Behavior:
If there's a UI, all the better. Use the app and get a mental map of relationships (e.g. links, modals, etc). Look at HTTP request if it helps, but don't put too much emphasis on it -- you just want a light, friendly acquaintance with app.
Acknowledge the Folder Structure:
Once again, this is light. Just see what belongs where, and hope that the structure is semantic enough -- you can always get some top-level information from here.
Analyze Call-Stacks, Top-Down:
Go through and list on paper or some other medium, but try not to type it -- this gets different parts of your brain engaged (build it out of Legos if you have to) -- function-calls, Objects, and variables that are closest to top-level first. Look at constants and modules, make sure you don't dive into fine-grained features if you can help it.
MindMap It!:
Maybe the most important step. Create a very rough draft mapping of your current understanding of the code. Make sure you run through the mindmap quickly. This allows an even spread of different parts of your brain to (mostly R-Mode) to have a say in the map.
Create clouds, boxes, etc. Wherever you initially think they should go on the paper. Feel free to denote boxes with syntactic symbols (e.g. 'F'-Function, 'f'-closure, 'C'-Constant, 'V'-Global Var, 'v'-low-level var, etc). Use arrows: Incoming array for arguments, Outgoing for returns, or what comes more naturally to you.
Start drawing connections to denote relationships. Its ok if it looks messy - this is a first draft.
Make a quick rough revision. Its its too hard to read, do another quick organization of it, but don't do more than one revision.
Open the Debugger:
Validate or invalidate any notions you had after the mapping. Track variables, arguments, returns, etc.
Track HTTP requests etc to get an idea of where the data is coming from. Look at the headers themselves but don't dive into the details of the request body.
MindMap Again!:
Now you should have a decent idea of most of the top-level functionality.
Create a new MindMap that has anything you missed in the first one. You can take more time with this one and even add some relatively small details -- but don't be afraid of what previous notions they may conflict with.
Compare this map with your last one and eliminate any question you had before, jot down new questions, and jot down conflicting perspectives.
Revise this map if its too hazy. Revise as much as you want, but keep revisions to a minimum.
Pretend Its Not Code:
If you can put it into mechanical terms, do so. The most important part of this is to come up with a metaphor for the app's behavior and/or smaller parts of the code. Think of ridiculous things, seriously. If it was an animal, a monster, a star, a robot. What kind would it be. If it was in Star Trek, what would they use it for. Think of many things to weigh it against.
Synthesis over Analysis:
Now you want to see not 'what' but 'how'. Any low-level parts that through you for a loop could be taken out and put into a sterile environment (you control its inputs). What sort of outputs are you getting. Is the system more complex than you originally thought? Simpler? Does it need improvements?
Contribute Something, Dude!:
Write a test, fix a bug, comment it, abstract it. You should have enough ability to start making minor contributions and FAILING IS OK :)! Note on any changes you made in commits, chat, email. If you did something dastardly, you guys can catch it before it goes to production -- if something is wrong, its a great way to get a teammate to clear things up for you. Usually listening to a teammate talk will clear a lot up that made your MindMaps clash.
In a nutshell, the most important thing to do is use a top-down fashion of getting as many different parts of your brain engaged as possible. It may even help to close your laptop and face your seat out the window if possible. Studies have shown that enforcing a deadline creates a "Pressure Hangover" for ~2.5 days after the deadline, which is why deadlines are often best to have on a Friday. So, BE RELAXED, THERE'S NO TIMECRUNCH, AND NOW PROVIDE YOURSELF WITH AN ENVIRONMENT THAT'S SAFE TO FAIL IN. Most of this can be fairly rushed through until you get down to details. Make sure that you don't bypass understanding of high-level topics.
Hope this helps you as well :)
All really good answers here. Just wanted to add few more things:
One can pair architectural understanding with flash cards and re-visiting those can solidify understanding. I find questions such as "Which part of code does X functionality ?", where X could be a useful functionality in your code base.
I also like to open a buffer in emacs and start re-writing some parts of the code base that I want to familiarize myself with and add my own comments etc.
One thing vi and emacs users can do is use tags. Tags are contained in a file ( usually called TAGS ). You generate one or more tags files by a command ( etags for emacs vtags for vi ). Then we you edit source code and you see a confusing function or variable you load the tags file and it will take you to where the function is declared ( not perfect by good enough ). I've actually written some macros that let you navigate source using Alt-cursor,
sort of like popd and pushd in many flavors of UNIX.
BubbaT
The first thing I do before going down into code is to use the application (as several different users, if necessary) to understand all the functionalities and see how they connect (how information flows inside the application).
After that I examine the framework in which the application was built, so that I can make a direct relationship between all the interfaces I have just seen with some View or UI code.
Then I look at the database and any database commands handling layer (if applicable), to understand how that information (which users manipulate) is stored and how it goes to and comes from the application
Finally, after learning where data comes from and how it is displayed I look at the business logic layer to see how data gets transformed.
I believe every application architecture can de divided like this and knowning the overall function (a who is who in your application) might be beneficial before really debugging it or adding new stuff - that is, if you have enough time to do so.
And yes, it also helps a lot to talk with someone who developed the current version of the software. However, if he/she is going to leave the company soon, keep a note on his/her wish list (what they wanted to do for the project but were unable to because of budget contraints).
create documentation for each thing you figured out from the codebase.
find out how it works by exprimentation - changing a few lines here and there and see what happens.
use geany as it speeds up the searching of commonly used variables and functions in the program and adds it to autocomplete.
find out if you can contact the orignal developers of the code base, through facebook or through googling for them.
find out the original purpose of the code and see if the code still fits that purpose or should be rewritten from scratch, in fulfillment of the intended purpose.
find out what frameworks did the code use, what editors did they use to produce the code.
the easiest way to deduce how a code works is by actually replicating how a certain part would have been done by you and rechecking the code if there is such a part.
it's reverse engineering - figuring out something by just trying to reengineer the solution.
most computer programmers have experience in coding, and there are certain patterns that you could look up if that's present in the code.
there are two types of code, object oriented and structurally oriented.
if you know how to do both, you're good to go, but if you aren't familiar with one or the other, you'd have to relearn how to program in that fashion to understand why it was coded that way.
in objected oriented code, you can easily create diagrams documenting the behaviors and methods of each object class.
if it's structurally oriented, meaning by function, create a functions list documenting what each function does and where it appears in the code..
i haven't done either of the above myself, as i'm a web developer it is relatively easy to figure out starting from index.php to the rest of the other pages how something works.
goodluck.