Avoiding TDD making big refactorings harder [closed] - language-agnostic

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm still relatively a beginner in TDD, and I often end up into the trap where I've designed myself into a corner at some point when trying to add a new piece of functionality.
Mostly it means that the API that grew out of the, say, first 10 requirements, doesn't scale when adding the next requirement, and I realize I have to do a big redesign on the existing functionality, including the structure, to add something the new stuff in a nice way.
This is fine, except in this case the API would then change, so all the initial tests would then have to change. This is usually a bigger thing than just renaming methods.
I guess my question is twofold: How should I have avoided getting into this position in the first place, and given that I get into it, what are safe patterns of refactoring the tests and allowing new functionality with a new API to grow?
Edit: Lots of great answers, will experiment will several techniques. Marked as solution the answer I felt was most helpful.

How should I have avoided getting into this position in the first place
The most general rule is: write tests with such refactorings in mind. In particular:
The tests should use helper methods whenever they construct anything API-specific (i.e. example objects.) This way you have only one place to change if the construction changes (e.g. after adding mandatory fields to the constructed object)
Same goes for checking the output of the API
Tests should be constructed as "diff from default", with the default provided by the above. For example if your test checks the effect of a method on field x, you should only set the x field in the test, with the rest of the fields taken from the default.
In fact, these are the same rules that apply to code in general.
what are safe patterns of refactoring the tests and allowing new functionality with a new API to grow?
Whenever you find out that an API change makes you change the tests in multiple places, try to figure out how to move the code into a single place. This follows from the above.

Make your tests small. A good test calls maybe 1-3 methods on the subject of the test and does some assertions on the result. These test will only need to change when one of these three methods change.
Make your test code clean. If you haven't already read 'Clean Code' by Robert C. Martin. Apply the rules to your production code AND your test code. This has the tendency to reduce the affected surface area of any refactoring.
Do refactor more often. Instead of (possibly unconsiously) waiting until you must do a large refactoring, do small refactorings a lot.
If you are faced with a huge refactoring, break it down in a couple (or if necessary a couple hundred) tiny refactorings.

In this case, I suggest you get the features locked down, and the iterations short.
Because the iterations are short, features will be grouped together into smaller, isolated groups. It lessens the need to think up of some grand design, which might not be adaptive to the needs of the users. The code for this week, will only work with the code for this week. That lessens the chances of new stuff mucking up the old stuff.

Yeah, it's a problem that is hard to avoid with TDD, since the whole idea behind it is to avoid the over-engineering caused by doing big designs upfront. So with TDD, your design is going to change, and often. The more tests you have, the more effort is required for each refactoring, which effectively discourage it, going against the whole idea behind TDD.
Even though your design will change, your basic requirements will be rather stable, so at the "high level", the way your app works shouldn't change too much. That's why I advise to put all your tests at the "high level" (integration testing, kinda). Low level tests are a liability because you have to change them when you refactor. High level testing is a bit more work, but I think it's worth it in the end.
I wrote an article about that a year ago: http://www.hardcoded.net/articles/high-level-testing.htm

The following might help..
Follow good coding standards and practices, including TDD and utilising design patterns to produce well structured designs. The application as a whole should then be easier to extend to include new features.
Good separation of concerns. If your API is well separated from the other functionality (calculations, database access etc) then changing the functionality without changing the API or vice-versa should be easier.
Use BDD to provide automated tests at a higher level (ie. more like user tests than unit tests). These should help to give you confidence while refactoring even if all your unit tests are breaking due to the refactoring.
Use a Dependency Injection Container such as Windsor. By abstracting away your class dependencies when those dependencies change it creates much less re-work (particularly if you have many unit tests) than having them coded into your classes.

You shouldn't find yourself "in a corner" with TDD. The eleventh test shouldn't so seriously change the design that the first ten tests need to change. Think hard about why so many tests need to change - look at it in detail, one by one - and see if you can come up with a way to make the change without breaking your existing tests.
If, for instance, you need to add a parameter to a method that they all call:
you could leave the existing fewer-parameter method in place, delegating to the new method and adding a default parameter;
you could have all the tests call a utility method (or perhaps a setup method) that calls the old method, so you need to change the method call in only one place;
you may be able to let your IDE do all the changes with a single command.
TDD and Refactoring work symbiotically; each helps the other. Because you emerge from TDD with comprehensive unit tests, refactoring is safe; because you have the organizational, intellectual, and editing tools to refactor freely, you can keep your tests well synchronized with your design. You say you are a beginner at TDD; perhaps you need to be growing your refactoring skills while you learn TDD.

Related

Passing my own project on someone else - what to do? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Often there are situations where a project is passed on someone else. And often this process is unpleasant for both sides - the new owner complains about horrible documentation, bugs and bad design. The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
I might soon be in a situation where one of my projects will be given to someone else so I can focus on my other projects. I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it. Its a medium sized project, not very large but still its not something you can code in a week.
I'm looking for a list of things that should be done in order to help the future owner taking over the project and at the same time will spare me all those annoying questions like "and what does this function do, what purpose does this class have...". I know documentation is a must - what else?
Note: although my project is in C++ i believe this is a language-agnostic question. If there are things you think are specific to some language, please mention them too.
Documentation is one thing, getting it into the head of your new project owner another. IMHO this is a typical situation where "less is more" - the less documentation your colleague has to read to understand something, the better. And, of course, learning takes time - for both of you, accept it.
So
instead of writing lots of documentation, make your code self-commentatory
have all documents / source code etc. in a clean and well named folder structure
make sure your build-process is almost completely automatic
don't forget to document your deployment process, if it is not automatic, too
clean-up, clean-up clean-up!
When taking over a project, documentation is of course desirable, but even more so is a good test suite. Trying to modify a program that you have no means of testing for correctness is a nightmare.
Documentation, but on all levels:
API docs
High level architecture: What components are there, what are their relationships and dependencies
For each component, a high level description pointing to important code sections
Tutorials: If you want to do X, here's how
Data: What data does it use and how, database schemas
Idioms: If you've created some idioms within your code, explain them
And, to start, give the guy a personal introduction to all of the above in person, hopefully doing some needed change in a pair programming way
the new owner complains about horrible documentation, bugs and bad design.
I suspect that no matter what you would do, new owner will always complain about something. People are different, so something that looks easy to understand for you, will look horrible and extremely complicated for someone else.
The original owner is then bothered for months with questions about the project, requests to fix old bugs etc.
In this case you should clearly refuse to help. If you won't refuse, you'll probably end up doing someone else's job for free. If maintaining the project is no longer your job, then the new guy should fix his problem without your help. If "the new guy" can't deal with that, he isn't suitable for the job and should quit.
Its a medium sized project,
"Medium sized" compared to what? How many lines or code, how many files, how many megabytes of code?
I wonder what should I do to make this transfer as smooth as possible. What i already have is a decent documentation, the code is quite good commented and i'm still improving it.
I would handle it like this:
First, do a sweep through the entire code and:
1.1 Remove all commented out blocks of code.
1.2 Remove all unused routines and classes (I'm talking about "forgotten" routines, not parts of utility library).
1.3 Make sure all code follow consistent formatting rules. I.e. you shouldn't mix class_a, ClassA and CClassA in same app, you shouldn't use different styles for putting brackets, etc.
1.4 Make sure that all names (class, variable, function) are self-explanatory. Your code should be as self-explaining as possible - this will save you from writing too much documentation.
1.5 In situations when there is a complicated or hard to understand function, write comments. Keep them as short as possible, and post only when they are absolutely necesarry.
1.6 Try to make sure that there are no known bugs left. If there are known bugs, document them and their behavior.
1.7 Remove garbage from project directories (files that are not used in project, etc.)
1.8 If possible, make sure that code still compiles and works as expected.
Generate html documentation with doxygen. Reveiw it few times, modify code comments a bit until you're satisfied. Or until you're somewhat satisfied with the result. Do not skip this step.
If there is a version control repository (say, git repository) with entire development history, hand it over to a new maintainer, or give him(her?) a functional copy of the repository. This will be useful for (git )bisecting and finding source of the bugs.
Once it is done, and code is transferred to a new maintainer, do not offer "free help", unless you're paid for it (or unless you get something else for helping, or unless it is order from your boss which makes helping new maintainer a part of your current task). Maintaining the code is no longer your job, and if new maintainer can't handle it, he isn't qualified for the job.
I think most of the problems can be avoided with just two simple rules.
Keep the code consistent with platform style guide.
Naming, naming and naming.
If the project is huge, then you just need to run some code camps with the new guys. There's no shortcut for this one.
Remember also that complaining happens mostly because new guy is not qualified enough, i.e. doesn't understand something. That's why it is important to keep things simple. And in case he is more qualified, then I guess you deserve it ;)
Some good advice where to start hacking/changing things is always better than documentation. Consider documentation as a backup material after you are familiar with the code, it should never be the starting point (except if you are exceptional technical writer with unlimited resources and time)
If there is good documentation and commented code as you say, then you've done your part. Just make sure that the documentation includes high-level documentation (architecture, data flow, etc.) as well as lower module or procedure-level documentation.
If this is a situation where you can, I would strongly suggest you protect yourself with some type of contract that specifies what future support (if any) you will provide and for how long.
I think for a situation like this the most important thing is a working, complete build that automatically compiles, documents, and tests the project. That way, there is a well defined point at which the new developer has it working. He can then figure stuff out from the tests and documentation, in principal.

How do you refactor a large messy codebase?

I have a big mess of code. Admittedly, I wrote it myself - a year ago. It's not well commented but it's not very complicated either, so I can understand it -- just not well enough to know where to start as far as refactoring it.
I violated every rule that I have read about over the past year. There are classes with multiple responsibilities, there are indirect accesses (I forget the term - something like foo.bar.doSomething()), and like I said it is not well commented. On top of that, it's the beginnings of a game, so the graphics is coupled with the data, or the places where I tried to decouple graphics and data, I made the data public in order for the graphics to be able to access the data it needs...
It's a huge mess! Where do I start? How would you start on something like this?
My current approach is to take variables and switch them to private and then refactor the pieces that break, but that doesn't seem to be enough. Please suggest other strategies for wading through this mess and turning it into something clean so that I can continue where I left off!
Update two days later: I have been drawing out UML-like diagrams of my classes, and catching some of the "Low Hanging Fruit" along the way. I've even found some bits of code that were the beginnings of new features, but as I'm trying to slim everything down, I've been able to delete those bits and make the project feel cleaner. I'm probably going to refactor as much as possible before rigging my test cases (but only the things that are 100% certain not to impact the functionality, of course!), so that I won't have to refactor test cases as I change functionality. (do you think I'm doing it right or would it, in your opinion, be easier for me to suck it up and write the tests first?)
Please vote for the best answer so that I can mark it fairly! Feel free to add your own answer to the bunch as well, there's still room for you! I'll give it another day or so and then probably mark the highest-voted answer as accepted.
Thanks to everyone who has responded so far!
June 25, 2010: I discovered a blog post which directly answers this question from someone who seems to have a pretty good grasp of programming: (or maybe not, if you read his article :) )
To that end, I do four things when I
need to refactor code:
Determine what the purpose of the code was
Draw UML and action diagrams of the classes involved
Shop around for the right design patterns
Determine clearer names for the current classes and methods
Pick yourself up a copy of Martin Fowler's Refactoring. It has some good advice on ways to break down your refactoring problem. About 75% of the book is little cookbook-style refactoring steps you can do. It also advocates automated unit tests that you can run after each step to prove your code still works.
As for a place to start, I would sit down and draw out a high-level architecture of your program. You don't have to get fancy with detailed UML models, but some basic UML is not a bad idea. You need a big picture idea of how the major pieces fit together so you can visually see where your decoupling is going to happen. Just a page or two of some basic block diagrams will help with the overwhelming feeling you have right now.
Without some sort of high level spec or design, you just risk getting lost again and ending up with another unmaintainable mess.
If you need to start from scratch, remember that you never truly start from scratch. You have some code and the knowledge you gained from your first time. But sometimes it does help to start with a blank project and pull things in as you go, rather than put out fires in a messy code base. Just remember not to completely throw out the old, use it for its good parts and pull them in as you go.
What was most important for me on different occasions were unit tests: I took a few days to write tests for the old code and then I was free to refactor with confidence. How exactly is a different question, but having the tests made it possible for me to make real, substantial changes to the code.
I'll second everyone's recommendations for Fowler's Refactoring, but in your specific case you may want to look at Michael Feathers' Working Effectively with Legacy Code, which is really perfect for your situation.
Feathers talks about Characterization Tests, which are unit tests not to assert known behaviour of the system but to explore and define the existing (unclear) behaviour -- in the case where you've written your own legacy code, and fixing it yourself, this may not be so important, but if your design is sloppy then it's quite possible there are parts of the code that work by 'magic' and their behaviour isn't clear, even to you -- in that case, characterization tests will help.
One great part of the book is the discussion about finding (or creating) seams in your codebase -- seams are natural 'fault lines', if you like, where you can break into the existing system to start testing it, and pulling it towards a better design. Hard to explain but well worth a read.
There's a brief paper where Feathers fleshes out some of the concepts from the book, but it really is well worth hunting down the whole thing. It's one of my favourites.
Just an additional refactoring that is more important than you think: Name things correctly!
This goes for any variable name and method name. If the name does not accurately reflect what the thing is used for, then rename it to something more accurate. This might require several iterations. If you cannot find a short, and entirely accurate name, then that item does too much and you have an excellent candidate for a code snippet that needs to be split. The names also clearly indicate where the cuts are to be made.
Also, document your stuff. Whenever the answer to WHY? is not clearly conveyed by the answer to HOW? (being the code) you will need to add some documentation. Capturing design decisions is probably the most important task as it is very hard to do in code.
You could always start from "scratch". That doesn't mean scrap it and start from nothing, but try to rethink high-level things from the beginning, since you seem to have learned a lot since the last time you worked on it.
Start from a higher level, and as you build the scaffolding of your new and improved structure, take all the code you can reuse, which will probably be more than you think if you're willing to read through it and make some small changes.
When you're making the changes, be sure to be strict with yourself about following all the good practices you now know, because you will really thank yourself later.
It can be surprisingly refreshing to properly re-make program to do exactly what it did before, only more "cleanly". ;)
As others have mentioned as well, unit-tests are your best friend! They help you ensure that your refactoring works, and if you're starting from "scratch", it's the perfect time to write them.
You're in a much better position than many people facing this problem in that you understand what the code is supposed to do.
Taking variables out of a shared scope, as you're doing, is a great start, in that you're partitioning responsibilities. Ultimately you want each class to express a single responsibility. A few other things you might look at:
Easy targets for refactoring are code that's duplicated in lots of places and long methods.
If you're managing application state through statically initialized singletons or worse, a global state that everything is talking to, consider moving it to a managed initialization system (i.e. a dependency injection framework like spring or guice) or at least make sure that the initialization isn't entangled with the rest of the code.
Centralize and standardize how you're accessing outside resources, especially if you've got things like file locations or urls hardcoded.
Buy an IDE that has good refactoring support. I think IntelliJ is the best, but Eclipse has it now, too.
The unit test idea is key as well. You will want to have a suite of large, overall transactions that will give you the overall behavior of the code.
Once you have those, start creating unit tests for classes and smaller packages. Write the tests to demonstrate proper behavior, make your changes, and re-run the tests to demonstrate that you haven't broken everything.
Track code coverage as you go. You'll want to work it up to 70% or better. For the classes you change, you'll want those to be 70% or better before you make your changes.
Build up that safety net over time and you'll be able to refactor with some confidence.
very slowly :D
No seriously... take it one step at a time. For instance, refactor something only if it affects or helps you write the current bug/feature that you are working on right now and do no more than that. And before you refactor make darn sure that you have some kind of automated test in place that gets run on each build that will actually test what you are writing/refactoring. Even if you don't have unit tests, it is never too late to start adding them for all new and modified code that is being written. Over time, your code base will get better in small increments daily or weekly instead of worse - all without you making monumental heaps of changes.
In my personal opinion and experience, it's not worth it to just refactor a (legacy) codebase en masse for the sake of refactoring. In those cases, it's best to just start from scratch and do it right all over again (and very rarely are you afforded the opportunity to do such a thing). Hence, just refactoring incremental is the way to go.
For Java code, my favorite first step is to run Findbugs and then remove all the dead stores, un-used fields, unreachable catch blocks, unused private methods and likely bugs.
Next I run CPD to look for evidence of cut-copy-paste code.
It isn't unusual to be able to reduce the code base by 5% by doing this. It also saves you from refactoring code that is never used.
I think you should use Eclipse as a IDE because it is having many plugins and free of cost.You should now follow the MVC pattern and yes must write test cases using JUnit.Eclipse also have plugin for JUnit and it is providing code refactoring facility too so that will reduce your some work.And always remember that writing a code is not important the main thing is to write clean code.So now give comments everywhere so that not only you but any other person read the code then while reading the code he must feel that he is reading an essay.
Refactor the low-hanging fruit. Nibble away at the easy bits, and as you do that, the harder bits will begin to be a little easier. When there aren't any bits left to refactor, you're done.
The refactorings you'll probably find most useful are Rename Method (and even more trivial Renamings like Field, Variable, and Parameter), Extract Method, and Extract Class. For each refactoring you perform, write the necessary unit tests to make the refactoring safe, and run the entire suite of unit tests after each refactoring. It's tempting - and, let's be honest, pretty safe - to rely on the automated refactorings of your IDE, without the tests - but it's good practice and will be good to have the tests into the future as you add functionality to your project.
You might want to look at Martin Fowler's book Refactoring. This is the book that popularized the term and technique (my thought when taking his course: "I've been doing a lot of this all along, I didn't know it had a name"). A quote from the link:
Refactoring is a controlled technique
for improving the design of an
existing code base. Its essence is
applying a series of small
behavior-preserving transformations,
each of which "too small to be worth
doing". However the cumulative effect
of each of these transformations is
quite significant. By doing them in
small steps you reduce the risk of
introducing errors. You also avoid
having the system broken while you are
carrying out the restructuring - which
allows you to gradually refactor a
system over an extended period of
time.
As others have pointed out, unit tests will allow you to refactor with confidence. And start by reducing code duplication. The book will give you lots of other insights.
Here is a catalog of refactorings.
The correct definition of messy code, is code that hard to maintain and change.
To use more mathematical definition, you can check your code by code metrics tools.
This way, you will keep the code that already good enough, and find very fast, the wrong code.
My experience say, that is very powerful way to improve the quality of your code. (if your tool can show you the result on each build or on realtime)
Throw it away, build it new.

stopping code rot [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Given that working features are better value for a company than good code at any given point in time and that bad code makes adding more features difficult:
How do you stop the code from deteriorating over time?
At any point, getting a feature to work is higher priority than getting it to work with well engineered code which takes longer. Even though as time goes on the effort for each feature increases.
How do you stop the code turning into an un-maintainable mush over time?
A comprehensive set of unit tests
edit: and it's helpful if they are well written to accurately test all your classes / interfaces in a human readable way.
edit 2: as svelil says, refactor your code to keep it clean, but being able to do this is a consequence of having the unit tests.
Unit tests will not stop the rot on their own. I can still write horrendous, unmaintainable code that passes a unit test.
A better answer is unit tests. + regular refactoring + peer review (either at pairing stage or after) + standards
You do know there is no silver bullet.
Use an iterating development process:
Implement function
Refactor code
Jump to 1.
You have to have some discipline, but without it you will end up having a mess. Even if you think "Oh, the code is readable enough", don't skip step 2. Of course, development should always be accompanied by testing.
Periodic refactorings, particularly in the section of code in which you're currently working (the "Boy Scout" rule).
The top and accepted answer in this question should be "Comprehensive unit tests".
This answer is not going to repeat that.
However adding Unit Tests to an existing project is much harder and generally is poor imitation of what can be achieved if the application code it self were written with Unit Testing in mind.
Also extreme schedule pressure can make it impossible to consider, those who haven't experienced using unit testing its still a big punt.
My recommendation in those conditions is write as well as you can to achieve the current goals. Be prepared to refactor existing code before adding new functions. Whilst unit testing would make this approach way way safer, this approach is still useful even without unit testing.
Of course good general testing and QA is important.
A decent set of coding standards.
They don't need to be complete, but they should mean that you know what things like your brace-indentation is so you have less to think about (and it means that places where the code was rushed and not formatted properly will stand out like a sore thumb)
In my job, as we approach code freeze then the 'fast and dirty' approach is sometimes necessary. This usually results in some pretty dodgy code that works but offends thine eyes.
However, immediately after shipping there is a period of relative calm. This is the perfect opportunity to revisit some of that smelly code and knock it into a bit of shape.
It helps to keep a list of areas that you would like to tidy up and assign some sort of priority to them. Despite how good you believe your memory to be, you WILL forget.
This approach works quite well for me but I suppose it depends upon your particular project workflow.

How would you maintain legacy applications [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How would you maintain the legacy applications that:
Has no unit tests have big methods
with a lot of duplicated logic have
have No separation of concern
have a lot of quick hacks and hard coded
strings
have Outdated and wrong
documentation
Requirements are not properly documented! This has actually resulted in disputes between the testers, developers and the clients in the past. Of course there are some non-functional requirements such as shouldn't be slow, don't clash and other business logics that are known to the application users. But beyond the most common-sense scenario and the most common-sense business workflow, there is little guidance on what should be ( or not) done.
???
You need the book Working Effectively with Legacy Code by Michael C. Feathers.
Write tests as soon as you can. Preferably against the requirements (if they exist). Start with functional tests. Refactor in small chunks. Anytime you touch code, leave it cleaner and better than when you started.
Two things.
Write unit tests as you have the chance.
Once you have enough unit tests to be confident, start refactoring.
The rate at which you accomplish this may be slow.... Typically, you're supposed to "just maintain it" not fix it.
During the "learning how to maintain it" phase, however, you can write a lot of unit tests.
Then, as bugs are found and enhancements requested, you can add yet more tests.
It's Agile, applied to legacy.
I have seen, worked and am working in a codebase which satisfies all the conditions that is mentioned in the question :-)
The approach followed in maintaining this codebase is NOT TO BREAK ANYTHING. FWIW, the code works and the end users are happy. No one is going to listen to the developer cries that there is duplication of code, hard coded strings etc. We just steal some time to fix whatever possible and take the utmost care to not introduce new bugs..
I think I would create a small set of Up To Date information: What Action calls which functions etc.
From there, I would look at refactoring. Duplicated Logic seems to be something that could be refactored, but remember that
That can be a huge task when you realize in how many many places that logic is called and
Two function that seem similar may have a tiny difference, i.e. a - instead of a +
I think the biggest urge to resist is "Just rebuild the whole damn thing!" and get an overview of the system first, to demystify the beast.
sudo rm -rf /
But more seriously, I think it has to be evaluated. If the code continually is a source of requests for change and the changes are difficult then before long you have to consider if it is worth it to try and refactor/re-engineer the system into something more modern. Of course this isn't always practical, so you often end up with just a few people on the team who are responsible for maintaining the legacy parts. As much as possible, everyone on the team should be able to maintain all parts of the system......
One more thing that I think is important is to track the amount of time and effort that a team spends working on a legacy system doing maintenance/feature requests. These metrics can be convincing when evaluating the planning of a new effort to replace the legacy systems/components.
I basically agree with everything Paul C said. I'm not a TDD priest, but anytime you're touching a legacy codebase -- especially one with which you're not intimately familiar -- you need to have a solid way to retest and make sure you've followed Hippocrates: First, do no harm. Testing, good unit and regression tests in particular, are about the only way to make that play.
I highly recommend picking up a copy of Reversing: Secrets of Reverse Engineering Software if it's a codebase with which you're unfamiliar. Although this book goes to great depths that are outside your current needs (and mine, for that matter), it taught me a great deal about how to safely and sanely work with someone else's code.

Best practices: Many small functions/methods, or bigger functions with logical process components inline? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Is it better to write many small methods (or functions), or to simply write the logic/code of those small processes right into the place where you would have called the small method? What about breaking off code into a small function even if for the time being it is only called from one spot?
If one's choice depends on some criteria, what are they; how should a programmer make a good judgement call?
I'm hoping the answer can be applied generally across many languages, but if necessary, answers given can be specific to a language or languages. In particular, I'm thinking of SQL (functions, rules and stored procedures), Perl, PHP, Javascript and Ruby.
I always break long methods up into logical chunks and try to make smaller methods out of them. I don't normally turn a few lines into a separate method until I need it in two different places, but sometimes I do just to help readability, or if I want to test it in isolation.
Fowler's Refactoring is all about this topic, and I highly recommend it.
Here's a handy rule of thumb that I use from Refactoring. If a section of code has a comment that I could re-word into a method name, pull it out and make it a method.
The size of the method is directly linked to its cyclomatic complexity.
The main advantages to keep the size of the method small (which means dividing a big method into several small methods) are:
better unit testing (due to low cyclomatic complexity)
better debugging due to a more explicit stack trace (instead of one error within one giant method)
As always you can say: it depends. It's more a question of naming and defining the task of a method. Every method should do one (not more) well defined task and should do them completely. The name of the method should indicate the task. If your method is named DoAandB() it may be better to have separate methods DoA() and DoB(). If you need methods like setupTask, executeTask, FinishTask, it may be useful to combine them.
Some points that indicate, that a merge of different methods may be useful:
A method cannot be used alone, without the use of other methods.
You have to be careful to call some dependent methods in the right order.
Some points that indicate, that a splitup of the method could be useful:
Some lines of the existing method have clear independent task.
Unit-testing of the big method gets problematic. If tests are easier to write for independent methods, then split the big method up.
As an explanation to the unit-test-argument: I wrote a method, that did some things including IO. The IO-part was very hard to test, so I thought about it. I came to the conclusion, that my method did 5 logical and independent steps, and only one of them involved the IO. So I split up my method into 5 smaller ones, four of them were easy to test.
Small methods every time.
They are self documenting (er, if well named)
They break down the problem into manageable parts - you are KeepingItSimple.
You can use OO techniques to more easily (and obviously) plug in behaviour. The large method is by definition more procedural and so less flexible.
They are unit testable. This is the killer, you simply can’t unit test some huge method that performs a load of tasks
Something I learnt from The Code Complete book:
Write methods/functions so that it
implement one chunk(or unit or task)
of logic. If that requires breakdown
into sub tasks, then write a
seperate method/function for them
and call them.
If I find that the method/function
name is getting long then I try to
examine the method to see it it can
be broken down into two methods.
Hope this helps
Some rules of thumb:
Functions should not be longer than what can be displayed on screen
Break functions into smaller ones if it makes the code more readable.
I make each function do one thing, and one thing only, and I try not to nest too many levels of logic. Once you start breaking your code down into well named functions, it becomes a lot easier to read, and practically self-documenting.
I find that having many small methods makes code easier to read, maintain and debug.
When I'm reading through a unit that implements some business logic, I can better follow the flow if I see a series of method calls that describe the process. If I care about how the method is implemented, I can go look in the code.
It feels like more work but it ultimately saves time.
There is an art, I think, to knowing what to encapsulate. Everyone has some slight difference of opinion. If I could put it in words I'd say that each method should do one thing that can be described as a complete task.
The bigger the method, the harder to test and maintain. I find its much easier to understand how a large process works when its broken down into atomic steps. Also, doing this is a great first step to make your classes extensible. You can mark those individual steps as virtual (for inheritance), or move them into other objects (composition), making your application's behavior easier to customize.
I usually go for splitting functions into smaller functions that each perform a single, atomic task, but only if that function is complex enough to warrent it.
This way, I don't end up with multiple functions for simple tasks, and the functions I do extract can typically be used elsewhere as they don't try to achieve too much. This also aids unit testing as each function (as a logical, atomic action) can then be tested individually.
It depends a bit ... on mindset. Still, this is not an opinionated question.
The answer rather actually depends on the language context.
In a Java/C#/C++ world, where people are following the "Clean Code" school, as preached by Robert Martin, then: many small methods are the way to go.
A method has a clear name, and does one thing. One level of nesting, that's it. That limits its length to 3, 5, max 10 lines.
And honestly: I find this way of coding absolutely superior to any other "style".
The only downside of this approach is that you end up with many small methods, so ordering within a file/class can become an issue. But the answer to that is to use a decent IDE that allows to easily navigate forth and back.
So, the only "legit" reason to use the "all stuff goes into one method/function" is when your whole team works like that, and prefers that style. Or when you can't use decent tooling (but then navigating that big ugly function won't work either).
Personally, I lean significantly in the direction of preferring more, smaller methods, but not to the point of religiously aiming for a maximum line count. My primary criterion or goal is to keep my code DRY. The minute I have a code block which is duplicated (whether in spirit or actually by the text), even if it might be 2 or 4 lines long, I DRY up that code into a separate method. Sometimes I will do so in advance if I think there's a good chance it will be used again in the future.
On the flip side, I have also heard it argued that if your break-off method is too small, in the context of a team of developers, a teammate is likely not to know about your method, and will either write inline, or write his own small method that does the same thing. This is admittedly a bad situation.
Some also try to argue that it is more readable to keep things inline, so a reader can just read top-down, instead of having to jump around method definitions, possibly across multiple files. Personally, I think the existence of a stack trace makes this not much of an issue.