When is it time to refactor code? - language-agnostic

On on hand:
1. You never get time to do it.
2. "Context switching" is mentally expensive (difficult to leave what you're doing in the middle of it).
3. It usually isn't an easy task.
4. There's always the fear you'll break something that's now working.
On the other:
1. Using that code is error-prone.
2. Over time you might realize that if you had refactored the code the first time you saw it, that would have saved you time on the long run.
So my question is - Practically - When do you decide it's time to refactor your code?
Thanks.

A couple of observations:
On on hand:
1. You never got time to do it.
If you treat re-factoring as something separate from coding (instead of an intrinsic part of coding decently), and if you can't manage time, then yeah, you'll never have time for it.
"Context switching" is mentally expensive (difficult to leave what you're doing in the middle of it).
See previous point above. Refactoring is an active component of good coding practices. If you separate the two as if they were two different tasks, then 1) your coding practices need improvement/maturing, and 2) you will engage in severe context switching if your code is in a severe need of refactoring (again, code quality.)
It's usually isn't an easy task.
Only if the code you produce is not amenable to refactoring. That is, code that is hard to refactor exhibits one or more of the following (list is not universally inclusive):
High cyclomatic complexity,
No single responsibility per class (or procedure),
High coupling and/or poor low cohesion (aka poor LCOM metrics),
poor structure
Not following the SOLID principles.
No adherence to the Law of Demeter when appropriate.
Excessive adherence to the Law of Demeter when inappropriate.
Programming against implementations instead of interfaces.
There's always the fear you'll break something that's now working.
Testing? Verification? Analysis? Any of these before being checked into source control (and certainly before being delivered to the users)?
On the other:
1. Using that code is error-prone.
Only if it has never tested/verified and/or if there is no clear understanding of the conditions and usage patterns under which the potentially error-prone code operates acceptably.
Over time you might realize that if you would have refactored the code the
first time you saw it - That would have save you time on the long run.
That realization should not occur over time. Good engineering and work ethics calls for that realization to occur when the artifact (being hardware or software) is in the making.
So my question is - Practically - When do you decide it's time to refactor your code?
Practically, when I'm coding; I detect an area that needs improvement (or something that needs correction after a change on requirements or expectations); and I get an opportunity to improve it without sacrificing a deadline. If I cannot re-factor at that moment, I simply document the perceived defect and create a workable, realistic plan to revisit the artifact for refactoring.
In real life, there will be moments that we'll code some ugly kludge just to get things running, or because we are drained and tired or whatever. It's reality. Our job is to make sure that those incidents do not pile up and remain unattended. And the key to this is to refactor as you code, keep the code simple and with a good, simple and elegant structure. And by "elegant" I don't mean "smart-ass" or esoteric, but that displays what is typically considered readable, simple, composable attributes (and mathematical attributes when they apply practically.)
Good code lends itself to refactoring; it displays good metrics; its structure resembles both computer science function composition and mathematical function composition; it has a clear responsibility; it makes its invariants, pre and post-conditions evident; and so on and so on.
Hope it helps.

One of the most common mistakes i see is people associating the word "Refactor" with "Big Change".
Refactoring code does not always have to be big. Even small changes such as changing a bool to a proper enum, or renaming a method to be closer to the actual function vs. the intent is refactoring your code. With the exception of the end of a milestone, I try to make at least a very small refactoring every single time I check in. And you'd be amazed at how quickly this makes a visible difference in the code.
Bigger changes do take bigger planning though. I try and schedule about 1/2 a day every two weeks during a normal development cycle to tackle a bigger refactoring change. This is enough time to make a substantial improvement to the code base. If the refactoring fails 1/2 a day is not that much of a loss. And it's rarely a total loss because even the failed refactoring will teach you something about your code.

Whenever it smells, I refactor. I may not make it perfect now, but I can at least make a small step towards a better state. And those small changes do add up over time...
If I am in the middle of something when I notice the smell, and fixing it isn't trivial (or I am just before release), I may make a (mental or paper) note to return to it once I am finished with the primary task.
Practice makes one better :-) But if I don't see a solution to a problem, I put it aside and let it brew for a while, discuss it with coworkers, or even post it on SO ;-)
If I don't have unit tests and the fix isn't trivial, I start with the tests. If the tests aren't trivial either, I apply point 2.

I start to refactor as soon as I find I am repeating my self. DRY principles ftw.
Also, if methods/functions get too long, to the point where they look unwieldy, or their purpose is being obscured by the length of the function, I break it into private subfunctions that explain what is really going on.
Lastly, if everything's up and running, and the code is dog-slow, I start to look at refactoring for the sake of performance.

When implementing a new feature I often notice that the task would be much simpler if the code I'm working on was structured in a different way. In this case I usually step back, try to do the refactoring first, and only after this is done I continue implementing the new feature.
I also have a habit to track all potential improvements that come to my mind either in notes or the bug tracker. The ideas bake there for some time, some of them don't feel so compelling anymore, and the reasonable ones are implemented during a day which I dedicate to smaller tasks.

Refactor code when it needs to be refactored. Some symptoms I look for:
duplicate code in similar objects.
duplicate code in within methods of one object.
anytime the requirements have changed twice or more.
anytime somebody says "we will clean that up later".
any time I read through code and shake my head thinking "what goofball did this" (even when the goofball in question is me)
In general, less design and/or less clear requirements means more oppurtunities for refactoring.

This might sound like a joke, but really, I only refactor when things "get messy". When a simple task starts taking more time and usual, when I have to twist my mind around to remember what function is doing what and such. Also, if the code starts running slow and it's not because I'm running in a development enviroment (a lot of variable outputs and such) if I can't optimise it, I refactor the code. As you said, it's worthed on the long run.
Still, I allways make sure I have enough time to think things through before I start so I don't get in this sittuation.
Cheers!

I usually refactor when one of the following is true:
I have nothing better to do and waiting for the next project to come to my inbox
The additions/changes I'm making to the code cannot work unless, or would be better if, I refactor
I am aesthetically displeased with the way the code is laid out

Martin Fowler in his book if the same name suggests you do it the third time you're in a block of code to make another change. First time in the block, you happen to notice you should refactor, but don't have time. Second time back...same thing. Third time back-now refactor.
Also, I read the developers of a current release of smalltalk (squeak.org, I think) say they go through a couple weeks of intense coding...then they step back and look at what can be refactored.
Personally I have to resist the impulse to refactor as I code or I get 'paralyzed'.

Related

Do we need to avoid duplicate at any cost?

In our application framework ,whenever a code duplicate starts happen (even in slightest degree) people immediately wanted it to be removed and added as separate function. Though it is good ,when it is being done at the early stage of development soon that function need to be changed in an unexpected way. Then they add if condition and other condition to make that function stay.
Is duplicate really evil? I feel we can wait for triplication (duplicate in three places ) to occur before making it as a function. But i am being seen as alien if i suggest this. Do we need to avoid duplicate at any cost ?
There is a quote from Donald Knuth that pin points your question:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.
I think that duplication is needed so we are able to understand what are the critical points that need to be DRYed.
Duplicated code is something that should be avoided but there are some (less frequent) contexts where abstractions may lead you to poor code, for example when writing test specs some degree of repetition may keep your code easier to maintain.
I guess you should always ask what is the gain of another abstraction and if there is no quick win you may leave some duplicated code and add TODO comment, this will make you code faster, deliver earlier and DRY only what is most valued.
A similar approach is described by Three Strikes And You Refactor:
The first time you do something, you just do it.
The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway.
(Or if you're like me, you don't wince just yet -- because your DuplicationRefactoringThreshold is 3 or 4.)
The third time you do something similar, you refactor.
This is the holy war between management, developers and QC.
So to make this war end you need to make a conman concept for all three of them.
Abstraction VS Encapsulation, Interface VS Class, Singleton VS Static.
you all need to give yourself time to understand each other point of view to make this work.
For example if you are using scrum then you can go sprints for developing then make a sprint for abstraction and encapsulation.

How do you refactor a large messy codebase?

I have a big mess of code. Admittedly, I wrote it myself - a year ago. It's not well commented but it's not very complicated either, so I can understand it -- just not well enough to know where to start as far as refactoring it.
I violated every rule that I have read about over the past year. There are classes with multiple responsibilities, there are indirect accesses (I forget the term - something like foo.bar.doSomething()), and like I said it is not well commented. On top of that, it's the beginnings of a game, so the graphics is coupled with the data, or the places where I tried to decouple graphics and data, I made the data public in order for the graphics to be able to access the data it needs...
It's a huge mess! Where do I start? How would you start on something like this?
My current approach is to take variables and switch them to private and then refactor the pieces that break, but that doesn't seem to be enough. Please suggest other strategies for wading through this mess and turning it into something clean so that I can continue where I left off!
Update two days later: I have been drawing out UML-like diagrams of my classes, and catching some of the "Low Hanging Fruit" along the way. I've even found some bits of code that were the beginnings of new features, but as I'm trying to slim everything down, I've been able to delete those bits and make the project feel cleaner. I'm probably going to refactor as much as possible before rigging my test cases (but only the things that are 100% certain not to impact the functionality, of course!), so that I won't have to refactor test cases as I change functionality. (do you think I'm doing it right or would it, in your opinion, be easier for me to suck it up and write the tests first?)
Please vote for the best answer so that I can mark it fairly! Feel free to add your own answer to the bunch as well, there's still room for you! I'll give it another day or so and then probably mark the highest-voted answer as accepted.
Thanks to everyone who has responded so far!
June 25, 2010: I discovered a blog post which directly answers this question from someone who seems to have a pretty good grasp of programming: (or maybe not, if you read his article :) )
To that end, I do four things when I
need to refactor code:
Determine what the purpose of the code was
Draw UML and action diagrams of the classes involved
Shop around for the right design patterns
Determine clearer names for the current classes and methods
Pick yourself up a copy of Martin Fowler's Refactoring. It has some good advice on ways to break down your refactoring problem. About 75% of the book is little cookbook-style refactoring steps you can do. It also advocates automated unit tests that you can run after each step to prove your code still works.
As for a place to start, I would sit down and draw out a high-level architecture of your program. You don't have to get fancy with detailed UML models, but some basic UML is not a bad idea. You need a big picture idea of how the major pieces fit together so you can visually see where your decoupling is going to happen. Just a page or two of some basic block diagrams will help with the overwhelming feeling you have right now.
Without some sort of high level spec or design, you just risk getting lost again and ending up with another unmaintainable mess.
If you need to start from scratch, remember that you never truly start from scratch. You have some code and the knowledge you gained from your first time. But sometimes it does help to start with a blank project and pull things in as you go, rather than put out fires in a messy code base. Just remember not to completely throw out the old, use it for its good parts and pull them in as you go.
What was most important for me on different occasions were unit tests: I took a few days to write tests for the old code and then I was free to refactor with confidence. How exactly is a different question, but having the tests made it possible for me to make real, substantial changes to the code.
I'll second everyone's recommendations for Fowler's Refactoring, but in your specific case you may want to look at Michael Feathers' Working Effectively with Legacy Code, which is really perfect for your situation.
Feathers talks about Characterization Tests, which are unit tests not to assert known behaviour of the system but to explore and define the existing (unclear) behaviour -- in the case where you've written your own legacy code, and fixing it yourself, this may not be so important, but if your design is sloppy then it's quite possible there are parts of the code that work by 'magic' and their behaviour isn't clear, even to you -- in that case, characterization tests will help.
One great part of the book is the discussion about finding (or creating) seams in your codebase -- seams are natural 'fault lines', if you like, where you can break into the existing system to start testing it, and pulling it towards a better design. Hard to explain but well worth a read.
There's a brief paper where Feathers fleshes out some of the concepts from the book, but it really is well worth hunting down the whole thing. It's one of my favourites.
Just an additional refactoring that is more important than you think: Name things correctly!
This goes for any variable name and method name. If the name does not accurately reflect what the thing is used for, then rename it to something more accurate. This might require several iterations. If you cannot find a short, and entirely accurate name, then that item does too much and you have an excellent candidate for a code snippet that needs to be split. The names also clearly indicate where the cuts are to be made.
Also, document your stuff. Whenever the answer to WHY? is not clearly conveyed by the answer to HOW? (being the code) you will need to add some documentation. Capturing design decisions is probably the most important task as it is very hard to do in code.
You could always start from "scratch". That doesn't mean scrap it and start from nothing, but try to rethink high-level things from the beginning, since you seem to have learned a lot since the last time you worked on it.
Start from a higher level, and as you build the scaffolding of your new and improved structure, take all the code you can reuse, which will probably be more than you think if you're willing to read through it and make some small changes.
When you're making the changes, be sure to be strict with yourself about following all the good practices you now know, because you will really thank yourself later.
It can be surprisingly refreshing to properly re-make program to do exactly what it did before, only more "cleanly". ;)
As others have mentioned as well, unit-tests are your best friend! They help you ensure that your refactoring works, and if you're starting from "scratch", it's the perfect time to write them.
You're in a much better position than many people facing this problem in that you understand what the code is supposed to do.
Taking variables out of a shared scope, as you're doing, is a great start, in that you're partitioning responsibilities. Ultimately you want each class to express a single responsibility. A few other things you might look at:
Easy targets for refactoring are code that's duplicated in lots of places and long methods.
If you're managing application state through statically initialized singletons or worse, a global state that everything is talking to, consider moving it to a managed initialization system (i.e. a dependency injection framework like spring or guice) or at least make sure that the initialization isn't entangled with the rest of the code.
Centralize and standardize how you're accessing outside resources, especially if you've got things like file locations or urls hardcoded.
Buy an IDE that has good refactoring support. I think IntelliJ is the best, but Eclipse has it now, too.
The unit test idea is key as well. You will want to have a suite of large, overall transactions that will give you the overall behavior of the code.
Once you have those, start creating unit tests for classes and smaller packages. Write the tests to demonstrate proper behavior, make your changes, and re-run the tests to demonstrate that you haven't broken everything.
Track code coverage as you go. You'll want to work it up to 70% or better. For the classes you change, you'll want those to be 70% or better before you make your changes.
Build up that safety net over time and you'll be able to refactor with some confidence.
very slowly :D
No seriously... take it one step at a time. For instance, refactor something only if it affects or helps you write the current bug/feature that you are working on right now and do no more than that. And before you refactor make darn sure that you have some kind of automated test in place that gets run on each build that will actually test what you are writing/refactoring. Even if you don't have unit tests, it is never too late to start adding them for all new and modified code that is being written. Over time, your code base will get better in small increments daily or weekly instead of worse - all without you making monumental heaps of changes.
In my personal opinion and experience, it's not worth it to just refactor a (legacy) codebase en masse for the sake of refactoring. In those cases, it's best to just start from scratch and do it right all over again (and very rarely are you afforded the opportunity to do such a thing). Hence, just refactoring incremental is the way to go.
For Java code, my favorite first step is to run Findbugs and then remove all the dead stores, un-used fields, unreachable catch blocks, unused private methods and likely bugs.
Next I run CPD to look for evidence of cut-copy-paste code.
It isn't unusual to be able to reduce the code base by 5% by doing this. It also saves you from refactoring code that is never used.
I think you should use Eclipse as a IDE because it is having many plugins and free of cost.You should now follow the MVC pattern and yes must write test cases using JUnit.Eclipse also have plugin for JUnit and it is providing code refactoring facility too so that will reduce your some work.And always remember that writing a code is not important the main thing is to write clean code.So now give comments everywhere so that not only you but any other person read the code then while reading the code he must feel that he is reading an essay.
Refactor the low-hanging fruit. Nibble away at the easy bits, and as you do that, the harder bits will begin to be a little easier. When there aren't any bits left to refactor, you're done.
The refactorings you'll probably find most useful are Rename Method (and even more trivial Renamings like Field, Variable, and Parameter), Extract Method, and Extract Class. For each refactoring you perform, write the necessary unit tests to make the refactoring safe, and run the entire suite of unit tests after each refactoring. It's tempting - and, let's be honest, pretty safe - to rely on the automated refactorings of your IDE, without the tests - but it's good practice and will be good to have the tests into the future as you add functionality to your project.
You might want to look at Martin Fowler's book Refactoring. This is the book that popularized the term and technique (my thought when taking his course: "I've been doing a lot of this all along, I didn't know it had a name"). A quote from the link:
Refactoring is a controlled technique
for improving the design of an
existing code base. Its essence is
applying a series of small
behavior-preserving transformations,
each of which "too small to be worth
doing". However the cumulative effect
of each of these transformations is
quite significant. By doing them in
small steps you reduce the risk of
introducing errors. You also avoid
having the system broken while you are
carrying out the restructuring - which
allows you to gradually refactor a
system over an extended period of
time.
As others have pointed out, unit tests will allow you to refactor with confidence. And start by reducing code duplication. The book will give you lots of other insights.
Here is a catalog of refactorings.
The correct definition of messy code, is code that hard to maintain and change.
To use more mathematical definition, you can check your code by code metrics tools.
This way, you will keep the code that already good enough, and find very fast, the wrong code.
My experience say, that is very powerful way to improve the quality of your code. (if your tool can show you the result on each build or on realtime)
Throw it away, build it new.

Being pressured to GOTO the dark-side

We have a situation at work where developers working on a legacy (core) system are being pressured into using GOTO statements when adding new features into existing code that is already infected with spaghetti code.
Now, I understand there may be arguments for using 'just one little GOTO' instead of spending the time on refactoring to a more maintainable solution. The issue is, this isolated 'just one little GOTO' isn't so isolated. At least once every week or so there is a new 'one little GOTO' to add. This codebase is already a horror to work with due to code dating back to or before 1984 being riddled with GOTOs that would make many Pastafarians believe it was inspired by the Flying Spaghetti Monster itself.
Unfortunately the language this is written in doesn't have any ready made refactoring tools, so it makes it harder to push the 'Refactor to increase productivity later' because short-term wins are the only wins paid attention to here...
Has anyone else experienced this issue whereby everybody agrees that we cannot be adding new GOTOs to jump 2000 lines to a random section, but continually have Anaylsts insist on doing it just this one time and having management approve it?
tldr;
How can one go about addressing the issue of developers being pressured (forced) to continually add GOTO statements (by add, I mean add to jump to random sections many lines away) because it 'gets that feature in quicker'?
I'm beginning to fear we may lose valuable developers to the raptors over this...
Clarification:
Goto here
alsoThere: No, I'm talking about the kind of goto that jumps 1000 lines out of one subroutine into another one mid way into a while loop. Goto somewhereClose
there: I wasn't even talking about the kind of gotos you can reasonably read over and determine what a program was doing. Goto alsoThere
somewhereClose: This is the sort of code that makes meatballs midpoint: If first time here Goto nextpoint detail:(each one almost completely different) Goto pointlessReturn
here: In this question, I was not talking about the occasionally okay use of a goto. Goto there
tacoBell: and it has just gone back to the drawing board. Goto Jail
elsewhere: When it takes Analysts weeks to decypher what a program is doing each time it is touched, something is deeply wrong with your codebase. In fact, I'm actually up to my hell:if not up-to-date goto 4 rendition of the spec goto detail pointlessReturn: goto tacoBell
Jail: Actually, just a small update with a small victory. I spent 4 hours refactoring a portion of this particular program a single label at a time, saving each iteration in svn as I went. Each step (about 20 of them) was smallish, logical and easy enough to goto bypass nextpoint: spontaneously jump out of your meal and onto you screen through some weird sort of spaghetti-meatball magnetism. Goto elseWhere
bypass: reasonably verify that it should not introduce any logic changes. Using this new more readable version, I've sat down with the analyst and completed almost all of this change now. Goto end
4: first *if first time here goto hell, no second if first time here goto hell, no third if first time here goto hell fourth now up-to-date goto hell
end:
How many bugs have been introduced because of incorrectly written GOTOs? How much money did they cost the company? Turn the issue into something concrete, rather than "this feels bad". Once you can get it recognized as a problem by the people in charge, turn it into a policy like, "no new GOTOs for anything except simplifying the exit logic for a function", or "no new GOTOs for any functions that don't have 100% unit test coverage". Over time, tighten the policies.
GOTOs don't make good code spaghetti, there are a multitude of other factors involved. This linux mailing list discussion can help put a few things into perspective (comments from Linus Torvalds about the bigger picture of using gotos).
Trying to institute a "no goto policy" just for the sake of not having gotos will not achive anything in the long run, and will not make your code more maintainable. The changes will need to be more subtle and focus on increasing the overall quality of the code, think along the lines of using best practices for the platform/language, unit test coverage, static analysis etc.
The reality of development is that despite all the flowery words about doing it the right way, most clients are more interested in doing it the fast way. The concept of a code base rapidly moving towards the point of imploding and the resulting fallout on their business is something that they cannot comprehend because that would mean having to think beyond today.
What you have is just one example. How you stand on this will dictate how you do development in the future. I think you have 4 options:
Accept the request and accept that you will always be doing this.
Accept the request, and immediately start looking for a new job.
Refuse to do and and be prepared to fight to fix the mess.
Resign.
Which option you choose is going to depend on how much you value your job and your self esteem.
Maybe you can use the boyscout-pattern: Leave the place a little more clean than you found it. So for every featurerequest: don't introduce new gotos, but remove one.
This won't spend too much time for improvements, would give more time, to find newly introduced bugs. Maybe it wouldn't cost much additional time, if you remove a goto from the part, which although had to spend time understanding, and bringing the new feature in.
Argue, that a refactoring of 2 hours will save 20 times 15 minutes in the future, and allow faster and deeper improvements.
Back to principles:
Is it readable?
Does it work?
Is it maintainable?
This is the classic "management" vs. "techies" conflict.
In spite of being on the "techie" team, over the years I have settled firmly the "management" side of this debate.
There are a number of reasons for this:
It's quite possible to have well written easy to read properly structured programs with gotos! Ask any assembler programmer; conditional branch and a primitive do loop are all they have to work with. Just because the "style" is not current and doesn't mean its not well written.
If it is a mess then its going to be a real pain to extract the busines rules you will need if you are going for a complete re-write, or, if you are doing a technical refactoring of the program you will never be sure the behaviour of the refactored program is "correct" i.e. it does exactly what the old program does.
Return on investment -- sticking to the original programming style and making minimal changes will take minimum effort and quickly statisfy the customers request. Spending a lot of time and effort refactoring will be more expensive and take longer.
Risk -- rewrites and refactoring are hard to get right, extensive testing of the refactored code is required and things that look like "bugs" may have been "features" as far as the customer is concerned. A particular problem with "improving" legacy code is that business users may have well established work arounds that depend on a bug being there, or, exploit the existense of a bugs to change the business procedures without informing the IT department.
So all in all management is faced with a decision -- "add one little goto" test the change and get it into production in double quick time with little risk -- or -- go in for a major programming effort and have the business scream at them when a new set of bugs crops up.
And if you do decide to refactor in five years time some snotty nosed college graduate will be moaning that your refactored program is not buzzword compliant any more and demanding he be allowed to spend weeks rewriting it.
If it ain't broke dont fix it!
PS: Even Joel thinks this is a bad idea: things you should never do
Update!-
OK if you want to refactor and improve the code you need to go about it properly.
The basic problem is you are saying to the client is "I want to spend n weeks working on this program and, if everything goes well, it will do exactly what it does now." -- this is a hard sell to say the least.
You need to gather long term data on the number of crashes and outages, the time spent analysing and programming seemingly small changes, the number of change requests that were not done because they were too hard, business opertunities lost because the system could not change fast enough. Also gather some acedemic data on the costs of maintaing well structured programs vs. letting it sink.
Unless you have a watertight case to present to the beancounters you will not get the budget. You really have to sell this to the business management, not, your immediate bosses.
I recently had to work on some code that wasn't legacy per se, but the former developers' habits certainly were and thus GOTO's were everywhere. I don't like GOTO's; they make a hideous mess of things and make debugging a nightmare. Worse yet, replacing them with normal code is not always straightforward.
IF you can't unwind your GOTO's I certainly recommend that you no longer use them.
Unfortunately the language this is written in doesn't have any ready made refactoring tools
Don't you have editors with macro capabilities? Don't you have shell scripts? I've done tons of refactoring over the years, very little of it with refactoring browsers.
The underlying problem here seems to be that you have 'analysts' who describe, presumably necessary, functional changes in terms of adding a goto to some code. And then you have 'programmers' whose job appears to be limited to typing in that change and then complaining about it.
To make anything different, that dysfunctional allocation of responsibilities between those people needs to change. There are a lot of different ways to split up the work: The classic, conventional one (that is quite likely the official, but ignored, policy in your work) is to have analysts write a implementation-independent specification document and programmers implement it as maintainably as they can.
The problem with that theoretical approach is it is actually unworkably unrealistic in many common situations. In particular, doing it 'properly' requires employees seen as junior to win an argument with colleagues who are more senior, have better social connections in the office, and more business-savvy. If the argument 'goto is not implementation-independent, so as an analyst you simply can't say that word' doesn't fly at your workspace, then presumably this is the case.
Far better in many circumstances are alternatives like:
analysts write client-facing code and developers write infrastructure.
Analysts write executable specifications which are used as reference implementations for unit tests.
Developers create a domain specific language in which analysts program.
You drop he distinction between one and the other, having only a hybrid.
If a change to the program requires "just one little goto" I would say that the code was very maintainable.
This is a common problem when dealing with legacy code. Stick to the style the program was originally written in or "modernize" the code. For me the answer is usually to stick with the original style unless you have a really big change that would justify a complete re-write.
Also be aware that several "modern" language features like java's "throw" statement, or SQLs "ON ERROR" are really "GO TO"s in disguise.

How to convince your fellow developer to write short methods?

Long methods are evil on several grounds:
They're hard to understand
They're hard to change
They're hard to reuse
They're hard to test
They have low cohesion
They may have high coupling
They tend to be overly complex
How to convince your fellow developer to write short methods? (weapons are forbidden =)
question from agiledeveloper
Ask them to write unit tests for the methods.
That depends on your definitions of "short" and "long".
When I hear someone say "write short methods", I immediately react badly because I've encountered too much spaghetti written by people who think the ideal method is two lines long: One line to do the tiniest possible unit of work followed by one line to call another method. (You say long methods are evil because "they're hard to understand"? Try walking into a project where every trivial action generates a call stack 50 methods deep and trying to figure out which of those 50 layers is the one you need to change...)
On the other hand, if, by "short", you mean "self-contained and limited to a single conceptual function", then I'm all for it. But remember that this can't be measured simply by lines of code.
And, as tydok pointed out, you catch more flies with honey than vinegar. Try telling them why your way is good instead of why their way is bad. If you can do this without making any overt comparisons or references to them or their practices (unless they specifically ask how your ideas would relate to something they're doing), it'll work even better.
You made a list of drawbacks. Try to make a list of what you'll gain by using short methods. Concrete examples. Then try to convince him again.
I read this quote from somewhere:
Write your code as if the person who has to maintain it is a violent psycho, who knows where you live.
In my experience the best way to convince a peer in these cases is by example. Just find opportunities to show them your code and discuss with them the benefits of short functions vs. long functions. Eventually they'll realize what's better spontaneously, without the need to make them feel "bad" programmers.
Code Reviews!
I suggest you try and get some code reviews going. This way you could have a little workshop on best practices and whatever formatting your company adhers to. This adds the context that short methods is a way to make code more readable and easier to understand and also compliant with the SRP.
If you've tried to explain good design and people just aren't getting it, or are just refusing to get it, then stop trying. It's not worth the effort. All you'll get is a bad rep for yourself. Some people are just hopeless.
Basically what it comes down to is that some programmers just aren't cut out for development. They can understand code that's already written, but they can't create it on their own.
These folks should be steered toward a support role, but they shouldn't be allowed to work on anything new. Support is a good place to see lots of different code, so maybe after a few years they'll come to see the benefits of good design.
I do like the idea of Code Reviews that someone else suggested. These sloppy programmers should not only have their own code reviewed, they should sit in on reviews of good code as well. That will give them a chance to see what good code is. Possibly they've just never seen good code.
To expand upon rvanider's answer, performing the cyclomatic complexity analysis on the code did wonders to get attention to the large method issue; getting people to change was still in the works when I left (too much momentum towards big methods).
The tipping point was when we started linking the cyclomatic complexity to the bug database. A CC of over 20 that wasn't a factory was guaranteed to have several entries in the bug database and oftentimes those bugs had a "bloodline" (fix to Bug A caused Bug B; fix to Bug B caused Bug C; etc). We actually had three CC's over 100 (max of 275) and those methods accounted for 40% of the cases in our bug database -- "you know, maybe that 5000 line function isn't such a good idea..."
It was more evident in the project I led when I started there. The goal was to keep CC as low as possible (97% were under 10) and the end result was a product that I basically stopped supporting because the 20 bugs I had weren't worth fixing.
Bug-free software isn't going to happen because of short methods (and this may be an argument you'll have to address) but the bug fixes are very quick and are often free of side-effects when you are working with short, concise methods.
Though writing unit tests would probably cure them of long methods, your company probably doesn't use unit tests. Rhetoric only goes so far and rarely works on developers who are stuck in their ways; show them numbers about how those methods are creating more work and buggy software.
Finding the right blend between function length and simplicity can be complex. Try to apply a metric such as Cyclomatic Complexity to demonstrate the difficulty in maintaining the code in its present form. Nothing beats a non-personal measurement that is based on testing factors such as branch and decision counts.
Not sure where this great quote comes from, but:
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it"
Force him to read Code Complete by Steve McConnell. Say that every good developer has to read this.
Get him drunk? :-)
The serious point to this answer is the question, "why do I consistently write short functions, and hate myself when I don't?"
The reason is that I have difficulty understanding complex code, be that long functions, things that maintain and manipulate a lot of state, or that sort of thing. I noticed many years ago that there are a fair number of people out there that are significantly better at dealing with this sort of complexity than I am. Ironically enough, it's probably because of that that I tend to be a better programmer than many of them: my own limitations force me to confront and clean up that sort of code.
I'm sorry I can't really provide a real answer here, but perhaps this can provide some insight to help lead us to an answer.
Force them to read the book "Clean Code", there are many others but this one is new, good and an easy read.
Asking them to write Unit tests for the complex code is a good avenue to take. This person needs to see for himself what that debt that complexity brings when performing maintenance or analysis.
The question I always ask my team is: "It's 11 pm and you have to read this code - can you? Do you understand under pressure? Can you, over the phone, no remote login, lead them to the section where they can fix an error?" If the answer is no, the follow up is "Can you isolate some of the complexity?"
If you get an argument in return, it's a lost cause. Throw something then.
I would give them 100 lines of code all under 1 method and then another 100 lines of code divided up between several methods and ask them to write down an explanation of what each does.
Time how long it takes to write both paragraphs and then show them the result.
...Make sure to pick code that will take twice or three times as long to understand if it were all under one method - Main() -
Nothing is better than learning by example.
short or long are terms that can be interpreted differently. For one short is a 2 line method while some else will think that method with no more than 100 lines of code are pretty short.
I think it would be better to state that a single method should not do more than one thing at the same time, meaning it should only have one responsibility.
Maybe you could let your fellow developers read something about how to practice the SOLID principles.
I'd normally show them older projects which have well written methods. I would then step through these methods while explaining the reasons behind why we developed them that way.
Hopefully when looking at the bigger picture, they would understand the reasons behind this.
ps. Also, this exercise could be used in conjuction as a mini knowledge transfer on older projects.
Show him how much easier it is to test short methods. Prove that writing short methods will make it easier and faster for him to write the tests for his methods (he is testing these methods, right?)
Bring it up when you are reviewing his code. "This method is rather long, complicated, and seems to be doing four distinct things. Extract method here, here, and here."
Long methods usually mean that the object model is flawed, i.e. one class has too many responsibilities. Chances are that you don't want just more functions, each one shorter, in the same class, but those responsibilies properly assigned to different classes.
No use teaching a pig to sing. It wastes your time and annoys the pig.
Just outshine someone.
When it comes time to fix a bug in the 5000 line routine, then you'll have a ten-line routine and a 4990-line routine. Do this slowly, and nobody notices a sudden change except that things start working better and slowly the big ball of mud evaporates.
You might want to tell them that he might have a really good memory, but you don't. Some people are able to handle much longer methods than others. If you both have to be able to maintain the code, it can only be done if the methods are smaller.
Only do this if he doesn't have a superiority complex
[edit]
why is this collecting negative scores?
You could start refactoring every single method they wrote into multiple methods, even when they're currently working on them. Assign extra time to your schedule for "refactoring other's methods to make the code maintanable". Do it like you think it should be done, and - here comes the educational part - when they complain, tell them you wouldn't have to refactor the methods if they would have made it right the first time. This way, your boss learns that you have to correct other's lazyness, and your co-workers learn that they should make it different.
That's at least some theory.

How to restrain one's self from the overwhelming urge to rewrite everything?

Setup
Have you ever had the experience of going into a piece of code to make a seemingly simple change and then realizing that you've just stepped into a wasteland that deserves some serious attention? This usually gets followed up with an official FREAK OUT moment, where the overwhelming feeling of rewriting everything in sight starts to creep up.
It's important to note that this bad code does not necessarily come from others as it may indeed be something we've written or contributed to in the past.
Problem
It's obvious that there is some serious code rot, horrible architecture, etc. that needs to be dealt with. The real problem, as it relates to this question, is that it's not the right time to rewrite the code. There could be many reasons for this:
Currently in the middle of a release cycle, therefore any changes should be minimal.
It's 2:00 AM in the morning, and the brain is starting to shut down.
It could have seemingly adverse affects on the schedule.
The rabbit hole could go much deeper than our eyes are able to see at this time.
etc...
Question
So how should we balance the duty of continuously improving the code, while also being a responsible developer? How do we refrain from contributing to the broken window theory, while also being aware of actions and the potential recklessness they may cause?
Update
Great answers! For the most part, there seems to be two schools of thought:
Don't resist the urge as it's a good one to have.
Don't give in to the temptation as it will burn you to the ground.
It would be interesting to know if more people feel any balance exists.
I'm a big fan of making lists!
As soon as the urge takes you to re-write something - spend 10 minutes making a list of the things that need re-writing. Follow all of the alleys that take you further into the code that needs attention and list those, too.
Hopefully within a relatively short space of time you'll have one of two things:
A really long list that completely puts you off wanting to rewrite anything ever again.
A list which actually isn't that long, so why not indulge yourself and go for a re-write?!
I've found that spending two months fixing bugs caused by my "harmless" re-write was enough to cure me of the urge to do these sorts of things without a clear mandate/project plan to do so.
The most memorable project of this kind for me occured some 4 years ago when I was called in to a remote office to "help" with a project that was due in 1 week for major presentation to the client and was not working yet at all. The project had been primarily off-shored to India, and IMO, a project management failure resulted in a ton of spaghetti code that was too fragmented to ever work properly in its current form.
After a full day's review, I presented my opinion to the management that the project simply needed wholesale refactoring and reorganization or it would never work properly. The result of this discussion was 6 days of 20 hours work / 4 hours sleep, 2 of which I actually spent sleeping on the couch in the company lobby due to the wasted time when driving back to the hotel.
The major improvements to the code included:
Application of naming standards
Moved into source control
Development of a build process
Documentation of the individual components
Most of the original code was left in place, but simply moved and reorganized / refactored to make it sustainable in the long term. Was it hell week? Sure. Did it make the project more successful? Yep.
I can't live with spaghetti code, and I'll often donate my own personal time to address it.
How to restrain one’s self from the overwhelming urge to rewrite everything?
Become old*
As this has happened to me, I've gradually lost the urge to rewrite everything. Why?
Once you've done it a few times, you realise that you often end up worse off than you started.
Even if you're god's gift to programming and your brilliant rewrite introduces no new bugs, you'll simply fail to notice or implement about 30% of the small edge-case features/bugs which things rely on. This will cost you months in fixing
Nothing wears down your irrational exuberance like time. Often this is a sad loss, but in this case, it's a win.
*(more experience may also be a suitable substitute if you lack the free time to become old)
The impulse to rewrite is righteous, provided that:
you "freeze" the existing code (with a label)
you start the rewrite attempt in a separate branch
you prepare first some unit-test in order to ascertain the current behavior and make sure you reproduce the existing features...
That said, you have to balance the rewriting process with the measure stability of the legacy code.
"If it is not broken, do not fix it" ;)
It's not quite an answer, but reading the beautiful article The Narcissism of Small Code Differences might help.
You're absolutely right that there's a right time and a wrong time to rewrite (and destabilize) the code.
Reminds me of an engineer I knew who had a habit of diving in and doing major rewriting whenever he felt like it. This drove the QA staff for his product up the wall -- if he report a bug about something trivial, the bug would get fixed, but the engineer's major rewrite of stuff he noticed while fixing the one bug would introduce fifty other new bugs, which then the QA staff needed to track down.
So one way to cure yourself of the temptation to do rewrites is to walk a mile in the shoes of the QA engineer who has to test, diagnose, and report all those new bugs. Maybe even lend yourself to the QA department for a few months to get a personal feel for the type of work they do (if you've never done that).
Another suggestion would be to make a rule for yourself to write up notes to describe any change. Perhaps post these notes when you check in code changes to source code control. You may be motivated to make the changes as minor as possible this way.
Just don't - The rabbit hole always goes too deep. Make any small local changes that leave the place cleaner than you found it, but leave big refactoring for when you're alert and have plenty of time to burn.
A nice starter on tackling big refactorings in small stages at sourcemaking.com :
you have to make like Hansel and
Gretel and nibble around the edges, a
little today, a little more tomorrow
reread "Refactoring".
Take a piece of paper and itemize the "Bad Smells" list.
(for each smell in BadSmells() {
print smell.name;
}
Add comments to the code including item(s) from the list.
while( odorPersists() ) {
Work through the list, laser-focused on one smell at a time.
}
Personally, this is where bug tracking software such as JIRA or Bugzilla come in to place with me.
I have a hard time NOT fixing all the broken windows when I see them, but if the situation is bad enough (and time permits) I will open a ticket and either assign it to me or assign it to the group responsible, that way I don't go off on a tangent.
-- I do only what needs to be done right them, but yet the issue is documented and will be fixed in time.
-- This is not to say that small broken windows should be handled this way; small issues should always be fixed on contact IMHO.
As far as I am concerned, if you have nothing better to do than to rewrite the code then you should make everyone else aware and just do it. Obviously, if it's going to take days to get any sort of result and the changes would mean excessive downtime then it's not a good idea, but eventually that code will become a problem. Make others aware that the code is awful and try to get some sort of movement to rewrite it.
The problem with the above statement is that as a programmer/developer you will ALWAYS have other things to keep yourself busy. Just leave it on the low-priority list of things-to-do, so when you're struggling with some work you can always keep your rhythm going with the rewrite.
Joel has an article about this:
There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong.
I've never worked anywhere particularly agile, so what I do is:
1) Figure out whether there's a reasonable fix that doesn't involve major rewrite. If not, then clear some time (perhaps by explaining to others how difficult the fix is) and do the rewrite.
2) There is a reasonable fix without major rewrite. Apply the fix, run the tests, check it in, mark the bug as fixed.
3) Now, raise a new bug/issue (enhancement request), outlining the proposed rewrite and how it would improve the code (simpler? more maintainable? reduces coupling? affects performance or resource use?). Assign it to myself, CC anyone interested in that bit of code.
4) Give people a chance to comment, then prioritise that new bug within my existing tasks. This usually means don't do it now, because most of the time if I have one "proper" bug to fix, then I have at least two. Do it once the critical list is cleared, or next time I need to do something that isn't boring. Or maybe after a couple of days it won't seem worth doing any more, and it won't get done, and I've saved the time I would have spent doing it.
The important thing, I think, is to avoid shaving a yak every time you make a small bugfix.
The trade-off is that if the code you want to rewrite is really bad, then it's easy to under-prioritise the rewrite, and you end up spending so much time maintaining it that you don't have time to replace it with something that would require less maintenance. That needs to be borne in mind. But no matter what priority the rewrite should be, or how your organisation assigns those priorities, fixing bad design in the same area of code is not the same thing as correcting the out-by-one error that caused the crash you originally went in to deal with. This has to be considered at step 1: if the design means there are probably a whole bunch of other out-by-one errors lurking in the code, then just fixing this one is probably not "reasonable". If it really was a typo, but you happened to spot a refactor opportunity because you had the file open, then correcting it and touching nothing else is "reasonable".
Obviously the more agile your shop, the more significant a refactor you can do without it being so disruptive / time-consuming / political as to require a separate task.
If it's some code you've inherited, start making the code your own. Write unit-tests, then refactor.
Write more unit tests just to find out that the code is running perfectly fine.
If you still have the urge to rewrite it, you will have some tests to find out that your rewriten code is now failing ;)
Bad code is not something that can be avoided totally, but it can be isolated by proper abstraction. If it is indeed wasteland, which means, there is no landfills around, focusing design process is much more effective than trying to make people write perfect code.
IDon't think you SHOULD stop yourself from this. Mostly if you FEEL for the big rewrite it's mostly correct to do. I insanely disagree with Joel Spolsky on this thing...
Though it's one of few places where I disagree with him ... ;)
Stop rewriting when it is good enough. For this you need to know what good program is for you not only for your employer. After all you do programming for life not for good impression. Develop sound criteria which really would tell you that you are good and result is descent and it is time to go to next task. You should like your programs not less then your boss likes them.
Refactor only if your boss/company actually encourages it, otherwise you'll end up doing frequent extra time to bring up code to perfection... Until someone less dedicated touches it again.
Starting assumption: you already "commit early, commit often".
Then there's never a bad time to start a refactoring, because you can easily back out at any point. I find that at the beginning, it always feels like it's going to be easy, but after making a few changes and seeing the effects, I start to realise quite how big a job it's going to be. That is the time to ask whether there is really time to make the changes, or whether living with it until next time you come to this code is the better course.
The willingness to stop, throw away a half-refactored branch, and do it the nasty but quick way where appropriate, is key.
That's for refactoring, where the changes are incremental, and running (or nearly-running) software keeps you well grounded. Rewriting is different, because the time until you figure out it's going to take longer than you thought is so much greater. I may be taking the question too literally, but there's almost never a good time to throw it away and start again.